The world is currently obsessed with building bigger and faster AI models. However, there is a massive problem that most people ignore. It is called the inference bottleneck. This is basically a traffic jam for AI data. Zain Asgar, a Stanford professor and seasoned founder, just raised $80 million in a Series A round to fix this. His new startup, Gimlet Labs, has a plan to make AI workloads run across any hardware you have lying around.
Right now, most companies think they need the newest and most expensive chips to run AI properly. This leads to a lot of wasted money and idle resources. McKinsey estimated that data center spending could hit $7 trillion by 2030. Asgar points out that most apps only use about 15 to 30 percent of the hardware they already have. That leaves a huge amount of power just sitting there doing nothing. Gimlet Labs wants to stop this waste.
Breaking the One-Chip Habit
Gimlet Labs created what they call a multi silicon inference cloud. Instead of forcing a single task to stay on one specific chip, their software slices up the work. It can run parts of an AI model on a traditional CPU while other parts handle heavy math on an AI tuned GPU. It even works with high memory systems simultaneously. This means you don’t have to wait for the latest hardware to roll off the assembly line to get your work done.
Think about a single AI agent. It might need to perform several steps to answer a prompt. Each step has different needs. One part might need a lot of memory, while another needs raw computing power. Gimlet’s software acts like a smart conductor. It looks at the hardware available and sends each part of the job to the best chip for that specific task. This approach makes AI workloads up to ten times more efficient than they are today.
Speed Meets Savings
The results are impressive. Gimlet Labs claims they can speed up AI inference by three to ten times. The best part is that they do this without increasing the cost or the power usage. They have already formed partnerships with the biggest names in the industry. Nvidia, AMD, Intel, ARM, and even newer players like Cerebras are on the list.
Gimlet’s product isn’t for the average person building a simple app in their garage. They are targeting the biggest AI labs and massive data centers. These are the places where even a small increase in efficiency can save millions of dollars. The company launched last October and already has eight figure revenues. Their customer base has doubled in just the last four months. This shows just how desperate the industry is for a way to manage these massive workloads.
Building for the Long Haul
Asgar and his co-founders Michelle Nguyen, Omid Azizi, and Natalie Serrino are not new to this world. They previously built a startup called Pixie that was acquired by New Relic. They know how to scale a business quickly. After a chance meeting with a lead investor and a flurry of interest from venture capital firms, they chose Menlo Ventures to lead this round.
With a total of $92 million in funding and a team of 30 people, Gimlet Labs is ready to change how we think about computing power. As new chips hit the market and old GPUs get retired, their software keeps everything running smoothly. They are proving that you don’t always need more hardware. Sometimes, you just need a better way to use what you already have.

