Ignite Startups: How Thunder Compute is Maximizing GPU Efficiency with Carl Peterson

Playback speed

Share post at current time

0:00

Transcript

Ignite Startups: How Thunder Compute is Maximizing GPU Efficiency with Carl Peterson

Episode 134 of the Ignite Podcast

Ignite Insights

Feb 13, 2025

The world of artificial intelligence is evolving at a breakneck pace, with demand for computing power growing exponentially. AI models require massive amounts of GPU processing, yet a significant portion of this compute power remains underutilized. Enter Thunder Compute, a YC-backed startup that is revolutionizing how GPUs are allocated and used, dramatically improving efficiency and cost-effectiveness for AI developers and enterprises.

In this episode of Ignite Podcast, host Brian Bell sat down with Carl Peterson, founder and CEO of Thunder Compute, to discuss how his company is tackling one of the biggest inefficiencies in AI infrastructure. Their conversation delved into GPU virtualization, the future of AI computing, and what it means for the industry.

The Problem: GPUs Sitting Idle 60-90% of the Time

At the heart of Thunder Compute’s innovation is a shocking inefficiency in AI computing: most GPUs sit idle for the majority of their lifecycle. While enterprises are paying a premium to access cloud-based GPUs, studies show that these processors are often underutilized, with as much as 60-90% of their time spent doing nothing.

This inefficiency isn’t just costly—it’s slowing down AI development. Researchers, engineers, and companies are paying for compute power they aren’t fully using, leading to bottlenecks and wasted resources.

Carl Peterson shared how his experience at Georgia Tech opened his eyes to this problem. His co-founder, also named Brian (a really good name turns out), was part of a research lab where students had to sign up for GPU time using an Excel spreadsheet. The system was frustratingly inefficient, often forcing researchers to wait weeks to access compute power.

This led them to ask a simple but powerful question:

👉 Why aren’t GPUs being shared more efficiently, just like CPUs were virtualized decades ago?

The Solution: Virtualizing GPUs Like VMware Did for CPUs

Thunder Compute’s approach is similar to what VMware did for CPUs in the 1990s—but for GPUs. By virtualizing GPUs, their technology enables multiple users to efficiently share GPU resources without wasted downtime.

Here’s how it works:

GPU Scheduling Optimization: Thunder Compute decouples GPU scheduling from server scheduling. Instead of dedicating a full GPU to a single user 24/7, it allows multiple users to share the same GPU, ensuring near 100% utilization.

AI Workload Optimization: Their software dynamically assigns GPU power based on actual workload needs, meaning that idle GPUs can be allocated to other processes instead of sitting unused.

Lowering Costs for AI Developers: By making GPU usage 4-5x more efficient, Thunder Compute significantly lowers the cost of running AI models, a huge benefit for startups and enterprises alike.

Carl describes their vision as "turning GPUs into a shared cloud resource, much like how AWS and VMware virtualized CPU computing decades ago."

The Impact: 4-5x Efficiency Gains with Minimal Performance Trade-offs

One of the most surprising insights from the conversation was that users don’t even notice the latency introduced by Thunder Compute’s GPU sharing model.

According to Carl, their system currently experiences a 20-50% performance slowdown compared to native GPU processing, but this tradeoff is negligible given the cost savings. In fact, most users report no noticeable difference when running AI workloads.

More importantly, Thunder Compute believes they can reduce this slowdown to within 5% of native GPU performance—a game-changer for AI developers looking to optimize costs.

“If a GPU is sitting idle 90% of the time, but we can make it work 100% of the time, that’s a massive efficiency gain,” says Carl.

By enabling near 100% GPU utilization, companies can get 4-5x more processing power per dollar spent, unlocking enormous savings, especially for cloud-based AI workloads.

Why Hasn’t AWS or Google Cloud Solved This?

A natural question arises: If this is such an obvious problem, why hasn’t AWS, Google Cloud, or Nvidia already built a similar solution?

Carl explains that major cloud providers have traditionally assumed that GPUs need the fastest possible connection to CPUs—an assumption that has gone unchallenged for years.

Thunder Compute breaks this assumption by using a different networking model that trades off minimal latency for dramatic efficiency improvements.

Additionally, big cloud providers are focused on selling more GPUs, not necessarily optimizing their usage. Since AWS, Google, and Azure make money renting out dedicated GPU instances, they have little incentive to disrupt their pricing model by introducing efficient sharing solutions.

However, Thunder Compute’s software-based approach is cloud-agnostic, meaning that companies can use it across different providers, reducing dependency on expensive, dedicated GPU instances.

Where Is Thunder Compute Headed?

Currently, Thunder Compute is reselling AWS and Google Cloud GPU instances at a fraction of the cost by applying their efficiency model. But their long-term vision extends far beyond cloud reselling.

Carl hints that as they scale, they may build their own GPU data centers to further drive down costs and control the entire stack.

“We want to stay focused on our core software right now, but in the future, vertical integration could be a major opportunity for us,” says Carl.

In the meantime, they are laser-focused on improving their virtualization technology to make it as fast and seamless as possible, ensuring that AI developers can get maximum compute power for minimal cost.

The Bigger Picture: The Future of AI Compute

The conversation also touched on broader AI trends, including the exponential growth in AI compute demand, advancements in AI model efficiency, and how companies are increasingly relying on AI-powered automation.

Brian and Carl discuss the possibility of AI-powered startups with no employees, where AI agents manage everything from customer support to software development.

Carl envisions a future where AI compute is far more efficient, democratized, and accessible to everyone, unlocking a new wave of innovation across industries.

“Right now, AI compute is where CPUs were in the 1990s. But in the next few years, we’re going to see a massive transformation in how GPUs are used and shared.”

Final Thoughts: What This Means for AI Developers & Startups

If you’re building AI applications, Thunder Compute’s approach could significantly lower your infrastructure costs while ensuring access to high-performance computing.

Here’s why it matters:

✅ 4-5x GPU efficiency gains → Lower cloud costs

✅ Eliminates wasted compute time → Faster AI development

✅ Works across multiple cloud providers → More flexibility

✅ Future-proof AI workloads → Scalable & cost-effective

For startups, researchers, and enterprises looking to optimize AI infrastructure costs, Thunder Compute is an exciting company to watch.

👂🎧 Watch, listen, and follow on your favorite platform: https://tr.ee/S2ayrbx_fL

🙏 Join the conversation on your favorite social network: https://linktr.ee/theignitepodcast

Chapters:

Introduction and Guest Overview (00:01 – 00:32)
The Founder’s Journey (00:32 – 02:33)

The Problem: Underutilized GPUs in AI Computing (02:33 – 06:13)

The Thunder Compute Solution: Virtualizing GPUs (06:13 – 09:57)

Breaking Industry Assumptions: Why Now? (09:57 – 12:49)

Performance vs. Cost Tradeoff: How Fast is Virtualized AI Compute? (12:49 – 16:12)

Monetizing GPU Efficiency: Thunder Compute’s Business Model (16:12 – 19:33)

Competing with Cloud Giants & Future Expansion Plans (19:33 – 21:57)

The Evolution of AI Compute & The Role of Networking (21:57 – 23:52)

YC’s Role in Thunder Compute’s Growth (23:52 – 26:37)

Real-World Use Cases and Customer Feedback (26:37 – 29:48)

The Future of AI Startups: Fewer Employees, More AI Agents? (29:48 – 33:12)

Long-Term Vision for Thunder Compute (33:12 – 36:41)

Rapid-Fire Questions & Tech Insights (36:41 – 40:17)

Final Thoughts & Where to Find Thunder Compute (40:17 – 42:35)

Thank-you to our sponsor!

⁠Byldd⁠ helps non-technical domain-expert founders build and launch tech businesses by providing a complete product team - that's everyone you need from designers to engineers to testers, all the way up to a CTO. We'll ship products while you focus on the other essentials: validation, sales, and distribution. Our portfolio companies have been backed by YC, Google, ERA, and other top-tier investors.

⁠Get Started Here⁠

Ignite Insights

Ignite Startups: How Thunder Compute is Maximizing GPU Efficiency with Carl Peterson

Discussion about this video