Three Ingredients, Three Labs, One Squeeze: Reading the 2026 AI Compute Crisis
The Sentence That Broke the Chess Board
May 6, 2026. San Francisco. Code with Claude.
Ami Vora is on stage. Anthropic's new Chief Product Officer. She's about ten minutes into the keynote when she drops the line that empties the air out of the room:
"We're partnering with SpaceX to use all of the capacity of their Colossus data center."
That's Simon Willison's live blog, captured in real time at 09:12. One sentence. Then she moves on to feature announcements like she hasn't just rewritten the alliance map of the entire industry.
Here's why that sentence matters. Four months earlier, on January 12, Anthropic cut off xAI's access to Claude after discovering Musk's lab had been using it through Cursor for internal development. xAI co-founder Tony Wu sent an internal memo that morning, first reported by Kylie Robison:
"According to Cursor, this is a new policy Anthropic is enforcing for all its major competitors."
Three months before that, Musk had publicly called Claude "misanthropic" on X. There were op-eds. Lawsuits in adjacent territory. By any honest reading, mutual hostility.
Today they share a power grid. Fortune put a price tag on the reconciliation in a single headline: "Elon Musk called Anthropic 'evil' 3 months ago. Now he's taking $4 billion to become its data landlord." Business Insider's reporting put the volume at 300 megawatts of new computing power at SpaceX's Colossus One.
Press releases don't stage cease-fires this fast unless something underneath is on fire. The fire has a name: compute.
Three Ingredients (Credit Theo, Then Extend)
I didn't have a clean way to think about this until Theo Browne's video on May 6 crystallized it. The framework is his. The angle below is mine.
There are exactly three ingredients to building a frontier AI lab in 2026, and right now nobody has all three at the scale they need:
- Research — the people who know which paper is worth reading, which architecture is worth chasing, and which apparent breakthrough is just an artefact of test-set contamination.
- Data — and not just any data. The high-grade stuff. The "no, you missed step 2 — fix it" correction traces from people building real software with real frustration. You can buy text. You cannot easily buy that.
- Compute — gigawatts, not GPUs. The unit changed sometime in the last eighteen months and most of us missed it. When SemiAnalysis describes Colossus 2 as "the first gigawatt datacenter in the world," that's the new benchmark.
A quick scoreboard:
| Lab | Research | Data | Compute | What they're plugging |
|---|---|---|---|---|
| OpenAI | ✓ | ✓ | ✓ | Distribution (AWS deal) |
| Anthropic | ✓ | ✓ | ✗ | Leasing Colossus, Google TPUs, AWS Trainium |
| xAI / SpaceX | partial | ✗ | ✓ | Bought Cursor for the data |
| ✓ | ✓ | ✓ | Their own internal coordination |
Once you see those columns, every move on the board over the last six months snaps into place. It also reveals a fourth ingredient that money cannot buy: time. You can write a $50 billion check this morning. You cannot write a check that conjures eighteen months of substation construction, transformer manufacturing lead times, and grid-interconnect approvals. Treat that constraint as the silent backdrop to everything that follows.
Start with the lab feeling the squeeze most.
Anthropic: The Lab That Tripped Over Its Own Demand
In the first quarter of 2026, Anthropic grew 80x. They had planned for 10x.
That's not my framing. That's Dario Amodei's, on the same Code with Claude stage, in conversation with his sister Daniela. Per Business Insider, the Anthropic CEO said his company had 80x year-over-year growth in revenue and usage in the first quarter, and "added, half-joking, that he hopes this doesn't continue because that level of hyper-growth is 'too hard to handle.'"
Then, the line that explains everything that followed: "Anthropic had planned for anywhere from a 'little' revenue growth to 10x, Amodei said, and that gap between expectations and reality is why his company's computing resources have been stretched thin this year."
The numbers behind the squeeze are stark. From Anthropic's own announcement on April 6:
- Run-rate revenue surpassed $30 billion, up from approximately $9 billion at the end of 2025.
- Over 1,000 business customers each spending more than $1 million annualized — doubled from 500 in February. In two months.
- API volume up 17x year-on-year (Vora at Code with Claude).
Here's the part I'm hesitant to write, because I'm a fan and a daily user.
I was one of the people getting throttled.
Through March and April, Claude Code on my Pro plan started dropping context mid-session, returning oddly clipped responses, and burning through quotas faster than the same workloads had two weeks earlier. I assumed it was me. It wasn't. Fortune's April 24 story walked through the postmortem: three engineering missteps — a March 4 reduction in default reasoning effort from "high" to "medium," a March 26 bug that caused the model to discard its own reasoning history mid-session, and an April 16 system prompt that capped responses at 25 words between tool calls.
All three were resolved by April 20. But the underlying story was the one Anthropic conceded to Fortune in plain English:
"Demand for Claude has grown at an unprecedented rate, and our infrastructure has been stretched to meet it, particularly at peak hours."
There's a number that gives the regression teeth. Veracode's analysis found Claude Opus 4.7 introduced a vulnerability in 52% of coding tasks tested, against roughly 30% for OpenAI's models. Opus 4.7 wasn't just slower at peak hours. For a window of weeks, it was measurably worse at the one thing Anthropic's reputation rests on.
And here's what the writers at MindStudio nailed in their April 23 piece: this isn't fixable on a quarter timeline. "Ordering GPUs, signing colocation deals, and provisioning infrastructure takes 18 to 24 months at minimum. Money raised today turns into compute capacity in late 2026 or 2027."
Look at Anthropic's compute portfolio in that light. Each row is a phone call:
- Google + Broadcom (Apr 6, 2026): multiple gigawatts of next-generation TPU capacity, online from 2027 (Anthropic official).
- CoreWeave (Apr 10, 2026): a multi-year deal, compute online later 2026 with option to expand.
- $50 billion U.S. AI infrastructure commitment (Nov 2025), referenced in Anthropic's announcement.
- AWS Trainium (primary cloud provider and training partner, per Anthropic).
- SpaceX Colossus (May 6, 2026): the new news.
Read those dates: April 6, April 10, May 6. That's a five-week sprint to bolt anything that produces electrons onto the side of the company. If you're Anthropic and you've already sold next year's GPU capacity twice over, you make a phone call you swore you'd never make.
xAI: The Lab With Spare Compute and a Data Hole
Across the country, Memphis has the opposite problem.
Colossus 1 — the original xAI supercomputer, built in 122 days from a converted Electrolux factory — runs at roughly 300 MW, with around 200,000 H100/H200s and ~30,000 GB200 NVL72s. SemiAnalysis: "the largest fully operational, single-coherent cluster" anywhere.
It's also mostly idle for Grok. Adoption never hit the trajectory Musk projected.
Meanwhile, xAI has moved its own training onto Colossus 2, the next-door build. SemiAnalysis estimates the project went from zero to 200 MW in six months. The Introl writeup of the January 2026 expansion puts the total site capacity near 2 gigawatts and the GPU count at 555,000 Nvidia Blackwell units, at a hardware cost of about $18 billion. Tom's Hardware notes that satellite imagery currently shows about 350 MW of cooling capacity, so treat the headline numbers as ambition; the trajectory is clear regardless.
The "genius move" SemiAnalysis flags is worth dwelling on. Memphis pushed back hard on gas-turbine permits. So xAI bought a former Duke Energy power plant in Southaven, Mississippi — across the state line, two miles south — and Mississippi regulators granted xAI temporary approval to run gas turbines there for up to 12 months without a permit. Power capacity that would have taken three years of Tennessee paperwork came online in months. Infrastructure as a regulatory arbitrage problem. That's the new game.
So xAI has compute. What it doesn't have is data. Specifically the kind of data that matters most for the coding-agent market the labs are now fighting over.
Hence: April 21, 2026.
SpaceX announces a $10 billion collaboration with Cursor to develop "coding and knowledge work AI," plus an option to acquire Cursor outright for $60 billion later in the year. Per TechCrunch, the partnership pairs "Cursor's product and distribution to expert software engineers" with SpaceX's Colossus, "which the company claims has the equivalent compute power of a million Nvidia H100 chips."
Cursor's valuation arc is its own short story:
- January 2025: $2.5 billion
- May 2025: $9 billion
- November 2025: $29.3 billion (post-money on a $2.3 billion Series D)
- April 2026: $50 billion target for the next round
- April 2026: $60 billion option price from SpaceX
One company, 24x in 16 months.
Here's the read I think is correct — the one I'll commit to as a builder even if I'd hedge it as an analyst: the $10 billion is a data-licensing fee dressed as a partnership.
Cursor's value isn't the IDE. The IDE is good. The IDE is not $60 billion good. What's $60 billion good is the corpus of every developer correction message ever sent to a frontier coding model — and Cursor has them against Claude, GPT, and Gemini. Those "no, you missed step 2 — fix it" messages are the highest-grade reinforcement-learning training data on Earth. Each one teaches the next model to skip the step that triggered the correction.
And that — finally — is why Anthropic banned xAI from Claude in January. Not pettiness. Anti-leakage. The Tony Wu memo line lands differently in this light:
"According to Cursor, this is a new policy Anthropic is enforcing for all its major competitors."
Read in May, that's an early warning siren. The labs already understood that competitor IDEs were data faucets. Anthropic was the first to admit it out loud, with terms that explicitly prohibit using Claude to "build a competing product or service, including to train competing AI models." Three months later, SpaceX put a price tag on the water.
Two months before the deal, two senior Cursor engineers, Andrew Milich and Jason Ginsberg, had already left to join xAI, both reporting directly to Musk. The plumbing was in place.
OpenAI's Quiet Wedge: AWS Distribution
OpenAI doesn't need compute. So why did they sign a deal with Amazon?
Until early 2026, Anthropic's quiet enterprise wedge was almost embarrassingly simple: Claude was the only frontier model on Bedrock. Most Fortune 500 procurement runs through AWS. That meant Claude had a procurement moat OpenAI couldn't reach.
Then in February 2026, AWS launched a multiyear partnership with OpenAI — per CIO Dive — "to distribute OpenAI Frontier, the Anthropic competitor's enterprise platform for AI agents." The exact deal mechanics matter less than the signal: the Bedrock moat is leakier than it was a year ago.
Couple that with the Codex catch-up. The same Fortune piece on the Claude Code regression slipped a quiet but devastating data point into the second half: "OpenAI said it now had 4 million active Codex users, 9 million paying business customers, 900 million weekly active users on ChatGPT, and more than 50 million subscribers. Anthropic has not published comparable user figures."
Now you can read why Anthropic was willing to break bread with the man who had called them evil. Two of their three structural moats — coding leadership and Bedrock-only distribution — were narrowing at the same moment. Compute had become the binding constraint. And the only operator with significant idle frontier-grade compute happened to also run a competing lab and hate their CEO on principle.
The framework holds: when OpenAI doesn't need compute, it goes after distribution. When Anthropic doesn't have compute, it leases from its enemy. Each lab plays the move that costs them the least pride.
All of which is fascinating geopolitics. None of which matters until you sit down at your keyboard.
What This Looks Like at 8 a.m. in Singapore
If you're reading this on a Pro or Max plan in Singapore on Thursday morning, here's what changed for you on Wednesday.
From Simon Willison's live blog of the keynote, captured at 09:12:
- Claude Code's 5-hour session limit doubled for Pro, Max, and Enterprise customers.
- Peak-hour rate-limit reductions removed.
- "Increased rate limits for developers on Claude Code and the API."
These aren't generosity. They're triage — what happens when a 300 MW chunk of Colossus 1 starts taking traffic off the bottleneck. Welcome news, real relief. But read in the context of the 80x quote, it's the equivalent of opening a second lane on a road already running at 800% over design capacity.
Three things this should change about how you build. Monday morning, not abstract eventually:
1. Treat compute as a multi-year supply chain, not a budget line. Anthropic's run-rate jump from $9B to $30B doesn't translate into capacity for 18–24 months. The Google/Broadcom TPU capacity is online "starting in 2027." The CoreWeave compute is "later in 2026." The capital is real; the electrons take time. Architect accordingly. Implement exponential backoff like you mean it. Treat rate limits as a permanent feature of the landscape, not a temporary inconvenience until your next plan upgrade.
2. Assume your IDE is a training corpus. Every "no, you missed step 2" you type into Cursor, into a Claude Code session, into Antigravity, into any agentic coding tool — that's training data for somebody. The Cursor deal made the price tag explicit; it didn't invent the dynamic. If your team's code is sensitive, or your differentiation lives in your prompts and your domain knowledge, read the data clauses again. Then read them a second time.
3. Pick your dependencies with the three ingredients in mind. A model provider strong on research and data but compute-starved (Anthropic, today) will throttle you when demand spikes — and Anthropic's own statement about being stretched at peak hours is now part of the public record. A provider strong on compute but thin on research and data (xAI) will be cheap but lag the frontier. There's no neutral choice. Only an informed one. Multi-LLM routing with task-aware fallback isn't a nice-to-have anymore. It's resilience.
Honest admission: I drafted most of this article in a Claude Code session. Hit a rate limit two thirds of the way through, swapped to a smaller model to finish a section, then came back to Opus to polish. The three ingredients aren't theoretical for me. They're operational. The bumps that landed Wednesday will help. They will not solve it.
The Bigger Question
Two years ago, the bottleneck was can the model do it?
One year ago, it was can humans verify what the model produces?
In May 2026, the bottleneck is which lab has access to enough electricity? — and the answer, increasingly, is that the labs trade favors with each other across grudges most of us would consider unforgivable.
That's not a temporary truce. It's the new shape of the industry. If you're building on AI, you're not picking a vendor — you're picking an alliance, in a market where alliances reshuffle every quarter and rate limits move faster than your architecture diagrams.
Press releases don't stage cease-fires this fast unless something underneath is on fire. In 2026, the fire is electricity, and we're all standing close enough to feel the heat.
The Cutler.sg Newsletter
Weekly notes on AI, engineering leadership, and building in Singapore. No fluff.
Manager Mode: When AI Does the Work, Everyone Becomes Middle Management
AI is silently promoting every knowledge worker to middle management — without the title, the training, or the pay. This is what that shift actually looks like from a Singapore desk.
The Hidden Arsenal: How My Dotfiles Unlocked 10x Productivity with AI Coding Assistants
After 12 months of systematic optimization, I've documented 50-70% productivity gains with AI coding assistants. The secret isn't just using AI tools—it's teaching them to think like you do through carefully crafted configurations.
The Quiet Failure Inside the Agent
AI agents don't fail loudly — they degrade silently, returning 200 OK while the damage compounds. Inside the $47K loops, NOHARM omissions, and the engineering discipline rebuilding observable failure.