Block's AI World Model: The Quiet Failure in Dorsey's Vision

When Zappos adopted Holacracy and eliminated management titles in 2015, roughly 29% of the workforce left. When Valve tried its famous flat structure, employees compared it to "Lord of the Flies" — informal power hierarchies replacing visible ones with zero accountability. These failures were loud. Everyone saw them break.

The world model failure is different.

The system flags a seasonal revenue dip as a crisis — and the prioritization shift happens before anyone questions the signal. It surfaces a correlation between a feature launch and a churn spike, but the actual cause was a billing change — and the team kills the wrong feature. The system drifts. It stops surfacing certain signals. Decisions get made on incomplete pictures. When the results are bad, the postmortem blames "market conditions." As technologist Nate Jones puts it: "The most dangerous version of a world model is the one that works well enough that nobody questions it until the decision quality degrades."

On March 31, 2026, Block published From Hierarchy to Intelligence — the most ambitious manifesto anyone has written for replacing middle management with AI. Within weeks, a technical lead named Richard Hesse threatened to quit to get his infrastructure team rehired after Block admitted some layoff cuts were "clerical errors."

The manifesto is fascinating. The gap between the manifesto and the correction is the story worth examining.

What Dorsey Gets Right

Before the critique, credit where it's due — and it's more credit than most critics are giving.

The historical arc in From Hierarchy to Intelligence is genuinely excellent. Dorsey and co-author Roelof Botha trace organizational hierarchy from the Roman contubernium — small units of eight soldiers with a strict span of control — through the Prussian General Staff's invention of middle management as a professional class (explicitly designed to "support incompetent generals"), the American railroads where West Point engineers imported military org structures into business, Frederick Taylor's scientific management pyramid, the Manhattan Project's cross-functional coordination under wartime pressure, and McKinsey's matrix organization in 1959.

The argument that emerges is elegant: hierarchy was never a natural law of organizations. It was an information-routing technology — the best one available for two thousand years. As Dorsey and Botha write: "For two thousand years, from the Roman contubernium to today's global enterprises, we have had no real alternative."

And then: "The question was never whether you needed layers. The question was whether humans were the only option for what those layers do. They aren't anymore."

This is the most compelling reframe of organizational structure I've seen. Previous flat-org advocates — Zappos, Valve, the agile movement — all struggled because they removed the coordination mechanism without replacing it. Dorsey's argument is that AI is the replacement. The coordination doesn't disappear; it shifts from humans to a continuously updated model of the company.

The four-layer architecture Block proposes — Capabilities (atomic financial primitives), World Model (how the company understands itself and its customers), Intelligence Layer (composing capabilities into solutions), and Interfaces (Square, Cash App, Afterpay) — is architecturally coherent. And Block's signal advantage is real. As Dorsey puts it: "Money is the most honest signal in the world." A fintech company sitting on transaction data from both the buyer side (Cash App) and the seller side (Square) has a richer foundation for a world model than almost anyone.

The vision of "a company built as an intelligence (or mini-AGI)" is audacious. It's also directionally correct about a significant portion of what management currently does. If I were evaluating this purely as a technical architecture document, I'd give it high marks. The historical framing is better than anything else I've read on the subject.

But architecture documents don't lay people off. Organizational decisions do. And those decisions depend on a number the manifesto never confronts.

The 60/40 Split

Here's the number the manifesto never addresses directly.

Managers spend 60–70% of their time on administrative logistics: status syncs, information routing, report generation, scheduling, approvals, compiling updates for their own managers. This is the work Dorsey's manifesto targets. And he's right — most of it can be automated. It probably should be.

The remaining 30–40% is judgment work. Mentoring a struggling engineer through a career crisis. Negotiating priorities between two teams that both think they should go first. Reading the room in a meeting and realizing the loudest objection isn't the real objection. Knowing that one name on a layoff list means losing the only person who understands the infrastructure that keeps three products running. These aren't information-routing tasks. They're interpretation tasks — and they require context that no model currently captures.

The manifesto treats management as a monolithic function. It replaces "the hierarchy" without distinguishing between the logistics layer (automate it) and the judgment layer (can't automate it — at least not yet). The word "judgment" barely appears in the document. The word "mentoring" doesn't appear at all.

Neil Thompson, a researcher at MIT, framed the distinction precisely: "If part of your job gets automated, and it's something that really didn't use the expertise that you needed, that's great. You get to spend more of your time on the part of your job that is really valuable." The implication cuts both ways — you must preserve the expertise parts. Block's manifesto automates the logistics and assumes the judgment will simply find new footing at "the edge." It doesn't design for that transition.

The broader trend makes this urgent. The average American manager now oversees 12 direct reports — nearly double the number since Gallup started tracking in 2013. Meta is running a 50:1 ratio in its applied AI engineering division. The "megamanager" trend is happening regardless of AI — and it's already straining the judgment layer. Only 27% of managers globally feel enthusiastic about their work, and most have never received formal training. The system is already broken. The question isn't whether to change it, but how.

Josh Bersin, who studied 70+ companies implementing AI productivity tools, found a pattern: "If you look at the 10-100X transformations from AI, none of them start with job displacement. They all start with business re-engineering."

Block started with displacement.

The Rehiring Problem

The 60/40 split is a useful framework. But frameworks are abstractions. What does it look like when the judgment layer actually fails?

Block gave us a case study within weeks.

In February 2026, Block cut over 4,000 employees — roughly half its workforce. Dorsey explicitly tied the layoffs to AI productivity gains. The stock rose nearly 18% the day of the announcement as investors bet on efficiency gains.

Then the corrections started.

Richard Hesse, a Block technical lead, threatened to quit to get his infrastructure team rehired. Block admitted some of the cuts were "clerical errors." Let that land. The company that just published the most sophisticated argument for replacing human judgment with AI systems described the failure of its own decision-making process as a clerical error.

If the world model actually understood which people were essential — if it carried an accurate model of who does what and which knowledge is irreplaceable — a technical lead wouldn't have needed to threaten to quit. The gap between the manifesto's vision of "a continuously updated model of the company" and this specific failure is the interpretive boundary in action. The system got the logistics right (here are the headcount numbers) and the judgment wrong (here's who you can't afford to lose).

The employees filled in the rest. Seven anonymous Block workers gave on-record accounts to The Guardian disputing AI capability claims. One employee — whose job was specifically to help others use AI tools — reported that roughly 95% of AI-generated code at Block still needed human fixes. Another said: "Everyone that I know that's still there has a ton of dread because they just realized their workload has quadrupled or 10xed and AI is not going to fix it."

The financial context complicates the AI narrative further. Block had grown from 4,000 to 13,000 employees during the pandemic hiring spree between 2019 and 2022. The stock had fallen 40% before the layoff announcement. Bloomberg called it AI-washing. Mizuho Americas analyst Dan Dolev said bluntly: "The vast majority of these cuts were probably not due to AI."

This is the quiet failure in miniature. The system — or in this case, the organizational narrative built around the system — presented the layoff plan with structured confidence. Senior leaders acted on it. The failure was invisible until someone with enough leverage made it visible. Most people don't have Richard Hesse's leverage.

The research supports the concern. McKinsey partners warned in Harvard Business Review that cutting middle management hastily "can be a costly mistake" — middle managers are "the glue that holds teams and enterprises together." Forrester predicts that 50% of AI-attributed layoffs industry-wide will be quietly rehired.

The rehiring problem isn't unique to Block. It's the predictable consequence of building organizational AI without solving for a specific design challenge — one that has a name.

The Interpretive Boundary

There's a name for what Block's manifesto is missing: the interpretive boundary.

Every AI system in an organization is implicitly distinguishing between two types of output. The first is "act on this" — factual, verified, low-risk conclusions. Status rollups. Threshold alerts. Routine approvals. The second is "interpret this first" — judgment calls the system isn't equipped to make. Trends requiring context. Correlations that might be causation. Prioritization decisions with political dimensions. Personnel decisions with human consequences.

The question isn't whether your AI system is making this distinction. It is. The question is whether you've designed it explicitly.

Nate Jones's analysis of world model architectures identifies three approaches, each with its own blind spot:

The Vector Database approach — fast to deploy, adequate for information logistics — but semantic retrieval has no mechanism to distinguish surfacing from interpreting. Rankings become editorial judgments by default. At scale, "the ranking becomes a reality no one intended."

The Structured Ontology approach (think Palantir) — defines objects and relationships explicitly, prevents hallucinations, enforces precision. But it can only represent what you've already categorized. It's "accurate about what it knows. Silent about what it doesn't know."

The Signal Fidelity approach (Block's model) — built on high-quality data, specifically transaction data. The signal is clean. But that cleanliness is precisely the vulnerability. As Jones puts it: "High signal fidelity at the input layer creates an illusion of high judgment quality at the output layer."

That last line is the key insight. Transaction data is clean. It's factual. It's "the most honest signal in the world," as Dorsey says. But the decisions built on that data — who to lay off, which products to kill, which markets to enter — are judgment calls wearing the costume of data. The cleaner the input, the more trustworthy the output looks, regardless of whether the interpretive leap was warranted.

Richard Teachout's analysis of decision boundaries in AI systems makes this explicit: "Boundaries encode risk tolerance, regulatory posture, and organizational accountability. They reflect where humans must remain in the loop, not because machines are incapable, but because consequences extend beyond the model's view."

This is what's missing from Block's manifesto. The interpretive boundary isn't an emergent property of good architecture. It's an explicit design element — one that must be mapped, maintained, and enforced. Without it, you get multi-agent AI systems that fail through "silent failure states" — agents optimizing for subgoals that don't align with the overarching mission. The same pattern plays out at organizational scale. The world model optimizes for efficiency. The humans who would have caught the error are gone. The system looks like it's working.

That's the quiet failure.

The Counter-Model

So what does an organization look like when it gets the interpretive boundary right?

DBS Bank in Singapore offers the clearest counter-example. By 2025, DBS had generated over S$1 billion in economic value from AI and data analytics, deployed more than 300 AI models, and reskilled all 40,000 employees. They didn't replace management — they transformed it, creating new oversight roles whose job is explicitly to sit at the interpretive boundary.

The DBS model takes the 60% logistics and automates it. Then it retrains the humans for the 40% judgment — plus new oversight roles that didn't exist before. This is augmentation, not replacement. And it's producing measurable results at scale.

Bersin's study of 70+ companies confirms the pattern. The real transformations start with business re-engineering, not job displacement. Block started from the wrong end.

If you're a CTO or VP of Engineering flattening your org right now, here are five design principles adapted from Jones's framework and the broader research:

Map the boundary explicitly — Audit every AI output in your management stack. Is this "act on this" or "interpret this first"? If you haven't classified it, you've left the boundary to chance.
Make uncertainty visible — Systems must communicate confidence levels, not just conclusions. A world model that presents a judgment call with the same visual weight as a verified fact is a system designed to produce quiet failures.
Encode outcomes — As Jones argues: "A knowledge base might record what happened. A world model is supposed to record what happened, what was done about it, and what happened next." Build the feedback loop. Without it, the model never learns from its own judgment errors.
Design for resistance — Capture signal as a byproduct of work, not as a separate documentation burden. If keeping the world model accurate requires people to stop working and start documenting, they won't do it. The model will drift.
Preserve the judgment layer — Automate the 60%. Invest in the 40%. Retrain managers for interpretation, not elimination. The DBS model works because it does both.

If you haven't mapped the interpretive boundary, you're building the quiet failure.

The Unresolved Question

Dorsey, to his credit, acknowledges the risk: "Block is in the early stages of this transition. It will be a difficult one, and parts of it will likely break before they work."

That's intellectually honest. But "break before they work" hits differently when the things that break are people's careers and livelihoods. Richard Hesse's infrastructure team wasn't an abstraction. The employee who watched their workload quadruple while being told AI would handle it wasn't a theoretical concern. These are real costs, incurred at the interpretive boundary the manifesto never names.

And here's the question that keeps me up at night: When the quiet failure happens — when the world model confidently surfaces the wrong signal, and the team acts on it, and the damage is done — who is responsible? The system? The engineer who built it? The executive who replaced the manager who would have caught it?

We don't have an answer yet. The accountability frameworks haven't caught up to the architecture.

Dorsey's manifesto is the best case anyone has made for automating the 60%. The historical framework is genuinely brilliant. The four-layer architecture is technically coherent. The signal advantage is real.

But the most dangerous version of this future isn't the one where AI fails. It's the one where AI looks like it's succeeding — where the world model presents its findings with calm, structured confidence, and nobody with the leverage to question it is left in the room.

The question was never whether you needed layers. The question is whether you've designed the boundary between automation and judgment.

That boundary is the 40% Block's manifesto barely mentions. It's also the part that matters most.

The world model failure is different.

The manifesto is fascinating. The gap between the manifesto and the correction is the story worth examining.

What Dorsey Gets Right

Before the critique, credit where it's due — and it's more credit than most critics are giving.

And then: "The question was never whether you needed layers. The question was whether humans were the only option for what those layers do. They aren't anymore."

But architecture documents don't lay people off. Organizational decisions do. And those decisions depend on a number the manifesto never confronts.

The 60/40 Split

Here's the number the manifesto never addresses directly.

Block started with displacement.

The Rehiring Problem

The 60/40 split is a useful framework. But frameworks are abstractions. What does it look like when the judgment layer actually fails?

Block gave us a case study within weeks.

Then the corrections started.

The rehiring problem isn't unique to Block. It's the predictable consequence of building organizational AI without solving for a specific design challenge — one that has a name.

The Interpretive Boundary

There's a name for what Block's manifesto is missing: the interpretive boundary.

The question isn't whether your AI system is making this distinction. It is. The question is whether you've designed it explicitly.

Nate Jones's analysis of world model architectures identifies three approaches, each with its own blind spot:

That's the quiet failure.

The Counter-Model

So what does an organization look like when it gets the interpretive boundary right?

Bersin's study of 70+ companies confirms the pattern. The real transformations start with business re-engineering, not job displacement. Block started from the wrong end.

If you're a CTO or VP of Engineering flattening your org right now, here are five design principles adapted from Jones's framework and the broader research:

Map the boundary explicitly — Audit every AI output in your management stack. Is this "act on this" or "interpret this first"? If you haven't classified it, you've left the boundary to chance.
Make uncertainty visible — Systems must communicate confidence levels, not just conclusions. A world model that presents a judgment call with the same visual weight as a verified fact is a system designed to produce quiet failures.
Encode outcomes — As Jones argues: "A knowledge base might record what happened. A world model is supposed to record what happened, what was done about it, and what happened next." Build the feedback loop. Without it, the model never learns from its own judgment errors.
Design for resistance — Capture signal as a byproduct of work, not as a separate documentation burden. If keeping the world model accurate requires people to stop working and start documenting, they won't do it. The model will drift.
Preserve the judgment layer — Automate the 60%. Invest in the 40%. Retrain managers for interpretation, not elimination. The DBS model works because it does both.

If you haven't mapped the interpretive boundary, you're building the quiet failure.

The Unresolved Question

Dorsey, to his credit, acknowledges the risk: "Block is in the early stages of this transition. It will be a difficult one, and parts of it will likely break before they work."

We don't have an answer yet. The accountability frameworks haven't caught up to the architecture.

The question was never whether you needed layers. The question is whether you've designed the boundary between automation and judgment.

That boundary is the 40% Block's manifesto barely mentions. It's also the part that matters most.

Michael Cutler

The Quiet Failure: Block's World Model Manifesto and the Line AI Can't Cross

What Dorsey Gets Right

The 60/40 Split

The Rehiring Problem

The Interpretive Boundary

The Counter-Model

The Unresolved Question

The Cutler.sg Newsletter

Manager Mode: When AI Does the Work, Everyone Becomes Middle Management

Three Ingredients, Three Labs, One Squeeze: Reading the 2026 AI Compute Crisis

The Quiet Failure Inside the Agent

The Quiet Failure: Block's World Model Manifesto and the Line AI Can't Cross

What Dorsey Gets Right

The 60/40 Split

The Rehiring Problem

The Interpretive Boundary

The Counter-Model

The Unresolved Question

The Cutler.sg Newsletter

Manager Mode: When AI Does the Work, Everyone Becomes Middle Management

Three Ingredients, Three Labs, One Squeeze: Reading the 2026 AI Compute Crisis

The Quiet Failure Inside the Agent