The Great Deflation: A Strategy for the AI Era

·11 min read

The regime has changed.

Since 2023, we've watched Google cut over 12,000 jobs1, Meta drop 21,0002, and Amazon trim 27,000 — with another 14,000 announced in late 20253. But this isn't just tech. McKinsey is cutting 10% of its workforce — roughly 5,000 roles4. PwC eliminated 3,300 positions in under a year5. Goldman Sachs and Morgan Stanley are trimming thousands more6. The official line is "efficiency." The unofficial line is that large language models are starting to do what junior analysts, consultants, and knowledge workers used to do — at a fraction of the cost.

This article isn't about whether AI will "take your job." That framing is too binary, too emotional, and frankly, not useful. This is a risk assessment. If you get paid to think — whether you're an engineer, lawyer, analyst, designer, writer, or consultant — you are holding an asset. That asset is your skillset. And the market dynamics around that asset are changing.

The goal here is simple: understand what's changing, identify where value is migrating, and build a strategy for the next decade.

The Economics of the Drop

For all of human history, "intelligence" — the ability to reason, analyze, write, and produce knowledge — has been a scarce asset. Scarcity meant pricing power. A senior engineer could command $300/hour because there weren't that many people who could do what they did. The same applied to lawyers, doctors, consultants, and anyone whose value came from cognitive work.

We are entering a period where the supply of intelligence is becoming functionally infinite.

📝

This isn't hyperbole. When frontier models can produce working code, coherent prose, and reasonable analysis at a fraction of a cent per query, the supply curve has fundamentally shifted.

The Subsidized Reality

Here's something most people miss: we are currently getting intelligence below cost.

AI companies are running the classic Silicon Valley playbook — achieve market penetration at any price, capture users and data, figure out profitability later. OpenAI, Anthropic, and Google are burning billions in compute costs to offer you $20/month subscriptions. The price you pay today does not reflect the true cost of the service.

This is a temporarily deferred cost. When the VC money runs dry or shareholders demand profitability, prices will rise. We're already seeing it — API costs have started creeping up, usage limits are tightening.

The Long-Term Trend

But here's the thing: even when subsidies end, the long-term price trajectory is still down.

Four forces guarantee this:

  1. Model Distillation: Smaller models are being trained to replicate the outputs of larger ones. What required GPT-4 in 2023 now runs on consumer hardware. What requires GPT-5 today will follow the same path.

  2. Hardware Efficiency: Moore's Law applied to GPUs. Every generation of chips does more inference per dollar.

  3. Reinforcement Learning at Scale: In domains with verifiable solutions — mathematics, programming, formal logic — more compute means higher quality training data. Models can generate solutions, verify correctness, and learn from the results. This creates a flywheel: better models produce better synthetic data, which trains even better models.

  4. Algorithmic Innovation: The transformer architecture isn't the end of the road. New architectures, training methods, and algorithmic tricks will continue to reduce the compute required per unit of intelligence. We're still in the early innings of understanding how to efficiently model cognition.

The cost of a token in 2023 versus 2026 has dropped by roughly 10x. This trend continues.

The implication: If your job is purely outputting syntax — writing boilerplate code, generating standard documents, producing routine analysis — you are holding a depreciating asset.

The Coherence Gap: Why Humans Are Still SOTA

Now for the part that AI evangelists don't want to talk about.

Models have fundamental limitations that aren't going away with scale. These aren't bugs to be fixed — they're architectural constraints.

Failure Mode 1: No Introspection

A human is relatively good at recognizing when they don't know something. A junior analyst says "I'm not sure, let me check with someone senior." A doctor orders additional tests when symptoms don't fit the pattern. This escalation pattern is how organizations and professions manage risk.

Models do not have this functionality.

When a model lacks knowledge or expertise, it doesn't escalate. It doesn't flag uncertainty. It produces an answer with the same confidence it produces everything else. The term for this is confidently wrong.

⚠️

In high-stakes environments — finance, medicine, infrastructure — the cost of verifying model output can exceed the cost of doing the work yourself. The model's speed advantage disappears when you factor in the verification overhead.

Failure Mode 2: Context Decay

Consider your career as a long-context task. You're maintaining trajectory over years, sometimes decades. You remember why a decision was made three years ago. You understand the political dynamics of your organization. You hold context that isn't written down anywhere.

Models lose coherence over long contexts. Not just when a token falls outside their context window — even within their context, performance degrades on longer tasks. They lose the thread.

Strategic coherence over a six-month project, let alone a career, is something models fundamentally cannot maintain.

Failure Mode 3: Irregular Intelligence

Model performance is inconsistent. The same prompt can yield dramatically different quality outputs. You might get a brilliant solution on one attempt and a mediocre one on the next.

This variance makes models unreliable for tasks where consistent quality matters. You can't build a system on a component that works 80% of the time.

Failure Mode 4: Regression to the Mean

LLMs are probabilistic engines designed to predict the most likely next token. By definition, they gravitate toward the median of their training data.

In most fields, value is often found in the outliers — the non-consensus view, the edge case, the novel approach. Because models are trained to maximize token probability, they inherently bias toward consensus. Ask for a business strategy, a diagnosis, or a system architecture, and you'll get the average of everything the model has seen.

Models can produce B+ work instantly. But they structurally struggle to produce A+ work, because A+ work is, by definition, an outlier that deviates from the training distribution.

Failure Mode 5: The Interpolation Trap

Neural networks excel at interpolation — connecting dots within the cloud of data they were trained on. They're terrible at extrapolation — predicting what happens outside that cloud.

The real world generates events with no historical precedent: a global pandemic, a novel cyberattack, an unprecedented regulatory shift. When a Black Swan event occurs — a situation completely outside the training distribution — models don't just fail; they fail catastrophically and unpredictably.

A human can use first-principles reasoning to navigate a novel crisis. A model tries to map the crisis to the closest pattern it knows, which is often the wrong map entirely.

Failure Mode 6: Sycophancy

Due to Reinforcement Learning from Human Feedback (RLHF), models are fine-tuned to be helpful and agreeable. They often hallucinate or agree with false premises just to align with the user's prompt.

In a risk assessment or code review, you need a critic, not a cheerleader. If you inadvertently ask a leading question or present a flawed premise, the model will often fabricate supporting evidence rather than correct you. It prioritizes alignment over truth.

⚠️

A tool that validates your bad ideas is more dangerous than a tool that offers no ideas at all.

Where Value Remains

If generation is cheap, what's expensive?

1. Liability (The Signature)

AI cannot be sued. AI cannot go to jail. AI cannot sign off on a building design, a financial audit, or a medical diagnosis.

In regulated industries — finance, healthcare, engineering, law — the value is increasingly in taking legal ownership of output, not generating it. Someone has to put their name on the line. Someone has to be accountable when things go wrong.

The model can draft the document. A human must sign it.

2. Creativity (The Outlier)

This ties directly to Failure Mode 4. Models regress to the mean — they produce the weighted average of their training data. By definition, they cannot reliably generate outliers.

But outliers are where value lives. The breakthrough scientific hypothesis. The novel legal argument. The architectural innovation. The art that moves people. These require deviation from the norm, not convergence toward it.

True creativity isn't about generating variations on existing patterns — models do that well. It's about recognizing when the existing patterns are wrong and proposing something fundamentally different. That requires the kind of judgment, taste, and willingness to be wrong that probabilistic systems structurally lack.

In any field where the goal is to solve truly complex problems or produce work that stands out, human creativity remains the bottleneck — and the value.

3. The Physical and Social Barrier

Some tasks have inherent latency and nuance that models can't handle:

  • Empathy: Understanding what someone actually needs versus what they're asking for
  • Ethics: Navigating situations where the "right" answer depends on values, not logic
  • Physical interaction: Anything requiring presence, touch, real-time adaptation to physical environments

Even as humanoid robots scale, the bandwidth of human social interaction remains difficult to replicate.

The Strategy

Don't fight the trend. The economics are too compelling. Every company that can reduce headcount by 20% while maintaining output will do so.

The question isn't whether to adopt AI. It's how to position yourself in a world where AI is ubiquitous.

Sovereignty: Own Your Stack

There's a risk in centralized AI that few people discuss.

When a handful of companies control the models, they control the ethics filters, the pricing, the access. They can change terms of service. They can raise prices once you're dependent. They can decide what questions you're allowed to ask.

The solution: run local.

Model distillation has made it possible to run capable models on consumer hardware. An RTX 4090 or 5090 can run models that would have required a data center three years ago. The weights are open. The data stays on your machine.

💡

Own the weights, own the data. This is sovereignty in the AI era. I'll write more about my local setup in a future post.

The Pivot: Augment and Specialize

The strategy has two parts.

First, use AI to multiply your output. Don't resist the tools — master them. Every hour you spend learning to effectively prompt, chain, and verify model outputs is an hour invested in your own productivity. The knowledge workers who thrive will be those who produce 10x the output by treating AI as leverage, not those who refuse to engage.

Second, specialize in what's hard to tokenize. Move toward areas where models structurally struggle:

  • Liability: Roles where someone must sign off, take legal ownership, and be accountable. The model drafts; you decide.
  • Creativity: Work that requires genuine novelty — not variations on patterns, but recognizing when the patterns themselves are wrong.
  • Deep Domain Knowledge: Expertise built over years that exists in long-context memory, tacit understanding, and proprietary experience. Regulatory edge cases. Clinical intuition. Organizational politics. Industry dynamics that never made it into training data.

The goal is to become the human in the loop that the system cannot remove — not because of gatekeeping, but because the value you provide is genuinely difficult to replicate.

The Owner Mindset

Here's the philosophical shift:

For most of the 20th century, the deal was clear. You sell your time and expertise to an organization. They pay you a wage. The organization captures most of the value, but you get stability.

That deal is breaking down.

When the cost of intelligence trends toward zero, organizations need fewer people. The stability disappears. What remains is ownership.

The question becomes: are you a laborer selling time, or an owner capturing value?

Tools are available. Models are accessible. Running inference locally is increasingly viable. The same capabilities that let companies reduce headcount can let individuals produce more.

The path forward isn't to compete with AI on its terms — raw output, speed, cost. It's to use AI as leverage while focusing on the things it can't do: judgment, liability, long-term coherence, and the messy human dynamics that no model can navigate.


Footnotes

  1. Google laid off 12,000 in January 2023, with ongoing cuts through 2024-2025.

  2. Meta cut 11,000 in November 2022 and another 10,000 in March 2023.

  3. Amazon cut 27,000 in 2022-2023, then announced 14,000 more in October 2025.

  4. McKinsey cutting up to 10% of workforce, reducing headcount from 45,000 to approximately 40,000.

  5. PwC laid off 1,800 in September 2024 and another 1,500 in May 2025.

  6. Goldman Sachs, Morgan Stanley, and JPMorgan planning thousands of cuts in 2025, with JPMorgan targeting 10% reduction in operations staff over five years.

Thanks for reading! Connect with me on LinkedIn or check out my GitHub.