This isn't about whether AI is capable. It clearly is. What's less certain is whether it can stay as accessible as it is today.
That question comes down to money.
The economics don't add up. Yet.
It's an open secret that Anthropic, OpenAI, and the rest are losing enormous amounts of money on compute. The subscription price you pay for Claude or ChatGPT covers a fraction of what it actually costs to run those models. The GPUs are expensive. Very expensive. The numbers bear this out: OpenAI's inference costs hit $8.4 billion in 2024, growing fourfold in a single year, while Anthropic's are projected to reach $2.7 billion in 2025, up more than threefold. Both companies have missed their own gross margin targets significantly. Anthropic's gross margin was a staggering negative 94% in 2024.
This isn't unusual as a business model: sacrifice margin early, acquire users fast, worry about profit later. It works fine as long as your VC or PE backers have patience and there's a credible path to making money. There are really only two ways to get there:
- Economies of scale. As users pile on, variable costs come down, or the fixed costs become viable simply because enough people are splitting them.
- Lock-in pricing. Once people are hooked, raise the price and bet that inertia keeps them subscribed. Most people will grumble. Most won't cancel. This tends to happen once the company believes it has hit (or is approaching) market saturation, i.e. anyone who was going to subscribe already has. Given long enough, the second one always happens. You can see it with Netflix, Xbox Game Pass, and a dozen others. It's practically mandatory for any publicly listed company once growth starts to plateau — shareholders stop rewarding user numbers and start demanding margin.
Three ways this plays out for AI
For AI specifically, I think there are three paths forward, and the most likely outcome is some combination of all three.
1. The models get smaller and smarter
I like the adapted phrase from MKBHD: "good models are getting small, and small models are getting good". The labs are betting heavily that they can compress their largest models into a much smaller footprint without losing the capabilities that matter. There's also interesting work at the architecture level, like multi-token prediction, which could dramatically improve the efficiency of running large models. If this works, inference gets cheaper, and prices can stay sane.
2. Chips get cheaper
In theory, falling hardware costs help everyone. Nvidia is one of the most valuable companies on earth. Micron has pivoted almost entirely to enterprise. This could mean major ramp up in production to satiated supply and reduce prices. In practice, I'm sceptical this rescues the economics in any meaningful timeframe. This is the weakest of the three levers, and I wouldn't build a business plan around it.
3. Prices go up - significantly
This is the most predictable outcome, and the one that most people aren't quite ready for. Unlike Netflix hiking from £10.99 to £13.99, AI companies may need to raise prices by 200–300% to close the gap with their actual costs. Alternatively, and this feels more likely, they'll continue releasing new flagship models that are a few percentage points better than the last one, but lock them behind a £250+/month tier, while the current-generation models get smaller, cheaper, and good enough for most use cases.
What actually happens
My best guess: a bit of everything above, and it's not as bleak as it sounds.
Small models are improving at a pace that's genuinely awesome. Qwen, Gemma, Kimi, and others are closing in fast. Within a year or two, a 120–250 billion parameter model will probably match what Claude Opus 4.7 can do today and at a fraction of the cost to run. The frontier labs will keep pushing ahead, but their most powerful models will increasingly sit behind partner agreements or restricted access programmes. The Anthropic Glasswing / Claude Mythos situation is a preview of this: a model that is alleged to be such a step-change that it cannot be released to the public, available only to a curated group of enterprise partners. That pattern will repeat.
The good news for developers, practitioners, and businesses is that the tools for using AI effectively have matured just as fast as the models themselves, if not faster. Better context management in coding harnesses like Claude Code, Pi, and OpenHands, combined with techniques like the Advisor Strategy, sub-agent architectures, and multi-agent orchestration mean you can go surprisingly far with a small, cheap model. I have personally seen great success with Qwen3.6 27B running locally on well scoped coding tasks.
The sky isn't falling. AI will still be part of the daily workflow, it'll just cost more for the very top tier, and the value will increasingly live in how smartly you use whatever model you have access to.