In 2023, the cost of asking a machine a PhD-level science question and getting it right fell by roughly forty times in a single year.

By that logic, intelligence should be nearly free by now, and a great deal of human work should already be done by software. Neither is quite true.

The same month prices were supposed to be racing toward zero, GitHub — owned by Microsoft, the most committed corporate believer in this technology — quietly tore up the way it charges for its AI coding assistant, because the new generation of “agents” had become too expensive to give away. Some developers watched their bills rise ten to fifty times overnight.

So which is it? Is AI collapsing toward free, or quietly becoming a luxury? The honest answer — the one this piece is about — is both at once, and the gap between those two facts is where the promise of replacing human effort is currently stranded.

This is not an argument that AI is overhyped, nor that it is unstoppable. It is an attempt to stand exactly on the line between the two, with the numbers in hand, and ask where the line is heading.

A short history of the anxiety

The fear came before the capability. For seventy years, “artificial intelligence” was a promise that kept arriving late — bursts of optimism in the 1950s and 1980s, each followed by a “winter” when the machines failed to do what their makers swore they would.

Every wave carried the same headline: the machines are coming for the work. Every wave, the work mostly stayed. Hold onto that pattern. It is the reason a sceptic is right to raise an eyebrow now — and the reason the sceptic might, this time, be wrong. Because in 2017 something genuinely different happened, and it did not come from a prophecy. It came from a paper about translation.

The inflection: eight authors and one idea

In June 2017, eight researchers at Google Brain published a paper with a title that sounds like a manifesto: “Attention Is All You Need.” It introduced the Transformer — a way of building neural networks that, instead of reading a sentence strictly left to right, looks at every word at once and learns which words matter to which.

It was written to improve machine translation. It became the foundation of nearly everything now called “AI.” Two small clarifications, because they recur in the story: the work came from Google Brain, then a separate lab from DeepMind — the two merged into Google DeepMind only in 2023, so the architecture that now powers OpenAI’s and Anthropic’s models was, at its root, Google’s idea, given away freely.

So what is a model, really?

Strip away the vocabulary and a large AI model does one humble thing, very fast, a staggering number of times: it predicts the next piece. The diagram below is live — hover any step, and toggle the numbers to see how a model really reads a word.

What actually happens when AI answersPick a prompt, then walk the four steps. At the end, change the temperature and model size and sample the next word yourself.

The capital of France is  

ThecapitalofFranceis

Your words are split into tokens — the chunks a model reads. (Real tokens are often word-fragments: “token” → “tok” + “en.”)

Mechanism per Vaswani et al. (2017); next-token prediction is the training objective of the GPT family.

That last idea is the quiet, important one. A model is not a mind that understands and then answers. It is an extraordinarily good guesser of what comes next, trained until its guesses are useful.

Keep this in your pocket. Almost every argument that follows — about whether AI can truly replace human effort, about why it costs what it costs, about where it breaks — comes back to this one fact: we have built a machine that predicts, and we are asking it to understand.

The chain reaction, and the race that followed

A freely published paper is a strange kind of gift. Google gave away the Transformer, and within five years it had seeded its own rivals. OpenAI built GPT on it. Anthropic was founded by people who left OpenAI. Google’s own DeepMind raced to catch the products built on Google’s own idea. Each new model became an argument that the last one was obsolete — and the argument was conducted in benchmarks.

For a while the benchmarks told a thrilling story. Scores climbed; press releases followed. But look closely at where they sit in 2026 and the story turns into something more honest — and more interesting.

Who’s actually ahead? Pick a test.The same models look nearly tied on easy benchmarks and spread far apart on hard ones. Switch the test and watch the gaps open and close. (Model names are illustrative near-future placeholders.)
↑ higher is better

What it measures: One overall score that blends ten hard tests into a single number. What a high score tells you: The closest thing to “how capable is it, overall.” The models bunch within a few points — the race is tightening.

Claude Opus 4.861.4
GPT-5.560.2
Gemini 3.1 Pro57
Grok 4.353
DeepSeek V451.5
Llama 547

14.4 between top and bottom A clear but modest spread between the leaders and the pack.

Curated snapshot in the style of the Artificial Analysis leaderboard, June 2026 — illustrative and dated; frontier rankings shift week to week. Model names follow the essay’s near-future framing.

Two truths sit inside that picture. The models are genuinely, remarkably capable — and they are converging, bunching together near the frontier rather than one runaway winner pulling away. That matters, because “AI will replace human effort” was sold as an escalator: capability rising without limit. What the benchmarks actually show in 2026 is a plateau forming at the top, with everyone arriving together. The race is real. It is also, increasingly, a race to the same place.

The promise meets the invoice

Here is the paradox this whole piece is named for. By one measure, intelligence is collapsing toward free. Epoch AI tracks not the price of a model but the price of a capability — what it costs to reach a fixed level of performance. By that measure, prices have fallen between nine and nine hundred times per year, depending on the task; after January 2024, a median of roughly two hundred times per year. Almost nothing in the history of technology has gotten cheaper this fast.

And yet. On 30 May 2026, GitHub — Microsoft’s, and the single most-used home for AI-assisted coding on Earth — abandoned flat-rate pricing for its Copilot assistant and switched to billing by the token. Developers who had paid a fixed monthly fee watched projected bills rise ten to fifty times. GitHub paused new sign-ups to stem the bleeding and pulled Anthropic’s most capable models from its cheaper tier entirely.

How can both be true? How does the price fall two hundred times a year while the bill rises fifty times overnight?

This is the precise location where the promise is stranded. “AI will replace human effort” assumed the cost of applying AI would fall as fast as the cost of a token. Instead, the moment we asked these systems to do real, multi-step human work — the autonomous, agentic work that replacement actually requires — their appetite for tokens grew faster than the per-token price could fall. A cheaper ingredient, used a thousand times over, is not a cheaper meal. The unit price is a headline. The invoice is the truth.

A view from outside the dollar

Every number so far has been quoted in US dollars. For most of the people on the planet, that is the catch hiding in plain sight. The frontier models are priced in dollars and, almost without exception, hosted in the United States. So the real cost of intelligence, for anyone who doesn’t earn in dollars, is not the sticker price — it’s the sticker price multiplied by an exchange rate, and divided by what a local wage can buy.

Take India: the rupee sat near ₹95 to the dollar in mid-2026, with forecasts pointing toward ₹100 — having weakened roughly eleven percent in a single year. An API bill that looks merely uncomfortable in San Francisco is, after that exchange rate and against Indian wages, closer to prohibitive for a two-person startup. Move the sliders below to feel it.

Cheaper than a person? Depends where you live.Pick a job and a country, then drag how hard the AI works the problem. The token price is the same dollar everywhere — local wages are not.
A customer-support agent (India)
₹33,915/mo
AI doing the same month of work
₹8,892/mo

AI does it for 26% of what a customer-support agent earns in India.

Same AI bill, every market — as a share of local pay

Nigeria
77%
Pakistan
57%
Philippines
34%
Indonesia
32%
India
26%
Brazil
19%
Germany
3.5%
United States
2.6%

Human pay: US BLS OEWS (May 2024 medians), scaled by average wage relative to the US (Numbeo, 2025) — a wage-level proxy. AI cost = tokens × $1.5/1M (representative 2026 blend); tokens-per-task are illustrative estimates. Exchange rates: live via open.er-api.com (snapshot 2 Jun 2026).

And the obvious escape route comes with its own toll. The one serious non-American frontier — China’s models, led by DeepSeek, which stunned the industry on cost — arrives wrapped in a question that has nothing to do with capability. DeepSeek’s own policy disclosed that user data is stored on servers in China, where the law grants the state broad access; within a year it had been banned or restricted by Italy, Australia, India, South Korea, and at least 17 US states.

So the developing-world buyer is offered a genuine choice and a genuine trap: pay a premium for American intelligence in a currency that keeps weakening, or accept a cheaper alternative whose price is paid in data and sovereignty. “Getting cheaper and more accessible” is true — if you earn in dollars. From outside the dollar, the same trend line bends the other way.

What is actually happening to work

Strip away both the doom and the cheerleading, and the clearest picture comes from watching what people actually use AI for. Anthropic does this through its Economic Index, mapping millions of real conversations onto the tasks that make up real jobs. As of early 2026, roughly 57% of AI use is augmentation — a person and a model working together — against about 43% automation, where the model does the task alone.

Augment or automate? It depends on the job.Pick a kind of work and see how much of its AI use is a person and a model together (augment) versus the model alone (automate).
Augment — person + model Automate — model alone36% of all AI use

Software & IT: Coding pulls both ways — developers steer the model, but agentic tools increasingly write and fix code on their own.

Across the economy — leaning automate → augment

Across all AI use, about 57% is augmentation — a person and a model working together, not the machine on its own.

Curated snapshot in the style of the Anthropic Economic Index (early 2026) — illustrative and dated. Shares of AI use are approximate.

More striking is the deskilling effect: AI is not chewing through the low-skill bottom of the labour market first, as the old fear assumed. It is absorbing the higher-skill parts of jobs — the analysis, the drafting, the reasoning. The machine is taking the interesting part.

So is anyone actually losing a job? Some are — but fewer, and for blurrier reasons, than the headlines suggest. The outplacement firm Challenger, Gray & Christmas logs the reasons employers give for layoffs. In all of 2025, AI was named in about 54,836 job cuts — roughly five percent of the year’s total. That is real, rising, and almost certainly undercounted. But against 1.2 million total cuts in a year of broad tightening, it is not yet the wave that was promised — or feared.

The honest summary of this moment: AI is reshaping work far faster than it is removing it. The danger is not a sudden mass unemployment. It is subtler — a slow hollowing of the rungs that juniors once climbed, the disappearance of the skilled-but-routine task that used to be how a person learned.

The next two years, counted conservatively

Forecasts about AI age like milk, so this is built deliberately low and slow — the conservative edge of credible institutional projections, each one a projection, not a fact.

Employers intend to cut where they can: in the World Economic Forum’s survey of more than a thousand large employers, 40% expect to reduce their workforce where AI can automate tasks by 2030. The net number, though, is positive — the same report projects 92 million roles displaced and 170 million created by 2030, a net gain of about 78 million jobs. The story is churn, not collapse.

The near term is muted: Gartner expects AI’s net effect on global employment to be roughly neutral through 2026, with gains arriving as augmentation before they arrive as replacement. And reskilling is the real battleground — 77% of employers say they plan to retrain existing staff to work alongside AI. Whether they truly do is the variable that decides whether the deskilling effect becomes opportunity or abandonment.

Put conservatively: over the next two years, expect disruption to outpace destruction. The task you do will change more than the job you hold. The risk concentrates at the entrance to careers and at the edges of the dollar — on the juniors, and on the economies that can least afford either the retraining or the API bill.

Nearly free, hardly cheap

Set the two facts side by side one last time. The cost of intelligence is falling faster than almost anything humans have ever built. And the promise that this would replace human effort — cheaply, universally, soon — has not arrived, and is furthest from arriving exactly where it was supposed to matter most: for the small firm, the junior worker, the economy outside the dollar.

Both are true. That is not a contradiction to be resolved; it is the shape of the thing. We built a machine that predicts and asked it to understand. We watched its unit price collapse and assumed the invoice would follow. We called it accessible and quoted the price in a currency most of the world doesn’t earn.

None of this says the promise is false. The escalator may simply be slower, and steeper, and more unevenly distributed than its sellers claimed. But the gap between nearly free and hardly cheap is not a rounding error to be smoothed away by next year’s model. It is where this technology actually lives right now — and the question worth sitting with is not whether the machines are coming, but who gets to afford them when they do.