PilotThí điểm Now onboarding partners in manufacturing, logistics & energy — join the waitlist Đang tuyển đối tác trong ngành sản xuất, logistics & năng lượng — tham gia danh sách chờ
← Back to blog ← Quay lại blog
6 May 2026 · Celesnity team

The Map Inside the Machine

World models, the $15trn bet behind today's AI race, and why a serviceable one is the precondition for real autonomy on the factory floor.

Why artificial intelligence is racing to build a working model of reality — and why it matters who succeeds first.

A circular instrument dial bolted into a larger machine. Inside, topographic contour lines form a small landscape. One outer contour fades into a cyan dashed line where the model breaks down. A single amber pin lands on that boundary.

In 1943 Kenneth Craik, a young Cambridge psychologist, proposed that the brain works by carrying around a “small-scale model” of the world. Such a model, he wrote, lets a person try out alternatives in advance — rehearse a fall, anticipate an enemy, imagine a meal — before committing to act. Mr Craik died in a cycling accident two years later, aged 31. His idea has outlasted him. It now animates the most expensive race in technology.

For the past two years the debate over what large language models (LLMs) actually do has hinged on a single phrase: world model. The term denotes an internal representation, learned from data, that encodes how reality behaves — how objects fall, how prices respond to interest rates, how a sentence in French maps onto one in Mandarin. With such a model, an artificial intelligence (AI) can predict, plan and reason. Without one, it can only mimic.

The bet driving Silicon Valley — and increasingly Beijing, London and Paris — is that whoever first builds a sufficiently faithful world model will own the next stage of computing. Sales pitches put the prize at $15trn or more in added global output by 2035. The engineering is harder.

Borges and bandwidth

A useful world model must satisfy two demands that pull against each other. It has to be compact enough to fit in a computer’s memory and quick enough to query in milliseconds — and it has to be rich enough to reproduce the texture of a world that, as Jorge Luis Borges once noted, is most accurately mapped only at full size. Today’s models cheat. Video generators such as OpenAI’s Sora, Google DeepMind’s Veo and China’s Kling produce footage that obeys gravity, occlusion and shadow most of the time — and breaks them whenever the prompt drifts off-distribution. Glasses pour through tables. Basketballs split into twins.

Yet the trend line is steep. Genie, a system unveiled by DeepMind in 2024, learned to generate playable two-dimensional video games from a description alone, by inferring the underlying physics from internet footage. GAIA-1, built by Wayve, a London-based startup valued at $2bn, simulates driving scenes good enough to train autonomous fleets. Tesla, whose self-driving software now logs roughly 2bn miles a year, treats its neural network as a proto-world-model that imagines counterfactual trajectories before the car turns the wheel. The aspiration, in each case, is the Craikian one: a machine that runs reality forward in its head.

The sceptics’ chorus

Not everyone is convinced this counts. Yann LeCun, Meta’s chief AI scientist and a Turing-award laureate, reckons that LLMs lack any real world model: they are, he has argued, glorified pattern-matchers that confuse fluency with understanding. His preferred architecture, JEPA, predicts not the next token but the latent state of the world, on the grounds that nature is best modelled in concepts, not pixels. François Chollet, a former Google researcher, has long argued that benchmarks overstate machine reasoning. The doubts have empirical bite. Studies in 2025 found that frontier LLMs, asked to navigate simple gridworlds they had ostensibly mastered, collapsed when the grid was rotated by 90 degrees.

But the sceptics may be defining the goal away. A growing camp, including researchers at Anthropic and DeepMind, contends that fragmentary world models are already visible inside today’s networks. Probe a chess-trained transformer, and a representation of the board falls out. Probe Othello-GPT, and the disc layout appears. The models are not merely predicting next moves; they are carrying around something that looks suspiciously like Mr Craik’s small-scale image. The argument is no longer whether LLMs have any world model, but how partial, brittle and useful these models are.

Cui bono

The implications stretch beyond engineering. A serviceable world model is the precondition for autonomous agents — software that books flights, places trades or pilots drones without minute-by-minute supervision. It is also the precondition for industrial deployment in manufacturing. That is the bet we are making at Minder AI: that voice assistants paired with predictive models can replace clipboards on factory floors. Regulators are circling. The European Union’s AI Act, in force since August 2024, treats systems “designed for general purposes” with extra scrutiny precisely because their internal models cannot be inspected. Insurers, no fools, want the same.

The deeper question is older than computing. Mr Craik thought a mind was useful insofar as its model was accurate; an inaccurate model produced surprise, and surprise produced learning. Today’s giants are surprised constantly. They learn anyway. The race, in the end, is to the firm whose machine is least often astonished by its own world.

For now, no one is winning cleanly. Mr LeCun maintains that no one has built a machine that truly understands. He may be right today. He will not be right for long.

Tags · engineering · product · model