The Cost Collapse — Summary
Dr. Yuki Tanaka assembles the swarm on a Friday afternoon, drinking black tea from a thermos she has carried since graduate school, dented from a fall on a research vessel off the Kuril Islands in 2019. She is a marine biologist at a regional university in Hokkaido. Her institution does not have a frontier compute cluster. What it has is a laptop cluster, three shared GPUs, and thirty years of field research on cold-water kelp forest dynamics, documented in field notes, gray literature, and papers the major journals found too regional to warrant wide attention.
She is assembling five models. A small language model on the oceanographic literature relevant to her region. A state-space model for the time-series analysis of temperature gradients her buoys have been recording for a decade. A Tiny LM built from her team’s own field notes, holding knowledge about this specific stretch of coast that no published paper contains. A transformer-based model for cross-domain inference with atmospheric chemistry. And a routing layer that assembles whichever combination is most relevant to the specific question she puts to it.
The whole thing cost less to build than the conference she attended in Bergen last autumn.
The cost does not just drop. The cost structure changes. Training a frontier model costs what a mid-sized country spends on its public agricultural research system annually. Training a swarm component costs what a single postdoctoral researcher earns in a year. Running inference costs what Yuki’s department spends on field equipment in a month. These are not refinements of the same economy. They are a different economy.
The hyper-local contextual assembly matters as much as the cost. The frontier model brings broad knowledge to every query, most of which is irrelevant to the specific context. Yuki’s swarm brings the knowledge most relevant to this coast, this season, this question, assembled on demand, discarded when the query is complete. This is closer to how domain expertise actually works. The expert doesn’t activate everything she knows at once. She assembles what the situation requires.
What the swarm enables is not a cheaper version of the frontier model. It is a different instrument. The discovery pipeline run through a frontier model finds what the frontier model’s architecture can find. The discovery pipeline run through a swarm finds what the swarm’s curated components were built to find. The two search spaces are not the same.
The equity barrier has not dissolved. It has moved. The question is no longer who can afford the compute but who has the curation knowledge to build the relevant components. This expertise is more distributed than frontier compute. It lives in domain communities: the marine biologists who know their coastlines, the water engineers who know their watersheds, the historians who know their archives. The exclusions follow different patterns than before. The distance between those who can build and those who cannot is smaller, and differently shaped.
Yuki closes the routing layer’s configuration file. The swarm is assembled. She puts the thermos back in its place beside the monitor, in the dent the Kuril Islands fall left in it. She types the first query. The swarm assembles the relevant components. She has been waiting thirty years for something that could hold all of this at once.