The Architecture of the Center

The previous essay in this series described a dependency relationship: the global south consuming AI infrastructure built, owned, and governed by a small number of wealthy countries, with surplus flowing outward along familiar channels. The description is structurally accurate. It is also incomplete in a specific way.

It treats the center as fixed.

The center is not fixed. The architectural question of where AI capability will actually live — in massive data centers running frontier models, or in distributed networks of smaller specialized models running on accessible hardware, or in some combination not yet clearly resolved — is genuinely in contest. The outcome of that contest has structural implications for whether the dependency described in AM 69 deepens, transforms, or partially breaks over the next decade.

This essay examines the architecture honestly. Not as a technology survey. As a structural question about power: where does capability concentrate, who controls the points of concentration, and what the history of comparable architectural transitions suggests about where new centers form when old ones are disrupted.

Two Visions, Both Partially True
#

The dominant narrative about AI capability is centralization. The most capable models require extraordinary compute: tens of thousands of the most advanced GPUs running for months, consuming electricity at the scale of small cities, trained on datasets of nearly incomprehensible size. GPT-4, Claude, Gemini: these are products of infrastructure concentration that has no precedent in the history of computing. The organizations that can afford this infrastructure can be counted on one hand. The countries where they operate can be counted on two fingers.

This narrative is accurate for the frontier. It is not accurate for the full picture.

A competing architectural reality has been developing alongside the frontier narrative, and it accelerated dramatically in 2024 and 2025. Models that run on a laptop. Models that run on a phone. Models that run on agricultural advisory devices in rural areas without reliable internet. Phi-3 Mini from Microsoft. Gemma from Google. Llama 3 in its smaller variants. Mistral 7B. These are not toys. For the majority of real-world tasks that real people and organizations actually need to perform, the capability gap between these models and frontier systems is smaller than the frontier narrative implies, and shrinking.

DeepSeek’s emergence in early 2025 sharpened this picture. A Chinese laboratory produced frontier-capable models at a fraction of the compute cost that American laboratories had assumed was necessary. The efficiency gains were not marginal. They suggested that the relationship between compute expenditure and model capability was less fixed than the scaling law orthodoxy held. If frontier capability is achievable at dramatically lower cost, the center becomes more accessible. If the center becomes more accessible, the dependency becomes less severe.

Both visions are true in different domains. For the hardest problems, the frontier still holds. For most of what most users actually need, the frontier is not necessary. The question for the dependency argument is which domain the global south primarily needs to operate in.

The Distillation Problem
#

Here is where the architectural picture becomes more complicated than either narrative captures.

Most capable small models are not independently trained from scratch on raw data. They are distilled from, fine-tuned with reference to, or evaluated against large frontier models. The knowledge encoded in a small model that runs efficiently on accessible hardware flows, in significant part, from the large models at the center. The inference is local. The epistemology is imported.

This matters for the dependency question in a way that is easy to miss. A country that deploys local small models for healthcare, for agricultural advisory, for educational support, for legal guidance in local languages, may believe it has achieved a form of AI sovereignty: the model runs on locally controlled hardware, the data stays within national borders, there is no ongoing payment to foreign API providers. But if the model’s capability derives from distillation from frontier models that it cannot independently replicate, a structural dependency persists at a deeper layer than the infrastructure layer.

The analogy is technology licensing. A country that licenses foreign manufacturing technology and builds its own factories has more sovereignty than one that imports finished goods. It has less sovereignty than one that can design and build the technology itself. The licensing dependency is less visible than import dependency. It is still dependency.

The direction of knowledge flow matters as much as the location of inference. A model that can only receive capability from the center, that cannot independently develop capability and contribute it back upward, occupies a structurally peripheral position regardless of where the hardware sits.

The question this poses: is independent capability development possible at the small model layer, for specialized domains, using locally controlled data? The answer matters enormously for whether the SLM plus edge hardware architecture represents genuine partial exit from the dependency, or a more comfortable version of the same structural position.

The answer is: conditionally yes, and the conditions are instructive.

Where Local Beats Frontier
#

A general frontier model trained on data from across the internet is, by construction, trained primarily on content produced in wealthy countries, in dominant languages, about dominant-context problems. Its baseline assumptions about what a healthcare presentation looks like, what an agricultural problem looks like, what a legal question looks like, reflect the contexts most represented in its training data.

A specialized model trained on locally controlled data about local problems can, for those specific problems, outperform the frontier. Not because it is more capable in general. Because it is more specifically calibrated to the problem that actually matters.

This is not hypothetical. The performance of specialized medical models on specific clinical domains has demonstrated that domain-specific fine-tuning on high-quality domain data produces better results for domain-specific tasks than general frontier models. Agricultural advisory models trained on specific soil types, climate patterns, crop varieties, and pest pressure profiles for specific geographies can be more useful to the farmers in those geographies than general AI systems that answer agricultural questions from the average of global agricultural data.

The leapfrog possibility lives here. The global south is not going to outcompete the American and Chinese AI ecosystems at the general frontier. It does not need to. The problems that matter for its populations are specific, and specificity is a domain where local capability can compete with central capability if the institutional infrastructure to develop it exists.

The India Stack demonstrates this logic at the application layer. India did not build a better general digital payment system than those available from American technology companies. It built a payment system specifically designed for Indian scale, Indian linguistic diversity, Indian institutional context. That specificity is a competitive advantage rather than a limitation. UPI now processes transaction volumes that dwarf what any foreign payment infrastructure had achieved in India. The sovereign infrastructure is better for its specific context than the imported alternative.

The question is whether this logic can be extended upward from the application layer to the model layer. That requires something the India Stack also required: state capacity, sustained investment, willingness to make the long bet, and enough population scale to generate the domain-specific data that makes specialized models actually superior.

The Hardware Layer Underneath
#

Even if the model layer partially decentralizes, the hardware layer has its own dependency structure.

AI hardware exists at two distinct tiers with different concentration profiles.

Training infrastructure is severely concentrated. The GPUs that train frontier models are manufactured primarily by NVIDIA. The chips that power NVIDIA’s GPUs are fabricated primarily by TSMC in Taiwan, using equipment manufactured primarily in the Netherlands by ASML. This supply chain is a chokepoint that involves three companies in three countries, and it is the specific vulnerability that American export controls on China are targeting. No country outside this network can currently train frontier models on domestically produced hardware. The hardware dependency for training is as severe as any dependency described in AM 69.

Inference infrastructure is less concentrated. ARM-based processor architectures, custom silicon for inference rather than training, and the broader ecosystem of chips that runs applications on consumer devices are more distributed in their design and manufacturing base. The efficiency improvements in inference hardware are democratizing this layer meaningfully. A phone manufactured in Vietnam or South Africa, running inference on an ARM chip, does not require access to the NVIDIA-TSMC-ASML supply chain.

The practical implication: the dependency is severe at the training layer and diminishing at the inference layer. A country that can run inference on accessible hardware but cannot train its own models faces a different dependency structure than the one AM 69 describes. It is a dependency that is contingent on continued access to models from the center, rather than continued access to cloud compute from the center. The terms of the relationship are different. The structural position is similar.

The Bipolar Complication in Technical Terms
#

China’s development of a competing AI infrastructure deserves technical precision, because the political framing tends to obscure what is actually being built.

China’s AI stack is real and substantial. At the model layer, DeepSeek R1 and V3 demonstrated genuine frontier capability built outside the American ecosystem. Qwen 2.5 and its successors are competitive with American frontier models on many benchmarks. These are not imitations. They are independently developed systems with different architectural choices reflecting different research priorities.

At the application layer, China has built extensive domestic AI infrastructure: payment systems, social media platforms, e-commerce infrastructure, government service delivery, surveillance and administrative systems. The Chinese ecosystem is genuinely distinct from the American one and is not dependent on it for core functionality.

The hardware dependency remains. China’s ability to manufacture leading-edge semiconductors is constrained by American export controls on EUV lithography equipment. This is the genuine chokepoint, and it is not being resolved quickly. China is investing heavily in domestic semiconductor manufacturing capability, but the gap between its current capability and TSMC’s leading-edge processes remains significant and will take years to close at minimum.

What China’s emergence creates for the rest of the global south is not freedom from dependency but choice of dependency. The African country that deploys Chinese AI infrastructure is in a structural position similar to the one that deploys American AI infrastructure: external training pipeline, external hardware supply chain, surplus flowing outward, governance terms set elsewhere. The idiom is different. The structural relationship is similar.

This is the honest version of the bipolar argument. Two centers do not resolve the periphery’s structural position. They offer the periphery a choice of which center’s terms to accept.

Quantum: Honest Uncertainty
#

Quantum computing’s relationship to AI dependency is genuinely uncertain, and that uncertainty deserves to be named rather than resolved by generating excitement.

For the near and medium term, quantum computing does not meaningfully change the AI architectural picture. The core computations in transformer-based neural networks, the matrix multiplications that dominate training and inference, are not obviously amenable to quantum speedup in ways that would alter the economics of AI development. Most researchers who study both quantum computing and neural network architecture are skeptical that quantum advantage will restructure AI compute economics in the next five to ten years.

The more immediate quantum implication for the dependency question is cryptographic. Sufficiently powerful quantum computers can break the public key encryption that secures most digital communications and transactions. The countries and institutions that develop this capability first gain, in principle, the ability to decrypt communications protected by current standards. This is a sovereignty and security question of significant importance: a world in which a small number of actors can read encrypted communications is a world in which digital infrastructure sovereignty means something different from what it means today.

The longer-term possibility is more speculative but structurally significant. If quantum computing eventually provides meaningful advantages for AI training, it creates a new center at a new layer, with a new dependency structure. The current investment map in quantum computing: the United States, China, and the European Union, with significant activity in the UK, Canada, and Australia. The dependency map would closely resemble the current AI infrastructure map. The periphery would face a new version of the same structural position at a higher layer of technical sophistication.

The honest summary on quantum: it is a genuine wildcard that restructures the dependency question if quantum advantage materializes for AI-relevant computation. The timeline and probability are genuinely uncertain. Planning around quantum’s disruption is premature. Ignoring it entirely is unwise.

What This Means for the Dependency Argument
#

The architectural picture, held honestly, neither fully confirms nor fully refutes the dependency framing in AM 69. It complicates it in ways that matter for anyone thinking about what the peripheral countries’ options actually are.

The dependency is real but not static. The inference layer is decentralizing. The training layer is not. The distillation pipeline creates epistemological dependency even when infrastructure is local. The hardware dependency is severe for training and diminishing for inference. China’s emergence creates choice rather than exit. Quantum creates new layers of potential dependency whose timeline is uncertain.

The conditions under which partial exit from the dependency is achievable: large enough population to generate domain-specific training data at scale, sufficient state capacity to make sustained investment in local AI infrastructure, political will to prioritize structural independence over short-term access efficiency, and the specific insight that competing at the general frontier is unnecessary if local specialization can outperform the frontier on locally relevant problems.

These conditions are not widely distributed. Where they exist, the architectural possibilities for a third path between the American and Chinese AI ecosystems are real. Where they do not exist, the architectural evolution of the next decade is more likely to change the form of the dependency than to break it.

The PC revolution distributed compute and created new centers at higher layers. The question for AI’s architectural evolution is not whether new centers will form as inference distributes. They will. The question is whether the new centers are more accessible to the periphery than the current ones, or whether the pattern of democratization followed by reconcentration repeats at a higher layer of the stack.

The historical record suggests reconcentration is the more common outcome. The possibility of a different outcome this time is real, and worth working toward. But it requires deliberate effort, sustained over years, against the structural tendency of productive infrastructure to concentrate in the hands of those who were already positioned to build it.

That is where the technical analysis ends, and where the political analysis must begin again.

The Approximate Mind is a philosophical essay series examining how artificial intelligence transforms human work, identity, development, and society. Parts 63-70, The New Periphery suite, trace the arc from broken educational contracts through the civilizational consequences of automation to the technical architecture of the dependency that organizes the whole. Part 71 translates the suite’s argument for the broadest possible audience.

Two Visions, Both Partially True#

The Distillation Problem#

Where Local Beats Frontier#

The Hardware Layer Underneath#

The Bipolar Complication in Technical Terms#

Quantum: Honest Uncertainty#

What This Means for the Dependency Argument#