Skip to main content
Main Series · Foundations · TAM_009

Who Gets Approximated

In a hurry? Read the executive summary.

Not everyone benefits equally from AI that approximates human understanding. Some people will be approximated accurately because they match the patterns in training data. Others will be systematically misunderstood because they don’t fit dominant patterns. This isn’t a technical problem to solve. It’s a political reality that shapes whose understanding counts.

The question isn’t just “can AI approximate human understanding?” It’s “whose understanding gets approximated well, and who gets left out?”

The Over-Represented and the Invisible
#

Current AI systems are trained primarily on data from WEIRD populations: Western, Educated, Industrialized, Rich, Democratic. If you fit this profile, AI approximates your patterns well. If you don’t, it frequently fails in predictable ways.

A health AI trained on clinical trials that over-sample white populations will approximate white patient presentations well and miss atypical presentations more common in other populations. The system isn’t broken. It’s working exactly as trained. The problem is whose data shaped the training.

A language AI trained primarily on formal written English will approximate formal communication styles well and struggle with code-switching, dialect, and informal speech patterns. It doesn’t fail equally for everyone. It fails systematically for people whose language use doesn’t match the training corpus.

A financial AI trained on mainstream banking patterns will approximate standard financial behavior well and flag non-standard patterns as suspicious. If your financial behavior looks different because of cultural practices, immigration status, or economic marginalization, the system sees you as anomalous. Your patterns aren’t wrong. They’re just not represented in what the system learned.

The Compounding Effect
#

These failures compound. If health AI misses your symptoms, you get worse care. If language AI misunderstands you, you get worse service. If financial AI flags you as suspicious, you face more scrutiny. Each misapproximation adds friction to your life that others don’t experience.

Over time, people learn to adapt. You code-switch to match what AI expects. You describe symptoms in ways the system recognizes rather than how you actually experience them. You modify your financial behavior to avoid algorithmic suspicion. The burden of translation falls on those already marginalized.

This adaptation has costs. You’re spending cognitive resources to match the system rather than having the system match you. You’re potentially receiving less accurate service because you’re describing yourself in ways that don’t quite fit. You’re learning to see yourself through the system’s categories rather than your own.

The Feedback Loop of Exclusion
#

Here’s where Article 8’s feedback loop becomes sinister: As AI systems train on adapted behavior, they learn the adaptations. The next generation of AI approximates the adapted version. People adapt further. The system never learns to understand the original patterns because it only sees the translated version.

The person who code-switches to match AI expectations contributes data that reinforces the expectation of code-switching. The system gets better at understanding the adapted behavior and never improves at understanding the natural behavior. The gap between authentic experience and system understanding widens even as the system appears to improve.

This creates what looks like inclusion but functions as assimilation. The system works for you if you become more like what it expects. The solution to being misunderstood is to change yourself, not to be understood as you are.

The Measurement Problem
#

We struggle to even measure these failures. Standard metrics assess average performance. If an AI system is 90% accurate overall, it looks successful. But if it’s 95% accurate for majority populations and 70% accurate for minorities, the average obscures systematic inequality.

Worse, we often lack data to know whether failures are random or systematic. If someone’s health symptoms are misclassified, we might attribute it to the inherent difficulty of diagnosis rather than systematic bias in training data. The failure appears as noise rather than signal.

Even when we identify disparate impact, we struggle to fix it. You can’t improve approximation for populations whose data you don’t have. You can’t have data from populations who don’t trust systems that have failed them. The historical exclusion perpetuates itself.

The Justice Question
#

This isn’t just a technical challenge. It’s a question of justice. Whose understanding matters enough to be approximated well? Whose experiences count as valid training data? Who bears the burden when approximation fails?

Current AI development largely answers these questions by default: the understanding of those who produce the most data, whose patterns are easiest to learn, whose experiences match majority norms. This isn’t a conspiracy. It’s the natural result of optimizing for aggregate metrics without examining who aggregates obscure.

The alternative requires explicit choices. Whose understanding do we prioritize when we can’t approximate everyone equally well? Do we optimize for average performance or worst-case performance? Do we deploy systems that work well for some while working poorly for others, or do we wait until they work well for everyone?

These aren’t technical questions. They’re ethical and political questions about distributive justice in the age of AI.

The Representation Paradox
#

There’s a deeper paradox here. To approximate someone’s understanding well, you need to understand them first. But if you already understood them well enough to gather representative data, you wouldn’t need the AI system. The populations most in need of good approximation are often the ones least represented in training data.

This creates a choice between two approaches:

The inclusion approach: Work harder to gather representative data from marginalized populations. Build systems that approximate diverse experiences. Ensure everyone benefits from AI understanding.

The skeptical approach: Recognize that some experiences resist approximation. Not because they’re less valid, but because they emerge from contexts the system can’t access. Accept that AI approximation has limits and shouldn’t be applied universally.

Both approaches have merit. Both have risks. The inclusion approach risks extracting data from vulnerable populations for systems that may still fail them. The skeptical approach risks denying beneficial technology to those who might benefit from even imperfect approximation.

What This Means for Approximate Understanding
#

Throughout this series, I’ve explored what AI can and can’t approximate about human understanding. This article adds a crucial dimension: the question isn’t just capability but distribution.

An AI system that approximates human understanding well for some while systematically failing for others isn’t approximating human understanding. It’s approximating particular human understanding and presenting it as universal.

The challenge for responsible AI development is acknowledging this partiality while working to expand the circle of whose understanding counts. Not pretending universality we don’t have. Not accepting exclusion as inevitable. Finding the difficult path between false claims of inclusion and resigned acceptance of exclusion.

That’s not a technical problem with a technical solution. It’s a moral challenge that requires continuous attention to who benefits, who’s burdened, and whose understanding gets to count as human understanding worth approximating.


This is the ninth in a series exploring how AI approaches understanding. Previous articles examined capabilities and limitations. This one examines how those capabilities and limitations distribute unequally, creating questions of justice alongside questions of accuracy.

How this essay connects to others across The Approximate Mind.

TAM_009 maps who gets approximated well and who gets left out: WEIRD populations in the training data, the compounding effect of misapproximation, the feedback loop of exclusion that looks like inclusion but functions as assimilation. TRF_6-03 follows the same structure into professional transformation: who retains access to human judgment, who gets the AI version, and whether the equity gap widens as AI absorbs the work that previously served as equalizer.
TAM_009 describes the burden of translation falling on those already marginalized: code-switching to match AI expectations, describing symptoms in ways the system recognizes rather than how they are experienced. TRF_1-05 locates the same dynamic in language professions: Yuki's interpretive work bridges cultures precisely because meaning does not map cleanly across linguistic systems. The people most misapproximated are those whose communication requires the most contextual translation.