The Reimagined Study

TAM-079 · The Approximate Mind

Dr. Kavitha Subramanian works at a public health institute in Hyderabad. She studies maternal nutrition in Telangana’s tribal districts. She has run three studies over seven years. Each was well-designed by conventional standards. Each produced clean findings. Each finding, when implemented as an intervention, worked less well than the study predicted.

She collects old maps. Not valuable ones. Tourist maps from the 1970s and 1980s, purchased for a few rupees at second-hand bookstalls in Abids. She likes the way they show a version of the world that was confident and already wrong. A 1978 road map of Hyderabad shows a city that no longer exists, its confident lines tracing routes through neighborhoods that have been rebuilt three times since. She keeps them rolled and banded on a shelf behind her desk, each one a document of someone’s certainty about a territory they had not fully seen.

She has a whiteboard. On the left side, her study results. On the right side, what she has observed in the field that her studies cannot capture. She drew a line between the two sides last year. Above the line, in her handwriting, she wrote a word she learned from a philosophy seminar she attended on a whim: “stratum.”

What the Study Found
#

Her third study: a cluster-randomized trial of a micronutrient supplementation program for pregnant women in three tribal blocks. Well-powered. Clean randomization. Good adherence monitoring. Twelve months. Published in a journal she respects.

Supplementation improved hemoglobin levels by a statistically significant margin. Birthweight improved modestly. Preterm birth rates showed no significant difference. The intervention was recommended for scale-up.

The recommendation was reasonable given the evidence. Kavitha wrote it herself.

She also knew, from seven years of fieldwork in the same blocks, what the study could not find.

Adherence dropped by 40% during planting season. It dropped differently across the three blocks despite identical protocols. The hemoglobin improvement did not translate into the expected reduction in preterm births. And the women in one block who received supplementation reported worse overall health at the end of the study than at the beginning, despite improved biomarkers.

The study design could not find these things because they are not hemoglobin questions. They are life questions.

The adherence dropped during planting season because the women were the primary agricultural laborers in their households. The supplementation schedule required clinic visits that conflicted with the labor pattern the household’s food security depended on. It dropped differently across blocks because the three blocks have different cropping patterns, different gender labor divisions, and different household structures determining who does what work and when.

The hemoglobin improvement did not reduce preterm births because the mechanisms producing preterm births in this population are not primarily nutritional. They are compound: nutritional status interacting with labor intensity interacting with water access interacting with psychosocial stress interacting with healthcare access. Each interaction operates at a level the study was not designed to reach. The single-variable intervention touched one component of a mechanism whose power resides in the compound.

The self-reported health worsened because the study itself created an additional obligation in lives already saturated with obligations. The clinic visits displaced something. Kavitha does not know what. Rest, possibly. Time with children. The walk to the water source at the hour when other women walk and the conversation that happens on the way. The study did not measure what it displaced because it did not know the displacement was occurring.

The study found what it was designed to find. The mechanisms that determined whether the finding actually mattered for these women’s pregnancies operated at a stratum the study was not designed to reach.

What Retroduction Would Do Differently
#

The same question, approached from the other direction.

Start from the outcome, not the intervention. The outcome: maternal and neonatal mortality and morbidity in these three tribal blocks exceed what the documented risk factors predict. Nutritional status, healthcare access, anemia prevalence, these account for some of the excess. They do not account for all of it.

The residual is not noise. It is the starting point.

Step one: map the compound. Not individual risk factors scored separately. The interaction mapped. Nutrition interacts with labor pattern. Labor pattern interacts with water access. Water access interacts with household structure. Household structure interacts with healthcare timing. Healthcare timing interacts with psychosocial load. The compound, not the individual components, is the unit of analysis.

This is what the Intersectional Systemic Harm Index does at the individual clinical level. Applied to a research population, it produces a compound condition profile for the study’s target group. The profile shows not just what barriers exist but how they interact, and where the interaction produces outcomes that no individual barrier would predict.

Step two: identify the stratum gap. Where does the compound produce outcomes that the published studies, the clinical guidelines, the empirical record cannot explain? The gap between predicted preterm birth rates (based on documented nutritional risk) and actual preterm birth rates in this population is the data. It points at mechanisms the research tradition has not captured.

Step three: reason backward from the gap to the mechanism. The retroductive inference: compound physical stress, agricultural labor plus water-carrying plus household labor, interacts with nutritional status in ways that are not additive. The nutritional intervention cannot offset the load because the load is generated by the compound, not by the nutritional component alone. The interaction is the mechanism. The mechanism has not been documented because the research tradition treats interaction effects as complications to control for rather than causal structures to study.

Step four: design the study to investigate the inferred mechanism. Not a supplementation trial. A compound-load study. The unit of analysis is not the nutrient. It is the household’s compound burden across the agricultural cycle. The outcome is not hemoglobin. It is whether the pregnancy unfolds inside conditions the body can sustain. The method is not randomization of a single variable, because you cannot randomize the monsoon, the water source, the household labor arrangement, the accumulated history of what these women’s bodies have been asked to carry.

Step five: apply the skeptic operations before the study begins.

Is “pregnant woman” the right unit of analysis, or is it the household whose labor arrangement determines what her body endures? Is “anemia” a natural kind, or is it a reification of a biomarker that the clinical tradition treats as a condition when it is actually a symptom of a compound the biomarker cannot represent? Is the pregnancy separable from the web of relationships and obligations that constitute this woman’s daily existence?

From whose position was “micronutrient supplementation” identified as the right intervention? From the researcher’s? The funder’s? The woman whose planting season the supplementation schedule disrupted?

If the study’s findings are implemented, what happens in two years? If the answer is “hemoglobin improves and nothing else changes,” the study was insufficient for the reality it was trying to serve.

Were the clinical guidelines the study was built on developed in populations with similar labor patterns, similar water access, similar compound conditions? If not, what is lost in the transfer? Is the transfer cost borne by the researcher or by the woman in the destination context?

Each question catches something the conventional study design accepts without examination. Together they do not replace the study. They redirect it. They point it at the stratum where the mechanisms actually operate rather than the stratum where the instruments were built to measure.

What This Looks Like in Education
#

A school district in Madhya Pradesh. The conventional study: a randomized evaluation of an AI-assisted personalized learning platform in government schools. Well-designed. Clean execution. Twelve months.

Test scores improved by 15%. The platform was recommended for scale-up.

What the study could not find: the improvement came primarily from students already in the top third. The bottom third showed no improvement. The middle third improved on the platform’s assessments and declined on assessments requiring unassisted reasoning. Teachers reported that their role had narrowed from facilitation to monitoring. Students learned to produce the outputs the platform rewarded in ways that did not transfer to contexts without the platform.

The retroductive question: given that test scores improved but the capacity for unassisted reasoning did not, what mechanisms are operating?

The inference: the platform optimized for assessment performance. The students learned to match patterns the platform recognized. The cognitive capacity the patterns were supposed to represent, the ability to encounter difficulty without external scaffolding, to reason through ambiguity, to sit with material you did not choose, was not developed because the platform removed the difficulty that was the developmental substrate.

The platform did not fail. The study’s outcome measure was insufficient for the thing that mattered.

The reimagined study starts from the formation outcome, not the test score. Does the child develop the capacity for unassisted reasoning? Boredom tolerance? The ability to persist with material that is not immediately engaging? These are measurable, though the instruments for measuring them are less developed than the instruments for measuring test scores, because the research tradition has invested decades in the latter and years in the former.

The study design asks: what learning environment produces these formation outcomes? The AI platform may be part of the answer. It cannot be the whole answer, because the whole answer includes the difficulty the platform is designed to remove. The reimagined study measures what the conventional study’s success metric was supposed to represent but did not: whether the child is becoming someone who can think without assistance.

What This Looks Like in Agriculture
#

A research station in Maharashtra. A randomized trial of a drought-resistant crop variety. Well-designed. The variety performed well. Yield maintained under simulated drought conditions. Recommended for adoption.

What the study could not find: the farmers who adopted the variety abandoned their polyculture because the new variety required monoculture management. The polyculture they abandoned was managing risk across monsoon variability, soil health, dietary diversity, and seed preservation simultaneously. The yield improvement in the trial year came at the cost of the risk management architecture that would have protected the household in the bad year.

The bad year came eighteen months after the study period ended.

The retroductive question: given that yield improved in the trial year and household food security declined two years later, what mechanisms produced the decline?

The inference: the polyculture was not inefficiency. It was a compound risk-management mechanism operating at the level of the real, invisible to a study designed to measure yield at the level of the empirical. The study’s unit of analysis, the crop, was the wrong unit. The mechanism resided in the household’s relationship to risk, a relationship the monoculture adoption dismantled.

The reimagined study: the unit of analysis is the farming household across a multi-year cycle. The outcome is not crop yield but household food security, economic resilience, and soil health measured across seasons that include the bad year. The study design assumes that the farmer’s existing practice, the polyculture, is itself a form of knowledge, a situated response to conditions the study must understand before it intervenes.

What Changes and What Doesn’t
#

The essay owes the reader an honest position on what retroductive design does not replace.

The randomized controlled trial remains the strongest design for estimating the average effect of an isolable intervention in a defined population. When the mechanism is genuinely isolable and the context genuinely controllable, the RCT is the right tool. Drug efficacy trials for single-mechanism pharmaceuticals. Vaccine trials. Surgical technique comparisons. These are real applications where the assumptions hold and the design produces valid findings.

The retroductive design is for the situations where those assumptions do not hold. Where the mechanisms are interactive. Where the context cannot be controlled because the context is the mechanism. Where the compound is the unit and decomposing it destroys the thing being studied.

The argument is not RCT versus retroduction. The argument is that the research tradition has one tool for every situation, and some situations require a different tool. The institutional architecture, the funding streams, the journals, the career incentives, overwhelmingly rewards the one tool. It does not reward the other.

The research enterprise needs both. The institutional reform required to fund and publish and reward retroductive study design is the same institutional reform the previous essay described for model integration: funding structures that reward boundary-crossing, career structures that do not penalize it, time horizons that match the phenomena being studied.

I wonder whether the generation of researchers now forming, the ones watching AI handle the computation and the data processing, will be the ones who build the integrative method. Not because they are smarter than their predecessors but because the thing they can contribute, the cross-domain judgment that AI cannot replicate, is exactly the thing the integrative framework requires. The pipeline handles the within-domain analysis. The human handles the between-domain connection. The connection is where the mechanisms live.

The Fourth Study
#

Kavitha has started designing her fourth study. It does not look like the first three.

The unit of analysis is not the nutrient. It is the household’s compound load across the agricultural cycle. She has developed a compound condition index, adapted from work she encountered at a conference, that measures the interaction between nutritional status, physical labor intensity, water access burden, household structure, and healthcare timing as a single variable. The index treats the interaction as signal, not noise.

The outcome is not hemoglobin. It is a composite indicator she has built from three measures: biomarkers, functional capacity, and the women’s own assessment of whether their body can sustain what is being asked of it. The third measure, the self-assessment, is the one she trusts most and the one the reviewers will question first. She is including it anyway, because the women know things about their own pregnancies that no biomarker captures, and treating their testimony as data rather than anecdote is one of the things the reimagined study requires.

The method is not randomization of a single variable. It is a prospective cohort design that follows households through a full agricultural cycle, measuring the compound condition index at multiple points and tracking how changes in the compound predict maternal outcomes. She cannot randomize the monsoon. She cannot randomize the water source. She cannot randomize the household’s labor arrangement. She can observe the compound as it changes across the seasons and trace how the changes interact to produce the outcomes the supplementation trial could not explain.

She does not know if the journal will accept it. The design does not fit the standard reporting templates. The methods section describes retroduction, and the reviewers may not know the term. The sample size calculation does not apply in the conventional sense because the compound is the unit and compounds do not decompose into calculable independent observations.

She is submitting it anyway.

The Maps
#

The old maps on her shelf are beautiful and wrong. The 1978 Hyderabad map shows a city of two million with confident road markings and labeled neighborhoods. The city is now ten million. The roads have been rerouted three times. The neighborhoods have been renamed, demolished, rebuilt, renamed again. The mapmaker was not wrong in 1978. The territory changed, and the map’s categories could not hold the change.

Her first three studies are maps. Carefully drawn. Methodologically sound. Showing a version of the territory that was accurate within the projection system the discipline provided. The territory they were trying to describe, the actual structure of these women’s lives and the mechanisms determining whether their pregnancies end in health or harm, was always larger than what the projection could hold.

She would rather draw a new map, imperfect but pointed at the territory as it actually is, than keep refining the old one until its precision is flawless and its relationship to reality is nil.

The whiteboard still has the line. Study results on the left. Field observations on the right. The word “stratum” above it.

She has been thinking about erasing the line. Not because the distinction is wrong. Because the fourth study is an attempt to build a method that does not require the line. A method that treats what she observes in the field and what she measures in the study as evidence from different strata of the same reality, rather than as two kinds of knowledge, one rigorous and one anecdotal, that cannot speak to each other.

The line is still there. She has not erased it yet. She picks up the marker sometimes and holds it near the whiteboard and puts it down again. The erasing feels like a commitment she is not quite ready to make, or a commitment the institution she works within is not quite ready to absorb.

The maps on the shelf do not judge her. They were all, in their time, the best rendering of a territory someone cared enough to draw. They were all, in time, replaced by better renderings. The territory did not mind. The territory was always there, beneath every map, waiting to be seen as it actually was.

This is Part 79 of The Approximate Mind, and it completes the diagnostic arc that began with Part 74. The Interrogator asked what AI systems cannot see. The Epistemic Framework specified what a system designed to see it would need to be. The Amplitude Problem described the destruction of effort-as-filter. The Injected Center described manufactured consensus. The Missing Model asked why we cannot simulate the social contract’s consequences across dimensions. This essay asks what research itself would look like if it stopped decomposing what should not be decomposed. The answer is retroductive study design: start from outcomes, reason backward to mechanisms, treat the compound as the unit, apply the skeptic operations before the protocol is finalized, and measure what matters rather than what the existing instruments were built to measure. The prescriptive work, what could be built, belongs to The Reimagined, the series that follows.

References
#

Critical Realism and Research Methodology

Bhaskar, Roy. A Realist Theory of Science. Verso, 1975.

Danermark, Berth, et al. Explaining Society: Critical Realism in the Social Sciences. Routledge, 2002.

Pawson, Ray, and Nick Tilley. Realistic Evaluation. SAGE Publications, 1997.

The Limits of Conventional Method

Cartwright, Nancy. How the Laws of Physics Lie. Oxford University Press, 1983.

Cartwright, Nancy, and Jeremy Hardie. Evidence-Based Policy: A Practical Guide to Doing It Better. Oxford University Press, 2012.

Deaton, Angus, and Nancy Cartwright. “Understanding and Misunderstanding Randomized Controlled Trials.” Social Science & Medicine, vol. 210, 2018, pp. 2-21.

Epidemiology and Population Health

Krieger, Nancy. Epidemiology and the People’s Health: Theory and Context. Oxford University Press, 2011.

Geronimus, Arline T. Weathering: The Extraordinary Stress of Ordinary Life in an Unjust Society. Little, Brown Spark, 2023.

Decolonizing Research

Smith, Linda Tuhiwai. Decolonizing Methodologies: Research and Indigenous Peoples. Zed Books, 1999.

Chambers, Robert. Whose Reality Counts? Putting the First Last. Intermediate Technology Publications, 1997.

Qualitative and Integrative Approaches

Greenhalgh, Trisha. “Of Lamp Posts, Keys, and Fabled Drunkards: A Perspectival Tale of Four Guideline Reviews.” BMJ Quality & Safety, vol. 21, 2012, pp. 1-5.

Flyvbjerg, Bent. Making Social Science Matter: Why Social Inquiry Fails and How It Can Succeed Again. Cambridge University Press, 2001.

Another Science

Stengers, Isabelle. Another Science Is Possible: A Manifesto for Slow Science. Polity Press, 2018.

Indian Agriculture and Knowledge Systems

Shiva, Vandana. The Violence of the Green Revolution: Third World Agriculture, Ecology, and Politics. Zed Books, 1991.

Education and Formation

Biesta, Gert. The Rediscovery of Teaching. Routledge, 2017.

What the Study Found#

What Retroduction Would Do Differently#

What This Looks Like in Education#

What This Looks Like in Agriculture#

What Changes and What Doesn’t#

The Fourth Study#

The Maps#

References#