Policy · Regulatory · Funders

Deployable Evidence vs Research Still Needed.

The single most common objection to acting on the gender health gap is we need more research. Some of it is fair. Much of it is not. This essay draws the line, carefully.

The objection, and why it functions as a delay

In any regulatory, funder, or payer conversation about women's health, the recurring sentence is that the evidence is emerging, the signals are suggestive, and more research is needed before action is warranted. This sentence is deployed across findings that differ wildly in their evidentiary status. Some of them genuinely warrant that caveat. Others are supported by three decades of replicated, peer-reviewed work and a clear mechanism, and attaching more-research-is-needed to them is not epistemic humility but inertia.

The absence of a shared criteria for deployability allows the objection to move fluidly between categories. A finding that is actually deployable gets treated as if it required further research; a finding that genuinely does need further research gets treated as if it already failed to survive scrutiny. The conversation stalls in both directions simultaneously.

Four criteria for deployability

A finding is deployable when all four of the following conditions are met. If any one fails, the finding moves to research-still-needed, or to a smaller scoped pilot.

01
Replicated effect
Two or more independent peer-reviewed studies in distinct populations showing the same direction and comparable magnitude.
02
Traceable mechanism
A biological or systemic mechanism that the effect plausibly depends on, published in mechanism-focused literature.
03
Actionable lever
A specific clinical, regulatory, or billing action that moves the outcome, requiring no new infrastructure.
04
Defined population
A population to whom the action applies is identifiable from existing records (ICD-10 code, demographic criteria, or standard risk flag).

Six findings that meet all four criteria

Each item below passes all four tests. They are deployable now. The phrase "more research is needed" applied to them is, politely, a placeholder for "we have not scheduled the action."

Deploy now
Evidence sufficient for action
GDM to T2D postpartum screening Meta-analysis establishes 10x T2D risk within 10 years of gestational diabetes. Current screening rate under 20 percent within the recommended window. lever: mandatory 6-12 week OGTT + annual HbA1c, coded as standard-of-care
Preeclampsia to CVD 10-year surveillance 22-study meta-analysis, 258,000 women, 4x heart-failure risk. Fewer than 10 percent receive any cardiovascular follow-up within the first year. lever: extend obstetric bundle to 10 years postpartum; see essay 14
Endometriosis diagnostic pathway correction 7 to 10 year diagnostic delay, $10,002 annual excess cost per patient (Soliman 2018, n=113,506). ICD-10 coding defaults to K58.9, R10.2. lever: diagnostic protocol + ICD-10 specificity at primary care
Sex-stratified CVD risk calculation Framingham and ASCVD under-estimate female risk. Recalibrated tools (QRISK3, SCORE2 recalibration) exist and are validated. lever: primary-care software update, already available
Pharmacokinetic sex-adjusted dosing 76 of 86 FDA drugs show higher PK values in women (Zucker 2020). 96% of female-biased PK associated with higher ADR. lever: dose-flag at prescribing interface, starting with top-10 ADR-signal drugs
Sex stratification pre-specified in pivotal trials IQVIA 2025 documents trial parity deficits. NIH SABV policy already requires this in principle. lever: FDA and EMA pre-specification guidance at end-of-Phase-2
Research still needed
Genuine frontiers
Fetal microchimerism and autoimmune onset Persistence is documented. Causal contribution to autoimmune mid-life clustering is hypothesised but not yet mechanistically proven. need: prospective cohort + tissue-level microchimeric cell isolation
Cycle-phase pharmacokinetic dosing Aggregate sex differences are documented. Within-cycle pharmacokinetic variance is real but the clinically-meaningful magnitude and drug list are not yet established. need: cycle-stratified Phase 1 protocols for CYP3A4-sensitive drugs
Ghost-population prevalence quantification Claims-vs-PREMs gap documented directionally; the cross-condition magnitude is not yet rigorously estimated. need: paired-cohort PREMs-against-claims study in one jurisdiction; see essay 17
Climate-menopause vulnerability at municipal granularity The thermoregulatory mechanism is established; the population-level heat-event ED admissions are documented. Municipal vulnerability stratification is not. need: CDC HeatRisk + N95.x code linkage at MSA level; see essay 18
Medial-septum Y-chromosome-dosage pharmacology Transcriptomic evidence is strong for chromosomally-driven neural dimorphism. Drug-discovery implications are nascent. need: YCD-stratified neuropsychiatric trial protocols
Cross-indication inflammation-axis therapeutic validation Mechanism is plausible per essay 19. Pan-indication trials are not yet commissioned. need: industry-sponsored cross-indication inflammation-target pivotal

The same findings, plotted on the two axes that matter

Mechanistic certainty sits on the horizontal axis. Action readiness — whether the clinical, billing, or regulatory lever already exists — sits on the vertical. A finding in the top-right quadrant is deployable. A finding in the bottom-left is genuinely research-frontier. The middle two quadrants are pilots and stopped programs respectively. Points are positioned by the authors' calibrated reading of the underlying literature and are illustrative rather than meta-analytic.

Evidence-readiness map · mechanism certainty × action readiness
action readiness →
deploy now
pilot · build lever
research frontier
low priority
GDM→T2D
Preeclampsia
Endometriosis
CVD calc
PK dosing
SABV pre-spec
Microchimerism
Cycle-phase PK
Ghost-pop
Climate granularity
Medial-septum YCD
Inflammation trial
0mechanism certainty →1.0
Deploy-column findings (6)
Research-column findings (6)

Positioning reflects the authors' calibrated reading of each finding's evidentiary base against the four deployability criteria. The two clusters are separated by a clear diagonal band, not because the underlying science is neatly bimodal, but because the four criteria above happen to move together: a finding with a replicated effect and traceable mechanism tends to already have a lever and a defined population, because those attributes accrete during the same decades of work.

The cost of the distinction: how long the deployable evidence has been waiting

Each deploy-column finding has an evidence-publication date and a current implementation rate. The gap between the two is not a measure of epistemic caution; it is a measure of institutional lag. The horizon bars below encode that gap on a calendar from the year of first major meta-analysis (or equivalent consolidation) to 2026, with the darker overlay showing the fraction of the at-risk population currently receiving the indicated action.

Evidence-publication year → 2026 · deploy-column lag horizons
GDM to T2D postpartum screeningBellamy Lancet 2009 · ~18% within window
17 yr
Preeclampsia to CVD surveillanceBellamy BMJ 2007 · <10% follow-up
19 yr
Endometriosis diagnostic pathwayNnoaham F&S 2011 · delay unchanged
15 yr
Sex-stratified CVD risk calcRidker JAMA 2007 · QRISK3 2017
19 yr
Sex-adjusted PK dosingFDA zolpidem 2013 · Zucker BoS 2020
13 yr
Trial sex-stratification pre-specNIH SABV 2016 · IQVIA 2025 gap
10 yr
publication year20162026
years of lag
Evidence known since
Current implementation rate

The lightest band encodes the span of years during which each finding has been consolidated in peer-reviewed literature; the solid overlay encodes the fraction of the target population currently receiving the indicated action. Implementation-rate figures are taken from the respective specialty surveillance literatures and rounded for display; the directional point — long known, barely deployed — is robust to reasonable variation in the exact percentage.

Why the distinction matters operationally

A regulator, a funder, or a payer who internalises the two-column ledger above can move immediately. The deploy column converts directly into policy instruments: CMS bundled-payment extension, FDA pre-specification guidance, commercial-payer clinical-pathway updates, CPT code additions, reimbursement-protocol modifications. None of these require waiting on the research column. None of them are contingent on the research column.

The research column, separately, carries its own funding and design implications. It should be resourced as a distinct portfolio, not as a justification for inaction on the deploy column. The NIH Sex as a Biological Variable policy, in its 2016 form, was precisely this kind of research-portfolio instrument. Its enforcement gap is documented (IQVIA 2025) but the enforcement-gap conversation is separate from the deploy-column conversation.

When the two columns are collapsed, the result is the status quo: every finding is treated as pre-deployable, every action is postponed, and the cumulative national cost of the gender health gap continues to be paid from the downstream clinical and disability budgets while the upstream action remains unscheduled.

One specific request to funders, regulators, and health-system leads

Every internal memo, guideline draft, and policy brief that references women's health research should, as a formatting convention, list findings under the two-column format used here. The convention forces the question what action does this imply? on every finding at the moment of citation. It is a small editorial change. It converts the deploy column into an operational queue and the research column into a funded program, without allowing the two to substitute for each other.

This essay is short on purpose. The framework is the deliverable. The six deploy-column items are not controversial within their specialty literatures. They are, institutionally, still waiting to be scheduled.

Related reading: Pathway Failure Is Correctable for the thesis; Pricing the Pathway for the actuarial-profession deploy list; Pregnancy Is a Cardiovascular Stress Test for a fully-worked single-example deploy case.