Prediction Validation — GaiaLab Biological Intelligence Platform

🔬

Prospective validation evidence — building in real time

Each analysis run adds to this dataset. As predictions accumulate and ClinicalTrials.gov checks complete, calibration curves populate here automatically. No other open biological intelligence platform publishes this data continuously — that is the standard we hold ourselves to.

Pre-registered Prospective Validation Protocol

We pre-register how forward predictions are judged — before outcomes are known — so the claim is falsifiable and cannot be tuned to a result. Modeled on the sealed-lockbox methodology used in rigorous ML-for-medicine work (e.g. Obermeyer et al., Nature 2026).

Sealed lockbox: 8,087 unresolved predictions + 8,087 placebo pairs, frozen 2026-06-28, content-hashed (SHA-256 e6d94d57…). Append-only; the git commit history is the pre-registration record.
Primary endpoint: off-label prospective hit rate. On-label concordances are excluded — for an approved drug a later trial is expected and is not evidence of foresight.
Placebo control: shuffled drug–disease pairs must show no lead-time enrichment. If they match the real cohort, the signal is an artifact and the claim is rejected.
Honest caveats (pre-stated): AI training data predates predictions (preprint leakage possible); prospective precedence is necessary but not sufficient for independent prediction.
Readouts: 90 / 180 / 365 days — published here pass or fail, no cherry-picking.
Pre-specified failure criteria: the claim is rejected if off-label hits ≤ placebo, or 0 off-label hits, or calibration error (ECE) stays > 0.20.

Full protocol: docs/PROSPECTIVE_VALIDATION_PROTOCOL.md · raw cohort: /api/predictions

Full Prediction Ledger

Every therapeutic candidate GaiaLab has scored — timestamped, unfiltered, and continuously cross-referenced against ClinicalTrials.gov. All entries in this name-matched ledger are retrospective benchmarks. Semantic matching has separately surfaced 15 prospective matches (2 off-label repurposing hypotheses + 13 on-label concordance) — see /new-trials.

Drug	Disease Context	Confidence	Outcome	Trial NCT IDs	Recorded

Methodology & Definitions

How predictions are recorded: At the end of every analysis, up to 10 drug candidates are saved with their confidence score, disease context, target genes, and a timestamp. Records are immutable — no retroactive changes.

Validation check: Each prediction is queried against ClinicalTrials.gov API v2 (clinicaltrials.gov/api/v2/studies) using the drug name + disease context. "Validated" = a completed disease-matched trial was found. "Trial active" = an active recruiting trial was found (direction confirmed, outcome pending). "Insufficient data" = no trials found — this counts against accuracy.

What this is NOT: This does not measure whether GaiaLab identified the drug before the trial started (we don't have that date information). It measures whether the drug+disease direction is being/was pursued in a clinical setting — a proxy for research relevance, not therapeutic efficacy.

Calibration curve: A well-calibrated system shows higher-confidence predictions matching trials at higher rates than lower-confidence ones. This is research direction correspondence, not therapeutic outcome accuracy. A true efficacy calibration curve requires prospective trial completion data — that data will be added as it matures. We publish what we have, not what looks best.

Data source: GET /api/predictions · GET /api/predictions/calibration — public, no auth required.

Prediction Accountability Dashboard

Prospective validation evidence — building in real time

Enterprise: Upload Institutional Data

Methodology & Definitions