GaiaLab

Type a gene panel. Get ranked drug candidates
cross-referenced against 75+ databases — in 60 seconds.

Six-factor scored drug candidates, pathway enrichment, and PMID-linked hypotheses — grounded in 75+ live biological databases, reviewed by a six-agent AI debate, and cross-referenced against ClinicalTrials.gov. All outputs are computational research hypotheses requiring independent validation.

Case studies: Dabrafenib ↗ Lecanemab ↗ Adagrasib ↗ Concordance ledger ↗
⟳ Convergent Dysfunction Detection 6-agent structured AI debate Timestamped concordance ledger No login required · open accuracy data
75+
Databases queried per analysis (parallel)
6
AI agent debate roles per analysis
Unique drug-disease pairs benchmarked — retrospective concordance matches ↗ vs ClinicalTrials.gov · plus 15 semantic prospective matches ↗ (2 off-label repurposing hypotheses + 13 on-label concordance)
Brier score (N≈500 resolved) · vs 0.25 no-skill baseline · lower is better · methodology →
Caveat: calibration error (ECE) is currently high — a low Brier here is partly class-imbalance, not proven calibration. See /calibration.
0.90 / 0.545 ?
Two AUROC benchmarks · 0.90 temporal holdout (n=22, curated) · 0.545 retrospective (N=529, all areas) · neither is a clinical predictor · full validation →
Researchers who run 3+ analyses on the same disease unlock contradiction alerts, score drift detection, and hypothesis confidence tiers. Start now →
How GaiaLab gets smarter with every analysis
drug associations promoted to high-confidence through repeated evidence
analyses have calibrated the scoring model
6h
self-improving proactive cycles — adaptive weights, prompt variants, model re-ranking
View full learning dashboard →
CONVERGENT DYSFUNCTION DETECTION
New · powered by pathway-level analysis

GaiaLab identifies gene pairs that independently dysregulate the same disease pathway via distinct molecular mechanisms — with no direct protein-protein interaction. Like convergent evolution in nature (sharks and dolphins arriving at the same body plan), these pairs suggest the pathway is attacked from two independent angles, making pathway-level therapeutics more powerful than single-gene targeting.

Example pairs detected
BRCA1 ↔ TP53 · DNA repair pathway
EGFR ↔ KRAS · MAPK signalling
Score 0–100 · shown on gene cards after analysis
GaiaLab-scored · ClinicalTrials.gov concordance
Tier I Olaparib BRCA1/2 · PARP inhibition ✦ FDA-approved · PARP inhibitor · concordance tracked
Tier I Midostaurin FLT3-ITD AML · FLT3 inhibition ✦ FDA-approved · concordance tracked
Tier I Osimertinib EGFR-mutant NSCLC · 3rd-gen TKI ✦ FDA-approved · concordance tracked
Tier I Ivosidenib IDH1-mutant AML · IDH1 inhibition ✦ FDA-approved · concordance tracked
Tier I Dabrafenib BRAF V600E · MEK inhibition ✦ BRAF V600E · concordance tracked Case study →
Tier I Nivolumab Pan-cancer · PD-1 checkpoint ✦ FDA-approved · PD-1 checkpoint · concordance tracked
Full concordance ledger →
Platform Activity
Live platform counts · predictions cross-referenced against ClinicalTrials.gov
Gene Panels profiled
Drug–Target Pairs evidence-validated
Unique Predictions timestamped ledger
Trial Corroborations ClinicalTrials.gov
Evidence Grounding citation-backed insights
Actionable Loci cross-source concordant
ACTIVE Evolution Log →
Evidence Operating System LIVE

Every claim has a lineage. Every drug has a state. Every board accumulates evidence over time.

GaiaLab is not a report generator. It is a persistent translational intelligence layer where claims carry provenance (PMID, confidence, contradiction score), drug candidates transition through a lifecycle (proposed → grounded → validated → deprecated), and disease boards accumulate evidence across every analysis your team runs — with exponential decay for stale signals and contradiction alerts when new data conflicts with prior conclusions.

⊡ Disease Boards ▣ Decision Workspace ◎ Semantic Search ◍ Analytics Spine ⇄ KG Path Finder ⚠ Evidence Decay Monitor

Drug Repurposing Engine

Six-factor weighted score (target overlap, clinical evidence, mechanism alignment, pathway relevance, safety profile, disease context) ranks FDA-approved and investigational agents into Tier I–III. AlphaFold pLDDT provides an orthogonal structural druggability signal. CIViC and OncoKB evidence levels calibrate confidence assignments. Scores are computational estimates. Two benchmarks: Prospective calibration — AUROC 0.90 on 22 known drug approvals held out by approval year (temporal holdout), 22/22 temporal recalls, 8/8 negative controls passed, mean rank 3 — methodology →. Retrospective benchmark (May 2026 snapshot, 22 disease areas) — AUROC 0.545 (bootstrap 95% CI: 0.526–0.562) vs 0.50 random baseline — full validation data →. These measure different things; neither is a clinical predictor.

Convergent Dysfunction Detection

Identifies gene pairs that independently dysregulate the same disease pathway via distinct molecular mechanisms — with no direct protein-protein interaction between them. Inspired by convergent evolution in biology (sharks and dolphins independently arriving at the same body plan). When two panel genes converge on the same pathway without interacting, pathway-level therapeutic targeting is stronger than targeting either gene alone. Detected pairs are flagged in gene cards, generate novel hypotheses (novelty 0.80), and boost drug scores for agents targeting the shared pathway. ⟳ See gene cards after analysis.

Concordance Tracking

Each candidate is timestamped at output time and matched against ClinicalTrials.gov. Matches where the prediction timestamp precedes trial registration date are classified as prospective; all others are retrospective. Prospective concordance indicates the system surfaced a hypothesis before investigators registered a corresponding trial — it does not constitute efficacy evidence. Completed trial status does not imply positive outcome. Full ledger: /validation.

CIViC + OncoKB Evidence Integration

Each gene is queried against CIViC (community-curated clinical variant evidence, Levels A–E; Washington University) and OncoKB (FDA-recognised precision oncology knowledge base; Memorial Sloan Kettering). Returns: drug associations per variant, AMP/ACMG tier, oncogene/TSG classification, and Level 1–R2 biomarker designations. CIViC: no token required. OncoKB: institutional token recommended.

Mechanism Derivation

Checkpoint and resistance genes are mapped to immunotherapy escape mechanisms via pathway enrichment, protein interaction topology, and cross-source agreement. Mechanistic assignments are derived from structured database outputs, not generative text synthesis. Assignments flagged as low-agreement carry an explicit uncertainty annotation.

Protein Interaction Network

Force-directed PPI graph with temporal overlay. Displays hub centrality, curated and computationally predicted edges, and topological context aggregated across STRING, BioGRID, and related interaction databases. Predicted edges are visually distinguished from experimentally validated interactions.

Evidence Ledger

Each output claim is linked to one or more PubMed citations and assigned a polarity classification: supporting, contradicting, or mixed. Claims without citation support are flagged, not suppressed. Full PMID traceability is preserved in exported evidence packages.

Reproducible Snapshots

Each run produces a tamper-evident snapshot encoding gene inputs, database versions, model configuration, scoring parameters, and complete outputs. Snapshots can be diffed against prior runs or replayed independently. Intended for methods-section documentation and internal audit trails.

Quality Gating

Evidence Depth Score, Contention Index, and grounded citation ratio gate every output. Claims below configurable thresholds are flagged with explicit rationale. Nothing is silently discarded — suppressed claims are logged and accessible in the full evidence export.

Multi-Source Evidence Consensus

Each conclusion is cross-validated across 16 independent data channels: genomics, protein structure, pathway enrichment, literature, drug bioactivity, clinical trials, disease association, interaction networks, expression, safety, and others. Agreement across channels elevates confidence score; divergence triggers a contradiction flag and downgrades the claim. No single source is treated as determinative.

Structured Export

Outputs export as JSON evidence packages or formatted briefs. Each export includes scoring context, PMID citations, contradiction annotations, and complete model configuration metadata. Format is designed for direct insertion into methods sections or internal research reports.

Calibration Feedback Loop

Prediction outcomes are periodically verified against ClinicalTrials.gov status updates. Hypothesis outputs are cross-checked against new PubMed entries. Calibration drift is detected and a recalibration multiplier is applied at each server cycle. Confidence score distributions are published at /validation.

TCGA Survival Stratification

Kaplan-Meier OS curves stratified by mutation status across 15 TCGA cohorts (BRCA, LUAD, GBM, PAAD, and 11 additional). Returns log-rank p-value, hazard ratio, and median OS via the cBioPortal public API. Survival data is observational; no causal inference is implied. No institutional subscription required.

Researcher Outcome Submission

Users can submit confirmed, refuted, partial, or inconclusive outcomes from the results page. Submissions are aggregated into per-decile calibration curves and applied as recalibration multipliers at the next server cycle. All submitted outcomes are treated as self-reported and are not independently verified.

Knowledge Graph Contradiction Alerts

Each analysis is compared against prior runs in the knowledge graph for the same disease context. A contradiction alert is raised when a therapeutic score shifts more than 20 points relative to prior runs for the same gene–drug pair. Weekly digests list the most materially changed conclusions. Alerts indicate score drift, not independent evidence of a clinical finding.

CIViC Variant Evidence

Each gene is queried against CIViC (Clinical Interpretation of Variants in Cancer; Washington University in St. Louis). Returns evidence levels A (validated association) through E (inferential), AMP/ACMG tier, and drug associations per variant. CIViC is community-curated; evidence quality varies by entry and requires independent verification.

Pathway Enrichment

Gene panels are enriched against MSigDB Hallmark (50 gene sets), KEGG 2021 Human, and WikiPathways via the Enrichr API. Statistical significance is assessed with Benjamini-Hochberg FDR correction. Results indicate overrepresentation in curated gene sets — not direct measurement of pathway activity in a specific tumour.

OncoKB Biomarker Levels

When an API token is configured, each gene is queried against OncoKB (Memorial Sloan Kettering; FDA-recognised). Returns Level 1 biomarkers (FDA-approved companion diagnostics), Level 2 (standard of care), Level 3B (investigational), and Level R1/R2 (resistance markers), plus oncogene vs. tumour suppressor classification. Requires institutional token for full access.

Analysis output — per run
🔬
Pathway Enrichment
Overrepresented gene sets ranked by BH-corrected FDR q-value. MSigDB Hallmark, KEGG, WikiPathways.
💊
Repurposing Candidates
FDA-approved and investigational agents scored 0–100 across six factors. Tier I–III assignment. Computational estimates only — not clinical recommendations. Methodology · Calibration →
🧠
Mechanistic Hypotheses
Multi-agent debate output with calibrated confidence scores and suggested experimental designs. Requires independent wet-lab validation.
📋
PMID Evidence Ledger
Each claim linked to primary PubMed citations with polarity classification: supporting, contradicting, or mixed.
📊
Concordance Record
Each candidate timestamped and matched against ClinicalTrials.gov. Prospective vs. retrospective classification explicit per entry.
Export: JSON evidence package · PDF brief · CSV drug table · Reproducible snapshot (inputs, model version, scoring parameters)

Signal Credibility Metrics — live from the most recent analysis run

Citation Coverage
Grounded Ratio
Contradiction Rate
Grounded Depth
Inferred Depth
Populated after analysis · Run a gene panel above to see live metrics · Full methodology →
Independently validated · published openly · no login required · gailabai.com/validation
unique drug-disease pairs benchmarked vs ClinicalTrials.gov · name-matched ledger all retrospective (semantic matching adds 15 prospective: 2 off-label + 13 on-label)
80
retrospective concordance matches — unique pairs where a ClinicalTrials.gov trial exists (trial pre-dates our analysis). Plus 15 AI-verified prospective matches found via pgvector semantic similarity, checked against the ClinicalTrials.gov intervention list (AI-assisted review + operator sign-off — not expert peer review). Honest split: 13 on-label concordance + 2 off-label hypotheses. View ↗
0.545
Platform-wide retrospective AUROC · N=529 predictions · 22 disease areas · 2026-03 benchmark · bootstrap 95% CI: 0.526–0.562 · vs 0.50 baseline · full data →
Temporal holdout benchmark (different, smaller test): AUROC 0.90 on 22 known drug approvals held out by year · 8/8 negative controls · methodology →
75+
Live biological databases queried per analysis · all outputs are research hypotheses requiring independent validation
Accuracy data published openly. Raw JSON: /api/predictions/calibration · Benchmark source: scripts/benchmark-auroc.js in the public repository.
About

Open computational drug repurposing with published accuracy data.

GaiaLab generates ranked repurposing candidates from a gene list in under 60 seconds, drawing from 75+ biological databases (CIViC, OncoKB, DGIdb, DrugCentral, OpenAlex, PharmGKB, MSigDB Hallmark, AlphaFold, DepMap, TCGA, ClinGen, COSMIC, JASPAR, STRING, gnomAD, OT Genetics, and 38 others). Each candidate is scored across six evidence dimensions, reviewed through a six-role structured AI debate, and cross-referenced against ClinicalTrials.gov. Performance data (AUROC, calibration curves, concordance breakdown) is published at /validation. All outputs are computational research hypotheses. Independent experimental validation is required before any therapeutic or clinical application.

Houston, TX  ·  Disease areas: GBM, AML, Alzheimer's, breast cancer, NSCLC, pancreatic cancer, and others  ·  Research use only  ·  partnerships@gailabai.com

75+
Live biological databases queried per analysis
Retrospective concordance matches across unique drug-disease pairs · all pre-date our analyses · full ledger →
$0
Cost to run your first analysis

Context

Developed in Houston, TX. Designed for translational research teams that need rapid hypothesis generation across gene panels without institutional informatics infrastructure. Primary use cases: target prioritisation, drug repurposing triage, mechanistic hypothesis scoping before wet-lab investment.

Access

No login or subscription required for standard analyses. API access and team workspaces available on paid plans. All analyses return the same evidence — access tier affects export formats and rate limits, not scoring or data sources.

Limitations

  • LLM synthesis: Mechanistic hypotheses are generated by language models, not trained predictors. Treat as hypothesis input, not conclusion.
  • Public APIs only: No proprietary databases. Coverage gaps exist for certain gene classes and rare indications.
  • Two benchmarks: Prospective calibration AUROC 0.90 (22 known drug approvals, temporal holdout, mean rank 3, 8/8 neg controls) — methodology →. Retrospective AUROC 0.545 (May 2026 snapshot, 22 disease areas) — modest signal above 0.50 random baseline. Not a clinically validated predictor.

Methods & Scoring

  • Confidence tiers derived from cross-source agreement, study design classification, and citation depth across 75+ databases.
  • Evidence polarity scoring (supporting / contradicting / mixed) identifies where published data diverges from the scored conclusion.
  • Per-claim PMID ledger with full scoring context exported with every analysis.
  • Run snapshots encode database versions, model configuration, gate outcomes, and all scored outputs for independent replay.

Inspect a Sample Snapshot

Download a complete audit snapshot containing evidence packages, scoring context, data sources, and model configuration metadata.

Includes reproducible gene inputs, data source versions, and full model configuration details.

🌐 Platform Analytics

Aggregated across all GaiaLab analyses — updated continuously
816
🔬 816 analyses⭐ #1 gene: KRAS💊 #1 drug: encorafenib🧬 30 genes tracked
Top Genes by Study Frequency
KRAS
532×
BRAF
530×
NRAS
527×
NKX2-1
104×
BRCA1
85×
PALB2
85×
BRCA2
85×
IDH1
72×
SMAD4
68×
PIK3CA
67×
Most Studied Disease Areas
colorectal cancer
524
breast cancer
62
non-small cell lun…
61
kras g12d pancreat…
60
brca-mutant early-…
59
pan-cancer
56
alzheimers disease
56
kras g12c non-smal…
55
Most Co-Studied Gene Pairs
BRAF+KRAS
528
BRAF+NRAS
527
KRAS+NRAS
527
BRCA1+BRCA2
85
BRCA1+PALB2
85
BRCA2+PALB2
85
Top Surfaced Drug Candidates
🥇encorafenib76% trial match966×
🥈dabrafenib81% trial match834×
🥉glecaprevir100% trial match727×
telaprevir100% trial match727×
selumetinib48% trial match498×
regorafenib75% trial match473×
mrtx-1133471×
adagrasib34% trial match471×
Disease × Gene Frequency
KRASBRAFNRASBRCA2PALB2BRCA1EGFRTP53
colorectal cance…524522521
breast cancer6262621212
non-small cell l…61
kras g12d pancre…6060
brca-mutant earl…595959
• KRAS is the most frequently studied gene on the platform (532 analyses).• encorafenib is the top surfaced drug candidate across all disease contexts (966×).

Melanoma IO Resistance Panel

Oncology reference panels
Flagship →
Internal demos / reference panels

Anti-PD-1 Resistance Audit: Melanoma

  • 10-gene IO resistance panel: PDCD1, CD274, CTLA4, LAG3, HAVCR2, PTEN, B2M, JAK1, STK11, BRAF
  • IO Response Score + TCGA SKCM mutation frequencies (n=440) queried live from cBioPortal
  • Export a reproducible JSON evidence package with per-claim PMID traceability

Breast Cancer Panel

  • TP53, BRCA1, EGFR analyzed in breast cancer disease context
  • Inspect pathway enrichment rankings and grounded ratio
  • Diff against a prior snapshot for run-to-run stability

Colorectal KRAS Panel

  • KRAS, NRAS, BRAF analyzed in colorectal cancer context
  • Explore 3D interaction network hub centrality
  • Review mechanism classifications and therapeutic overlap

IO Responder Profile: Inflamed TME

  • Inflamed panel: CD8A, CXCL9, CXCL10, PDCD1, LAG3, TIGIT — cytotoxic T-cell infiltration with chemoattractant signature
  • IO Response Score 100/100 (strong response likelihood) — contrast with 16/100 resistance panel
  • Identify actionable checkpoints: LAG3 → relatlimab, TIGIT → tiragolumab

Configure Analysis

Enter any gene list and disease context. The pipeline queries 75+ biological databases in parallel — including PubMed, ChEMBL, OpenTargets, ClinicalTrials.gov, OpenFDA, DGIdb, DrugCentral, OpenAlex, and PharmGKB — then synthesises pathways, therapeutic candidates, mechanistic hypotheses, and a confidence-scored evidence ledger. No account required.
Analyses run Drug candidates Diseases mapped Gene nodes live · updated every analysis
Enter 2–15 gene symbols separated by commas · or · or
Try:
Be specific — disease, subtype, and mechanism context improve output quality
Try:
Optional. Reuse the same workspace ID to track prior runs, contradictions, and changed conclusions over time.
Workspace IDs can stay anonymous, or you can create a protected workspace with invite-based team access.
Adds DGIdb, ChEMBL, DrugCentral, ClinicalTrials.gov, OpenFDA, and PubChem. Adds ~10s to analysis time.
Clinical Biomarkers FDA-Approved IO Predictors optional
≥10 mut/Mb = TMB-H (KEYNOTE-158). ≥20 = very high.
CPS≥1 nivolumab eligible · CPS≥10 pembrolizumab preferred · TPS≥50% monotherapy
Unlock drug candidates, exports & higher limits Academic — Free → Get a key →
5 free analyses/day · Results in ~30 seconds · Upgrade for more →
⚖️
Compare two diseases with the same gene panel
Free & instant. Same genes. Two diseases. See exactly where the evidence diverges.
BRCA1/2 · breast vs ovarian → KRAS · colorectal vs pancreatic → EGFR · NSCLC vs TNBC → Custom comparison →

Running Analysis

Databases
Literature
AI Synthesis
Evidence Gate
Assembly
Initializing pipeline...
GaiaLab Evidence Assistant
Melanoma IO resistance · Checkpoint biology · Evidence audit
G
Evidence assistant for melanoma IO resistance audits. I can interpret checkpoint gene evidence, explain resistance mechanisms, review polarity scores, and surface clinical trial matches.

Example queries:
• "What resistance mechanisms involve PD-L1 upregulation?"
• "Which checkpoint genes have active clinical trials?"
• "Explain LAG3 role in anti-PD-1 resistance"