GaiaLab — Computational Drug Repurposing · 75+ databases · Research Use

Evidence Operating System LIVE

Every claim has a lineage. Every drug has a state. Every board accumulates evidence over time.

GaiaLab is not a report generator. It is a persistent translational intelligence layer where claims carry provenance (PMID, confidence, contradiction score), drug candidates transition through a lifecycle (proposed → grounded → validated → deprecated), and disease boards accumulate evidence across every analysis your team runs — with exponential decay for stale signals and contradiction alerts when new data conflicts with prior conclusions.

⊡ Disease Boards ▣ Decision Workspace ◎ Semantic Search ◍ Analytics Spine ⇄ KG Path Finder ⚠ Evidence Decay Monitor

Drug Repurposing Engine

Six-factor weighted score (target overlap, clinical evidence, mechanism alignment, pathway relevance, safety profile, disease context) ranks FDA-approved and investigational agents into Tier I–III. AlphaFold pLDDT provides an orthogonal structural druggability signal. CIViC and OncoKB evidence levels calibrate confidence assignments. Scores are computational estimates. Two benchmarks: Prospective calibration — AUROC 0.90 on 22 known drug approvals held out by approval year (temporal holdout), 22/22 temporal recalls, 8/8 negative controls passed, mean rank 3 — methodology →. Retrospective benchmark (May 2026 snapshot, 22 disease areas) — AUROC 0.545 (bootstrap 95% CI: 0.526–0.562) vs 0.50 random baseline — full validation data →. These measure different things; neither is a clinical predictor.

Convergent Dysfunction Detection

Identifies gene pairs that independently dysregulate the same disease pathway via distinct molecular mechanisms — with no direct protein-protein interaction between them. Inspired by convergent evolution in biology (sharks and dolphins independently arriving at the same body plan). When two panel genes converge on the same pathway without interacting, pathway-level therapeutic targeting is stronger than targeting either gene alone. Detected pairs are flagged in gene cards, generate novel hypotheses (novelty 0.80), and boost drug scores for agents targeting the shared pathway. ⟳ See gene cards after analysis.

Concordance Tracking

Each candidate is timestamped at output time and matched against ClinicalTrials.gov. Matches where the prediction timestamp precedes trial registration date are classified as prospective; all others are retrospective. Prospective concordance indicates the system surfaced a hypothesis before investigators registered a corresponding trial — it does not constitute efficacy evidence. Completed trial status does not imply positive outcome. Full ledger: /validation.

CIViC + OncoKB Evidence Integration

Each gene is queried against CIViC (community-curated clinical variant evidence, Levels A–E; Washington University) and OncoKB (FDA-recognised precision oncology knowledge base; Memorial Sloan Kettering). Returns: drug associations per variant, AMP/ACMG tier, oncogene/TSG classification, and Level 1–R2 biomarker designations. CIViC: no token required. OncoKB: institutional token recommended.

Mechanism Derivation

Checkpoint and resistance genes are mapped to immunotherapy escape mechanisms via pathway enrichment, protein interaction topology, and cross-source agreement. Mechanistic assignments are derived from structured database outputs, not generative text synthesis. Assignments flagged as low-agreement carry an explicit uncertainty annotation.

Protein Interaction Network

Force-directed PPI graph with temporal overlay. Displays hub centrality, curated and computationally predicted edges, and topological context aggregated across STRING, BioGRID, and related interaction databases. Predicted edges are visually distinguished from experimentally validated interactions.

Evidence Ledger

Each output claim is linked to one or more PubMed citations and assigned a polarity classification: supporting, contradicting, or mixed. Claims without citation support are flagged, not suppressed. Full PMID traceability is preserved in exported evidence packages.

Reproducible Snapshots

Each run produces a tamper-evident snapshot encoding gene inputs, database versions, model configuration, scoring parameters, and complete outputs. Snapshots can be diffed against prior runs or replayed independently. Intended for methods-section documentation and internal audit trails.

Quality Gating

Evidence Depth Score, Contention Index, and grounded citation ratio gate every output. Claims below configurable thresholds are flagged with explicit rationale. Nothing is silently discarded — suppressed claims are logged and accessible in the full evidence export.

Multi-Source Evidence Consensus

Each conclusion is cross-validated across 16 independent data channels: genomics, protein structure, pathway enrichment, literature, drug bioactivity, clinical trials, disease association, interaction networks, expression, safety, and others. Agreement across channels elevates confidence score; divergence triggers a contradiction flag and downgrades the claim. No single source is treated as determinative.

Structured Export

Outputs export as JSON evidence packages or formatted briefs. Each export includes scoring context, PMID citations, contradiction annotations, and complete model configuration metadata. Format is designed for direct insertion into methods sections or internal research reports.

Calibration Feedback Loop

Prediction outcomes are periodically verified against ClinicalTrials.gov status updates. Hypothesis outputs are cross-checked against new PubMed entries. Calibration drift is detected and a recalibration multiplier is applied at each server cycle. Confidence score distributions are published at /validation.

TCGA Survival Stratification

Kaplan-Meier OS curves stratified by mutation status across 15 TCGA cohorts (BRCA, LUAD, GBM, PAAD, and 11 additional). Returns log-rank p-value, hazard ratio, and median OS via the cBioPortal public API. Survival data is observational; no causal inference is implied. No institutional subscription required.

Researcher Outcome Submission

Users can submit confirmed, refuted, partial, or inconclusive outcomes from the results page. Submissions are aggregated into per-decile calibration curves and applied as recalibration multipliers at the next server cycle. All submitted outcomes are treated as self-reported and are not independently verified.

Knowledge Graph Contradiction Alerts

Each analysis is compared against prior runs in the knowledge graph for the same disease context. A contradiction alert is raised when a therapeutic score shifts more than 20 points relative to prior runs for the same gene–drug pair. Weekly digests list the most materially changed conclusions. Alerts indicate score drift, not independent evidence of a clinical finding.

CIViC Variant Evidence

Each gene is queried against CIViC (Clinical Interpretation of Variants in Cancer; Washington University in St. Louis). Returns evidence levels A (validated association) through E (inferential), AMP/ACMG tier, and drug associations per variant. CIViC is community-curated; evidence quality varies by entry and requires independent verification.

Pathway Enrichment

Gene panels are enriched against MSigDB Hallmark (50 gene sets), KEGG 2021 Human, and WikiPathways via the Enrichr API. Statistical significance is assessed with Benjamini-Hochberg FDR correction. Results indicate overrepresentation in curated gene sets — not direct measurement of pathway activity in a specific tumour.

OncoKB Biomarker Levels

When an API token is configured, each gene is queried against OncoKB (Memorial Sloan Kettering; FDA-recognised). Returns Level 1 biomarkers (FDA-approved companion diagnostics), Level 2 (standard of care), Level 3B (investigational), and Level R1/R2 (resistance markers), plus oncogene vs. tumour suppressor classification. Requires institutional token for full access.

Analysis output — per run

🔬

Pathway Enrichment

Overrepresented gene sets ranked by BH-corrected FDR q-value. MSigDB Hallmark, KEGG, WikiPathways.

💊

Repurposing Candidates

FDA-approved and investigational agents scored 0–100 across six factors. Tier I–III assignment. Computational estimates only — not clinical recommendations. Methodology · Calibration →

🧠

Mechanistic Hypotheses

Multi-agent debate output with calibrated confidence scores and suggested experimental designs. Requires independent wet-lab validation.

📋

PMID Evidence Ledger

Each claim linked to primary PubMed citations with polarity classification: supporting, contradicting, or mixed.

📊

Concordance Record

Each candidate timestamped and matched against ClinicalTrials.gov. Prospective vs. retrospective classification explicit per entry.

Export: JSON evidence package · PDF brief · CSV drug table · Reproducible snapshot (inputs, model version, scoring parameters)

Signal Credibility Metrics — live from the most recent analysis run

Citation Coverage ⓘ

—

Grounded Ratio ⓘ

—

Contradiction Rate ⓘ

—

Grounded Depth ⓘ

—

Inferred Depth ⓘ

—

Populated after analysis · Run a gene panel above to see live metrics · Full methodology →

Independently validated · published openly · no login required · gailabai.com/validation

—

unique drug-disease pairs benchmarked vs ClinicalTrials.gov · name-matched ledger all retrospective (semantic matching adds 15 prospective: 2 off-label + 13 on-label)

retrospective concordance matches — unique pairs where a ClinicalTrials.gov trial exists (trial pre-dates our analysis). Plus 15 AI-verified prospective matches found via pgvector semantic similarity, checked against the ClinicalTrials.gov intervention list (AI-assisted review + operator sign-off — not expert peer review). Honest split: 13 on-label concordance + 2 off-label hypotheses. View ↗

0.545

Platform-wide retrospective AUROC · N=529 predictions · 22 disease areas · 2026-03 benchmark · bootstrap 95% CI: 0.526–0.562 · vs 0.50 baseline · full data →

Temporal holdout benchmark (different, smaller test): AUROC 0.90 on 22 known drug approvals held out by year · 8/8 negative controls · methodology →

75+

Live biological databases queried per analysis · all outputs are research hypotheses requiring independent validation

Accuracy data published openly. Raw JSON: /api/predictions/calibration · Benchmark source: scripts/benchmark-auroc.js in the public repository.

About

Open computational drug repurposing with published accuracy data.

GaiaLab generates ranked repurposing candidates from a gene list in under 60 seconds, drawing from 75+ biological databases (CIViC, OncoKB, DGIdb, DrugCentral, OpenAlex, PharmGKB, MSigDB Hallmark, AlphaFold, DepMap, TCGA, ClinGen, COSMIC, JASPAR, STRING, gnomAD, OT Genetics, and 38 others). Each candidate is scored across six evidence dimensions, reviewed through a six-role structured AI debate, and cross-referenced against ClinicalTrials.gov. Performance data (AUROC, calibration curves, concordance breakdown) is published at /validation. All outputs are computational research hypotheses. Independent experimental validation is required before any therapeutic or clinical application.

Houston, TX · Disease areas: GBM, AML, Alzheimer's, breast cancer, NSCLC, pancreatic cancer, and others · Research use only · partnerships@gailabai.com

75+

Live biological databases queried per analysis

—

Retrospective concordance matches across — unique drug-disease pairs · all pre-date our analyses · full ledger →

Cost to run your first analysis

Context

Developed in Houston, TX. Designed for translational research teams that need rapid hypothesis generation across gene panels without institutional informatics infrastructure. Primary use cases: target prioritisation, drug repurposing triage, mechanistic hypothesis scoping before wet-lab investment.

Access

No login or subscription required for standard analyses. API access and team workspaces available on paid plans. All analyses return the same evidence — access tier affects export formats and rate limits, not scoring or data sources.

Limitations

LLM synthesis: Mechanistic hypotheses are generated by language models, not trained predictors. Treat as hypothesis input, not conclusion.
Public APIs only: No proprietary databases. Coverage gaps exist for certain gene classes and rare indications.
Two benchmarks: Prospective calibration AUROC 0.90 (22 known drug approvals, temporal holdout, mean rank 3, 8/8 neg controls) — methodology →. Retrospective AUROC 0.545 (May 2026 snapshot, 22 disease areas) — modest signal above 0.50 random baseline. Not a clinically validated predictor.

Methods & Scoring

Confidence tiers derived from cross-source agreement, study design classification, and citation depth across 75+ databases.
Evidence polarity scoring (supporting / contradicting / mixed) identifies where published data diverges from the scored conclusion.
Per-claim PMID ledger with full scoring context exported with every analysis.
Run snapshots encode database versions, model configuration, gate outcomes, and all scored outputs for independent replay.

Inspect a Sample Snapshot

Download a complete audit snapshot containing evidence packages, scoring context, data sources, and model configuration metadata.

Download sample snapshot Load melanoma IO panel

Includes reproducible gene inputs, data source versions, and full model configuration details.

🌐 Platform Analytics

Aggregated across all GaiaLab analyses — updated continuously

818

🔬 818 analyses⭐ #1 gene: KRAS💊 #1 drug: encorafenib🧬 30 genes tracked

Top Genes by Study Frequency

KRAS↓

534×

BRAF↓

532×

NRAS↓

529×

NKX2-1↓

104×

BRCA1↓

85×

PALB2↓

85×

BRCA2↓

85×

IDH1↓

72×

SMAD4↓

68×

PIK3CA↓

67×

Most Studied Disease Areas

colorectal cancer

526

breast cancer

non-small cell lun…

kras g12d pancreat…

brca-mutant early-…

pan-cancer

alzheimers disease

kras g12c non-smal…

Most Co-Studied Gene Pairs

BRAF+KRAS

530

BRAF+NRAS

529

KRAS+NRAS

529

BRCA1+BRCA2

BRCA1+PALB2

BRCA2+PALB2

Top Surfaced Drug Candidates

🥇encorafenib76% trial match970×

🥈dabrafenib81% trial match836×

🥉glecaprevir99% trial match729×

telaprevir99% trial match729×

selumetinib48% trial match500×

regorafenib75% trial match475×

mrtx-1133473×

adagrasib34% trial match473×

Disease × Gene Frequency

	KRAS	BRAF	NRAS	BRCA2	PALB2	BRCA1	EGFR	TP53
colorectal cance…	526	524	523
breast cancer				62	62	62	12	12
non-small cell l…							61
kras g12d pancre…	60							60
brca-mutant earl…				59	59	59

• KRAS is the most frequently studied gene on the platform (534 analyses).• encorafenib is the top surfaced drug candidate across all disease contexts (970×).

Melanoma IO Resistance Panel

Oncology reference panels

Flagship →

Internal demos / reference panels

Anti-PD-1 Resistance Audit: Melanoma

10-gene IO resistance panel: PDCD1, CD274, CTLA4, LAG3, HAVCR2, PTEN, B2M, JAK1, STK11, BRAF
IO Response Score + TCGA SKCM mutation frequencies (n=440) queried live from cBioPortal
Export a reproducible JSON evidence package with per-claim PMID traceability

Breast Cancer Panel

TP53, BRCA1, EGFR analyzed in breast cancer disease context
Inspect pathway enrichment rankings and grounded ratio
Diff against a prior snapshot for run-to-run stability

Colorectal KRAS Panel

KRAS, NRAS, BRAF analyzed in colorectal cancer context
Explore 3D interaction network hub centrality
Review mechanism classifications and therapeutic overlap

IO Responder Profile: Inflamed TME

Inflamed panel: CD8A, CXCL9, CXCL10, PDCD1, LAG3, TIGIT — cytotoxic T-cell infiltration with chemoattractant signature
IO Response Score 100/100 (strong response likelihood) — contrast with 16/100 resistance panel
Identify actionable checkpoints: LAG3 → relatlimab, TIGIT → tiragolumab

Configure Analysis

Enter any gene list and disease context. The pipeline queries 75+ biological databases in parallel — including PubMed, ChEMBL, OpenTargets, ClinicalTrials.gov, OpenFDA, DGIdb, DrugCentral, OpenAlex, and PharmGKB — then synthesises pathways, therapeutic candidates, mechanistic hypotheses, and a confidence-scored evidence ledger. No account required.

Analyses run — Drug candidates — Diseases mapped — Gene nodes — live · updated every analysis

Gene Symbols

Enter 2–15 gene symbols separated by commas · or · or

Try:

Disease Context

Be specific — disease, subtype, and mechanism context improve output quality

Try:

Output Perspective

Workspace Memory

Optional. Reuse the same workspace ID to track prior runs, contradictions, and changed conclusions over time.

Email me when predictions are confirmed or contradicted

Workspace IDs can stay anonymous, or you can create a protected workspace with invite-based team access.

Include therapeutic signal mapping (drug targets, trials, safety)

Adds DGIdb, ChEMBL, DrugCentral, ClinicalTrials.gov, OpenFDA, and PubChem. Adds ~10s to analysis time.

Clinical Biomarkers FDA-Approved IO Predictors optional

TMB (mut/Mb) FDA-Approved

≥10 mut/Mb = TMB-H (KEYNOTE-158). ≥20 = very high.

MSI / MMR Status FDA-Approved

Not tested MSS / pMMR MSI-L MSI-H / dMMR

PD-L1 Expression FDA-Approved

CPS (Combined Positive Score)

TPS (Tumor Proportion, %)

CPS≥1 nivolumab eligible · CPS≥10 pembrolizumab preferred · TPS≥50% monotherapy

Unlock drug candidates, exports & higher limits Academic — Free → Get a key →

5 free analyses/day · Results in ~30 seconds · Upgrade for more →

⚖️

Compare two diseases with the same gene panel

Free & instant. Same genes. Two diseases. See exactly where the evidence diverges.

BRCA1/2 · breast vs ovarian → KRAS · colorectal vs pancreatic → EGFR · NSCLC vs TNBC → Custom comparison →

Running Analysis

Databases

Literature

AI Synthesis

Evidence Gate

Assembly

Initializing pipeline...