GaiaLab

Computational drug repurposing from gene input.
2,153 candidates cross-referenced against ClinicalTrials.gov.

Accepts a gene list. Returns ranked repurposing candidates with six-factor evidence scores, pathway enrichment, PMID-linked mechanistic hypotheses, and a timestamped concordance record against ClinicalTrials.gov. Query-to-output time: under 60 seconds. Gene cards include a disease-relevance score, evidence strength meter, panel pathway convergence signal, pharmacogenomic conflict warnings, and source quality tiers across 54+ databases — CIViC, OncoKB, AlphaFold, MSigDB, DepMap, TCGA, DGIdb, ClinGen, COSMIC, JASPAR, OpenAlex, PharmGKB, and 42 more queried in parallel. All outputs are computational research hypotheses. Independent experimental validation is required before any therapeutic or clinical application. Concordance ledger →

AUROC 0.545 · 95% CI 0.526–0.562 10,937 predictions logged No login required · open accuracy data
54+
Databases queried per analysis (parallel)
6
Independent AI agents that evaluate and debate each output before scoring is finalised
10,937
Repurposing candidates logged and cross-referenced against ClinicalTrials.gov — 2,153 concordance matches ↗
0.545
AUROC vs 0.50 random baseline · bootstrap 95% CI: 0.526–0.562 · 22 disease areas
GaiaLab-scored · ClinicalTrials.gov concordance
Tier I Olaparib BRCA1/2 · PARP inhibition ✦ 71 trial concordance matches
Tier I Midostaurin FLT3-ITD AML · FLT3 inhibition ✦ FDA-approved · concordance tracked
Tier I Osimertinib EGFR-mutant NSCLC · 3rd-gen TKI ✦ FDA-approved · concordance tracked
Tier I Ivosidenib IDH1-mutant AML · IDH1 inhibition ✦ FDA-approved · concordance tracked
Tier I Dabrafenib BRAF V600E · MEK inhibition ✦ 60 trial concordance matches Case study →
Tier I Nivolumab Pan-cancer · PD-1 checkpoint ✦ 56 trial concordance matches
Full concordance ledger →
Platform Activity
Live counts from analyses run on this platform — not externally verified benchmarks
Gene Panels profiled
Drug–Target Pairs evidence-validated
Biomarker Associations literature-derived
Clinical Patterns hypothesis→confirmed
Evidence Grounding citation-backed insights
Actionable Loci cross-source concordant
ACTIVE Evolution Log →
Evidence Operating System LIVE

Every claim has a lineage. Every drug has a state. Every board accumulates evidence over time.

GaiaLab is not a report generator. It is a persistent translational intelligence layer where claims carry provenance (PMID, confidence, contradiction score), drug candidates transition through a lifecycle (proposed → grounded → validated → deprecated), and disease boards accumulate evidence across every analysis your team runs — with exponential decay for stale signals and contradiction alerts when new data conflicts with prior conclusions.

⊡ Disease Boards ▣ Decision Workspace ◎ Semantic Search ◍ Analytics Spine

Drug Repurposing Engine

Six-factor weighted score (target overlap, clinical evidence, mechanism alignment, pathway relevance, safety profile, disease context) ranks FDA-approved and investigational agents into Tier I–III. AlphaFold pLDDT provides an orthogonal structural druggability signal. CIViC and OncoKB evidence levels calibrate confidence assignments. Scores are computational estimates. AUROC 0.545 vs 0.50 random baseline across 22 disease areas. Full methodology →

Concordance Tracking

Each candidate is timestamped at output time and matched against ClinicalTrials.gov. Matches where the prediction timestamp precedes trial registration date are classified as prospective; all others are retrospective. Prospective concordance indicates the system surfaced a hypothesis before investigators registered a corresponding trial — it does not constitute efficacy evidence. Completed trial status does not imply positive outcome. Full ledger: /validation.

CIViC + OncoKB Evidence Integration

Each gene is queried against CIViC (community-curated clinical variant evidence, Levels A–E; Washington University) and OncoKB (FDA-recognised precision oncology knowledge base; Memorial Sloan Kettering). Returns: drug associations per variant, AMP/ACMG tier, oncogene/TSG classification, and Level 1–R2 biomarker designations. CIViC: no token required. OncoKB: institutional token recommended.

Mechanism Derivation

Checkpoint and resistance genes are mapped to immunotherapy escape mechanisms via pathway enrichment, protein interaction topology, and cross-source agreement. Mechanistic assignments are derived from structured database outputs, not generative text synthesis. Assignments flagged as low-agreement carry an explicit uncertainty annotation.

Protein Interaction Network

Force-directed PPI graph with temporal overlay. Displays hub centrality, curated and computationally predicted edges, and topological context aggregated across STRING, BioGRID, and related interaction databases. Predicted edges are visually distinguished from experimentally validated interactions.

Evidence Ledger

Each output claim is linked to one or more PubMed citations and assigned a polarity classification: supporting, contradicting, or mixed. Claims without citation support are flagged, not suppressed. Full PMID traceability is preserved in exported evidence packages.

Reproducible Snapshots

Each run produces a tamper-evident snapshot encoding gene inputs, database versions, model configuration, scoring parameters, and complete outputs. Snapshots can be diffed against prior runs or replayed independently. Intended for methods-section documentation and internal audit trails.

Quality Gating

Evidence Depth Score, Contention Index, and grounded citation ratio gate every output. Claims below configurable thresholds are flagged with explicit rationale. Nothing is silently discarded — suppressed claims are logged and accessible in the full evidence export.

Multi-Source Evidence Consensus

Each conclusion is cross-validated across 16 independent data channels: genomics, protein structure, pathway enrichment, literature, drug bioactivity, clinical trials, disease association, interaction networks, expression, safety, and others. Agreement across channels elevates confidence score; divergence triggers a contradiction flag and downgrades the claim. No single source is treated as determinative.

Structured Export

Outputs export as JSON evidence packages or formatted briefs. Each export includes scoring context, PMID citations, contradiction annotations, and complete model configuration metadata. Format is designed for direct insertion into methods sections or internal research reports.

Calibration Feedback Loop

Prediction outcomes are periodically verified against ClinicalTrials.gov status updates. Hypothesis outputs are cross-checked against new PubMed entries. Calibration drift is detected and a recalibration multiplier is applied at each server cycle. Confidence score distributions are published at /validation.

TCGA Survival Stratification

Kaplan-Meier OS curves stratified by mutation status across 15 TCGA cohorts (BRCA, LUAD, GBM, PAAD, and 11 additional). Returns log-rank p-value, hazard ratio, and median OS via the cBioPortal public API. Survival data is observational; no causal inference is implied. No institutional subscription required.

Researcher Outcome Submission

Users can submit confirmed, refuted, partial, or inconclusive outcomes from the results page. Submissions are aggregated into per-decile calibration curves and applied as recalibration multipliers at the next server cycle. All submitted outcomes are treated as self-reported and are not independently verified.

Knowledge Graph Contradiction Alerts

Each analysis is compared against prior runs in the knowledge graph for the same disease context. A contradiction alert is raised when a therapeutic score shifts more than 20 points relative to prior runs for the same gene–drug pair. Weekly digests list the most materially changed conclusions. Alerts indicate score drift, not independent evidence of a clinical finding.

CIViC Variant Evidence

Each gene is queried against CIViC (Clinical Interpretation of Variants in Cancer; Washington University in St. Louis). Returns evidence levels A (validated association) through E (inferential), AMP/ACMG tier, and drug associations per variant. CIViC is community-curated; evidence quality varies by entry and requires independent verification.

Pathway Enrichment

Gene panels are enriched against MSigDB Hallmark (50 gene sets), KEGG 2021 Human, and WikiPathways via the Enrichr API. Statistical significance is assessed with Benjamini-Hochberg FDR correction. Results indicate overrepresentation in curated gene sets — not direct measurement of pathway activity in a specific tumour.

OncoKB Biomarker Levels

When an API token is configured, each gene is queried against OncoKB (Memorial Sloan Kettering; FDA-recognised). Returns Level 1 biomarkers (FDA-approved companion diagnostics), Level 2 (standard of care), Level 3B (investigational), and Level R1/R2 (resistance markers), plus oncogene vs. tumour suppressor classification. Requires institutional token for full access.

Analysis output — per run
🔬
Pathway Enrichment
Overrepresented gene sets ranked by BH-corrected FDR q-value. MSigDB Hallmark, KEGG, WikiPathways.
💊
Repurposing Candidates
FDA-approved and investigational agents scored 0–100 across six factors. Tier I–III assignment. Computational estimates only — not clinical recommendations. Methodology
🧠
Mechanistic Hypotheses
Multi-agent debate output with calibrated confidence scores and suggested experimental designs. Requires independent wet-lab validation.
📋
PMID Evidence Ledger
Each claim linked to primary PubMed citations with polarity classification: supporting, contradicting, or mixed.
📊
Concordance Record
Each candidate timestamped and matched against ClinicalTrials.gov. Prospective vs. retrospective classification explicit per entry.
Export: JSON evidence package · PDF brief · CSV drug table · Reproducible snapshot (inputs, model version, scoring parameters)

Signal Credibility Metrics — live from the most recent analysis run

Citation Coverage
94%
Grounded Ratio
87%
Contradiction Rate
2.1%
Evidence Depth Score
62/100
Live metrics from most recent run · Run an analysis to refresh · Full methodology →
Independently validated · published openly · no login required · gaialabai.com/validation
drug predictions logged and validated against ClinicalTrials.gov
77%
of scored candidates concordant with an active or completed clinical trial (retrospective + prospective combined)
0.545
AUROC vs 0.50 random baseline — modest but non-trivial signal on unsupervised task · bootstrap 95% CI: 0.526–0.562
41
Live biological databases queried per analysis · all outputs are research hypotheses requiring independent validation
Accuracy data published openly. Raw JSON: /api/predictions/calibration · Benchmark source: scripts/benchmark-auroc.js in the public repository.
About

Open computational drug repurposing with published accuracy data.

GaiaLab generates ranked repurposing candidates from a gene list in under 60 seconds, drawing from 54+ biological databases (CIViC, OncoKB, DGIdb, DrugCentral, OpenAlex, PharmGKB, MSigDB Hallmark, AlphaFold, DepMap, TCGA, ClinGen, COSMIC, JASPAR, STRING, gnomAD, OT Genetics, and 38 others). Each candidate is scored across six evidence dimensions, reviewed by six independent AI agents, and cross-referenced against ClinicalTrials.gov. Performance data (AUROC, calibration curves, concordance breakdown) is published at /validation. All outputs are computational research hypotheses. Independent experimental validation is required before any therapeutic or clinical application.

Houston, TX  ·  Disease areas: GBM, AML, Alzheimer's, breast cancer, NSCLC, pancreatic cancer, and others  ·  Research use only  ·  partnerships@gailabai.com

54+
Live biological databases queried per analysis
0.545
AUROC vs 0.50 random baseline · research hypotheses, not clinical evidence
$0
Cost to run your first analysis

Context

Developed in Houston, TX. Designed for translational research teams that need rapid hypothesis generation across gene panels without institutional informatics infrastructure. Primary use cases: target prioritisation, drug repurposing triage, mechanistic hypothesis scoping before wet-lab investment.

Access

No login or subscription required for standard analyses. API access and team workspaces available on paid plans. All analyses return the same evidence — access tier affects export formats and rate limits, not scoring or data sources.

Limitations

  • LLM synthesis: Mechanistic hypotheses are generated by language models, not trained predictors. Treat as hypothesis input, not conclusion.
  • Public APIs only: No proprietary databases. Coverage gaps exist for certain gene classes and rare indications.
  • AUROC 0.545: Modest signal above random on a 22-disease retrospective benchmark. Not a clinically validated predictor.

Methods & Scoring

  • Confidence tiers derived from cross-source agreement, study design classification, and citation depth across 54+ databases.
  • Evidence polarity scoring (supporting / contradicting / mixed) identifies where published data diverges from the scored conclusion.
  • Per-claim PMID ledger with full scoring context exported with every analysis.
  • Run snapshots encode database versions, model configuration, gate outcomes, and all scored outputs for independent replay.

Inspect a Sample Snapshot

Download a complete audit snapshot containing evidence packages, scoring context, data sources, and model configuration metadata.

Includes reproducible gene inputs, data source versions, and full model configuration details.

🌐 Platform Analytics

Aggregated across all GaiaLab analyses — updated continuously
106
🔬 106 analyses⭐ #1 gene: EGFR💊 #1 drug: olaparib🧬 30 genes tracked
Top Genes by Study Frequency
EGFR
52×
BRCA1
48×
TP53
44×
KRAS
14×
BRCA2
13×
PALB2
13×
BRAF
13×
NRAS
13×
ALK
12×
MET
10×
Most Studied Disease Areas
breast cancer
47
colorectal cancer
12
non-small cell lun…
12
pan-cancer
8
alzheimers disease
7
glioblastoma
5
parkinson's diseas…
4
inflammatory bowel…
3
Most Co-Studied Gene Pairs
BRCA1+EGFR
35
BRCA1+TP53
35
EGFR+TP53
35
BRCA1+PALB2
13
BRCA2+PALB2
13
BRAF+NRAS
13
Top Surfaced Drug Candidates
🥇olaparib100% trial match60×
🥈cetuximab100% trial match41×
🥉durvalumab31×
talazoparib98% trial match25×
niraparib98% trial match25×
carboplatin100% trial match23×
encorafenib100% trial match23×
paclitaxel100% trial match22×
Disease × Gene Frequency
BRCA1EGFRTP53PALB2BRCA2ALKNRASBRAF
breast cancer4734341313
non-small cell l…1212
colorectal cance…1212
pan-cancer8
alzheimers disea…
• EGFR is the most frequently studied gene on the platform (52 analyses).• olaparib is the top surfaced drug candidate across all disease contexts (60×).

Melanoma IO Resistance Panel

Oncology reference panels
Internal demos / reference panels

Anti-PD-1 Resistance Audit: Melanoma

  • 10-gene IO resistance panel: PDCD1, CD274, CTLA4, LAG3, HAVCR2, PTEN, B2M, JAK1, STK11, BRAF
  • IO Response Score + TCGA SKCM mutation frequencies (n=440) queried live from cBioPortal
  • Export a reproducible JSON evidence package with per-claim PMID traceability

Breast Cancer Panel

  • TP53, BRCA1, EGFR analyzed in breast cancer disease context
  • Inspect pathway enrichment rankings and grounded ratio
  • Diff against a prior snapshot for run-to-run stability

Colorectal KRAS Panel

  • KRAS, NRAS, BRAF analyzed in colorectal cancer context
  • Explore 3D interaction network hub centrality
  • Review mechanism classifications and therapeutic overlap

IO Responder Profile: Inflamed TME

  • Inflamed panel: CD8A, CXCL9, CXCL10, PDCD1, LAG3, TIGIT — cytotoxic T-cell infiltration with chemoattractant signature
  • IO Response Score 100/100 (strong response likelihood) — contrast with 16/100 resistance panel
  • Identify actionable checkpoints: LAG3 → relatlimab, TIGIT → tiragolumab

Configure Analysis

Enter any gene list and disease context. The pipeline queries 54+ biological databases in parallel — including PubMed, ChEMBL, OpenTargets, ClinicalTrials.gov, OpenFDA, DGIdb, DrugCentral, OpenAlex, and PharmGKB — then synthesises pathways, therapeutic candidates, mechanistic hypotheses, and a confidence-scored evidence ledger. No account required.
Enter 2–15 gene symbols separated by commas · or · or
Try:
Be specific — disease, subtype, and mechanism context improve output quality
Try:
Optional. Reuse the same workspace ID to track prior runs, contradictions, and changed conclusions over time.
Workspace IDs can stay anonymous, or you can create a protected workspace with invite-based team access.
Adds DGIdb, ChEMBL, DrugCentral, ClinicalTrials.gov, OpenFDA, and PubChem. Adds ~10s to analysis time.
Clinical Biomarkers FDA-Approved IO Predictors optional
≥10 mut/Mb = TMB-H (KEYNOTE-158). ≥20 = very high.
CPS≥1 nivolumab eligible · CPS≥10 pembrolizumab preferred · TPS≥50% monotherapy
No account required · Results stream in ~30 seconds · Data not stored beyond your session

Running Analysis

Databases
Literature
AI Synthesis
Evidence Gate
Assembly
Initializing pipeline...
GaiaLab Evidence Assistant
Melanoma IO resistance · Checkpoint biology · Evidence audit
G
Evidence assistant for melanoma IO resistance audits. I can interpret checkpoint gene evidence, explain resistance mechanisms, review polarity scores, and surface clinical trial matches.

Example queries:
• "What resistance mechanisms involve PD-L1 upregulation?"
• "Which checkpoint genes have active clinical trials?"
• "Explain LAG3 role in anti-PD-1 resistance"