← Methodology page · Trust & Transparency · Validation data
Version 0.1.1 · May 2026 Platform: https://www.gailabai.com
GaiaLab is an AI-powered biological intelligence platform that transforms gene panels into structured, citation-linked research insights in under 60 seconds. The platform aggregates live data from 75+ biological databases,¹ applies a six-agent AI debate framework for hypothesis generation and critique, and produces PMID-grounded therapeutic hypotheses ranked by a six-factor scoring model. An MCP (Model Context Protocol) server interface allows direct integration into AI assistant workflows. All outputs are computational research hypotheses requiring independent experimental validation.
Translational genomics research faces a reproducibility and synthesis bottleneck: the information required to reason about a gene panel is distributed across dozens of databases, literature corpora, and clinical trial registries, none of which share a common schema. A researcher submitting a panel of 3–10 genes to understand their disease biology must manually cross-reference PubMed, ClinicalTrials.gov, ChEMBL, gnomAD, UniProt, AlphaFold, KEGG, Reactome, COSMIC, and dozens of other sources — a process that takes days and produces results that are difficult to cite, compare, or reproduce.
GaiaLab automates this synthesis layer. It does not replace experimental biology; it accelerates the hypothesis-generation and literature-triangulation steps that precede it. Every output carries PMID citations, confidence labels, and an explicit evidence-quality flag so researchers can distinguish AI-generated inference from database-backed assertion.
pg@modelcontextprotocol/sdk v1.24py/ for SAE and ESM-2 protein embeddings``
User submits gene panel + disease context
│
▼
Rate / quota gate (tier-based: free / researcher / enterprise)
│
▼
Gene normalisation ── HGNC alias resolution, symbol canonicalisation
│
▼
Parallel data fetch ── 75+ sources via Promise.allSettled() [~15–40 s cold]
│
▼
Literature cache check ── 5-min in-memory keyed by genes + disease
│
▼
Evidence ledger build ── PMID validation, polarity classification,
grounding attribution, knownPmid backfill
│
▼
Drug repurposing engine ── Six-factor scoring, tier assignment,
AlphaFold structural bonus, DepMap essentiality
│
▼
Six-agent AI debate ── Parallel: Hypothesis · Critic · Evidence ·
Innovation · Risk · Synthesis agents
│
▼
Convergence scoring ── Cross-source family validation (PubMed,
ClinicalTrials, FDA, ChEMBL, structural, network)
│
▼
Insight assembly + grounding gate ── PMID linkage, confidence
labelling, pathway FDR
│
▼
Snapshot persistence + KG write ── PostgreSQL, shareable URL
│
▼
Structured JSON response ── Rendered in analyze.html widget
`
Typical latency: 30–60 s cold start; < 1 s from L1 cache.
GaiaLab fetches from 75+ biological databases¹ across seven domains. All clients follow the same defensive pattern: they never throw; they return partial results with an error field so Promise.allSettled() can continue regardless of upstream failures.
Gene annotation & variation AlphaFold EBI · ClinVar · Ensembl · gnomAD (variant + ancestry + constraint) · GWAS Catalog · HGNC · OMIM · UniProt · VEP · NCBI Gene · ClinGen · Monarch Initiative
Pathway & functional Enrichr · GO (Gene Ontology) · JASPAR (transcription factor binding) · KEGG · MSigDB · PathwayCommons · Reactome · ChEA3
Interaction & network BioGRID · IntAct · STRING · STRING-DB partners · ComplexPortal · SynLethDB (synthetic lethality)
Literature PubMed (NCBI Entrez) · Europe PMC · PubMed Central full-text (JATS XML → quantitative extraction: IC50, p-values, HR, OR, fold-changes) · bioRxiv · Semantic Scholar · OpenAlex · preprint monitor
Drug & clinical ChEMBL · CIViC · ClinicalTrials.gov API v2 · DGIDb · DrugBank · DrugCentral · FDA Regulatory · OncoKB · OpenFDA adverse events · OpenTargets (genetics + disease–gene association) · OpenTargets Genetics · PubChem (compound + bioassay) · RxNorm DDI · TTD · PharmGKB · HMDB (metabolomics)
Omics & cancer CBIOPORTAL · COSMIC Signatures · CPTAC (proteomics) · DepMap (cancer dependency + co-essentiality) · GDSC (drug sensitivity) · GTeX (expression + eQTL) · HPA (protein atlas) · MetaboLights · PRIDE (proteomics) · ProteomicsDB · scRNA (Cell × Gene) · TCGA (mutation + survival) · AGR (Alliance of Genome Resources)
Structural AlphaFold (pLDDT → druggability score) · PDB (experimental structures) · NGL molecular viewer
Regulatory / patent Drug resistance intelligence · FDA regulatory intelligence · Patent status + expiry · Regulatory intelligence client · LINCS (perturbation signatures)
Sources are grouped into seven aggregators (src/data/aggregators/):
- Gene aggregator — symbol resolution, aliases, disease associations
)
4. Drug Repurposing Engine
4.1 Six-Factor Scoring Model
Every candidate drug receives a score from 0–100 derived from six independently calibrated factors:
Factor Weight Signal captured Target match (targetMatch) 0.30 Direct binding evidence against panel genes (ChEMBL pChEMBL ≥ 6, confirmed binding targets, DGIDb) )))))Weights were calibrated against 10 known disease–drug pairs from OpenTargets and clinical guidelines.
Beyond the six factors, three bonus signals can increase the final score:
- AlphaFold structural bonus (+0 to +10): derived from mean pLDDT of the target protein (pLDDT ≥ 80 → +10; ≥ 70 → +6; ≥ 60 → +3; < 60 → 0). Fetched from the AlphaFold EBI API.
FDA-approved on-panel drugs receive a score floor of 70 (Tier I guaranteed) regardless of context relevance, because the approval represents validated clinical evidence. Off-label FDA-approved drugs with non-zero context relevance receive a floor of 35.
Context penalty: Off-label drugs with contextRelevance < 20 receive a ×0.3 score multiplier; those with contextRelevance 20–34 receive ×0.45. These are mutually exclusive — only the most severe penalty applies.
Each drug is evaluated against six independent evidence families. A drug scoring 4/6 or higher is considered "convergent" — supported by multiple orthogonal source types rather than a single strong signal:
AI-generated insights are only as credible as their citations. GaiaLab's grounding pipeline ensures that every strategy card and pathway insight either carries validated PMIDs or is explicitly labelled as a hypothesis.
Attribution occurs in two passes:
Pass 1 — _preAttributePmids: After data fetch, each insight item is matched against the full literature pool. Matching uses token overlap between the item's text (label + mechanism + gene symbols) and each paper's title + abstract. A gene-symbol match requires only 1 overlapping token; non-gene content requires 2. Up to 2 PMIDs are collected per item (collecting two is important — items with ≥ 2 PMIDs plus a disease/gene text match reach grounded status, the highest evidence tier).
Pass 2 — buildInsightLinks: Inside the polarity assignment loop, secondary attribution runs the same gene-aware 1-token / 2-token rule against the known-PMID set, collecting up to 2 additional papers.
Each insight item receives an evidenceStatus:
The grounded ratio — the proportion of pathway + strategy items reaching grounded or supported — is reported on every analysis and monitored by the canary suite. Observed range: 28% (cold start, PubMed rate-limited) to 70%+ (warm cache, full paper pool).
Each linked PMID is classified as support, neutral, or contradict relative to the insight claim. Items in the knownPmids set with neutral polarity are promoted to support (papers confirmed as relevant by prior analysis are treated as supporting unless explicitly classified as contradictions).
GaiaLab uses a structured six-agent debate (enabled by default via GAIALAB_MULTI_AGENT_ENABLED=true) where each agent has a distinct epistemic role:
Each agent operates independently in parallel. The SynthesisAgent receives all five outputs and produces the final structured response.
Before the debate, src/ai/agent-data-retrieval.js pre-fetches four live APIs and formats the results per agent:
- OpenTargets disease–gene associations → EvidenceAgent + CriticAgent
This grounds agent reasoning in current database state rather than training-data recall.
The platform supports four AI providers in priority order:
`
Each analysis attempt tries the primary provider; if it times out or returns an error, the next provider is tried. Only after all providers fail does the system return an error. This ensures high availability during provider-side outages.
7. Knowledge Graph & Population Insights
7.1 Knowledge Graph
Every completed analysis writes to a PostgreSQL knowledge graph (
kg_nodes + kg_edges + kg_cooccurrence tables). Nodes represent genes, drugs, pathways, and disease contexts. Edges represent:- Drug → target (gene) bindings
Gene → pathway memberships
Gene → disease associations
Drug → pathway links (derived from target memberships)The KG accumulates cross-analysis signal over time. Endpoints:
GET /api/knowledge-graph/stats — node/edge counts
GET /api/knowledge-graph/drugs?disease= — top drugs by disease context
GET /api/knowledge-graph/gene-neighbors?gene= — PPI neighbourhood7.2 Population-Level Insights
scripts/aggregate-insights.js reads all stored snapshots and computes gene, pathway, and drug frequency + co-occurrence across all analyses run on the platform. This surfaces cross-patient, cross-study signal that single-analysis views cannot show. The Research Intelligence Panel on the homepage shows calibration charts and frequency pills derived from this aggregate.7.3 Prediction Tracking
src/utils/prediction-tracker.js records every drug-disease prediction at analysis time, then polls ClinicalTrials.gov v2 periodically for outcome updates. This enables prospective calibration: the fraction of predictions that are eventually validated by trial completion is tracked as a calibration curve and reported on the platform's validation page.
8. Intelligence Boards
GaiaLab maintains 10 active disease intelligence boards — living summaries updated on a 24-hour refresh cycle. Each board aggregates evidence across the KG, surfaces emerging contradictions, and sends email alerts when new contradictions are detected against prior conclusions.
Current boards:
Breast Cancer · Triple-Negative Breast Cancer · Non-Small Cell Lung Cancer · Colorectal Cancer · Glioblastoma · Prostate Cancer · Ovarian Cancer · Melanoma · Pancreatic Ductal Adenocarcinoma · Alzheimer's Disease
Board data is stored in PostgreSQL (
disease_boards, board_evidence_items, board_alerts tables). Case study pages for selected boards (lecanemab/AD, adagrasib/NSCLC, adagrasib/PDAC) provide focused mechanistic analysis with open research questions.
9. MCP Server Interface
GaiaLab exposes a Model Context Protocol server at
POST /mcp, allowing AI assistants (Claude Desktop, custom agents built with the Anthropic Agent SDK) to call the platform as a tool.Tool:
gaialab_generate_insightsInput schema (Zod-validated):
`json
{
"genes": ["string"],
"diseaseContext": "string",
"audience": "researcher | clinician | general"
}
`Each POST creates a fresh
McpServer + StreamableHTTPServerTransport instance. Responses carry Access-Control-Allow-Origin: * for cross-origin use. The MCP interface is the primary integration surface for research workflow automation.
10. Workspace & Collaboration
Authenticated users (Stripe-backed tiers) receive persistent workspaces with:
- Saved analyses — shareable snapshot URLs, reproducible replay
Analysis history — timeline of past gene panels and results
Workspace memory — cross-session context for the chat assistant
Report export — PDF export with PMID-gated trust score (blocked if valid-PMID rate < 70%)
Weekly digests — emailed summaries of board updates and new contradictions relevant to saved analyses
Row-level security — per-workspace PostgreSQL isolation (when GAIALAB_RLS_ENABLED=true)Subscription tiers:
Tier Daily analyses Drug repurposing Export Free Limited (IP-gated) Basic No
Researcher Unlimited Full (all tiers) PDF + CSV
Enterprise Unlimited Full + matrix Full + API
11. SAE / ESM-2 Interpretability (Optional)
When
GAIALAB_INTERPRETABILITY_ENABLED=1, GaiaLab spawns Python 3.11 subprocesses from py/ to run sparse autoencoder (SAE) inference over ESM-2 protein language model embeddings. This surfaces learned biological features from the ESM-2 representation that are not explicitly encoded in database annotations — an experimental interpretability layer for protein function.Requirements: PostgreSQL, Python 3.11,
py/requirements.txt, ESM-2 model cache. This feature is disabled by default on the public deployment.
12. Evaluation & Benchmarking
12.1 AUROC (Retrospective)
A retrospective AUROC of 0.545 (95% CI bootstrap: 0.526–0.562) was computed in March 2026 against 529 predictions across 22 disease areas, using ClinicalTrials.gov completed trial matches as the gold standard. The random baseline is 0.50. This represents a modest but consistent signal above random. This is not a clinically validated predictor.
12.2 Temporal Holdout
A separate temporal holdout benchmark using 22 known drug approvals (held out by year) showed AUROC 0.90 with 8/8 negative controls correctly rejected. This benchmark is on a smaller, curated dataset and should be interpreted accordingly.
12.3 Grounding Rate
The primary quality signal for day-to-day health is the grounding ratio — fraction of insight items with at least one validated PMID. The canary suite monitors this on every push. The
npm run verify:engineering gate includes a grounding check (test:grounding-gate).12.4 Continuous Evaluation
The evaluation suite (
scripts/gaialab-eval.js) supports:
NDCG@10 for drug ranking quality
Paired t-test vs. baseline for significance
Gold standard benchmarks ( data/benchmarks/gold-standard.json)
Trust and reliability benchmarks ( data/benchmarks/trust-benchmarks.json)
Snapshot replay for regression detection
13. Engineering Quality Gates
All deployments must pass
npm run verify:engineering, which runs in sequence:1.
security:scan — secret detection (no API keys, connection strings, or credentials in committed code)
2. test:critical — critical flow tests
3. test:grounding-gate — grounding ratio threshold check
4. test:biomedical-trust — biomedical claim credibility checks
5. test:biomedical-trust:standard — golden test suite for trust surfaces
6. test:week2 through test:week6 — regression suites per weekly milestone
7. test:drug-scoring — drug scoring unit tests
8. test:ad-regression — Alzheimer's disease regression
9. test:contracts — 60+ API contract tests
10. test:export-surfaces, test:provenance-surfaces, test:critical-flows, test:ui-trust-surfaces
11. test:smoke, test:lineage, test:nav-e2e
12. test:concordance-integrity — PMID annotation concordance
13. test:sse-stream, test:kg-explorer, test:drug-paywall, test:prod-smokeThe gate is enforced by a canary runner (
scripts/canary.js) that executes a full live analysis against https://www.gailabai.com and validates: completion time, drug candidate count, grounded ratio, job failure rate, and trust page availability.
14. Security Model
- Secret detection:
scripts/check_secrets.js runs as a git pre-commit hook and in every CI job. Matches known patterns for API keys, connection strings, and credentials.
Stripe webhook verification: All payment webhook events are verified using stripe.webhooks.constructEvent() with a required STRIPE_WEBHOOK_SECRET. Unsigned webhooks are rejected with HTTP 400.
Rate limiting: IP-based daily quota for free-tier users; API key tier for researcher/enterprise.
No SSRF exposure: All outbound API calls are to fixed, known biological database URLs with 30-second timeouts. User input does not influence outbound URL construction.
PMID validity gate: PDF export is blocked when the valid-PMID rate in an analysis falls below a configurable threshold (default 70%), preventing export of poorly-grounded reports.
15. Deployment
GaiaLab is deployed on Railway via Nixpacks with automatic deploys from the
main branch of the GitHub repository. The production URL is https://www.gailabai.com.Environment variables required for full functionality:
Variable Purpose DEEPSEEK_API_KEY Primary AI provider
OPENAI_API_KEY AI failover #1
GOOGLE_API_KEY AI failover #2
ANTHROPIC_API_KEY AI failover #3
DATABASE_URL PostgreSQL connection
STRIPE_SECRET_KEY Payments
STRIPE_WEBHOOK_SECRET Webhook signature verification
NCBI_API_KEY PubMed rate: 3 → 10 req/s Optional premium data source keys:
BIOGRID_API_KEY, DISGENET_API_KEY, DRUGBANK_API_KEY, SEMANTIC_SCHOLAR_API_KEY`.GaiaLab is a computational research tool. Its outputs are hypotheses, not clinical recommendations. Specific limitations:
- All therapeutic suggestions require independent experimental validation before any clinical application.
If you use GaiaLab in published research, please cite:
*This document reflects the platform as of version 0.1.1, May 2026. For the latest implementation details, refer to the source repository.*
¹ Active source count varies with API key configuration. Full source list in Section 3.1. Without optional paid keys (DisGeNET, DrugBank), active coverage is approximately 60 sources. The count of 75+ reflects the full set of integrated clients shipped with the platform.