HBISS Recap: Knowledge Graphs for Space Life Sciences; Peter Rose (UCSD) & Amanda Saravia-Butler (AI4LS)
March 12, 2026
Today’s Horizons in Biosciences and Informatics Seminar series (HBISS) featured @asaravia and Peter Rose presenting on the GeneLab Knowledge Graph and how Model Context Protocol (MCP) servers can connect it to external knowledge graphs using natural language queries.
Here is the recording if you missed the meeting (I hit record a few mins after the meeting began, sorry
). All links are below from the chat, including those shared by Melissa from MonarchKG group ![]()
What Peter covered
Peter walked through the SPOKE-GeneLab composite knowledge graph built on Neo4j. The graph integrates OSDR study metadata, differential gene expression, DNA methylation, and amplicon/metagenomic data with self-describing meta nodes that make it AI-ready. He demonstrated a Cypher query combining hypermethylated promoter regions with downregulated genes across spaceflight vs. ground control comparisons, and showed how the composite graph links GeneLab data to the broader SPOKE biomedical knowledge graph through shared nodes (genes, cell types, anatomy).
What Amanda covered
Amanda demonstrated how MCP servers allow users to query the GeneLab knowledge graph (and connect it to external graphs) entirely through natural language in a chatbot client like Claude, with no Cypher or coding required. She showed three live demos:
- Querying OSD-244 (Rodent Research 6, thymus, muscle atrophy) for differentially expressed genes across 30-day and 60-day spaceflight timepoints, generating volcano plots and Venn diagrams, then pulling related publications from PubMed via its MCP connector
- Combining the GeneLab KG with the Monarch knowledge graph to find genes that are both hypermethylated in the promoter region and downregulated in OSD-48, then using Monarch for pathway enrichment analysis (growth factor signaling, lipid metabolism, circadian clock, ECM remodeling)
- Using GeneLab KG + SPOKE-OKN + Monarch + PubMed together to analyze 16S amplicon data from OSD-267 (Veggie hardware validation test), identifying the top 20 most abundant bacteria in spaceflight roots and cross-referencing them against known plant and human pathogens across multiple knowledge graphs
Amanda outlined the near-term plan: host the GeneLab MCP server publicly so users only need to add a connector URL to their chatbot of choice and start querying. Longer term, a “Space Life Sciences” overarching MCP server will route queries across multiple KG sub-servers automatically. Registration on PyPI is planned once testing is complete.
Key discussion highlights
- Melissa Haendel (Monarch Initiative) raised important points about KG interoperability: even when graphs use the same ontology terms, they can model source data differently. She encouraged the team to train Claude to document equivalency decisions and provenance. Melissa shared several standards and resources (linked below) and expressed strong interest in collaborating.
- Rebecca Ringuette @rebecca.ringuette stressed the importance of being able to see the actual code and data sources behind any AI-generated plot, and suggested adding DOIs to study nodes. Amanda confirmed DOI properties are planned.
- Nick Brereton @nicholas.brereton noted that FDR thresholds could be too strict for cross-experiment work and asked about integrating the Environmental Data App and RadLab. Amanda confirmed MCP servers for both the RadLab API and EDA API are planned once those APIs are in Open API format.
- Adam Amara @adam.amara asked about an open-source dump of the graph database for testing in other graph engines. Peter shared a Neo4j dump file (linked below).
- Simon Cole @simoncole asked about variance from experimental differences (read depth, library prep) across studies. Amanda noted that full metadata (currently in OSDR but not yet in the KG) will be accessible via API-based MCP queries so users can make informed decisions.
- Anu Iris @anuiris asked about accessibility for non-coders. Amanda confirmed that once hosted, using the tool will require nothing more than adding a connector in your chatbot and writing natural language prompts. Tutorials and template prompts are planned.
- Peter Rose noted that Claude (Opus 4.6 and Sonnet 4.6) currently gives the best and most consistent results among the LLMs tested.
Next HBISS
April 30, 2026 — Bowhead whales and improved DNA repair in long-lived species
Bowhead whales live ~200 years, and contrary to expectations (elephants have extra tumor suppressor genes), bowhead fibroblasts actually require fewer oncogenic hits for malignant transformation but compensate with dramatically superior DNA repair via cold-inducible RNA binding protein (CIRBP). Exciting potential applications for spaceflight radiation countermeasures!
All links from the chat and presentation
- SPOKE-OKN on FRINK — https://frink.renci.org/registry/kgs/spoke-okn/
- Prime-KG (Harvard) — https://zitniklab.hms.harvard.edu/projects/PrimeKG/
- Monarch Initiative KG — https://monarchinitiative.org/kg/about
- Monarch DisMech repo — https://github.com/monarch-initiative/dismech
- SPOKE-GeneLab GitHub — https://github.com/BaranziniLab/spoke_genelab
- MCP-GeneLab (main repo) — https://github.com/sbl-sdsc/mcp-genelab
- MCP-GeneLab (Amanda’s fork) — https://github.com/asaravia-butler/mcp-genelab
- Neo4j dump file — https://drive.google.com/file/d/1Wy-BqUhtmGk7qB8WpnG5LefDDsF_6xQW/view?usp=sharing
- HBISS presentation slides — https://docs.google.com/presentation/d/1qq9C4HgCsyYKpsRcRMWIV-U7WeSCJN9g
- SSSOM mapping standard — https://doi.org/10.1093/database/baac035
- LinkML modeling language — https://linkml.io/
- Biolink Model — https://biolink.github.io/biolink-model/
- Bowhead whale Nature paper — https://www.nature.com/articles/s41586-025-09694-5
- OSMED — https://www.osmed.org/
- RadLab API — https://visualization.osdr.nasa.gov/radlab/gui/data-api/
- Environmental Data App — https://visualization.osdr.nasa.gov/eda/
- AWG Forum-Space About — https://awg.osdr.space/about
