New AI/ML Subgroup for Genetic Perturbation Predictive Modeling (GPPM)

Genetic Perturbation Predictive Modeling

We are excited to introduce a new subgroup within the AI/ML AWG centered around predictive modeling using omic data created from genetic perturbation. We already have an existing project the involves training ML algorithms on Perturb-Seq scRNA-Seq data to make predictions in unrelated single and bulk RNA-seq datasets, namely those with from spaceflight and spaceflight simulation. The premise is that, by training on the profile of transcriptional changes created by an upstream perturbation, then the origins of other widespread transcriptional changes (such as those which occur in humans during spaceflight) can be traced to their sources. This approach is unique because it can identify an upstream source gene or cluster even if the source itself does not undergo a significant change in expression. This project already has a manuscript on biorxiv, but needs to be submitted to a journal, possibly BMC Bioinformatics, and will likely require some revision and expansion. Would like to see this published. From there we can explore avenues to expand upon the existing paradigm with new datasets and algorithms. There are several possible directions for this already on paper, but we are interested in any new avenues or ideas you might bring to the table.

Active Project Pre-Print:

https://doi.org/10.1101/2024.11.28.625741

Interest Response Form:

We are looking for members to fill the following roles/areas of attention:

Computation – Data processing and model training

Output Analysis – Gene set enrichment analysis and literature validation.

Research – Searching for new datasets, algorithms to apply, and studies to validate predictions against. Involves extensive literature review.

Code Organization – Maintaining and synchronizing versions, GitHub and Hugging Face maintenance. Familiarity with google collab notebooks would be helpful as we need to pivot towards that system.

Submission Experience – Expertise in navigating journal submissions.

Resources we are looking for:

Compute – The existing models were created using a limited dataset that fit within 164 gigs of memory, but we will need a larger server configured for remote access via collab. We have a 96 gig server which may be available for this as a last resort, but we would like to find a better solution.

OSDR human spaceflight RNA seq datasets - The existing project was created with the recently compiled human datasets in mind, but has never been used on them due to access limitations.

@AIMLawg @MultiOmicsAWG

21 Likes

@rachelcrivero and myself might want to set up a meeting with you all regarding some new datasets we have been working on… :slight_smile:

5 Likes

Hi all,

Congratulations on developing this framework!
I would be very interested in discussing how this can be expanded towards viral evolution, genomic epidemiology, and public health purposes.
Please let me know if this would be of interest, and when would be a good time to connect.
I’ve also registered to become part of your subgroup.

Many thanks,
Nidia

@liamfj17 @lauren.sanders

3 Likes

Hello everyone :folded_hands:

I’m Chalermchai from Saraburi Thailand

The project details of the Ai/ML sub-team are close to my imagination. I would like to present them in case any experts want to add some ideas :pushpin: to the project. :victory_hand: I named the project: “Astro-Symbiote: AI-Powered Bio-Adaptive Habitat Management”** * :pushpin: Concept: :backhand_index_pointing_right::backhand_index_pointing_right::backhand_index_pointing_right:Create an AI system that acts as a “symbiotic organism” (Symbiote) with a closed-loop ecosystem in a spacecraft or base on another planet. The AI will learn and adapt to maintain the balance of living things (plants, microorganisms, captive insects) and resources (water, air, nutrients) as appropriate in real time. * AI/ML uses: * Reinforcement Learning: AI learns from experiments to adjust various factors (light, temperature, humidity, watering/nutrients) and receives feedback from biological sensors. (Plant growth rate, microbial health) to find the best “policy” to maintain the balance. * Computer Vision & Anomaly Detection: AI analyzes images from microscopes and regular cameras to detect plant diseases, pest/microbial outbreaks, or ecosystem abnormalities early on. * Predictive Modeling: Forecast future resource needs, oxygen/food production, and waste management so the system can adapt in advance. * Novelty: It’s not just about controlling the environment, but about creating AI that “understands” and “nurtures” complex biological systems to grow and sustain themselves in limited environments.

Thank you :folded_hands:

Chalermchai

@Anatta

3 Likes

Just a reminder that the first meeting for the GPPM subgroup is today at 2:00 PM pacific, hope to see you there!

Video call link: https://meet.google.com/kgu-oxpk-hee

2 Likes

Did you all see this paper?

https://www.nature.com/articles/s41592-025-02772-6

@liamfj17 @lauren.sanders @nidiatrovao

3 Likes

Hello! I’d love to join this subgroup. I’ve just filled out the form linked above, looking forward to hearing from you all! ML models and scRNA-seq are two topics I’ve worked firsthand with, so I hope I can bring valuable insight and skill to this team!

@liamfj17 @lauren.sanders

1 Like

Still a space for a data engineer? @liamfj17 @lauren.sanders

1 Like

Topic→New PR for Perturbation Theory

URL: https://github.com/liamfj17/Perturb-Seq-Transfer-ML-for-Prediction/pull/1

Reviewers assigned→ @liamfj17 cced @lauren.sanders

Content: Improved Readme with best practices for pull request - merge. (@liamfj17 you will need to block main for now to avoid automatic merges)

Main action item: Establish access to public datasets for local testing. Next Ticket…?

Dev: Felipe Pineda

2 Likes

Hiya - The Github Repo is ready to begin contributing! For now the process is to:

A. Clone the Repo-Request access. :rescue_worker_s_helmet:
B. Jump to the branch UAT using

git checkout UAT

C. Creating a new branch out of UAT with:

git checkout -b {featureName}FeatureDev

D. Once you have done all the changes and are happy with them you can commit and push them on your branch and open a pull request to UAT! That would be it. :satellite:

If you have any questions feel free to reach out. Let the improvements begin!

cc. @nidiatrovao @sriram.susarla @Anatta @liamfj17

1 Like