Regarding the idea that @lauren.sanders brought up, I’d like to share a relevant foundation model(RetFound) in ophthalmology (A foundation model for generalizable disease detection from retinal images | Nature) which demonstrates promising results in using eye images to detect systemic diseases such as stroke , Parkinson Disease … I really love this idea and have been working on something related. It would be great to check it out!
Hi @AliReza-H - this foundation model is so cool! As an easy first step, OSD-679 has optical coherence tomography images from rodents that were exposed to simulated spaceflight stress - one of the data types that this model was trained on. It would be cool to test this model on this data!
Hi @jgong and all,
I’ve prepared the gene table we discussed in the meeting: mouse_retina_genes_phenotypes.csv - Google Drive
I went through all the gene sets and included any missing genes. The columns in the table include Ensembl IDs, gene name, description, and the phenotypes where the genes are present for the targeted analysis. I think we can use it for selecting the most relevant genes involved in retinal functions
Lauren, this is the same exact list from your document. We formatted it to allow better interpretation (with gene symbols) and for helping us keeping track of the shared genes across different assays/phenotypes.
I finished making the 2 gene lists: 1 that comes from the phenotype-related gene sets (LINK TO FILE), and 1 that comes from XGBoost regression to identify which genes are most predictive of the phenotypes (LINK TO FILE) (thanks for the suggestion of XGBoost, V, it worked much better!)
It turns out there is NO overlap between the 2 lists. There are 1777 genes from the phenotype-related gene sets, and 26 genes that XGBoost identified as predictive of the phenotypes.
Here’s the notebook with all the code: Google Colab Anyone should be able to run it and get the files locally.
@vaishnavi.nagesh I’d suggest to try imputing both gene lists and see which performs better.
The OSD-679 data is split into 4 cohorts (female old and young; male old and young). I ended up downloading only cohort 3 (old male mice) because it’s the smallest.
There are lots of ways to break this dataset into classes; as a first pass I used Control vs HLS (hind limb suspension).
So far I’ve only finetuned the model for 10 epochs without hyperparameter tuning and the performance is not great (56%) but I think we could do some things to improve:
More data/additional cohorts
Maybe preprocessing the images more similar to the paper
Thanks for this beautiful notebook, @lauren.sanders
I also tried unfreezing the last 4 blocks — unfortunately, the result didn’t change. Or maybe I made a mistake.
my fork:
Can I please be added to the subgroup mailing list and calendar? I would love to contribute to this project and have independent research experience relating to retinal foundation models and ophthalmology.
I also would like to be added to the mailing list and calendar. I truly believe this is great opportunity to contribute to research and gain experience.
Hi all, all recordings should be up to date now, here: Digital Twin - Google Drive (I fixed the recording for last week. Thanks @alavia).
Sunkalp and Kush, please use the calendar here to join meeting today. I need your email address to add you to our calendar invite. (@schandra, @kshah, please PM me). Our meeting today is scheduled at 10 am PST. See you all soon!
I cannot make the meeting today. I will watch the recordings and catch up on what is discussed in the meeting. Please let me know if there is anything specific I should focus on or look into.
Hi all,
During our meeting last Friday, the literature review was discussed. I was wondering how I might be able to get involved with that. I’d love to contribute however I can.
Hi Kush, Thanks for asking this question. Please take a look at our folder on google drive. Take a look at the spreadsheet that Amey created, where he has collected quite a few of papers. Literature_Review - Google Drive
I hope to have you look into the existing collection of papers, and get sense of the kind of input data required for each study (make a table/do a bit of classification, for example). All of these papers are currently focused on imputation.
We want to expand this collection to include the broader digital twins, Alireza has been working on that. His latest finding is on the heart digital twin project. Please take a look at that, as well. If you want, feel free to create another spreadsheet on digital twin related papers.
This Friday, it would be great if you could share your findings in a 5-10 min presentation format, so we can learn this together. Thanks so much!