Microgravity analogue literature review

Mining the Microgravity Analogue Literature: Help Needed!

Hey everyone,

With so many open-source microgravity platforms available, I figured it’s time to review the existing literature and i need your help sifting through a massive pile of research papers!

Here’s the challenge: I’ve pulled 17,411 papers from NCBI related to 3 groups of microgravity analogues (RPMs, 2D clinostats, slow rotating wall vessels). The tables listing these papers are in a GitHub repository along with the code used to find them and a summary of some initial findings.

The question is: How many of these papers are truly relevant to microgravity analogue experiments? There are likely to be some false positives in this dataset (my manual inspection of a small fraction of the slow wall rotating vessel group was promising but i’m not use if its just because i’m relying on the title and not the abstract?).

Can you lend a hand? I'm looking for clever ways to analyze this list of papers (titles, abstracts, journals) to identify the real deal and remove any irrelevant studies/ false positives.

Does this require a human(s) to read all titles and abstracts to pass judgment? Or is it more time-effective to find a way to make the initial search more precise? How do we assess the accuracy of this initial NCBI search?

I’ve put the code and lists of papers (& related meta-data such as author, abstract, title, publication year, journal), along with the code and a summary in this GitHub repo (GitHub - dr-richard-barker/Microgravity_analogue_review)

If this sounds interesting to you please let me know and we could create a focus group to help assess or enhance your solution or strategy.

Thanks for your time and consideration :slight_smile:
Cheers DRB

@PlantAWG @ALSDAawg @AnimalAWG

16 Likes

Hi Richard,

This sounds interesting. I would love to be part of it. Anything I can do?

1 Like

For sure, give me examples in csv of false positives and I could help you with this task.

1 Like

Sure, here’s the link to the .csv in the repo that contains all the papers.
The same repo also contains a python note book that was used to create each of the groups.

1 Like

I was an author on a similar paper in recent years. It was nowhere near this exhaustive.

This led to the development of the NASA Space Life Science Library. Many interns contributed to this work over the years.
https://public.ksc.nasa.gov/nslsl/

3 Likes

In this case RPM.csv and rotating_wall.csv are partially clean and RPM_clinostat_literature_combined_4_ai.csv is combined with false positives?

1 Like

None of the .csv files have been cleaned.
Each was pulled separately and then merged into one with a new “Group” designation.
The paper provided @Botanynerd and the references inside it are the closest thing we have to a “cleaned” dataset.

ok, can you send me false positives?

Not a comprehensive list but here’s 168 (not evenly distributed amount the 3 groups)…

1 Like

Looking through the titles and abstracts i think ideally they should mention the species, the tissue, (sometimes) the assay name and the microgravity analogue type/description → that enables grouping for the meta-analysis. I think finding a way to pull this information into columns in the table may help identify false positives. Possibly?

1 Like

Hey Richard I am really interested in developing the biophysical assessments of these treatments in terms of mechanical and thermal systems, and developing some quantitative models of these factors within these treatment systems…

1 Like

Yes, it is. I’m going to check the information and develop something, hold on.

1 Like

I am interested to be a part of it.

1 Like

I found 108 automatically (2D case), but I require you to check if it is correct while I search for matches with slightly more complex methods:

https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_2D.csv

RPM case:

https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_rpm.csv

I found 153 in 2D using a more complex method, but it consumes considerably more resources:

https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_2D_a.csv

Tell me if you consider any of the results useful so I can share the code with you. I have several meetings but when I return I can explore other approaches that might be useful to you.

1 Like

Hello, this is really interesting… Yes, this is really helpful, i just ned help understand what happened?
It’s hard to be sure what happened with just these .csv files.
I’m a bit confused as for the RPM analysis there were ~9958 papers but this filtered list only contains 683 entries. I can see there are new columns entitled “is_false_positive_2D” or “predicted_false_positive” which is exactly what we want to see! Awesome, is it possible to do it will all 17411 papers in the combined list? Thanks for your help with this :slight_smile:

1 Like

I’m back. Sorry for not being specific, the files I gave you are false positives, you have to delete them from your database, what I need to know is if I deleted them correctly with the processes I developed (I check the first 20 and it’s ok), we can make an hybrid (humans labeling) or full automatic process (but it can be less rigorous).

In the first case I developed a non intensive method and I got:

2D (found 108 to exclude):
https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_2D.csv

RPM (found 684 to exclude):
https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_rpm.csv

I developed a more intensive method but I only tested in 2D (found 153 to exclude):

https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_2D_a.csv

In the case that humans can label if we can discard a paper, I can develop a recursive process to simpify the task and you can multiply the human capacity, but we need a server (maybe a NASA server with linux and ssh, I can host this process but maybe you want to use this methods for other task in the future) for the interface, for example, if a human label 1 paper, we have more cases to discard more papers, in this case with your csv with false positives I discard these elements (2D: 163 and RPM:684).

Except for some meetings I have this week, I have time to help you label, maybe we can end up with a hybrid approach this week, but the result might not be perfect, I can develop another interface to label if someone finds an irrelevant document and maybe one label could help to discard more documents.

1 Like

These false positives are from the 17411 papers, but we need more human labeling to discard more, maybe I can do that, but maybe you have a team doing this that we can use to finish with this task faster.

1 Like

Bad news: the false positives are the same that you sent me (168, if you have more, could be better), the difference was only PMID repeated.
Good news: With the more intensive process, I discard 1392 in RPM.

https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/identified_false_positives_rpm_a.csv

Now I can exclude 1392+108=1500.

Now I’m going to work with:

RPM: 8565.
https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/RPM_w.csv

2D: 6320.
https://www.grupoalianzaempresarial.biz/nasa/awg/microgravityanalogueliteraturereview/2d_w.csv

In this moment: 14,885. I’m going to develop something, maybe I can discard a lot more… when I finish this first approach I can develop a platform to label with humans if you have a team, we can divide by sections, I can put options, for example, human reviwed (positive or false positive) and automatically labeled (for these false positives to be reviewed by humans) this can help to obtain a rigorous result, but could be an easy way to work.

I’m showing the process in the case someone find errors or want to participate, I will publish the methods at the end.

1 Like