Hi AWG members,
I’m excited to announce that the updated (bulk) RNAseq pipeline for eukaryotic organisms is ready for your review. Please click on the link to review the pipeline and provide your feedback ASAP and no later than Monday, 2/17/2025.
Updates include the following:
Updated Ensembl Reference Files to the following releases:
- Animals: Ensembl release 112
- Plants: Ensembl plants release 59
- Bacteria: Ensembl bacteria release 59
Software version updates.
STAR Alignment:
- Added unaligned reads FASTQ output file(s) via STAR
-outReadsUnmapped Fastq
:- {sample}_Unmapped.out.mate1
- {sample}_Unmapped.out.mate2
RSeQC Analysis:
- Updated inner_distance.py invocation to use a lower minimum value to account for longer read lengths
- Previously used fixed -150 minimum value
- Now uses -(max read length)
DESeq2 Analysis Workflow:
- Added variance-stabilizing transformation (VST) transformed counts output file, VST_Counts_GLbulkRNAseq.csv.
- Account for technical replicates
- For datasets with uniform technical replicates (all samples have the same number of technical replicates):
- Sum counts across technical replicates using DESeq2’s collapseReplicates
- For datasets with non-uniform technical replicates:
- Keep only the first N technical replicates for each sample, where N is the smallest number of technical replicates among all samples
- Sum counts across the kept technical replicates
- For datasets with uniform technical replicates (all samples have the same number of technical replicates):
- Removed DGE and PCA output tables previously used for GeneLab visualization (visualization_output_table_GLbulkRNAseq.csv and visualization_PCA_table_GLbulkRNAseq.csv)
- ERCC-normalized DGE analysis was removed. The following output files were removed:
- ERCC_Normalized_Counts_GLbulkRNAseq.csv
- ERCCnorm_differential_expression_GLbulkRNAseq.csv
- ERCCnorm_contrasts_GLbulkRNAseq.csv
- visualization_output_table_ERCCnorm_GLbulkRNAseq.csv
- visualization_PCA_table_ERCCnorm_GLbulkRNAseq.csv
- Added parallel rRNA-removed DGE analysis:
- Create filtered RSEM count files with rRNA features removed:
- {sample}_rRNA_removed.genes.results
- Normalize rRNA-removed counts
- Perform DGE analysis using rRNA-removed counts
- Output additional set of rRNA-removed counts and DGE results
- Create filtered RSEM count files with rRNA features removed:
A big thank you to @alexis.torres and @crystal.han on the Data Processing Team for leading this effort!
@MicrobesAWG @AIMLawg @AnimalAWG @ALSDAawg @PlantAWG @MultiOmicsAWG @HUMANawg @RLWG @FemaleReproAWG