Hello, I am analyzing ROSMAP blood RNA-seq data (syn22024498). The batch3 fastq files have two end reads with three different lanes. I have got the counts of each lane through featurecounts. But I don't know how to deal with the counts of the same sample but different lanes. Is it necessary to combine them or treat them as separate samples? Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R2.fastq.gz Thanks, DuanTingting

Created by Tingting Duan Duantingting
Hi Dana, I'm not sure but hopefully Dr. Yiyi Ma at Columbia can help answer! @yiyima, can you help advise on the correct order for the fastq merge process for the ROSMAP monocyte RNAseq data? Thanks, Abby
Hi @abby.vanderlinden Is it possible to get a short explanation about the correct order of the reads of the monocyte samples in batch 3? I understand that all the Sample_XXX_...._xxx.R1 belongs to the same endread. But how can I know between the 3 files with the same R1 postfix. What is the internal order? What would be the correct file sequence for the merge process in the following example? Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R2.fastq.gz Thanks Dana
The original merged fastq files that were replaced were merged incorrectly -- the order of the reads in the provided files was incorrect. Best practice is still to merge the files before alignment, but we wanted to provide the raw files so that downstream users can start with the raw data and do the merging with their preferred method.
@abby.vanderlinden Thanks! However, the data of batch 3 was merged before, but it was later found that this was wrong. So the re uploaded data is not merged. [Corrections to batch 3 fastq files in February 2022: Samples from batch 3 were sequenced across 1-3 lanes depending on the sample, and fastqs from those sequenced across multiple lanes were initially merged and provided as one file per read end. However, it was discovered that files were incorrectly merged, resulting in incorrect read sequences and incompatible transcript counts. The original merged fastqs provided from batch 3 samples have been deprecated and replaced with the raw unmerged fastq files generated from these samples.](https://www.synapse.org/#!Synapse:syn22024496)
Hi there, I would recommend merging the fastq files that belong to the same sample before doing the alignment and generating the counts.

ROSMAP blood RNA-seq data (syn22024498) page is loading…