Dear all: Sorry to disturb you! I am new to gene expression analysis and I was trapped for days in the step of differential expression (DE) gene analysis. Because almost all DE analysis tools require raw count data as input, and TPM data is not recommended, if comparing among individuals. However, in the ROSMAP dataset, the bulk tissue RNA-seq data was in FPKM normalized format. Maybe one alternative is to download the very original file and process them into raw counts data, however, it is too big (>1.6T) to be downloaded as a whole and the time cost is unacceptable. So any suggestions to solve this problem? I'll appreciate any help from you, Thanks! Best wishes

Created by fanc232
No, these are raw counts. No normalization or adjustment has been done.
@sieberts Excuse me, but was batch-effect in consideration when generating the raw counts dataset syn8691134 ? I asked because I got as more as 10,000+ differential expression genes (totally 60,000+) with this file as input using limma package, and when I used hclust to test sample enrichment in NCI/MCI/AD, I found that samples were in almost uniform distribution in each cluster.
@sieberts I know. Thank you for your help!
@fanc232 - The file I pointed to is from a consortium-wide re-alignment effort using an updated gene reference. In other words, these files are aligned to different gene references, which is why the genes do not match between the two files.
@sieberts Sorry to disturb you again, but I met another problem about the unmatched numbers of gene IDs between the normalized gene expression file and the counts file, of the ROSMAP database. In the normalized file (syn3505720, https://www.synapse.org/#!Synapse:syn3505720) , there are totally 55,889 gene IDs; While in the counts file(syn8691134, https://www.synapse.org/#!Synapse:syn8691134) , there are totally 60,729 gene IDs. The overlap part of above two contains only 27,760 gene IDs. Could you tell me if I made something wrong?
@sieberts I am so happy to receive your help, I'll try it, thanks a lot!
@fanc232 - There are counts for ROSMAP here, which were generated as part of a consortium-wide data reprocessing: https://www.synapse.org/#!Synapse:syn8691134. Hopefully those will be more useful for you.

Data for Differential Expression Gene Analysis page is loading…