The annotation file (anno_refseq.csv) for the RNA-seq experiment (Biochronicity) has 56085 gene names but if I go to gene expression file named RespiratoryViralChallenge_IndependentTest_Biochronicity_Time0_EXPRESSION_VST.tsv it has 35517 lines. I could not understand how I can map the RNA-seq genes to the expression values given in the .tsv file. I need this information to be able to map RNA-seq genes to the Affymetrix genes in HG-U133A_2.na35.annot.csv. Otherwise we cannot use the models trained so far for predicting RNA-seq data. Thanks.

Created by Zafer Aydin zaferaydin
I see that lines in RespiratoryViralChallenge_IndependentTest_Biochronicity_Time0_EXPRESSION_VST.tsv file start with the ID information. Something I did not notice before. It is clear now. Thanks.
Not every gene in the annotation file will be present in the gene expression matrix, because some genes have no observations. The RNA-seq annotation file maps refseq IDs to HUGO IDs, in order to identify the assayed gene. Similarly, the Affymetrix annotation file contains a mapping between probe set ID, refseq ID, and HUGO ID (Gene symbol), so the probe sets can be mapped to either the refseq ID or the Hugo ID.

Mapping RNA-seq gene names to gene expression values page is loading…