The RNA-Seq dataset seems to be uper-quantile normalized based on the description in the data section.
But the table below (also in the data section) indicates RNASeq expression was in TPM format which is not the same as upper quantile normalization. It is a bit confusing to me which normalization was applied to RNASeq data in the validation set
Annotation Type Measure Patient Samples by File Name
Clinical See Codebook See Codebook Row clinical_data.csv
Ensembl17 gene count Column GRCh37ERCC_ensembl75_genes_count.csv
Ensembl17 gene tpm Column GRCh37ERCC_ensembl75_genes_tpm.csv
Ensembl17 isoform count Column GRCh37ERCC_ensembl75_isoforms_count.csv
Ensembl17 isoform tpm Column GRCh37ERCC_ensembl75_isoforms_tpm.csv
ReqSeq gene count Column GRCh37ERCC_refseq105_genes_count.csv
ReqSeq gene tpm Column GRCh37ERCC_refseq105_genes_tpm.csv
ReqSeq isoform count Column GRCh37ERCC_refseq105_isoforms_count.csv
ReqSeq isoform tpm Column GRCh37ERCC_refseq105_isoforms_tpm.csv
Created by wdgong Dear @wdgong ,
There will also be counts but not upper-quantile normed data. You will likely notice that the synthetic data does not fit certain expectations since it was generated in such a way as to break gene-gene correlation but not to sum to a million or match TPM to count files for example.
Kind Regards,
Mik Hi Mike,
So the RNA Seq data in the validation set will be in TPM format? Just to make sure.
Thanks,
Weida Dear @wdgong ,
Thanks for participating in this Challenge. There was a miss communication among Challenge organizers and Upper-quantile normed files **are not provided**. I will be updating the text description shortly. If you are interested you can use [this script](https://github.com/mozack/ubu/blob/master/src/perl/quartile_norm.pl) or one of your choosing to perform upper quantile normalization.
Apologies for any inconvenience,
Mike
Drop files to upload
RNA-Seq data normalization. Upper-quantile normalized or TPM? page is loading…