Hi,
I was trying to figure out what genes were available for the different `SEQ_ASSAY_ID`s. Running the R code below the number of genes listed in `assay_information.txt` often doesn't match (but usually isn't far off) the number of unique `Hugo_Symbol`s in `genomic_information.txt`. I'm likely missing something but could someone point me in the right direction? Also, the column `clinicalReported` mentioned in the data guide is missing from `genomic_information.txt`.
```
assay_info = read.delim("synapse/assay_information.txt")
genom_info = read.delim("synapse/genomic_information.txt")
assays = unique(assay_info$SEQ_ASSAY_ID)
sapply(assays, function(assay){
assay_info$number_of_genes[assay_info$SEQ_ASSAY_ID==assay] -
length(unique(genom_info$Hugo_Symbol[genom_info$SEQ_ASSAY_ID == assay & genom_info$includeInPanel == "True"]))
})
```
Thanks
Steve
Created by stephenrho Thanks! Hello,
Some discrepancies are expected due to panel specific properties, and these discrepancies are described in the [release notes](https://www.synapse.org/#!Synapse:syn26838309). Other discrepancies may result from errors in panel metadata submitted by the sites.
Regarding `clinicalReported` being missing, this was an oversight and will be corrected in the next release. Thank you for bringing this to our attention.
Best,
Haley
Drop files to upload
question about genomic_information.txt page is loading…