Hi,
I am just wondering whether 1. the gene annotation file from genecode (basically the location of each gene, this file is not included in the annotation folder) 2. the gene set information from GSEA or GO Ontology is allowed to be used? Gene sets are essentially very similar to motif databases such as JASPAR since they are not directly dependent on the ENCODE sequencing data. Thanks!
Created by zhicheng ji zji You can use TSS coordinates but only from the GENCODE files we pointed to. RefGene is not the same.
Thanks,
Anshul.
Can TSS coordinates be used? For instance, inferred from refgene?
http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz
Anshul's answers above about gene annotations would seem to imply that this is OK too, but I just wanted to make sure... Thanks! Yes. Any annotations from the GTF files should be fine since everyone has access to that data and can potentially use it.
Anshul Thanks for your reply. How about the gene annotation files (gtf files from http://www.gencodegenes.org/releases/19.html)? Sorry. I'm afraid you cannot use this information for the purposes of this challenge. They are an independent source of information but they are not the same as using motifs from any database. The point of allowing the use of motifs from any database is to allow freedom in modeling sequence affinity. The challenge focuses on integrating sequence affinity with accessibility and TF or target expression. GO and GSEA dont model either of these. So they are not admissible as features used directly in models. You are free to use these to learn motifs or sequence affinity models in some way if you like. But you cannot use them directly as features in the model.
Thanks,
Anshul.
Drop files to upload
Gene Annotation and Gene Set Information Allowed to be Used? page is loading…