I noticed the contest included ChIP-seq peaks for the training set as bw files (independent of the labeled tsv files)
However, the final test regions will presumably not have this data available.
Were we expected to use the raw files ChIP-seq peaks in some way?
Created by Lawrence Du LawrenceDu No there is no apriori expectation to use any of the data provided in any specific way. That being said the TF ChIP-seq data is used as training labels and not features so it doesnt need to present in test data. So for example you could very well train a model on the actual ChIP-seq signal rather than the binary labels and see if that in any way improves your performance for the binary prediction task in the test set. Or you could use ChIP-seq signal strength as weights on training examples. In each of these scenarios, you dont need ChIP-seq in the test set since its only being used to define training labels. On the test data you will simply use your model to predict. And we will compare those predictions with the observed ground truth hidden ChIP-seq binary labels to evaluate performance.
Thanks,
Anshul.