Dear Organizers
You provided quite big amount of the data. However, any user that want to go little bit out from your pipeline need at least BAM files.
There are different ways to combine replicates instead of IDR.
For several analyses I prefer to work with p-value signals (ppois) instead of fold change ones
Do all the files for this challenge are available for public in ENCODE?
Is there any easy way to get a list of all fastq files with ENCODE identifiers? For example index file from kundajelab/TF_chipseq_pipeline, of course only training data.
Correction2.
Is it possible to have extra folder with p-value signals (ppois) for each cell line - transcription factor pair (macs2 bdgcmp -t Treat.bdg -c Input.bdg --outdir Out -m ppois ......)?
Ramil
Created by Ramil Nurtdinov n.ramil Hi Ramil,
Sorry the challenge is restricted to using data that is provided. We don't plan to provide pvalue tracks for any of the data types at this stage. You will need to work with the fold-enrichment tracks provided for ChIP-seq (which are the outputs). For DNase-seq you can process the BAMs as you wish. For the challenge it is critical that we define the labels in a specific way otherwise methods cannot be compared head-to-head. There are clearly many ways in which you can call peaks and compute signal. We have made choices that typically work well. For the purposes of setting up a challenge that we can effectively benchmark, participants will unfortunately have to stick to some of these choices wrt. the output variable.
Thanks,
Anshul.