Dear All,
Our sincere apologies for the radio silence on the status of the main paper. It was challenging to say the least to perform a systematic and robust comparison of the methods (many of which are giant ensembles) and TFs in order to draw meaningful conclusions from the challenge results for a high profile paper. We also had unforseen logistical and personal challenges that added to the delays. Nevertheless, we have derived what we think are interesting conclusions from the challenge that pave the way forward for potentially new paradigms for cross-cell models of TF binding. We are in the final stretch of completing the paper. I am attaching parts of the paper that are finalized for you all to start getting insights into the key findings.
Please see in this google drive https://drive.google.com/drive/folders/15UA3HB5RPNpAXNX3lqFwXNDyZA-3hmBT?usp=sharing
1. Main figures with legends
2. Slides deconstructing and explaining the main figures
Title: Systematic evaluation of multimodal approaches to predict in vivo DNA binding landscapes of regulatory proteins across cell types
Abstract: Predictive models of in vivo transcription factor (TF) are an essential complement to experimental assays for comprehensively mapping dynamic, multi-cellular regulatory landscapes. We organized a DREAM Challenge (https://www.synapse.org/encode) to crowdsource the development and benchmarking of in vivo TF binding models. Genome-wide TF binding predictions of models trained on ENCODE ChIP-seq data for 31 TFs in reference cell-lines by integrating DNA sequence, DNA shape, chromatin accessibility and gene expression data were prospectively evaluated against newly acquired ChIP-seq experiments in held-out cellular contexts. Top performers used innovative feature engineering, data normalization, sampling and model ensembling approaches. We identified significant biases of evaluation strategies used in the literature. Systematic error analysis across methods, TFs, genomic and cellular contexts revealed a fundamental limitation of current modeling approaches in adapting to differences in cell-type specific sequence determinants of TF binding. We suggest promising new research directions and best practices to further improve cross cell-type in vivo TF binding models.
I are aiming to circulate the main paper and a gigantic supplement in the next 2 weeks but will keep you posted on progress and send regular updates with portions of the paper on a weekly basis. We will put it on on biorxiv immediately after for comments from you all and from the community. We expect a submission to Cell latest by the end of April.
Thank you for your participation, support and hopefully your forgiveness for the immense delays. I greatly appreciate your patience.
Please feel to discuss the figures/paper on the Synapse discussion board. The narrative of the paper will get clearer once we circulate a complete draft. But we are happy to hear your thoughts and answer any queries you might have.
Cheers,
Anshul on behalf of the ENCODE-DREAM Challenge Organizers.
Created by ANSHUL KUNDAJE akundaje Great. Please feel free to use this data in any form and cite the Synapse DOI for this challenge website.
Also, the 3 winners of the challenge did publish their papers . Pointing them here since they might be helpful.
https://genome.cshlp.org/content/early/2018/12/19/gr.237156.118
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6327544/
https://www.sciencedirect.com/science/article/pii/S1046202318303293?via%3Dihub
Awesome! Looking forward to it!
Also, please let me know if you need a hand with any data analysis/validation (I am working on this data for another project). Not yet. We've got some important experiments as validation data partially completed. Once the shutdown reopens we expect to get that completed. We will aim to submit by the end of the summer. Is this paper published or archived anywhere?
We are largely done with the supplement. Have made some revisions to figures and added a few more analyses based on feedback. Working on finalizing results and discussion in the main text. Aim to finish by the month end. Hi Anshul,
May I ask if there is any further progress in this regard?
Thanks.