I have a couple of technical questions regarding the EHR Dream Challenge on patient mortality prediction. As I understood, we are supposed to submit a training methodology based on open-source code which to train a model on the server of the University of Washington on real data from 2009 to 2017 and that will be evaluated on the independent test set (leftover data till 2019). 1. What are the limitations in the training process (maximum training and run time, how many cores (can we use parallel processing), maximum RAM usage, etc.)? 2. Are we able to configure the docker image, that enables calculations in R? 3. Are we able to use packages/libraries which are proprietary but open-accessible for research purposes? 4. Shall we provide the prediction as a score (continuous scale) or binary? 5. What is the specific scoring metric? AUROC or accuracy? Many thanks in advance for answering my questions.

Created by Marcus Vollmer Vollmer
Hi @toekneesunshine, Yes, we just published the [technical limitations](https://www.synapse.org/#!Synapse:syn18405991/wiki/598712) description page. We do not have GPUs at this time. We hope to have GPUs for future challenges. Thank you, Tim
Hi Tim, As a followup to the technical limitations question, are we just allotted CPUs? I couldn't find any mention of GPUs on the Wiki page. Also, does a more detailed technical limitations page exist yet? Thanks, Tony
Hi @sarahmul, We have created a new technical limitations section in the wiki that should be published shortly. The Leaderboard technical limitations are - 70 GB RAM - 50 GB Hard drive - 4 CPUs - 10 Hours Max Runtime Let me know if you have more questions! Thanks, Tim
Hello, I can't seem to find if you released the technical limitations (1) for the Leaderboard phase on the wiki. Can you point me in the direction of these limitations? Thanks! Sarah
Hi @Vollmer, Thanks for the questions. That is correct, during the Leaderboard phase models will be trained and evaluated on data ranging from 2009-2018. During the validation phase we will be evaluating the models on the independent test set ranging from 2018-2019. 1. For the current open phase, you will have 30GB RAM, 4 cores (you can use parallel processing), and one hour to run your model. This is just for the open phase which is only synthetic data. The Leaderboard phase will include UW data, which is larger than the synthetic data. We are going to announce the available resources when we start the Leaderboard phase, but they will be equal to or greater than the open phase limitations. 2. Yes, you can use whatever language you would like. 3. Yes, these algorithms won't be used for commercial purposes and so should be clear to use under open-access for research purposes. 4. We are expecting the score as a continuous number between 0 and 1 (0 less likely of death, 1 more likely of death). 5. We are using AUROC as our primary scoring metric and AUPRC as the tie-breaking metric.

Technical limitations and scoring metric page is loading…