Dear DREAM team, I am really glad to have participated during the last week of this challenge (I registered something like 10 days ago), which was absolutely not expected to me ! This is the very first time I participate to a DREAM challenge so please forgive me for my lack of knowledge regarding submission procedures ! I am also quite new to Docker platform. To summarize where I am : I could only do the SC1 part in this quite short time. But my result on the leaderboard did not look so bad so I thought I had to push it to an end. Where I am right now a few hours before submission : I built the docker based on one exemple you provided. I also used a few functions you provided to process the data. I could run the code through the docker and make my predictions in the docker. So it seems like I am very close. As it seems to be my first and last trial, I just wanted to ask you a few questions before pushing my docker : - I did not really get what script will be used on the data on your side. I made a script that fit the data to make one model for each drug, and then another script to compute the predictions. I could succesfully run the whole thing with this command on my machine : docker run -v "$PWD/data/:/data/" -v "$PWD/model/:/model/" -v "$PWD/training/:/input/" -v "$PWD/leaderboard:/validation/" -v "$PWD/output:/output/" docker.synapse.org/$SYNAPSE_PROJECT_ID/sc1_model Where leaderboard could be replaced by a folder containing the unseen data. Is that ok to have a docker that needs to be run like that ? Should I provide this line somewhere ? - I provided the functions I used to do the features selections and the hyperparameter optimization (I used a SVR model), but my functions relied on pre-computed data coming from these functions. Indeed the feature selection / hyperparameter optimization takes around 4 hours so I wanted to spare you from launching again the same process. My question is : ** is that ok to use pre-computed data for feature selection / hyperparamter optimization **? - If I understood correclty, for this last submission I should provide you a model built with merged training and leaderboard datasets , is that correct ? I am note sure whether I am still supposed to use only `training` dataset to make my models. - As I said I made one model for each drug. As I pickled dump the models, I decided the replace space character by underscore '_' character in the model files. Is that ok ? Thank you very much in advance. Sorry I you had already explained that before. Best, Alexandre Coudray PhD student at EPFL Lausanne Switzerland

Created by Alexandre Coudray alexdray86
Unfortunately my last minute submission faced this problem that I saw in previous post : "raise Exception("No 'predictions.csv' file written to /output, " Although it is working on my computer when I run it with docker run -v "$PWD/leaderboard:/input/" -v "$PWD/output:/output/" docker.synapse.org/$SYNAPSE_PROJECT_ID/sc1_model I know this was very last minute but it would be great if I could just get one submission before you close. Thanks a lot
I will launch the push in the minutes coming I hope it's going to pass .. !!
Alexandre, (1) Correct: your submitted models should already be fit. (2) Correct: I misspoke, you should be using training + leaderboard now. Good catch! Best, Jacob from the CTD^2 BeatAML DREAM Challenge Team
Hello Jacoberts and thanks for your help ! It seems that I had not understood a few things, tell me if I am wrong - Is that correct that we need to launch the script to make predictions in /output/predictions.csv, using data in /inputs, but we **do not** need to fit/train the model in the docker ? - "use the training data to fit a model" -> Is that correct that at this stage and for final submission, I need to provide **models built with training+leaderboard**? I think the rest seems clear to me , thanks ! Alexandre Coudray PhD student at EPFL Lausanne Switzerland
Dear Alexandre, We're so glad you are participating! First off, the high level procedure is: 1. use the training data to fit a model 2. store the fit model in a docker container 3. submit the docker container 4. your container should look for data at a predetermined path (`/inputs`) and place predictions at `/output/predictions.csv` *(by agreeing upon particular conventions for input and output paths, everyones' docker files can be run with the same command)* So it seems like you're close! A couple specifics based on your comments: * from within your container, you won't have access to any outside data. In other words, the only mounts onto your container are /inputs and /output. * validation data will be mounted at `/inputs` * if you need data to make predictions (eg, your model/ directory), you should copy that data *into* the docker container (eg, with a `COPY model .` command) Does that help? Best, Jacob from the CTD^2 BeatAML DREAM Challenge Team

last minute submission of SC1 page is loading…