Where are details of input for the containerized solution?
In particular may we assume that the command line input of the program looks something like this:
```
docker run my_ra2_app case_lh.jpg case_rh.jpg case_lf.jpg case_rf.jpg output.csv
```
That is, will we always be given 4 input files in a fixed order like LH, RH, LF, RF, for each test case?
Created by Lars Ericson lars.ericson No problem!
I do not yet have a more precise schedule. We are still testing infrastructure and finalizing the example docker model and documentation. I will provide an update as soon as possible.
Cheers,
Robert
Thanks @allawayr for your answer.
So I would be happy to begin submitting my models as soon as you have the leaderboard working.
It was previously talked about a target date towards the end of January, do you have a more precise schedule? Hi @arielis -
I chatted with the organizers who are fine with you submitting a pre-trained model.
So, we will accept pre-trained models for the leaderboard and final submission phases. However, all top-performing teams are subject to additional audit and will need to provide a docker container that will train on the Cheaha and produce reproducible results, or be able to be retrained on additional datasets. Hope this helps!
@allawayr
Thanks for your answer.
As my model is already trained on the full training dataset, additional training on the content on the "/training" directory in the docker environment would make sense only if additional images are made available. @lars.ericson - yes, I think that is an appropriate assumption.
@arielis - The expectation as laid out initially was that the model both train and test on Cheaha. however, this was before the full training dataset was released by the organizers. From a reproducibility and extendability perspective, it's preferable that the model be designed to both train and test. Let me get in touch with the group of organizers to determine if this is still a hard requirement.
Thanks!
Robert
Are we expected to have new training images available in the docker environment that are not available here?
If not, can we just put the model in the docker image, and ignore the content of the /train directory? Should we assume that, for a single patient, the four images may be of different sizes and could be either monochrome or RGB? Hi Lars,
We've been working with the UAB folks to upgrade some software on their end that has delayed us a bit in finalizing the challenge scoring architecture.
We'll be running your docker container using Singularity 3.5.2. (singularity run your_container), but we will also mount in the test and train data as described here: https://www.synapse.org/#!Synapse:syn20545111/wiki/597249
Here is the snippet that I think most directly addresses your question:
>**IMPORTANT:** Your docker image _must_ be able to run without a network connection.
>
>### Input files
>* When running your container, we will mount all training image data to the container working directory as `/train/` and the test image data to `/test/`, so please be sure your code looks for these files in these directories, respectively.
>* `/train` will include 4 image files *per patient* of the format `Patient_ID-region.jpg`. For example: `UAB001-RH.jpg UAB001-LH.jpg` for the right and left hands, respectively, and `UAB001-RF.jpg UAB001-LF.jpg` for the right and left feet, respectively. There will also be a filled training matrix called `training.csv` in the same format as the [testing template](syn21072036). Your method must be able to match the Patient_ID column (e.g. UAB001) with the correct images for training.
>* `/test` will include 4 image files per patient of the format `Patient_ID-region.jpg`. For example: `UAB001-RH.jpg UAB001-LH.jpg` for the right and left hands, respectively, and `UAB001-RF.jpg UAB001-LF.jpg` for the right and left feet, respectively. There will also be an empty training matrix in the same format as the [leaderboard/test template](syn21072036) (note that the linked template is the _leaderboard_ template - the final evaluation template will be the same format, but more patients). This file will be called `template.csv`. Your method must be able to match the Patient_ID column (e.g. UAB001) with the correct images for training.
>
>### Output files
>* Any output files should be written into a directory in the working directory of the container called `/output`.
>* Specifically, your container must write your predictions to one csv file (by filling in the [leaderboard/test template](syn21072036) ) named `predictions.csv`.
>*If you'd like to participate in only in SC1, simply have your container fill in your Overall_Tol predictions and fill the other prediction columns and rows with `0`. *
So, we will be mounting all of the files into your running container at once as these directories. You are free to have your model run on these images however is easiest for you. For example, if using `keras` in R, you could use something like the `flow_images_from_dataframe` function to read them all in at once.
I should also note that I'll be putting together a more comprehensive hello_world model that will have some more explicit instructions on things you need to do to have your running container find the GPU drivers. I hope to finalize this next week, but have been waiting until we have the scoring infrastructure and software finalized so that I don't confuse people with lots of changes.
Drop files to upload
Docker template and input for a single case page is loading…