Hi,
Our preprocessing step only finds 317617 files matching "*.dcm*" when executing the following
find /trainingData/ -name '*.dcm*' | wc -l
but the wiki mentions 640k images. According to wikipedia, dicom can only hold one pixel array,
so presumably we only have 317k images. It seems unlikely that you are holding back 50% as a test set.
What have I misunderstood?
Cheers
Bob
Created by Bob Kemp BobK @paoloinglese Yes, that's what we always do. The newsletter is on its way (1-2 days) and will include additional announcements. However, I wanted to let the participants who have requested the features know that it's already available.
Thanks! @tschaffter
Hi Thomas,
please could you send a notification (by email for instance) when changes are introduced?
Thank you!
Bw,
Paolo Hi all,
A scratch space of 200 GB (`/scratch`) is now available during the inference submissions. The [Cheat Sheet](https://www.synapse.org/#!Synapse:syn4224222/wiki/409763) is now reflecting this update.
Thanks! Hi, Thomas, can you consider the request to add scratch space to the scoring run? Since it would be a win-win situation for everyone: it encourage people to use more sophisticated model while potentially shorting the running time. Thanks a lot! Thomas:
Thanks for taking the suggestion! I hope the time allowance for each will be generous :-) Sometime even read one images from the volume takes many seconds.
Another thing is that currently the container size (image size + usable space) is limited to 10GB. I wonder if you can make it larger. For a container that integrates many models, 10GB may be too small for the image size.
Thanks! To speed up scoring run, I am wondering if we can have an explicit scratch space setup with a known size. There are two benefits to this:
1. Without knowing how big we can write to temporary space (for now, we write into docker, which has limited space), we have to go with smaller batches. It is a problem if I need to load multiple models to work on the same small batch.
Because for each small batch, loading the model to the GPU will take a significant amount of time. By amortize this load to a bigger batch (or ideally process all images in one batch), it could improve performance to some degree.
In extreme case, if people are processing one image at a time, swapping models for each image, it would be very very slow. If we can know the size of the scratch space ahead of time, it can help us proper determine the batch size and thus speed up our scoring runs.
2. The mapped-in storage volumes (either host path or external volume) are much more stable and have higher performance than the docker internal overlay storage. As of today, all major overlay docker storage drivers like devicemapper, AuFS, overlayFs, Btrfs,
still have some stability or performance issues. Ideally, using '-v' to map host local storage into container is preferable for scratch space, because it can also avoid network bottleneck. So this might also help improve the running time of everybody's scoring run.
Thanks a lot!
Hi all,
> I think at least the organizer should setup a performance guideline for classify an image, like X seconds / image, upfront
This is a good suggestion and we are planning to communicate on that in the next newsletter. For this round, we will do our best to process all the submissions as we did in Round 1. During the scoring phase, we will certainly allow the inference submissions to run longer as no round is starting a week later.
Thanks a lot for the feedback! If there is a computation time limit need to be set during scoring, I think at least the organizer should setup a performance guideline for classify an image, like X seconds / image, upfront, since the size of the test dataset hasn't been disclosed. Otherwise it will be very discouraging to have sophisticated model: you don't even know where the red tape is. also, the organizers said we have to use time-series data in SC2, otherwise, we are not eligible. **Do you know why we don't use? We don't know how to use it, or we are stupid? It is because using time-series it is going to cost 3 times more compute time!** During training, we don't need to treat every sample the same, especially in this highly unbalanced dataset (note that positive samples are a very small portion of the whole training samples here). For example, a sophisticated training model may ignore majority of the negative samples in the training phase and still be good. But during inference, every sample need to go through the whole classification process. So it is possible that inference takes longer than training even the test dataset is smaller than the training dataset. i assembled 12 models from different epochs.
to save time, i already through out a critical aspect that would improve performance. **and throw out many other things i wanted to run, because i know if i am the one taking the longest time, i will be the one being cut.**
i think, at least in the final round, the model should be allowed for 4 weeks to run. because in the final round, i am going to assemble all the 50 models i collected through 3 phases For the pilot data and training data, it seems to me that the amount of positive image (with cancer) are much smaller than negative ones, thus we don't have to process all the negative images to train the model. That is how we can afford to do computational intensive pre-processing. But if we are going to predict 300K images, it would take a much longer time to run.
There was not a guideline about how much time is alloted for each team for scoring before. If there need to be a guideline or limitation on running time for scroing phase, can we set it for the next phase rather than the current one? Thanks!
@yuanfang.guan The complexity of the models are somehow limited by the 2 weeks of computational time allotted per team for training. In that configuration, do you expect a model to take 2-3 weeks to classify a dataset that is smaller than the training set? > as it's not viable for the Cloud to have submission running for weeks.
i think 2-3 weeks is probably minimal? other wise no good model can finish. especially for the final submission, you have until the end of Oct to score them > Or a much smaller subset of the 300+k images that have been hold back will be used?
The test set is smaller than the training set. We are currently discussing a suitable walltime to apply to the inference submissions as it's not viable for the Cloud to have submission running for weeks. Does that mean in the inference stage, the submitted container has to analyse 300+k images to get a score? If each images take 10 seconds (because high resolution images run through deep learning neural network really take time), it will take about 40 days to finish one inference run. Or a much smaller subset of the 300+k images that have been hold back will be used? Hi Bob,
The Competitive Phase of this Challenge has two phases during which inference methods are evaluated on different datasets (see [Challenge Timelines](https://www.synapse.org/#!Synapse:syn4224222/wiki/401751)). That should have been:
```
find /trainingData/ -name '*.dcm*' | wc -l
```