Webinar #1 Q&A

Hi All, Here are the answers to the questions asked during the Webinar. Please follow the link of a given question for further discussion. [1. Can we expect that every exam will have at least the 4 standard views?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=850) Answer: No, however more than 99% of the exams have CC and MLO views for each breast. [2. What proportion of exams have more than the 4 standard views?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=851) Answer: About 20% of the exams have extra images in addition to the 4 standard views. [3. What is the distribution of exams by manufacturer? i.e. do the majority come from the same manufacturer? If not, do different manufacturers look significantly different?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=852) Answer: Most of the images are from the same manufacturer. This information is available in the DICOM header of each image. [4. The open phase doesn't have any scoring right? so can we test our models in this phase?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=853) Answer: Correct, no _meaningful_ scoring during that phase. Participants will still be able to test whether the evaluation part of their code works. But no information regarding the performance of the method will be returned during the Open Phase. [5. Where can we find information regarding the "reference algorithm" ?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=854) Answer: If the question refers to the example Docker container that we are providing, additional information will be shortly available on the Wiki page [Submitting Models](https://www.synapse.org/#!Synapse:syn4224222/wiki/401759). One example Docker container that we are preparing contains the deep learning _framework_ Caffe and a settings file that allows to select one of the following _models_: AlexNet, GoogLeNet and VGG Net. Another example will feature the same content but using the framework TensorFlow instead of Caffe. If the question refers to the baseline algorithm against which the performance of the inference methods submitted will be compared, the answer is that this baseline algorithm will be available only after then end of the Competitive Phase. [6. Are you going to release more subjects for the training set? The current training set is too small to be meaningful. The number of cases is very small and the total number of subjects is very small compared with the whole data set.](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=855) Answer: If ?release? means ?make public?, the answer is no. We are only authorized to release 500 mammography images ([Pilot Set](https://www.synapse.org/#!Synapse:syn6174174)). However the training set in the Leaderboard Phase will contain data from about 43k subjects. More data will be made available during the Final Round in the Evaluation Phase (for a total of about 60k subjects). [7. What models will your internal team build and when do we expect release? You mentioned tensor flow - could you elaborate please?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=856) Answer: Tensorflow is one framework to do deep learning. Caffe and Theano are two other frameworks. We will provide examples that use Caffe and Tensorflow. The models that will be included in the Caffe and Tensorflow examples are the following standard models: AlexNet, GoogLeNet and VGG Net. [8. When will the basic caffe, tensorflow examples be available ?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=857) Answer: In one week or so. [9. Is Caffe the recommended deep learning framework to use?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=858) Answer: Caffe is one of the most used deep learning frameworks. However we are not making any recommendations regarding the choice of the deep learning framework. [10. You've said we musn't copy images out - I get that. Is it OK for us to have partial images or partially processed images in the 10TB local storage?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=859) Answer: No, images hosted on the Challenge Cloud as well as processed images stay on the Cloud. [11. Will the provided example container have example input data/directories, intermediate log files, and output? (so that we can make sure our container mirrors the same behavior)](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=860) Answer: No, however the example container provided are guaranteed to run successfully on the Challenge Cloud. [12. Is there a deadline for forming teams? E.g. can individual participant join together during any phase?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=861) Answer: The composition of the teams can be changed during a small time window between each round of the Challenge. Once the round has started, the composition of the team can not be changed until the end of the round. New teams can register during this time window. Individual participants that are not already part of a team can join during any phase of the challenge, but they cannot join _together_ and form a team before the end of the round. [13. Are the DICOM metadata are available for the Sub-Challenge 1?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=863) Answer: The DICOM header of each image provides a wealth of technical information as well as clinical information. All the metadata that are clinical data (e.g. subject age) are removed from the DICOM header (Sub-challenge 1 and 2). The list of fields that have been removed from the DICOM header of each image will be provided shortly. In Sub-challenge 2, clinical and demographical data are provided as a table (.tsv). [14. Will we get access to our trained models?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=864) Answer: Not during the Competitive Phase of the Challenge. As training progresses we capture and return your log files. By "log files" we mean any command line output your model produces. We hope you can use this mechanism to assess how well your model converges, informing your submission choice. [15. Are we provided BI-RADS scores or a physician's impression?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=866) Answer: No, only objective information are provided during the Competitive Phase of the Challenge. [16. Are all the images at the same physical resolution per pixel?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=867) Answer: No, the images have different resolutions. [17. If we submit a preprocessing Docker container, will we be able to review the preprocessed images so we understand how well the preprocessing works?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=868) Answer: You can test the pre-processing Docker container on your computer using the [Pilot Set](https://www.synapse.org/#!Synapse:syn6174174). [18. Is there a baseline algorithm?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=869) Answer: The baseline algorithm will be available after the Leaderboard Phase. [19. Can we preprocess the original data more than once?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=870) Answer: Yes. You can submit and run a pre-processing Docker container as many time as you want as long as you have have not reached your time quota. [20. You've been using python for your examples. Can we use other languages - or is it more sensible to stick with python too?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=871) Answer: Yes, a great feature of Docker is that you can install any piece of software in your Docker container. See https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=796 [21. For the open phase, will we get log-files regarding the model training accuracy?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=872) Answer: The log files contains whatever you decide to add to them. I recommend including in it any metrics that can help you to decide 1) if your container should be stopped to save time quota (in case your model is not converging) and 2) if you want to submit later the trained model for evaluation against the leaderboard set (this action can be performed only up to three times per round). [22. Would it be possible to add a docker image with a higher level framework such as keras, which I feel is more practical than tensorflow for prototyping?](https://www.synapse.org/#!Synapse:syn4224222/discussion/threadId=873) Answer: You can install any software of your choice in your Docker container. If several participants are interested in using Keras, we may release an example Docker image that already has it installed. Thanks!

Your web browser must have JavaScript enabled in order for this application to display correctly.
If you are an automated web crawler from a search engine, follow this AJAX application crawl link

Drop files to upload

Webinar #1 Q&A page is loading…