Seemingly conflicting information regarding the trained models

According to the webinar Q&A, we are not allowed access to the trained models. However, on the model submission wiki page, it says "When your model completes, the contents of /modelState are also zipped and uploaded to Synapse." I'm really confused. Are we allowed to obtain a copy of the trained model or not? A related question is: are the log files the only way to examine model status?

Created by Li Shen thefaculty
Any updates on this?
I would agree with most of what's been said -- without access to some form of the model state, there's no way to iteratively improve our models in a principled manner. A percentage correct is definitely better than nothing, but we'll basically be designing these algorithms with one hand tied behind our backs. Thomas, is there any possibility to both allow us to have access to the model state, but check for image theft too? Perhaps with some sort of automatic screening?
This is the same question I asked earlier. I also need to tune my model. I only need about one line of data per image to do the tuning - but it would be best to have information from all the images. Say a maximum of 1k per image - so, for 60,000 images, that would be 60MB. I agree with the comment earlier, it's partly computation, but partly heuristic tuning through inspection. Essentially to understand the necessary weighting.
To add another argument to make the model state available to us: if we were able to download the trained model, we can use it to **make predictions on the pilot set**. And that will help us tremendously in determining **when and why our models fail**. As we know, machine learning researchers can learn a great deal from **failures**.
Hi Thomas, Tensorflow's checkpoint files vary according to the number of parameters saved. An example is Inception which is between 90Mb and 400Mb (see * below) although this quantization can halve this and more. You might want to write a piece of code to scan the .ckpt files for images before they are returned, although it would in my view be foolish and pointless to try and steal images as already noted. * http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz and http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz Regards, Stephen
also i cannot agree that a bigger attraction in kaggle or other image competitions, e.g. imagenet, is that all images are released. to me that is the biggest DISADVANTAGE, because one can hand label the test set. you don't even need to do pixel level. just quickly click through the images, yes or no. then i just train a model based on my annotation on the test set. i often label thousands of images in a morning, because it is such a relaxing yet seemingly productive activity. i think for this dataset i can finish labeling the test set in 3 days. but i cannot do it , since i have a burden of a university job and title. then i will have to watch others do it.
i think the state should not be provided. because providing state is equivalent to providing original image. one can use an autoencoder and reconstruct images based on second to last layer.
I agree with Stephen M. Most deep learning models are trained with SGD that requires many grounds of tuning. Without access to the model state, we are basically throwing a dart in the dark. A model is most likely sub-optimized...
Hi Stephen, Before addressing your questions in details, can you please give us a range estimate of the total number of MB that we will return to you (only for the model estimate, without including the log files) if we decide to return the model state. You mentions that a checkpoint file is about 100 MB, but which layer architecture and size of images do you plan to use? Also, how much MB would be one of your check point file if you use a image resolution closer to the original images that we are providing (smaller images are 3328x2560 pixels). Here are some numbers that should be useful: - the challenge has 3 rounds, each lasting five weeks - 2 weeks of computation time per user per round Thanks!
There is concern that the potential withdrawal of access to trained models will make this project very difficult. I will outline the issues for everyone?s benefit and hope to hear some response. **Access** Designing and training neural nets is empirical and experimental. Teams need access to the model state to : - understand how back prop is working: do signals decay, explode or more generally how they behave. - see how hyperparameters and affect weights and kernel activation patterns. - understand what features are being learned. See Alexnet's features below. - reuse weights, filters and layers each new model. Retraining from scratch every time could take weeks. That's expensive for any developer. The limits on server time here make that a bigger drag. ![Feature maps](https://devblogs.nvidia.com/parallelforall/wp-content/uploads/2014/09/nn_example-624x218.png) **Tools** If teams have their model files they can use existing tools to inspect learning curves, weights and other parts of the models. Google has already built tensorboard which analysis the checkpoint file. Without the model files, participants would have to build new new tools to save variables and to analyse the results. That would take weeks of coding and it is reinventing the wheel. **Why Join** Participants joined this challenge on the understanding they would get access to our models. That documentation was open to all and surely agreed by the data providers. Now the webinar said the models could be downloaded if there was a good reason. I think there is. Most are also doing this for the learning experience not the prize money. There are too many contestants to make the prize achievable. This competition would be far less attractive without access. Other competitions, like nerve segmentation on Kaggle, give access to the data, and that makes them relatively more attractive. **Theft** One of the organisers said there is concern about theft but isn't that overcautious? Bear in mind a checkpoint file is ~100Mb which is about 20-30 images, i.e. 0.00004 of the training data. That's insignificant and not worth the risk of disqualification. I hope the organisers are able to recognise these points and give access to the model states.
Hi Li Shen, Here is some additional information. We are aware that not returning the trained model may make the task more complicated for you. The motivation behind doing so is that we need to prevent the theft of the mammography images as requested by Group Health. We are still discussing this question internally, but the current consensus is that insight about a model can be collected *live* or after the completion of the training and included in the log file. Thomas
> "When your model completes, the contents of /modelState are also zipped and uploaded to Synapse." Thanks for spotting this deprecated part of the documentation. > are the log files the only way to examine model status? Yes

Your web browser must have JavaScript enabled in order for this application to display correctly.
If you are an automated web crawler from a search engine, follow this AJAX application crawl link

Drop files to upload

Seemingly conflicting information regarding the trained models page is loading…