Hello everyone,
We are heading into the last week of the competition. Currently, 100+ teams of 300+ researchers from 80+ institutions and companies are participating in the challenge. I have some announcements to share with you.
You can check the final submission package format here (https://www.synapse.org/#!Synapse:syn28469146/wiki/617563).
Please keep an eye on the forum, so you do not miss essential discussions (https://www.synapse.org/#!Synapse:syn28469146/discussion/default).
Week 7 leaderboard will run from July 27 to July 30 as scheduled. However, we are adding one additional day to the challenge and will allow 20 more submissions on the final day. We will close the leaderboard on July 31st.
Created by Abdul Muntakim Rafi muntakimrafi @muntakimrafi Thank you! Thanks a lot @muntakimrafi! Midnight. Does that mean 11.59pm ET, or close of business (e.g., about 5pm ET)? I will try my best to follow Eastern Time (ET). Hi @muntakimrafi,
Regarding the closure of the leaderboard on 31 July, is that at a specific time (and timezone)?
Thanks in advance,
Dimitri @penzard Ideally, we will want to run everyone's models on the TPUs. However, do not worry; we will use TPUs for evaluating any GPU model only if we can get comparable performances to submitted predictions. I know that TPUs are not the best for lower batch sizes. It could be unfair to models which are better optimized at low batch sizes.
I have experience with both Tensorflow and Pytorch. But I would still advise you to properly document the codes. Not only would it make the evaluation phase easier for us, but it would also help the future users of the model.
By the way, [Kaggle](https://www.kaggle.com/) offers free TPU v3-8s for a certain amount each week. You can use kaggle if you want to check whether your code is compatible with TPUs. If you plan to do that, be sure to upload the challenge data as a private dataset. Do not share it as a public dataset on kaggle. You can choose not to do this as well, given that there is not much time left in the competition. I would not want you to worry about TPU compatibility right now. @muntakimrafi Could a team ask to run it's model in the specified environment? For example, our model is trained on GPU as we can't use TPU from Google Cloud. We have no idea how it will work on TPU and, frankly speaking, we will be much more relaxed if you train our neural network on GPU. Also, we can provide files to reproduce our conda environment.
About developing TPU-version - we write our code in pytorch while you, as I know, use tensorflow2. Also, our code is not example of best-ever-readability. It might be hard so to introduce any changes to our code and, as we have no access to TPU, we won't be able to validate your changes . Also, for now we have almost-reproducible code for GPU (given restrictions I've mentioned). For TPU wa can't gurantee that.
About random seeds - I think, some misunderstandig occured. I was talking about selecting number of seeds (say, 42, 79, ....) and learning different versions of model with each of them. So, you'll get different predictions and will be able to see variation in model performance, but the result will be reproducible in comparison to not setting seed at all.
@penzard The predictions made by the participants will act as a reference. If we cannot recreate a similar level of performance, we will work with the participant to see what could've gone wrong in our implementation. The evaluation will be ideally done by our trained weights of the models. I think this will increase the acceptance of the models.
After being sure that we have been able to reproduce the models, given resource availability, we want to use multiple runs to validate the models. We may not fix the seeds in multiple runs to see the variation in the performance for different runs. What is your thought on not fixing the seeds for multiple runs?
We do not want to change platforms to evaluate the same models. Ideally, all of the models will be run on TPUs.
If we fail to implement any model in TPUs, then the training will be done on GPUs. I think this might be a scenario where model A is trained on TPU and compared with model B trained on GPU. However, we will try our best to avoid that by developing a TPU implementation of the models.
Another important thing to remember is that we won't be comparing only two neural networks but two proposed methods (overall pipeline). Different teams could propose using different proportions of training and validation data. Some could try modifying the amount of data sent from different expression ranges or even modify each batch. In the evaluation phase, we won't be focusing on which model is the best. We will try to see which team proposed the best pipeline. The team whose method gives the best result will win the challenge.
But we plan to explore all these things too (best model architecture, best preprocessing, best post-processing, etc.) before officially presenting the result of the competition to the community (ideally in the form of a paper). @muntakimrafi Sorry to bother you. Could you clarify, how are you going to evaluate predictions of participants models?
As I have posted in other thread, predictions depend not only on seed but even on platform and nvidia-card you are using.
Will you use our predictions and only are going to check if they similar to produced by your independent run(s)?
Or you are going to get your own version of model and, thus, predictions and evaluate only this version?
Or you are going to run train model, say, 10 times with different seeds and then report mean+-sd for each participant?
It's crucial, because even for nowdays models this can result in different rankings, especially if two or more models have pretty similar performance
Drop files to upload
Heading into the last week: some important announcements page is loading…