I noticed others have had this problem too. I know it's my foolishness, but, two of my three submissions were faulty - I'd have liked to have cancelled them, to make working submissions, but I couldn't.
I submitted them because I was frustrated by the slowness of the express lane and thought using up a full submission might get me a result quicker - all three are still in the queue, so I was obviously quite wrong, and will try not to make the same mistake.
... but we are human. Just in case I clicked the wrong button, late at night, and made a full submission, instead of putting it onto the express queue, it'd be really nice to have a 'cancel' button.
Created by Peter Brooks fustbariclation @brucehoff
You know that running locally is not enough. When you build the docker training image you are not putting the images and the metadata in, so you are not able to check if it will work. You have to check the scripts work locally, import them in docker, run the image locally and test that the libraries are working. All these number of steps just increase the chances of stupid errors (like typos) that will be discovered only after the submission fails. I understand and appreciate the effort to make the infrastructure as much "democratic" as possible, but unfortunately it results in a very convoluted and over-complicated mechanism for the real purpose which is providing a good predictive model.
I understand the idea of the express lane, which scope is to give you a quick answer about the correctness of the submission, but often also that required hours and hours of waiting time. Great! That's excellent news.
Yes, I do run everything on my own machines before submitting them. Unfortunately, the differences in architecture are just enough to enable me to make silly errors -- 48 CPUs instead of 4 CPUs means that it goes through slightly different logic, it shouldn't make a difference, of course, and it's only because I'm an idiot, I know, but, there we are, I am what I am, as Popeye would say. > it'd be really nice to have a 'cancel' button.
Yes, we have discussed this internally in the last week. Our thinking is to provide a 'cancel' button for inference submissions that have not yet started running. Such canceled submissions would not count toward your quota. Hopefully we will be able to provide this in Round 3.
> All this just causes a delay because it requires a new submission, waiting until the code is run and hoping everything is fine.
I wonder if it would help to run your submission locally (on your own desktop or laptop) to check for correctness before submitting. Doing so would avoid waiting for your submission to run through our queue to verify correctness. I can understand your feelings. I've got an error because of a typo image_crosswalk.tsv instead of images_crosswalk.tsv.
I am starting feeling a bit frustrated about all the computational infrastructure: computational time, huge queue, no possibility to have an interactive shell. I am starting thinking that this is more about overcoming hardware and software limitations than finding a good algorithm for the predictions.
I don't see the real point of using the docker and having to submit an entire OS where the modelling script represents a minimal part. There are infinite pitfalls introduced by the only use of docker: missing libraries that usually are available in other distros different than CentOS (try to call matplotlib and you'll see), for example. All this just causes a delay because it requires a new submission, waiting until the code is run and hoping everything is fine.
I am starting to be convinced that other rule schemes (like those used in Kaggle) are designed to be focussed on the modelling and not on all this useless time-wasting docker-based infrastructure.