I ran a number of model training sessions today on the challenge cluster.
But now it appears that my latest model is stuck in the "VALIDATED" state for more than 90 minutes:
${imageLink?synapseId=syn7880081&align=None&responsive=true}
Why would that be? Am I hitting a daily number of submissions quota? Or maybe I'm in a queue because all servers are busy?
Being able to quickly test different models and do quick prototyping (but with the full data set) is important for competitiveness. It would be great if we could make sure that a fast iteration style is supported by the competition's cluster job system, and that people doing more frequent submissions aren't in a disadvantage.
Thanks!
Created by Daniel Nouri dnouri Any chance we could have the submission queue open past the 23rd even if training is shut down then? Still in VALIDATED status and need my trained model to submit... Dear all,
Apologies for the delay in response. There is a spike in submissions, so all the servers are busy processing. I see all your submissions as VALIDATED and hopefully the machines free up soon to run your submissions. Thank you for your patience.
Best,
Thomas My latest job is also stuck for more than a day. :-( @brucehoff @tschaffter @thomas.yu
My training submissions (7882469 & 7882523) have been in "VALIDATED" status for over a day. Just want to make sure it is simply because all servers are busy?
Also, is there a way to check the status of the severs to see how many submissions are in front of mine?
Thanks. I believe it is because all servers are busy. My jobs have been in this status for almost two days. I'm also encountering this problem Any update? My last submission has been stuck in Validated for hours.
Drop files to upload
What's the "VALIDATED" status and why is my job stuck? page is loading…