Hi.
I had waited more than a day to train a model (submission ID 8452548 - this was stuck in 'validated' for a long time) and when it finally started running, it started preprocessing again.
I have been using the same preprocessing docker since round 2 and not sure why it is running again.
Could you please take a look?
Also, could you please provide the log file for submission ID 8460457? I need more than just 'prediction file can't be empty'
Thanks.
Created by Seung Wook Kim swkim We reran 8460457. The end of the log file looks like this:
```
...
/scratch/1760/797635.png does not exist
/scratch/1760/800382.png
/scratch/1760/800418.png
/scratch/1760/798018.png
/scratch/1760/798144.png
/scratch/1760/798157.png does not exist
/scratch/1760/798153.png
/scratch/1760/798117.png
/scratch/1760/797459.png
/scratch/1760/797455.png
/scratch/1760/800485.png
/scratch/1760/798302.png
/scratch/1760/798324.png
/scratch/1760/798333.png
/scratch/1760/800489.png does not exist
/scratch/1760/798441.png
/scratch/1760/800508.png
/scratch/1760/798450.png
/scratch/1760/798458.png does not exist
/scratch/1760/800540.png
/scratch/1760/797820.png does not exist
/scratch/1760/800579.png does not exist
/scratch/1760/800584.png
/scratch/1760/798751.png does not exist
/scratch/1760/800603.png
/scratch/1760/798692.png
/scratch/1760/798696.png
/scratch/1760/798755.png
/scratch/1760/798764.png does not exist
/scratch/1760/798759.png
/scratch/1760/800643.png
/scratch/1760/798122.png does not exist
/scratch/1760/798126.png
/scratch/1760/800652.png
/scratch/1760/798948.png
/scratch/1760/800670.png
/scratch/1760/800674.png does not exist
/scratch/1760/799026.png does not exist
/scratch/1760/800696.png does not exist
/scratch/1760/800751.png
/scratch/1760/800755.png
/scratch/1760/800764.png does not exist
/scratch/1760/799448.png
/scratch/1760/799457.png
/scratch/1760/799453.png does not exist
/scratch/1760/800768.png does not exist
/scratch/1760/798665.png
/scratch/1760/800823.png
/scratch/1760/799493.png does not exist
/scratch/1760/800832.png does not exist
/scratch/1760/798913.png does not exist
/scratch/1760/800845.png does not exist
/scratch/1760/799790.png
/scratch/1760/798979.png
/scratch/1760/799785.png
/scratch/1760/799789.png
/scratch/1760/799057.png
/scratch/1760/799062.png
/scratch/1760/799934.png does not exist
/scratch/1760/799943.png
/scratch/1760/799938.png does not exist
/scratch/1760/799147.png does not exist
/scratch/1760/799161.png
16838
```
> We are getting the following error on sub-challenge 2 inference submission:
> STDERR: InvalidArgumentError (see above for traceback): Invalid PNG header, data size 0
Kindly give the submission ID for which you received this error.
${leaderboard?path=%2Fevaluation%2Fsubmission%2Fquery%3Fquery%3Dselect%2B%2A%2Bfrom%2Bevaluation%5F7453793%2Bwhere%2BSUBMITTER%253D%253D%25223344230%2522%2B%2Border%2Bby%2BmodifiedOn%2BDESC&paging=true&queryTableResults=true&showIfLoggedInOnly=false&pageSize=100&showRowNumber=false&jsonResultsKeyName=rows&columnConfig0=none%2CSUBMISSION ID%2CobjectId%3B%2CDESC&columnConfig1=none%2C%2CWORKER%5FID%3B%2CNONE&columnConfig2=none%2C%2Cstatus%3B%2CNONE&columnConfig3=none%2CStatus Description%2CSTATUS%5FDESCRIPTION%3B%2CNONE&columnConfig4=none%2C%2CFAILURE%5FREASON%3B%2CNONE&columnConfig5=userid%2C%2CuserId%3B%2CNONE&columnConfig6=none%2C%2CteamId%3B%2CNONE&columnConfig7=synapseid%2C%2CPREDICTIONS%5FFILE%3B%2CNONE&columnConfig8=epochdate%2C%2CcreatedOn%3B%2CNONE&columnConfig9=epochdate%2C%2CTRAINING%5FSTARTED%3B%2CNONE&columnConfig10=epochdate%2C%2CTRAINING%5FLAST%5FUPDATED%3B%2CNONE&columnConfig11=synapseid%2C%2CentityId%3B%2CNONE&columnConfig12=none%2C%2Cauc%3B%2CNONE&columnConfig13=none%2C%2CpAuc%3B%2CNONE&columnConfig14=none%2CSpecificity at Sensitivity %25280%252E8%2529%2CSpecAtSens%3B%2CNONE}
> Also, could you please provide the log file for submission ID 8460457? I need more than just 'prediction file can't be empty'
The log file is no longer available. I took the liberty of restarting the submission. The first time it ran in less than a day, so we should quickly get a new result. If you post back to this thread when you receive another failure notification for the submission, then I will have a look at the log file.
Sorry for the late reply.
> I had waited more than a day to train a model (submission ID 8452548 - this was stuck in 'validated' for a long time) and when it finally started running, it started preprocessing again. I have been using the same preprocessing docker since round 2 and not sure why it is running again. Could you please take a look?
It appears that 8452548 ran on a machine different from the previous submission, 8437289. We don't have a record of why this happened but a likely cause is an effort to distribute submissions across machines. As you mention, the submission queued for a long time. Occasionally we estimate that, due to queue depth, it will be quicker to move to an available machine and (immediately) rerun preprocessing results rather than to continue waiting for the original machine which has a long queue of waiting jobs.
${leaderboard?path=%2Fevaluation%2Fsubmission%2Fquery%3Fquery%3Dselect%2B%2A%2Bfrom%2Bevaluation%5F7213944%2Bwhere%2BSUBMITTER%253D%253D%25223344230%2522%2BAND%2BcreatedOn%253E%253D%2B1489200278304&paging=true&queryTableResults=true&showIfLoggedInOnly=false&pageSize=100&showRowNumber=false&jsonResultsKeyName=rows&columnConfig0=none%2CSubmission ID%2CobjectId%3B%2CDESC&columnConfig1=none%2CStatus%2Cstatus%3B%2CNONE&columnConfig2=none%2CStatus Detail%2CSTATUS%5FDESCRIPTION%3B%2CNONE&columnConfig3=epochdate%2CSubmitted On%2CcreatedOn%3B%2CNONE&columnConfig4=epochdate%2CLast Updated%2CTRAINING%5FLAST%5FUPDATED%3B%2CNONE&columnConfig5=synapseid%2CSubmitted Repository or File%2CentityId%3B%2CNONE&columnConfig6=none%2CFile Version%2CversionNumber%3B%2CNONE&columnConfig7=none%2CSubmitting User or Team%2CSUBMITTER%3B%2CNONE&columnConfig8=none%2C%2CTRAINING%5FSUBMISSION%5FPARAMETERS%3B%2CNONE&columnConfig9=none%2C%2CWORKER%5FID%3B%2CNONE} We reverted back to the version that did not use '/scratch' - but this would take really long time to complete, so please check if there's something wrong as we asked before. @thomas.yu
Could you update the status of this issue please?
We don't have much time till the end of round 3 and if we don't hear back real soon, we might not be able to make a submission for sub-challenge 2.
Thanks Dear Seung Wook,
I have forwarded your message to other challenge organizers. Thanks in advance for your patience.
Best,
Tom We are getting the following error on sub-challenge 2 inference submission:
STDERR: InvalidArgumentError (see above for traceback): Invalid PNG header, data size 0
Tensorflow outputs this error when there exists a png file with size 0. So it is most probable that dicom files were converted to empty png files.
Since it runs fine on the express lane and we also had no problem using the same extraction+read code for sub-challenge 1, we suspect that there is a system-related problem such as
insufficient memory allocation on /scratch.
Could you check into this issue please?
Thanks
and we still want the log file for submission ID 8460457.