Hello all,
A couple of times over the past weeks, including during the past 6 hours, I've been trying make use of the feature that allows multiple training runs, using data created in the /preprocessedData/ volume after an earlier preprocess.sh run.
I have tried :
* keeping the preprocess= line in the submission file constant,
* eliminating the preprocess line, just having a training line
* several other exotic configurations
but I have NOT been able to make this work. All I see are either fresh preprocessing stages being initiated, or no /preprocessedData/ volume.
If anyone has made this work, I would definitely like to see the evidence, before I spend time creating a bug report.
**Incidentally**, my test to see if I get data in the /preprocessedData/ volume is simply to
```
$ touch /preprocessedData/`date -Iseconds`
```
during the preprocess.sh phase, which gives me a simple and reliable indicator, then
```
$ ls /preprocessedData/
```
in the train.sh phase, to see if I have files in the volume.
**Finally**, I just put my account in conflict by submitting a quick test as an individual instead of as part of my team (sorry, an easy mistake after doing many fruitless test runs today, trying to make this work). Can someone on the staff please resolve that? @brucehoff ?? So that I can again submit files?
Created by Russ Ferriday snaggle @kikoalbiol: I believe I understand your difficulty: It seems you have submitted a single Docker container (a 'training only' submission) when you intended to submit a sequence of two steps (preprocessing and training). To do the latter you should prepare a file with the following contents:
```
preprocessing=docker.synapse.org/syn7349799/vanilla_convolucional@sha256:ec32bfcda96dfd8a7ad802fa4e4d7d921e86866cf184cf67e4650507b0e883b6
training=docker.synapse.org/syn7349799/vanilla_convolucional@sha256:ec32bfcda96dfd8a7ad802fa4e4d7d921e86866cf184cf67e4650507b0e883b6
```
Upload this file to Synapse and then submit it to the challenge.
I hope this helps.
### Open Phase Training Queue
${leaderboard?path=%2Fevaluation%2Fsubmission%2Fquery%3Fquery%3Dselect%2B%2A%2Bfrom%2Bevaluation%5F7213944%2B%2Bwhere%2BSUBMITTER%253D%253D%25223343931%2522&paging=true&queryTableResults=true&showIfLoggedInOnly=false&pageSize=100&showRowNumber=false&jsonResultsKeyName=rows&columnConfig0=synapseid%2CentityId%2CentityId%3B%2CNONE&columnConfig1=none%2CSubmission ID%2CobjectId%3B%2CNONE&columnConfig2=synapseid%2CSubmission Folder%2CSUBMISSION%5FFOLDER%3B%2CNONE&columnConfig3=none%2CAgent ID%2CWORKER%5FID%3B%2CNONE&columnConfig4=none%2CSubmission Status%2Cstatus%3B%2CNONE&columnConfig5=none%2C%2CSTATUS%5FDESCRIPTION%3B%2CNONE&columnConfig6=userid%2CSubmitted By%2CuserId%3B%2CNONE&columnConfig7=none%2C%2CUSER%5FID%3B%2CNONE&columnConfig8=none%2C%2CSUBMITTER%3B%2CNONE&columnConfig9=none%2CSubmission Name%2Cname%3B%2CNONE&columnConfig10=epochdate%2CCreation Date%252FTime%2CcreatedOn%3B%2CNONE&columnConfig11=none%2CScore%2CDM%5FSCORE%3B%2CNONE&columnConfig12=epochdate%2CTRAINING%255FSTARTED%2CTRAINING%5FSTARTED%3B%2CNONE&columnConfig13=epochdate%2CTRAINING%255FLAST%255FUPDATED%2CTRAINING%5FLAST%5FUPDATED%3B%2CNONE&columnConfig14=none%2CFailure Reason%2CFAILURE%5FREASON%3B%2CNONE&columnConfig15=none%2CServer ID%2CWORKER%5FID%3B%2CNONE&columnConfig16=none%2CTime Remaining %2528millisec%2529%2CTIME%5FREMAINING%3B%2CNONE}
> Re-use of /preprocessedData/ volume. Has anyone has success with it?
Yes! I submitted the following entry to the challenge twice:
https://www.synapse.org/#!Synapse:syn7214584
The content of the submitted file is:
```
preprocessing=docker.synapse.org/syn5644795/dm-python-example@sha256:fa73a6c00e0b0fb188b120e2363a7b5bc36eeb06c0465ab1e029c20667b5cd4e
training=docker.synapse.org/syn5644795/dm-python-example@sha256:fa73a6c00e0b0fb188b120e2363a7b5bc36eeb06c0465ab1e029c20667b5cd4e
```
The preprocessing step copies the files matching "1*.dcm" to /preprocessedData and the 'training' step reads those files and logs selected DICOM metadata.
The first submission received ID 7509663. The logs were uploaded here:
https://www.synapse.org/#!Synapse:syn7509699
The preprocessing log contains the 'dummy' output:
```
STDOUT: This is some content for the log file!!
```
The training log contains:
```
STDOUT: 18tu0ay3.dcm: file size (MB): 16.642625, PatientID: , StudyDate: , PatientAge: , SeriesDescription: R CC, Rows: 3328, Columns: 2560
STDOUT: 1452bi2c.dcm: file size (MB): 26.626634765625, PatientID: , StudyDate: , PatientAge: , SeriesDescription: R MLO, Rows: 4096, Columns: 3328
STDOUT: 1de3yw45.dcm: file size (MB): 16.642625, PatientID: , StudyDate: , PatientAge: , SeriesDescription: L CC, Rows: 3328, Columns: 2560
...
STDOUT: 1fni2ell.dcm: file size (MB): 16.642634765625, PatientID: , StudyDate: , PatientAge: , SeriesDescription: R MLO, Rows: 3328, Columns: 2560
STDOUT: 1gnnxlg2.dcm: file size (MB): 26.626625, PatientID: , StudyDate: , PatientAge: , SeriesDescription: L CC, Rows: 4096, Columns: 3328
STDOUT: 1dcqitgt.dcm: file size (MB): 26.6266171875, PatientID: , StudyDate: , PatientAge: , SeriesDescription: R CC, Rows: 4096, Columns: 3328
```
As you can see the training phase received the files "1*.dcm" and was able to iterate through the files.
I submitted the same entry a second time. The new submission ID is 7509793. The log files were uploaded here:
https://www.synapse.org/#!Synapse:syn7509800
In this case there are **no** preprocessing logs since that step was skipped. The training log contents is the same as shown above, for submission 7509663, which shows that it successfully accessed the cached result of the previous submission.
In short we feel the system is working as expected. We are happy to continue the discussion, investigate anomalies, and/or clarify the prescribed system behavior as needed. Some comments about your explorations:
> I have tried :
> keeping the preprocess= line in the submission file constant,
This is the correct way to invoke the caching of preprocessed data, that is to 'skip' preprocessing which has already been completed.
> eliminating the preprocess line, just having a training line
Doing so will cause the submission not to use the preprocessing results of any previous submission.
> my test to see if I get data in the /preprocessedData/ volume is simply to
> touch /preprocessedData/`date -Iseconds`
We have the implicit assumption that rerunning a preprocessing container will produce the same result (or at least a result which is not preferable to the original). If you print the current date you may find you get the date of an earlier submission since we may use the cached result of an earlier submission having the same preprocessing continer.
Hope this helps!
I still can't access to this directory. **/preprocessedData**
Seems that is a 'dream' to know in a clear way when will be available.
> Can someone on the staff please resolve that?
To do so we have either to delete the submission in question or to mark it 'rejected' or 'invalid'. I chose to mark it 'invalid'. If this does not resolve the issue, kindly let us know. Also, you are welcome to streamline your submissions by scripting them in R or Python (rather than using the web interface each time). Here's a link to the page in the Synapse Python client reference on Evaluation queue submission:
http://docs.synapse.org/python/Client.html#synapseclient.Synapse.submit
Hope this helps.
Drop files to upload
Re-use of /preprocessedData/ volume. Has anyone has success with it? page is loading…