i md5sum the images, many many of them are exactly the same images, the pixel matrices (not dicom), why?
how can different patients have EXACTLY the same image???why?
**I SINCERELY suggest this challenge to postpone until Dec 1st to sort all issues out. Several weeks is not enough to fix all issues. I suggest 2 months, that is the minimal time you will need. I hope that the organizers consider my suggestion as seriously as that of any other participant. I SINCERELY thank you and appreciate it in advance, OK?**
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 it3hrky3
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 mgt2dzf2
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 n8fuixiz
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 pvtwa8zc
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 xjgb4m2y
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 y47ymyqr
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 yq4uynlj
STDOUT: 0310e694c6c3d10f578f5c972f8f99b3 zkyb9ewg
STDOUT: 039042870097e1e1380f2fcaece005c0 2ml2oo5f
STDOUT: 039042870097e1e1380f2fcaece005c0 2qd1vkfq
STDOUT: 039042870097e1e1380f2fcaece005c0 3jr7t5ef
STDOUT: 039042870097e1e1380f2fcaece005c0 3vdyk2ck
STDOUT: 039042870097e1e1380f2fcaece005c0 gbnovepa
STDOUT: 039042870097e1e1380f2fcaece005c0 i9yjczqm
STDOUT: 039042870097e1e1380f2fcaece005c0 l381vdbk
STDOUT: 039042870097e1e1380f2fcaece005c0 lx5uh4d6
STDOUT: 039042870097e1e1380f2fcaece005c0 ofebmzcx
STDOUT: 039042870097e1e1380f2fcaece005c0 vw9j0clz
STDOUT: 039042870097e1e1380f2fcaece005c0 yy5zws1h
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 0pp1ai5r
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 57uzjw6e
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 6ga6mdg1
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 803ak0br
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 9adb2hnz
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 ce612dtn
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 dnbjt46u
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 mf9o0mbk
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 o1n86ro3
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 rv8gjced
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 ry55qcd4
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 s9d81zmh
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 typs5csy
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 udpjnmfq
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 v4at1a82
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 w222fwx0
STDOUT: 046db067f3d6d98fa6022bc99f2cfcf0 yagdhbcc
STDOUT: 0501ed4150074b17c8931b3ef8624069 0fnwy9iz
STDOUT: 0501ed4150074b17c8931b3ef8624069 2hg1o47e
STDOUT: 0501ed4150074b17c8931b3ef8624069 3v6j9ysu
STDOUT: 0501ed4150074b17c8931b3ef8624069 67wgnnyr
STDOUT: 0501ed4150074b17c8931b3ef8624069 99jve6me
STDOUT: 0501ed4150074b17c8931b3ef8624069 9gooipox
STDOUT: 0501ed4150074b17c8931b3ef8624069 9m7ca3ur
STDOUT: 0501ed4150074b17c8931b3ef8624069 af5mfa0y
STDOUT: 0501ed4150074b17c8931b3ef8624069 c6thwj9c
STDOUT: 0501ed4150074b17c8931b3ef8624069 g0l0vt8c
STDOUT: 0501ed4150074b17c8931b3ef8624069 ki8x1azu
STDOUT: 0501ed4150074b17c8931b3ef8624069 ksbhi8w8
STDOUT: 0501ed4150074b17c8931b3ef8624069 ns6dfl6b
STDOUT: 0501ed4150074b17c8931b3ef8624069 o5rqyn9z
STDOUT: 0501ed4150074b17c8931b3ef8624069 tb0opboj
STDOUT: 0501ed4150074b17c8931b3ef8624069 wuoll8wj
STDOUT: 0501ed4150074b17c8931b3ef8624069 xttrmyfm
STDOUT: 0501ed4150074b17c8931b3ef8624069 y0emz45f
STDOUT: 0abab22622db16454fa9d217a051e4ab 0hh619cg
STDOUT: 0abab22622db16454fa9d217a051e4ab 1oa60wxn
STDOUT: 0abab22622db16454fa9d217a051e4ab 4souz9f4
STDOUT: 0abab22622db16454fa9d217a051e4ab 4zyddh0m
STDOUT: 0abab22622db16454fa9d217a051e4ab 7b0qp0s1
STDOUT: 0abab22622db16454fa9d217a051e4ab 8f42p4b9
STDOUT: 0abab22622db16454fa9d217a051e4ab 8rr0quwa
STDOUT: 0abab22622db16454fa9d217a051e4ab br9cfbo4
STDOUT: 0abab22622db16454fa9d217a051e4ab kexkhsjx
STDOUT: 0abab22622db16454fa9d217a051e4ab nyakblur
STDOUT: 0abab22622db16454fa9d217a051e4ab p07af93s
STDOUT: 0abab22622db16454fa9d217a051e4ab pnlf0suk
STDOUT: 0abab22622db16454fa9d217a051e4ab tms1u1wm
STDOUT: 0abab22622db16454fa9d217a051e4ab uk5fzrog
Created by Yuanfang Guan ???? yuanfang.guan @tschaffter
So does this mean we need to redo the preprocessing step for the competitive phase?
This was the response I got a while ago:
" That's the exact purpose of the directory "/processedData" where you can save up to 10 TB of data. The IT infrastructure that we have developed for this challenge is modular and enables to separate the pre-processing from the training phase (see Submitting Models)."
Thanks. Hi Yuanfang,
I can't confirm whether it's stated somewhere that a substitute dataset is currently used. If not, this is a communication mistake from our side and we will rectify that. The reasons why we have decided to use a substitute training set are the following:
- the integrity of the dataset would have been too much threatened by the numerous bugs discovered during the Open Phase
- it would not be fair regarding the participants who are not joining the Open Phase (optional) if you start training a meaningful model with unlimited time quota
The Open Phase has been devised to 1) stress test the IT infrastructure, which is more complex that in previous DREAM Challenges and 2) give participants sufficient time to get used to the architecture of this challenge. Regarding the reasons why a substitute is in place, I hope that you will agree that it was the wise thing to do.
As you have noticed, changes have been made to the format of the data and the IT infrastructure based on the feedback from the participants. A list of all the changes that have been made made will be released shortly as a set of guide lines. These guidelines will be updated each time a change is made, so keep an eye on this reference document once we publish it. To answer your question regarding the format of the field _subjectId_, you can assume that all the values are 8 characters.
Thanks! thanks for the explanation, Thomas.
**This information is not revealed to participants? ** Or i have missed it? That is very mis-leading, because I am sure some of us are assuming this is real data and trying very hard to pre-train their models, and probably some already filtered 10 models out.
by mis-leading, e.g., **will these names remain the same in the real data? **because i assume most of us now are writing something so that we can only recognize names have **EIGHT **letters. Then if it is 10 in the end, we will have to waste probably 5 submissions just to figure out the name formats have changed. or even worse, someone may never figure out and then train with empty set throughout. Using fake data looks a penalty to everyone that is in the open phase.
mf9o0mbk
uk5fzrog
tms1u1wm > many many of them are exactly the same images, the pixel matrices (not dicom), why?
Because have installed a _substitute_ training set on the cloud for the duration of the Open Phase, which we have generated using material from the Pilot Set. The substitute training set is similar in size to the Challenge training set. As a reminder, one of the objectives of the Open Phase is to allow participants to develop an inference methods that runs smoothly on the cloud and evaluate runtime characteristics of their model. This is not a phase that you can use to train your model without time quota.
> how can different patients have EXACTLY the same image???why?
See above response.
> I hope that the organizers consider my suggestion as seriously as that of any other participant.
Thank your for your valuable past and future feedback. Please rest assured that we are doing our best to ensure that the cloud runs smoothly at the opening of the Leaderboard Phase.