Hi @RA2DREAMChallengeParticipants,
@dongmeisun and team have put together the full training dataset for release on U-BRITE. I've updated the access instructions on the wiki here under the "Training Data" subheading: https://www.synapse.org/#!Synapse:syn20545111/wiki/597243
A few notes:
- This dataset replaces the previously released "example.zip" and "example_v2.zip" datasets. It incorporates some adjustments to images that were suggested by participants.
- Please note that your submitted method must still be able to be trained using the training data mounted into your Docker container after submission.
- As a reminder, this dataset cannot be used for anything other than the RA2 DREAM Challenge.
Thanks,
Robert
Created by Robert Allaway allawayr Thank you for the heads up, I just fixed it on the How to Participate page! That's what I get for posting the link in 3 places on the wiki. Hey Robert,
Before I came to this thread, I was looking at the initial instructions page - it still links to the old download link: http://ubrite.informatics.uab.edu/ra2_dream/example.zip
Just a heads up!
Yes, that is correct. You can use all of the images. The organizers initially were going to release just 50 but decided to release the full training set.
Cheers,
Robert
There are 368*4images instead of 50 images in the download link, can I use all 368*4 =1472 samples for training? Hi @arielis,
These were removed because they were suspected to be duplicates.
Cheers,
Robert
Hi @allawayr ,
Thanks for the update,
I notice that some individuals are not included in the new dataset, notably:
UAB082, UAB095, UAB201, UAB263, UAB296
Is there anything wrong with their score, or is it only because of duplicates or flipped feet/hands (which I've already corrected manually)?
Best,
Ariel
Hi all,
This dataset has been corrected and posted on U-BRITE (http://ubrite.informatics.uab.edu/ra2_dream/training.zip). We're preparing a broader announcement for folks that may not be following this conversation.
All the best,
Robert
Hi all,
I just double-checked the latest version of the dataset for correct left/right hand/foot labels as well as for correct jpg formatting. We should be all good to go with the exception of one set of four images that I have a question about. Once that is resolved, I will zip up the new dataset and get it loaded onto U-BRITE. I anticipate having this dataset published to U-BRITE this dataset tomorrow morning (Pacific time).
Re @muschellij2 's question:
>What level of confidence do the challenge organizers have that the test set will be similar/fixed in the same way?
I will work with the challenge organizers in the coming days to ensure that the same issues that were identified in the training set either are not present (or if they are, corrected) in the leaderboard and test datasets. Thank you again for your help in identifying these issues in the training set, and your patience as we resolve these issues.
@muschellij2 @allawayr
Thanks for your patient and your time to help us.
I already fixed the data and asked Robert to have a double check now. Should be available soon.
This is for sure that all the leaderboard data and test data will be the same quality since I have found some approaches to guarantee that.
Sorry for that and thanks again.
1. When would the data set be assumed to be finalized?
2. What level of confidence do the challenge organizers have that the test set will be similar/fixed in the same way? @muschellij2
Sorry, you are right. Still have some flipped. I will correct.
Thanks a lot. I corrected all of them in the updated data.
UAB296 looks stretched. You are right. I didn't change since it is original and I didn't see any other defect.
There are some images came from same patient at different time point. That's why some of them look like the exact same, they are same patients. Didn't get to look at the new data, but found all of these were left/right flipped:
UAB030-LF
UAB440-LF
UAB090-LH
UAB093-LH
UAB049-LH
UAB028-LH
UAB031-RF
UAB049-RF
UAB394-RF
UAB049-RH
UAB090-RH
UAB093-RH
UAB394-RH
The number of cases indicate that people probably shouldn't rely on the left/right designation on the image.
UAB296-LH - looks stretched
UAB081-RF and UAB082-RF look like the exact same foot, not different patients
UAB081-RH and UAB082-RH
UAB201-RF and UAB200-RF look like the exact same foot, not different patients
UAB263-RF and UAB264-RF
UAB277-RF and UAB278-RF Apologies for the delay in announcement - this was fixed yesterday.
If you would like to confirm that you have the latest version of the dataset, you can run `md5sum` or otherwise calculate the md5. The expected result is `623f952c4dd4b25443fc765ea5ceb10e` for `training.zip`
Have a great weekend,
Robert
Thanks. Robert find some more as yours. WE are fixing them. These 3 all are png
UAB124-LF.jpg
UAB124-LH.jpg
UAB124-RF.jpg
macbook-pro-8:raw johnmuschelli$ file UAB124-LH.jpg
UAB124-LH.jpg: PNG image data, 948 x 1616, 8-bit/color RGBA, non-interlaced
macbook-pro-8:raw johnmuschelli$ file UAB124-RF.jpg
UAB124-RF.jpg: PNG image data, 798 x 1485, 8-bit/color RGBA, non-interlaced Sorry, will fix it asap. Thanks.
@muschellij2
I am able to reproduce this on my end:
```
(base) Roberts-MacBook-Pro-3:train rallaway$ file UAB124-LF.jpg
UAB124-LF.jpg: PNG image data, 836 x 1267, 8-bit/color RGBA, non-interlaced
```
I will work with the UAB team to get this fixed. Thanks! @muschellij2
Thanks. Please make sure it's UAB124-LF and only this one. I have a double check and it's a jpeg file. I don't know what causes the error, but I will be happy to redo and reload it. UAB124-LF.jpg is likely a PNG that has been renamed to a JPG (as per https://stackoverflow.com/a/11310281/2549133):
```
jpeg::readJPEG(path)
Error in jpeg::readJPEG(path) :
JPEG decompression error: Not a JPEG file: starts with 0x89 0x50
```
whereas
```
png::readPNG(path)
```
works fine Hi Lars,
>Around 50 training examples, 3 or 4 orders of magnitude less than the norm for successful neural network applications
There are 373 training samples now available, with four images per patient, so 1492 training images for Subchallenge 1, and 373 x 4 appendage cohort training sets for SC2/3.
>Mislabeled cases in that set
These are being corrected, and similar changes are being made to the leaderboard and test datasets.
>No ground truth for joint localization, thus requiring either extensive manual labeling effort on the part of each solver, or a novel unsupervised learning technique that can work on a very small number of examples
Speaking for myself, and definitely not on behalf of the organizing committee, I am not convinced that joint localization information is a requirement to train or test in this challenge. It definitely is not a prerequisite to participate in SC1. It certainly could be helpful in SC2/SC3. As you have already suggested, an unsupervised approach could be employed to obviate segmentation entirely. Other possibilities include: using the provided training data to develop a segmentation model using a small subset of annotated training data (I saw a paper that used 10 images), teaming up with other participants to create a comprehensive bounding dataset with the training data, or using spatial information+other prior knowledge to identify the joints without bounding boxes.
This paper from 2002 shows an example of that final option: https://www.ncbi.nlm.nih.gov/pubmed/11929022 (I'm not sure if I can provide a pdf of that for copyright reasons but will double-check and post it if I can get the right licensing).
I'm sure there are other options I'm not thinking of here. You can always create or join a team to discuss the aforementioned and other options to determine the optimal approach given the limitations of the training data. :)
For RA2, we have
- Around 50 training examples, 3 or 4 orders of magnitude less than the norm for successful neural network applications
- Mislabeled cases in that set
- No ground truth for joint localization, thus requiring either extensive manual labeling effort on the part of each solver, or a novel unsupervised learning technique that can work on a very small number of examples
I participated this year in a NIST-sponsored challenge, [OpenCLIR19](https://www.nist.gov/itl/iad/mig/openclir-evaluation), with an unrealistic training objective (translate Swahili speech to English text with 50,000 labelled speech examples, when the commercial state of the art is 1 million labelled examples), and a modest prize ($20,000). 45 teams worked on this example. That quickly dropped to 7, and then 2, for Swahili speech-to-text. Nobody claimed the speech-to-text prize.
If you set up an unrealistic problem, you will get nothing done except some free cleaning of the dataset.
@arielis
Agree, that is not a easy challenge. We hope participants can develop some new machine learning algorithm or use some existing method to solve the problem.
Good luck, and let me know if you have any suggestions. Thanks a lot. @dongmeisun, hopefully
But it is quite a challenge to train a model on so few samples I believe that since they are well trained. Thanks @dongmeisun for your explanations.
Are you sure that all the radiologists understood it this way ?
In the training data, it seems that scores above 5 in the feet are exceptionally rare.
Out of over 4400 erosion scores documented in the feet , only 20 are higher than 5.
6 5
7 4
8 1
9 4
10 6 Remember you can only find scores higher than 5 in erosion feet but not hands @arielis
I checked both feet, the MTP_E_5 of both feet showing severe erosion with scores 6. These are correct since you can see erosion at both side. Please refer the figure 2 in the wiki page. We scored both side range 0-5 giving a total 0-10.
see https://www.synapse.org/#!Synapse:syn20545111/wiki/597243 @dongmeisun
I don't understand how these scores are correct.
It is explicitely stated in the Data section that: "Each joint or joint region is assigned a score from 0 to 4 for narrowing **and 0 to 5 for erosion.**"
If you look at the first row I reported, for patient UAB318, a score of 6 was given in column LF_mtp_E__5, which is the score for erosion of the mtp_5 joint in **left foot**.
You also have a score of 6 for RF_mtp_E__5, which is the same joint in the **right foot**.
So you have for the same joint in each foot a score higher than 5 for erosion. Hi, All:
Based on your discussion, I think I should spend time to correct the mislabelled images. I will do today. For those scores higher than 5, these are correct since we scored both sides of the foot for erosion.
Thanks. I will try to update those mislabelled images today.
Thanks @allawayr
I've also noticed a problem with scores out of range (higher than 5):
Patient_ID|LF_mtp_E__1|LF_mtp_E__2|LF_mtp_E__3|LF_mtp_E__4|LF_mtp_E__5|RF_mtp_E__2|RF_mtp_E__3|RF_mtp_E__5
UAB318|0|0|0|0|6|0|0|6
UAB634|1|4|5|3|7|3|0|7
UAB317|0|0|0|0|6|0|0|10
UAB503|0|0|0|0|0|0|0|7
UAB564|0|0|2|2|5|0|2|7
UAB504|5|0|3|6|2|0|5|8
UAB084|3|2|2|9|9|9|9|6
UAB490|10|10|10|10|10|0|0|1
@lars.ericson , Yes, I agree with you - the images and training values should be corrected, rather than having mislabeled images, even if the mislabeled images "match" the correct numerical training values, because it will affect any methods that are training the different types of appendages as "independent" cohorts (or otherwise take the appendage type into account when training).
I will let you know when I receive an updated training dataset to post on U-BRITE. Is the currently published dataset mislabelled as above? Is the ground truth incorrect? It seems clear we shouldn't train against mislabelled images with incorrect ground truth. Should we manually exclude these from our training? Or will you update the training data if it is still wrong? @arielis
Thanks for your info. We noticed the problem and double checked already. The reason for the switched foot or hand is because we mislabelled them and scored in that way at the very beginning. If I made them in correct order I had to changed scores of each set. So I left the way it is.
@allawayr Thanks. @dongmeisun can you double-check these and fix if necessary?
@arielis thank you for the info! There are some swapped feet in the updated dataset:
- UAB031-LF is a right foot and UAB031-RF is a left foot
- UAB049-LF is a right foot and UAB049-RF is a left foot
- UAB394-LF is a right foot and UAB394-RF is a left foot
- UAB440-LF is a right foot and is the same as UAB440-RF
- UAB030-LF is a right foot and has a similar look as UAB030-RF
The PNG you mentioned is now a JPG again, and the numbering issue you highlighted in your initial post has been fixed. The updated training dataset is available from the same location as before.
Thanks again for pointing this out. @dongmeisun can you fix this? UAB054-RH.png should be a jpg. :)
Thanks!
Yes - I see that. Oddly, UAB054-RH.png is a *PNG*. Thanks @muschellij2 for catching these! Yes, the 1s in the filenames should definitely be removed.
Re UAB054, I see all four files (UAB054-LF.jpg UAB054-LH.jpg UAB054-RF.jpg UAB054-RH.png) in the zipped training data - did you mean a different ID?
Thanks,
Robert
New data looks great. Just curious if there was a reason some people were missing some feet/hands. UAB054 - no left foot. Also, UAB403-R1 should be UAB403-RH
|image |image_type |id |body_type |body_side |
|:-------------|:----------|:------|:---------|:---------|
|UAB054-LF.jpg |LF |UAB054 |foot |left |
|UAB054-LH.jpg |LH |UAB054 |hand |left |
|UAB054-RF.jpg |RF |UAB054 |foot |right |
|UAB403-LF.jpg |LF |UAB403 |foot |left |
|UAB403-LH.jpg |LH |UAB403 |hand |left |
|UAB403-R1.jpg |R1 |UAB403 |foot |right |
|UAB403-RF.jpg |RF |UAB403 |foot |right |
We have these
UAB403-R1.jpg
UAB405-RF1.jpg
UAB405-RH1.jpg,
which the 1 should likely be removed for those analyzing the data. Great thank you for this as well as for continually listening to feedback and making these changes to (hopefully) improve outcomes Yes, this totally fine. You can see @james.costello's response regarding this here:
https://www.synapse.org/#!Synapse:syn20545111/discussion/threadId=6346 (it was probably posted at the same time as you were typing your message for this thread :) )
Cheers,
Robert
@allawayr Hi - thank you for this. A question in with respect to "your submitted method must still be able to be trained using the training data mounted into your Docker container after submission"
Could you clarify whether we are allowed to manually label the joints in the training data, and use that to pre-train a model that can detect joints. We then upload this model to the docker container, and use it to detect joints.
Detected joints in the training data are used to train a model that predicts the values (this happens in the docker image). We then use the pretrained model to detect joints in test data, and use the model trained in the dokcker image to predict the outcome?
So the classification model is trained from scratch on the docker image, but a model that can detect joints is uploaded ready to go
Thank you