Per the Q&A, I will need to label the location of the joints in the training set images myself in order to train a neural net on how to recognize joints. However, per the Q&A, the Docker container will need to train itself on an unmodified dataset as part of model validation. Almost always in current practice for multiple object categorization and location using neural nets, a training set is supplied with categories and bounding boxes as an input. So on that basis, either I have to sneak in a pretrained model in my Docker image, and then just pretend to train after that on the supplied images, or I have basically no options in the traditional neural network practice to train my net to recognize the joints. Is that correct?

Created by Lars Ericson lars.ericson
If you're not providing all of the data required for ground truth on the training set, then you should delete retraining of the Docker image as a step in scoring. I would go further and say that you should add a prize for adding the missing required segmentation ground truth to the training set, since that is tedious, difficult work that adds no insight to the problem, and is work that people usually have to pay someone to do, either via Amazon Mechanical Turk or to a firm that specializes in image annotation. If the challenge sponsors ignore these practicalities, then they run the risk of having few active participants in the challenge, namely those people that need an article credit, are skilled in machine learning, are able to risk working long hours for a contingent paycheck, and don't mind spending most of the their time on the challenge doing a menial chore. I could be wrong about this, it's just my guess that that will be a fairly small set of people.
@lars.ericson Whether or not to generate and provide these masked images is a decision that will have to be made by the challenge organizers who are providing the data - ie. @LouBridges from UAB cced above. I don't have any additional info as to whether they have the bandwidth or ability to provide this information for this challenge. Best, Robert
@allawayr please let me know if you get any confirmation on this. There is no sense putting a training phase into the Docker eval, when there is really no choice but for us to manually augment the training ground truth with bounding masks for the joints in the training set. This challenge is about damage in joints. The setup is basically identical to a similar challenge set up by CMU (https://xview2.org/ )on classifying hurricane damage to houses. There are two steps: Locate the houses in the image, and then classify the damage. Literally an identical challenge. They publish a working baseline solution (https://github.com/DIUx-xView/xview2-baseline). The baseline solution breaks the problem up into the location and damage problems as separate models, each of which needs to be trained. The location sub-model training set comes with ground truth bounding masks. It's necessary. You can't make them up on the fly in an automatic way. The location ground truth is marked by hand by humans. You can't get around it. This is with CMU setting up the challenge (top of the line for machine learning) and DIU backing it (top of the line DoD funding organization). If they are providing bounding boxes in the training set, and a model solution, we should too, to follow best and most pragmatic practice in this problem space.
A better solution would be to give everybody, for each training set image, a file of bounding boxes or bounding masks for each joint, labelled with the joint ID and the score. This kind of ground truth is the norm for supervised learning in image processing. Otherwise you are calling for something called unsupervised learning, which would turn this into a research project on the distant frontiers of machine learning (i.e. a computer science and mathematics project), as opposed to a research project on the application of current state of the art machine learning to this problem domain (i.e. a biotech project).
Hi Lars, In my mind, it would not be an issue to pre-train a segmentation/object detection model on the public subset of RA2 DREAM data and have that trained model as a component of your docker image (e.g. to use that to detect joints in the Cheaha train and test datasets), but I think @james.costello @LouBridges @jakechen should confirm this. Cheers, Robert

Catch 22 on joint recognition page is loading…