I was expecting that the metadata in the 'Digital Mammography Model Training' lane would include the information as to whether there was a tumour present. The two files in /metadata contain: ``` /metadata/images_crosswalk.tsv subjectId examIndex imageIndex view laterality filename ``` ``` /metadata/exams_metadata.tsv subjectId examIndex daysSincePreviousExam cancerL cancerR invL invR age implantEver implantNow bcHistory yearsSincePreviousBc previousBcLaterality reduxHistory reduxLaterality hrt antiestrogen firstDegreeWithBc firstDegreeWithBc50 bmi race ``` I was expecting, as with the sample data, that the first file would, instead, contain: ``` /metadata/images_crosswalk.tsv subjectId examIndex imageIndex view laterality filename cancer ``` Are we supposed to use the fields 'cancerL cancerR' to infer whether a particular image will be showing a tumour? Or is it an error that the metadata does not have the 'cancer' field? It would be really useful if the /metadata/images_crosswalk.tsv could have the 'cancer' field, if that was possible. If I'm misunderstanding the reason for the training lane, please let me know, but my understanding is that we should be able to examine the images knowing which have, and do not have, tumours.

Created by Peter Brooks fustbariclation
Thank you very much, Thomas, that makes good sense.
Hi Peter, The "cancer" column has been removed during the Open Phase (see Section "Important Information" in the description of the [Pilot Data](https://www.synapse.org/#!Synapse:syn6174174)). > Are we supposed to use the fields 'cancerL cancerR' to infer whether a particular image will be showing a tumour? Yes. Please note that "cancerL" and "cancerR" don't mean that a tumour has been observed by a radiologist, but that a cancer has been tissue-diagnosed in a breast within 12 months (see Challenge Dictionary). > It would be really useful if the /metadata/images_crosswalk.tsv could have the 'cancer' field, if that was possible. If you want, I have written a python script that generate a label file at the image level (filename \TAB cancer) from the file `images_crosswalk.tsv`. You can find the script [here](https://github.com/tschaffter/dm-docker/tree/master/dm-preprocess-png). As for all the examples that we provide, please check that they work as expected and let us know if you encounter any issues. > If I'm misunderstanding the reason for the training lane, please let me know, but my understanding is that we should be able to examine the images knowing which have, and do not have, tumours. Please refer to the [Challenge Questions](https://www.synapse.org/#!Synapse:syn4224222/wiki/401749). In Sub-challenge 1, only one exam per subject is presented and we ask you to make a prediction for each breast of the subject (exams metadata not available). In Sub-challenge 2, we provide a sequence of exams for each subject (sometimes the sequence of exams includes only one exam) in addition to exams metadata. In Sub-challenge 2, we ask you to make a prediction for each breast of the subject for the last exam of the sequence. Hope this helps!

Fields in metadata - in the 'Digital Mammography Model Training' lane page is loading…