Predictions file is important because if we understand better how is the format we can figure how the evaluation is going to be performed.
The TSV file is like follows.
```
subjectId laterality confidence
1 L 0.01
1 R 0.05
2 L 0.00
2 R 0.01
```
In the case of a patient evaluation I want to know (and clarify) which affirmations are true or false:
1.- All patients **MUST** have both lateralities (Left and Right)
2.- A subject Id can have multiple orientations given one laterality.
3.- The confidence **MUST** be an agreement response based in a resume for all images orientations for a given laterality.
4.- The more or less orientations for a given patient laterality should be expresed in JUST one number
5.- In trial 2 previous examinations should be packed in the same output comparison.
6.- SubjectId must have always two laterality's and two confidence numbers
Created by Kiko Albiol kikoalbiol Dear Kiko,
The rules I pasted above can be found on the bottom of this [page](#!Synapse:syn4224222/wiki/401759). Please kindly let us know if you need any more assistance.
Best,
Thomas Hi Kiko,
> The case described above is not forbidden and we should assign a number for L and R, even if there is a missing orientation for L
Just to clarify, we are asking your inference method to generate a prediction for each breast of a subject for which data are available. It doesn't matter whether the data available for the **left breast** of **subject A** include only one MLO view OR a CC view OR both OR anything else. For all these configurations, the expected prediction line should look like "**A** \TAB **L** \TAB ". About 2 and 4.
I an studio can have different orientations for the same laterality.
From your answer I assume that we should pack the value expressed in all orientations for the same laterality and get only one number.
Ej.
subjectId:
Laterality **R** Orientations **MLO** and **CC**
Laterality **L** Orientations **MLO**
The case described above is not forbidden and we should assign a number for L and R, even if there is a missing orientation for **L**
Dear Kiko,
Here are the rules for what the prediciton file needs to be:
First, the predictions file must include the above header as the scoring script refers to the column by their name. The first column contains the ID of the subject and the second column the laterality of the breast (left or right). The third column contains the confidence level predicted by the inference method that a breast will develop cancer within 12 months. The confidence level must take values in [0,1], where 0 means that the breast will not develop cancer and 1 means that the breast will develop a cancer for sure.
Below are all the requirements that a prediction file must follow for it to be scored.
* Predictions MUST be a TSV file and MUST be named predictions.tsv
* Prediction file MUST have the headers (case sensitive): subjectId, laterality, confidence
* A sample is considered as subjectId + laterality (Ex. 0001R, 0001L)
* Laterality MUST be either R, or L. Please do not use lowercase letters or right/left.
* No duplicated samples allowed
* All samples in the prediction file MUST exist in the goldstandard
* Confidence values cannot be negative
* Must at least have one confidence value (Can't submit all NA or blank file)
* Empty confidence/missing samples will be imputed with the median of confidence values
To answer your questions
1. Not all patients **MUST** have both lateralities, as stated in the rules, "Empty confidence/missing samples will be imputed with the median of confidence values"
2. Please kindly clarify your quesiton - A sample (subjectId + laterality eg, 1L) can **ONLY** appear once
3. The confidence value can be any value except a negative value
4. Please kindly clarify your quesiton - but again a sample can **ONLY** appear once
5. Please kindly clarify your quesiton
6. Not all patients **MUST** have both lateralities, as stated in the rules, "Empty confidence/missing samples will be imputed with the median of confidence values"
Hope this helps.
Thomas