Hello,
I am just finalizing my inference submissions, and when testing on the express lane it says there are repeated entries in my inference submission.
The error message I get is, "submission ID: 9605196 No duplicated sampleIds (subjectId + laterality)"
I just checked this myself with the prediction file from the express lane submission. I printed predictions.tsv to stdout, removed the timestamp information and checked for duplicate entries, though I was unable
to find any duplicate entries. Was hoping to get some feedback on this before I make the final submission.
Kind Regards,
Ethan
Created by Ethan Goan ethan.goan Thanks for your reply @thomas.yu
I am reading the sampleId from the crosswalk file, as the metadata file is not available in sub challenge 1 for inference submissions.
I just made another test submission, this time I just looked at 10 scans. Here is some of the output from my log file where I print the predictions.tsv file I generate
```
STDOUT: 2017-04-27T01:23:40.527729611Z subjectId laterality confidence
STDOUT: 2017-04-27T01:23:40.527732873Z 20fjor9z L 1.0
STDOUT: 2017-04-27T01:23:40.527735778Z jir65ihv R 0.0
STDOUT: 2017-04-27T01:23:40.527738697Z 9g54q8ov R 1.0
STDOUT: 2017-04-27T01:23:40.527741571Z 8ee9x93r L 0.0
STDOUT: 2017-04-27T01:23:40.527744518Z hmxrey04 L 0.0
STDOUT: 2017-04-27T01:23:40.527747440Z yrxt24d6 L 1.0
STDOUT: 2017-04-27T01:23:40.527750448Z j3f5gzjd L 1.0
STDOUT: 2017-04-27T01:23:40.527753435Z ivevayol R 1.0
STDOUT: 2017-04-27T01:23:40.527756681Z 4var9uqo L 1.0
STDOUT: 2017-04-27T01:23:40.527759573Z q5nmr4fc R 0.0
STDOUT: 2017-04-27T01:23:40.527762536Z gphm3wyt L 1.0
```
You can see here, that these is definitely no repeated entries, though I still received an email saying that there were duplicate entries.
I have written my model to account for duplicate entries, it will take the mean prediction score if there are any repeated entries so that is fine.
Would you be able to look into this for me please? Is there something else I am doing incorrectly?
Everything is tab separated so Im not sure what is happening.
Cheers,
Ethan
Dear Ethan,
How did you generate the prediction file in the express lane? If you read the metadata file to generate the prediction file then it wont have any duplicates on the express lane.
However, the inference lane metadata files do have duplicates. A sample id may appear twice. Please kindly let me know if this is confusing.
Best,
Tom Hi @thomas.yu,
As far as I am aware, there weren't actually any repeated entries in the predictions file. I printed the prediction file on the express lane and made a script to check, and couldn't find any repeated entries.
There is only one entry for each combination of subjectId and laterality. Would you be able to have a look and let me know if I am doing it incorrectly. It is strange because it is the same method I used for my express lane
submissions in the previous round and I did not receive this error.
Cheers,
Ethan Dear Ethan,
The express lane files don't have any duplicated entires. In the inference submissions, there are cases where one sampleId has more than one value. So if you are reading in the file and not removing duplicates, then you will run into this issue.
Best,
Tom
Drop files to upload
Repeated entries in inference submission page is loading…