Hi,
I would like to know whether the "project column" is included in the metadata of each samples in the test dataset.
That is, whether the model in our docker file is able to know from which project has the test data sample was generated.
Also, is there any chance that an "unseen project" would be included in the test dataset?
For example, we currently have the data samples from project A to J, so should we have to expect that there might be another project subset (e.g K or L?)
Thanks,
Gwanghoon
Created by Gwanghoon Jang GGG Greetings!
It's an excellent question.
The test data does indeed include the `project` column, with a letter corresponding to each test data set, in a similar format that was done for the training data. The letters may not be sequential (i.e. we do not guarantee they are K, L, M, N...), but we can guarantee that the letters do not overlap with those from the training data.
Thank you for the question.
Jonathan Golob, MD PhD