Hi,
there is a file named goldstandard in the synthetic data folder. I could not find any description of this file, would you mind provide more description about the file? Thanks a lot!
Also, when we could expect the question 2 be released and are we gonna have some training data for question 2?
Thanks!
Created by dyushen Hi @egealpay, @dyushen, and @Bcragin,
This is correct, the goldstandard file in the synthetic data is functionally worthless, aside from just giving an example. However, with the release of the training data today, the goldstandard file will be used in the training data as the true label file, so the inclusion of a fake goldstandard file in the synthetic training data is to give your models an example to work with.
Hope this helps!
@trberg
@egealpay For an official answer you'll have to wait for one of the organizers to reply, but my personal understanding is that the goldstandard file we have is just a sample label file created to accompany the simulated data EHR data. While you certainly *could* use this file as a true label file, there's really no point in doing so -- it contains only made up data, so doing so won't improve your score.
@egealpay I also thought that before.
Hi @trberg, could we use that label as true label? Thanks! I thought that goldstandart file was true labels for synthetic data. Am I wrong? Thanks! @trberg Hi @dyushen,
The gold standard file is just an example of what we use to assess accuracy of the models. It's not very necessary for participants to build models, we just wanted to give an example.
Question 2 should be announce soon, the first iteration will probably not have training data initially. We will be announcing training data for question 1 shortly as well.
Thanks,
@trberg