Hi @trberg , I have a few questions about the gold standard and evaluation process: 1. The instruction says that after we submit the codebook, you will rerun our codebook using newly collected data. Is that mean you will retrain our model with new training data? What tables will be provided in the new training and testing data? Will it be the same as what we have now in the challenge data folder? Since the final submission is only a codebook, is it possible for us to load other resources (e.g., pretrained model weights in our team folder)? 2. Following my question 1, will you just add new patients to the training data or there will be a completely different patient set to train and evaluate our model? Can you tell us the size of the new training and testing data, so that we can ensure our codes can handle the data without errors? I notice that there are overlaps between training patients and testing patients in your example submission codebook. Does this overlap also exist in the new training and testing data? 3. In Task 1 gold standard script - Outpatient COVID Index Visits table, you only select outpatient visits that ?days_from_covid_index? is smaller than 7. Can you further explain the reason of this? Thanks in advance, Junyi

Created by Junyi Gao junyig
Hi @junyig, 1. Yes, we may retrain the model. The training data will be the same for the main evaluation, but we will do other experiments where we change the training data around (e.g. exclude specific sites for cross-validation). The data tables and data format will be the same. The only difference will be the number of available rows, and possibly the codes that are available. Yes, you may load other resources into your team folder. So long as it can run all the way through, you can incorporate whatever data you'd like. 2. The training data will be mostly the same, other than some shuffling or exclusions that the judges will do to test your models. The overlap you see will not exist when we evaluate your models. We will be pulling in prospectively collected data and replacing the current test set with this new data. There will be no overlap of patients between the training and test set. The size of the testing data shouldn't be too much greater than that of the training data. I can't give you exact numbers since the data that we will be using is still be collected and is continually growing. If we find that models are failing on the testing data because its just too large, we may reduce the size. 3. The intended use case of the models that come from Task 1 are for an outpatient setting. We want to be able to assess the risk of hospitalization when a patient has recently tested positive and comes into an outpatient clinic for evaluation. We decided to limit the time from that initial covid positive test to their outpatient visit to 7 days. Let me know if you have further questions! Thank you, @trberg

Questions about gold standard and evaluations page is loading…