Hi,
I have two questions regarding training data in Q2, thank you very much in advance for the help!
1) According to the instruction in question 2, the input includes **geographic, demographic, and clinical measures.**. The provided example also seems to use **measurement.csv** and **person.csv** only, does it mean the model building should be based on these two datasets only?
2) since labels are provided as **goldstandard.csv**, there seems to be no need to use **visit_occurence.csv** unless we want to use any visit information prior to the COVID19- associated hospitalization for training (if extra data other than **measurement.csv** and **person.csv** is allowed )?
Thank you
Created by Cong Zhu r_w_2020 I see, thank you very much @trberg Hi @r_w_2020,
1. The model building can be based on any data that is made available. This can be **person.csv**, **measurement.csv**, **procedure_occurrence.csv**, **condition_occurrence.csv**, etc. You can [look here](https://www.synapse.org/#!Synapse:syn21849255/wiki/602415) for more info on the available tables.
2. Correct, you won't need to use visit_occurrence to derive training labels unless you want to try and build your own version of them or you'd like to use visit information in your model.
Thank you!
@trberg