Dear committee, Thank you for your supporting. I would like to ask about the Data format of the scoring data set. Are there the following columns in the hidden scoring data set? 1. PrevalentCHD 2. Event 3. Event_time 4. PrevalentHFAIL Besides, if there is PrevalentCHD column in the hidden scoring data set, is it eligible to use the feature PrevalentCHD to predict the risk? Thank you very much Best regards, Robin

Created by Chih-Han Huang Chih-Han
Thank you! I just would like to confirm. PrevalentCHD is also not allowed to be used as one of the features for prediction, am I right? I would like to confirm because it seems like the limitation is not listed in the previous rule Thank you so much!
Dear Robin, The training and test dataset has those columns, and you can use those information to train the model, however scoring dataset will not have those information when we do the testing, to prevent any potential over-optimisation. So, please do not use those columns on scoring dataset. Thank you for your understanding, ece

Data format of the scoring data set (N=1809) page is loading…