Hi,
on the leaderboard, we only see the performance of our model on the test data. However, since we do not have direct access to the real data, it would be very useful for us to also know the performance on the real training data to determine whether our model is overfitting or not (a model may perform well on the simulated data but strongly under/overfit the real data). Do you think it would be possible to provide us with these scores as well ?
Also, could you please confirm that the score of the cox model with all covariates is 0.855 (thread 9782) on the real data set (I am bit surprised that nobody reached that performance according to the leaderboard).
Thanks,
Tristan
Created by TristanF Hi @TristanF,
Thank you for your request.
We think this is relevant and fair suggestion, so we would add the harrel's C and Hosmer Lemeshow (hoslem) test results using real training data.
Please wait in the couple of days until your score available in leaderboard ( no e-mail notification). We have added 2 columns (Harrell c Train and Hosmer_LemeshowTrain) to show the score in real training set.
Regarding your second question, yes, the score with all covariates that I provided in thread 9782 is calculated on a real dataset (test set).
Please note, the aim of this challenge is to predict time-to-event HF using both covariates and microbiome and model is well callibrated based on hoslem test.
Thus, using covariates alone might not answer the challenge aim.
Best regards,
Pande