Hi all,
Do you mind if I double-check which HarrelC values you obtain on the test dataset (synthetic) provided in the baseline models?
Mine are:
cox sex+age,0.35933836962688,
cox all covariates,0.365068516325495
random survival forest all,0.501299184626315
Created by notauser Hi @notauser
We refined our baseline model and the scoring metric.
So please use the benchmark model only as an example of how the code should be structured to be able to run in our system.
Below is the Harrel'sC value for 3 baseline model (we use 3 different cox model) both in synthetic and real datasets.
cox sex+age=0.723; real dataset: 0.816
cox all covariates=0.711; real dataset:0.855
cox microbes?allcovaraites=0.659, real dataset=0.824
Our refine model creates scores.csv as an absolute risk score for 15 years instead of fitting our model using predict() functions.
In addition, we made mistakes in our example of scoring Harrel's C index, where in example you can find it is based on Event time, in our current scoring metric we used both event and event time information, according to the task that we provided for the challenge.
Therefore, our score looks greatly improved from our previous score that we presented in webinar.
Please remember, we will use Hosmer Lemeshow test (for 15 years, we updated the wiki) as an additional metric for scoring, to see whether your provided model is well calibrated.
In addition, the challenge organizers reserve the right to use additional criteria and procedures to choose the ultimate winner.
These may include, for instance, the use of additional scoring metrics (e.g Bootstrapping) or criteria if necessary, e.g. in the event of a tie or any other special circumstance.
Best regards,
Pande