Dear organizers and participants,     We would like to have a discussion on the difference between internal and external validation, freely.   In our cases using five-fold validations, internal AUCs can surpass a level of o.7 more; though not transferable to external AUCs, nearly 10% difference.   Any participants encountered these? or organizers have comments about this difference?     Happy Christmas, Bruce

Created by Wei-Quan Fang deleapoli
Thanks Mike's comment, and in general agree with this point.   We also guess sample size, training data representiveness..., or taking the all in one word "dataset specifics", could be one of the possible reasons for the gaps between internal and external validations in our cases.   Though, as Guan's team's opinion 'in bioinformatics all aucs are >0.9...', it seems a mission impossible to eliminate the gap now, but we will still make a try to get a _balance_ between external consistency of Guan's team and internal accuracy of our team, contributing to not only MM patients but also other types of cancer patients.     Sincerely, Bruce
Dear Bruce and Yuanfang, Your results are very consistent with our internal findings. There appears to be very little signal in the genetics though there might be more in CNV or translocations that were not included. This, in and of itself is a likely novel finding for the challenge. The value of SNVs in modeling MM has been disputed in the MM field but having this many participant models show it in multiple validations may help researchers move their focus to expression and translocations. As for the difference in internal vs external scores. I do not this it is due to overfitting but instead an issue of dataset specifics. Features such as treatment that were not readily provided might to participants may differ between studies, especially with regards to when the study to took place and will have to be explored in the community phase.
that's a good point, to make the story complete, this is my cytogenetics cross-validation results with random forest (est = sklearn.ensemble.RandomForestRegressor(n_estimators=200, max_depth=4, random_state=0).fit(X,Y)) in the following 3 datasets: ' EMTAB4032', 'HOVON65', 'MMRF': I cannot remember why there is no result for the other dataset. :::::::::::::: 0_iAUC.txt :::::::::::::: 0.4933295 :::::::::::::: 1_iAUC.txt :::::::::::::: 0.517521 :::::::::::::: 2_iAUC.txt :::::::::::::: 0.5122213 what you said indeed needs investigation. but overall it sounds to me like the commonly seen 'in bioinformatics all aucs are >0.9' problem
Thanks for sharing this interesting and consistent results.   In our cases for _internal validations_,   **Models of Sub-Ch2** can achieve 0.72 of AUC level (1.5 year cut-off point only) generally in both seq and array data, using expressions only. **Models of Sub-Ch1 and Ch3** can achieve 0.75, using clinical and cytogenetic features (without expressions).   BUT _in external validations_, all cannot surpass 0.64 for final iAUC performances. NOT so consistent as Guan's team.   It seems uncertain gaps between internal and external validations need to be dug out, rather than overfitting.
This is my cross validation on three microarray expression datasets (expression+clinical) 0_iAUC.txt :::::::::::::: 0.6293621 :::::::::::::: 1_iAUC.txt :::::::::::::: 0.7037985 :::::::::::::: 2_iAUC.txt :::::::::::::: 0.6745611 This is my cross validation on the RNA seq dataset (expression+clinical): 0.64250561 10 I didn't find signal in genetics. this is my clinical performance (note they contain different set of individuals from the above, so it can be higher sometimes): 0.6273865 0.6368049 0.4697577 0.6687154 This is my final: sub1. 0.6293 (0.6606, 0.5868, 0.5833) sub2: 0.6613 (0.6982, 0.6492, 0.6694, 0.6076) sub3: 0.6245 (0.6374, 0.5529) again, they contain different set of individuals. So, in general consistent with CV; but huge variation. what do you mean by 0.7 or more, like 0.700001? that is within variation i think.

Differences between internal and external validations? page is loading…