I wonder why you would need the permutations for and I went back to my webinar recordings and realized that you want it to determine the background distribution of the model. I did some googling and came up with nothing. Would you please explain this background distribution or provide some material so that I can go through?

Created by Mehrad Mahmoudian michelangelo
Mehrad- You could look up permutation testing for example. The idea is, in the absence of an independent test data set, to account for each method's degree of overfitting by estimating each method's null distribution. For example, the AUROC has a null expectation of 0.5, but fit and tested on the same data most methods will identify a model with AUROC > 0.5. Thus, we can use permutations to assess overfitting by each method and compare the true data to the permutations in order to determine the amount of true signal discovered by each method. You can see what we did in the RA Challenge paper: http://www.nature.com/articles/ncomms12460. Here we used a different method for resampling and done in a manner to ask a very specific question, but it is an example of how we used a Monte Carlo approach for inference in this case.

Background distribution definition page is loading…