I am trying to understand the implication of the following Q&A from the webinar. Q. About the RMA, did you remove the validation set before performing RMA or you separate them after RMA? Additionally did you do the RMA on all CEL files or study by study? (Schrodingers cat) A: All data were normalized together Will the outcome of RMA be any different if it is done separately on each study's CEL files?

Created by Prasad Chodavarapu chprasad
Perhaps this basic primer on quantile normalization will help: https://en.wikipedia.org/wiki/Quantile_normalization
Quantile normalization only assumes that the overall distribution of expression is the same across arrays, and puts no constraint at the individual probe level. The cel files have been provided, and you're welcome to normalize the data any way you see fit.
And by further extension, normalize separately for each class (shedding but asymptomatic, shedding and symptomatic, ...). But this is only possible on the training set and not on the test set where class labels won't be available! That begs the question, why is quantile normalization considered kosher across categories in any of the microarray based classification studies?
Thanks for the prompt response. I assume the goal of normalization is to make probe intensity measurements comparable across arrays (CEL files). That probably constrains the arrays included in a normalization batch to be congruent by which I mean we intuitively expect all the arrays to have similar readings for every probe. If that is correct, even within a study's CEL files, is it appropriate to normalize across all CEL files generated at various times? Shouldn't we separately normalize CEL files for each time point?
Yes. RMA includes a quantile normalization step, which will differ depending on which arrays are included in the normalization set.

RMA mechanics page is loading…