I propose a model where I perform two sample t-test for each gene. The two groups used are form by the binary response variable SHEDDING_SC1. I first subset my dataset as follows 1)Subset gene expressions according to virus types. That is into DEE1 RSV, DEE2 H3N2, DEE3 H1N1, DEE4X H1N1, DEE5 H3N2, Rhinovirus Duke, and Rhinovirus UVA. a) For each virus type say for example DEE1 RSV, the data is subset into the different time points. b) For each time point say at -24, I have a data set for 20 patients and plenty of genes. c) For each gene perform t.test with two groups being SHEDDING_SC1=0 and SHEDDING_SC1=1. We ignore the small sample problem. d) So we have all these many t.tests which can be converted to z_scores. e) We then realize that for each gene we have 21 z_scores corresponding to the 21 time points. f)My assertion is that on average upward regulated genes will have relative positive large z scores and negative regulated genes will have a negative large z_score. Variations that are random will end-up with near or even zero z-scores. The attachment contain some histograms to that give a better view of the problem.

Created by Chamberlain Mbah chambox

t-test type predictions page is loading…