1: \begin{abstract}
2: Gene Set Enrichment Analysis (GSEA) is a basic tool for genomic data
3: treatment. From a statistical point of view, the centering of its
4: test statistic does not
5: allow the derivation of asymptotic results.
6: A test statistic with a different centering is proposed.
7: Under the null hypothesis, the convergence in distribution of
8: the new test statistic is proved, using the theory of empirical
9: processes. The limiting distribution can be computed by Monte-Carlo
10: simulation. The test defined in this way has been called Weighted
11: Kolmogorov Smirnov (WKS) test. The fact that the evaluation of the
12: asymptotic distribution serves for many different gene sets
13: results in shorter computing times. Using
14: expression data from the GEO repository, tested against the MSig
15: Database C2, a comparison between the classical GSEA test and the new
16: procedure has been conducted. Our conclusion is that, beyond its
17: mathematical and algorithmic advantages, the WKS test could be more
18: informative in many cases, than the classical GSEA test.
19: \vskip 2mm \noindent
20: \emph{Keywords:} GSEA, statistical test,
21: empirical processes, weak convergence, Monte-Carlo simulation
22: %Some other potential keywords: Slutsky's theorem, Continuous mapping theorem,
23: %Donsker classes, stochastic integral, Kolmogorov-Smirnov statistic
24:
25: \vskip 2mm \noindent
26: \emph{AMS Subject Classification:} Primary 62F03; Secondary 60F17
27:
28: \end{abstract}
29: