5fa34116e5802772.tex
1: \begin{abstract}
2: Studying the effects of groups of Single Nucleotide Polymorphisms (SNPs), as in a gene, 
3: genetic pathway, or network, can provide novel insight into complex diseases, above that 
4: which can be gleaned from studying SNPs individually. 
5: Common challenges in set-based genetic association testing include weak effect sizes,
6: correlation between SNPs in a SNP-set, and scarcity of signals, with single-SNP effects
7: often ranging from extremely sparse to moderately sparse in number.
8: Motivated by these challenges, we propose the Generalized Berk-Jones (GBJ) test for 
9: the association between a SNP-set and outcome. 
10: The GBJ extends the Berk-Jones (BJ) statistic by accounting for correlation among 
11: SNPs, and it provides advantages over the Generalized Higher Criticism (GHC) test 
12: when signals in a SNP-set are moderately sparse.  
13: We also provide an analytic p-value calculation procedure for SNP-sets of any finite size.
14: Using this p-value calculation, we illustrate that the rejection region for GBJ can be described
15: as a compromise of those for BJ and GHC.
16: We develop an omnibus statistic as well, and we show that this omnibus test is robust to 
17: the degree of signal sparsity.
18: An additional advantage of our method is the ability to conduct inference using individual 
19: SNP summary statistics from a genome-wide association study.
20: We evaluate the finite sample performance of the GBJ though simulation studies and 
21: application to gene-level association analysis of breast cancer risk.
22: \end{abstract}
23: