1: \begin{abstract}
2: Studying the effects of groups of Single Nucleotide Polymorphisms (SNPs), as in a gene,
3: genetic pathway, or network, can provide novel insight into complex diseases, above that
4: which can be gleaned from studying SNPs individually.
5: Common challenges in set-based genetic association testing include weak effect sizes,
6: correlation between SNPs in a SNP-set, and scarcity of signals, with single-SNP effects
7: often ranging from extremely sparse to moderately sparse in number.
8: Motivated by these challenges, we propose the Generalized Berk-Jones (GBJ) test for
9: the association between a SNP-set and outcome.
10: The GBJ extends the Berk-Jones (BJ) statistic by accounting for correlation among
11: SNPs, and it provides advantages over the Generalized Higher Criticism (GHC) test
12: when signals in a SNP-set are moderately sparse.
13: We also provide an analytic p-value calculation procedure for SNP-sets of any finite size.
14: Using this p-value calculation, we illustrate that the rejection region for GBJ can be described
15: as a compromise of those for BJ and GHC.
16: We develop an omnibus statistic as well, and we show that this omnibus test is robust to
17: the degree of signal sparsity.
18: An additional advantage of our method is the ability to conduct inference using individual
19: SNP summary statistics from a genome-wide association study.
20: We evaluate the finite sample performance of the GBJ though simulation studies and
21: application to gene-level association analysis of breast cancer risk.
22: \end{abstract}
23: