1: \begin{abstract}
2: In large-scale genomic applications vast numbers of molecular features are
3: scanned in order to find a small number of candidates which are linked to a
4: particular disease or phenotype. This is a variable selection problem in the
5: ``large $p$, small $n$'' paradigm where many more variables than samples are
6: available. Additionally, a complex dependence structure is often observed among
7: the markers/genes due to their joint involvement in biological processes and
8: pathways.
9:
10: Bayesian variable selection methods that introduce sparseness through
11: additional priors on the model size are well suited to the problem. However,
12: the model space is very large and standard Markov chain Monte Carlo (MCMC) algorithms such as a Gibbs
13: sampler sweeping over all $p$ variables in each iteration are often
14: computationally infeasible. We propose to employ the dependence structure in
15: the data to decide which variables should always be updated together and which
16: are nearly conditionally independent and hence do not need to be considered
17: together.
18:
19: Here, we focus on binary classification applications. We follow the
20: implementation of the Bayesian probit regression model by \citet{albert93} and the Bayesian logistic regression model by \citet{holmes06}
21: which both lead to marginal Gaussian distributions. We investigate several MCMC
22: samplers using the dependence structure in different ways. The mixing and
23: convergence performances of the resulting Markov chains are evaluated and
24: compared to standard samplers in two simulation studies and in an application
25: to a real gene expression data set.
26: \end{abstract}
27: