36342e22e2d3607f.tex
1: \begin{abstract}
2:   This paper is about variable selection, clustering and estimation in
3:   an unsupervised high-dimensional setting. Our approach is based on
4:   fitting constrained Gaussian mixture models, where we learn the
5:   number of clusters $K$ and the set of relevant variables $S$ using a
6:   generalized Bayesian posterior with a sparsity inducing prior. We
7:   prove a sparsity oracle inequality which shows that this procedure
8:   selects the optimal parameters $K$ and $S$. This procedure is
9:   implemented using a Metropolis-Hastings algorithm, based on a
10:   clustering-oriented greedy proposal, which makes the convergence to
11:   the posterior very fast.
12: \end{abstract}
13: