423bae2a4a3316c2.tex
1: \begin{abstract}
2: This work considers the problem of learning the structure of multivariate
3: linear tree models, which include a variety of directed tree graphical
4: models with continuous, discrete, and mixed latent variables such as
5: linear-Gaussian models, hidden Markov models, Gaussian mixture models, and
6: Markov evolutionary trees.
7: The setting is one where we only have samples from certain observed
8: variables in the tree, and our goal is to estimate the tree structure
9: (\emph{i.e.}, the graph of how the underlying hidden variables are
10: connected to each other and to the observed variables).
11: We propose the Spectral Recursive Grouping algorithm, an efficient and
12: simple bottom-up procedure for recovering the tree structure from
13: independent samples of the observed variables.
14: Our finite sample size bounds for exact recovery of the tree structure
15: reveal certain natural dependencies on underlying statistical and
16: structural properties of the underlying joint distribution.
17: Furthermore, our sample complexity guarantees have no explicit dependence
18: on the dimensionality of the observed variables, making the algorithm
19: applicable to many high-dimensional settings.
20: At the heart of our algorithm is a spectral quartet test for determining
21: the relative topology of a quartet of variables from second-order
22: statistics.
23: \end{abstract}
24: