e981e9b70a9e9274.tex
1: \begin{abstract}
2: 
3: We present a new algorithm for discovering patterns in time series and other
4: sequential data.  We exhibit a reliable procedure for building the minimal set
5: of hidden, Markovian states that is statistically capable of producing the
6: behavior exhibited in the data --- the underlying process's \emph{causal
7:   states}.  Unlike conventional methods for fitting hidden Markov models (HMMs)
8: to data, our algorithm makes no assumptions about the process's causal
9: architecture (the number of hidden states and their transition structure), but
10: rather infers it from the data.  It starts with assumptions of minimal
11: structure and introduces complexity only when the data demand it.  Moreover,
12: the causal states it infers have important predictive optimality properties
13: that conventional HMM states lack.  We introduce the algorithm, review the
14: theory behind it, prove its asymptotic reliability, use large deviation theory
15: to estimate its rate of convergence, and compare it to other algorithms which
16: also construct HMMs from data.  We also illustrate its behavior on an example
17: process, and report selected numerical results from an implementation.
18: 
19: \end{abstract}
20: