1: \begin{abstract}
2:
3: We present a new algorithm for discovering patterns in time series and other
4: sequential data. We exhibit a reliable procedure for building the minimal set
5: of hidden, Markovian states that is statistically capable of producing the
6: behavior exhibited in the data --- the underlying process's \emph{causal
7: states}. Unlike conventional methods for fitting hidden Markov models (HMMs)
8: to data, our algorithm makes no assumptions about the process's causal
9: architecture (the number of hidden states and their transition structure), but
10: rather infers it from the data. It starts with assumptions of minimal
11: structure and introduces complexity only when the data demand it. Moreover,
12: the causal states it infers have important predictive optimality properties
13: that conventional HMM states lack. We introduce the algorithm, review the
14: theory behind it, prove its asymptotic reliability, use large deviation theory
15: to estimate its rate of convergence, and compare it to other algorithms which
16: also construct HMMs from data. We also illustrate its behavior on an example
17: process, and report selected numerical results from an implementation.
18:
19: \end{abstract}
20: