1: \begin{abstract}
2: Herding defines a deterministic dynamical system at the edge of chaos.
3: It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations,
4: where the sequence of states can be interpreted as ``samples'' from an associated MRF model.
5: Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point
6: and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically.
7: Herding may be interpreted as a``perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system
8: rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical
9: characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may
10: be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to
11: models with latent variables and to a discriminative learning setting. The perceptron cycling theorem
12: ensures that the fast moment matching property is preserved in the more general framework.
13: \end{abstract}
14: