1: \begin{abstract}
2: Unsupervised neural network learning extracts hidden features from
3: unlabeled training data. This is used as a pretraining step for
4: further supervised learning in deep networks. Hence, understanding
5: unsupervised learning is of fundamental importance. Here, we study the unsupervised learning from a finite
6: number of data, based on the restricted Boltzmann machine learning. Our study inspires an efficient message passing
7: algorithm to infer the hidden feature, and estimate the entropy of
8: candidate features consistent with the data. Our analysis reveals that the learning requires only a few data if
9: the feature is salient and extensively many if the feature is weak. Moreover,
10: the entropy of candidate features monotonically decreases with data
11: size and becomes negative (i.e., entropy crisis) before the message passing becomes unstable, suggesting a discontinuous phase transition.
12: In terms of convergence time of the message passing algorithm, the unsupervised learning exhibits an
13: easy-hard-easy phenomenon as the training data size increases. All
14: these properties are reproduced in an approximate Hopfield model,
15: with an exception that the entropy crisis is absent, and only continuous phase transition is observed. This key difference is also confirmed in a handwritten digits dataset. This study deepens our understanding of
16: unsupervised learning from a finite number of data, and may
17: provide insights into its role in training deep networks.
18:
19: \end{abstract}
20: