1: \begin{abstract}
2: Models for learning probability distributions such as generative models and density estimators behave quite differently from models for learning functions.
3: One example is found in the memorization phenomenon, namely the ultimate convergence to the empirical distribution, that
4: occurs in generative adversarial networks (GANs).
5: For this reason, the issue of generalization is more subtle than that for supervised learning.
6: For the bias potential model, we show that dimension-independent generalization accuracy is achievable if early stopping is adopted,
7: despite that in the long term, the model either memorizes the samples or diverges.
8: %The generality of our arguments indicates that this slow deterioration from generalization to memorization might be common to distribution-learning models in general.
9: \end{abstract}
10: