1: \begin{abstract}
2: A recent literature in econometrics models unobserved cross-sectional heterogeneity in panel data by assigning each cross-sectional unit a one-dimensional, discrete latent type.
3: Such models have been shown to allow estimation and inference by regression clustering methods.
4: This paper is motivated by the finding that the clustered heterogeneity models studied in this literature can be badly misspecified, even when the panel has significant discrete cross-sectional structure.
5: To address this issue, we generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple, imperfectly-correlated latent variables that describe its response-type to different covariates.
6: We give inference results for a k-means style estimator of our model and develop information criteria to jointly select the number clusters for each latent variable.
7: Monte Carlo simulations confirm our theoretical results and give intuition about the finite-sample performance of estimation and model selection.
8: We also contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
9: Our results suggest that over-fitting can be severe in k-means style estimators when the number of clusters is over-specified.
10: \end{abstract}
11: