1: \begin{abstract}
2: Regularized regression models, such as the lasso and variants,
3: %are standard tools in applied machine learning and statistics. These methods
4: are well studied and,
5: under appropriate conditions,
6: offer fast and statistically interpretable results.
7: However, large data in many applications are
8: heterogeneous in the sense of harboring distributional differences between latent groups. Then,
9: the assumption that the conditional distribution of response $Y$ given features $X$
10: is the same for all samples may not hold (even approximately).
11: Furthermore, in scientific applications, the covariance structure of the features
12: may contain important signals and
13: its learning is also affected by latent group structure.
14: %The two issues -- heterogeneity in feature distributions and in regression models -- are linked, since both aspects may provide signals relevant to understanding the latent structure.
15: We propose a class of regularized mixture models for
16: paired data of the form $(X,Y)$ that
17: couples together
18: the distribution of $X$ (modeled using sparse graphical models)
19: and the conditional $Y \mid X$ (modeled using sparse regression).
20: Both the regression and graphical models are specific to the latent groups and model parameters are estimated jointly (hence we
21: call the approach ``regularized joint mixtures").
22: This allows
23: signals in either or both of the feature distribution and regression model to inform learning of latent structure and
24: % This joint strategy deals with suspected distributional shifts and
25: provides automatic control of confounding by such structure. Estimation is handled via an expectation-maximization algorithm, whose convergence is established theoretically. We illustrate the key ideas via empirical examples.
26: \end{abstract}
27: