90224919bc2f2495.tex
1: \begin{abstract}
2: Regularized  regression models, such as the lasso and variants,
3: %are standard tools in applied machine learning and statistics. These methods 
4: are well studied and, 
5: under appropriate conditions,
6: offer fast and statistically interpretable results.
7: However, large data in many applications are 
8: heterogeneous in the sense of harboring distributional differences between latent groups. Then,
9: the assumption that the conditional distribution of response $Y$ given features $X$ 
10: is the same for all samples may not hold (even approximately).
11: Furthermore, in scientific applications, the covariance structure of the features
12: may contain important signals and 
13: its learning is also affected by latent group structure.
14: %The two issues -- heterogeneity in feature distributions and in regression models -- are linked, since both aspects may provide signals relevant to understanding the latent structure. 
15: We propose a class of  regularized mixture models for 
16: paired data of the form $(X,Y)$ that 
17: couples together 
18: the  distribution of $X$ (modeled using sparse  graphical models)
19: and the conditional $Y \mid X$ (modeled using sparse regression).
20: Both the regression and graphical models are specific to the latent groups and model parameters are estimated jointly (hence we 
21: call the approach ``regularized joint mixtures").
22: This  allows
23: signals in either or both of the feature distribution and regression model to inform learning of latent structure and 
24: % This joint strategy deals with suspected distributional shifts and 
25: provides automatic control of confounding by such structure. Estimation is handled via an expectation-maximization algorithm, whose convergence is established theoretically. We illustrate the key ideas via empirical examples.
26: \end{abstract}
27: