1: \begin{abstract}
2: Compositional data sets are ubiquitous in science, including
3: geology, ecology, and microbiology.
4: In microbiome research, compositional data primarily
5: arise from high-throughput sequence-based profiling
6: experiments. These data comprise microbial compositions in
7: their natural habitat and are often paired with covariate
8: measurements
9: that characterize physicochemical habitat properties or
10: the physiology of the host. Inferring parsimonious statistical
11: associations between microbial compositions and habitat- or
12: host-specific covariate data is an important step in exploratory
13: data analysis. A standard statistical model linking compositional
14: covariates to continuous outcomes is the linear log-contrast model.
15: This model describes the response as a linear combination of
16: log-ratios of the original compositions and has been extended
17: to the high-dimensional setting via regularization. In
18: this contribution, we propose a general convex optimization model
19: for linear log-contrast regression which includes many previous
20: proposals as special cases. We introduce a proximal algorithm that
21: solves the resulting constrained optimization problem exactly
22: with rigorous convergence guarantees.
23: We illustrate the versatility of our approach by
24: investigating the performance of several model instances on
25: soil and gut microbiome data analysis tasks.
26:
27: \keywords{compositional data \and
28: convex optimization \and log-contrast model
29: \and microbiome
30: \and perspective function \and proximal algorithm}
31: % \PACS{PACS code1 \and PACS code2 \and more}
32: % \subclass{MSC code1 \and MSC code2 \and more}
33: \end{abstract}
34: