abstract:2cdf09dec635793a.tex

1: \begin{abstract}

2: {We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates.

3: Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction.

4: Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties.

5: The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three publicly available genetic cancer datasets with high-dimensional covariates.

6: We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC($t$) and survival prediction.

7: Thus, we propose a powerfull tool for personalized medicine in cancerology.} \\

8:

9: \noindent

10: \emph{Keywords.} Cox’s proportional hazards model; CURE model; Elastic-net regularization; High-dimensional estimation; Mixture duration model; Survival analysis

11: \end{abstract}

12: