4b3404baa701d77d.tex
1: \begin{abstract}
2: We introduce the \textit{binacox}, a prognostic method to deal with the problem of detecting multiple cut-points per features in a multivariate setting where a large number of continuous features are available.
3: The method is based on the Cox model and combines one-hot encoding with the \emph{binarsity} penalty, which uses total-variation regularization together with an extra linear constraint, and enables feature selection. Original nonasymptotic oracle inequalities for prediction (in terms of Kullback-Leibler divergence) and estimation with a fast rate of convergence are established.
4: The statistical performance of the method is examined in an extensive Monte Carlo simulation study, and then illustrated on three publicly available genetic cancer datasets.
5: On these high-dimensional datasets, our proposed method significantly outperforms state-of-the-art survival models regarding risk prediction in terms of the C-index, with a computing time orders of magnitude faster. In addition, it provides powerful interpretability from a clinical perspective by automatically pinpointing significant cut-points in relevant variables.\\
6: 
7: \emph{Keywords.} Cox model; Cut-point; Feature binarization; Nonasymptotic oracle inequality; Proximal methods; Survival analysis; Total variation
8: \end{abstract}
9: