1: \begin{abstract}
2: The use of machine-learning in neuroimaging offers new perspectives in early diagnosis and prognosis of brain diseases.
3: Although such multivariate methods can capture complex relationships in the data, traditional approaches provide irregular ($\ell_2$ penalty) or scattered ($\ell_1$ penalty) predictive pattern with a very limited relevance.
4: A penalty like Total Variation (TV) that exploits the natural 3D structure of the images can increase the spatial coherence of the weight map.
5: However, TV penalization leads to non-smooth optimization problems that are hard to minimize.
6: %
7: % We propose a generic optimization algorithm that minimizes, any differentiable loss (logistic, least square) with any combination of $\ell_1, \ell_2$, and $TV$ penalties.
8: We propose an optimization framework that minimizes any combination of $\ell_1$, $\ell_2$, and $TV$ penalties
9: while preserving the exact $\ell_1$ penalty.
10: This algorithm uses Nesterov's smoothing technique to approximate the $\TV$ penalty with a smooth function such that the loss and the penalties are minimized with an exact accelerated proximal gradient algorithm.
11: We propose an original continuation algorithm that uses successively smaller values of the smoothing parameter to reach a prescribed precision while achieving the best possible convergence rate.
12: This algorithm can be used with other losses or penalties.
13: %
14: The algorithm is applied on a classification problem on the ADNI dataset.
15: We observe that the $\TV$ penalty does not necessarily improve the prediction but provides a major breakthrough in terms of support recovery of the predictive brain regions.
16: \end{abstract}
17: