abstract:bbde7be193cab076.tex

1: \begin{abstract}

2: The conditional gradient method (CGM) is widely used in large-scale sparse convex optimization, having a low per iteration computational cost for structured sparse regularizers and a greedy approach to collecting nonzeros. We explore the sparsity acquiring properties of  a general penalized CGM (P-CGM) for convex regularizers and a reweighted penalized CGM (RP-CGM) for nonconvex regularizers, replacing the usual convex constraints with gauge-inspired penalties.  This generalization does not increase the per-iteration complexity noticeably. Without  assuming bounded iterates or using line search, we show $O(1/t)$ convergence of the gap of each subproblem, which measures distance to a stationary point.

3: %\red{FB: shouldn't we make a difference between convex and non-convex in the following?}

4:  We couple this with a screening rule which is safe in the convex case, converging to the true support at a rate $O(1/(\delta^2))$ where $\delta \geq 0$ measures how close the problem is to  degeneracy. In the nonconvex case the screening rule converges to the true support in a finite number of iterations, but is not necessarily safe in the intermediate iterates.

5: In our experiments, we verify the consistency of the method and adjust the aggressiveness of the screening rule by tuning the concavity of the regularizer.

6: %\keywords{Dual screening \and conditional gradient method \and atomic sparsity \and reweighted optimization}

7: % \PACS{PACS code1 \and PACS code2 \and more}

8: % \subclass{MSC code1 \and MSC code2 \and more}

9: \end{abstract}

10: