1: \begin{abstract}
2: We consider the problem of finding critical points of functions that
3: are non-convex and non-smooth. Studying a fairly broad class of such
4: problems, we analyze the behavior of three gradient-based methods
5: (gradient descent, proximal update, and Frank-Wolfe update). For
6: each of these methods, we establish rates of convergence for general
7: problems, and also prove faster rates for continuous sub-analytic
8: functions. We also show that our algorithms can escape strict saddle
9: points for a class of non-smooth functions, thereby generalizing
10: known results for smooth functions. Our analysis leads to a
11: simplification of the popular CCCP algorithm, used for optimizing
12: functions that can be written as a difference of two convex
13: functions. Our simplified algorithm retains all the convergence
14: properties of CCCP, along with a significantly lower cost per
15: iteration. We illustrate our methods and theory via applications to
16: the problems of best subset selection, robust estimation, mixture
17: density estimation, and shape-from-shading reconstruction.
18: \end{abstract}