961776ab0fdd33a3.tex
1: \begin{abstract}
2: We analyze the convergence of a nonlocal gradient descent method for minimizing a class of high-dimensional non-convex functions, where a directional Gaussian smoothing (DGS) is proposed to define the nonlocal gradient (also referred to as the DGS gradient).
3: % an optimization approach with directional Gaussian smoothing (DGS) technique, which defines and employs a novel nonlocal search direction called DGS gradient for optimizing on high-dimensional, non-convex landscapes. 
4: The method was first proposed in \cite{DBLP:conf/uai/ZhangTLZ21}, in which multiple numerical experiments showed that replacing the traditional local gradient with the DGS gradient can help the optimizers escape local minima more easily and significantly improve their performance. However, a rigorous theory for the efficiency of the method on nonconvex landscape is lacking. In this work, we investigate the scenario where the objective function is composed of a convex function, perturbed by a oscillating noise. We provide a convergence theory under which the iterates exponentially converge to a {tightened} neighborhood of the solution, whose size is characterized by the noise wavelength. {We also establish a correlation between the optimal values of the Gaussian smoothing radius and the noise wavelength, thus justify the advantage of using moderate or large smoothing radius with the method.} Furthermore, if the noise level decays to zero when approaching global minimum, we prove that DGS-based optimization converges to the exact global minimum with linear rates, similarly to standard gradient-based method in optimizing convex functions. {Several numerical experiments are provided to confirm our theory and illustrate the superiority of the approach over those based on the local gradient.} 
5: \end{abstract}
6: