f246db011494adbe.tex
1: \begin{abstract}
2: We introduce a new Swarm-Based  Gradient Descent (SBGD) method for  non-convex optimization. The swarm consists of agents, each is identified with a position, $\bx$, and mass, $m$.  The key to  their dynamics is  communication: masses  are being  transferred from agents at high ground to low(-est) ground. At the same time, agents change positions  with step size, $h=h(\bx,m)$,  adjusted  to their relative mass: heavier agents  proceed with small time-steps in the direction of local gradient, while lighter agents take larger time-steps based on a backtracking protocol. Accordingly, the crowd of agents is dynamically divided between `heavier' leaders,  expected to approach local minima, and `lighter' explorers. With their large-step protocol,  explorers are expected to encounter improved position for the swarm; if they do, then they  assume the role of `heavy' swarm  leaders and so on. Convergence analysis and numerical simulations in one-, two-, and 20-dimensional benchmarks demonstrate the effectiveness of SBGD  as a global optimizer.  
3: \end{abstract}
4: