1: \begin{abstract}
2: In this paper, we propose a primal-dual algorithm with a {novel momentum term using the partial gradients of the coupling function} that can be viewed as a generalization of the method proposed by Chambolle and Pock in 2016 %\cite{chambolle2016ergodic}
3: to solve saddle point problems defined by a convex-concave function $\cL(x,y)=f(x)+\Phi(x,y)-h(y)$ with a general coupling term $\Phi(x,y)$ that is \emph{not} assumed to be bilinear. \sa{Assuming $\grad_x\Phi(\cdot,y)$ is Lipschitz %in $x$
4: for any fixed $y$, and {$\grad_y\Phi(\cdot,\cdot)$ is Lipschitz}, we show that the iterate sequence converges to a saddle point; and
5: for any $(x,y)$, %saddle point $(x^*,y^*)$,
6: we derive error bounds in terms of {$\cL(\bar{x}_k,y)-\cL(x,\bar{y}_k)$} for the ergodic sequence $\{\bar{x}_k,\bar{y}_k\}$.} In particular, we show $\cO(1/k)$ rate when the problem is merely convex in $x$. %using a constant step-size rule.
7: %Furthermore,
8: \sa{Furthermore,} assuming $\Phi(x,\cdot)$ is linear %in $y$
9: for each fixed $x$ and $f$ is strongly convex, we obtain the %optimal
10: ergodic convergence rate of $\cO(1/k^2)$ {-- we are not aware of another single-loop method in the related literature achieving the same rate when $\Phi$ is not bilinear.} \sa{Finally, we propose a backtracking technique which does not require the knowledge of Lipschitz constants while ensuring the same convergence results.}
11: %method with the same convergence results is proposed to elevate the practicality of the method.
12: We tested our method for solving the kernel matrix learning and \sa{quadratically constrained quadratic} problems, and compare it against other state-of-the-art first-order algorithms and interior point methods.
13: \end{abstract}
14: