1: \begin{abstract}Recently, convex nested stochastic composite optimization (NSCO) has received considerable attention for its application in reinforcement learning and risk-averse optimization. However, In the current literature, there exists a significant gap in the iteration complexities between these NSCO problems and other simpler stochastic composite optimization problems (e.g., sum of smooth and nonsmooth functions) without the nested structure.
2: %there is a gap in the complexities for minimizing stochastic functions and nested stochastic functions. %and these algorithms often carry a too strong assumption on the smoothness of outer layer functions.
3: In this paper, we close the gap by reformulating a class of convex NSCO problems as ``$\min\max\ldots \max$" saddle point problems under mild assumptions and proposing two primal-dual type algorithms with the optimal $\bigO\{1/\ep^2\}$ (resp., $\bigO\{1/\ep\}$) complexity for solving nested (resp., strongly) convex problems. More specifically, for the often-considered two-layer smooth-nonsmooth problem, we introduce a simple vanilla stochastic sequential dual (SSD) algorithm which can be implemented purely in the primal form.
4: For the multi-layer problem, we propose a general stochastic sequential dual framework. The framework consists of modular dual updates for different types of functions (smooth, smoothable, and non-smooth, etc.), so that it can handle a more general composition of layer functions. Moreover, we present modular convergence proofs to show that the complexity of the general SSD is optimal with respect to nearly all the problem parameters.
5: \end{abstract}
6: