1: \begin{abstract}
2: In safety-critical robotic tasks, potential failures must be reduced, and multiple constraints must be met, such as avoiding collisions, limiting energy consumption, and maintaining balance.
3: Thus, applying safe reinforcement learning (RL) in such robotic tasks requires to handle multiple constraints and use risk-averse constraints rather than risk-neutral constraints.
4: To this end, we propose a trust region-based safe RL algorithm for multiple constraints called a \emph{safe distributional actor-critic} (SDAC).
5: Our main contributions are as follows: 1) introducing a gradient integration method to manage infeasibility issues in multi-constrained problems, ensuring theoretical convergence, and 2) developing a TD($\lambda$) target distribution to estimate risk-averse constraints with low biases.
6: We evaluate SDAC through extensive experiments involving multi- and single-constrained robotic tasks.
7: While maintaining high scores, SDAC shows 1.93 times fewer steps to satisfy all constraints in multi-constrained tasks and 1.78 times fewer constraint violations in single-constrained tasks compared to safe RL baselines.
8: Code is available at: \texttt{https://github.com/rllab-snu/Safe-Distributional-Actor-Critic}.
9: \end{abstract}
10: