872bcacd9cf443ec.tex
1: \begin{abstract}
2: 		Many sequential decision-making problems need optimization of different objectives which possibly conflict with each other.
3: 		The conventional way to deal with a multi-task problem is to establish a scalar objective function 
4: 		based on a linear combination of different objectives.
5: 		However, for the case of having conflicting objectives with different scales, 
6: 		this method needs a trial-and-error approach to properly find proper weights for the combination. 
7: 		As such, in most cases, this approach cannot guarantee an optimal Pareto solution.
8: 		In this paper, we develop a single-agent scale-independent multi-objective reinforcement learning 
9: 		on the basis of the Advantage Actor-Critic (A2C) algorithm.
10: 		A convergence analysis is then done for the devised multi-objective algorithm providing a convergence-in-mean guarantee.
11: 		We then perform some experiments over a multi-task problem to evaluate the performance of the proposed algorithm.
12: 		Simulation results show the superiority of developed multi-objective A2C approach against the single-objective algorithm.
13: 	\end{abstract}
14: