1: \begin{abstract}
2: Traditional reinforcement learning (RL) generates discrete control policies,
3: assigning one action per cycle. These policies are usually implemented as in a
4: fixed-frequency control loop. This rigidity presents challenges as optimal
5: control frequency is task-dependent; suboptimal frequencies increase
6: computational demands and reduce exploration efficiency. Variable Time Step
7: Reinforcement Learning (VTS-RL) addresses these issues with adaptive control
8: frequencies, executing actions only when necessary, thus reducing
9: computational load and extending the action space to include action durations.
10: In this paper we introduce the Multi-Objective Soft Elastic Actor-Critic
11: (MOSEAC) method to perform VTS-RL, validating it through theoretical analysis
12: and experimentation in simulation and on real robots. Results show faster
13: convergence, better training results, and reduced energy consumption with
14: respect to other variable- or fixed-frequency approaches.
15: \end{abstract}
16: