1: \begin{abstract}
2: %
3: \lcrl is a software tool that implements model-free Reinforcement Learning (RL) algorithms
4: over unknown Markov Decision Processes (MDPs),
5: synthesising policies that satisfy a given linear temporal specification with maximal probability.
6: \lcrl leverages partially deterministic finite-state machines known as Limit Deterministic B\"uchi Automata (LDBA)
7: to express a given linear temporal specification.
8: A reward function for the RL algorithm is shaped on-the-fly, based on the structure of the LDBA.
9: Theoretical guarantees under proper assumptions ensure the
10: convergence of the RL algorithm to an optimal policy that maximises the satisfaction probability.
11: We present case studies to
12: demonstrate the applicability, ease of use, scalability and performance of \lcrl~. Owing to the LDBA-guided exploration
13: and \lcrl model-free architecture, we observe robust performance, which also scales well when compared to standard RL approaches (whenever applicable to LTL specifications). Full instructions on how to execute all the case studies in this
14: paper are provided on a GitHub page that accompanies the \lcrl
15: distribution \url{www.github.com/grockious/lcrl}.
16:
17: \end{abstract}
18: