abstract:73a796a1523e6453.tex

1: \begin{abstract}

2: %\vspace{-5pt}

3: A new mechanism for efficiently solving the Markov decision processes (MDPs) is proposed in this paper.

4: %MDPs have been used as an essential framework for decision-theoretic planning where an agent has action uncertainty.

5: %Prevalent algorithms for solving the MDPs include policy iteration and value iteration.

6: We introduce the notion of {\em reachability landscape} where we use the Mean First Passage Time (MFPT) as a means to characterize the reachability of every state in the state space.

7: We show that such reachability characterization very well assesses the importance of states and thus provides a natural basis for effectively prioritizing states and approximating policies.

8: Built on such a novel observation, we design two new algorithms -- Mean First Passage Time based Value Iteration (MFPT-VI) and Mean First Passage Time based Policy Iteration (MFPT-PI) -- that have been modified from the state-of-the-art solution methods.

9: To validate our design, we have performed numerical evaluations in robotic decision-making scenarios, by comparing the proposed new methods with corresponding classic baseline mechanisms.

10: The evaluation results showed that MFPT-VI and MFPT-PI have outperformed the state-of-the-art solutions in terms of both practical runtime and number of iterations.

11: Aside from the advantage of fast convergence, this new solution method is intuitively easy to understand and practically simple to implement.

12: \end{abstract}

13: