73a796a1523e6453.tex
1: \begin{abstract} 
2: %\vspace{-5pt}
3: A new mechanism for efficiently solving the Markov decision processes (MDPs) is proposed in this paper.
4: %MDPs have been used as an essential framework for decision-theoretic planning where an agent has action uncertainty. 
5: %Prevalent algorithms for solving the MDPs include policy iteration and value iteration. 
6: We introduce the notion of {\em reachability landscape} where we use the Mean First Passage Time (MFPT) as a means to characterize the reachability of every state in the state space. 
7: We show that such reachability characterization very well assesses the importance of states and thus provides a natural basis for effectively prioritizing states and approximating policies.  
8: Built on such a novel observation, we design two new algorithms -- Mean First Passage Time based Value Iteration (MFPT-VI) and Mean First Passage Time based Policy Iteration (MFPT-PI) -- that have been modified from the state-of-the-art solution methods.    
9: To validate our design, we have performed numerical evaluations in robotic decision-making scenarios, by comparing the proposed new methods with corresponding classic baseline mechanisms. 
10: The evaluation results showed that MFPT-VI and MFPT-PI have outperformed the state-of-the-art solutions in terms of both practical runtime and number of iterations. 
11: Aside from the advantage of fast convergence, this new solution method is intuitively easy to understand and practically simple to implement.
12: \end{abstract}
13: