abstract:6ef3a3323165c9a2.tex

1: \begin{abstract}

2: This paper is devoted to studying constrained continuous-time Markov

3: decision processes (MDPs) in the class of randomized policies depending

4: on \textit{state histories}. The transition rates may be \textit{unbounded},

5: the reward and costs are admitted to be \textit{unbounded from above and

6: from below}, and the state and action spaces are Polish spaces. The

7: optimality criterion to be maximized is the expected discounted

8: rewards, and the constraints can be imposed on the expected discounted

9: costs. First, we give conditions for the nonexplosion of underlying

10: processes and the finiteness of the expected discounted rewards/costs.

11: Second, using a technique of occupation measures, we prove that the

12: constrained optimality of continuous-time MDPs can be transformed to an

13: \textit{equivalent} (optimality) problem over a class of probability

14: measures. Based on the equivalent problem and a so-called \textit{$\bar

15: w$-weak convergence} of probability measures developed in this paper,

16: we show the existence of a constrained optimal policy. Third, by

17: providing a linear programming formulation of the equivalent

18: problem, we show the solvability of constrained optimal policies.

19: Finally, we use two \textit{computable} examples to illustrate our main

20: results.

21: \end{abstract}