abstract:78f3f7231acc0ada.tex

1: \begin{abstract}

2: We propose universal randomized function approximation-based

3: empirical value learning (EVL) algorithms for Markov decision processes.

4: The `empirical' nature comes from each iteration being done empirically

5: from samples available from simulations of the next state. This makes

6: the Bellman operator a random operator. A parametric and a non-parametric

7: method for function approximation using a parametric function space

8: and a Reproducing Kernal Hilbert Space (RKHS) respectively are then

9: combined with EVL. Both function spaces have the universal function

10: approximation property. Basis functions are picked randomly. Convergence

11: analysis is done using a random operator framework with techniques

12: from the theory of stochastic dominance. Finite time sample complexity

13: bounds are derived for both universal approximate dynamic programming

14: algorithms. Numerical experiments support the versatility and {computational tractability}

15: of this approach.

16: \end{abstract}

17: