abstract:93d01375322cd67f.tex

1: \begin{abstract}                          % Abstract of not more than 200 words.

2: We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized.

3: Although TERMDP is generally a nonconvex optimization problem, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed.  Convergence of the proposed algorithm to a stationary point of the considered TERMDP is established.

4: Applications of TERMDP are discussed in the context of networked control systems theory and non-equilibrium thermodynamics.

5: The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices.

6: \end{abstract}

7: