1: \begin{abstract}
2: %\boldmath
3: This paper aims to find an algorithmic structure that affords to predict and explain the economical choice behaviour particularly under uncertainty(random policies) by manipulating the prevalent Actor-Critic learning method to comply with the requirements we have been entrusted ever since the field of neuroeconomics dawned on us. Whilst skimming some basics of neuroeconomics that might be relevant to our discussion, we will try to outline some of the important works which have so far been presented to simulate choice making processes. Concerning neurological findings that suggest the existence of two specific functions, namely, 'rewards' and 'beliefs' that are executed through a specific pathway from Basal Ganglia all the way up to sub- cortical areas, we will offer a modified version of actor/critic algorithm to shed a light on the relation between these functions and most importantly resolve what is referred to as a challenge for actor-critic algorithms, that is lack of inheritance or hierarchy which avoids the system being evolved in continuous time tasks whence the convergence might not be emerged.
4: \end{abstract}
5: