1: \begin{abstract}
2: In this paper, we consider a queue-aware distributive resource
3: control algorithm for two-hop MIMO cooperative systems. We shall
4: illustrate that relay buffering is an effective way to reduce the
5: intrinsic half-duplex penalty in cooperative systems. The complex
6: interactions of the queues at the source node and the relays are
7: modeled as an average-cost infinite horizon Markov Decision Process
8: (MDP). The traditional approach solving this MDP problem involves
9: centralized control with huge complexity. To obtain a distributive
10: and low complexity solution, we introduce a linear structure which
11: approximates the value function of the associated Bellman equation
12: by the sum of per-node value functions. We derive a distributive
13: {\em two-stage two-winner auction-based} control policy which is a
14: function of the local CSI and local QSI only. Furthermore, to
15: estimate the {\em best fit} approximation parameter, we propose a
16: distributive online stochastic learning algorithm using stochastic
17: approximation theory. Finally, we establish technical conditions for
18: almost-sure convergence and show that under heavy traffic, the
19: proposed low complexity distributive control is global optimal.
20: \end{abstract}
21: