abstract:6f16ac612115fca2.tex

1: \begin{abstract}

2: % We look at improving team performance in a multi-agent multi-armed bandit (MAB) framework using the Fastest Distributed Linear Averaging (FDLA) method that relies on the usage of a distributed consensus. The multi-agent network is represented using a fixed communication graph structure and is setup and simulated using the coop-UCB2 algorithm. The matrix used to represent the consensus distribution through weights at nodes and edges (the Perron matrix $P$) directly impacts learning and team performance in terms of regret and convergence time. Our goal is to shrink the timescale on which the convergence of the consensus occurs to achieve optimal team performance and maximize reward. Through our experiments, we show that the convergence to the optimal consensus occurs faster.

3:

4: We introduce an approach to improve team performance in a Multi-Agent Multi-Armed Bandit (MAMAB) framework using Fastest Mixing Markov Chain (FMMC) and Fastest Distributed Linear Averaging (FDLA) optimization algorithms. The multi-agent team is represented using a fixed relational network and simulated using the Coop-UCB2 algorithm. The edge weights of the communication network directly impact the time taken to reach distributed consensus. Our goal is to shrink the timescale on which the convergence of the consensus occurs to achieve optimal team performance and maximize reward. Through our experiments, we show that the convergence to team consensus occurs slightly faster in large constrained networks.

5: \end{abstract}

6: