abstract:c7dbade45fbff933.tex

1: \begin{abstract}

2: Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications.

3: In this model we have one data record at each network node,

4: but without the possibility to move raw data due to privacy considerations.

5: For example, user profiles, ratings, history, or sensor readings can represent this case.

6: This problem is difficult, because there is no possibility to learn local models, the system model

7: offers almost no guarantees for reliability, yet the communication cost needs to be kept low.

8: Here we propose gossip learning, a generic approach that is based on multiple models taking random walks

9: over the network in parallel, while applying an online learning algorithm to improve themselves, and

10: getting combined via ensemble learning methods.

11: We present an instantiation of this approach for the case of classification with linear models.

12: Our main contribution is an ensemble learning method which---through the continuous combination

13: of the models in the network---implements a virtual weighted voting mechanism over an

14: exponential number of models at practically no extra cost as compared to independent random walks.

15: We prove the convergence of the method theoretically, and perform extensive experiments on benchmark

16: datasets.

17: Our experimental analysis demonstrates the performance and robustness of the proposed approach.

18: \end{abstract}