1: \begin{abstract}
2: Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications.
3: In this model we have one data record at each network node,
4: but without the possibility to move raw data due to privacy considerations.
5: For example, user profiles, ratings, history, or sensor readings can represent this case.
6: This problem is difficult, because there is no possibility to learn local models, the system model
7: offers almost no guarantees for reliability, yet the communication cost needs to be kept low.
8: Here we propose gossip learning, a generic approach that is based on multiple models taking random walks
9: over the network in parallel, while applying an online learning algorithm to improve themselves, and
10: getting combined via ensemble learning methods.
11: We present an instantiation of this approach for the case of classification with linear models.
12: Our main contribution is an ensemble learning method which---through the continuous combination
13: of the models in the network---implements a virtual weighted voting mechanism over an
14: exponential number of models at practically no extra cost as compared to independent random walks.
15: We prove the convergence of the method theoretically, and perform extensive experiments on benchmark
16: datasets.
17: Our experimental analysis demonstrates the performance and robustness of the proposed approach.
18: \end{abstract}