1eb3880d3bf8c848.tex
1: \begin{abstract}
2: Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems.
3: Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of $U$-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area.
4: %We show that straightforward generalisation of basic averages (involving the summation over all pairs of $n$ observations) are not satisfactory.
5: % When the data are distributed across a set of nodes.
6: Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter.
7: % in ranking through within scatter plots in clustering.
8: This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the $U$-statistic of interest. We establish convergence rate bounds of $O(1/t)$ and $O(\log t / t)$ for the synchronous and asynchronous cases respectively, where $t$ is the number of iterations, with explicit data and network dependent terms.
9: % provided that the network is connected and non-bipartite,
10: %with an explicit dependence on the connectivity properties of the considered network.
11: Beyond favorable comparisons in terms of rate analysis, numerical experiments
12: % on synthetic and real-world datasets
13: provide empirical evidence the proposed algorithms surpasses the previously introduced approach.
14: \end{abstract}
15: