0612:cs0612012/cs0612012

1: \documentclass[12pt]{article}

2:

3: \usepackage{amsmath,amssymb, fullpage}

4: \usepackage{graphicx}

5: \usepackage[latin1]{inputenc}

6: \usepackage[T1]{fontenc}

7: \usepackage{lmodern}

8: \usepackage[pdfstartview={FitH}]{hyperref}

9: \usepackage{algorithm}

10: \usepackage{algorithmic}

11: \setlength{\oddsidemargin}{0.25 in}

12: \setlength{\evensidemargin}{-0.25 in} \setlength{\topmargin}{-0.6

13: in} \setlength{\textwidth}{6.5 in} \setlength{\textheight}{8.5 in}

14: \setlength{\headsep}{0.75 in} \setlength{\parindent}{0 in}

15: \setlength{\parskip}{0.1 in}

16: \newcommand{\lecture}[4]{

17:    \pagestyle{myheadings}

18:    \thispagestyle{plain}

19:    \newpage

20:    \setcounter{page}{1}

21:    \noindent

22: }

23: \newtheorem{theorem}{Theorem}

24: \newtheorem{lemma}{Lemma}

25: \newtheorem{proposition}{Proposition}

26: \newtheorem{claim}{Claim}

27: \newtheorem{corollary}[theorem]{Corollary}

28: \newtheorem{defn}{Definition}

29: \newtheorem{construction}{Construction}

30: \newtheorem{exercise}{Exercise}

31: \newtheorem{example}{Example}

32: \newtheorem{open}[theorem]{Open Question}

33: \newtheorem{notation}{Notation}

34: %\newtheorem{algorithm}{Algorithm}

35: \newtheorem{observation}{Observation}

36: %\newtheorem{conjecture}[theorem]{Conjecture}

37:

38: \def\beq{\begin{eqnarray}}

39: \def\eeq{\end{eqnarray}}

40: \def\beqs{\begin{eqnarray*}}

41: \def\eeqs{\end{eqnarray*}}

42: %% Blackboard bold symbols %%%%%%%%%%%%%%%%%%%%%%%%%%%%%

43: \newcommand{\N}{\mathbb{N}}

44: \newcommand{\Z}{\mathbb{Z}}

45: \newcommand{\Q}{\mathbb{Q}}

46: \newcommand{\R}{\mathbb{R}}

47: \newcommand{\RR}{\mathbb{R}}

48: \newcommand{\C}{\mathbb{C}}

49: \newcommand{\CC}{\mathcal{C}}

50: \newcommand{\T}{\mathbb{T}}

51: \newcommand{\A}{\mathbb{A}}

52: \newcommand{\x}{\mathbf{x}}

53: \newcommand{\y}{\mathbf{y}}

54: \newcommand{\z}{\mathbf{z}}

55: \newcommand{\n}{\mathbf{n}}

56: \newcommand{\I}{\mathbb{I}}

57: \newcommand{\K}{\mathbb{K}}

58: \newcommand{\E}{\mathbb{E}}

59: \newcommand{\p}{\mathbb{P}}

60: \newcommand{\e}{\mathbf{e}}

61: \newcommand{\one}{\mathbf{1}}

62: \newcommand{\LL}{\mathcal L}

63: \newcommand{\MM}{\mathcal M}

64: \newcommand{\ra}{\rightarrow}

65: \newcommand{\la}{\leftarrow}

66: %\def\a{{\mbox{\boldmath $\alpha$}}}

67: %\def\l{{\mbox{\boldmath $\lambda$}}}

68: %\def\m{{\mbox{\boldmath $\mu$}}}

69: %\def\n{{\mbox{\boldmath $\nu$}}}

70: \def\eee{{\mathrm e}}

71: \def\a{{\mathbf{\alpha}}}

72: \def\l{{\mathbf{\lambda}}}

73: \def\m{{\mathbf{\mu}}}

74: %\def\n{{\mathbf{\nu}}}

75: \def\A{{\mathcal{A}}}

76: \def\ie{i.\,e.\,}

77: \def\of{{\bf off}}

78: \def\on{{\bf on}}

79: \def\pa{{\bf passive}}

80: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

81:

82: %\def\baselinestretch{1.5}         % double space (well...almost!)

83: \title{Geographic Gossip on Geometric Random Graphs via Affine Combinations}

84: \author{Hariharan Narayanan\\

85: Department of Computer Science, University of Chicago\\

86:  {\tt hari@cs.uchicago.edu} }

87:

88: \begin{document}

89: \maketitle

90: \begin{abstract}

91: In recent times, a considerable amount of work has been devoted to

92: the development and analysis of gossip algorithms in Geometric

93: Random Graphs. In a recently introduced model termed ``Geographic

94: Gossip," each node is aware of its position but possesses no further

95: information. Traditionally, gossip protocols have always used convex

96: linear combinations to achieve averaging. We develop a new protocol

97: for Geographic Gossip, in which counter-intuitively, we use {\it

98: non-convex affine combinations} as updates in addition to convex

99: combinations to accelerate the averaging process. The dependence of

100: the number of transmissions used by our algorithm on the number of

101: sensors $n$ is $n \exp(O(\log \log n)^2) = n^{1 + o(1)}$. For the

102: previous algorithm, this dependence was $\tilde{O}(n^{1.5})$.

103: %This reduces the energy consumption by a factor of

104: %n^{1/2 - o(1)} over the most efficient algorithm previously known.

105: The exponent 1+ o(1) of our algorithm is asymptotically optimal. Our

106: algorithm involves a hierarchical structure of $\log \log n$ depth

107: and is not completely decentralized. However, the extent of control

108: exercised by a sensor on another is restricted to switching the

109: other on or off.

110: \end{abstract}

111:

112: \section{Introduction}

113:

114: Geometric Random Graphs have become an accepted model for wireless

115: ad hoc and sensor networks. Due to applications in distributed

116: sensing, a significant amount of effort has been directed towards

117: developing energy efficient algorithms for information exchange on

118: these graphs. The problem of distributed averaging  has been studied

119: intensively because it appears in several applications such as

120: estimation on ad hoc networks, and encapsulates many of the

121: difficulties faced in asynchronous distributed computation. Let

122: $v_1, \dots, v_n$ be $n$ points independently chosen uniformly at

123: random from a unit square in $\R^2$. A Geometric Random Graph $G(n,

124: r)$ is obtained from these points by connecting any two points

125: within Euclidean distance $r$. A Gossip Algorithm is an averaging

126: algorithm that, after a certain number of information exchanges and

127: updates, leaves each node with a value close to the average of all

128: the originally held values.

129: %to be connected, $r(n)$ must scale as $\Theta(\sqrt{\frac{\log n}{n}})$.

130: % Geometric random graphs

131: %have been studied extensively (\cite{}) and have been used to model

132: %distributed wireless networks such as sensor networks.

133:

134: \subsection{Related Work}

135: There is an extensive body of work surrounding the subject of gossip

136: algorithms in various contexts. Here, we only survey the results

137: relevant in a narrow sense to the question under consideration.

138:

139: Gupta and Kumar \cite{kumar} gave conditions under which $G(n, r)$

140: is connected with high probability (w.h.p.). It is sufficient that

141: $r$ scales as $\Omega(\sqrt{\frac{\log n}{n}})$ in order that $G(n,

142: r)$ be connected with probability greater than $1 - n^{-\Theta(1)}$.

143:

144: A distributed Gossip Algorithm for arbitrary graphs was presented by

145: Boyd et al \cite{Boyd}.  In this algorithm, when the clock of a

146: sensor $s$ ticks, $s$ sends its value $x_s$ to a sensor $v$ chosen

147: uniformly at random from its neighbors, and receives the value $x_v$

148: of $v$. Thereafter $s$ and $v$ set their values to $\frac{x_s +

149: x_v}{2}$. The dependence of the number of transmissions required by

150: this algorithm on $n$ is $\tilde{O}(n^2)$. The performance was

151: related to the mixing time of the natural random walk on that graph.

152: In fact they showed that if the connectivity graph is $G$, the

153: number of transmissions made in the course of the algorithm is

154: $\Theta(n T_{mix}(G))$, where $T_{mix}(G)$ is the mixing time of

155: $G$.

156:

157: In the standard framework for modeling sensor networks, $n$ sensors

158: are placed at random on a unit square $\square$ and have a radius of

159: connectivity $r = \Theta(\sqrt{\frac{\log n}{n}})$. One does not

160: assume that a sensor possesses any information about its own

161: location. In this model, the number of transmissions that the best

162: known algorithm uses is $\tilde{O}(n^2)$ as described

163: above.\footnote{In using $\tilde{O}$, we ignore polylogarithmic

164: factors and depending on context, the dependence on parameters other

165: than $n$.}

166:

167: A more powerful model was proposed by Dimakis et al

168: $\cite{wainwright}$, wherein each sensor is aware of its own

169: location with reference to $\square$ , but possess no further

170: information. It is mentioned in \cite{wainwright} that this is

171: reasonable in typical scenarios. With this model, by exploiting

172: geographic information, they were able to provide an algorithm that

173: requires $\tilde{O}(n^{1.5})$ transmissions. In their algorithm,

174: each node exchanges its value with the node nearest to a position

175: chosen randomly on $\square$, and both nodes replace their values by

176: the average as in the algorithm of Boyd et al \cite{Boyd}. Rejection

177: sampling is used to make the distribution roughly uniform on nodes.

178: The routing takes $\tilde{O}(\sqrt{n})$ hops w.h.p, but since the

179: mixing time on the complete graph is $O(1)$, one obtains an

180: algorithm using $\tilde{O}(n^{1.5})$ transmissions, which is an

181: improvement over \cite{Boyd} by a factor of $\tilde{O}(\sqrt{n})$.

182:

183: A natural approach to obtaining more efficient algorithms would be

184: to engage in long-range information exchanges less frequently than

185: short-range ones. However, it appears that the benefit derived from

186: an improved mixing time with long-range transmissions more than

187: compensates for the additional cost in terms of hops for a

188: long-range routing. Due to this fact, simply altering the

189: probability distribution with which a node picks targets seems to be

190: counterproductive.

191:

192: \subsection{Our Contribution}

193: An affine combination of two vectors $\mathbf{a}$ and $\mathbf{b}$

194: has the form $\alpha \mathbf{a} + (1-\alpha) \mathbf{b}$. Unlike the

195: case of convex combinations, $\alpha$ need not belong to $[0, 1]$.

196: We introduce counter-intuitive update rules which are {\it affine

197: combinations} rather than {\it convex combinations} (with

198: coefficients possibly as large as $\Omega(\sqrt{n})$) to achieve

199: faster averaging. The total number of transmissions used by the

200: proposed algorithm in order that the $\ell_2$-distance of the output

201: from the average diminish by a multiplicative factor of $\epsilon$

202: w.h.p, is $n\exp(O((\log\log n) \log \log \frac{n}{\epsilon}))$.

203: When $\epsilon = \exp(n^{\frac{o(1)}{\log \log n}})$ the number of

204: transmissions is $n^{1+o(1)}$.

205:  The

206: exponent $1 + o(1)$ is asymptotically optimal, since every node must

207: make at least one transmission for an averaging algorithm to work.

208: Like previous algorithms, ours makes packet exchanges with random

209: nodes. Due to

210:  the instability introduced

211: into the system by the use of non-convex combinations, for the

212: present analysis to hold, a certain amount of control needs to be

213: exercised and our algorithm is not truly decentralized. However, the

214: extent of control exerted by any sensor on another is restricted to

215: switching the other on or off.

216:

217: \section{Preliminaries}

218: The standard model for a sensor network is as follows.

219:  We assume

220: that each node or sensor has a clock that is a Poisson process with

221: rate $1$, and that these processes are independent. This model is

222: equivalent to having a single clock that is Poisson of rate $n$, and

223: assigning clock ticks to nodes uniformly at random. We assume that

224: the time units are adjusted so communication time between any two

225: adjacent nodes is insignificant in comparison with the length of an

226: average time slot $n^{-1}$. Our algorithm involves packet forwarding

227: when two non-adjacent nodes communicate. We shall assume that the

228: time taken to forward a packet is also insignificant in comparison

229: with $n^{-1}$, and that a single packet exists in the network in

230: each time slot w.h.p.. We assume some limited computational power,

231: which amounts to memory of logarithmic size, and the ability to do

232: floating point computations.

233:

234: %\section{Preliminaries}

235: For our purposes, a Geometric Random Graph is defined in the

236: following way.

237:  Let $v_1, \dots, v_n$ be $n$ points independently chosen uniformly at random

238: from a unit square in $\R^2$. A Geometric Random Graph $G(n, r)$ is

239: obtained from these points by connecting any two points within

240: Euclidean distance $r$.

241: %In our context, $r(n) =

242: %\Theta(\sqrt{\frac{\log n}{n}})$.

243:

244: %We shall use the standard model for timing in a sensor network which

245: %is as follows.

246: % We assume

247: %that each node or sensor has a clock that is a Poisson process with

248: %rate $1$, and that these processes are independent. This model is

249: %equivalent to having a single global clock that is Poisson of rate

250: %$n$, and assigning clock ticks to nodes uniformly at random. We

251: %assume that the time units are adjusted so communication time

252: %between any two adjacent nodes is insignificant in comparison with

253: %the length of an average time slot $n^{-1}$. Our algorithm involves

254: %packet forwarding when two non-adjacent nodes communicate. We shall

255: %assume that the time taken to forward a packet is also insignificant

256: %in comparison with $n^{-1}$, and that a single packet exists in the

257: %network in each time slot.

258: \subsection{Problem Statement}

259: Let node $v_i$ for  $i = 1, \dots, n$ hold a value $x_i(t)$ at the

260: $t^{th}$ global clock tick, the initial values being $x_i(0)$.

261: Without loss of generality, we assume $\overline{\x(0)} = 0$. Given

262: $\epsilon, \delta > 0$,  the task is to design an algorithm

263: %using as

264: %few transmissions $Trans(n, \epsilon, \delta)$ as possible so that

265: %after this many transmissions,

266: such that $\|\x(t)\| < \epsilon \|\x(0)\|$  for all possible choices

267: of $\x(0)$  with probability $> 1 - \delta$. The cost of the

268: algorithm is the expected number of transmissions made until $t$.

269: %We are interested in designing a distributed

270: %algorithm of modifying these values so that for each $i$,  $\lim_{t

271: %\ra \infty} x_i(t) = x_{ave} := \frac{1}{n}\sum_i x_i(t)$, and the

272: %time taken for approximate convergence is small. The

273: %$\epsilon$-averaging time $T(n, \epsilon)$ is defined as follows:

274:

275: %\begin{defn}

276: %Given $\epsilon, \delta > 0$, the $\epsilon, \delta$-averaging time

277: %is the earliest time $t$ (the number of ``global" clock-ticks) at

278: %which the vector $x(t)$ is $\epsilon$ close to the normalized true

279: %average with probability $> \delta$,

280: %$$T_{ave}(n, \epsilon, \delta) := \sup_{x(0)}

281: %\inf \left(t:\p\left(\frac{\|x(k) - x_{ave}\vec{1}\|}{\|x(0)\|} \geq

282: %\epsilon \right) \leq \delta\right).$$

283: %\end{defn}

284: In the rest of the paper, we shall make the standard assumption that

285: the radius of connectivity $r(n) = \Theta(\sqrt{\frac{\log n}{n}})$

286: (eg \cite{wainwright}.) Under this assumption, the probability of

287: the graph $G(n, r)$ being disconnected is $\Omega(n^{-O(1)})$, for

288: an appropriate constant $a$.  As a consequence, it is not possible

289: to drive $\delta$ below $n^{-O(1)}$. For this reason, in the

290: analysis, we shall assume that $\delta = n^{-O(1)}$. On the other

291: hand $\epsilon$ can be made arbitrarily small by running the

292: averaging algorithm for a sufficiently long interval of time. In

293: this paper, we shall assume that $\log \frac{1}{\epsilon} =

294: n^{\frac{o(1)}{\log \log n}}$. This does not allow $\epsilon$ to be

295: exponentially small but permits it to be the reciprocal of a

296: quasipolynomial. A sufficiently large constant $a$ will appear in

297: the parameters of our algorithm described later.When we use the term

298: {\it high probability}, we shall mean with probability $1 -

299: n^{-\Theta(1)}$.

300: %For a

301: %discussion of the probability of $G(n, r)$ being disconnected, see

302: %\cite{kumar}.

303: %If the connectivity radius $r(n)$ is $= \sqrt{\frac{a \ln n}{\pi

304: %n}}$, the probability that the graph $G(n, r)$ is disconnected is

305: %$\Omega {n^{-a}}$, as can be seen by simply computing the

306: %probability that node $1$ is disconnected from all other nodes.

307: %Therefore under the condition that $r(n) = \Theta{\sqrt{\frac{\log

308: %n}

309: %\subsection{Gossip Algorithms}

310: %Gossi

311: %\begin{definition}

312: %Let $\A(v_1, \dots, v_n)$ be a Gossip algorithm on nodes $v_1,

313: %\dots, v_n$. Let $T[n, \epsilon]$ be the

314: %\section{Results}

315: %As in \cite{wainwright} we shall seek to minimize the total number

316: %of transmissions. If the $t^{th}$ clock tick belongs to node $j$,

317: %$j$ will communicate with some node in the network, not necessarily

318: %within the radius of connectivity. Each such communication may take

319: %$R(t)$ radio transmissions, and our communication cost shall be

320: %$$C(n, \epsilon, \delta) := \sum_{t=1}^{T_{ave}(n, \epsilon, \delta)} R(t).$$

321:

322: \section{Overview of Algorithm}

323: %The square $\square$ is partitioned into $n_1$  subsquares

324: %$\square_i$, where $n_1$ is the nearest integer to $\sqrt{n}$ that

325: %is the square of an even number. For a square $\square_{i_1\dots

326: %i_r}$, let $\E_\#\square_{i_1\dots i_r}$ denote the expected number

327: %of sensors within $\square_{i_1\dots i_r}$. Then, while

328: %$\E_\#\square_{i_1\dots i_r} > \log (n)^8$,

329:

330: %the square $\square_{i_1\dots i_r}$ is partitioned into $n_{r+1}$

331: %subsquares $\square_{i_1\dots i_{r+1}}$, where $n_{r+1}$ is the

332: %nearest integer to $\sqrt{\E_\#\square_{i_1\dots i_r}}$ that is the

333: %square of an even number. Let $$\ell := 1 +

334: %\sup\limits_{\square_{i_1\dots i_r}} r,$$ \ie the number of levels

335: %in this recursion. Given a square $\square_{<i>}$, let

336: %$s(\square_{<i>})$ denote the sensor nearest to its center. By our

337: %construction, these centers are well separated, and any sensor has

338: %this property with respect to at most one square. We shall denote

339: %this by $\square(s)$. %Note that the total number of squares in our

340: Let $\square$ be the unit square in which the $n$ sensors are

341: randomly placed. Let the initial values carried by sensors be

342: $x_i(0)$, for $i = 1$ to $n$.  We consider a partition of $\square$

343: into $\sim n^{1/2}$ smaller squares $\square_i$. Let $\square_i$

344: contain $\#(\square_i)$ sensors. Let $time(n)$ represent the

345: expected number of transmissions until $\|\x(t)\| \leq \epsilon

346: \|\x(0)\|$  w.h.p., where $\epsilon$ is some function of $n$ that we

347: shall not investigate at the moment. Suppose that we had a ``nearly

348: perfect" averaging protocol $\A$ on the smaller squares $\square_i$,

349: \ie when $\A$ is run on each square, after $t = time_{\A}(\sqrt n)$

350: transmissions, within $\square_i$ the values are for practical

351: purposes equal to the the average of the original values. That is,

352: $$(\forall i) (\forall s \in \square_i) x_s(t) \backsimeq \frac{\sum\limits_{s \in

353: \square_i} x_s(0)}{\#(\square_i)}.$$

354: \begin{defn}

355: For each square $\square_i$, let $s(\square_i)$ be the sensor

356: closest to the center of $\square_i$.

357: \end{defn}

358: This can be determined by each square, using a constant number of

359: transmissions w.h.p.

360:

361: The $s(\square_i)$ exchange values among themselves by Greedy

362: Geographic Routing (see \cite{wainwright}).

363:

364: Consider the following protocol. Suppose that $\A$ has been run on

365: each subsquare of the form $\square_i$ independently, and the values

366: carried by the nodes within $\square_i$ are all equal. When

367: $s(\square_i)$ becomes active,

368:  the following round takes place.

369: \begin{enumerate}

370: \item $s_i :=s(\square_i)$ picks a square $\square_j$ uniformly at

371: random. $s_i$ geographically routes a packet with its value to $s_j

372: := s(\square_j)$.

373:

374: %\item Node $s$ sends the packet, which is routed to the node nearest

375: %to $t$ by greedy geographical routing (see \cite{wainwright} for

376: %details). Let $v := s(\square_j)$ be the node closest to $t$.

377: \item $s_j$ routes its own value to $s_i$ by greedy

378: geographic routing.

379: \item $x_{s_i} \la x_{s_i} + \frac{2\sqrt{n}}{5}(x_{s_j} - x_{s_i})$.

380: \item  $x_{s_j} \la x_{s_j} + \frac{2\sqrt{n}}{5}(x_{s_i} - x_{s_j})$.

381: \item $\A$ is independently run on $\square_i$ (the process being activated by $s_i$ by switching certain nodes on)

382: and on $\square_j$ (initiated by $s_j$ similarly).

383: \item $\A$ is ended on square $\square_i$ by $s_i$ (by turning certain nodes off), and

384: $\A$ is ended on $\square_j$ by $s_j$ (by switching certain nodes

385: off.)

386: \end{enumerate}

387:

388: Now, let $z_i(t) := \sum\limits_{s \in \square_i} x_s(t)$. Without

389: loss of generality, we assume that $\sum_i{x_i} = 0$, since this

390: only adds a constant offset and does not affect the rate of

391: convergence. An application of the Chernoff Bound tells us that

392: $(\forall i)\left| \frac{\#(\square_i)}{\sqrt n}-1\right| <

393: \frac{1}{10}$ w.h.p . If we examine the evolution of $\z$, we see

394: that after a round of the kind described above

395:

396: \begin{itemize}

397: \item $z_i(t) = (1-\alpha_i)z_i(t-1)

398: + \alpha_jz_j(t-1)$

399: \item $z_j(t) = (1-\alpha_j) z_j(t-1) + \alpha_i z_i(t-1)$

400: \end{itemize}

401: where $\forall i,  \alpha_i \in (\frac{1}{2}, \frac{1}{3})$. From

402: Lemma~\ref{l:1}, it follows that

403:

404: $\E[\|\z(t)\|^2] < (1-\frac{1}{2\sqrt{n}})^t \|\z(0)\|^2$. Roughly

405: speaking after $O(\sqrt{n} \log(\frac{n}{\epsilon}))$ of these

406: steps, we have a distribution $\x(t')$ such that $\|\x(t')\| <

407: \epsilon \|\x(0)\|$.

408:

409: Each geographical routing mentioned above takes $O(\sqrt n)$

410: transmissions w.h.p (see \cite{wainwright}). Also, each process of

411: initiating or ending $\A$ on a square $\square_i$ takes

412: $O(\sqrt{n})$ transmissions.

413:

414: So, the total number of transmissions with $n$ nodes $time(n)$

415: satisfies a recurrence of the form: $$time(n) \backsimeq

416: O\left(\sqrt n \log(\frac{n}{\epsilon}) ( time_\A(\sqrt n) + O(\sqrt

417: n))\right).$$ Ignoring the dependence on $\epsilon$, it would allows

418: us to recursively define the algorithm $\A$ on $\square$, for which

419: $time_\A(n) = n \exp(O(\log\log n)^2).$

420:

421:

422:

423: %We shall describe our algorithm recursively. In order to do this, we

424: %shall first have to define the problem in a way that facilitates the

425: %recursion.

426:

427: %Let $k$ be a \texttt{Binomial}(n, p) random variable, where $E[k] =

428: %pn = \Omega((\log n)^4)$

429: %for some constant $c \geq 3$.

430: %Let $k$ sensors be placed uniformly at randomly in a square $S$ of

431: %area $p$. Let sensors within a distance $\Theta(\sqrt{\frac{\log

432: %n}{n}})$ be connected. We shall describe the algorithm $A(s, S, p,

433: %n)$ that sensor $s$ implements.

434:

435: %\footnote{This allows us to get the probability of "failure"

436: %$\epsilon$ down to $\frac{1}{n^a}$, for a constant $a$. For a

437: %discussion of the related probability of $G(n, r)$ being

438: %disconnected, see \cite{kumar}.}.

439: %Let $C >> 1$ be a large constant.  $A(s, S, p, n, \epsilon, \delta)$

440: %is described recursively as follows:

441: %\begin{enumerate}

442: %\item  If $pn < C(\log n)^8$:

443:

444: %Consider a partition $\{S_i\}_{1 \leq i \leq u}$ of square $S$

445: %into $u = \lceil k^{1/4} \rceil^2$ smaller squares $S_i$ of area

446: %$\frac{p}{u}$ each. It can be seen from Chernoff bounds that with

447: %high probability $\frac{2\sqrt{pn}}{3} < u <

448: %\frac{4\sqrt{pn}}{3}$.

449:

450: %Let $s_i$ be the sensor in square $S_i$, that is nearest to the

451: %center $C_i$ of $S_i$.

452: %From Lemma~\ref{centers}, a sensor $s \in

453: %S_i$ can determine whether or not it is the nearest to $C_i$ with

454: %high probability using a $\mathrm{polylog}(n)$ number of radio

455: %transmissions. By Lemma~\ref{center_dist}, with high probability,

456: %each $s_i$ is within a Euclidean distance of

457: %$\Theta(\sqrt{\frac{\log n}{n}})$ of $C_i$.

458: %Maybe the above discussion should not be here.

459: %Suppose $s \in S_i$.

460: %%I don't have to make complex error computations here. The recursively defined expression is short.

461: %With probability $1 - \frac{(\log

462: %n)^{-4}}{T_{ave}(\lceil2\frac{pn}{\ell}\rceil,

463: %\frac{\epsilon^3}{n^3}, \frac{\delta}{\ell^2})}$,%

464:

465: %Note that with high probability, there is a sensor within a

466: %Euclidean distance $\Theta(\sqrt{\frac{\log n}{n}})$ of each $C_i$.

467: \section{Description of the Algorithm}

468: \subsection{Notation}\label{s:1}

469: The square $\square$ is partitioned into $n_1$  subsquares

470: $\square_i$, where $n_1$ is the nearest integer to $\sqrt{n}$ that

471: is the square of an even number. For a square $\square_{i_1\dots

472: i_r}$, let $\E_\#\square_{i_1\dots i_r}$ denote the expected number

473: of sensors within $\square_{i_1\dots i_r}$. Then, while

474: $\E_\#\square_{i_1\dots i_r} > (\log n)^8$,

475:

476: the square $\square_{i_1\dots i_r}$ is partitioned into $n_{r+1}$

477: subsquares $\square_{i_1\dots i_{r+1}}$, where $n_{r+1}$ is the

478: nearest integer to $\sqrt{\E_\#\square_{i_1\dots i_r}}$ that is the

479: square of an even number. Let $$\ell := 1 +

480: \sup\limits_{\square_{i_1\dots i_r}} r,$$ \ie the number of levels

481: in this recursion. Given a square $\square_{<i>}$, let

482: $s(\square_{<i>})$ denote the sensor nearest to its center. By our

483: construction, these centers are well separated, and any sensor has

484: this property with respect to at most one square w.h.p.. We shall

485: denote

486: this by $\square(s)$. %Note that the total number of squares in our

487: %construction is $o(n)$.

488: We assign a Level to each node by the

489: following rule: If $s = s(\square_{i_1\dots i_r})$, $s$ has level

490: $\ell - r$. These nodes are have Levels $1, \dots, \ell$. There is a

491: single root node at Level $\ell$, namely $s(\square)$. The nodes at

492: Level $0$ are the nodes not of the form $s(\square_{i_1\dots i_r})$.

493: In the informal discussion earlier, we did not concern ourselves

494: with the error in the averaging carried out on subsquares

495: $\square_i$. However, these errors propagate up the hierarchy

496: rapidly, and hence it is necessary to obtain results with greater

497: accuracy in smaller squares. Thus we define the desired accuracy

498: recursively. Let $\epsilon_r$ be the accuracy for the averaging

499: process in a square $\square_{i_1 \dots i_{r-1}}.$ Lemma~\ref{l:2}

500: tells us that it is sufficient to take $\epsilon_r$, to be

501: $\frac{\epsilon_{r-1}}{\text{poly}(n)}$ for a polynomial of

502: sufficiently large degree.

503:

504:  Let $\epsilon_0 =

505: \epsilon$, $\delta_0 = \delta$. We recursively define

506: $\epsilon_{r+1} := \frac{\epsilon_r}{25 n^{\frac{7}{2} + a}}$ and

507: $\delta_{r+1} = \frac{\delta_r}{n_r^{2a}}$.

508:

509: We define $time(n, \ell-1, \epsilon_r, \delta_r)$ to be $\left((\log

510: \frac{n}{\epsilon_{\ell-1}})

511: \log(\delta_{\ell-1}^{-1})\right)^{16}$. Thereafter, we define

512: $time(n, r-1, \epsilon_{r-1}, \delta_{r-1}) := time(n, r,

513: \epsilon_r, \delta_r) n^a \left(\log (\frac{n_r}{\epsilon_r})\log

514: (\delta_r^{-1})\right)^{16}.$

515:

516: Let  $s \in \square_{i_1\dots i_{\ell-1}}$.

517:

518: \subsection{The Protocol}

519: Every node $s$ has two states, a $local.state$ and a $global.state$,

520: both of which are initially $= off$, but can also take the value

521: $on$. Each node $s$ possesses a private counter $counter(s)$. During

522: initialization, the $global.state$ of $s(\square)$ is set to $on$

523: but every other $global.state$ is $0$. The $local.state$ of {\it

524: all} nodes is set to $off$ at this juncture.

525:

526: Let us suppose that the clock of $s$ ticks. We describe the protocol

527: followed by it below. We consider two cases. If $s$ is at Level $0$,

528: it obeys the following protocol: \{

529: \begin{enumerate}

530: \item If $local.state(s) = on$\\ $Near(s)$;

531: \end{enumerate}

532: \}

533:

534: $Near(s)$\{

535: \begin{enumerate}

536: \item $s$ picks an adjacent node $v$ contained in $\square_{i_1\dots i_{\ell-1}}$

537: uniformly at random.

538: \item $s$ sets $x_s(t+1) = \frac{x_s(t) + x_v(t)}{2};$\\

539:       $v$ sets $x_v(t+1) = \frac{x_s(t) + x_v(t)}{2};$

540: \end{enumerate}

541: \}

542:

543: We next describe the protocol if $s$ is at a Level greater than $0$.

544: The subroutine $Near$ is the same as above.

545: Let $\square(s)=: \square_{i_1 \dots i_r}.$\\

546: \{

547: \begin{enumerate}

548:  \item If $global.state(s) = on$ %\{\\

549:

550:  \begin{enumerate}

551:  \item  If $counter(s) = 0$ $Activate.square(s);$

552:  \item With probability $n^{-a} time(n, r, \epsilon_r,

553:  \delta_r)^{-1}$

554:  \begin{itemize}

555:  \item $Far(s)$;

556:  \item $counter(s) \la 0$;

557:  \end{itemize}

558:

559: \end{enumerate}

560:

561: \item If  $local.state(s) = on$ \\$Near(s);$

562:

563: \item If $counter(s) \geq time(r, n, \epsilon_r, \delta_r)$

564: $Deactivate.square(s);$\\

565: Else $counter(s) \la counter(s) + 1;$

566:

567: \end{enumerate}

568: \}

569:

570: ${Far(s)}$\{

571: \begin{enumerate}

572: \item $s$ picks a square $\square_{i_1'\dots i_r'} \not\ni s$ uniformly at

573: random. Let $s' := s(\square_{i_1'\dots i_r'})$ . Node $s$ routes

574: its value to $s'$ geographically.

575: %\item Node $s$ sends the packet, which is routed to the node nearest

576: %to $t$ by greedy geographical routing (see \cite{wainwright} for

577: %details). Let $v$ be the node closest to $t$.

578:

579: %\} \\

580:

581: \item $x_s(t+1) = x_s(t) + \frac{2}{5}(\E_\#\square_{i_1\dots i_r} x_{s'}(t) -

582: \E_\#\square_{i_1 \dots i_r}x_s(t))$.

583: \item $s'$ sends  back to a packet with its value $x_{s'}(t)$ to  $s$ by greedy

584: geographic routing.

585: \item Node $s$ computes $x_s(t+1) = x_s(t) + \frac{2}{5}(\E_\#\square_{i_1\dots

586: i_r}x_{s'}(t) - \E_\#\square_{i_1\dots i_r} x_s(t))$.

587: \item $counter(v) \la 0$.

588: \end{enumerate}\}

589:

590: $Activate.square(s)$\{

591: \begin{enumerate}

592: \item If $s \in $ Level $1$, send packets to each node $s'$ in

593: $\square(s)$ setting $local.state(s') \la on$ by flooding.

594: \item If $s \in $ Level $i>1$, send packets to each Level $i-1$ node $s'$ in

595: $\square(s)$ by greedy geographic routing, setting $global.state(s')

596: \la on$.

597: \end{enumerate}

598: \}

599:

600: $Deactivate.square(s)$\{

601: \begin{enumerate}

602: \item If $s \in $ Level $1$, send packets to each node $s'$ in

603: square($s$) setting $local.state(s') \la off$ by flooding.

604: \item If $s \in $ Level $i>1$, send packets to each Level $i-1$ node $s'$ in

605: $\square(s)$ by greedy geographic routing, setting $global.state(s')

606: \la off$.

607: \end{enumerate}

608: \}

609: \section{Analyzing the number of Transmissions}

610: Let $H(n, r, \epsilon_r, \delta_r)$ denote the number of

611: transmissions used in our protocol in one round of

612: $\square_{i_1\dots i_r}$, in order to diminish the variance (of the

613: values carried by sensors in $\square_{i_1\dots i_r}$) by a factor

614: $\epsilon_r$, with probability $1-\delta_r$.

615:

616: %We shall need an observation, which is a consequence of

617: %Lemma~\ref{l:2} applied to a vector $\y$ defined analogously to $\z$

618: %in the discussion above.

619:

620: %We shall explain this in greater detail in the final version of the

621: %paper.

622:

623: \begin{observation}\label{l:red}

624:  In one round, \ie the duration

625: between $s$ activating $\square(s) := \square_{i_1\dots i_r}$ and

626: deactivating $\square(s)$, the number of long-range packet exchanges

627: between sensors of the kind $s(\square_{i_1\dots i_r i_{r+1}})$ is

628: $\Theta\left(\tilde{n} \log(\frac{\tilde{n}}{\epsilon_r})\right)$

629: w.h.p, where $$\tilde{n} = \frac{\E_{\#}[\square_{i_1\dots

630: i_r}]}{\E_{\#}[\square_{i_1 \dots i_r i_{r+1}}]}.$$

631: \end{observation}

632:   Each of these

633: involves $O(\sqrt{\E_{\#}[\square(s)]}) \tilde{n}$ hops w.h.p (see

634: \cite{wainwright}). Therefore the total number of transmissions here

635: is $O\left(\tilde{n}^2 \log(\frac{\tilde{n}}{\epsilon_r})\right)$

636: w.h.p.

637:

638: Each of these long-range packet exchanges is followed by a period of

639: averaging within the involved subsquares, and this takes $H(n, r+1,

640: \epsilon_{r+1}, \delta_r) = \Omega(\tilde{n})$ transmissions. Thus

641: we have the recurrence \beqs \label{e:1} H(n, r, \epsilon_r,

642: \delta_r) & = & O\left((H(n, r+1, \epsilon_{r+1}, \delta_{r+1}) +

643: \tilde{n})\tilde{n}\log(\frac{\tilde{n}}{\epsilon_r})\right)\\

644:  & = &  O\left(H(n, r+1, \epsilon_{r+1}, \delta_{r+1})\tilde{n}\log(\frac{\tilde{n}}{\epsilon_r})\right).

645:  \eeqs

646:

647:  %In subsection~\ref{epsdel}, we shall

648: % It can be shown that to obtain the desired

649: % result, it suffices to choose $\epsilon_{r+1} =

650: % \frac{\epsilon_{r}}{25 n^{3/2}}$ for $r \leq \ell - 1$, and

651: % $\delta_{r+1} = \frac{\delta_r}{\tilde{n}^2}$. $\epsilon_0 :=

652: % \epsilon$, and $\delta_0 := \delta$, which are input parameters.

653: As mentioned in subsection~\ref{s:1}, we let

654:  $\epsilon_0 =

655: \epsilon$, $\delta_0 = \delta$ and  recursively define

656: $\epsilon_{r+1} := \frac{\epsilon_r}{25 n^{7/2}}$ and $\delta_{r+1}

657: = \frac{\delta_r}{n_r^2}$.

658:  For these parameters,

659: $\delta_r = \Omega(\frac{1}{\text{poly}(n)})$, since $\delta_0 =

660: \Omega(\frac{1}{\text{poly}(n)})$ and the $\tilde{n}$ telescope.

661: $\epsilon_r = \epsilon_0\Omega{n^{-O(\log\log n)}}$ since $\ell \sim

662: \log \log n$.

663:  Now, the smallest squares that we create have $O(\text{polylog}n)$

664:  sensors each w.h.p. Since the ordinary averaging that we do there

665:  (described by the procedure "Near(s)") has an averaging time that is

666:  quadratic \cite{Boyd, Boyd2},

667: $H(n, \ell, \epsilon_{\ell}, \delta_\ell) =

668: \Omega(\text{polylog}(\frac{n}{\epsilon_\ell}))$. And so using the

669: recurrence for $H$ and telescoping, we see that the total number of

670: transmissions is

671:  \beqs H(n, 0, \epsilon_0, \delta_0) &=& \left(H(n,

672: \ell, \epsilon_{r+1}, \delta_{r+1})\right)\prod_r

673: \left\{\frac{\E_{\#}[\square_{i_1\dots i_r}]}{\E_{\#}[\square_{i_1

674: \dots i_r i_{r+1}}]} \log \frac{n}{\epsilon_r}\right\}\\

675: & = & n (\log \frac{n}{\epsilon})^{O(\log \log n)}. \eeqs This is

676: $n^{1+o(1)}$ if $\epsilon = \exp(-n^{\frac{o(1)}{\log \log n}})$,

677: and $\delta = n^{-O(1)}$.

678: %\section{Lemma}

679:

680: \section{Notes on Correctness}

681: In the algorithm proposed in this paper, each square $\square(s)$

682: has a certain latency, which is the averaging time restricted to

683: that square. In order for our algorithm to be correct, we require

684: that $\square(s)$ be undisturbed by the long-range exchanges that

685: $s$ is involved in, during this period. This is not a condition that

686: can be imposed without the long-range exchanges of $s$ losing their

687: i.i.d property, which is crucial in our analysis of convergence. In

688: order to retain this, and have an algorithm that is successful w.h.p

689: we have set the rates at which long-range exchanges of $s$ occur to

690: be lower than the inverse of the latency by a factor $n^a$. As a

691: consequence, w.h.p, in the course of the entire algorithm, there are

692: no long-range transmissions made by any node $s$ while $\square(s)$

693: is active. The only issue that we have not dealt with in detail is

694: of showing that our choice of errors $\epsilon_r$ achieves the

695: desired end. This follows from Lemma~\ref{l:2} interpreted as

696: follows: The nodes $i$  represent {\it subsquares} $\square_{i_1

697: \dots i_r i_{r+1}}$ of $\square_{i_1 \dots i_r}$ and the $y_j(t)$

698: for different $j$ represent  the {\it sum} of the values held by the

699: nodes in a subsquare $\square_{i_1 \dots i_r j}$ after $t$ long

700: distance transmissions between subsquares since the activation of

701: $\square_{i_1 \dots i_r}$. We set $\epsilon := {\epsilon_{r+1}}

702: \|\x(0)\|$. The perturbations $n(t)$ represent the errors generated

703: from imperfect averaging within these subsquares.

704:

705: \section{Concluding Remarks}

706: We introduced {\it non-convex affine combinations}, in our averaging

707: protocol in order to accelerate Geographic Gossip in Geometric

708: random graphs. The number of transmissions used in the course of our

709: protocol is $n^{1+o(1)}$. This exponent is asymptotically optimal.

710: Our algorithm, unlike the previous one in \cite{wainwright} is not

711: completely decentralized. However as far as we can see, this is not

712: a necessary feature associated with the use of affine combinations.

713:

714: % In this scenario, if the sensors in $\square(s)$ are not

715: %shut down after the time required for them to have performed their

716: %task, there is an excessive wastage of power.

717:

718: % The

719: %reason we required to have a sensor $s$ activate and deactivate the

720: %region $\square(s)$ is that in order for the long-range information

721: %exchanges between $s$ and other nodes of its ``Level" to form an

722: %i.i.d process, one cannot have a clock that controls the immediate

723: %subsquares of $\square(s)$ to form an i.i.d process

724:

725: \section{Future Directions}

726: It would be interesting to study whether affine combinations can be

727: used to develop a completely decentralized algorithm for Geographic

728: Gossip that is also energy efficient.

729:

730:

731: \begin{thebibliography}{50}

732: %\vspace*{0.5mm} \scriptsize

733:

734: \bibitem{Boyd}

735: S.~Boyd, A.~Ghosh, B.~Prabhakar, and D.~Shah.

736: \newblock Gossip algorithms : Design, analysis and applications.

737: \newblock In {\em Proceedings of the 24th Conference of the IEEE Communications

738:   Society (INFOCOM 2005)}, 2005.

739:

740: \bibitem{Boyd2}

741: S.~Boyd, A.~Ghosh, B.~Prabhakar, and D.~Shah.

742: \newblock Mixing Times for Random Walks on Geometric Random Graphs.

743: \newblock SIAM ANALCO 2005.

744:

745: \bibitem{car}

746: S. ~Carruthers, V. ~King.

747: \newblock Connectivity of Wireless Sensor Networks with Constant

748: Density.{\em ADHOC-NOW, 2004}, 149-157

749: \newblock

750:

751: \bibitem{kumar}

752: P.~Gupta and P.~Kumar.%\\

753: \newblock The capacity of wireless networks.%\\

754: \newblock {\em IEEE Transactions on Information Theory}, 46(2):388--404, March

755:   2000.

756:

757: \bibitem{wainwright}

758:  A. ~Dimakis, A. ~Sarwate, M. ~Wainwright.

759:  \newblock Geographic gossip: efficient aggregation for sensor

760:  networks.

761:  \newblock In {\em Proceedings of the fifth international conference on information processing in sensor networks (IPSN)}, 2006.

762:

763: \bibitem{Karp}

764: R.~Karp, C.~Schindelhauer, S.~Shenker, and B.~V\"{o}cking.%\\

765: \newblock Randomized rumor spreading.%\\

766: \newblock In {\em Proc. IEEE Conference of Foundations of Computer Science,

767:   (FOCS)}, 2000.

768:

769: \bibitem{k1}

770: D.~Kempe, J.~Kleinberg, A.~Demers.%\\

771:  \newblock Spatial gossip and

772: resource location protocols.%\\

773:  \newblock in {\em Proc. 33rd ACM

774: Symposium on Theory of Computing,} 2001.

775:

776: \bibitem{k2}

777: D. ~Kempe, J. ~Kleinberg.%\\

778: \newblock Protocols and Impossibility

779: Results for Gossip-Based Communication Mechanisms.%\\

780:  \newblock In

781: {\em Proc. 43rd IEEE Symposium on Foundations of Computer Science,}

782: 2002.

783:

784: \bibitem{MoskAoyama}

785: D.~Mosk-Aoyama and D.~Shah.

786: \newblock Information dissemination via gossip: Applications to averaging and

787:   coding.

788: \newblock http://arxiv.org/cs.NI/0504029, April 2005.

789:

790: \bibitem{MR95}

791: R.~Motwani and P.~Raghavan.%\\

792: \newblock {\em Randomized Algorithms}.%\\

793: \newblock Cambridge University Press, Cambridge, 1995.

794:

795: \bibitem{Penrose}

796: M.~Penrose.%\\

797: \newblock {\em Random Geometric Graphs}.%\\

798: \newblock Oxford studies in probability. Oxford University Press, Oxford,

799: 2003.%\\

800:

801: \bibitem{Xiao}

802: L.~Xiao, S.~Boyd, and S.~Lall.%\\

803: \newblock A scheme for asynchronous distributed sensor fusion based on average

804:   consensus.%\\

805: \newblock In {\em 2005 Fourth International Symposium on Information Processing

806:   in Sensor Networks (IPSN)}, 2005.

807: \end{thebibliography}

808: \appendix

809: \section{Appendix}

810:  Let $K_n$ be the complete graph on $n$ vertices $\{1, \dots,

811: n\}.$ $\forall i,$ let $\alpha_i \in (\frac{1}{3}, \frac{1}{2}).$ At

812: time $t \geq 0$, for $i = 1, \dots, n$, let node $i$ hold the value

813: $x_i(t)$. Consider the following update rule. If the $t^{th}$ clock

814: tick belongs to node $i$, then, $i$ chooses a node $j$ uniformly at

815: random, and the following update occurs:

816:

817: \begin{itemize}\label{update}

818: \item $x_i(t) = (1-\alpha_i) x_i(t-1) + \alpha_j x_j(t-1) .$

819: \item $x_j(t) = (1-\alpha_j) x_j(t-1) + \alpha_i x_i(t-1).$

820: \end{itemize}

821:

822: \begin{lemma}\label{l:1}

823: $\E[\x(t)^T \x(t)] < (1-\frac{1}{2n})^t \x(0)^T \x(0)$.

824: \end{lemma}

825: {\bf Proof:}%Please refer to the Appendix.\\

826: %{\bf Proof of Lemma~\ref{l:1}}\\

827: Let the update rule for $\x(t)$ be given by $A(t-1)$, \ie \, $\x(t)

828: = A(t-1)\x(t-1)$. Note that $A(t-1) = I - (\alpha_i \e_i - \alpha_j

829: \e_j)(\e_i^T - \e_j^T)$, if the $i^{th}$ vector of the standard

830: basis is denoted by $\e_i$.

831:

832: \beqs\label{eq1}

833: \E[\x(t)^T \x(t)|\x(t-1)] & = & \E[\x(t-1)^T A(t-1)^T A(t-1) \x(t-1)|\x(t-1)]\\

834:                & = & \x(t-1)^T \E[A(t-1)^T A(t-1)] \x(t-1).

835:                \eeqs

836: Let $\alpha_i \e_i - \alpha_j \e_j = \mathbf{\alpha}_{ij}$ and $\e_i

837: - \e_j = \e_{ij}$. Then, $\E[A(t-1)^T A(t-1)] = \E[(I -

838: \e_{ij}\alpha_{ij}^T)^T(I - \e_{ij}\alpha_{ij}^T)]$.

839:

840: Let $E_{ij}$ denote the $n \times n$ matrix whose $ij^{th}$ entry is

841: $1$ and every other entry is $0$.

842:

843: Then, by expanding, one finds that \beqs \E[A(t-1)^T A(t-1)] & = & I

844: + \sum_i \frac{(1-2\alpha_i)^2 -1}{n} E_{ii} + \sum_{i \neq j}

845: \frac{(1 - (1-2\alpha_i)(1-2\alpha_j)) E_{ij}}{n(n-1)}\\

846: & = & I (1 - \frac{1}{n-1}) + \frac{\mathbf{1}\mathbf{1}^T}{n(n-1)}

847: - \frac{(\mathbf{1}-2\mathbb{\alpha})(\mathbf{1}-2\alpha)^T}{n(n-1)}

848: + \sum_i\frac{(1-2\alpha_i)^2E_{ii}}{n-1}. \eeqs An application of

849: the formula for $\E[\x(t)^T \x(t)|\x(t-1)]$, now gives us the

850: following:

851:

852: \beq \label{expr} \E[x(t)^T x(t) | x(t-1)] & = & \E[x(t-1)^TA(t-1)^TA(t-1)x(t-1)|x(t-1)]\\

853:                               & = & x(t-1)^T\E[A(t-1)^TA(t-1)]x(t-1)

854:                             \eeq

855: We know that $\forall i,  1-2\alpha_i \in (0, \frac{1}{3})$.

856:

857: %and so $\|1-2\alpha\| \leq \frac{\sqrt{n}}{3}$. An application of

858: %the Cauchy-Schwarz inequality gives us  $|\x(t-1)^T(\mathbf{1} -

859: %2\alpha)| \leq \|x(t-1)\|\|1-2\alpha\|$, implying the bound

860: %$|\x(t-1)^T(\mathbf{1} - 2\alpha)| \leq

861: %\frac{\sqrt{n}}{3}\|x(t-1)\|.$ Also, $x(t-1)^T \one = 0$. Applying

862: %these to the expression for $\E[A(t-1)^T A(t-1)]$ derived earlier,

863: Let us upper bound $x(t-1)^T\E[A(t-1)^TA(t-1)]x(t-1)$ using the the

864: expression for $\E[A(t-1)^T A(t-1)]$ derived earlier.

865: $$x(t-1)^T I (1 - \frac{1}{n-1}) x(t-1) = (1 -

866: \frac{1}{n-1})\|x(t-1)\|^2,$$

867: $$\frac{x(t-1)^T \mathbf{1} \one^T x(t-1)}{n-1} = 0,$$

868: $$- \frac{x(t-1)^T (\mathbf{1} - 2\alpha)(\one^T - 2\alpha^T)

869: x(t-1)}{n(n-1)} \leq 0 $$ and,

870: $$x(t-1)^T \left(\sum_i \frac{(1-2\alpha_i)^2

871: E_{ii}}{n-1}\right)x(t-1)  \leq \frac{\|x(t-1)\|^2}{9(n-1)}.$$

872:

873: Adding up the above inequalities, $$\E[x(t)^Tx(t)|x(t-1)] \leq

874: \left(1 - \frac{8}{9(n-1)}\right) x(t-1)^T x(t-1).$$ As a

875: consequence,

876: $$\E[\|x(t)\|^2 \, | \, x(t-1)] < \left(1 - \frac{1}{2n}\right) \|x(t-1)\|^2.$$

877: Successively conditioning on $x(t-2), \dots, x(0)$, we see that

878: $$\E[\|x(t)\|^2] < \left(1 - \frac{1}{2n}\right)^t \|x(0)\|^2.$$

879: This proves the lemma.{\hfill $\Box$}

880:

881:

882: An application of Markov's inequality gives us the following

883: corollary.

884: \begin{corollary}\label{c:1}

885: $$\p\left(\|x(t)\| > \epsilon \|x(0)\|\right) \leq \epsilon^{-2}\left(1 -

886: \frac{1}{2n}\right)^t.$$

887: \end{corollary}

888: {\bf Proof:}%Please refer to the Appendix.\\%{\bf Proof of Corollary~\ref{c:1}}\\

889: \beqs \p\left(\|x(t)\| > \epsilon \|x(0)\|\right) &=& \p\left(\frac{\|x(t)\|^2}{\|x(0)\|^2} > \epsilon^2 \right)\\

890:                                                 &\leq& \epsilon^{-2}\E\left(\frac{\|x(t)\|^2}{\|x(0)\|^2}\right) {\hfill (\text{Markov's inequality})} \\

891:                                                 &\leq& \epsilon^{-2}\left(1 - \frac{1}{2n}\right)^t

892: \eeqs    {\hfill $\Box$}

893:

894:

895: An application of Markov's inequality gives us the following

896: corollary.

897: \begin{corollary}\label{c:1}

898: $$\p\left(\|x(t)\| > \epsilon \|x(0)\|\right) \leq \epsilon^{-2}\left(1 -

899: \frac{1}{2n}\right)^t.$$

900: \end{corollary}

901:

902:

903: We now consider a modified update rule, and prove a lemma similar to

904: Lemma~\ref{l:1}.

905:

906:

907:

908: Let $K_n$ be the complete graph on $n$ vertices $\{1, \dots, n\}.$

909: $\forall i,$ let $\alpha_i \in (\frac{1}{3}, \frac{1}{2}).$ At time

910: $t \geq 0$, for $i = 1, \dots, n$, let node $i$ hold the value

911: $x_i(t)$. Let $n(0), n(1), \dots$ be a sequence of real numbers.

912: Consider the following update rule. If the $t^{th}$ clock tick

913: belongs to node $i$, then, $i$ chooses a node $j$ uniformly at

914: random, and the following update occurs:

915:

916: \begin{itemize}\label{update}

917: \item $y_i(t) = (1-\alpha_i) y_i(t-1) + \alpha_j y_j(t-1) + n(t-1).$

918: \item $y_j(t) = (1-\alpha_j) y_j(t-1) + \alpha_i y_i(t-1) - n(t-1).$

919: \end{itemize}

920:

921: \begin{lemma}\label{l:2}

922: Suppose that for each $t$, $|n(t)| < \epsilon$, and that $a

923: > 0$. Then,

924: $$\p\left[\|\y(t)\| > n^{\frac{a}{2}}\left((1-\frac{1}{2n})^{t/2}\|\y(0)\| + 8\sqrt{2} n^{3/2}

925: \epsilon \right)\right] \leq \frac{5}{n^a}.$$

926: %\mathrm{poly}(n)\left((1-\frac{1}{2n})^{\frac{t}{2}}

927: % + \epsilon \right)\|\y(0)\|\right] < \frac{1}{\text{poly(n)}}.$$

928: \end{lemma}

929: {\bf Proof:}%Please refer to the Appendix.\\%{\bf Proof of Lemma~\ref{l:2}}\\

930:  $\y(t) = A(t-1)\y(t-1) + \n(t-1)$,

931: where $A(t) = I - (\alpha_i \e_i - \alpha_j \e_j)(\e_i^T - \e_j^T)$,

932: and $\n(t-1) = n(t-1)(\e_i - \e_j).$ Let $\x(0) = \y(0)$, and let

933: the $\x(t)$ satisfy $\x(t+1) = A(t)\x(t)$ as in Lemma~\ref{l:1}. We

934: observe that

935: $$\y(1) = \x(1) + \n(0)$$ and more generally,

936: $$ \y(t+1) =  \x(t+1) + \n(t) + \sum_{i=0}^{t-1} A(t)A(t-1)

937: \dots A(i+1) \n(i) .$$ An application of the triangle inequality now

938: gives us

939: $$ \|\y(t+1)\| \leq  \|\x(t+1)\| + \|\n(t)\| + \sum_{i=0}^{t-1} \|A(t)A(t-1)

940: \dots A(i+1) \n(i)\| .$$ Our approach to proving this Lemma is to

941: upper bound each term in the right hand side.

942: \begin{observation}\label{o:1}

943: \beqs \p\left[\|\x(t)\|

944:  >  (1-\frac{1}{2n})^{t/2} n^{a/2} \|x(0)\|\right] & \leq &

945: \left((1-\frac{1}{2n})^{t/2}

946: n^{a/2}\right)^{-2}\E\left(\frac{\|x(t)\|^2}{\|x(0)\|^2}\right)\\

947: & \leq & \left((1-\frac{1}{2n})^{t/2} n^{a/2}\right)^{-2} (1 -

948: \frac{1}{2n})^t \\

949: & = & \frac{1}{n^a}.\eeqs

950: \end{observation}

951: The above inequalities follow from Lemma~\ref{l:1} and

952: Corollary~\ref{c:1}. We shall now upper bound the other terms as

953: well with high probability. Using Corollary~\ref{c:1} \beqs

954: \p\left[\frac{\|A(t-1) \dots A(i) \n(i-1)\|}{\|\n(i-1)\|} >

955: (1-\frac{1}{2n})^{\frac{t-i}{4}} n^{\frac{a+1}{2}}\right] & \leq &

956: ((1-\frac{1}{2n})^{\frac{t-i}{4 }} n^{\frac{a+1}{2}})^{-2}\left(1 -

957: \frac{1}{2n}\right)^{t-i}\\

958: & = & n^{-(a+1)}(1-\frac{1}{2n})^{\frac{t-i}{2}}. \eeqs

959:

960: However,

961: $$\sum_{i=1}^{t-1} (1-\frac{1}{2n})^{\frac{t-i}{2}} n^{-(a+1)} <

962: \frac{4}{n^a}$$ and so,

963: $$ \p\left[\exists_i \left\{\frac{\|A(t-1) \dots A(i)

964: \n(i-1)\|}{\|\n(i-1)\|}

965: > (1-\frac{1}{2n})^{\frac{t-i}{4}} n^{\frac{a+1}{2}}\right\}\right] \leq

966: \frac{4}{n^a}.$$

967:

968: We next observe that $$\sum_{i \leq t}

969: (1-\frac{1}{2n})^{\frac{t-i}{4}} n^{\frac{a+1}{2}} < 8

970: n^\frac{a+3}{2}.$$ As a consequence we have

971:

972: \begin{observation}

973: $$ \p\left[\sum_i \frac{\|A(t-1) \dots A(i)

974: \n(i-1)\|}{\|\n(i-1)\|}

975: > 8 n^\frac{a+3}{2}\right] \leq \frac{4}{n^a}.$$

976: \end{observation}

977:

978: % ((1-\frac{1}{2n})^{\frac{t-i}{4 }} n^{\frac{a+1}{2}})^{-2}\left(1

979: %//-

980: %\frac{1}{2n}\right)^{t-i}\\

981: %& = & n^{-(a+1)}(1-\frac{1}{2n})^{\frac{t-i}{2}}. \eeqs

982: % Therefore

983: %\beqs \p\left[\sum_{i=1}^{t-1} \frac{\|A(t-1) \dots A(i)

984: %\n(i-1)\|}{\|\n(i-1)\|} > \sum_{i=1}^{t-1}

985: %(1-\frac{1}{2n})^{\frac{t-i}{4}} n^{\frac{a+1}{2}}\right] \leq

986: %\p\left[\sum_{i=1}^{t-1} \frac{\|A(t-1) \dots A(i)

987: %\n(i-1)\|}{\|\n(i-1)\|} > \right]\\ & < &  \sum_{i=1}^{t-1}

988: %n^{-(a+1)}(1-\frac{1}{2n})^{\frac{t-i}{2}}\\

989: %& < &  . \eeqs

990:

991: Once we put the above two observations together and note that

992: $(\forall i) \sqrt{2} \epsilon \geq \|\n(i)\|$, an application of

993: the union bound gives

994: $$ \p\left[\|\y(t)\| > n^{\frac{a}{2}}\left((1-\frac{1}{2n})^{t/2}\|\y(0)\|+

995: 8 \sqrt{2} n^{3/2} \epsilon \right)\right]  \leq \frac{5}{n^a}.$$

996: {\hfill $\Box$}

997: \end{document}

998: