0710:0710.0270/analysis-successors

1: \subsection{Successor Pointers}

2: \begin{figure}

3: 	\centering

4: 	\includegraphics[width=9cm, height=7cm]{w1_trans.eps}

5: %	\includegraphics[width=9cm]{w1_trans.eps}

6: %	\vspace*{-0.25cm}

7: 	\caption{Changes in $W_1$, the number of wrong  (failed or outdated) $s_1$ pointers, due to joins, failures and stabilizations.}

8: 	\label{fig:w1-trans}

9: \end{figure}

10: %\begin{figure}

11: %	\centering

12: %		\includegraphics[height=7cm, angle=270]{wd}

13: %	\caption{Theory and Simulation for $W_1(r,\alpha)$ and $D_1(r,\alpha)$}

14: %	\label{fig:w}

15: %\end{figure}

16: %

17: We now turn to estimating various quantities of interest for Chord.

18: In all that follows we will evaluate various {\it average} quantities,

19: as a function of the parameters. To do this we need

20: to understand how the dynamical evolution of the system affects these

21: quantities.

22: %However this same formalism can also be used

23: %for  evaluating higher moments like the variance.

24:

25: In the case of Chord, we only need to consider  one of

26: three kinds of events happening at any micro-instant: a join, a failure

27: or a stabilization. One assumption made in the following

28: is that such a micro-instant of time exists, or in other words, that

29: we can divide time till we have an interval small enough that

30: in this interval, only one of these three processes occurs

31: \minorchange{anywhere in the system}.

32: Implicit in this is the assumption that a stabilization

33: (either of successors or fingers) is done faster than the

34: time-scales over which joins and fails occur.

35:

36: Another aspect of this system which simplifies analysis is that

37: successor pointers of adjacent nodes are independent of each other.

38: That is, the state of the first successor pointer of a given node

39: does not affect the state of the first successor pointer of either its

40: predecessor or its successor. The same logic also works for the

41: state of the second successor pointers of adjacent nodes and so on.

42: On the other hand, the state of the second successor pointer

43: of a node is clearly related to the state of its first successor

44: pointer as well the state of the first successor pointer of the successor.

45: This is taken into account in the analysis of second and higher successor

46: pointers. In characterizing the states of higher successors,

47: we look for the leading order behavior in terms of the

48: parameter $r$.

49:

50: %In the case of finger pointers, two adjacent nodes might have the same finger,

51: %in the sense that some finger of a node $n$ and some finger of its successor might

52: %be pointing to the same node. We take this into account in our analysis.

53: %However the state of the fingers of nodes far apart are independent.

54:

55:

56: %the rest of our analysis on successors, fingers and lookup lengths

57: %is accurate to order $1/r$. Higher order terms though

58: %not entirely neglected, are less accurate.

59: %Clearly these will need to be taken into account to get a

60: %more accurate estimate in the high-churn regime, where the rates of

61: %joins and failures become of the same order of magnitude as

62: %the rate of stabilisation.

63:

64: %Assumptions of this sort are the building blocks of fluid models,

65: %particularly in the master equation approach. If we were to

66: %take more correlations into account the probability of a state

67: %described by a number of properties (say the state of the successors

68: %and fingers) would not factorize into the probability that each

69: %of these properties hold separately. The equation for the probability

70: %of a certain property to hold would then, in general, depend on the

71: %probability of certain pairs of properties to hold, and so on,

72: %leading to a tower of equations known in physical kinetics as the BBKGY

73: %hierarchy~\cite{vanKampen}. Naturally, this full tower of equations

74: %is no simpler than an underlying more detailed description, and

75: %has to be truncated at some level by a \textit{closure approximation}.

76: %Hence, the name of the game is to make the \textit{least} number

77: %of assumptions on dependency while still catching the behavior of

78: %the system. As we see below, the assumptions made here

79: %are sufficiently precise to predict

80: %all quantities extremely accurately, but it should be kept

81: %in mind that analysis is hence not \textit{exact}.

82:

83: Consider first the successor pointers.

84: Let $w_k(r,\alpha)$ denote the fraction of nodes having

85: a \emph{wrong} $k^{th}$ successor pointer and

86: $d_{k}(r,\alpha)$ the fraction of nodes having

87: a \emph{failed} successor pointer.

88: Also, let $W_k(r,\alpha)$ be

89: the number of nodes having

90: a \emph{wrong} $k^{th}$ successor pointer and

91: $D_{k}(r,\alpha)$ the number of nodes having

92: a \emph{failed} successor pointer.

93: A \emph{failed} pointer is one

94: which points to a departed node while

95: a \emph{wrong} pointer points either to an

96: incorrect node (alive but not correct) or a dead one.

97: As we will see, both these quantities play a role

98: in predicting lookup consistency and lookup length.

99:

100:

101: By the protocol for stabilizing successors in Chord, a node periodically contacts its first successor, possibly correcting it and reconciling with its successor list. Therefore, the number of wrong $k^{th}$ successor pointers are not independent quantities but depend on the number of wrong first successor pointers.

102: %We first consider $s_1$ here, and then

103: %briefly discuss the other cases towards the end of this section.

104:

105: %( We derive similar relations for $s_k, k >1$ in \cite{ansary:analysis}).

106:

107: %Define $P_{nb}$ to be the probability that the network does not break up (i.e. a single node gets disconnected (should we

108: %say from the ring?)). In our analysis, we consider only the case where $P_{nb}=1$. This is achieved by setting the length

109: %of the successors list ${\cal S}$ to $O(\log(N))$.

110: %

111:

112: %We write an equation for $W_1(r,\alpha)$ by accounting (table \ref{tab:wrong}) for all the events that can change it in a micro %event of time $\Delta t$.

113:

114: \begin{table}[t]

115: \caption{Gain and loss terms for $W_1(r,\alpha)$: the number of wrong first successors

116: as a function of $r$ and $\alpha$.}

117: \label{tab:wrong}

118: 	\centering

119: 		\begin{tabular}{|l|l|} \hline

120: 		Change in $W_1(r,\alpha)$	&  \minorchange{Probability of Occurrence}   \\ %\hline

121: 		$W_1(t+\Delta t) = W_1(t)+1$ & $c_{1.1}=(\lambda_j \minorchange{N} \Delta t) (1-w_1)$ \\ %\hline

122: 		$W_1(t+\Delta t) = W_1(t)+1$ & $c_{1.2}=\lambda_f \minorchange{N} (1-w_1)^2   \Delta t$ \\ %\hline

123: 		$W_1(t+\Delta t) = W_1(t)-1$ & $c_{1.3}=\lambda_f \minorchange{N} w_1^2   \Delta t $ \\

124: 		$W_1(t+\Delta t) = W_1(t)-1$ & $c_{1.4}=\alpha\lambda_s \minorchange{N} w_1   \Delta t $\\ %\hline

125: 		$W_1(t+\Delta t) = W_1(t)$ & $1 - (c_{1.1} + c_{1.2} + c_{1.3} + c_{1.4})$\\

126: \hline

127: 		\end{tabular}

128: %\vspace*{-0.35cm}

129: \end{table}

130:

131: We write an equation for $W_1(r,\alpha)$ by accounting  for all the events that can change it in a micro event of time $\Delta t$. An illustration of the different cases in which changes in $W_1$ take place due to joins, failures and stabilizations is provided in Fig. \ref{fig:w1-trans}. In some cases $W_1$ increases/decreases while in others it stays unchanged. For each

132: increase/decrease, Table \ref{tab:wrong} provides the corresponding

133: \minorchange{probabilities}.

134:

135: By our implementation of the join protocol, a new node $n_y$, joining between two nodes $n_x$ and $n_z$, always has a correct $s_1$ pointer after the join. However the state of $n_x.s_1$ before the join makes a difference. If $n_x.s_1$ was correct (pointing to $n_z$) before the join, then after the join it will be wrong and therefore $W_1$ increases by $1$. If $n_x.s_1$ was wrong before the join, then it will remain wrong after the join and $W_1$ is unaffected. Thus, we need to account for the former case only. The probability that $n_x.s_1$ is correct is $1-w_1$ and term $c_{1.1}$ follows from this.

136:

137: For failures, we have $4$ cases. To illustrate them we use nodes $n_x$, $n_y$, $n_z$ and assume that $n_y$ is going to fail.

138: First, if both $n_x.s_1$ and $n_y.s_1$ were correct, then the failure of $n_y$ will make $n_x.s_1$ wrong and hence $W_1$ increases by $1$. Second, if $n_x.s_1$ and $n_y.s_1$ were both wrong, then the failure of $n_y$ will decrease $W_1$ by one,

139: since one wrong pointer disappears. Third, if $n_x.s_1$ was wrong

140: and $n_y.s_1$ was correct, then $W_1$ is unaffected. Fourth, if $n_x.s_1$ was correct and $n_y.s_1$ was wrong, then the wrong pointer of $n_y$ disappears and $n_x.s_1$ becomes wrong, therefore $W_1$ is unaffected. For the first case to happen, we need to pick two nodes with correct pointers, the probability of this is $(1-w_1)^2$. For the second case to happen, we need to pick two nodes with wrong pointers, the probability of this is $w^2_1$. From these probabilities follow the terms $c_{1.2}$ and $c_{1.3}$.

141:

142: Finally, a successor stabilization does not affect $W_1$, unless the stabilizing node had a wrong pointer. The probability of picking such a node is $w_1$. From this follows the term $c_{1.4}$.

143:

144: Hence the equation for $W_1(r,\alpha)$ is:

145: \begin{equation}

146: \frac{d W_1}{\minorchange{N} dt}= \lambda_j (1-w_1) + \lambda_f (1-w_1)^2  - \lambda_f w_1^2 - \alpha\lambda_s w_1    \nonumber

147: \end{equation}

148: Solving for $w_1$ in the steady state and putting $\lambda_j=\lambda_f$, we get:

149: \begin{equation}

150: w_1(r,\alpha) = \frac{2}{3+r\alpha} \approx \frac{2}{r\alpha}

151: \end{equation}

152:

153: This expression matches well with the simulation results as shown in Fig.

154: \ref{fig:wi}.

155: $d_1(r,\alpha)$ is then $ \approx \frac{1}{2}w_1(r,\alpha)$

156: since when $\lambda_j=\lambda_f$, about half the number of wrong pointers

157: are incorrect and about half point to dead nodes.

158: Thus $ d_1(r,\alpha) \approx \frac{1}{r\alpha}$ which

159: also matches well the simulations as shown in Fig. \ref{fig:wi}.

160: %We can also use the above reasoning to iteratively get $w_k(r,\alpha)$ for

161: %any $k$.

162:

163: \begin{figure}

164: 	\centering

165: 		\includegraphics[height=8cm, angle=270]{wdboth-sep}

166: 		%\includegraphics[height=8cm, angle=270]{i-sep}

167: 	\caption{Theory and simulation for the probability of wrong $1^{st}$ successor $w_1(r,\alpha)$ and failed $1^{st}$ successor $d_1(r,\alpha)$.}

168: 	\label{fig:wi}

169: \end{figure}

170:

171: The fraction of wrong second successors can be estimated in an analogous manner.

172: Consider, for a node $n$, the possible states of

173: the successor, $n.s_1$, the successor of the successor,

174: $*(n.s_1).s_1$, and

175: the second successor, $n.s_2$.

176: In a fully correct state,

177: $*(n.s_1).s_1$ and $n.s_2$ of course point to the same node.

178: If in such a state either $n.s_1$ or $*(n.s_1).s_1$

179: becomes incorrect through the action of a join or a failure, then

180:  $n.s_2$ is also incorrect. On the other hand,  $n.s_2$

181: cannot be corrected by the stabilization protocol

182: unless both $n.s_1$ and $*(n.s_1).s_1$ are both already corrected.

183: Hence,  $n.s_2$ is wrong if either $n.s_1$  or $*(n.s_1).s_1$ are

184: wrong, and also if both $n.s_1$  and $*(n.s_1).s_1$ are correct,

185: but  $n.s_2$ has not yet been corrected.

186: If the number of such non-stabilized

187: configurations is $N_2$ and the fraction is $n_2$, we have

188: \begin{equation}

189: \label{eq:w2-equation}

190: w_2 = 2w_1 - w_1^2 + n_2

191: \end{equation}

192:

193: To estimate $n_2$ we consider how these configurations might be

194: gained or lost.  The gain term arises

195: from stabilizations of configurations

196: where $n.s_1$ is correct but $*(n.s_1).s_1$  is wrong.

197: A stabilization performed by node $n.s_1$ then

198: results in the gain of a $N_2$ configuration.

199: %Note that configurations in which $n.s_1$ is wrong but

200: %$*(n.s_1).s_1$ is correct do not add to $N_2$, since

201: %a stabilization by node $n$

202: %the whole successor list is copied to the first list

203: %when $n.s_1$ is stabilized, so then $n.s_2$ is

204: %immediately also corrected.

205: On the other hand, non-stabilized configurations are lost either

206: by a stabilization performed by node $n$ (when it gets the correct

207: successor list from its successor and hence corrects $n.s_2$),

208: or by corrupting either  $n.s_1$ or $*(n.s_1).s_1$

209: (by a join or failure).  The latter possibility

210: gives terms of order $\frac{1}{r^2}$ and we can ignore

211: it in the limit \minorchange{that} stabilizations happens on

212: a much faster time scale than joins and failures  (\textit{i.e.},

213: $r$ much larger than unity). The equation for $N_2$ is hence

214: \begin{equation}

215: \label{eq:n2-equation}

216: \frac{dN_2}{dt} \approx

217: \alpha\lambda_s w_1 (1-w_1) - \alpha\lambda_s n_2

218: \end{equation}

219: which implies $n_2\approx w_1$ to order $\frac{1}{r}$.

220: Thus, we have $w_2 \approx \frac{6}{r}$.

221:

222:

223: For higher successors we reason similarly by considering

224: the state of the  ${k-1}^{st}$ successor pointer of node $n$,

225: the successor pointer of the ${k-1}^{st}$ successor,

226: and the $k^{th}$ successor pointer of node $n$.

227: We can write a recursion equation for $w_k$ the fraction of nodes with

228: wrong $k^{th}$ successor pointer

229: \begin{equation}

230: \label{eq:wk-equation}

231: w_k = w_1 + w_{k-1} - w_{k-1} w_1 + n_k

232: \end{equation}

233: where $n_k$ is the density of configurations where

234: the ${k-1}^{st}$ successor pointer of node $n$ and the first successor pointer

235: of the ${k-1}^{st}$ successor are both correct, but this information

236: has not yet been used to correct the $k^{th}$ successor pointer of node $n$.

237: If node $n$ does not as yet have the correct information about its

238: $k^{th}$ successor, that means that either all the nodes in between $n$ and its ${k-1}^{st}$  successor have the correct information but node $n$ has not as yet stabilized, or that the stabilization has propagated back from the ${k-1}^{st}$ successor

239: to  some node in between but not as yet to $n.s_1$.

240: To elaborate on this further, there is the case where the

241: second successor pointer

242: of the ${k-2}^{nd}$  successor has not been corrected, then the case where

243: this has been done, but the third successor pointer of

244: the  ${k-3}^{rd}$ successor has not been corrected, and so on.

245: Each of these is analogous to $n_2$ and each occurs with density

246: $(1-w_{k-1})w_1$, if joins and failures are neglected compared

247: to stabilizations.

248: Hence, if to leading order in $\frac{1}{r}$ we have

249: $w_k \sim \frac{c_k}{\alpha r}$, then

250: \begin{equation}

251: \label{eq:ck-equation}

252: c_k = c_{k-1} + k c_1

253: \end{equation}

254: which leads to

255: \begin{equation}

256: \label{eq:wk-leading}

257: w_k \approx \frac{k(k+1)}{\alpha r}

258: \end{equation}.

259: We note that this expression obviously depends on the

260: details of the stabilization scheme, and is in principle

261: only valid up to $k \sim \sqrt{r}$.

262: As shown in Fig. \ref{fig:wk}, the agreement between

263: theory and simulation is still however quite reasonable

264: at $k=5$ and $r=100$.

265: \begin{figure}

266: 	\centering

267: 		\includegraphics[height=8cm, angle=270]{wk}

268: 		%\includegraphics[height=8cm, angle=270]{i-sep}

269: 	\caption{Theory and simulation for the probability of a wrong $k^{th}$ successor $w_k(r,\alpha)$.}

270: 	\label{fig:wk}

271: \end{figure}

272:

273:

274: \subsection{Break-up (Network Disconnection) Probability}

275:

276:

277: \begin{table}[t]

278: \caption{Gain and loss terms for $N_{bu} (2,r, \alpha)$:

279: the number of nodes with dead

280: first {\em and} second successors.}

281: \label{tab:disconnect}

282: 	\centering

283: 		\begin{tabular}{|l|l|} \hline

284: 		Change in $N_{bu}(r,\alpha)$	&  \minorchange{Probability of Occurrence}   \\ %\hline

285: 		$N_{bu}(t+\Delta t) = N_{bu}(t)+1$ & $c_{2.1}=(\lambda_f \minorchange{N} \Delta t)d_1 (r, \alpha)$ \\ %\hline

286: 		$N_{bu}(t+\Delta t) = N_{bu}(t)+1$ & $c_{2.2}=\lambda_f \minorchange{N} \Delta t (1-d_1) d_2 $ \\ %\hline

287: 		$N_{bu}(t+\Delta t) = N_{bu}(t)-1$ & $c_{2.3}=\alpha \lambda_s \minorchange{N} \Delta t P_{bu}(2,r,\alpha) $ \\

288: 		$N_{bu}(t+\Delta t) = N_{bu}(t)$ & $1 - (c_{2.1} + c_{2.2} + c_{2.3} )$\\

289: \hline

290: 		\end{tabular}

291: %\vspace*{-0.35cm}

292: \end{table}

293:

294: We demonstrate below, how calculating $d_k(r, \alpha)$:

295: the fraction of nodes with dead $k^{th}$ pointers,

296: helps in estimating the probability that

297: the network gets disconnected for any value of $r$ and $\alpha$.

298: Let $P_{bu} (n, r,\alpha)$ be the probability that

299: $n$ consecutive nodes fail. If

300: $n={\cal S}$, the length of the successor list, then clearly the node

301: whose successor list this is, gets disconnected from the network

302: and the network breaks up.

303: For the range of $r$ considered in Fig. \ref{fig:wi},

304: $P_{bu}({\cal S},r,\alpha) \sim 0$. However should we go lower, this

305: starts becoming finite. The master equation analysis

306: introduced here can be used to estimate $P_{bu} (n,r,\alpha)$

307: for any $1\le n \le {\cal S}$. We

308: indicate how this might be done by first considering the case $n=2$.

309: Let $N_{bu} (2,r,\alpha)$ be the number of configurations in which

310: a node has both $s_1$ and $s_2$ dead and $P_{bu}(2,r,\alpha)$ be the

311: fraction of such configurations.

312: Table \ref{tab:disconnect} indicates how this is estimated

313: within the present framework.

314:

315:

316: \begin{figure}

317: 	\centering

318: 		\includegraphics[height=9cm, angle=270]{d2}

319: 	\caption{Theory and simulation for the probability of failure of the $2^{nd}$ successor, $d_2(r,\alpha)$.}

320: 	\label{fig:d2}

321: \end{figure}

322:

323: A join event does not affect this probability in any way. So we only need to

324: consider the effect of failures or stabilization events.

325: The term $c_{2.1}$ accounts for the situation when the

326: {\em first} successor of a node is dead

327: (which happens with probability $d_1 (r, \alpha)$ as explained above).

328: A failure event can then kill its second successor as well and this happens

329: with probability $c_{2.1}$. The second term is the situation that the first

330: successor is alive (with probability $1-d_1$) but the second successor

331: is dead (with probability $d_2$). The logic used to estimate $d_2$

332: (or $d_k$ in general) is very similar to the reasoning

333: we used to estimate the $w_k$'s. So we have

334: \begin{equation}

335: \label{eq:dk-equation}

336: d_k = d_{1} + (k-1) d_1 = k d_1

337: \end{equation}

338: Thus the $k^{th}$ successor of a node is dead if the ${k-1}^{st}$

339: successor's successor is dead, or the ${k-1}^{st}$ successor's successor

340: is not dead but the intermediate nodes think it is

341: because they haven't stabilized.

342: Hence $d_2 \sim 2/\alpha r$. This estimate for $d_2$ matches the simulation results very well, as shown in Fig. \ref{fig:d2}.

343:

344: Coming back to counting the gain and loss terms for $N_{bu}(2,r,\alpha)$,

345: a stabilization event reduces

346: the number of such configurations by one, if

347: the node doing the stabilization had such a configuration to begin with.

348:

349: Solving the equation for $N_{bu} (2,r,\alpha)$, one hence obtains

350: that $P_{bu}(2,r,\alpha) \sim 3/(\alpha r)^2$.

351: As Fig. \ref{fig:fos2} shows, this is a

352: precise estimate.

353:

354: We can similarly estimate the probabilities for three consecutive nodes

355: failing, {\it etc}, and hence also the general disconnection

356: probability $P_{bu}({\cal S},r,\alpha)$. In fact

357: $P_{bu}({\cal S},r,\alpha)$ may be written in terms of the

358: $d_k(r, \alpha)$ as:

359: \begin{equation}

360: \label{eq:Pbu-equation}

361: P_{bu}({\cal S}) = ({{\cal S}-1})! \frac{\sum_{1}^{\cal S} d_i(r,\alpha)}{(\alpha r)^{{\cal S}-1}}

362: \end{equation}

363: The logic behind this equation is similar to that used for

364: solving for $P_{bu}(2)$, namely that for ${\cal S}$ consecutive nodes to fail, any ${{\cal S}-1}$ of the  ${\cal S}$ nodes should have

365: failed first, and then a failure event kills the remaining node.

366: (\ref{eq:Pbu-equation})

367: is readily solved by substituting the values of the $d_k$'s to get

368: \begin{equation}

369: \label{eq:Pbu-solution}

370: P_{bu}({\cal S})= \frac{({{\cal S}+1})!}{2 (\alpha r)^{{\cal S}}}

371: \end{equation}

372:

373: As mentioned above this is again correct only to leading order. Namely

374: there will be correction terms of the order $r^{{\minorchange{\cal S}} +1}$ which we haven't

375: computed at this level of approximation.

376: The Master Equation formalism

377: thus affords the possibility of making a precise

378: prediction for when the system runs the danger of

379: getting disconnected, as a function of the parameters.

380:

381:

382:

383: \begin{figure}

384: 	\centering

385: 		\includegraphics[height=9cm, angle=270]{fos2}

386: 	\caption{Theory and simulation for the break-up probability $P_{bu}(2, r, \alpha)$.}

387: 	\label{fig:fos2}

388: \end{figure}

389:

390:

391:

392:

393:

394:

395:

396:

397: