0710.0270/analysis-successors_jan25.tex
1: \subsection{Successor Pointers}
2: \begin{figure}
3: 	\centering
4: 	\includegraphics[width=9cm, height=7cm]{w1_trans.eps}
5: %	\includegraphics[width=9cm]{w1_trans.eps}
6: %	\vspace*{-0.25cm}
7: 	\caption{Changes in $W_1$, the number of wrong  (failed or outdated) $s_1$ pointers, due to joins, failures and stabilizations.}
8: 	\label{fig:w1-trans}
9: \end{figure}
10: %\begin{figure}
11: %	\centering
12: %		\includegraphics[height=7cm, angle=270]{wd}
13: %	\caption{Theory and Simulation for $W_1(r,\alpha)$ and $D_1(r,\alpha)$}
14: %	\label{fig:w}
15: %\end{figure}
16: %
17: We now turn to estimating various quantities of interest for Chord.
18: In all that follows we will evaluate various {\it average} quantities, 
19: as a function of the parameters. To do this we need
20: to understand how the dynamical evolution of the system affects these
21: quantities.
22: %However this same formalism can also be used 
23: %for  evaluating higher moments like the variance.
24: 
25: In the case of Chord, we only need to consider  one of 
26: three kinds of events happening at any micro-instant: a join, a failure
27: or a stabilization. One assumption made in the following
28: is that such a micro-instant of time exists, or in other words, that
29: we can divide time till we have an interval small enough that 
30: in this interval, only one of these three processes occurs 
31: \minorchange{anywhere in the system}. 
32: Implicit in this is the assumption that a stabilization
33: (either of successors or fingers) is done faster than the
34: time-scales over which joins and fails occur. 
35: 
36: Another aspect of this system which simplifies analysis is that
37: successor pointers of adjacent nodes are independent of each other.
38: That is, the state of the first successor pointer of a given node
39: does not affect the state of the first successor pointer of either its 
40: predecessor or its successor. The same logic also works for the 
41: state of the second successor pointers of adjacent nodes and so on.
42: On the other hand, the state of the second successor pointer
43: of a node is clearly related to the state of its first successor 
44: pointer as well the state of the first successor pointer of the successor. 
45: This is taken into account in the analysis of second and higher successor 
46: pointers. In characterizing the states of higher successors, 
47: we look for the leading order behavior in terms of the 
48: parameter $r$. 
49: 
50: %In the case of finger pointers, two adjacent nodes might have the same finger,
51: %in the sense that some finger of a node $n$ and some finger of its successor might 
52: %be pointing to the same node. We take this into account in our analysis.
53: %However the state of the fingers of nodes far apart are independent.
54: 
55: 
56: %the rest of our analysis on successors, fingers and lookup lengths 
57: %is accurate to order $1/r$. Higher order terms though 
58: %not entirely neglected, are less accurate.
59: %Clearly these will need to be taken into account to get a
60: %more accurate estimate in the high-churn regime, where the rates of 
61: %joins and failures become of the same order of magnitude as
62: %the rate of stabilisation.
63: 
64: %Assumptions of this sort are the building blocks of fluid models,
65: %particularly in the master equation approach. If we were to
66: %take more correlations into account the probability of a state 
67: %described by a number of properties (say the state of the successors
68: %and fingers) would not factorize into the probability that each
69: %of these properties hold separately. The equation for the probability
70: %of a certain property to hold would then, in general, depend on the
71: %probability of certain pairs of properties to hold, and so on,
72: %leading to a tower of equations known in physical kinetics as the BBKGY
73: %hierarchy~\cite{vanKampen}. Naturally, this full tower of equations
74: %is no simpler than an underlying more detailed description, and
75: %has to be truncated at some level by a \textit{closure approximation}. 
76: %Hence, the name of the game is to make the \textit{least} number
77: %of assumptions on dependency while still catching the behavior of
78: %the system. As we see below, the assumptions made here
79: %are sufficiently precise to predict
80: %all quantities extremely accurately, but it should be kept
81: %in mind that analysis is hence not \textit{exact}.
82: 
83: Consider first the successor pointers.
84: Let $w_k(r,\alpha)$ denote the fraction of nodes having 
85: a \emph{wrong} $k^{th}$ successor pointer and 
86: $d_{k}(r,\alpha)$ the fraction of nodes having 
87: a \emph{failed} successor pointer.
88: Also, let $W_k(r,\alpha)$ be 
89: the number of nodes having 
90: a \emph{wrong} $k^{th}$ successor pointer and 
91: $D_{k}(r,\alpha)$ the number of nodes having 
92: a \emph{failed} successor pointer.
93: A \emph{failed} pointer is one
94: which points to a departed node while
95: a \emph{wrong} pointer points either to an
96: incorrect node (alive but not correct) or a dead one. 
97: As we will see, both these quantities play a role 
98: in predicting lookup consistency and lookup length.
99: 
100: 
101: By the protocol for stabilizing successors in Chord, a node periodically contacts its first successor, possibly correcting it and reconciling with its successor list. Therefore, the number of wrong $k^{th}$ successor pointers are not independent quantities but depend on the number of wrong first successor pointers. 
102: %We first consider $s_1$ here, and then
103: %briefly discuss the other cases towards the end of this section.
104: 
105: %( We derive similar relations for $s_k, k >1$ in \cite{ansary:analysis}).
106: 
107: %Define $P_{nb}$ to be the probability that the network does not break up (i.e. a single node gets disconnected (should we
108: %say from the ring?)). In our analysis, we consider only the case where $P_{nb}=1$. This is achieved by setting the length
109: %of the successors list ${\cal S}$ to $O(\log(N))$.
110: %
111: 
112: %We write an equation for $W_1(r,\alpha)$ by accounting (table \ref{tab:wrong}) for all the events that can change it in a micro %event of time $\Delta t$.
113: 
114: \begin{table}[t]
115: \caption{Gain and loss terms for $W_1(r,\alpha)$: the number of wrong first successors
116: as a function of $r$ and $\alpha$.} 
117: \label{tab:wrong}
118: 	\centering
119: 		\begin{tabular}{|l|l|} \hline
120: 		Change in $W_1(r,\alpha)$	&  \minorchange{Probability of Occurrence}   \\ %\hline 
121: 		$W_1(t+\Delta t) = W_1(t)+1$ & $c_{1.1}=(\lambda_j \minorchange{N} \Delta t) (1-w_1)$ \\ %\hline
122: 		$W_1(t+\Delta t) = W_1(t)+1$ & $c_{1.2}=\lambda_f \minorchange{N} (1-w_1)^2   \Delta t$ \\ %\hline
123: 		$W_1(t+\Delta t) = W_1(t)-1$ & $c_{1.3}=\lambda_f \minorchange{N} w_1^2   \Delta t $ \\ 
124: 		$W_1(t+\Delta t) = W_1(t)-1$ & $c_{1.4}=\alpha\lambda_s \minorchange{N} w_1   \Delta t $\\ %\hline
125: 		$W_1(t+\Delta t) = W_1(t)$ & $1 - (c_{1.1} + c_{1.2} + c_{1.3} + c_{1.4})$\\ 
126: \hline
127: 		\end{tabular}
128: %\vspace*{-0.35cm}
129: \end{table}
130: 
131: We write an equation for $W_1(r,\alpha)$ by accounting  for all the events that can change it in a micro event of time $\Delta t$. An illustration of the different cases in which changes in $W_1$ take place due to joins, failures and stabilizations is provided in Fig. \ref{fig:w1-trans}. In some cases $W_1$ increases/decreases while in others it stays unchanged. For each
132: increase/decrease, Table \ref{tab:wrong} provides the corresponding 
133: \minorchange{probabilities}. 
134: 
135: By our implementation of the join protocol, a new node $n_y$, joining between two nodes $n_x$ and $n_z$, always has a correct $s_1$ pointer after the join. However the state of $n_x.s_1$ before the join makes a difference. If $n_x.s_1$ was correct (pointing to $n_z$) before the join, then after the join it will be wrong and therefore $W_1$ increases by $1$. If $n_x.s_1$ was wrong before the join, then it will remain wrong after the join and $W_1$ is unaffected. Thus, we need to account for the former case only. The probability that $n_x.s_1$ is correct is $1-w_1$ and term $c_{1.1}$ follows from this. 
136: 
137: For failures, we have $4$ cases. To illustrate them we use nodes $n_x$, $n_y$, $n_z$ and assume that $n_y$ is going to fail.
138: First, if both $n_x.s_1$ and $n_y.s_1$ were correct, then the failure of $n_y$ will make $n_x.s_1$ wrong and hence $W_1$ increases by $1$. Second, if $n_x.s_1$ and $n_y.s_1$ were both wrong, then the failure of $n_y$ will decrease $W_1$ by one,
139: since one wrong pointer disappears. Third, if $n_x.s_1$ was wrong
140: and $n_y.s_1$ was correct, then $W_1$ is unaffected. Fourth, if $n_x.s_1$ was correct and $n_y.s_1$ was wrong, then the wrong pointer of $n_y$ disappears and $n_x.s_1$ becomes wrong, therefore $W_1$ is unaffected. For the first case to happen, we need to pick two nodes with correct pointers, the probability of this is $(1-w_1)^2$. For the second case to happen, we need to pick two nodes with wrong pointers, the probability of this is $w^2_1$. From these probabilities follow the terms $c_{1.2}$ and $c_{1.3}$.
141: 
142: Finally, a successor stabilization does not affect $W_1$, unless the stabilizing node had a wrong pointer. The probability of picking such a node is $w_1$. From this follows the term $c_{1.4}$. 
143: 
144: Hence the equation for $W_1(r,\alpha)$ is: 
145: \begin{equation}
146: \frac{d W_1}{\minorchange{N} dt}= \lambda_j (1-w_1) + \lambda_f (1-w_1)^2  - \lambda_f w_1^2 - \alpha\lambda_s w_1    \nonumber
147: \end{equation}
148: Solving for $w_1$ in the steady state and putting $\lambda_j=\lambda_f$, we get:
149: \begin{equation}
150: w_1(r,\alpha) = \frac{2}{3+r\alpha} \approx \frac{2}{r\alpha}
151: \end{equation}
152: 
153: This expression matches well with the simulation results as shown in Fig. 
154: \ref{fig:wi}. 
155: $d_1(r,\alpha)$ is then $ \approx \frac{1}{2}w_1(r,\alpha)$
156: since when $\lambda_j=\lambda_f$, about half the number of wrong pointers
157: are incorrect and about half point to dead nodes. 
158: Thus $ d_1(r,\alpha) \approx \frac{1}{r\alpha}$ which
159: also matches well the simulations as shown in Fig. \ref{fig:wi}. 
160: %We can also use the above reasoning to iteratively get $w_k(r,\alpha)$ for 
161: %any $k$.
162: 
163: \begin{figure}
164: 	\centering
165: 		\includegraphics[height=8cm, angle=270]{wdboth-sep}
166: 		%\includegraphics[height=8cm, angle=270]{i-sep}
167: 	\caption{Theory and simulation for the probability of wrong $1^{st}$ successor $w_1(r,\alpha)$ and failed $1^{st}$ successor $d_1(r,\alpha)$.}
168: 	\label{fig:wi}
169: \end{figure}
170: 
171: The fraction of wrong second successors can be estimated in an analogous manner. 
172: Consider, for a node $n$, the possible states of 
173: the successor, $n.s_1$, the successor of the successor,
174: $*(n.s_1).s_1$, and
175: the second successor, $n.s_2$.
176: In a fully correct state, 
177: $*(n.s_1).s_1$ and $n.s_2$ of course point to the same node.
178: If in such a state either $n.s_1$ or $*(n.s_1).s_1$
179: becomes incorrect through the action of a join or a failure, then
180:  $n.s_2$ is also incorrect. On the other hand,  $n.s_2$
181: cannot be corrected by the stabilization protocol
182: unless both $n.s_1$ and $*(n.s_1).s_1$ are both already corrected.
183: Hence,  $n.s_2$ is wrong if either $n.s_1$  or $*(n.s_1).s_1$ are
184: wrong, and also if both $n.s_1$  and $*(n.s_1).s_1$ are correct,
185: but  $n.s_2$ has not yet been corrected.
186: If the number of such non-stabilized
187: configurations is $N_2$ and the fraction is $n_2$, we have
188: \begin{equation}
189: \label{eq:w2-equation}
190: w_2 = 2w_1 - w_1^2 + n_2
191: \end{equation}
192: 
193: To estimate $n_2$ we consider how these configurations might be 
194: gained or lost.  The gain term arises
195: from stabilizations of configurations
196: where $n.s_1$ is correct but $*(n.s_1).s_1$  is wrong.
197: A stabilization performed by node $n.s_1$ then
198: results in the gain of a $N_2$ configuration.
199: %Note that configurations in which $n.s_1$ is wrong but  
200: %$*(n.s_1).s_1$ is correct do not add to $N_2$, since
201: %a stabilization by node $n$
202: %the whole successor list is copied to the first list
203: %when $n.s_1$ is stabilized, so then $n.s_2$ is
204: %immediately also corrected.
205: On the other hand, non-stabilized configurations are lost either
206: by a stabilization performed by node $n$ (when it gets the correct 
207: successor list from its successor and hence corrects $n.s_2$), 
208: or by corrupting either  $n.s_1$ or $*(n.s_1).s_1$ 
209: (by a join or failure).  The latter possibility
210: gives terms of order $\frac{1}{r^2}$ and we can ignore
211: it in the limit \minorchange{that} stabilizations happens on
212: a much faster time scale than joins and failures  (\textit{i.e.},
213: $r$ much larger than unity). The equation for $N_2$ is hence
214: \begin{equation}
215: \label{eq:n2-equation}
216: \frac{dN_2}{dt} \approx
217: \alpha\lambda_s w_1 (1-w_1) - \alpha\lambda_s n_2
218: \end{equation}
219: which implies $n_2\approx w_1$ to order $\frac{1}{r}$.
220: Thus, we have $w_2 \approx \frac{6}{r}$. 
221: 
222: 
223: For higher successors we reason similarly by considering 
224: the state of the  ${k-1}^{st}$ successor pointer of node $n$, 
225: the successor pointer of the ${k-1}^{st}$ successor,
226: and the $k^{th}$ successor pointer of node $n$. 
227: We can write a recursion equation for $w_k$ the fraction of nodes with 
228: wrong $k^{th}$ successor pointer 
229: \begin{equation}
230: \label{eq:wk-equation}
231: w_k = w_1 + w_{k-1} - w_{k-1} w_1 + n_k
232: \end{equation}
233: where $n_k$ is the density of configurations where
234: the ${k-1}^{st}$ successor pointer of node $n$ and the first successor pointer
235: of the ${k-1}^{st}$ successor are both correct, but this information
236: has not yet been used to correct the $k^{th}$ successor pointer of node $n$.
237: If node $n$ does not as yet have the correct information about its
238: $k^{th}$ successor, that means that either all the nodes in between $n$ and its ${k-1}^{st}$  successor have the correct information but node $n$ has not as yet stabilized, or that the stabilization has propagated back from the ${k-1}^{st}$ successor
239: to  some node in between but not as yet to $n.s_1$.
240: To elaborate on this further, there is the case where the 
241: second successor pointer
242: of the ${k-2}^{nd}$  successor has not been corrected, then the case where
243: this has been done, but the third successor pointer of
244: the  ${k-3}^{rd}$ successor has not been corrected, and so on.
245: Each of these is analogous to $n_2$ and each occurs with density
246: $(1-w_{k-1})w_1$, if joins and failures are neglected compared
247: to stabilizations.
248: Hence, if to leading order in $\frac{1}{r}$ we have 
249: $w_k \sim \frac{c_k}{\alpha r}$, then
250: \begin{equation}
251: \label{eq:ck-equation}
252: c_k = c_{k-1} + k c_1
253: \end{equation}
254: which leads to 
255: \begin{equation}
256: \label{eq:wk-leading}
257: w_k \approx \frac{k(k+1)}{\alpha r}
258: \end{equation}.
259: We note that this expression obviously depends on the
260: details of the stabilization scheme, and is in principle 
261: only valid up to $k \sim \sqrt{r}$. 
262: As shown in Fig. \ref{fig:wk}, the agreement between
263: theory and simulation is still however quite reasonable
264: at $k=5$ and $r=100$. 
265: \begin{figure}
266: 	\centering
267: 		\includegraphics[height=8cm, angle=270]{wk}
268: 		%\includegraphics[height=8cm, angle=270]{i-sep}
269: 	\caption{Theory and simulation for the probability of a wrong $k^{th}$ successor $w_k(r,\alpha)$.}
270: 	\label{fig:wk}
271: \end{figure}
272: 
273: 
274: \subsection{Break-up (Network Disconnection) Probability}
275: 
276: 
277: \begin{table}[t]
278: \caption{Gain and loss terms for $N_{bu} (2,r, \alpha)$: 
279: the number of nodes with dead 
280: first {\em and} second successors.}
281: \label{tab:disconnect} 
282: 	\centering
283: 		\begin{tabular}{|l|l|} \hline
284: 		Change in $N_{bu}(r,\alpha)$	&  \minorchange{Probability of Occurrence}   \\ %\hline 
285: 		$N_{bu}(t+\Delta t) = N_{bu}(t)+1$ & $c_{2.1}=(\lambda_f \minorchange{N} \Delta t)d_1 (r, \alpha)$ \\ %\hline
286: 		$N_{bu}(t+\Delta t) = N_{bu}(t)+1$ & $c_{2.2}=\lambda_f \minorchange{N} \Delta t (1-d_1) d_2 $ \\ %\hline
287: 		$N_{bu}(t+\Delta t) = N_{bu}(t)-1$ & $c_{2.3}=\alpha \lambda_s \minorchange{N} \Delta t P_{bu}(2,r,\alpha) $ \\ 
288: 		$N_{bu}(t+\Delta t) = N_{bu}(t)$ & $1 - (c_{2.1} + c_{2.2} + c_{2.3} )$\\ 
289: \hline
290: 		\end{tabular}
291: %\vspace*{-0.35cm}
292: \end{table}
293: 
294: We demonstrate below, how calculating $d_k(r, \alpha)$:
295: the fraction of nodes with dead $k^{th}$ pointers,
296: helps in estimating the probability that
297: the network gets disconnected for any value of $r$ and $\alpha$.
298: Let $P_{bu} (n, r,\alpha)$ be the probability that
299: $n$ consecutive nodes fail. If
300: $n={\cal S}$, the length of the successor list, then clearly the node
301: whose successor list this is, gets disconnected from the network 
302: and the network breaks up.
303: For the range of $r$ considered in Fig. \ref{fig:wi}, 
304: $P_{bu}({\cal S},r,\alpha) \sim 0$. However should we go lower, this 
305: starts becoming finite. The master equation analysis
306: introduced here can be used to estimate $P_{bu} (n,r,\alpha)$ 
307: for any $1\le n \le {\cal S}$. We 
308: indicate how this might be done by first considering the case $n=2$.
309: Let $N_{bu} (2,r,\alpha)$ be the number of configurations in which 
310: a node has both $s_1$ and $s_2$ dead and $P_{bu}(2,r,\alpha)$ be the 
311: fraction of such configurations. 
312: Table \ref{tab:disconnect} indicates how this is estimated
313: within the present framework. 
314: 
315: 
316: \begin{figure}
317: 	\centering
318: 		\includegraphics[height=9cm, angle=270]{d2}
319: 	\caption{Theory and simulation for the probability of failure of the $2^{nd}$ successor, $d_2(r,\alpha)$.}
320: 	\label{fig:d2}
321: \end{figure}
322: 
323: A join event does not affect this probability in any way. So we only need to
324: consider the effect of failures or stabilization events.
325: The term $c_{2.1}$ accounts for the situation when the 
326: {\em first} successor of a node is dead 
327: (which happens with probability $d_1 (r, \alpha)$ as explained above). 
328: A failure event can then kill its second successor as well and this happens
329: with probability $c_{2.1}$. The second term is the situation that the first
330: successor is alive (with probability $1-d_1$) but the second successor
331: is dead (with probability $d_2$). The logic used to estimate $d_2$
332: (or $d_k$ in general) is very similar to the reasoning 
333: we used to estimate the $w_k$'s. So we have 
334: \begin{equation}
335: \label{eq:dk-equation}
336: d_k = d_{1} + (k-1) d_1 = k d_1
337: \end{equation}
338: Thus the $k^{th}$ successor of a node is dead if the ${k-1}^{st}$
339: successor's successor is dead, or the ${k-1}^{st}$ successor's successor 
340: is not dead but the intermediate nodes think it is
341: because they haven't stabilized. 
342: Hence $d_2 \sim 2/\alpha r$. This estimate for $d_2$ matches the simulation results very well, as shown in Fig. \ref{fig:d2}.
343: 
344: Coming back to counting the gain and loss terms for $N_{bu}(2,r,\alpha)$, 
345: a stabilization event reduces 
346: the number of such configurations by one, if
347: the node doing the stabilization had such a configuration to begin with.
348: 
349: Solving the equation for $N_{bu} (2,r,\alpha)$, one hence obtains
350: that $P_{bu}(2,r,\alpha) \sim 3/(\alpha r)^2$. 
351: As Fig. \ref{fig:fos2} shows, this is a
352: precise estimate.
353: 
354: We can similarly estimate the probabilities for three consecutive nodes
355: failing, {\it etc}, and hence also the general disconnection
356: probability $P_{bu}({\cal S},r,\alpha)$. In fact
357: $P_{bu}({\cal S},r,\alpha)$ may be written in terms of the
358: $d_k(r, \alpha)$ as:
359: \begin{equation}
360: \label{eq:Pbu-equation}
361: P_{bu}({\cal S}) = ({{\cal S}-1})! \frac{\sum_{1}^{\cal S} d_i(r,\alpha)}{(\alpha r)^{{\cal S}-1}}
362: \end{equation}
363: The logic behind this equation is similar to that used for 
364: solving for $P_{bu}(2)$, namely that for ${\cal S}$ consecutive nodes to fail, any ${{\cal S}-1}$ of the  ${\cal S}$ nodes should have 
365: failed first, and then a failure event kills the remaining node.
366: (\ref{eq:Pbu-equation})  
367: is readily solved by substituting the values of the $d_k$'s to get
368: \begin{equation}
369: \label{eq:Pbu-solution}
370: P_{bu}({\cal S})= \frac{({{\cal S}+1})!}{2 (\alpha r)^{{\cal S}}}
371: \end{equation}
372: 
373: As mentioned above this is again correct only to leading order. Namely
374: there will be correction terms of the order $r^{{\minorchange{\cal S}} +1}$ which we haven't 
375: computed at this level of approximation. 
376: The Master Equation formalism 
377: thus affords the possibility of making a precise
378: prediction for when the system runs the danger of 
379: getting disconnected, as a function of the parameters.
380: 
381: 
382: 
383: \begin{figure}
384: 	\centering
385: 		\includegraphics[height=9cm, angle=270]{fos2}
386: 	\caption{Theory and simulation for the break-up probability $P_{bu}(2, r, \alpha)$.}
387: 	\label{fig:fos2}
388: \end{figure}
389: 
390: 
391: 
392: 
393: 
394: 
395: 
396: 
397: