1: \subsection{Cost of Finger Stabilizations and Lookups}
2: \begin{figure*}[t]
3: \centering
4: \includegraphics{ck.eps}
5: % \vspace*{-0.4cm}
6: \caption{Cases that a lookup can encounter with the respective probabilities and costs.}
7: \label{fig:ck}
8: \end{figure*}
9: \begin{figure*}[t]
10: \centering
11: % \includegraphics[height=9cm, angle=270]{wd}
12: % \includegraphics[height=9cm, angle=270]{i}
13: \includegraphics[height=8cm, angle=270]{f}
14: \includegraphics[height=8cm, angle=270]{l_jan25}
15:
16: %\begin{table}[t]
17: %\centering
18: %\vspace*{-0.3cm}
19: \caption{Theory and simulation for probability of failure of the $k^{th}$ finger $f_k(r,\alpha)$, \minorchange{and the lookup length $L(r,\alpha)$.}}
20: \label{fig:w}
21: \end{figure*}
22:
23: In this section, we demonstrate how the information
24: about the failed fingers and successors can be used to predict
25: the cost of stabilizations, lookups or in general the cost for
26: reaching any key in the id space. By cost we mean the number
27: of hops needed to reach the destination {\it including }
28: the number of timeouts encountered en-route. Timeouts occur
29: every time a query is passed to a dead node. The node does not answer and
30: the originator of the query has to use another finger instead.
31: For this analysis, we consider timeouts and hops to add equally
32: to the cost. We can easily generalize this analysis to investigate the case
33: when a timeout costs some factor $\gamma$ times the cost of a hop.
34:
35: Define $C_{t}(r, \alpha)$ (also denoted by $C_{t}$) to be the expected cost for a given node
36: to reach some target key which is $t$ keys away from it (which
37: means reaching the first successor of this key). For example,
38: $C_1$ would then be the cost of looking up the adjacent key ($1$
39: key away). Since the adjacent key is always stored at the
40: first alive successor, therefore if the first successor is
41: alive (which occurs with probability $1-d_1$), the cost will be $1$ hop.
42: If the first successor is dead but the second is alive (occurs with probability
43: $d_1(1-d_2)$), the cost will be 1 hop + 1 timeout = $2$ and the \emph{expected} cost is
44: $2 \times d_1(1-d_2)$ and so forth. Therefore, we have $C_1= 1-d_1 + 2 \times d_1(1-d_2) + 3 \times d_1 d_2 (1-d_3)+ \dots
45: \approx 1 + d_1 = 1+1/(\alpha r)$.
46:
47: To find the expected cost for reaching a general distance $t$ we need
48: to closely follow the Chord protocol, which would lookup $t$ by first finding
49: the closest preceding finger. For the purposes of the analysis,
50: we will find it easier to think in terms of the closest preceding {\it start}.
51: Let us hence define $\xi$ to be the {\emph start} of the
52: finger (say the $k^{th}$) that most closely precedes $t$.
53: Hence $\xi = 2^{k-1} + n$ and
54: $t = \xi+m$ \textit{i.e.}, there are $m$ keys between the sought target $t$
55: and the start of the closest preceding
56: finger. With that, we can write a recursion relation
57: for $C_{\xi+m}$ as follows:
58:
59: \begin{equation}
60: %\vspace*{-0.5cm}
61: \label{eq:cost}
62: \begin{split}
63: &C_{\xi+m} = C_{\xi} \left[1-a(m)\right] \\
64: &+ (1-f_k) a(m)\left[1 + \sum_{i=0}^{m-1} bc(i,m)C_{m-i}\right]
65: \\
66: &+ f_k a(m) \biggl[ 1 + \sum_{i=1}^{k-1} h_k(i) \\
67: &\sum_{l=0}^{\xi/2^i-1}bc(l,\xi/2^i)(1+(i-1) +C_{\xi_i-l+m}) + O(h_k(k)) \biggr]
68: % &+ \biggl[ f_k a(m) \\
69: % &+f_k a(m)\sum_{i=1}^{k-1} h_k(i) \sum_{l=1}^{\xi/2^i}bc(l,m)(1+C_{\xi_i(k)+1-l+m}) + 2h_k(k)\biggr]
70: \end{split}
71: %\vspace*{-0.5cm}
72: \end{equation}
73:
74: where $\xi_i \equiv \sum_{m=1,i} \xi/2^{m}$ and $h_k(i)$ is the
75: probability that a node is forced to use its $k-i^{th}$ finger owing to the
76: death of its $k^{th}$ finger.
77: The probabilities $a,b,bc$ have already been introduced in Section
78: {\ref{sec:internode}},
79: and we define the probability $h_k(i)$ below.
80:
81:
82: The lookup equation though rather complicated at first sight
83: merely accounts for all the possibilities that
84: a Chord lookup will encounter, and deals with them
85: exactly as the protocol dictates.
86:
87: The first term (Fig. \ref{fig:ck}~(a)) accounts for the eventuality that there is no node intervening
88: between $\xi$ and $\xi+m$ (occurs with probability $1-a(m)$).
89: In this case, the cost of looking for $\xi + m$ is the same
90: as the cost for looking for $\xi$.
91:
92: The second term (Fig. \ref{fig:ck}~(b)) accounts for the situation when a node does intervene in between (with
93: probability $a(m)$), and this node is alive (with probability $1-f_k$).
94: Then the query is passed on to this node (with $1$ added to
95: register the increase in the number of hops) and then the cost depends on
96: the length of the distance between this node and $t$.
97:
98: The third term (Fig. \ref{fig:ck}~(c)) accounts for the case when the intervening node is dead
99: (with probability $f_k$). Then the cost increases by $1$ (for a timeout)
100: and the query needs to find an alternative
101: lower finger that most closely precedes
102: the target. Let the $k-i^{th}$ finger (for some $i$, $1 \leq i \leq k-1$)
103: be such a finger. This happens with probability $h_k(i)$
104: \textit{i.e.}, the probability
105: that the lookup is passed back to the $k-i^{th}$ finger either because the intervening fingers
106: are dead or share the same finger table entry as the $k^{th} $ finger is denoted by $h_k(i)$.
107: The start of the $k-i^{th}$ finger is at $\xi/2^i$ and the distance between
108: $\xi/2^i$ and $\xi$ is equal to $\sum_{m=1,i} \xi/2^{m}$
109: which we denote by $\xi_i$.
110: Therefore, the distance from the {\it start} of the $k-i^{th}$ to the
111: target is equal to $\xi_i+m$.
112: However, note that $fin_{k-i}.node$ could be $l$
113: keys away (with probability $bc(l,\xi/2^i)$) from $fin_{k-i}.start$
114: (for some $l$, $0 \leq l < \xi/2^i$).
115: Therefore, after making one hop
116: to $fin_{k-i}.node$,
117: the remaining distance to the target is $\xi_i+m-l$.
118: The increase in cost for this operation is $1+(i-1)$; the $1$ indicates
119: the cost of taking up the query again by $fin_{k-i}.node$,
120: and the $i-1$ indicates the cost for trying and discarding each of
121: the $i-1$ intervening fingers.
122: The probability $h_k(i)$ is easy to compute given
123: property \ref{prop:ab} and the expression
124: for the $f_k$'s computed in the previous section.
125:
126: \begin{equation}
127: \label{eq:hki}
128: \begin{split}
129: h_k(i) = & a(\xi/2^{i}) (1-f_{k-i}) \\
130: \times &\Pi_{s=1,i-1} (1-a(\xi/2^{s}) + a(\xi/2^s)f_{k-s}), i<k \\
131: h_k(k) = & \Pi_{s=1,k-1} (1-a(\xi/2^{s}) + a(\xi/2^s)f_{k-s})
132: \end{split}
133: \end{equation}
134:
135: In (\ref{eq:hki}) we account for all the
136: reasons that a node may have to use its $k-i^{th}$ finger
137: instead of its $k^{th}$ finger. This could happen because the
138: intervening fingers were either dead or not distinct.
139: The probabilities $h_k(i)$ satisfy the constraint $\sum_{i=1}^{k} h_k(i)=1$
140: since clearly, either a node uses any one of its fingers
141: or it doesn't. This latter probability is $h_k(k)$, that is the probability that a node
142: cannot use any earlier entry in its finger table.
143: In this case, $n$ proceeds to its successor list.
144: The query is now passed on to the first alive successor
145: and the new cost is a function of the distance of this node
146: from the target $t$.
147: We indicate this case by the last term in \ref{eq:cost} which is
148: $O(h_k(k))$. This can again be computed from the inter-node distribution
149: and from the functions $d_k(r,\alpha)$ computed earlier.
150: However in practice, the probability for this
151: is extremely small except for targets very close to $n$.
152: Hence this does not
153: significantly affect the value of general lookups and we ignore it
154: in our analysis.
155:
156:
157:
158: %\begin{figure}[t]
159: % \centering
160: % \includegraphics[height=8cm, angle=270]{ici}
161: % \caption{The average cost $C_i$ (the number hops for looking up an
162: %item $i$ keys away) in a network of ${\cal N}=1000$ nodes and
163: %${\cal K}= 2^{20} $ keys without churn obtained from the
164: %recurrence relation (\ref{eq:constnochurn1}). The average lookup length $L$ is also plotted as a reference.}
165: % \label{fig:ici}
166: %\end{figure}
167: %
168: %
169: %\begin{figure}[t]
170: % \centering
171: % \includegraphics[height=8cm, angle=270]{nochurn}
172: % \caption{Theory and Simulation for the lookup cost without churn for a key space of size ${\cal K}=2^{14}$ for varying $N$. Plotted as reference is the curve $0.5 \log_2(N)$. Note that on the y axis we have actually plotted
173: %$L-1$ for convenience.}
174: % \label{fig:nochurn}
175: %\end{figure}
176: %
177:
178: The cost for general lookups is hence
179: $$
180: L(r,\alpha) = \frac{\Sigma_{i=1}^{{\cal K} -1} C_i(r,\alpha)}{\cal K}
181: $$
182:
183: The lookup equation is solved recursively numerically,
184: given the coefficients
185: and $C_1$.
186: In Fig.~\ref{fig:w}, we compare theoretical results with
187: simulation for $N=1000$. It is seen that the theory matches
188: the simulation results very well.
189:
190: In Fig.~\ref{fig:lookup_theory} we also show the theoretical predictions
191: for some larger values of $N$. \minorchange{From the structure of
192: Equation \ref{eq:cost}, it is clear that the dependence of the average lookup on churn comes entirely
193: from the presence of the terms $f_k$. Since $f_k \sim f$ is independent of $k$ for large fingers, we can
194: approximate the average lookup length by the
195: functional form $L (r, \alpha) = A + {B}f + C f^{2} + \cdots $.
196: The coefficients $A, B, C$ {\it etc} can be recursively computed by solving the lookup equation
197: to the required order in $f$ and depend only on $N$ the number of nodes, $1- \rho $ the
198: density of peers and $b$ the base or equivalently the
199: size of the finger table of each node.
200: The advantage of writing the lookup length this
201: way is that churn-specific details such as how new joinees construct a finger table or how exactly
202: stabilizations are done in the system, can be isolated in the expression for $f$.
203: If we were to change our stabilization strategy for example \cite{KEAH_inprep}, we could immediately
204: estimate the lookup length by plugging in the new expression for $f$ in the above relation.}
205:
206:
207: \minorchange{The coefficient $A$, which is the lookup cost without churn
208: can be obtained very precisely for any base $b$}, from analyzing
209: (\ref{eq:cost}) in the zero-churn case. This analysis is rather
210: laborious and will be presented elsewhere \minorchange{\cite{KEAH_inprep}}.
211: It confirms the well-known result
212: $A= \frac{1}{2}\log_2 N $ and in addition
213: reproduces small deviations from this behavior previously
214: observed by us in numerical simulations \cite{elAnsaryAurellHaridi}.
215: The values of $A$ in Fig.~\ref{fig:lookup_theory} are
216: taken from this analysis.
217:
218: $B$ can be qualitatively estimated as follows :
219: every sufficiently long finger is dead with some
220: finite probability \minorchange{$f$ given by (\ref{eq:fk}).
221: If $A$ is the average value of the lookup length {\it without} churn, then
222: each look-up encounters $f A$ dead fingers on average}. This estimate
223: predicts a look-up cost of approximately \minorchange{$A (1+f)$, giving $B=A$ and $C$ and all other
224: coefficients equal to $0$.}.
225:
226:
227: \begin{figure}[t]
228: \centering
229: \includegraphics[height=8cm, angle=270]{lookup_theory_jan25}
230: \caption{Lookup cost, theoretical curve, for \minorchange{$1000$,$2000$,$4000$,$8000$ and $16000$} peers.
231: The rationale for the fits is explained in the text.}
232: \label{fig:lookup_theory}
233: \end{figure}
234:
235:
236: \minorchange {In Fig.~\ref{fig:lookup_theory} we show that the best fit to the data is obtained in fact
237: by taking $B=A$ and $C=3A$. The expression for $f$ is taken from \ref{eq:fk} for large $k$ (for a system with $20$ fingers,
238: the expression for $f_k$ becomes independent of $k$ for $k \ge 13$).
239: In general, as mentioned earlier, $B$ and $C$ can be obtained accurately for any value of the system parameters
240: by the numerical solution of Eq. ~\ref{eq:cost} to the required order}.
241:
242: %are estimated by judging
243: %which value provides the best fit. To compare, the qualitative analysis
244: %given above predicts that
245: %$B= \frac{1}{2}\log_2 N {\cal M}(1+\tilde{P}_{rep}(k))$ if we use
246: %(\ref{eq:fk-leading}). Using ${\cal M}= 20$ and
247: %$\tilde{P}_{rep} \sim 0.879$ (we can get the latter by evaluating
248: %$p_1$,$p_2$ and $p_3$ mentioned in Property \ref{prop:share} in Section \ref{sec:internode}),
249: %we obtain $B \sim 187$ for $N=1000$, $B \sim 225$ for $N=4000$ and
250: %$B \sim 243$ for $N=8000$. While these are not the best fit values
251: %shown in Fig ~\ref{fig:lookup_theory},
252: %they are of the same order of magnitude.
253:
254:
255:
256:
257:
258:
259:
260: