0710:0710.0270/analysis-lkpcost

1: \subsection{Cost of Finger Stabilizations and Lookups}

2: \begin{figure*}[t]

3: 	\centering

4: 	\includegraphics{ck.eps}

5: %	\vspace*{-0.4cm}

6: 	\caption{Cases that a lookup can encounter with the respective probabilities and costs.}

7: 	\label{fig:ck}

8: \end{figure*}

9: \begin{figure*}[t]

10: 	\centering

11: %		\includegraphics[height=9cm, angle=270]{wd}

12: %		\includegraphics[height=9cm, angle=270]{i}

13: 		\includegraphics[height=8cm, angle=270]{f}

14: 		\includegraphics[height=8cm, angle=270]{l_jan25}

15:

16: 		%\begin{table}[t]

17: 	   %\centering

18: %\vspace*{-0.3cm}

19: 	\caption{Theory and simulation for probability of failure of the $k^{th}$ finger $f_k(r,\alpha)$, \minorchange{and the lookup length $L(r,\alpha)$.}}

20: 	\label{fig:w}

21: \end{figure*}

22:

23: In this section, we demonstrate how the information

24: about the failed fingers and successors can be used to predict

25: the cost of stabilizations, lookups or in general the cost for

26: reaching any key in the id space. By cost we mean the number

27: of hops needed to reach the destination {\it including }

28: the number of timeouts encountered en-route. Timeouts occur

29: every time a query is passed to a dead node. The node does not answer and

30: the originator of the query has to use another finger instead.

31: For this analysis, we consider timeouts and hops to add  equally

32: to the cost. We can  easily generalize this analysis to investigate the case

33: when a timeout costs some factor $\gamma$ times the cost of a hop.

34:

35: Define $C_{t}(r, \alpha)$ (also denoted by $C_{t}$) to be the expected cost for a given node

36: to reach some target key which is $t$ keys away from it (which

37: means reaching the first successor of this key). For example,

38: $C_1$ would then be the cost of looking up the adjacent key ($1$

39: key away). Since the adjacent key is always stored at the

40: first alive successor, therefore if the first successor is

41: alive (which occurs with probability $1-d_1$), the cost will be $1$ hop.

42: If the first successor is dead but the second is alive (occurs with probability

43: $d_1(1-d_2)$), the cost will be 1 hop + 1 timeout = $2$ and the \emph{expected} cost is

44: $2 \times d_1(1-d_2)$ and so forth. Therefore, we have $C_1= 1-d_1 +  2 \times d_1(1-d_2) + 3 \times d_1 d_2 (1-d_3)+ \dots

45: \approx 1 + d_1 = 1+1/(\alpha r)$.

46:

47: To find the expected cost for reaching a general distance $t$ we need

48: to closely follow the Chord protocol, which would lookup $t$ by first finding

49: the closest preceding finger. For the purposes of the analysis,

50: we will find it easier to think in terms of the closest preceding {\it start}.

51: Let us hence define $\xi$ to be the {\emph start} of the

52: finger (say the $k^{th}$) that most closely precedes $t$.

53: Hence $\xi = 2^{k-1} + n$ and

54: $t = \xi+m$ \textit{i.e.}, there are $m$ keys between the sought target $t$

55: and the start of the closest preceding

56: finger.  With that, we can write a recursion relation

57: for $C_{\xi+m}$ as follows:

58:

59: \begin{equation}

60: %\vspace*{-0.5cm}

61: \label{eq:cost}

62: \begin{split}

63: &C_{\xi+m} =  C_{\xi} \left[1-a(m)\right]  						\\

64: 				         &+ (1-f_k) a(m)\left[1 + \sum_{i=0}^{m-1} bc(i,m)C_{m-i}\right]

65:           \\

66: 					 &+ f_k  a(m) \biggl[ 1 + \sum_{i=1}^{k-1} h_k(i) \\

67: 					 &\sum_{l=0}^{\xi/2^i-1}bc(l,\xi/2^i)(1+(i-1) +C_{\xi_i-l+m}) + O(h_k(k)) \biggr]

68: %					 &+ \biggl[ f_k  a(m) \\

69: %					 &+f_k  a(m)\sum_{i=1}^{k-1} h_k(i) \sum_{l=1}^{\xi/2^i}bc(l,m)(1+C_{\xi_i(k)+1-l+m}) + 2h_k(k)\biggr]

70: \end{split}

71: %\vspace*{-0.5cm}

72: \end{equation}

73:

74: where $\xi_i \equiv \sum_{m=1,i} \xi/2^{m}$ and $h_k(i)$ is the

75: probability that a node is forced to use its $k-i^{th}$ finger owing to the

76: death of its $k^{th}$ finger.

77: The probabilities $a,b,bc$ have already been introduced in Section

78: {\ref{sec:internode}},

79: and we define the probability $h_k(i)$ below.

80:

81:

82: The lookup equation though rather complicated at first sight

83: merely accounts for all the possibilities that

84: a Chord lookup will encounter, and deals with them

85: exactly as the protocol dictates.

86:

87: The first term (Fig. \ref{fig:ck}~(a)) accounts for the eventuality that there is no node intervening

88: between $\xi$ and $\xi+m$ (occurs with probability $1-a(m)$).

89: In this case, the cost of looking for $\xi + m$ is the same

90: as the cost for looking for $\xi$.

91:

92: The second term (Fig. \ref{fig:ck}~(b)) accounts for the situation when a node does intervene in between (with

93: probability $a(m)$), and this node is alive (with probability $1-f_k$).

94: Then the query is passed on to this node (with $1$ added to

95: register the increase in the number of hops) and then the cost depends on

96: the length of the distance between this node and $t$.

97:

98: The third term (Fig. \ref{fig:ck}~(c)) accounts for the case when the intervening node is dead

99: (with probability $f_k$). Then the cost increases by $1$ (for a timeout)

100: and the query needs to find an alternative

101: lower finger that most closely precedes

102: the target. Let the $k-i^{th}$ finger (for some $i$, $1 \leq i \leq k-1$)

103: be such a finger. This happens with probability $h_k(i)$

104: \textit{i.e.}, the probability

105: that the lookup is passed back to the $k-i^{th}$ finger either because the intervening fingers

106: are dead or share the same finger table entry as the $k^{th} $ finger is denoted by $h_k(i)$.

107: The start of the $k-i^{th}$ finger is at $\xi/2^i$ and the distance between

108: $\xi/2^i$ and $\xi$ is equal to $\sum_{m=1,i} \xi/2^{m}$

109: which we denote by $\xi_i$.

110: Therefore, the distance from the {\it start} of the $k-i^{th}$ to the

111: target is equal to $\xi_i+m$.

112: However, note that $fin_{k-i}.node$ could be $l$

113: keys away (with probability $bc(l,\xi/2^i)$) from $fin_{k-i}.start$

114: (for some $l$, $0 \leq l < \xi/2^i$).

115: Therefore, after making one hop

116: to $fin_{k-i}.node$,

117: the remaining distance to the target is  $\xi_i+m-l$.

118: The increase in cost for this operation is $1+(i-1)$; the $1$ indicates

119: the cost of taking up the query again by $fin_{k-i}.node$,

120: and the $i-1$ indicates the cost for trying and discarding each of

121: the $i-1$ intervening fingers.

122: The probability $h_k(i)$ is easy to compute given

123: property \ref{prop:ab} and the expression

124: for the $f_k$'s computed in the previous section.

125:

126: \begin{equation}

127: \label{eq:hki}

128: \begin{split}

129: h_k(i) = & a(\xi/2^{i}) (1-f_{k-i})  \\

130:        \times &\Pi_{s=1,i-1} (1-a(\xi/2^{s}) + a(\xi/2^s)f_{k-s}), i<k \\

131: h_k(k) = & \Pi_{s=1,k-1} (1-a(\xi/2^{s}) + a(\xi/2^s)f_{k-s})

132: \end{split}

133: \end{equation}

134:

135: In (\ref{eq:hki}) we account for all the

136: reasons that a  node may have to use its $k-i^{th}$ finger

137: instead of its $k^{th}$ finger. This could happen because the

138: intervening fingers were either dead or not distinct.

139: The probabilities $h_k(i)$ satisfy the constraint $\sum_{i=1}^{k} h_k(i)=1$

140: since clearly, either a node uses any one of its fingers

141: or it doesn't. This latter probability is $h_k(k)$, that is the probability that a node

142: cannot use any earlier entry in its finger table.

143: In this case, $n$ proceeds to its successor list.

144: The query is now passed on to the first alive successor

145: and the new cost is a function of the distance of this node

146: from the target $t$.

147: We indicate this case by the last term in \ref{eq:cost} which is

148: $O(h_k(k))$. This can again be computed from the inter-node distribution

149: and from the functions $d_k(r,\alpha)$ computed earlier.

150: However in practice, the probability for this

151: is extremely small except for targets very close to $n$.

152: Hence this does not

153: significantly affect the value of general lookups and we ignore it

154: in our analysis.

155:

156:

157:

158: %\begin{figure}[t]

159: %	\centering

160: %	\includegraphics[height=8cm, angle=270]{ici}

161: %	\caption{The average cost $C_i$ (the number hops for looking up an

162: %item $i$ keys away) in a network of ${\cal N}=1000$ nodes and

163: %${\cal K}= 2^{20} $ keys without churn obtained from the

164: %recurrence relation (\ref{eq:constnochurn1}). The average lookup length $L$ is also plotted as a reference.}

165: %	\label{fig:ici}

166: %\end{figure}

167: %

168: %

169: %\begin{figure}[t]

170: %	\centering

171: %	\includegraphics[height=8cm, angle=270]{nochurn}

172: %	\caption{Theory and Simulation for the lookup cost without churn for a key space of size ${\cal K}=2^{14}$ for varying $N$. Plotted as reference is the curve $0.5 \log_2(N)$. Note that on the y axis we have actually plotted

173: %$L-1$ for convenience.}

174: %	\label{fig:nochurn}

175: %\end{figure}

176: %

177:

178: The cost for general lookups is hence

179: $$

180: L(r,\alpha) = \frac{\Sigma_{i=1}^{{\cal K} -1} C_i(r,\alpha)}{\cal K}

181: $$

182:

183: The lookup equation is solved recursively numerically,

184: given the coefficients

185: and $C_1$.

186: In Fig.~\ref{fig:w}, we compare  theoretical results with

187: simulation for $N=1000$. It is seen that the theory matches

188: the simulation results very well.

189:

190: In Fig.~\ref{fig:lookup_theory} we also show the theoretical predictions

191: for some larger values of $N$. \minorchange{From the structure of

192: Equation \ref{eq:cost}, it is clear that the dependence of the average lookup on churn comes entirely

193: from the presence of the terms $f_k$. Since $f_k \sim f$ is independent of $k$ for large fingers, we can

194: approximate the average lookup length by the

195: functional form $L (r, \alpha) = A + {B}f + C f^{2} + \cdots $.

196: The coefficients $A, B, C$ {\it etc} can be recursively computed by solving the lookup equation

197: to the required order in $f$ and depend only on $N$ the number of nodes, $1- \rho $ the

198: density of peers and $b$ the base or equivalently the

199: size of the finger table of each node.

200: The advantage of writing the lookup length this

201: way is that churn-specific details such as how new joinees construct a finger table or how exactly

202: stabilizations are done in the system, can be isolated in the expression for $f$.

203: If we were to change our stabilization strategy for example \cite{KEAH_inprep}, we could immediately

204: estimate the lookup length by plugging in the new expression for $f$ in the above relation.}

205:

206:

207: \minorchange{The coefficient $A$, which is the lookup cost without churn

208: can be obtained very precisely for any base $b$}, from analyzing

209: (\ref{eq:cost}) in the zero-churn case. This analysis is rather

210: laborious and will be presented elsewhere \minorchange{\cite{KEAH_inprep}}.

211: It confirms the well-known result

212: $A= \frac{1}{2}\log_2 N $ and in addition

213: reproduces small deviations from this behavior previously

214: observed by us in numerical simulations \cite{elAnsaryAurellHaridi}.

215: The values of $A$ in Fig.~\ref{fig:lookup_theory} are

216: taken from this analysis.

217:

218: $B$ can be qualitatively estimated as follows :

219: every sufficiently long finger is dead with some

220: finite probability \minorchange{$f$ given by (\ref{eq:fk}).

221: If $A$ is the average value of the lookup length {\it without} churn, then

222: each look-up encounters $f A$ dead fingers on average}. This estimate

223: predicts a look-up cost of approximately \minorchange{$A (1+f)$, giving $B=A$ and $C$ and all other

224: coefficients equal to $0$.}.

225:

226:

227: \begin{figure}[t]

228: 	\centering

229: 	\includegraphics[height=8cm, angle=270]{lookup_theory_jan25}

230: 	\caption{Lookup cost, theoretical curve, for \minorchange{$1000$,$2000$,$4000$,$8000$ and $16000$} peers.

231: The rationale for the fits is explained in the text.}

232: 	\label{fig:lookup_theory}

233: \end{figure}

234:

235:

236: \minorchange {In Fig.~\ref{fig:lookup_theory} we show that the best fit to the data is obtained in fact

237: by taking $B=A$ and $C=3A$. The expression for $f$ is taken from \ref{eq:fk} for large $k$ (for a system with $20$ fingers,

238: the expression for $f_k$ becomes independent of $k$ for $k \ge 13$).

239: In general, as mentioned earlier, $B$ and $C$ can be obtained accurately for any value of the system parameters

240: by the numerical solution of Eq. ~\ref{eq:cost} to the required order}.

241:

242: %are estimated by judging

243: %which value provides the best fit. To compare, the qualitative analysis

244: %given above predicts that

245: %$B= \frac{1}{2}\log_2 N {\cal M}(1+\tilde{P}_{rep}(k))$ if we use

246: %(\ref{eq:fk-leading}). Using ${\cal M}= 20$ and

247: %$\tilde{P}_{rep} \sim 0.879$ (we can get the latter by evaluating

248: %$p_1$,$p_2$ and $p_3$ mentioned in Property \ref{prop:share} in Section \ref{sec:internode}),

249: %we obtain $B \sim 187$ for $N=1000$, $B \sim 225$ for $N=4000$ and

250: %$B \sim 243$ for $N=8000$. While these are not the best fit values

251: %shown in Fig ~\ref{fig:lookup_theory},

252: %they are of the same order of magnitude.

253:

254:

255:

256:

257:

258:

259:

260: