1: \documentclass[10pt,twocolumn]{article}
2: \usepackage[dvips]{graphicx}
3: \usepackage{myusenix}
4:
5: \usepackage{amsmath}
6: \begin{document}
7:
8: \renewcommand\dbltopfraction{.95}
9: \renewcommand\textfraction{.05}
10:
11:
12: \title{CUP: Controlled Update Propagation in Peer-to-Peer Networks}
13:
14:
15: \author{
16: Mema Roussopoulos \hspace{0.75cm} Mary Baker\\
17: {\em Department of Computer Science} \\
18: {\em Stanford University}\\
19: {\em Stanford, California, 94305}\\
20: \\
21: \normalsize \{mema, mgbaker\}@cs.stanford.edu \\
22: http://mosquitonet.stanford.edu/} % end author
23:
24: \date{February 1, 2002}
25:
26:
27: \maketitle
28:
29: \begin{abstract}
30:
31: \small Recently the problem of indexing and locating content in
32: peer-to-peer networks has received much attention. Previous work
33: suggests caching index entries at intermediate nodes that lie on the
34: paths taken by search queries, but until now there has been little
35: focus on how to maintain these intermediate caches. This paper
36: proposes CUP, a new comprehensive architecture for Controlled Update
37: Propagation in peer-to-peer networks. CUP asynchronously builds
38: caches of index entries while answering search queries. It then
39: propagates updates of index entries to maintain these caches. Under
40: unfavorable conditions, when compared with standard caching based on
41: expiration times, CUP reduces the average miss latency by as much as a
42: factor of three. Under favorable conditions, CUP can reduce the
43: average miss latency by more than a factor of ten.
44:
45: CUP refreshes intermediate caches, reduces query latency, and reduces
46: network load by coalescing bursts of queries for the same item. CUP
47: controls and confines propagation to updates whose cost is likely to
48: be recovered by subsequent queries. CUP gives peer-to-peer nodes the
49: flexibility to use their own incentive-based policies to determine
50: when to receive and when to propagate updates. Finally, the small
51: propagation overhead incurred by CUP is more than compensated for by
52: its savings in cache misses.
53:
54: \end{abstract}
55:
56:
57:
58: % ------------------------------------------------------------------------
59: \section{Introduction}
60: % ------------------------------------------------------------------------
61:
62: Peer-to-peer systems are self-organizing distributed systems where
63: participating nodes both provide and receive services from each other
64: in a cooperative effort to prevent any one node or set of nodes from
65: being overloaded. Peer-to-peer systems have recently gained much
66: attention, primarily because of the great number of features they
67: offer applications that are built on top of them. These features
68: include: scalability, availability, fault tolerance, decentralized
69: administration, and anonymity.
70:
71: Along with these features has come an array of technical challenges.
72: In particular, over the past year, there has been much focus on the
73: fundamental indexing and routing problem inherent in all peer-to-peer
74: systems: Given the name of an object of interest, how do you locate
75: the object within the peer-to-peer network in a well-defined,
76: structured manner that avoids flooding the network \cite{ratnasamy01a,
77: rowstron01b, stoica01, zhao01}?
78:
79: As a performance enhancement, the designers of these systems suggest
80: caching index entries with expiration times at intermediate nodes that
81: lie on the path taken by a search query. Intermediate caches are
82: desirable because they balance query load for an item across multiple
83: nodes, reduce latency, and alleviate hot spots. However, little
84: attention has been given to how to maintain these intermediate caches.
85: This problem is interesting because the peer-to-peer model assumes the
86: index will change constantly. This constant change stems from several
87: factors: peer nodes continuously join and leave the network, content
88: is continuously added to and deleted from the network, and replicas of
89: existing content are continuously added to alleviate bandwidth
90: congestion at nodes holding the content.
91:
92: In this paper we propose a new comprehensive architecture for
93: Controlled Update Propagation (CUP) in peer-to-peer networks that
94: asynchronously builds caches of index entries while answering search
95: queries. It then propagates updates of index entries to maintain
96: these caches. The basic idea is that every node in the peer-to-peer
97: network maintains two logical channels per neighbor: a query channel
98: and an update channel. The query channel is used to forward search
99: queries for items of interest to the neighbor that is closest to the
100: authority node for those items. The update channel is used to forward
101: query responses (first-time updates) asynchronously to a neighbor and
102: to update index entries that are cached at the neighbor.
103:
104:
105: Queries for an item travel ``up'' the query channels of nodes along
106: the path toward the authority node for that item. Updates travel
107: ``down'' the update channels along the reverse path taken by a query.
108: Figure~\ref{fig:logicalChannels} shows this process.
109:
110: \begin{figure}[tb]
111: \centerline{\includegraphics[height=7cm]{logChannel.eps}}
112: \caption[]{\small This figure shows how CUP works. \(A_1\) and
113: \(A_2\) are authority nodes. \(Q_{A_1}\), \(Q_{A_2}\), \(U_{A_1}\),
114: and \(U_{A_2}\) are the four logical channels between nodes \(N_1\)
115: and \(N_2\). A query arriving at node \(N_2\) for an item for which
116: \(A_1\) is the authority is pushed onto query channel \(Q_{A_1}\) to
117: \(N_1\). If \(N_1\) has a cached entry for the item, it returns it through
118: \(U_{A_1}\). Otherwise, it forwards the query towards \(A_1\).
119: Any update originating from \(A_1\) flows
120: downstream to \(N_1\) which may forward it onto \(N_2\) through
121: \(U_{A_1}\). The analogous process holds for queries
122: at \(N_1\) for items for which \(A_2\) is the authority.}
123: \label{fig:logicalChannels}
124: \end{figure}
125:
126:
127: The advantages of the query channel are twofold. First, if a node
128: receives two or more queries for an item for which it does not have a
129: fresh response, the node pushes only one instance of the query for
130: that item up its query channel.
131: This approach can have
132: significant savings in traffic, because bursts of requests for an item
133: are coalesced into a single request. Second, using a single query
134: channel solves the ``open connection'' problem suffered by some
135: peer-to-peer systems. Each time a query arrives at a node which does
136: not have a cached response, the node opens one or more connections to
137: neighboring nodes and must maintain those connections open until the
138: response returns through them. The asynchronous nature of the query
139: channel relieves nodes from having to maintain many open connections
140: since all responses return through the update channel. Through simple
141: bookkeeping (setting an interest bit) the node registers the interest
142: of its neighbors so it knows which of its neighbors to push the query
143: response to when the answer arrives.
144:
145:
146: The cascaded propagation of updates from authority nodes down the
147: reverse paths of search queries has many advantages. First, updates
148: extend the lifetime of cached entries allowing intermediate nodes to
149: continue serving queries from their caches without having to push
150: queries up their channels explicitly. It has been shown that up to
151: fifty percent of content hits at caches are instances where the
152: content is valid but stale and therefore cannot be used to serve
153: queries without first being re-validated \cite{cohen01b}. These occurrences are
154: called \emph{freshness misses}. Second, a node that
155: proactively pushes updates to interested neighbors reduces its load of
156: queries generated by those neighbors. The cost of pushing the update
157: down is recovered by the first query for the same item
158: following the update.
159: Third, the further down an update gets pushed, the shorter the
160: distance subsequent queries need to travel to reach a fresh cached
161: answer. As a result, query response latency is reduced. Finally,
162: updates can help prevent errors. For example, an update to invalidate
163: an index entry prevents a node from answering queries using the entry
164: before it expires.
165:
166:
167: In CUP, nodes decide individually when to receive updates. A node only
168: receives updates for an item if the node has registered interest in
169: that item. Furthermore, each node uses its own incentive-based policy to
170: determine when to cut off its incoming supply of updates for an
171: item. This way the propagation of updates is controlled and does not
172: flood the network.
173:
174: Similarly, nodes decide individually when to propagate updates to
175: interested neighbors. This is useful because a node may not always be
176: able or willing to forward updates to interested neighbors. In fact, a
177: node's ability or willingness to propagate updates may vary with its
178: workload. CUP addresses this by introducing an adaptive mechanism
179: each node uses to regulate the rate of updates it propagates
180: downstream. A salient feature of CUP is that even if a node's
181: capacity to push updates becomes zero, nodes dependent on the
182: node for updates fall back with no overhead to the case
183: of standard caching with expiration.
184:
185:
186: When compared with standard caching, under unfavorable conditions, CUP
187: reduces the average miss latency by as much as a factor of three.
188: Under favorable conditions, CUP reduces the average miss latency by
189: more than a factor of ten. CUP overhead is more than compensated for
190: by its savings in cache misses. In fact, the ``investment'' return
191: per update pushed in saved misses grows substantially with
192: increasing network size and query rates. The cost of saved misses can
193: be one to two orders of magnitude higher than the cost of updates
194: pushed.
195:
196:
197: We demonstrate that the performance of CUP depends highly on the
198: policy a node uses to cut off its incoming updates. We find that the
199: cut-off policy should adapt to the node's query workload and we
200: present probabilistic and log based methods for doing so. Finally, we
201: show that CUP continues to outperform standard caching even when
202: update propagation is reduced by either node capacity or network
203: conditions.
204:
205: The rest of the paper is organized as follows:
206: Section~\ref{Architecture} describes in detail the design of the CUP
207: architecture. Section~\ref{Evaluation} describes the cost model we use to
208: evaluate CUP and presents experimental evidence of the benefits of CUP.
209: Section~\ref{RelatedWork} discusses related work and
210: Section~\ref{Conclusions} concludes the paper.
211:
212:
213:
214:
215: \section{CUP Architecture Design}
216: \label{Architecture}
217:
218: First, we provide some background terminology we use throughout the
219: paper and very briefly describe how peer-to-peer networks for which
220: CUP is appropriate perform their indexing and lookup operations.
221: Then we describe the components of the CUP protocol.
222:
223: \subsection{Background}
224:
225: The following terms will be useful for the remainder of the paper:
226:
227: \emph{Node}: This is a node in the peer-to-peer network. Each node
228: periodically exchanges ``keep-alive'' messages with its neighbors to
229: confirm their existence and to trigger recovery mechanisms should one
230: of the neighbors fail.
231: Every node also maintains two logical channels (connections) for each neighbor:
232: the query channel and the update channel. The query channel is used
233: by the node to push queries to its neighbor. The update channel is
234: used by the node to push updates that are of interest to the neighbor.
235:
236:
237: \emph{Global Index}: The most important operation in a peer-to-peer
238: network is that of locating content. As in \cite{ratnasamy01a} we
239: assume a hashing scheme that maps keys (names of content files or
240: keywords) onto a virtual coordinate space using a uniform hash
241: function that evenly distributes the keys to the space. The
242: coordinate space serves as a global index that stores index entries
243: which are \emph{(key, value)} pairs. The value in an index entry is a
244: pointer (typically an IP address) to the location of a replica that
245: stores the content file associated with the entry's key. There can be
246: several index entries for the same key, one for each replica of the
247: content.
248:
249:
250: \emph{Authority Node}: Each node N in the peer-to-peer system is
251: dynamically allocated a subspace of the coordinate space (i.e., a
252: partition of the global index) and all index entries mapped into its
253: subspace are owned by N. We refer to N as the authority node of these
254: entries. \emph{Replicas} of content whose key corresponds to an
255: authority node N send birth messages to N to announce they are willing
256: to serve the content. Depending on the application supported,
257: replicas might periodically send refresh messages to indicate they are
258: still serving a piece of content. They might also send deletion
259: messages that explicitly indicate they are no longer serving the
260: content. These deletion messages trigger the authority node to delete
261: the corresponding index entry from its local index directory.
262:
263: \emph{Search Query}: A search query posted at a node N is a request to
264: locate a replica for key K. The response to such a search query is a
265: set of index entries that point to replicas that serve the content
266: associated with K.
267:
268: \emph{Query Path for Key K}: This is the path a search query for key
269: \emph{K} takes. Each hop on the query path is in the direction of
270: the authority node that owns \emph{K}. If an intermediate node on this
271: path has fresh entries cached, the path ends at the intermediate
272: node; otherwise the path ends at the authority node.
273:
274: \emph{Reverse Query Path for Key K}: This path is the reverse of the
275: query path defined above.
276:
277:
278: \emph{Local index directory}: This is the subset of global index
279: entries owned by a node.
280:
281: \emph{Cached index entries}: This is the set of index entries cached
282: by a node N in the process of passing up queries and propagating down
283: updates for keys for which N is not the authority. The set of
284: cached index entries and the local index directory are disjoint sets.
285:
286: \emph{Lifetime of index entries}: We assume that each index entry
287: cached at a node has associated with it a lifetime and a timestamp
288: indicating the time at which the lifetime was set. When the
289: difference between the current time and the timestamp is greater than
290: the lifetime field, the entry has expired and cannot be used to answer
291: queries. An index entry is considered fresh until it expires.
292:
293:
294: \subsection{How Routing Works}
295:
296: We assume that anytime a node issues a query for key \emph{K}, the
297: query will be routed along a well-defined structured path with a
298: bounded number of hops from the querying node to the authority node
299: for \emph{K}. The routing mechanism is designed so that each node on
300: the path hashes \emph{K} using the same hash function to deterministically choose
301: which of its neighbors will serve as the next hop.
302: Examples of peer-to-peer systems that provide this type of structured
303: location and routing mechanism include content-addressable networks
304: (CANs) \cite{ratnasamy01a}, Chord \cite{stoica01}, Pastry
305: \cite{rowstron01b} and Tapestry \cite{zhao01}. CUP can be used in the
306: context of any of these systems.
307:
308:
309: \subsection{Node Bookkeeping}
310:
311: At each node, index entries are grouped together by key. For each key
312: K, the node stores a flag that indicates whether the node is waiting
313: to receive an update for K for the first time and an interest bit
314: vector. Each bit in the vector corresponds to a neighbor and is set
315: or clear depending on whether that neighbor is or is not interested in
316: receiving updates for K.
317:
318:
319: Each node tracks the popularity or request frequency of each non-local
320: key K for which it receives queries. The popularity measure for a key
321: K can be the number of queries for K a node receives between arrivals
322: of consecutive updates for K or a rate of queries of a larger moving
323: window. On an update arrival for K, a node uses its popularity
324: measure to re-evaluate whether it is beneficial to continue caching and
325: receiving updates for K. We elaborate on this cut-off decision in
326: Section~\ref{CutOffPolicies}.
327:
328: Node bookkeeping in CUP involves no network overhead. With increasing
329: CPU speeds and memory sizes, this bookkeeping is negligible when we
330: consider the reduction in query latency achieved.
331:
332:
333: \subsection{Update Types}
334:
335: We classify updates into four categories: first-time updates, deletes,
336: refreshes, and appends. Deletes, refreshes, and
337: appends originate from the replicas of a piece of content and are
338: directed toward the authority node that owns the index entries for
339: that content.
340:
341: First-time updates are query responses that travel down the reverse
342: query path.
343:
344: Deletes are directives to remove a cached index entry. Deletes can be
345: triggered by two events:
346: 1) a replica sends a message indicating it no
347: longer serves a piece of content to the authority node that owns the
348: index entry pointing to that replica.
349: 2) the authority node notices a
350: replica has stopped sending ``keep-alive'' messages and assumes the
351: replica has failed. In either case, the authority node deletes the
352: corresponding index entry from its local index directory and
353: propagates the delete to interested neighbors.
354:
355: Refreshes are keep-alive messages that extend the lifetimes of index
356: entries.
357: Refreshes that arrive at a cache do not result in errors as deletes
358: do, but help prevent freshness misses.
359:
360:
361:
362: Finally, appends are directives to add index entries for new replicas
363: of content. These updates help alleviate the demand for content from the
364: existing set of replicas since they add to the number of replicas from
365: which clients can download content.
366:
367:
368:
369: % ------------------------------------------------------------------------
370: \subsection{Handling Queries}
371: \label{HowQueryingWorks}
372: % ------------------------------------------------------------------------
373:
374: Upon receipt of a query for a key \emph{K}, there are three basic
375: cases to consider. In each of the cases, the node updates its
376: popularity measure for \emph{K}. The node also sets the appropriate
377: bit in the interest bit vector for \emph{K} if the query originates
378: from a neighbor. Otherwise, if the query is from a local client, the
379: node maintains the connection until it can return a fresh answer to the
380: client.
381: To simplify the protocol description we use the phrase
382: ``push the query'' to indicate that a node pushes a query upstream
383: toward the authority node. We use the phrase ``push the update'' to
384: indicate that a node pushes an update downstream in the direction of
385: the reverse query path.
386:
387: \textbf{Case 1: Fresh Entries for key K are cached.}
388: The node uses its cached entries for \emph{K} to push the response
389: as a first-time update to the querying neighbor or local client.
390:
391: \textbf{Case 2: Key K is not in cache.} The node adds \emph{K} to its
392: cache and marks it with a \emph{Pending-First-Update} flag. The
393: purpose of the \emph{Pending-First-Update} flag is to coalesce bursts
394: of queries for the same key into one query. A subsequent query for
395: \emph{K} from a neighbor or a local client will save the node from
396: pushing another instance of the query for \emph{K}.
397:
398:
399: \textbf{Case 3: All cached entries for key K have expired.} The node
400: must obtain the fresh index entries for \emph{K}. If the
401: \emph{Pending-First-Update} flag is set, the node does not need to
402: push the query; otherwise, the node sets the flag and pushes the
403: query.
404:
405: % ------------------------------------------------------------------------
406: \subsection{Handling Updates}
407: \label{UpdateArrivals}
408: % ------------------------------------------------------------------------
409:
410: A key feature of CUP is that a node does not forward an update for
411: \emph{K} to its neighbors unless those neighbors have registered
412: interest in \emph{K}. Therefore, with some light bookkeeping, we
413: prevent unwanted updates from wasting network bandwidth.
414:
415: Upon receipt of an update for key \emph{K} there are three cases to
416: consider.
417:
418: \textbf{Case 1: Pending-First-Update flag is set.} This means that
419: the update is a first-time update carrying a set of index entries in
420: response to a query. The node stores the index entries in its cache,
421: clears the \emph{Pending-First-Update} flag, and pushes the update to
422: neighbors whose interest bits are set and to local client connections
423: open at the node.
424:
425:
426: \textbf{Case 2: Pending-First-Update flag is clear.} If all the
427: interest bits for \emph{K} are clear, the node decides whether it
428: wants to continue receiving updates for \emph{K}. The node bases its
429: decision on \emph{K}'s popularity measure. Each node uses its own
430: policy for deciding whether the popularity of a key is high enough to
431: warrant receiving further updates for it. If the node decides
432: \emph{K}'s popularity is too low, it pushes a \emph{Clear-Bit} control
433: message to the neighbor from whom it received the update. The
434: \emph{Clear-Bit} message indicates that the neighbor's interest bit
435: for this node should be cleared. Otherwise, if the popularity is high
436: or some interest bits are set, the node applies the update to its
437: cache and pushes the update to the neighbors whose bits are set.
438:
439: Note that a greedy or selfish node can choose not to push updates for
440: a key K to interested neighbors. This forces downstream nodes to fall
441: back to standard caching for K. However, by choosing to cut off
442: downstream propagation, a node runs the risk of receiving subsequent
443: queries from its neighbors. Handling each of these queries is twice
444: the cost of propagating an update downward because the node has to
445: receive the query from the downstream neighbor and then push the
446: response as an update. Therefore, although each node is free to stop
447: pushing updates at any time it is in its best interest to push updates
448: for which there are interested neighbors.
449:
450: \textbf{Case 3: Incoming update has expired.} This could occur when
451: the network path has long delays and the update does not arrive in
452: time. The node does not apply the update and does not push it downstream.
453:
454: \subsection{Handling Clear-Bit Messages}
455: \label{Clear-Bit-Messages}
456:
457: A \emph{Clear-Bit} control message is pushed by a node to indicate to
458: its neighbor that it is no longer interested in updates for a
459: particular key from that neighbor.
460:
461: When a node receives a \emph{Clear-Bit} message for key K, it clears
462: the interest bit for the neighbor from which the message was sent. If
463: the node's popularity measure for K is low and all of its interest
464: bits are clear, the node also pushes a \emph{Clear-Bit} message for K.
465: This propagation of \emph{Clear-Bit} messages toward the authority
466: node for K continues until a node is reached where the popularity of K
467: is high or where at least one interest bit is set.
468:
469: \emph{Clear-Bit} messages can be piggy-backed onto queries or updates
470: intended for the neighbor, or if there are no pending queries or
471: updates, they can be pushed separately.
472:
473: \subsection{Adaptive Control of Update Push}
474: \label{ControllingUpdateArrivals}
475:
476: Ideally every node would propagate all updates to interested neighbors
477: to save itself from having to handle future downstream misses.
478: However, from time to time, nodes are likely to be limited in their
479: capacity to push updates downstream. Therefore, we introduce an
480: adaptive control mechanism that a node uses to regulate the rate of
481: pushed updates.
482:
483: We assume each node N has a capacity U for pushing updates that varies
484: with N's workload, network bandwidth, and/or network connectivity. N
485: divides U among its outgoing update channels such that each channel
486: gets a share that is proportional to the length of its queue. This
487: allocation maintains the queues roughly equally sized. The queues are
488: guaranteed to be bounded by the expiration times of the entries in the
489: queues. So even if a node has its update channels completely shut
490: down for a long period, all entries will expire and be removed from
491: the queues.
492:
493:
494: Under a limited capacity and while updates are waiting in the queues,
495: each node can re-order the updates in its
496: outgoing update channels by pushing ahead updates that are likely to
497: have greater impact on query latency reduction, on query accuracy, or
498: on the load balancing of content demand across replicas. During the
499: re-ordering any expired updates are eliminated.
500:
501: The strategy for re-ordering depends on the application. For example,
502: in an application where query latency and accuracy are of the most
503: importance, one can push updates in the following order: first-time
504: updates, deletes, refreshes, and appends. In an application subject to
505: flash crowds that query for a particular item, appends might be given
506: higher priority over the other updates. This would help distribute
507: the load faster across the entire set of replicas.
508:
509:
510: A node can also re-order refreshes and appends so that entries that
511: are closer to expiring are given higher priority. Such entries are
512: more likely to cause freshness misses which in turn trigger a new query search.
513: So it is advantageous to try to catch this in time by pushing these first.
514:
515: \subsection{Node Arrivals and Departures}
516:
517: The peer-to-peer model assumes that participating nodes will
518: continuously join and leave the network. CUP must be able to handle
519: both node arrivals and departures seamlessly.
520:
521: \textbf{Arrivals.}
522: When a new node N enters the peer-to-peer
523: network, it becomes the authority node for a portion of the index
524: entries owned by an existing node M. N, M, and all surrounding
525: affected nodes (old neighbors of M) update the bookkeeping structures
526: they maintain for indexing and routing purposes. To support CUP, the
527: issues at hand are updating the interest bit vectors of the affected
528: nodes and deciding what to do with the index entries stored at M.
529:
530: Depending on the indexing mechanism used, the cardinality of the bit
531: vectors of the affected nodes may change. That is, bit vectors may
532: expand or contract as some nodes may now have more or fewer neighbors
533: than before N's arrival. Since all nodes already need to track who
534: their neighbors are as part of the routing mechanism, updating the
535: cardinality of the interest bit vectors to reflect N's arrival is
536: straightforward. For example, nodes that now have both N and M as
537: neighbors have to increase their bit vectors by one element to include
538: N. The affected nodes also need to modify the mappings from bit ID to
539: neighbor IP address in their bit vectors. For example, if a node that
540: previously had M as its neighbor now has N as its neighbor, the node
541: must make the bit ID that pointed to M now point to N.
542:
543: To deal with its stored index entries, M could simply not hand over
544: any of its entries to N. This would cause entries at some of M's
545: previous neighbors to expire and subsequent queries from those nodes
546: will restart update propagations from N. Alternatively, M could give
547: a copy of its stored index entries to N. Both N and M would then go
548: through each entry and patch its bit vector. This way nodes that
549: previously depended on M for updates of particular keys could continue
550: to receive updates from either M or N but not both.
551:
552: \textbf{Departures.} Node departures can be either graceful (planned)
553: or ungraceful (due to sudden failure of a node). In either case the
554: index mechanism in place dictates that a neighboring node M take over
555: the departing node N's portion of the global index. To support CUP,
556: the interest bit vectors of all affected nodes must be patched to
557: reflect N's departure.
558:
559: If N leaves gracefully, N can choose not to hand over to M its index
560: entries. Any entries at surrounding nodes that were dependent on N to
561: be updated will simply expire and subsequent queries will restart
562: update propagations. Again, alternatively N may give M its set of
563: entries. M must then merge its own set of index entries with N's, by
564: eliminating duplicate entries and patching the interest bit vectors as
565: necessary. If N's departure is not planned, there can be no hand over
566: of entries and all entries in the affected neighboring nodes will
567: expire as in standard caching.
568:
569: Note that the transfer of entries can be coincided with the transfer
570: of information that is already occurring as part of the routing mechanism
571: in the peer-to-peer network, and therefore does not add extra network
572: overhead. Also the bit vector patching is a local operation that affects
573: only each individual node. Therefore even in cases where a node's
574: neighborhood changes often, the effect on the overall performance of
575: CUP is limited to that node's neighborhood (see section 3.7).
576:
577:
578:
579: \subsection{CUP Query/Update Trees}
580:
581: Figure~\ref{fig:CUPTrees} shows a snapshot of CUP in progress in a
582: network of seven nodes. The left hand side of each node shows the set
583: of keys for which the node is the authority. The right hand side
584: shows the set of keys for which the node has cached index entries as a
585: result of handling queries. For example, node A owns
586: K3 and has cached entries for K1 and K5.
587:
588: For each key, the authority node that owns the key is the root of a
589: CUP tree. The branches of the CUP tree are formed by the paths
590: traveled by queries from other nodes in the network. For example, one
591: path in the tree rooted at A is \{F, D, C, A\}.
592:
593: Updates originate at the root (authority node) of a CUP tree and
594: travel downstream to interested nodes. Queries travel upstream toward
595: the root.
596:
597: \begin{figure}[tb]
598: \centerline{\includegraphics[width=7cm, height=7cm]{7nodesA}}
599: \caption[]{\small CUP Trees}
600: \label{fig:CUPTrees}
601: \end{figure}
602:
603:
604: % ------------------------------------------------------------------------
605: \section{Evaluation}
606: \label{Evaluation}
607: % ------------------------------------------------------------------------
608:
609: The goal of CUP is to extend the benefits of standard caching based on
610: expiration times. There are two key performance questions to address.
611: First, by how much does CUP reduce the average cost per query?
612: Second, how much overhead does CUP incur in providing this reduction?
613:
614: We first present the cost model based on economic incentive used by
615: each node to determine when to cut off the propagation of updates for a
616: particular key. We give a simple analysis of how the cost per query
617: is reduced (or eliminated) through CUP. We then describe our
618: experimental results comparing the performance of CUP with that of
619: standard caching.
620:
621: \subsection{Cost Model}
622: \label{CostModel}
623:
624: Consider an authority node A that owns key K and consider the tree
625: generated by issuing a query for K from every node in the peer-to-peer
626: network. The resulting tree, rooted at A, is the \emph{Virtual Query
627: Spanning Tree} for K, V(A,K), and contains all possible query paths
628: for K. The \emph{Real Query Tree} for K, R(A,K) is a subtree of
629: V(A,K) also rooted at A and contains all paths generated by real
630: queries. The exact structure of R(A,K) depends on the actual workload
631: of queries for K. The entire workload of queries for all keys results
632: in a collection of criss-crossing Real Query Trees with overlapping
633: branches.
634:
635: We first consider the case of standard caching at the intermediate
636: nodes along the query path for key K. Consider a node N that is at
637: distance D from A in V(A,K). We define the cost per query for K at N
638: as the number of hops in the peer-to-peer network that must be
639: traversed to return an answer to N. When a query for K is posted at N
640: for the first time, it travels toward A looking for the response. If
641: none of the nodes between N and A have a fresh response cached, the
642: cost of the query at N is \(2 D\): D hops to reach A and D hops for
643: the response to travel down the reverse query path as a first-time
644: update. If there is a node on the query path with a fresh answer
645: cached, the cost is less than \(2 D \). Subsequent queries for K at N
646: that occur within the lifetime of the entries now cached at N have a
647: cost of zero. As a result, caching at intermediate nodes has the
648: benefits of balancing query load for K across multiple nodes and
649: lowering average latency per query.
650:
651: We can gauge the performance of CUP by calculating the percentage of
652: updates CUP propagates that are ``justified''. We precisely define
653: what a justified update is below, but simply put, an update is
654: justified if it recovers the overhead it incurs, i.e., if its cost is
655: recovered by a subsequent query. An unjustified update is therefore
656: overhead that is not recovered (wasted). Updates for popular keys are
657: likely to be justified more often than updates for less popular keys.
658:
659: A refresh update is justified if a query arrives sometime
660: between the previous expiration of the cached entry and the new
661: expiration time supplied by the refresh update. An append
662: update is justified if at least one query arrives between
663: the time the append is performed and the time of its expiration.
664:
665: A first-time update is always justified because it carries a query's
666: response toward the node that originally issues the query. A deletion
667: update is considered justified if at least one query arrives between
668: the time the deletion is performed and the expiration time of the
669: entry to be deleted.
670:
671: For each update, let \(T\) be the critical time interval described
672: above during which a query must arrive in order for the update to be
673: justified. (For first-time updates \(T = \infty\)). Consider a node
674: N at distance D from A in R(A,K). An update propagated down to N is justified
675: if at least one query Q is posted within \(T\) time units at any of the
676: nodes of the virtual subtree V(N,K). Note that an update is justified
677: if Q arrives at the virtual tree V(N,K), \emph{not} the real query
678: tree R(N,K) because Q can be posted anywhere in V(N,K).
679:
680: Given the distribution of query arrivals at each node in the tree
681: V(N,K), we can find the probability that the update at N is justified
682: by calculating the probability that a query will arrive at some node
683: in V(N,K). Assume that queries for K arrive at each node \(N_i\) in
684: V(N,K) according to a Poisson process with parameter \(\lambda_i\).
685: Then it follows that queries for K arrive at V(N,K) according to a
686: Poisson process with parameter \(\Lambda\) equal to the sum of all
687: \(\lambda's\). Therefore, the probability that a query for K will
688: arrive within \(T\) time units is \(1 - e^{-\Lambda T}\) and equals
689: the probability that the update pushed to N is justified.
690: The closer to the authority N is, the higher the \(\Lambda\)
691: and thus the higher the probability for an update pushed to N to be justified.
692: For \(\Lambda=1\) query arrival per second and \(T=6\)
693: seconds, the probability that an update arriving at N is justified
694: is 99 percent.
695:
696: The benefit of a justified update goes beyond recovery of its cost.
697: For each hop an update is pushed down, exactly one hop is saved since
698: without the propagation, a subsequent query arriving within \(T\) time
699: units would have to travel one hop up and one hop down. This halves
700: the number of hops traveled which reduces query response latency, and
701: at the same time provides enough benefit margin for more
702: aggressive CUP strategies.
703: For example, a more aggressive strategy would be to push some updates
704: even if they are not justified. As long as the number of justified
705: updates is at least fifty percent the total number of updates pushed,
706: the overall update overhead is completely recovered. If the
707: percentage of justified updates is less than fifty percent, then the
708: overhead will not be fully recovered but query latency will be further
709: reduced. Therefore, if network load is not the prime concern, an
710: ``all-out'' push strategy achieves minimum latency.
711:
712:
713: \subsection{Experiments}
714: \label{Experiments}
715:
716: One of the challenges in evaluating this work is the unavailability of
717: real data traces of completely decentralized peer-to-peer networks
718: such as those assumed by CUP. The reason for this is that such
719: systems \cite{ratnasamy01a, rowstron01b, stoica01, zhao01} are not yet
720: in widespread use to make collecting traces feasible.
721: Therefore, in the evaluation of CUP we choose simulation parameters
722: that range from unfavorable to favorable conditions for CUP in order
723: to show the spectrum of performance and how it compares to standard
724: caching under the same conditions. For example, low query rates do
725: not favor CUP because updates are less likely to be justified since
726: there may not be enough subsequent queries to recover the cost of the
727: updates. On the other hand, queries for keys that become suddenly hot
728: not only justify the propagation overhead, but also enjoy a significant
729: reduction in latency.
730:
731: For our experiments, we simulated a two-dimensional ``bare-bones''
732: content-addressable network (CAN) \cite{ratnasamy01a} using the
733: Stanford Narses simulator \cite{maniatis01}. The simulation takes as
734: input the number of nodes in the overlay peer-to-peer network, the
735: number of keys owned per node, the distribution of queries for keys,
736: the distribution of query inter-arrival times, the number of replicas
737: per key, and the lifetime of replicas in the system.
738: We ran experiments for n = \(2^k\) nodes where k ranged from 3 to 12.
739: Simulation time was 22000 seconds, with 3000 seconds of querying time.
740: We present results for experiments with replica lifetime of 300
741: seconds to reflect the dynamic nature of peer-to-peer networks where
742: nodes might only serve content for short periods of time. For all
743: experiments, refreshes of index entries occur at expiration. Query
744: arrivals were generated according to a Poisson process.
745: Nodes were randomly selected to post the queries.
746:
747: We present five experiments. First we compare the performance and
748: overhead of CUP against standard caching where CUP propagates updates
749: without any concern for whether the updates are justified. In this
750: experiment, we vary the level in the CUP tree to which updates are
751: propagated. We use this experiment to establish the level that
752: provides the maximum benefit and then use the performance results at
753: this level as a benchmark for comparison in later experiments.
754: Second, we compare the effect on CUP performance of different
755: incentive-based cut-off policies and compare the performance of these
756: policies to that of the benchmark. Third, using the best cut-off
757: policy of the previous experiment, we study how CUP performs as we
758: vary the size of the network. Fourth, we study the effect on
759: performance of increasing the number of replicas corresponding to a
760: key. Finally, we study the effect of limiting the outgoing update
761: capacities of nodes.
762:
763: \subsection{Varying the CUP Push Level}
764:
765: In this set of experiments we compare standard caching with a version
766: of CUP that propagates updates down the Real Query Tree of a key
767: regardless of whether or not the updates are justified. We use this
768: information to determine a maximal performance baseline. We determine
769: the reduction in misses achieved by CUP and the overhead CUP incurs to
770: achieve this reduction. We define \emph{miss cost} as the total
771: number of hops incurred by all misses, i.e. freshness and first-time
772: misses. We define the CUP overhead as the total number of hops
773: traveled by all updates sent downstream plus the total number of hops
774: traveled by all clear-bit messages upstream. (We assume clear-bit
775: messages are not piggybacked onto updates. This somewhat inflates the
776: overhead measure.) We define \emph{total cost} as the sum of the
777: \emph{miss cost} and any overhead hops incurred. Note that in
778: standard caching, the \emph{total cost} is equal to the \emph{miss
779: cost}.
780:
781:
782: Figures~\ref{fig:PushLevQ1,10} and \ref{fig:PushLevQ100,1000} plot
783: CUP's total cost and miss cost versus the push level for a network of
784: \(2^{10}\) nodes. A push level of \(p\) means that updates are
785: propagated to all nodes that have queried for the key and that are at
786: most \(p\) hops from the authority node. A push level of \(0\)
787: corresponds to the case of standard caching, since all updates from
788: the authority node (the root of the CUP tree) are immediately
789: squelched and not forwarded on. For this set of experiments, query
790: arrivals were generated according to a Poisson process with average
791: rate \(\lambda\) of 1, 10, 100, and 1000 queries per second at the
792: network.
793:
794:
795: The figures show that as the push level increases, CUP significantly
796: reduces the miss cost when compared with standard caching and does
797: so with little overhead as shown by the displacement of each pair
798: of curves.
799:
800: In Figure~\ref{fig:PushLevQ1,10}, for $\lambda = 1$ query per second,
801: the total cost incurred by CUP decreases and reaches a minimum at
802: around push level 20, after which it slightly increases. This
803: turning point is the level beyond which the overhead cost of updates is
804: not recoverable.
805: For $\lambda = 10$ queries per second, a similar turning
806: point occurs at around push level 25. In
807: Figure~\ref{fig:PushLevQ100,1000} the minimum total cost occurs at
808: push level 25 and tapers off for both $\lambda =100$ and $\lambda
809: =1000$ queries per second. For low query arrival rates, the turning
810: point occurs at lower push levels. For example, for $\lambda = 0.01$
811: queries per second, the turning point occurs at push level 15. These
812: results show that there is no specific optimal push level at which CUP
813: achieves the minimum total cost across all workloads. If there were,
814: then the simplest strategy for CUP would be to have updates be
815: propagated to that optimal push level. In fact, we have found that in
816: addition to the query workload, the optimal push level is affected by
817: the number of nodes in the network and the rate at which updates are
818: generated, both of which change dynamically.
819:
820: In the absence of an optimal push level, each node needs a policy for
821: determining when to stop receiving updates. We next examine some
822: cut-off policies.
823:
824: \begin{figure}[tb]
825: \centerline{\includegraphics[width=9cm]{Pop4TotalCostVersusPushLevQ1,10.eps}}
826: \caption{\small Total cost and miss cost versus push level. }
827: \label{fig:PushLevQ1,10}
828: \end{figure}
829:
830: \begin{figure}[tb]
831: \centerline{\includegraphics[width=9cm]{Pop4TotalCostVersusPushLevQ100,1000.eps}}
832: \caption{\small Total cost and miss cost versus push level. The
833: y-axis is log scale.}
834: \label{fig:PushLevQ100,1000}
835: \end{figure}
836:
837:
838:
839: \subsection{Varying the Cut-Off Policies}
840: \label{CutOffPolicies}
841:
842: On receiving an update for a key, each node determines
843: whether or not there is incentive to continue receiving updates
844: or to cut off updates by pushing up a clear-bit message. We base the
845: incentive on the \emph{popularity} of the key at the node.
846: The more popular a key is, the more incentive there is to receive updates
847: for that key. For a key K, the popularity is the number of queries a
848: node has received for K since the last update for K arrived at the
849: node.
850:
851: We examine two types of thresholds against which to test a key's
852: popularity when making the cut-off decision: probability-based and
853: log-based.
854:
855: A probability-based threshold uses the distance of a node N from the
856: authority node A to approximate the probability that an update pushed
857: to N is justified. Per our cost model of section 3.2, the further N
858: is from A, the less likely an update at N will be justified.
859: We examine two such thresholds, a linear one and a logarithmic one. With a
860: linear threshold, if an update for key K arrives at a node at distance
861: $D$ and the node has received at least $\alpha D$ queries for K since
862: the last update, then K is considered popular and the node continues
863: to receive updates for K. Otherwise, the node cuts off its intake of
864: updates for K by pushing up a clear-bit message. The logarithmic
865: popularity threshold is similar. A key K is popular if the node has
866: received $\alpha \lg(D)$ queries since the last update. The
867: logarithmic threshold is more lenient than the linear in that it
868: increases at a slower rate as we move away from the root.
869:
870: A log-based threshold is one that is based on the recent history of
871: the last \emph{n} update arrivals at the node.
872: If within \emph{n} updates, the node has not received
873: any queries, then the key is not popular and the node pushes up a clear-bit message. A specific example of a
874: log-based policy is the second-chance policy. In this policy, $n=3$.
875: When an update arrives, if no queries have arrived since the last
876: update, the policy gives the key a ``second chance'' and does not push
877: a clear-bit message immediately. If at the next update arrival the
878: node has still not received any queries for K, then it pushes a
879: clear-bit message. The philosophy behind this policy is that pushing
880: these two updates down from the parent node costs two hops. If a
881: query arrives in the interval between these two updates,
882: then it will recover the cost of
883: pushing them down, since a query miss would incur
884: two hops, one up and one down.
885:
886: \begin{table*}
887: \caption[]{\small Total cost for varying cut-off policies.}
888: \label{tab:policies2}
889: \begin{center}
890: \begin{tabular}{|l|r|r|r|r|} \hline
891: \textsf{\textbf{Policy}} & \textsf{\textbf{1 q/s Total Cost}}
892: & \textsf{\textbf{10 q/s Total Cost}} & \textsf{\textbf{100 q/s Total Cost}}
893: & \textsf{\textbf{1000 q/s Total Cost}} \\\hline
894: Standard Caching & 55905 (1.00) & 125288 (1.00) & 399669 (1.00) & 1967452 (1.00) \\ \hline
895: %Linear, $\alpha = 0.5$ & 69964 (1.25) & 127198 (1.02)& 69555 (0.17)& 191859 (0.10)\\ \hline
896: Linear, $\alpha = 0.25$ & 61778 (1.11)& 65668 (0.52)& 61521 (0.15)& 188772 (0.10)\\ \hline
897: Linear, $\alpha = 0.10$ & 39707 (0.71)& 44259 (0.35)& 56170 (0.14)& 184907 (0.09)\\ \hline
898: Linear, $\alpha = 0.01$ & 28234 (0.51)& 37396 (0.30)& 53613 (0.13)& 182198 (0.09)\\ \hline
899: Linear, $\alpha = 0.001$ & 28234 (0.51)& 37396 (0.30)& 53613 (0.13)& 182198 (0.09)\\ \hline
900: Logarithmic, $\alpha = 0.5$ & 30415 (0.54)& 39325 (0.31) & 54816 (0.14)& 183476 (0.09)\\ \hline
901: Logarithmic, $\alpha = 0.25$ & 28165 (0.50)& 37392 (0.30) & 53613 (0.13) & 182183 (0.09)\\ \hline
902: Logarithmic, $\alpha = 0.10$ & 28165 (0.50)& 37392 (0.30) & 53613 (0.13) & 182183 (0.09)\\ \hline
903: Logarithmic, $\alpha = 0.01$ & 28165 (0.50)& 37392 (0.30) & 53613 (0.13) & 182183 (0.09)\\ \hline
904: Second-chance & 15183 (0.27) & 24886 (0.20) & 46385 (0.12) & 177755 (0.09)\\ \hline
905: Optimal push level & 13916 (0.25) & 21805 (0.17) & 44062 (0.11) & 175914 (0.09) \\ \hline
906: \end{tabular}
907: \end{center}
908: \end{table*}
909:
910: Table~\ref{tab:policies2} compares the total cost of standard caching
911: with that of the linear and logarithmic policies for various $\alpha$
912: values, the second chance policy, and that of the optimal push level.
913: The experiment is for a network of \(2^{10}\) nodes and \(\lambda\)
914: rates of 1, 10, 100 and 1000 queries per second. In each table entry,
915: the first number is the total cost and the number in the parentheses
916: is the total cost normalized by the total cost for standard caching.
917:
918: For the lower query rates, the performance of the linear and the
919: logarithmic policies is greatly affected by the choice of parameter
920: $\alpha$. In some cases, the total cost of the linear policy exceeds
921: that of standard caching. For both the linear and logarithmic
922: policies, the total cost decreases with $\alpha$ until it reaches a
923: minimum that cannot be reduced any further. For both $\lambda = 1$ and
924: $\lambda =10$, this minimum is 1.5 to almost two times higher than that of
925: the second chance policy. For higher query rates, the performance of
926: the linear and logarithmic policies is less affected by the choice of
927: $\alpha$.
928: These results show that choosing a priori an $\alpha$ value for the
929: linear and logarithmic policies that will perform well across all
930: workloads is difficult.
931:
932: The log-based second-chance policy consistently outperforms both
933: probability-based policies and achieves a total cost very near the
934: minimum total cost. This is because, unlike the probability-based
935: policies that depend on a function of the node's network distance from
936: the root node, the second-chance policy adapts to the timing of the
937: queries within the workload and thus accounts for shifts in key
938: popularity, which is independent of the distance of the node. In all
939: remaining experiments, we use second-chance as the cut-off policy.
940:
941:
942: \subsection{Varying the Network Size}
943: \label{NetworkSize}
944:
945: In this section we show that CUP performs well as we vary the size
946: of the network.
947:
948: \begin{table*}
949: \caption[]{\small Comparison of CUP with standard caching for varying
950: numbers of nodes n = $2^k$ for k between 3 and 12.}
951: \label{tab:comparison2}
952: \begin{center}
953: \begin{tabular}{|l|r|r|r|r|r|r|r|r|r|r|r|} \hline
954: \textsf{\textbf{Number of Nodes}} & \textsf{\textbf{8}} & \textsf{\textbf{16}}
955: & \textsf{\textbf{32}} & \textsf{\textbf{64}} & \textsf{\textbf{128}}
956: & \textsf{\textbf{256}} & \textsf{\textbf{512}} & \textsf{\textbf{1024}}
957: & \textsf{\textbf{2048}} & \textsf{\textbf{4096}} \\\hline
958: CUP / STD Caching Miss Cost & 0.47 & 0.41 & 0.36 & 0.20 & 0.19 & 0.23 & 0.17 & 0.15 & 0.15 & 0.15 \\\hline
959: CUP miss latency &2.3 &2.7 &3.0 &3.0 &3.2 &4.0 & 3.9 & 3.9 & 5.5 & 7.3 \\\hline
960: STD Caching miss latency &2.8 &3.0 &3.5 &4.4 &5.1 & 5.4 & 7.7 & 9.4 & 13.0 & 19.1 \\\hline
961: Saved miss hops per CUP overhead hop & 0.77 & 1.05 & 1.47 & 2.70 & 3.44 & 3.0 & 5.57 & 7.05 & 9.13 & 13.2 \\\hline
962: \end{tabular}
963: \end{center}
964: \end{table*}
965:
966:
967:
968: Table~\ref{tab:comparison2} compares CUP and standard caching for
969: varying numbers of nodes using three metrics. We use a $\lambda$ rate
970: of one query per second. The first row shows the CUP miss cost as a
971: fraction of the standard caching miss cost. The second and third rows
972: show the query latency measured by average number of hops needed to
973: handle a miss for CUP and standard caching respectively. As can be
974: observed, CUP reduces latency respectively by 5.5, 7.5, and 11.8 hops
975: per miss for the 1024, 2048, and 4096 node networks. This is a
976: substantial reduction (as much as a factor of three) in query response
977: time in peer-to-peer networks.
978:
979: The fourth row in Table~\ref{tab:comparison2} shows the ``investment''
980: return per update push performed by CUP. This is computed as the
981: overall ratio of saved miss cost to overhead incurred by CUP:
982:
983: \small
984: \[
985: \label{eq:MissCostSaved}
986: \begin{split}
987: {S}&{avedMissOverheadRatio} \\ % \qquad \qquad \qquad \qquad \qquad \\
988: & = \frac {MissCost_{StandardCaching} - MissCost_{CUP}} {OverheadCost_{CUP}}
989: \end{split}
990: \] %\end{equation}
991:
992: \normalsize For the last three network sizes, the return is
993: respectively 7.05, 9.13, and 13.2 which are remarkable values for a
994: fully recoverable overhead investment. The return increases with the
995: network size, thus CUP is more amenable to larger networks.
996:
997: Note that this table was generated using a very low query arrival
998: rate. As a point of comparison, for a network of 1024 nodes and
999: \(\lambda = 1000\) queries per second, the CUP miss cost is 0.09 that
1000: of standard caching, the average CUP miss latency, 2.4 hops, is over
1001: ten times less than that for standard caching (25.1 hops), and the
1002: ratio of saved miss cost to update cost is 168 to 1. This
1003: demonstrates that CUP is indeed amenable to higher query arrival
1004: rates.
1005:
1006: \subsection{Multiple Replicas per Key}
1007:
1008: In this section we examine the effect on CUP performance of having
1009: multiple replicas per key and propagating updates from each of these
1010: replicas.
1011:
1012: A node that is receiving updates for key K will likely have
1013: multiple index entries cached for K. At first glance it seems
1014: that the more replicas there are in the system, the fewer the
1015: freshness misses for K the node should experience.
1016:
1017: It turns out this will occur only if the cut-off policy is independent
1018: of the number of replicas in the network. A naive implementation of a
1019: cut-off policy applies its decision at \emph{every} update arrival for
1020: K. Since the rate of update arrivals for K increases with the number
1021: of replicas in the system, the chances that the node will receive
1022: enough queries to pass its cut-off test are lower, and the naive
1023: implementation mistakenly concludes that there is not enough incentive
1024: to continue receiving updates. It therefore pushes up a clear-bit
1025: message earlier than necessary.
1026:
1027: Table~\ref{tab:replicas} illustrates this problem for a network of 1024
1028: nodes using the second-chance policy and a \(\lambda\) rate of 1 query
1029: per second. Column 2 in the table shows the miss cost and, in
1030: parentheses, the absolute number of misses incurred by the naive
1031: implementation. In this case, adding more replicas has the opposite
1032: effect of what we expected. More replicas can mean more misses. In
1033: fact, the single replica run exhibits the smallest miss cost and
1034: absolute number of misses.
1035:
1036: A solution to this problem requires that the cut-off decision be
1037: independent of the number of replicas in the network. One way to
1038: achieve this is to trigger the decision only when updates for a
1039: particular replica arrive, and to reset the popularity measure only at
1040: those times. This ensures that the popularity measure remains the same
1041: across updates for other replicas for the same key. Column 3 of
1042: Table~\ref{tab:replicas} shows that with this fix, the miss cost and
1043: number of misses do decrease as the number of replicas increases.
1044:
1045:
1046: \begin{table*}
1047: \caption[]{\small Miss cost, absolute number of misses, and total cost
1048: for varying number of replicas }
1049: \label{tab:replicas}
1050: \begin{center}
1051: \begin{tabular}{|l|r|r|r|} \hline
1052: \textsf{\textbf{ }} & \textsf{\textbf{Naive Cut-Off }} & \textsf{\textbf{Replica-independent Cut-Off}} & \textsf{\textbf{Replica-independent Cut-Off}} \\
1053: \textsf{\textbf{Replicas}} & \textsf{\textbf{Miss Cost \& (Misses)}}
1054: & \textsf{\textbf{Miss Cost \& (Misses)}} & \textsf{\textbf{Total Cost}} \\\hline
1055: %500 & 62249 (4171) & 7562 (502) & 3033595 \\ \hline
1056: 100 & 58068 (4493) & 7562 (502) & 608998 \\ \hline
1057: 50 & 59261 (4522) & 7562 (502) & 310301 \\ \hline
1058: 10 & 44079 (4296) & 7565 (504) & 69086 \\ \hline
1059: 5 & 24406 (2936) & 7565 (504) & 39615 \\ \hline
1060: 2 & 11614 (1366) & 7607 (522) & 21050 \\ \hline
1061: 1 & 8460 (1206) & 8460 (1206) & 15183 \\ \hline
1062: \end{tabular}
1063: \end{center}
1064: \end{table*}
1065:
1066: The last column of Table~\ref{tab:replicas} shows the total cost when
1067: each replica refresh is sent as a separate update. When compared to
1068: the 55905 hops of total cost for standard caching from
1069: Table~\ref{tab:policies2}, we observe that the total cost of CUP will
1070: eventually overtake that of standard caching as we increase the number
1071: of replicas. In fact this occurs at eight replicas where the total
1072: cost is 57430. While these results may seem to imply that a handful
1073: of replicas is enough for good CUP performance, for some applications,
1074: having many more replicas in the network is necessary even if they run
1075: the risk of unrecoverable additional CUP overhead. For example,
1076: having multiple replicas of content helps to balance the demand for
1077: that content across many nodes and reduces latency.
1078:
1079: One may view pushing updates for multiple replicas as an
1080: example of an aggressive CUP policy we refer to in
1081: Section~\ref{CostModel}. At 100 replicas, the total cost is about 10
1082: times that of standard caching. CUP pays the price of extra overhead
1083: but achieves a miss cost that is about 13.5 percent that of standard
1084: caching. Therefore, at the cost of extra network load, both query
1085: latency is reduced and the demand for content is balanced across a
1086: greater number of nodes.
1087:
1088: If however network load is a concern, there are a couple of techniques an
1089: authority node can use to reduce the overhead of CUP when there are
1090: many replicas in the network.
1091: First, rather than push all replica refreshes,
1092: the authority node can selectively choose to propagate a subset of the
1093: replica refreshes and suppress others. This allows the authority node
1094: to reduce update overhead as well as balance demand for content across
1095: the replicas. Another alternative would be to aggregate replica
1096: refreshes. When a refresh arrives for one replica, the authority node
1097: waits a threshold amount of time for other updates for the same key to
1098: arrive. It then batches all updates that arrive within that time and
1099: propagates them together as one update. This threshold would be a
1100: function of the lifetime of a replica and could be dynamically
1101: adjusted with the number of replicas in the system. We are
1102: experimenting with different kinds of threshold functions.
1103:
1104:
1105: \subsection{Varying Outgoing Update Capacity}
1106: Our experiments thus far show that CUP clearly outperforms standard
1107: caching under conditions where all nodes have full outgoing capacity.
1108: A node with full outgoing capacity is a node that can and does
1109: propagate all updates for which there are interested neighbors. In
1110: reality, an individual node's outgoing capacity will vary with its
1111: workload, network connectivity, and willingness to propagate updates.
1112: In this section we study the effect on CUP performance of reducing the
1113: outgoing update capacity of nodes.
1114:
1115:
1116: We present two experiments run on a network of 1024 nodes. In the
1117: first experiment, called "Up-And-Down", after a five minute warm up
1118: period, we randomly select twenty percent of the nodes to reduce their
1119: capacity to a fraction of their full capacity. These nodes operate at
1120: reduced capacity for ten minutes after which they return to full
1121: capacity. After another five minutes for stabilization, we randomly
1122: select another set of nodes and reduce their capacity for ten minutes.
1123: We proceed this way for the entire 3000 seconds during which queries
1124: are posted, so capacity loss occurs three times during the simulation.
1125: In the second experiment, called "Once-Down-Always-Down", after the
1126: initial five minute warmup period, the randomly selected nodes reduce
1127: and remain at reduced capacity for the remainder of the experiment.
1128:
1129: Figure~\ref{fig:capacityQ1} shows the total cost incurred by CUP
1130: versus reduced capacity $c$ for both Up-And-Down and
1131: Once-Down-Always-Down configurations. A reduced capacity $c = .25$
1132: means a node is only pushing out one-fourth the updates it receives.
1133: The figure also shows the total cost for standard caching as a
1134: horizontal line for comparison. The $\lambda$ rate is 1 query per
1135: second. Figure \ref{fig:capacityQ1000} shows the same for $\lambda$=
1136: 1000 which is especially interesting because CUP has bigger wins with
1137: higher query rates since more updates are justified than with lower
1138: query rates. Therefore, with high query rates CUP has more to lose if
1139: updates do not get propagated.
1140:
1141: Note that even when the capacity of one fifth of the nodes
1142: is reduced to zero percent and these nodes do not propagate updates,
1143: CUP outperforms standard
1144: caching for both query rates. The total cost incurred by CUP is about
1145: half that of standard caching for one query per second for both
1146: configurations. For 1000 queries per second, the total cost of CUP is
1147: 0.56 and 0.77 that of standard caching for "Up-And-Down" and
1148: "Once-Down-Always-Down" respectively.
1149:
1150: A key observation from these experiments is that CUP's performance
1151: degrades gracefully as $c$ decreases. This is because the reduction
1152: in propagation saves any
1153: overhead that would have occurred
1154: otherwise. The important point here is that even if nodes can only
1155: push out a fraction of updates to interested neighbors, CUP still
1156: extends the benefits of standard caching. Clearly though, CUP
1157: achieves its full potential when all nodes have maximum propagation
1158: capacity.
1159:
1160: \begin{figure}[tb]
1161: \centerline{\includegraphics[width=9cm]{UpAndDown,OnceDownAlwaysDownTotalCost,1.eps}}
1162: \caption{\small Total cost versus reduced capacity.}
1163: \label{fig:capacityQ1}
1164: \end{figure}
1165:
1166: \begin{figure}[tb]
1167: \centerline{\includegraphics[width=9cm]{UpAndDown,OnceDownAlwaysDownTotalCost,1000.eps}}
1168: \caption{\small Total cost versus reduced capacity. The y-axis is
1169: log scale.}
1170: \label{fig:capacityQ1000}
1171: \end{figure}
1172:
1173: % ------------------------------------------------------------------------
1174: \section{Related Work}
1175: \label{RelatedWork}
1176: % ------------------------------------------------------------------------
1177:
1178: Some peer-to-peer systems suffer from what
1179: we call the ``open-connection'' problem. Every time a peer node
1180: receives a query for which it does not have an answer cached, it asks
1181: one (e.g., Freenet \cite{clarke00}) or more (e.g., Gnutella \cite{gnutella})
1182: neighbors the same query by
1183: opening a connection and forwarding the query through that connection.
1184: The node keeps the connection open until the answer is returned through it.
1185: For every query on every item for which the node does not have a cached
1186: answer, the connection is maintained until the answer comes back.
1187: This results in excessive overhead for the node because it must
1188: maintain the state of many open connections. CUP avoids this overhead
1189: by asynchronously pushing responses as first-time updates and by
1190: coalescing queries for the same item into one query.
1191:
1192: Chord \cite{stoica01} and CFS \cite{dabek01} suggest alternatives to
1193: making the query response travel down the reverse query path back to
1194: the query issuer. Chord suggests iterative searches where the query
1195: issuer contacts each node on the query path one-by-one for the item of
1196: interest until one of the nodes is found to have the item. CFS
1197: suggests that the query be forwarded from node to node until a node is
1198: found to have the item. This node then directly sends the query
1199: response back to the issuer. Both of these approaches help avoid some
1200: of the long latencies that may occur as the query response traverses
1201: the reverse query path.
1202: CUP is advantageous regardless of whether the
1203: response is delivered directly to the issuer or through the reverse
1204: query path. However, to make this work for direct response delivery,
1205: CUP must not coalesce queries for the same item at a node into one
1206: query since each query would need to explicitly carry the return
1207: address information of the query issuer.
1208:
1209: All of the above systems (Gnutella, Freenet, Chord, and CFS) enable
1210: caching at the nodes along the query path. They do not focus on how
1211: to maintain entries once they have been cached. Cached items are
1212: removed when they expire and refetched on subsequent queries. For
1213: very popular items this can lead to higher average response time since
1214: subsequent bursts of queries must wait for the response to travel up
1215: and (possibly) down the query path. CUP can avoid this problem by
1216: refreshing or updating cached items for which there is interest before
1217: they expire.
1218:
1219: Consistent hashing work by Karger et al. \cite{karger97} looks at
1220: relieving hot spots at origin web servers by caching at intermediate
1221: caches between client caches and origin servers. Requests for items
1222: originate at the leaf clients of a conceptual tree and travel up
1223: through intermediate caches toward the origin server at the root of the
1224: tree. This work uses a model slightly different from the
1225: peer-to-peer model. Their model and analysis assume
1226: requests are made only at leaf clients and that intermediate caches do
1227: not store an item until it has been requested some threshold number of
1228: times. Also, this work does not focus on maintaining cache freshness.
1229:
1230: Update propagations in CUP form trees very similar to the application-level
1231: multicast trees built by Scribe \cite{rowstron01a}. Scribe
1232: is a publish-subscribe infrastructure built on top
1233: of Pastry \cite{rowstron01b}. Scribe creates a multicast tree rooted
1234: at the rendez-vous point of each multicast group.
1235: Publishers send a message to the rendez-vous point which then
1236: transmits the message to the entire group by sending it down the
1237: multicast tree.
1238: The multicast tree is formed by joining the Pastry routes from each
1239: subscriber node to the rendez-vous point. Scribe could apply the
1240: ideas CUP introduces to provide update propagation for cache
1241: maintenance in Pastry.
1242:
1243: Cohen and Kaplan study the effect that aging through cascaded caches
1244: has on the miss rates of web client caches \cite{cohen01a}. For each
1245: object an intermediate cache refreshes its copy of the object when its
1246: age exceeds a fraction \emph{v} of the lifetime duration. The
1247: intermediate cache does not push this refresh to the client; instead,
1248: the client waits until its own copy has expired at which point it
1249: fetches the intermediate cache's copy with the remaining lifetime.
1250: For some sequences of requests at the client cache and some
1251: \emph{v}'s, the client cache can suffer from a higher miss rate than
1252: if the intermediate cache only refreshed on expiration. Their model
1253: assumes zero communication delay. A CUP tree could be viewed as a
1254: series of cascaded caches in that each node depends on the previous
1255: node in the tree for updates to an index entry. The key difference is
1256: that in CUP, refreshes are pushed down the entire tree of interested
1257: nodes. Therefore, barring communication delays, whenever a parent
1258: cache gets a refresh so does the interested child node. In such
1259: situations, the miss rate at the child node actually improves.
1260:
1261: % ------------------------------------------------------------------------
1262: \section{Conclusions}
1263: \label{Conclusions}
1264: % ------------------------------------------------------------------------
1265:
1266: In this paper we propose CUP: Controlled Update Propagation for cache
1267: maintenance. CUP query channels coalesce bursts of queries for the
1268: same item into a single query. CUP update channels refresh
1269: intermediate caches and reduce the average query latency by over a
1270: factor of ten in favorable conditions, and as much as a factor of
1271: three in unfavorable conditions. Through light book-keeping, CUP
1272: controls and confines propagations so that only updates that are
1273: likely to be justified are propagated. In fact, when only half the
1274: number of updates propagated are justified, CUP's overhead is
1275: completely recovered. Finally, even when a large percentage of nodes
1276: cannot propagate updates (due to limited capacity), CUP continues to
1277: outperform standard caching with expiration.
1278:
1279: % ------------------------------------------------------------------------
1280: \section{Acknowledgements}
1281: % ------------------------------------------------------------------------
1282: This research is supported by the Stanford Networking Reseach Center,
1283: and by DARPA (contract N66001-00-C-8015).
1284:
1285: The work presented here has benefited greatly from discussions with
1286: Petros Maniatis, Armando Fox, Nick McKeown, and Rajeev Motwani. We
1287: thank them for their invaluable feedback.
1288:
1289: \small
1290: %\begin{singlespace}
1291: %\bibliographystyle{alpha} \bibliography{cup}
1292: %\end{singlespace}
1293:
1294: \newcommand{\etalchar}[1]{$^{#1}$}
1295: \begin{thebibliography}{DKK{\etalchar{+}}01}
1296:
1297: \bibitem[CK01a]{cohen01a}
1298: Edith Cohen and Haim Kaplan.
1299: \newblock {Aging Through Cascaded Caches: Performance Issues in the
1300: Distribution of Web Content}.
1301: \newblock In {\em SIGCOMM}, 2001.
1302:
1303: \bibitem[CK01b]{cohen01b}
1304: Edith Cohen and Haim Kaplan.
1305: \newblock {Refreshment Policies for Web Content Caches}.
1306: \newblock In {\em INFOCOM}, 2001.
1307:
1308: \bibitem[CSWH00]{clarke00}
1309: Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore~W. Hong.
1310: \newblock {Freenet: A Distributed Anonymous Information Storage and Retrieval
1311: System}.
1312: \newblock In {\em DIAU}, July 2000.
1313:
1314: \bibitem[DKK{\etalchar{+}}01]{dabek01}
1315: Frank Dabek, M.~Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica.
1316: \newblock {Wide-area Cooperative Storage with CFS}.
1317: \newblock In {\em SOSP}, 2001.
1318:
1319: \bibitem[gnu]{gnutella}
1320: {The Gnutella Protocol Specification v0.4}.
1321: \newblock http://gnutella.wego.com/.
1322:
1323: \bibitem[KLL{\etalchar{+}}97]{karger97}
1324: David Karger, Eric Lehman, Tom Leighton, Mathew Levine, Daniel Lewin, and Rina
1325: Panigrahy.
1326: \newblock {Consistent Hashing and Random Trees: Distributed Caching Protocols
1327: for Relieving Hot Spots on the World Wide Web}.
1328: \newblock In {\em STOC}, 1997.
1329:
1330: \bibitem[MGB01]{maniatis01}
1331: Petros Maniatis, T.J. Giuli, and Mary Baker.
1332: \newblock {Enabling the Long-Term Archival of Signed Documents through Time
1333: Stamping}.
1334: \newblock Technical Report cs.DC/0106058, Stanford University, 2001.
1335: \newblock http://www.arxiv.org/abs/cs.DC/0106058.
1336:
1337: \bibitem[RD01]{rowstron01b}
1338: Antony Rowstron and Peter Druschel.
1339: \newblock {Pastry: Scalable, distributed object location and routing for
1340: large-scale peer-to-peer systems}.
1341: \newblock In {\em MiddleWare}, November 2001.
1342:
1343: \bibitem[RFH{\etalchar{+}}01]{ratnasamy01a}
1344: Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker.
1345: \newblock {A Scalable Content-Addressable Network}.
1346: \newblock In {\em SIGCOMM}, 2001.
1347:
1348: \bibitem[RKCD01]{rowstron01a}
1349: Antony Rowstron, Anne-Marie Kermarrec, Miguel Castro, and Peter Druschel.
1350: \newblock {SCRIBE: The design of a large-scale event notification
1351: infrastructure}.
1352: \newblock In {\em NGC}, 2001.
1353:
1354: \bibitem[SMK{\etalchar{+}}01]{stoica01}
1355: Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan.
1356: \newblock {Chord: A Scalable Peer-to-peer Lookup Service for Internet
1357: Applications}.
1358: \newblock In {\em SIGCOMM}, 2001.
1359:
1360: \bibitem[ZKJ01]{zhao01}
1361: Ben~Y. Zhao, John~D. Kubiatowicz, and Anthony~D. Joseph.
1362: \newblock {Tapestry: An Infrastructure for Fault-tolerant Wide-area Location
1363: and Routing}.
1364: \newblock Technical Report UCB/CSD-01-1141, U. C. Berkeley, April 2001.
1365:
1366: \end{thebibliography}
1367:
1368:
1369:
1370:
1371: % \begin{tiny}
1372: % \begin{verbatim}
1373: % $Id: cup.tex,v 1.12 2002/02/02 05:04:16 mema Exp mema $
1374: % \end{verbatim}
1375: % \end{tiny}
1376:
1377: \end{document}
1378:
1379: