1: \section{Introduction}
2: A distributed system consists of a collection of geographically dispersed
3: autonomous sites, which are connected by a communication network. The
4: sites (or processes) have no shared memory and can only communicate
5: with one another by means of messages. \par
6: In the {\em mutual exclusion problem}, concurrent access to a shared
7: resource, called the {\em critical section} ({\em CS}), must be synchronized
8: such that at any time, only one process can access the ({\em CS}). Mutual exclusion
9: is crucial for the design of distributed systems. Many problems involving
10: replicated data, atomic commitment, synchronization, and others require that
11: a resource be allocated to a single process at a time. Solutions to this
12: problem often entail high communication costs and are vulnerable to site and
13: communication failures.
14: \par
15: Several distributed algorithms exist to implement mutual exclusion
16: \cite{ab,ca,la,ma,ra,ri,tn}, etc., they usually are designed for complete or
17: general networks and the most recent ones are often fault tolerant. But,
18: whatever the algorithm, it is either a permission-based, or a token-based
19: algorithm, and thus, it uses appropriate data structures. Lamport's
20: token-based algorithm \cite{la} maintains a waiting queue at each site and
21: the message complexity of the algorithm is $3(n - 1)$, where $n$ is the
22: number of sites.
23: Several algorithms were presented later, which reduce the number of messages
24: to $\Theta(n)$ with a smaller constant factor \cite{ca,ri}. Maekawa's
25: permission-based algorithm \cite{ma} imposes a logical structure on the
26: network and only requires $c\sqrt{n}$ messages to be exchanged (where $c$
27: is a constant which varies between 3 and 5).
28: \par
29: The token-based algorithm $\cal A$ (see \cite{nta,tn}), which is analysed
30: in the present paper, is the first mutual exclusion algorithm for complete
31: networks which achieves a logarithmic
32: average message complexity~; besides, it is the very first one to use a {\em
33: tree-based} structure, namely a path reversal, as its basic distributed data
34: structure. More recently, various mutual
35: exclusion algorithms ({\em e.g.} \cite{ab,ra}, etc.) have been designed
36: which use either the same data structure, or some very close tree-based data
37: structures. They usually also provide efficient (possibly fault
38: tolerant) solutions to the mutual exclusion problem.
39: \par
40: The general model used in \cite{nta,tn} to design algorithms $\cal A$
41: assumes the underlying communication links and the processes to be
42: reliable. Message propagation delay is finite but impredictable and the
43: messages are not assumed to obey the FIFO rule. A process entering the
44: ({\em CS}) releases it within a
45: finite delay. Moreover, the communication network is {\em complete}. To
46: ensure a fair mutual exclusion, each node in the network maintains two
47: pointers, {\em Last} and {\em Next}, at any time. {\em Last} indicates the
48: node to which requests for ({\em CS}) access should be forwarded~; {\em
49: Next} points to the node to which access permission must be forwarded after
50: the current node has executed its own ({\em CS}). As described below, the
51: dynamic updating of these two pointers involves two distributed data
52: structures:
53: a waiting queue, and a {\em dynamic} logical rooted tree structure which is
54: nothing but a path reversal. Algorithm $\cal A$ is thus very
55: efficient in terms of average-case message
56: complexity, {\em viz.} $H_{n-1} = \ln n + O(1)$\footnote{Throughout the
57: paper, $\lg$
58: denotes the base two logarithm and $\ln$ the natural logarithm.
59: $H_{n} =\: \sum_{i=1}^{n} 1/i$ denotes the $n$-th harmonic number,
60: with asymptotic value $H_{n} = \:\ln n \: + \: \gamma \:+\: 1/2n \:+\: O(n^{-2})$
61: (where $\gamma = 0.577\ldots$ is Euler's constant)}.
62: \par
63: Let us recall now how the two data
64: structures at hand are actually involved in the algorithm, which is fully
65: designed in \cite{nta,tn}.
66: Algorithm $\cal A$ uses the notion of {\em token}. A node
67: can enter its ({\em CS}) only if it has the token. However, unlike the
68: concept of a token circulating continuously in the system, the token is sent
69: from one node to another if and only if a request is made for it. The token
70: (also called {\em privilege message}) consists of a queue of processes which
71: are requesting the ({\em CS}). The token circulates strictly according
72: to the order in which the requests have been made.
73: \par
74: The first data structure
75: used in $\cal A$ is a {\em waiting queue} which is updated by
76: each node after executing its own ({\em CS}). The waiting queue of
77: requesting processes is maintained at the
78: node containing the token and is transferred along with the token whenever
79: the token is transferred. The requesting nodes receive the token strictly
80: according to the order in the queue.
81: Each node knows its next node in the waiting queue only if the {\em Next}
82: exists. The head is the node which owns the token and the tail is the last
83: node which requested the ({\em CS}). Thus, a path is constructed in such a
84: way that each request message is transmitted to the tail. Then, either the
85: tail is in the ({\em CS}) and it let the requesting node enter it, or the
86: tail waits for the token, in which case the requesting node is appended to
87: the tail.
88: \par
89: The second data structure involved in algorithm $\cal A$ gives the path to
90: go to the tail: it is a logical rooted ordered tree. A node which requests
91: the ({\em CS}) sends its message to its {\em Last}, and, from {\em Last} to
92: {\em Last}, the request is transmitted to the tail of the waiting queue. In
93: such a structure, every node knows only its {\em Last}. Moreover, if the
94: requesting node is not the last, the logical tree structure is transformed:
95: the requesting node is the new {\em Last} and the nodes which are located
96: between the requesting node and the last will gain the new last as
97: {\em Last}.
98: This is typically a logical transformation of {\em path reversal}, which is
99: performed at a node $x$ of an ordered $n$-node tree $T_{n}$ consisting of
100: a root with $n - 1$ children. These transformations $\varphi(T_{n})$
101: are performed to keep a dynamic decentralized path towards the tail of the
102: waiting queue.
103: \par
104: In \cite{gi}, Ginat, Sleator and Tarjan derived a tight upper bound of
105: $\lg n$ for the cost of path reversal in using the notion of {\em amortized
106: cost} of a path
107: reversal. Actually, by means of combinatorial and algebraic methods on the
108: Dycklanguage (namely by encoding oriented ordered trees $T_{n}$ with
109: Dyckwords), the average number of messages used by algorithm
110: $\cal A$ was obtained in \cite{nta}.
111: By contrast, the present paper uses direct and general derivation methods
112: involving one-to-one correspondences
113: between combinatorial structures such as priority queues, binary tournament
114: trees and permutations.
115: Moreover, a full analysis of algorithm $\cal A$ is completed in this paper
116: from the computation of the first and second moments of the cost of path
117: reversal~; {\em viz.} we derive the expected and worst-case message
118: complexity of $\cal A$ as well as its average and
119: worst-case waiting time. Note that the average-case analysis of
120: other efficient mutual exclusion tree-based algorithms ({\em e.g.}
121: \cite{ab,ra}, among others) may easily be adaptated from the present one,
122: since the data structures involved in such algorithms are quite close to
123: those of algorithm $\cal A$.
124: The analysis of the average waiting time using simple birth-and-death
125: process methods and asymptotics, it could thus also apply easily to the
126: waiting time analysis of the above-mentioned algorithms. In this sense, the
127: analyses proposed in this paper are quite general indeed.
128:
129: The paper is organized as follows.
130: In Section 2, we define the path reversal transformation performed in a
131: tree $T_{n}$ and give a constructive proof of the one-one correspondence
132: between priority queues and the combinatorial structure of trees $T_{n}$.
133: In Section 3, probability generating functions are computed which
134: yield the exact expected cost of path reversal: $H_{n-1}$, and the second
135: moment of the cost. Section 4 is devoted to the computation of the waiting
136: time and the expected waiting time of algorithm $\cal A$. In Section 5,
137: more extended complexity results are given, {\em viz.} randomized bounds
138: on the worst-case message complexity of the algorithm in {\em arbitrary}
139: networks. In the Appendix, we propose a second proof technique which directly
140: yields the exact expected cost of path reversal by solving a straight and
141: simple recurrent equation.
142: