0611:cs0611098/pr1.tex

1: \section{Introduction}

2: A distributed system consists of a collection of geographically dispersed

3: autonomous sites, which are connected by a communication network. The

4: sites (or processes) have no shared memory and can only communicate

5: with one another by means of messages. \par

6: In the {\em mutual exclusion problem}, concurrent access to a shared

7: resource, called the {\em critical section} ({\em CS}), must be synchronized

8: such that at any time, only one process can access the ({\em CS}). Mutual exclusion

9: is crucial for the design of distributed systems. Many problems involving

10: replicated data, atomic commitment, synchronization, and others require that

11: a resource be allocated to a single process at a time. Solutions to this

12: problem often entail high communication costs and are vulnerable to site and

13: communication failures.

14: \par

15: Several distributed algorithms exist to implement mutual exclusion

16: \cite{ab,ca,la,ma,ra,ri,tn}, etc., they usually are designed for complete or

17: general networks and the most recent ones are often fault tolerant. But,

18: whatever the algorithm, it is either a permission-based, or a token-based

19: algorithm, and thus, it uses appropriate data structures. Lamport's

20: token-based algorithm \cite{la} maintains a waiting queue at each site and

21: the message complexity of the algorithm is $3(n - 1)$, where $n$ is the

22: number of sites.

23: Several algorithms were presented later, which reduce the number of messages

24: to $\Theta(n)$ with a smaller constant factor \cite{ca,ri}. Maekawa's

25: permission-based algorithm \cite{ma} imposes a logical structure on the

26: network and only requires $c\sqrt{n}$ messages to be exchanged (where $c$

27: is a constant which varies between 3 and 5).

28: \par

29: The token-based algorithm $\cal A$ (see \cite{nta,tn}), which is analysed

30: in the present paper, is the first mutual exclusion algorithm for complete

31: networks which achieves a logarithmic

32: average message complexity~; besides, it is the very first one to use a {\em

33: tree-based} structure, namely a path reversal, as its basic distributed data

34: structure. More recently, various mutual

35: exclusion algorithms ({\em e.g.} \cite{ab,ra}, etc.) have been designed

36: which use either the same data structure, or some very close tree-based data

37: structures. They usually also provide efficient (possibly fault

38: tolerant) solutions to the mutual exclusion problem.

39: \par

40: The general model used in \cite{nta,tn} to design algorithms $\cal A$

41: assumes the underlying communication links and the processes to be

42: reliable. Message propagation delay is finite but impredictable and the

43: messages are not assumed to obey the FIFO rule. A process entering the

44: ({\em CS}) releases it within a

45: finite delay. Moreover, the communication network is {\em complete}. To

46: ensure a fair mutual exclusion, each node in the network maintains two

47: pointers, {\em Last} and {\em Next}, at any time. {\em Last} indicates the

48: node to which requests for ({\em CS}) access should be forwarded~; {\em

49: Next} points to the node to which  access permission must be forwarded after

50: the current node has executed its own ({\em CS}). As described below, the

51: dynamic updating of these two pointers involves two distributed data

52: structures:

53: a waiting queue, and a {\em dynamic} logical rooted tree structure which is

54: nothing but a path reversal. Algorithm $\cal A$ is thus very

55: efficient in terms of  average-case message

56: complexity, {\em viz.} $H_{n-1} = \ln n + O(1)$\footnote{Throughout the

57: paper, $\lg$

58: denotes the base two logarithm and $\ln$ the natural logarithm.

59: $H_{n} =\: \sum_{i=1}^{n} 1/i$ denotes the $n$-th harmonic number,

60: with asymptotic value $H_{n} = \:\ln n \: + \: \gamma \:+\: 1/2n \:+\: O(n^{-2})$

61: (where $\gamma = 0.577\ldots$ is Euler's constant)}.

62: \par

63: Let us recall now how the two data

64: structures at hand are actually involved in the algorithm, which is fully

65: designed in \cite{nta,tn}.

66: Algorithm $\cal A$ uses the notion of {\em token}. A node

67: can enter its ({\em CS}) only if it has the token. However, unlike the

68: concept of a token circulating continuously in the system, the token is sent

69: from one node to another if and only if a request is made for it. The token

70: (also called {\em privilege message}) consists of a  queue of processes which

71: are requesting the ({\em CS}). The token circulates strictly according

72: to the order in which the requests have been made.

73: \par

74: The first data structure

75: used in $\cal A$ is a {\em waiting queue} which is updated by

76: each node after executing its own ({\em CS}). The waiting queue of

77: requesting processes is maintained at the

78: node containing the token and is transferred along with the token whenever

79: the token is transferred. The requesting nodes receive the token strictly

80: according to the order in the queue.

81: Each node knows its next node in the waiting queue only if the {\em Next}

82: exists. The head is the node which owns the token and the tail is the last

83: node which requested the ({\em CS}). Thus, a path is constructed in such a

84: way that each request message is transmitted to the tail. Then, either the

85: tail is in the ({\em CS}) and it let the requesting node enter it, or the

86: tail waits for the token, in which case the requesting node is appended to

87: the tail.

88: \par

89: The second data structure involved in algorithm $\cal A$ gives the path to

90: go to the tail: it is a logical rooted ordered tree. A node which requests

91: the ({\em CS}) sends its message to its {\em Last}, and, from {\em Last} to

92: {\em Last}, the request is transmitted to the tail of the waiting queue. In

93: such a structure, every node knows only its {\em Last}. Moreover, if the

94: requesting node is not the last, the logical tree structure is transformed:

95: the requesting node is the new {\em Last} and the nodes which are located

96: between the requesting node and the last will gain the new last as

97: {\em Last}.

98: This is typically a logical transformation of {\em path reversal}, which is

99: performed at a node $x$ of an ordered $n$-node tree $T_{n}$ consisting of

100: a root with $n - 1$ children. These transformations  $\varphi(T_{n})$

101: are performed to keep a dynamic decentralized path towards the tail of the

102: waiting queue.

103: \par

104: In \cite{gi}, Ginat, Sleator and Tarjan derived a tight upper bound of

105: $\lg n$ for the cost of path reversal in using the notion of {\em amortized

106: cost} of a path

107: reversal. Actually, by means of combinatorial and algebraic methods on the

108: Dycklanguage (namely by encoding oriented ordered trees $T_{n}$ with

109: Dyckwords), the average number of messages used by algorithm

110: $\cal A$ was obtained in \cite{nta}.

111: By contrast, the present paper uses direct and general derivation methods

112: involving one-to-one correspondences

113: between combinatorial structures such as priority queues, binary tournament

114: trees and permutations.

115: Moreover, a full analysis of algorithm $\cal A$ is completed in this paper

116: from the computation of the first and second moments of the cost of path

117: reversal~; {\em viz.} we derive the expected and worst-case message

118: complexity of $\cal A$ as well as its average and

119: worst-case waiting time.  Note that the average-case analysis of

120: other efficient mutual exclusion tree-based algorithms ({\em e.g.}

121: \cite{ab,ra}, among others) may easily be adaptated from the present one,

122: since the data structures involved in such algorithms are quite close to

123: those of algorithm $\cal A$.

124: The analysis of the average waiting time using simple birth-and-death

125: process methods and asymptotics, it could thus also apply easily to the

126: waiting time analysis of the above-mentioned algorithms. In this sense, the

127: analyses proposed in this paper are quite general indeed.

128:

129: The paper is organized as follows.

130: In Section 2, we define the path reversal transformation performed in a

131: tree $T_{n}$ and give a constructive proof of the one-one correspondence

132: between priority queues and the combinatorial structure of trees $T_{n}$.

133: In Section 3, probability generating functions are computed which

134: yield the exact expected cost of path reversal: $H_{n-1}$, and the second

135: moment of the cost. Section 4 is devoted to the computation of the waiting

136: time and the expected waiting time of algorithm $\cal A$. In Section 5,

137: more extended complexity results are given, {\em viz.} randomized bounds

138: on the worst-case message complexity of the algorithm in {\em arbitrary}

139: networks. In the Appendix, we propose a second proof technique which directly

140: yields the exact expected cost of path reversal by solving a straight and

141: simple recurrent equation.

142: