1: \section{One-one correspondences between combinatorial structures}
2: We first define a path reversal transformation in $T_{n}$ and its {\em cost}
3: (see \cite{gi}). Then we point out some one-to-one correspondences between
4: combinatorial objects and structures which are relevant to the problem of computing
5: the average cost of path reversal. Such one-to-one tools are used in Section 3
6: to compute this expected cost and its variance by means of corresponding
7: probability generating functions.
8:
9: \subsection{Path reversal}
10: Let $T_{n}$ be a rooted $n$-node tree, or an ordered tree with $n$ nodes,
11: according to either \cite{gi}, or \cite[page 306]{kn}. A {\em path reversal}
12: at a node $x$ in $T_{n}$ is performed by traversing the path from $x$ to the
13: tree root $r$ and making $x$ the parent (or pointer {\em Last}) of each node
14: on the path other than $x$. Thus $x$ becomes the new tree root.
15: The {\em cost} of the reversal is the number of edges on the path
16: reversed. Path reversal is a variant of the standard path compression
17: algorithm for maintaining disjoint sets under union.
18:
19: \begin{figure}[t]
20: \center
21: \includegraphics[width=0.9\textwidth]{pr.eps}
22: \caption{Path reversal $\varphi_{x_{0}}$.
23: The $T_{i}$'s denote the (left/right) subtrees of $T_{n}$.}
24: \end{figure}
25:
26: The average cost of a path reversal performed on an initial ordered $n$-node
27: tree $T_{n}$ which consists of a root with $n - 1$ descendants (or
28: {\em children}, as in \cite{gi}) is the expected number of edges on the paths
29: reversed in $T_{n}$ (see Figure 1).
30: In words, it is the {\em expected height of such reversed trees}
31: $\varphi(T_{n})$, provided that we let the
32: height of a tree root be 1: {\em viz.} the {\em height} of a node $x$ in
33: $T_{n}$ is thus defined as being the number of nodes on the path from the
34: node $x$ to the root $r$ of $T_{n}$.
35: \par
36: It turns out that the average number of messages used in $\cal A$ is
37: actually the expected cost of a path reversal performed on such initial
38: ordered $n$-node
39: trees $T_{n}$ which consist of a root with $n - 1$ children. This is indeed
40: the average number of changes of the variable {\em Last} which builds the
41: dynamic data structure of path reversal used in algorithm $\cal A$.
42:
43: \subsection{Priority queues, tournament trees and permutations}
44: Whenever two combinatorial structures are counted by the same number,
45: there exist one-one mappings between the two structures.
46: Explicit one-to-one correspondences between combinatorial representations
47: provide coding and decoding algorithms between the stuctures. We now need the
48: following definitions of some combinatorial structures which are closely
49: connected with path reversal and involved in the computation of its cost.
50:
51: \subsubsection{Definitions and notations}
52: \begin{description}
53: \item
54: {\sl (i)} Let $[n]$ be the set $\{1, 2, \ldots , n\}$.
55: A {\em permutation} is a {\em one-one mapping} $\sigma : [n] \rightarrow
56: [n]$; we write $\sigma \in S_{n}$, where $S_{n}$ is the symmetric group over
57: $[n]$.
58: \item
59: {\sl (ii)} A {\em binary tournament tree} of size $n$ is a binary $n$-node
60: tree whose internal nodes are labeled with consecutive integers of $[n]$, in
61: such a way that the root is labeled 1, and all labels are decreasing
62: ({\em bottom-up}) along each branch. Let ${\cal T}_{n}$ denote the set of
63: all binary tournament trees of size $n$.
64: ${\cal T}_{n}$ also denotes the set of tournament representations of all
65: permutations $\sigma~\in~S_{n}$, considered as elements of $[n]^{n}$,
66: since the correspondence $\tau : S_{n} \rightarrow {\cal T}_{n}$ is
67: one-one (see \cite{vu} for a detailed proof). Note that this one-to-one
68: mapping implies that $\left| {\cal T}_{n} \right| = n!$
69: \item
70: {\sl (iii)} A {\em priority queue} of size $n$ is a set $Q_{n}$ of keys ;
71: each key $K \in Q_{n}$ has an associated priority $p(K)$ which is an
72: arbitrary integer. To avoid cumbersome
73: notations, we identify $Q_{n}$ with the set of priorities of its keys.
74: Strictly speaking, this is a set with repetitions since priorities need not
75: be all distincts. However, it is convenient to ignore this technicality
76: and assume {\em distinct priorities}. The simplest representation of a
77: priority queue of size $n$ is then a sequence
78: $s~=~(p_{1},p_{2},~\ldots~,p_{n})$ of the priorities of $Q_{n}$, kept in
79: their order of arrival. Assume the $n!$ possible orders of arrival of the
80: $p_{i}$'s
81: to be equally likely, a priority queue $Q_{n}$ ({\em i.e.} a sequence $s$ of
82: $p_{i}$'s) is defined as
83: random {\em iff} it is associated to a random order of the $p_{i}$'s.
84: There is a one-to-one correspondence between the set ${\cal T}_{n}$ of all the
85: $n$-node binary tournament trees and the set of all the priority queues
86: $Q_{n}$ of size $n$.
87: To each one sequence of priorities $s = (p_{1}, \ldots ,p_{n}) \in Q_{n}$,
88: we associate a binary tournament tree $\gamma(s) = T \in {\cal T}_{n}$
89: by the following rules: let {\bf m}$ \; = \; \min(s)$, we then write $s =
90: \ell \; \mbox{\bf m} \; r$; the binary tree $T \in {\cal T}_{n}$ possesses
91: {\bf m} as root, $\gamma(\ell)$ as left subtree and $\gamma(r)$ as right
92: subtree. The rules are applied repeatedly to all the left and right
93: subsequences of $s$, and from the root of $T$ to the leaves of $T$; by
94: convention, we let $\gamma(\emptyset) = \Lambda$ (where $\Lambda$
95: denotes the empty binary tree). The correspondence $\gamma$ is obviously
96: one-one (see \cite{fr} for a fully detailed constructive proof).
97: \par
98: We shall thus use binary tournaments ${\cal T}_{n}$ to represent the
99: permutations of $S_{n}$ as well as the priority queues $Q_{n}$ of size $n$.
100: \item
101: {\sl (iv)} If $T \in {\cal T}_{n}$ is a binary tournament, its
102: {\em right branch} $RB (T)$ is the increasing sequence of
103: priorities found on the path starting at the root of $T$ and repeatedly
104: going to the right subtree. The {\em bottom} of $RB (T)$ is the node
105: having no right son. The {\em left branch} $LB (T)$ of $T$ is defined in a
106: symmetrical manner.
107: \end{description}
108:
109: \subsection{The one-one correspondence between \protect\bm{Q_{n}} and
110: \protect\bm{T_{n}}}
111: We now give a constuctive proof of a {\em one-to-one correspondence} mapping
112: the given combinatorial structure of ordered trees $T_{n}$ (as defined in
113: the Introduction) onto the priority queues $Q_{n}$.
114: \begin{theorem}
115: There is a one-to-one correspondence between the priority queues of size
116: $n, Q_{n}$, and the ordered $n$-node trees $T_{n}$ which consist of a root with
117: $n - 1$ children.
118: \end{theorem}
119: \begin{proof}
120: There are many representations of priority queues $Q_{n}$ ; let us
121: consider the $n$-node {\em binary heap} structure, which is very simple and
122: perfectly suitable for the constructive proof.
123: \begin{itemize}
124: \item First, a {\em binary heap} of size $n$ is an {\em essentially complete
125: binary tree}. A binary tree is {\em essentially complete} if each of its
126: internal nodes possesses exactly two children, with the possible exception
127: of a unique {\em special} node situated on level $(h - 1)$ (where $h$ denotes
128: the height of the heap), which may possess only a left child and no right child.
129: Moreover, all the leaves are either on level $h$, or else they are on levels
130: $h$ and $(h-1)$, and no leaf is found on level $(h-1)$ to the left of an
131: internal node at the same level. The unique special node, if it exists, is
132: to the right of all the other level $(h-1)$ internal nodes in the subtree.\\
133: Besides, each tree node in a binary heap contains one item, with the items
134: arranged in heap order ({\em i.e.} the priority queue ordering): the key of
135: the item in the parent node is strictly smaller than the key of the item in
136: any descendant's node.
137: Thus the root is located at position 1 and contains an item of minimum key.
138: If we number the nodes of such a essentially complete binary tree from 1 to
139: $n$ in heap order and identify nodes with numbers, the parent of the node located
140: at position $x$ is located at $\lfloor x/2 \rfloor$. Similarly, The left
141: son of node $x$ is located at $2x$ and its right son at $\min\{2x + 1,n\}$.
142: We can thus represent each node by an integer and the entire binary heap
143: by a map from $[n]$ onto the items: the binary heap with $n$ nodes fits well
144: into locations $1, \ldots , n$. This forces a breadth-first, left-to-right
145: filling of the binary tree, {\em i.e.} a heap or priority queue ordering.
146:
147: \item Next, it is well-known that any ordered tree with $n$ nodes may easily
148: be transformed into a binary tree by the {\em natural correspondence} between
149: ordered trees and binary trees. The corresponding binary tree is obtained by
150: linking together the brothering nodes of the given ordered tree and removing
151: vertical links except from a father to its first (left) son. \\
152: Conversely, it is easy to see that any binary tree may be represented as an
153: ordered tree by reversing the process. The correspondence is thus one-one
154: (see \cite[Vol.~1, page 333]{kn}).
155: \end{itemize}
156: Note that the construction of a binary heap of size $n$ can be carried out
157: in a linear time, and more precisely in $\Theta(n)$ sift-up operations.
158:
159: Now, to each one sequence of priorities $s = (p_{1}, \ldots ,p_{n}) \in
160: Q_{n}$, we may associate a unique $n$-node tree $\alpha(s) = T_{n}$ in
161: the natural breadth-first, left-to-right order; by convention, we also let
162: $\alpha(\emptyset) = \Lambda$. In such a representation, $T_{n} = \alpha(s)$
163: is then an ordered $n$-node tree the ordering of which is the priority queue
164: (or heap) order, and it is thus built as an essentially complete binary heap
165: of size $n$. The correspondence $\alpha$ naturally represents the priority
166: queues $Q_{n}$ of size $n$ as ordered trees $T_{n}$ with $n$ nodes.
167:
168: Conversely, to any ordered tree $T_{n}$ with $n$ nodes, we may associate a
169: binary tree with heap ordered nodes, that is an essentially complete binary
170: heap. Hence, there exists a correspondence $\beta$ mapping any given ordered
171: $n$-node tree $T_{n}$ onto a unique sequence of priorities $s = \beta(T_{n})
172: \in Q_{n}$; by convention we again let $\beta(\Lambda) = \emptyset$.
173:
174: The correspondence is one-one, and it is easily seen that mappings $\alpha$
175: and $\beta$ are respective inverses.
176: \end{proof}
177:
178: Let binary tournament trees represent each one of the above structures. Any
179: operation can thus be performed as if dealing with ordered trees $T_{n}$,
180: whereas binary tournament trees or permutations are really manipulated.
181: More precisely, since we know that $T_{n} \longleftrightarrow Q_{n}
182: \longleftrightarrow {\cal T}_{n} \longleftrightarrow S_{n}$, the cost of path
183: reversal performed on initial $n$-node trees $T_{n}$ which consist of a
184: root with $n-1$ children is {\em transported} from the $T_{n}$'s onto the
185: tournament trees $T \in {\cal T}_{n}$ and onto the permutations
186: $\sigma \in S_{n}$. In the following definitions (see Section 3.1 below),
187: we therefore let $\varphi(\sigma) \in S_{n}$ denote the ``reversed''
188: permutation which corresponds to the reversed tree $T_{n}$.
189: From this point the {\em first moment of the cost of path reversal},
190: $\varphi : T_{n} \rightarrow T_{n}$, can be derived, and a
191: straightforward proof technique of the result, distinct from the one in
192: section 3 below, is also detailed in the Appendix.
193:
194: \section{Expected cost of path reversal, average message complexity of $\bm{\cal A}$}
195: It is fully detailed in the Introduction how the two data structures at hand
196: are actually involved in algorithm $\cal A$ and the design of the algorithm
197: takes place in~\cite{nta,tn}.
198:
199: \subsection{Analysis}
200: Eq.~(13) proved in the Appendix, is actually sufficient to provide the average
201: cost of path reversal. However, since we also desire to know the second moment
202: of the cost, we do need the probability generating function of the probabilities
203: $p_{n,k}$, defined as follows.
204:
205: \medskip Let $h(T_{n})$ denote the height of $T_{n}$, {\em i.e.} the number
206: of nodes on the path from the deepest node in $T_{n}$ to the root of $T_{n}$,
207: and let $T \in {\cal T}_{n-1}$.
208: $$p_{n,k} \;= \;\Pr\{\mbox{cost of path reversal for}\ T_{n}\ \mbox{is}\ k\} %
209: \;=\; \Pr\{h (\varphi(T)) = k\}$$
210: is the probability that the tournament tree $\varphi(T)$ is of height $k$.
211: We also have
212: $$p_{n,k} \;=\; \Pr\{k\ \mbox{changes occur in the variable \textit{Last}
213: of algorithm} {\cal A}\}.$$
214:
215: More precisely, let a {\bf swap} be any interchanged pair of adjacent
216: prime cycles (see {\rm \cite[Vol.~3, pages 28-30]{kn}}) in a permutation
217: $\sigma$ of $[n-1]$ to obtain the ``reversed'' permutation
218: $\varphi_{x}(\sigma)$ corresponding to the path reversal performed at a node
219: $x \in T_{n}$, that is any interchange which occurs in the relative order of
220: the elements of $\varphi_{x}(\sigma)$ from the one of $\sigma$'s elements, and
221: let $N$ be the number of these swaps occurring from $\sigma \in S_{n-1}$ to
222: $\varphi_{x}(\sigma)$, then,
223: $$p_{n,k} \:=\; \frac{1}{(n-1)!}\, (\mbox{number of}\ \sigma \in S_{n-1}\ %
224: \mbox{for which}\ N=k),$$
225: since the cost of a path reversal at the root of an ordered tree such as
226: $T_{n}$ is zero.
227:
228: \begin{lemma}
229: Let $P_{n}(z) = \sum_{k \geq 0} p_{n,k} z^{k}$ be the probability generating
230: function of the $p_{n,k}$'s. We have the following identity,
231: $$P_{n}(z) \:=\; \prod_{j=1}^{n-1} \frac{z + j - 1}{j}\,.$$
232: \end{lemma}
233: \begin{proof}
234: We have $p_{1,0} = 1$ and $p_{1,k} = 0$ for all $k > 0$.
235: \par
236: A fundamental point in this derivation is that we are averaging not over
237: all tournament trees $T \in {\cal T}_{n-1}$, but {\em over all possible
238: orders} of the elements of $S_{n-1}$.
239: Thus, every permutation of $(n - 1)$ elements with $k$ swaps corresponds to
240: $(n - 2)$ permutations of $(n - 2)$ elements with $k$ swaps and one
241: permutation of $(n - 2)$ elements with $(k - 1)$ swaps. This leads directly
242: to the recurrence
243: $$(n-1)! p_{n,k} \;=\; (n-2)(n-2)! \,p_{n-1,k} \;+\; (n-2)! p_{n-1,k-1},$$
244: or
245: \begin{equation}
246: p_{n,k} =\; \left(1-\frac{1}{n-1} \right) p_{n-1,k} \;+\; %
247: \left(\frac{1}{n-1}\right) p_{n-1,k-1}.
248: \end{equation}
249: \par
250: Consider any permutation $\sigma = \langle \sigma_{1} \ldots \sigma_{n-1}
251: \rangle$ of $[n-1]$. Formula (1) can also be derived directly with the
252: argument that the probability of $N$ being equal to $k$ is
253: the simultaneous occurrence of $\sigma_{i} = j\; \; (1 \leq i,j \leq n-1)$ and
254: $N$ being equal to $k-1$ for the remaining elements of $\sigma$, {\em plus}
255: the simultaneous occurrence of $\sigma_{i} \neq j \; (1 \leq i,j \leq n-1)$
256: and $N$ being equal to $k$ for the remaining elements of $\sigma$. Therefore,
257: \begin{eqnarray*}
258: p_{n,k} & = & \Pr\{\sigma_{i} = j\} \times p_{n-1,k-1} \;+\; %
259: \Pr\{\sigma_{i} \neq j\} \times p_{n-1,k} \\
260: & = & \Big(1/(n-1)\Big) p_{n-1,k-1} \;+\; \Big(1-1/(n-1)\Big) p_{n-1,k}.
261: \end{eqnarray*}
262: \par
263: Using now the probability generating function $P_{n}(z) =\: \sum_{k \geq 0} p_{n,k} z^{k}$,
264: we get after multiplying~(1) by $z^{k}$ and summing,
265: $$(n - 1) P_{n}(z) \;=\; z P_{n-1}(z) \;+\; (n - 2) P_{n-1}(z),$$
266: which yields
267: \begin{eqnarray}
268: P_{n}(z) & = & \frac{z + n - 2}{n - 1} \; P_{n-1}(z) \nonumber \\
269: P_{1}(z) & = & z.
270: \end{eqnarray}
271:
272: The latter recurrence~(2) telescopes immediately to
273:
274: $$P_{n}(z) \; = \; \prod_{j=1}^{n-1} \frac{z + j - 1}{j} .$$
275: \end{proof}
276: \begin{remark}
277: The property proved by Trehel that the average number of
278: messages required by $\cal A$ is exactly the number of nodes at
279: height 2 in the reversed ordered trees $\varphi(T_{n})$ (see \cite{nta})
280: is hidden in the definition of the $p_{n,k}$'s. As a matter of fact, the
281: number of permutations of $[n]$ which contains exactly 2 prime cycles is
282: $\left[\begin{array}{c} n \\ 2\end{array}\right]
283: \;=\; (n-1)!\, H_{n-1}$ (see~\cite{kn}), and whence the result.
284: \end{remark}
285: \begin{theorem}
286: The expected cost of path reversal and the average message complexity
287: of algorithm $\cal A$ is $\E(C_{n})\:=\; \overline{C_{n}} \:=\: H_{n-1}$,
288: with variance $var(C_{n}) \:=\: H_{n-1} \;-\; H_{n-1}^{(2)}$.
289: Asymptotically, for large $n$,
290: $$\overline{C_{n}} \:=\; \ln n \;+\;\gamma \;+\; O(n^{-1})\ \quad \mbox{and}\
291: \quad var(C_{n}) \;=\; \ln n \;+\; \gamma \;-\; \pi^{2}/6 \;+\; O(n^{-1}).$$
292: \end{theorem}
293: \begin{proof}
294: By Lemma 3.1, the probability generating function $P_{n}(z)$ may be regarded
295: as the product of a number of very simple probability generating functions
296: (P.G.F.s), namely, for $1\leq j\leq n-1$,
297: $$P_{n}(z) =\; \prod_{1 \leq j \leq n-1} \Pi_{j}(z),\ \quad\ \mbox{with}\ %
298: \ \Pi_{j}(z) \;=\; \frac{j-1}{j} \;+\;\frac{z}{j}\,.$$
299:
300: Therefore, we need only compute moments for the P.G.F. $\Pi_{j}(z)$,
301: and then sum for $j = 1$ to $n - 1$. This is a classical property of P.G.F.s
302: that one may transform products to sums.
303:
304: \medskip \noindent Now, $\Pi_{j}'(1) \;=\; 1/j$ and $\Pi_{j}''(1) \:=\: 0$,
305: and hence
306: $$\E(C_{n}) \;=\: \overline{C_{n}} \:=\; P_{n}'(1) \;=\; \sum_{j=1}^{n-1} %
307: \Pi_{j}'(1) \;= \: H_{n-1}.$$
308: Moreover, the variance of $C_{n}$ is
309: $$var(C_{n}) \:=\; P_{n}''(1) \;+\;P_{n}'(1) \;-\; P_{n}'^{2}(1),$$
310: and thus,
311: $$var(C_{n}) \:= \; \sum_{j=1}^{n-1} \frac{1}{j} \;-\; \sum_{j=1}^{n-1}
312: \frac{1}{j^{2}} \;=\; H_{n-1} \;-\; H_{n-1}^{(2)}.$$
313: \par
314: Since $H_{n-1}^{(2)} \;=\; \pi^{2}/6 \;-\; 1/n \:+\: O(n^{-2})$ when
315: $n\rightarrow +\infty$, and by the asymptotic expansion of $H_{n}$, the
316: asymptotic values of $\overline{C_{n}}$ and of $var(C_{n})$ are easily obtained.
317: (Recall that Euler's constant is $\gamma = 0.57721\ldots$, thus
318: $\gamma - \pi^{2}/6 \;=\: - 1.6772\ldots$)
319:
320: Hence, $\overline{C_{n}} =\: .693\ldots \lg n \;+\; O(1)$, and
321: $var(C_{n}) =\: .693\ldots \lg n \;+\; O(1)$.
322: \end{proof}
323: Note also that, by a generalization of the central limit theorem to sums of
324: independent but nonidentical random variables, it follows that
325: $$\frac{C_{n} \:-\: {\overline C_{n}}}{(\ln n \,-\, 1.06772\ldots)^{1/2}}$$
326: converges to the normal distribution whe $n\rightarrow +\infty$.
327: \begin{proposition}
328: The worst-case message complexity of algorithm $\cal A$ is $O(n)$.
329: \end{proposition}
330: \begin{proof}
331: Let $\Delta$ be the {\em maximum} communication delay time in the
332: network and let $\Sigma$ be the {\em minimum} delay time for a process
333: to enter, proceed and release the critical section. \\
334: Set $q =\: \left\lceil \Delta/\Sigma \right\rceil$, the number of messages
335: used in $\cal A$ is at most $(n-1) \:+\: (n-1)q \:=\: (n-1)(q+1) \:= O(n)$.
336: \end{proof}
337:
338: \begin{remarks}
339: \item[1.]\ The one-to-one correspondence between ordered trees with $(n+1)$
340: nodes and the words of lenght $2n$ in the Dycklanguage with one type of
341: bracket is used in \cite{nta} to compute the average message complexity of
342: $\cal A$. Several properties and results connecting the depth of a Dyckword
343: and the height of the ordered $n$-node trees can be derived from the
344: one-to-one correspondences between combinatorial structures involved
345: in the proof of Theorem 2.1.
346:
347: \item[2.]\ In the first variant of algorithm $\cal A$ (see \cite{tn})
348: which is analysed here, a node never stores more than one request of some
349: other node and hence it only requires $O(\log n)$ bits to store the variables,
350: and the message size is also $O(\log n)$ bits. This is not true of the second
351: variant of algorithm $\cal A$ (designed in \cite{nt}). Though the constant
352: factor within the order of magnitude of the average number of messages is claimed
353: to be slightly improved (from 1 downto $.4$), the token now consists of a queue
354: of processes requesting the critical section.
355: Since at most $n-1$ processes belong to the requesting queue, the size of the
356: token is $O(n\log n)$. Therefore, whereas the average message complexity
357: is slightly improved (up to a constant factor), the message size increases
358: from $O(\log n)$ bits to $O(n\log n)$ bits. The bit complexity is thus much
359: larger in the second variant~\cite{nt} of $\cal A$. Moreover, the state
360: information stored at each node is also $O(n\log n)$ bits in the second
361: variant, which again is much larger than in the first variant of $\cal A$.
362: \end{remarks}
363: