math0411098/k2n2.tex
1: \documentclass[11pt]{article}
2: \usepackage{amsfonts,latexsym,amssymb,epsfig}
3: \usepackage{amsmath,amsthm,amstext,amscd}
4: \usepackage{fullpage}
5: \usepackage{euscript}
6: \parindent 0cm
7: \parskip 0.2cm
8: 
9: \begin{document}
10: \bibliographystyle{plain}
11: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
12: \newtheorem{theorem}{Theorem}
13: \newtheorem{proposition}[theorem]{Proposition}
14: \newtheorem{corollary}[theorem]{Corollary}
15: \newtheorem{note}[theorem]{Note}
16: \newtheorem{lemma}[theorem]{Lemma}
17: \newtheorem{definition}[theorem]{Definition}
18: \newtheorem{observation}[theorem]{Observation}
19: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
20: \newcounter{fignum}
21: \newcommand{\figlabel}[1]
22:            {\\Figure \refstepcounter{fignum}\arabic{fignum}\label{#1}}
23: \newcommand{\ignore}[1]{}
24: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
25: \def\F2{{\{0,1\}}}
26: \def\eps{{\epsilon}}
27: \def\tP{\tilde{P}}
28: \def\ttP{\tilde{\tilde{P}}}
29: \def\rhoi{\rho^{-1}}
30: \def\hM{\hat{M}}
31: \def\hpM{\hat{M'}}
32: \def\halpha{\hat{\alpha}}
33: \def\hpalpha{\hat{\alpha}'}
34: \def\tO{\tilde{O}}
35: \def\tOmega{\tilde{\Omega}}
36: \def\S{{\Sigma}}
37: \def\hn{{\lfloor n/2\rfloor}}
38: \def\ox{{\overline{x}}}
39: \def\I{\EuScript{I}}
40: \def\cardI{{I}}
41: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
42: % log-like functions
43: \newcommand{\rank} {\mbox {rank}}
44: \newcommand{\schreier} {\mbox {sc}}
45: \newcommand{\gap} {\mbox {gap}}
46: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
47: \title{Simple Permutations Mix Even Better}
48: \author{
49: \parbox{8cm}{\centering
50: Alex Brodsky\\
51: Department of Computer Science\\
52: University of Toronto\\
53: abrodsky@cs.toronto.edu
54: }
55: \parbox{8cm}{\centering 
56: Shlomo Hoory\footnote{
57: Research is supported in part by an NSERC grant and a PIMS postdoctoral 
58: fellowship.}\\
59: Department of Computer Science\\
60: University of British Columbia\\
61: shlomoh@cs.ubc.ca}
62: }
63: \maketitle
64: 
65: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
66: \begin{abstract}
67: We study the random composition of a small family of $O(n^3)$ 
68: simple permutations on $\{0,1\}^n$. 
69: Specifically we ask how many randomly selected simple permutations need be 
70: composed to yield a permutation that is close to $k$-wise independent.
71: We improve on the results of Gowers~\cite{Go96} and 
72: Hoory et al.~\cite{HMMR04} and show that up to a polylogarithmic factor, 
73: $n^2 k^2$ compositions of random permutations from this family suffice.
74: In addition, our results give an explicit construction of a degree 
75: $O(n^3)$ Cayley graph of the alternating group of $2^n$ objects with 
76: a spectral gap $\Omega(2^{-n}/n^2)$, which is a substantial improvement over
77: previous constructions.
78: 
79: \end{abstract}
80: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
81: 
82: {\bf Keywords:} Mixing-time, k-wise independent permutations, cryptography,
83: multicommodity flow, reversible computation.
84: 
85: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
86: 
87: A naturally occurring question in cryptography is how well the composition 
88: of simple permutations drawn from a simple distribution resembles a random 
89: permutation.
90: Although such constructions are a common source of security for 
91: block ciphers like DES and AES, 
92: their mathematical justification (or lack thereof) is troubling.
93: 
94: This  motivated the investigation of Hoory et al.~\cite{HMMR04} who considered
95: the notion of almost 
96: {\em $k$-wise independence}. Namely, that the distribution obtained when 
97: applying a permutation from a given distribution to any $k$ distinct 
98: elements is almost indistinguishable from the distribution obtained when 
99: applying a truly random permutation.
100: Therefore, the question is
101: how close is the composition of $T$ random simple permutations  
102: to $k$-wise independent?  
103: 
104: Another motivation is a fundamental open problem in the theory of expanding
105: graphs. 
106: \footnote{A solution to this problem was announced recently by 
107: Kassabov~\cite{Ka05}.}
108: Namely, the problem of constructing a constant degree expanding 
109: Cayley graph of the symmetric group. 
110: A possible relaxation of this problem is to  ask whether one can find a small 
111: set of simple permutations such that its action on $k$ points yields an 
112: expanding graph. 
113: %see Section~\ref{conclude:section} for more details.
114: 
115: It turns out that these two problems reduce to bounding the mixing time and 
116: the spectral gap of the random walk on the {\em same} graph.
117: This walk, $P$, is defined on the state space 
118: of $k$-tuples of distinct elements from the $n$-dimensional binary cube. 
119: In each step it randomly selects a simple permutation and
120: applies it to each of the $k$ elements at its current position.
121: The mixing time, $\tau(\eps)$, is the number of steps needed to 
122: come $\eps$-close to the uniform distribution (in total variation distance),
123: and the spectral gap, $\gap(P)$, is the difference between
124: the two largest eigenvalues of $P$'s transition matrix.
125: 
126: Following the construction of DES, and previous work by Gowers~\cite{Go96} 
127: and Hoory et al~\cite{HMMR04}, 
128: we consider the class of {\em width $2$ simple permutation}, denoted $\S$.
129: The action of such a permutation on an element of the $n$-dimensional 
130: binary cube is to XOR a single coordinate with a Boolean function of $2$
131: other coordinates; there are $16n(n-1)(n-2)$ such 
132: permutations.  
133: 
134: These problems were first considered by Gowers~\cite{Go96} who gave an
135: $\tO(n^3 k (n^2+k)(n^3+k) )$
136: \footnote{Notation $\tO$ suppresses a polylogarithmic factor
137: in $n$ and $k$.} 
138:  bound on the mixing time, by lower bounding the 
139: spectral gap $1/\gap(P) = \tO(n^2 (n^2+k)(n^3+k) )$.
140: Subsequently, Hoory et al.~\cite{HMMR04} improved the bound on the mixing 
141: time to $\tO(n^3 k^3 )$ by proving that $1/\gap(P) = \tO(n^2 k^2)$.
142: Both results were achieved using the {\em canonical paths} technique,
143: and neither result applies for $k > 2^{n/2}$.
144: Using the comparison technique, in conjunction with the theory of reversible
145: computation, we give better bounds for all values of $k$ 
146: up to the largest conceivable value, $k=2^n-2$.
147: 
148: \begin{theorem}\label{k2n2:theorem}
149: $\tau(\eps) = \tO( n^2 k^2 \cdot \log(1/\eps) )$,
150: as long as $k \leq 2^{n/50}$.
151: %In this case, it is $[\log a \cdot \log\log a \cdot (1+\log k)]^2$, where
152: %$a=\max(k,n)$.
153: \end{theorem}
154: 
155: \begin{theorem}\label{k2n3:theorem}
156: $1/\gap(P) = O( n^2 k )$
157: for all $k \leq 2^n-2$.
158: \end{theorem}
159: 
160: Using the well known connection between the mixing time and the
161: spectral gap Theorem~\ref{k2n3:theorem} implies:
162: 
163: \begin{corollary}
164: $\tau(\eps) = O( n^2 k \cdot (nk+\log(1/\eps)) )$
165: for all $k \leq 2^n-2$.
166: \end{corollary}
167: 
168: The proofs of both Theorems are based on the comparison technique for 
169: Markov chains~\cite{DiSa93}.  
170: To prove Theorem~\ref{k2n3:theorem} we compare the random walk $P$
171: either to a Glauber dynamics Markov chain or to the random walk on
172: the alternating group using $3$-cycles. 
173: To prove Theorem~\ref{k2n2:theorem} we observe that 
174: after a short preamble the random walk $P$ is almost surely in 
175: a ``generic'' state.
176: Consequently, it suffices to bound the mixing time of a Markov chain
177: restricted to ``generic'' states.  
178: To this end we again employ the comparison technique, 
179: but with a better comparison constant.
180: In all cases we construct the multicommodity flows required 
181: by the comparison technique using ideas from the 
182: theory of reversible computation.
183: 
184: %{burn-in method} introduced by Dyer and Frieze~\cite{DyFr03} to study
185: %Glauber dynamics. 
186: 
187: 
188: It follows from \cite{HMMR04,MaPi04} that these results
189: apply also in the more general setting of adaptive adversaries 
190: (see references for a definition).
191: \begin{corollary}\label{strongk2n2:corollary}
192: Let $T$ be the minimal number of random 
193: compositions of independent and uniformly distributed permutations from 
194: $\S$ needed to generate a permutation which is 
195: $\eps$-close to $k$-wise independent against an adaptive adversary.
196: Then $T = \tO( n^2 k^2 \cdot \log(1/\eps) )$ for $k \leq 2^{n/50}$,
197: and $T = O( n^2 k \cdot (nk + \log(1/\eps)) )$ for $k \leq 2^n-2$.
198: \end{corollary}
199: 
200: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
201: 
202: 
203: \section{Preliminaries}
204: 
205: Let $f$ be a random permutation on some base set $X$. 
206: Denote by $X^{(k)}$ the set of all $k$-tuples of distinct elements from $X$.
207: We say that $f$ is $\eps$-close to $k$-wise independent if for every
208: $(x_1,\ldots,x_k) \in X^{(k)}$ the distribution of 
209: $(f(x_1),\ldots,f(x_k))$ 
210: is $\eps$-close to the uniform distribution on $X^{(k)}$.
211: We measure the distance between two
212: probability distributions $p, q$ by the total variation distance, defined by
213: \begin{eqnarray*}
214: d(p,q) = \frac 1 2 ||p-q||_1 = \frac 1 2 \sum_{\omega} |p(\omega)-q(\omega)|
215: = \max_A \sum_{\omega \in A} p(\omega)-q(\omega).
216: \end{eqnarray*}
217: 
218: Assume a group $H$ is acting on a set $X$ and let $S$ be a subset of $H$
219: closed under inversion. Then the {\em Schreier graph} 
220: $G=\schreier(S,X)$ is defined 
221: by $V(G)=X$ and $E(G)= \{(x,xs): x \in X,\, s \in S\}$.
222: For a sequence $\omega=(s_1,\ldots,s_\ell) \in S^l$ we denote
223: $x \omega = x s_1 \cdots s_\ell$, and we sometimes refer by $x \omega$ to
224: the walk $x, x s_1, \ldots, x s_1 \cdots s_\ell$.
225: 
226: The {\em random walk} $X_0,X_1,\ldots$
227: associated with a $d$-regular graph $G$ is defined by the 
228: transition matrix $P_{vu} = \Pr[X_{i+1}=u|X_i=v]$ which is $1/d$ if 
229: $(v,u) \in E(G)$ and zero otherwise. 
230: The uniform distribution $\pi$ is stationary for this Markov chain.
231: If $G$ is connected and not bipartite, we know that given any initial 
232: distribution of $X_0$, the distribution of $X_t$ tends to the uniform 
233: distribution. The mixing time of $G$ is
234: $\tau(\eps) = \max_{v \in V(G)} \min \{ t : d(P^{(t)}(v,\cdot),\pi) < \eps \}$,
235: where $P^{(t)}(v,.)$ is the probability distribution of $X_t$ given that
236: $X_0=v$.
237: It is not hard to prove (see~\cite[Lemma 20]{AlFi}) that
238: \begin{eqnarray}\label{submultmix}
239: \tau(2^{-\ell-1}) \leq \ell \cdot \tau(1/4).
240: \end{eqnarray}
241: Let $1=\beta_0 \geq \beta_1 \geq \cdots \geq \beta_{|V(G)|}$ be the eigenvalues
242: of the transition matrix $P$.
243: We say that this random walk is lazy if for some constant $\delta>0$ we have
244: $P_{vv} \geq \delta$ for all $v \in V(G)$. 
245: We denote the spectral gap $1-\beta_1$ of the Markov chain $P$ by $\gap(P)$.
246: 
247: Two fundamental results relating the spectral gap of a Markov chain to
248: its mixing time are the following:
249: \begin{theorem}\label{gapmix:theorem}(\cite[Proposition 3]{DiSt91})
250: If the random walk on $G$ is lazy then 
251: \(\tau(\eps) = O \left( \log( |V(G)|/\eps) \,/\, \gap(P) \right).\)
252: \end{theorem}
253: \begin{theorem}\label{mixgap:theorem}(~\cite[Proposition 1.ii]{Si92} 
254: or \cite[Chapter 4]{AlFi})
255: For any time reversible Markov chain $P$ and $\eps>0$,
256: \(\gap(P) = \Omega(\log(1/2\eps)\,/\,\tau(\eps)).\)
257: \end{theorem}
258: 
259: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
260: \section{Composing simple permutations}
261: 
262: Another building block that we use are results on reversible computation
263: that enables us to compose simple permutations to construct permutations
264: that are easier to work with.  A classical result of Coppersmith and
265: Grossman~\cite{CoGr75} is that for $n > 3$ the set of width $2$ simple
266: permutations generates exactly the alternating group $A_n$.  Thus, all
267: compositions must be even permutations.
268: 
269: Formally, we define the set of width $w$ simple permutations$, \S_w$, 
270: as the set of permutations
271: $f_{i,J,h}$ where $i \in [n]$, $J = \{j_1,\ldots,j_w\}$ is a size $w$ ordered
272: subset of $[n]\setminus \{i\}$, and $h$ is a Boolean function on $\F2^w$. 
273: The permutation $f_{i,J,h}$ maps 
274: $(x_1,\ldots,x_n) \in \F2^n$ to 
275: $(x_1,\ldots,x_{i-1},x_i \oplus h(x_{j_1},\ldots,x_{j_w}),x_{i+1},\ldots,x_n)$.
276: We are primarily interested in width $2$ simple permutations, 
277: and denote $\S=\S_2$.
278: 
279: \begin{theorem} (Barenco et al.~\cite{BBCDMSSSW95})\label{wideand:theorem}
280: The permutation that flips the $n$-th bit of input $x$ if and only if the first
281: $w$ bits of $x$ are $1$ can be implemented as a composition
282: of $O(w)$ permutations from $\S$, as long as $w \leq n-2$.
283: \end{theorem}
284: 
285: \begin{theorem} (Brodsky~\cite{Br04})\label{basic-3-cycle:theorem}
286: for any distinct $x,y,z \in \F2^n$, one can compose $O(n)$
287: permutations from $\S$ to obtain the $3$-cycle $(xyz)$.
288: \end{theorem}
289: 
290: A length $\ell$ {\em implementation} of the permutation $\sigma$ is a sequence 
291: of permutations $\sigma_1,\ldots,\sigma_\ell$ from $\S$ whose composition 
292: is $\sigma$.
293: Theorem~\ref{basic-3-cycle:theorem} gives a length $O(n)$ implementation
294: for $3$-cycles.
295: We would like to use this implementation to construct a multicommodity flow
296: with low load on all edges.  However, Theorem~\ref{basic-3-cycle:theorem}
297: does not guarantee this. We solve this problem by randomizing the 
298: implementation, enabling us to prove a stronger theorem.
299: 
300: A length $\ell$ {\em randomized implementation} of the permutation
301: $\sigma$ is a sequence of {\em random} permutations
302: $\sigma_1,\ldots,\sigma_\ell$ from $\S$ whose composition is $\sigma$.
303: In Theorem~\ref{3-cycle:theorem} we give a randomized implementation
304: for 3-cycles, such that applying any prefix
305: $\sigma_1\cdots\sigma_{\ell'}$ of the randomized implementation of a
306: uniformly random 3-cycle $(xyz)$ to $x$ yields a string that looks
307: random. Namely, its {\em min-entropy} $H_\infty(\cdot)$ is high, which
308: is the minimum amount of information revealed when exposing the value
309: of a random variable $X$, that is
310: $H_\infty(X)=\min_\chi(-\log_2(\Pr[X=\chi]))$.
311: 
312: \begin{theorem}\label{3-cycle:theorem}
313: Let $x,y,z \in \F2^n$ be uniformly distributed and distinct. Then there
314: is a length $L=O(n)$ randomized implementation $\sigma_1 \cdots \sigma_L$
315: of the 3-cycle $(xyz)$ such that for all $\ell \in [L]$ 
316: the min-entropy of 
317: $(x\sigma_1\cdots\sigma_{\ell-1},\sigma_\ell)$ (which is a random variable on
318: $\F2^n \times\S$) is at least $\log_2( 2^n\cdot n^3 ) - O(1)$.
319: \end{theorem}
320: Note, this implies that the min-entropy of the marginals is big, i.e., 
321: $H_\infty(x\sigma_1\cdots\sigma_{\ell-1}) \geq n - O(1)$
322: and $H_\infty(\sigma_\ell) \geq \log_2(n^3) - O(1)$.
323: 
324: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
325: \section{Proof of Theorem~\ref{k2n3:theorem}}\label{k2n3proof:section}
326: 
327: In order to prove that the composition of random permutations from $\S$ 
328: approaches $k$-wise independence quickly we construct the Schreier graph 
329: $G_{k,n}=\schreier(\S,X^{(k)})$, where $X^{(k)}$ is the set of
330: $k$-tuple with $k$ distinct elements from the base set $X = \F2^n$.
331: It is convenient to think of $X^{(k)}$ as the set of $k$ by $n$ binary
332: matrices with distinct rows. 
333: A simple permutation acts on $X^{(k)}$ by acting on each of the rows. 
334: Then $P$ is the transition matrix of the random walk on $G_{k,n}$.
335: We prove that the random walk on this graph mixes rapidly.
336: 
337: To prove that $1/\gap(P) = O( n^2 k )$,
338: we first observe that $\gap(P)$ is monotone nonincreasing in $k$. 
339: This follows from the fact that the graph $G_{k+1,n}$ is a lift
340: of $G_{k,n}$ and therefore inherits the spectrum of $G_{k,n}$.
341: To see this, observe that any eigenfunction of $G_{k,n}$, can be lifted
342: to an eigenfunction on $G_{k+1,n}$, where the value of the latter on 
343: some $k+1$ by $n$ matrix is the value of the former on the matrix obtained by
344: deleting the last row. The eigenvalues of these two eigenfunctions is the 
345: same. 
346: In light of this observation, it is sufficient to prove the following two 
347: lemmas:
348: 
349: \begin{lemma}\label{cayley-gap:lemma}
350: $1/\gap(P) = O(n^2\cdot 2^n)$ for $k=2^n-2$.
351: \end{lemma}
352: 
353: \begin{lemma}\label{n2k-gap:lemma}
354: $1/\gap(P) = O(n^2 k)$ for $k \leq 2^n/3$.
355: \end{lemma}
356: 
357: We obtain the lower bound on the spectral gap of $P$ using the comparison 
358: technique~\cite{DiSa93}. This technique enables one to lower bound
359: $\gap(P)$ by $\gap(\tP)/A$, where $\tP$ is some other Markov chain,
360: and $A$ is the comparison constant. 
361: In our case, all chains are walks on regular graphs.
362: An upper bound on $A$ is obtained by constructing
363: a multicommodity flow on the underlying graph of $P$.
364: The flow flows a unit between
365: all pairs of endpoints of edges of $\tP$
366: such that the flow through each edge of $P$ is small.
367: To prove Lemmas~\ref{cayley-gap:lemma} and \ref{n2k-gap:lemma},
368: we compare $P$ to two different Markov chains. 
369: We start with the first Lemma.
370: 
371: \begin{proof}(of Lemma~\ref{cayley-gap:lemma})
372: For $k = 2^n - 2$, the state space of $P$ comprises
373: all even permutations of $\F2^n$. 
374: Let $\tP$ be a Markov chain on this state space, where
375: in each step 
376: we pick three distinct elements of the cube $x,y,z \in \F2^n$
377: and perform the permutation $(x y z)$.
378: It follows from a result of Friedman~\cite{Fr00}, 
379: that $1/\gap(\tP) = \Theta(2^n)$.
380: Therefore, it is sufficient to prove that the comparison constant of $P$ to
381: $\tP$ is $O(n^2)$.
382: \footnote{
383: Alternately, one can define a transition of $\tP$ as performing two random 
384: transpositions (not necessarily disjoint) and use a result of Diaconis and 
385: Shahshahani~\cite{DiSh81} that $1/\gap(\tP) = \Theta(2^n)$.}
386: 
387: To bound the comparison constant $A$, we need to construct a multicommodity 
388: flow $f$ in $G_{k,n}$ that flows a unit between every two matrices $M,M'$ 
389: such that $\tP(M,M')>0$.
390: Since the chains $P$ and $\tP$ correspond to random walks on regular graphs 
391: with degrees $d=\Theta(n^3)$ and 
392: $\tilde{d} = \Theta(2^{3n})$ respectively, 
393: the formula given in~\cite[Theorem 2.3]{DiSa93} reduces to:
394: \begin{eqnarray}\label{Abound:eqn}
395: A = 
396: (d/\tilde{d}) \cdot
397: \max_{(N,N') \in E(G_{k,n})} \left\{
398: \sum_{\gamma :\: (N,N') \in \gamma} f(\gamma) \cdot |\gamma|
399: \right\}.
400: \end{eqnarray}
401: 
402: Let $M,M'$ be two matrices such that $\tP(M,M')>0$.  Then $M'$ can be
403: obtained by applying some $3$-cycle $(xyz)$ to $M$.  Recall that the
404: randomized implementation given by Theorem~\ref{3-cycle:theorem}
405: induces a probability distribution on the length $L$ sequences of
406: permutations from $\Sigma$ whose composition is $(xyz)$.  Such a
407: distribution naturally translates to a distribution on length $L$
408: paths from $M$ to $M'$. We obtain a unit flow from $M$ to $M'$ by
409: flowing through each such path $\gamma$ an amount equal to its
410: probability.  We claim that the multicommodity flow obtained by
411: repeating this process for all $M,M'$ pairs satisfying $\tP(M,M')>0$
412: yields a small comparison constant.
413: 
414: Since $|\gamma| \cdot (d/\tilde{d}) = \Theta(n \cdot |\Sigma|/2^{3n})$
415: for all paths $\gamma$ with non-zero flow, the problem of bounding the
416: sum in (\ref{Abound:eqn}) reduces to bounding the total flow
417: through a given edge $e \in E(G_{k,n})$.  Let
418: $\gamma=(M_0,\ldots,M_L)$ be a path from $M_0$ to $M_L$, 
419: where $M_L$ is obtained from $M_0$ by applying the 3-cycle $(xyz)$. 
420: Assume further that $\gamma$ goes
421: through the edge $e$ at the $\ell$-th step, and that $x$ is
422: the $r$-th row of $M$. For any of the $\Theta(2^{4n}\cdot n)$ possible
423: assignments to $x,y,z,\ell,r$, we can determine the distribution
424: of the $r$-th row of the matrices $M_0,\ldots,M_L$. In particular, the
425: probability that $(M_{\ell-1},M_L)$ is equal to $e$ is bounded by the
426: probability that they coincide in their $r$-th row. By
427: Theorem~\ref{3-cycle:theorem}, in average over all assignments to
428: $x,y,z,\ell,r$, this probability is $O(1/2^n|\Sigma|)$. 
429: Putting it all together yields that, up to a constant factor, the
430: comparison constant $A$ is bounded $(n\cdot |\Sigma|/2^{3n}) \cdot
431: (2^{4n}\cdot n) \cdot (1/2^n |\Sigma|) = n^2$, as claimed.
432: \end{proof}
433: 
434: \begin{proof}(of Lemma~\ref{n2k-gap:lemma})
435: Let $\tP$ be the a Markov chain on the same state space as $P$,
436: which is the $k$ by $n$ binary matrices with distinct rows.
437: If the current state of $\tP$ is the matrix $M$, 
438: then the next state is determined by picking
439: a row $r \in \{1,\ldots,k\}$ and setting it to a random new value that
440: is distinct from all other $k-1$ rows. 
441: The process $\tP$ is the Markov chain of coloring the clique on 
442: $k$ vertices with $2^n$ colors defined in~\cite[section 4.1]{Je03}. 
443: Proposition 4.5 therein bounds its mixing time by 
444: $\tilde{\tau}(\eps) = O(k\log(k/\eps))$ as long as $k \leq 2^n/3$.
445: Setting $\eps=1/4k$ in Theorem~\ref{mixgap:theorem} implies that 
446: $\gap(\tP)=\Omega(1/k)$.
447: Therefore, as in the proof of Lemma~\ref{cayley-gap:lemma},
448: it is sufficient to prove that the comparison constant of $P$ to $\tP$ 
449: is $O(n^2)$.
450: 
451: Given matrices $M,M'$ such that $\tP(M,M')>0$, we know that $M'$ is obtained
452: from $M$ by changing the value of some row $r$ from $x$ to $y$.
453: To construct paths from $M$ to $M'$, we note that $M'$ can be obtained by 
454: applying the 3-cycle $(xyz)$ to $M$
455: for any $z \in \F2^n$ that is distinct from all rows of $M,M'$.
456: We choose $z$ at random from the $2^n-(k+1)$ allowed values.
457: As in the proof of Lemma~\ref{cayley-gap:lemma}, the randomized implementation
458: of $(xyz)$, given by Theorem~\ref{3-cycle:theorem}, defines 
459: a distribution on paths from $M$ to $M'$ and therefore a multicommodity
460: flow. We turn to bound the comparison constant, given by (\ref{Abound:eqn}).
461: 
462: As before, $|\gamma| \cdot (d/\tilde{d}) = \Theta(n \cdot |\Sigma|/k2^n)$
463: for all $\gamma$ with non-zero flow, and it suffices to bound the flow through
464: some edge $e \in E(G_{k,n})$. We enumerate over the choices of the 
465: position $\ell$, row $r$ and distinct $x,y$, which make a total of 
466: $\Theta(nk2^{2n})$ possible values. 
467: Again we apply Theorem~\ref{3-cycle:theorem} to argue that in average,
468: the probability of agreement with $e$ is bounded by $O(1/|\Sigma|2^n)$.
469: \footnote{
470: One should note that $z$ is uniformly distributed only over
471: $2^n-(k+1) > 2^{n-1}$ values. However, this is equivalent to conditioning a
472: uniform $z$ on an event with probability at least half and therefore 
473: (by Lemma \ref{cond:lemma}) can only increase the probability of agreement 
474: with $e$ by a factor of two.}
475: Therefore, up to a constant factor, 
476: $A=(n \cdot |\Sigma|/k2^n) \cdot (nk2^{2n}) \cdot (1/|\Sigma|2^n)=n^2$, 
477: as claimed.
478: \end{proof}
479: 
480: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
481: \section{Proof of Theorem~\ref{k2n2:theorem}}
482: 
483: In light of inequality (\ref{submultmix}), it is sufficient to
484: prove that $\tau(1/4) = \tO(n^2 k^2)$. 
485: The outline of the proof is the following.
486: We start by introducing the notion of a {\em generic} matrix,
487: and as suggested by the name, most matrices are generic.
488: The proof then proceeds by arguing that after a short random walk 
489: almost surely all matrices encountered are generic.
490: Therefore, it is sufficient to bound the mixing time of a walk that is 
491: restricted to generic matrices.
492: For such a walk, we can compare the chain to a chain defined {\em only} on 
493: generic matrices and achieve a much smaller comparison constant.
494: This yields the desired bound, $\tO(n^2 k^2)$.
495: 
496: Let $w=10\cdot(\log k + \log n)$. By assumption, we have $w \leq n/4$
497: for a sufficiently large $n$, and we set $p=\lceil n/2w \rceil$.
498: Let $C_1,\ldots,C_p,C$ be a partition of $[n]$ such that 
499: $|C_i|=w$ for $i=1,\ldots,p$ and $|C|=n-pw$. 
500: Consequently, $n/4 \leq n/2-w < |C| \leq n/2$.
501: 
502: We say that a $k$ by $n$ matrix is {\em generic}, if for all $j \in [p]$,
503: its restriction to $C_j$ has distinct rows.
504: It is not difficult to check that a uniformly distributed matrix $M$ is
505: almost surely generic. 
506: In fact, it is sufficient that the rows of $M$ are $2^{-w}$-close 
507: to $2$-wise independent, since then the probability that
508: $M$ is not generic is bounded by $p$ times the probability that the restriction
509: of $M$ to $C_j$ doesn't have distinct rows. 
510: This yields the bound
511: $p \cdot \binom{k}{2} \cdot (2 \cdot 2^{-w}) = o(1/n^3k^3)$ and
512: implies the following lemma:
513: 
514: \begin{lemma}\label{generic-2wise:lemma}
515: If the rows of a random $k$ by $n$ matrix $M$ are $2^{-w}$-close to $2$-wise 
516: independent, then $M$ is generic with probability $1-o(1/n^3k^3)$.
517: \end{lemma}
518: %\begin{proof}
519: %$\Pr[M \mbox{ is generic}] 
520: %\geq 1-\sum_{i \in [p]} \Pr[M_{|C_i} \mbox{doesn't have distinct rows}]
521: %\geq 1- 2p \cdot \binom{k}{2} \cdot 2^{-w}
522: %= 1-o(1/n^3k^3).$
523: %\end{proof}
524: 
525: It follows from a result of Chung and Graham about the mixing time 
526: of the ``Aldous Cube''~\cite{ChGr97}, that the number of steps needed
527: to come close to $2$-wise independence, which is the same as the mixing time of 
528: $G_{2,n}$, is $O(n\log n)$.
529: This is stated in the following lemma (whose proof is deferred to 
530: Section~\ref{more:section}).
531: 
532: \begin{lemma}\label{two-wise:lemma}
533: For all $w\ge 1$ the $\eps$ mixing time of the Schreier graph
534: $\schreier(\S_w,X^{(2)})$ is $O(n \log n\log(1/\eps))$.
535: \end{lemma}
536: 
537: Therefore, the matrix obtained after 
538: $T_1=O(n \log n \cdot w) = O(n\log n \cdot(\log k + \log n))$
539: steps is $2^{-w}$-close to $2$-wise independent,
540: and by Lemma~\ref{generic-2wise:lemma} it is
541: generic with probability $1-o(1/n^3k^3)$.
542: This implies that if we proceed by $T_2 = O(n^3k^3)$ steps, 
543: then all $T_2$ matrices encountered are generic
544: with probability $1-o(T_2/n^3k^3) > 1-\eps_1$, for any fixed $\eps_1>0$ and 
545: sufficiently large $n$.
546: 
547: We introduce a new Markov chain $P'$. 
548: The state space of $P'$ consists of all generic $k$ by $n$ matrices. 
549: If the chain is currently at the matrix $M$, then the next state is determined
550: as follows. We pick a uniformly distributed simple permutation $\sigma \in \S$.
551: If $M\sigma$ is generic, we move to $M\sigma$. Otherwise, we remain at $M$.
552: Let $\tau'(\eps)$ denote the $\eps$-mixing time of $P'$,
553: and require that $T_2 \geq \tau'(\eps_2)$ for some fixed $\eps_2>0$.
554: 
555: We claim that as long as $2\eps_1+\eps_2 < 1/4$ the mixing time of $P$ can be 
556: bounded by $\tau(1/4) \leq T_1+T_2$.
557: To see this, 
558: let $M$ be some $k$ by $n$ matrix with distinct rows, and consider following 
559: two matrices.
560: The first matrix $M'$ obtained when starting at $M$ and walking $T_1+T_2$ 
561: steps using $P$.
562: The second matrix $M''$ is defined as follows. 
563: Let $\hat{M}$ be the matrix obtained
564: when starting at $M$ and performing $T_1$ steps of $P$.
565: If $\hat{M}$ is not generic, we set $M''=\hat{M}$.
566: Otherwise, $M''$ is the matrix reached by the length $T_2$ walk using $P'$
567: that starts at $\hat{M}$.
568: We claim that $d(M',M'') \leq \eps_1$ and that $M''$ is 
569: $(\eps_1+\eps_2)$-close to the uniform distribution over $k$ by $n$ matrices 
570: with distinct rows
571: \footnote{Note that by our assumptions, the distance between the uniform 
572: distribution over matrices with distinct rows and generic matrices is $o(1)$}.
573: Proving those claims will imply that 
574: \begin{eqnarray}
575: \tau(1/4) \leq \tau'(\eps_2) + O(n\log n \cdot(\log k + \log n)),
576: \end{eqnarray}
577: as long as $\tau'(\eps_2) = O(n^3k^3)$.
578: 
579: We start by checking that indeed $d(M',M'') \leq \eps_1$. It is convenient to 
580: think of the two length $T_1+T_2$ walks from $M$ to $M'$ and $M''$ as
581: defined over the same probability space $\S^{T_1+T_2}$ which is the choice
582: of a simple permutation in each of the $T_1+T_2$ steps. Denote the
583: the $P$-walk by $(M_0=M,M_1,\ldots,M_{T_1+T_2}=M')$.
584: Then, if all the matrices $M_{T_1},\ldots,M_{T_1+T_2}$ are generic,
585: it coincides with the walk leading to $M''$, and in particular we have
586: $M'=M''$. By the previous arguments, this event happens at least
587: with probability $1-\eps_1$, implying that $d(M',M'') \leq \eps_1$.
588: 
589: The proof that $M''$ is $(\eps_1+\eps_2)$-close to uniform is more delicate.
590: We know that the matrix $\hat{M}$ is generic with probability at least 
591: $1-\eps_1$.
592: Also, since $T_2 \geq \tau'(\eps_2)$, we know that conditioned on $\hat{M}$
593: being generic, $M''$ is $\eps_2$-close to the uniform distribution.
594: Therefore $M''$ is $(\eps_1+\eps_2)$-close to the uniform distribution over 
595: matrices with distinct rows. This argument can be easily formalized using 
596: Lemma~\ref{twoeps:lemma} of Section~\ref{more:section}.
597: 
598: We are left with the proof of the following lemma.
599: 
600: \begin{lemma}\label{genericmixing:lemma}
601: $\tau'(1/4) = \tO(n^2k^2)$.
602: \end{lemma}
603: 
604: To bound the mixing time of the Markov chain $P'$, we apply the comparison 
605: technique~\cite{DiSa93}. We compare $P'$ to the Markov chain $\tP$ defined
606: on the same state space, the $k$ by $n$ generic matrices. 
607: Given that $\tP$ is at a matrix $M$, we determine the next state as follows.
608: With probability half we pick a random column $c \in C$ and row $r \in [k]$
609: and flip the corresponding bit with probability half.
610: Otherwise, we pick at random an index $i \in [p]$, a row $r \in [k]$ and 
611: a string $\alpha \in \F2^w$ that is distinct from all other $k-1$ rows in 
612: the restriction of $M$ to the columns $C_i$. We set the bits at row $r$ 
613: and columns $C_i$ to $\alpha$.
614: 
615: Consequently, the following two lemmas,
616: imply Lemma~\ref{genericmixing:lemma}. Note that we need not worry about the
617: smallest eigenvalue of $P'$ since a random permutation from $\S$ is the 
618: identity with probability $1/16$.
619: 
620: \begin{lemma}\label{spectral:lemma}
621: $\gap(\tP) = \Omega(1/nk)$.
622: \end{lemma}
623: 
624: \begin{lemma}\label{simulation:lemma}
625: The comparison constant $A$ of $\tP$ to $P'$ satisfies $A = \tO(1)$.
626: \end{lemma}
627: 
628: \begin{proof} (of Lemma~\ref{spectral:lemma})
629: 
630: Consider two Markov chains $\tP_1$ and $\tP_2$:
631: \begin{enumerate}
632: \item
633: The state space of $\tP_1$ are the $k$ by $w$ binary matrices with distinct 
634: rows. At each step one chooses a random row and sets it to a random new
635: value distinct from all other $k-1$ rows. 
636: This chain is exactly the coloring chain of a clique on $k$ vertices
637: with $2^w$ colors of~\cite[Proposition 4.5]{Je03},
638: and as in the proof of Lemma~\ref{n2k-gap:lemma}, 
639: it satisfies $\gap(\tP_1)=\Omega(1/k)$.
640: \item
641: $\tP_2$ is the random walk on the $(n-wp)\cdot k$ dimensional binary cube, 
642: where in each step with probability half, 
643: one flips a random coordinate. Therefore,  $\gap(\tP_2)=\Omega(1/nk)$.
644: \end{enumerate}
645: One can think of the chain $\tP$ as the product of $p$ copies of $\tP_1$ and
646: one copy of $\tP_2$. Indeed the state space of $\tP$ is the direct product of
647: the $p+1$ state spaces. Moreover, a step of $\tP$ performs a move of $\tP_2$ 
648: with probability $1/2$ and otherwise performs the move in a randomly selected 
649: copy of $\tP_1$. 
650: It is straight forward to check that the spectral gap of $\tP$ is
651: $\min(\gap(\tP_1)/p,\gap(\tP_2))/2$, implying the desired bound.
652: 
653: \end{proof}
654: 
655: \begin{proof}(of Lemma~\ref{simulation:lemma})
656: 
657: Let $G'$ be the underlying graph of $P'$.
658: The vertices of $G'$ are the generic $k$ by $n$ matrices, 
659: and $(N,N')$ is an edge of $G'$ if $P'(N,N')>0$.
660: To bound the comparison constant $A$, we need to construct a multicommodity 
661: flow $f$ in $G'$ that flows a unit between every two matrices $M,M'$ 
662: such that $\tilde{P}(M,M')>0$.
663: The chains $P'$ and $\tP$ correspond to random walks on regular graphs
664: with degrees $d'=\Theta(n^3)$, $\tilde{d} = \Theta(kn2^w/w)$ respectively,
665: and as before the comparison constant $A$ is defined by (\ref{Abound:eqn}).
666: 
667: To build a path $\gamma$ from $M$ to $M'$ we need to distinguish two types
668: of $\tP$ transitions.
669: Type (i) flips the bit at row $r$ and column $c \in C$.
670: Type (ii) changes the bits at row $r$ and columns 
671: $C_i$ from $\alpha$ to $\alpha'$.
672: We start by constructing the type (i) paths.
673: 
674: Let $j \in [p]$ be a random index, and let $\beta \in \F2^w$ be the 
675: restriction of the $r$-th row of $M$ to $C_j$.
676: Also let $S$ be a random sequence of $w-1$ distinct elements from 
677: $C \setminus\{c\}$.
678: The unit flow from $M$ to $M'$ is along paths $\gamma=\gamma_{M,M'}^{S,j}$.
679: Each such path is defined by composing simple permutations from $\S$
680: to achieve the permutation that acts on $x \in \F2^n$ by flipping coordinate 
681: $c$ if the restriction of $x$ to $C_j$ is $\beta$. 
682: Clearly such a permutation maps $M$ to $M'$.
683: We follow the method of Barenco et al.~\cite{BBCDMSSSW95} to build 
684: an AND gate with $w$ inputs.
685: This gate inverts its output bit (the coordinate $c$) if its $w$ inputs (the
686: coordinates $C_j$) have some fixed value $\beta$. 
687: The coordinates in the set $S$ are used as ``scratch''.
688: 
689: Let $C_j=\{j_1,\ldots,j_w\}$, $S=\{s_1,\ldots,s_{w-1}\}$ and 
690: $\beta=(b_1,\ldots,b_w)$.
691: Let $\sigma_1$ be the simple permutation that flips coordinate 
692: $s_1$ of $x \in \F2^n$ if $x_{j_1}$ is equal to $b_1$,
693: and let $\sigma_\ell$ for $2 \leq \ell \leq w-1$ be the simple permutation 
694: that flips coordinate $s_\ell$ if $x_{s_{\ell-1}}$ is one and $x_{j_\ell}$ 
695: is equal to $b_\ell$. Also, we denote by $\tau_c$ the simple permutation that 
696: flips $x_c$ if $x_{s_{w-1}}$ is one and $x_{j_w}$ is equal to $b_w$. 
697: We claim that the following permutation
698: flips coordinate $c$ of $x \in \F2^n$ if the restriction of $x$ to $C_j$ is 
699: equal to $\beta$: 
700: \begin{eqnarray*}
701: \sigma = 
702: (\tau_c \sigma_{w-1} \cdots \sigma_2\sigma_1\sigma_2 \cdots \sigma_{w-1}
703: )^2
704: \end{eqnarray*}
705: To see this, one checks by induction that 
706: $\sigma_\ell\cdots\sigma_1\cdots\sigma_\ell$ flips coordinate $s_\ell$ if 
707: $x_{j_1}, \ldots, x_{j_\ell}$ is equal to $b_1,\ldots,b_\ell$.
708: 
709: For the type (ii) paths, we need to change the bits at row $r$ and
710: columns $C_i$ from $\alpha$ to $\alpha'$. 
711: The problem is that if we change $\alpha$ to $\alpha'$ bit by bit,
712: as suggested by the construction of type (i) paths, 
713: we might violate row distinctness.
714: To solve this problem, we start our path by applying a length 
715: $L = O(w\log w\cdot(1+2\log k))$ sequence $\phi$ of simple permutations with 
716: indices restricted to $C_i$.
717: Let $\hM = M\phi$ and $\hpM = M'\phi$, and let $C_i'$ and $C_i''$ be the first
718: and last $\lfloor (w-1)/2 \rfloor$ columns of $C_i$.
719: We say that $\phi$ is valid if for both the restriction of $\hM$ to $C_i''$ and
720: for the restriction of $\hpM$ to $C_i'$, have distinct rows. 
721: By Lemma~\ref{two-wise:lemma} we know for a random $\phi$, both $\hM$
722: and $\hpM$ are $1/8k^2$-close to $2$-wise independence. 
723: Therefore, a random $\phi$ is not valid with probability bounded by 
724: $k^2\cdot(2^{-w/2+1}+1/8k^2) \leq 1/4$.
725: If $\phi$ is valid we define a path $\gamma=\gamma_{M,M'}^{S,j,\phi}$
726: from $M$ to $M'$, where $j \in [p]\setminus \{i\}$ and $S$ is 
727: a length $w-1$ sequence of elements from $C$. 
728: The path is prefixed by $\phi$ to get from $M$ to $\hM$
729: and is suffixed by $\phi^{-1}$ to get from $\hpM$ to $M'$.
730: Let $\halpha$ and $\hpalpha$ be the restriction of the $r$-th row 
731: of $\hM$ and $\hpM$ to $C_i$ respectively,
732: and let $\beta$ be the restriction of the $r$-th row of $M$ to $C_j$.
733: Then the middle path connecting $\hM$ to $\hpM$ is defined as follows:
734: \begin{eqnarray*}
735: \sigma = [(
736: \prod_{\{c \in C_i'\,:\,\halpha_c \neq \hpalpha_c\}} \tau_c
737: ) \cdot \sigma_{w-1} \cdots \sigma_2\sigma_1\sigma_2 \cdots \sigma_{w-1} ]^2
738: \cdot
739: [(
740: \prod_{\{c \in C_i \setminus C_i'\,:\,\halpha_c \neq \hpalpha_c\}} \tau_c
741: ) \cdot \sigma_{w-1} \cdots \sigma_2\sigma_1\sigma_2 \cdots \sigma_{w-1} ]^2,
742: \end{eqnarray*}
743: where $\tau_c$ and $\sigma_\ell$ are as defined for the type (i) sequences.
744: Therefore it is guaranteed that the matrices encountered 
745: along the first and second half of the sequence agree with 
746: $\hM$ on the columns $C_i''$ and with 
747: $\hpM$ on the columns $C_i'$ respectively. 
748: Since $\phi$ is valid, this implies that we never attempt to move to 
749: a non-generic matrix throughout the entire path.
750: We define the unit flow from $M$ to $M'$ by splitting the flow uniformly
751: between all valid paths $\gamma$ designated by $S,j,\phi$. 
752: 
753: There are two points that need special attention in the constructed type (i) 
754: and type (ii) paths. 
755: The first point is that all indices of the simple permutations used in $\phi$
756: are in $C_i$. This is unacceptable for us, as it induces an undue load on
757: a small subset of $\S$. To solve this problem we replace
758: each simple permutation used in $\phi$ by a constant length sequence that 
759: avoids that problem. 
760: For example, the permutation that flips coordinate $i_1$ if $i_2$ and $i_3$ 
761: are $1$, denoted $\chi_{i_1,i_2,i_3}$, is replaced by the sequence
762: \( (\chi_{s_2,s_1,i_3} \chi_{s_1,i_2}, \chi_{s_2,s_1,i_3}, \chi_{i_1,s_2})^2\)
763: where permutation $\chi_{i_1,i_2}$ XORs coordinate $i_1$ with $i_2$.
764: %$x_{i_1} \leftarrow x_{i_1} \oplus x_{i_2}x_{i_3}$
765: %is replaced by:
766: %\begin{eqnarray*}
767: %(\;x_{s_2} \leftarrow x_{s_2} \oplus x_{s_1}x_{i_3};\ \ 
768: %x_{s_1} \leftarrow x_{s_1} \oplus x_{i_2}; \ \ 
769: %x_{s_2} \leftarrow x_{s_2} \oplus x_{s_1}x_{i_3};\ \ 
770: %x_{i_1} \leftarrow x_{i_1} \oplus x_{s_2}\;)^2.
771: %\end{eqnarray*}
772: 
773: The second point is that some of the simple permutations used 
774: ($\sigma_1$ and some of the permutations in $\phi$) 
775: do not use three indices. However, in the definition of $\S$, 
776: we have three indices at our disposal even if we don't use all three.
777: We use this to guarantee that all simple permutations used have one index in 
778: $C_j$ and two from $S$ or $c$ for type (i) paths or $C_i$ for type (ii) paths.
779: 
780: To complete the proof, we have to bound the comparison constant $A$
781: given by (\ref{Abound:eqn}). 
782: We have $d'/\tilde{d} = \theta(n^2w/k2^w)$ and $|\gamma|=O(L)$.
783: Also, $f(\gamma)$ is $\Theta(w/n(m)_{w-1})$ for type (i) paths 
784: and $\Theta(w/|\S^{(w)}|^Ln(m)_{w-1})$ for type (ii) paths, 
785: where we denote $m=|C|$, 
786: $(m)_q = m(m-1)(m-2)\cdots(m-q+1)$,
787: and $\S^{(w)}$ as the width $2$ simple permutations restricted to the 
788: $w$-dimensional cube.
789: Therefore, we only have to bound the maximal number of $\gamma_{M,M'}^{S,j}$
790: and $\gamma_{M,M'}^{S,j,\phi}$ paths through an edge $(N,N')$.
791: 
792: We start with type (i) paths.
793: The first step is to extract as much information as possible about a path 
794: $\gamma$ through $(N,N')$ by considering the simple permutation $s$ associated 
795: with $(N,N')$.
796: Note first that $s$ determines $j$. 
797: Moreover, since only one of $\sigma_1,\ldots,\sigma_{w-1}$ and 
798: $\tau_c$ can be equal to $s$, any path $\gamma$ using $s$, 
799: must use it in one of $O(1)$ possible positions. 
800: Since a permutation $\sigma_\ell$ for $\ell \in [w-1]$ or $\tau_c$ determines 
801: two indices of $S,c$ 
802: there are only $\Theta((m)_{w-2})$ choices for $S,c$ that are 
803: consistent with $s$.
804: The last thing still needed to reconstruct $\gamma$ is the string 
805: $\beta \in \F2^w$. Since the columns $C_j$ are not modified throughout the
806: entire sequence, $\beta$ must be the restriction of some row of $N$ to $C_j$,
807: limiting $\beta$ to one of $k$ possible values.
808: Therefore, the total number of type (i) paths through $(N,N')$ 
809: is $O(k \cdot (m)_{w-2})$, 
810: and the contribution of the type (i) sequences to $A$ is:
811: \begin{eqnarray*}
812: A_{(i)} 
813: = O( \overset{d'/\tilde{d}}{\overbrace{(n^2w/k2^w)}} 
814: \cdot \overset{f(\gamma)\cdot|\gamma|}{\overbrace{(Lw/(m)_w)}}
815: \cdot \overset{\mbox{choices for }j,S,c,\beta\mbox{ and position}}
816:               {\overbrace{(k \cdot (m)_{w-2})}} )
817: = O(Lw^2/2^w) = o(1).
818: \end{eqnarray*}
819: 
820: For type (ii) paths we distinguish the cases where $(N,N')$ is 
821: in the first middle or last sections of a path 
822: $\gamma_{M,M'}^{S,j,\phi}$.
823: Consider the first section (and similarly the last).
824: We enumerate over possible positions $\ell \in [L]$. 
825: Then we know two indices of the
826: sequence $S$ and one of the $3L$ indices in $C_i$ that where used by $\phi$.
827: Therefore, we have $L\cdot(m)_{w-3}\cdot|\S^{(w)}|^L/w$ possible values for 
828: $S,i,\phi$ and the position.
829: This enables us to determine $M$ and $\hM$. We still have to determine the
830: row $r$, the two strings $\alpha, \alpha'$ and the index $j$
831: which have $O(kn2^w/w)$ possibilities.
832: Therefore the contribution of the first and last sections of type (ii)
833: paths is:
834: \begin{eqnarray*}
835: A_{(ii.\mbox{first,last})} 
836: &=& O( \overset{d'/\tilde{d}}{\overbrace{(n^2w/k2^w)}} 
837: \cdot \overset{f(\gamma)\cdot|\gamma|}{\overbrace{(Lw/|\S^{(w)}|^L(m)_w)}}
838: \cdot \overset
839:        {\mbox{choices for }j,S,i,\phi,\alpha,\alpha',\beta\mbox{ and position}}
840:        {\overbrace{(k2^wL\cdot(m)_{w-2}\cdot|\S^{(w)}|^L/w^2)}})\\
841: &=& O(L^2) = O(w^2\log^2 w\cdot(1+\log k)^2).
842: \end{eqnarray*}
843: 
844: For the middle section of type (ii) paths, as for the type (i) argument,
845: given $(N,N')$ we first determine the position up to $O(1)$ possible choices.
846: Then we determine the index $i$ or $j$ and two indices from $S$,
847: then we have $O( (m)_{w-2} \cdot |\S^{(w)}|^L/w )$ possibilities for 
848: $i,j,S,\phi$.
849: Also we have $k2^w$ choices for the row and the strings $\beta,\alpha$ and 
850: $\alpha'$.
851: Therefore,
852: \begin{eqnarray*}
853: A_{(ii.\mbox{middle})}
854: = O(\overset{d'/\tilde{d}}{\overbrace{(n^2w/k2^w)}} 
855: \cdot \overset{f(\gamma)\cdot|\gamma|}{\overbrace{(Lw/|\S^{(w)}|^L(m)_w)}}
856: \cdot \overset
857:        {\mbox{choices for }j,S,i,\phi,\alpha,\alpha',\beta\mbox{ and position}}
858:        {\overbrace{k2^w \cdot (m)_{w-2} \cdot |\S^{(w)}|^L/w}}
859: )
860: = O(Lw).
861: \end{eqnarray*}
862: 
863: 
864: \end{proof}
865: 
866: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
867: 
868: \section{Proof of Theorem~\ref{3-cycle:theorem}}
869: 
870: \newcommand{\Core}{{\rho_\mathit{core}}}
871: \newcommand{\Top}{\mathit{top}}
872: \newcommand{\Bot}{\mathit{bot}}
873: \newcommand{\cI}{\I}
874: \newcommand{\cIp}{{\I}^\prime}
875: %\newcommand{\ignore}[1]{}
876: \newcommand{\vp}{v^\prime}
877: \newcommand{\vpp}{v^{\prime\prime}}
878: \newcommand{\up}{u^\prime}
879: \newcommand{\tvp}{\tilde{v}^\prime}
880: \newcommand{\tvpp}{\tilde{v}^{\prime\prime}}
881: 
882: First, we describe the randomized implementation of a $3$-cycle
883: $(xyz)$ using the simple permutations in $\Sigma$.  Second, we show
884: that this randomized implementation satisfies the statement of the
885: theorem. The randomness is introduced into the implementation of
886: $(xyz)$ by using a permutation $\phi \in S_n$ and two vectors $v_4,v_5$.
887: 
888: Let $\phi$ be some permutation of the $n$ coordinates. 
889: If $\omega=\sigma_1\cdots\sigma_L$ implements $(x\phi,y\phi,z\phi)$, 
890: then $\omega^\phi$ is an implementation of $(xyz)$, 
891: where $\omega^\phi=\phi \omega \phi^{-1}$ is the conjugation of
892: $\omega$ with $\phi$, i.e. the conjugation each of the permutations
893: $\sigma_i$ used in $\omega$. Note that the set $\Sigma$ of simple
894: permutations is closed under conjugation by permutations from $S_n$, because
895: this just relabels the indices.
896: 
897: For a vector $v \in \F2^n$, we denote the first $n-2$ bits of $v$ by $\vp
898: \in \F2^{n-2}$ and the last two bits of $v$ by $\vpp \in \F2^2$, i.e.,
899: $v = \vp\vpp$.  We call the last two bits the {\it control bits}.  
900: For convenience, the notation
901: $\vp00$, $\vp01$, $\vp10$, and $\vp11$ denotes bit vectors comprising
902: the first $n-2$ bits of $v$ and the control bits $00$, $01$, $10$, and
903: $11$, respectively.  Let $(v)_j$ denote the $j$-th bit of a vector $v$.
904: Finally, let $v_1 = x\phi$, $v_2 = y\phi$ and $v_3 = z\phi$.
905: 
906: If $\vp_1$ is equal to $\vp_2$ or to $\vp_3$ then we say that $\phi$ is
907: invalid. 
908: This can only occur if $x$, $y$, or $z$ are less than Hamming distance
909: $3$ apart and $\phi$ maps all indices on which $x$
910: and $y$ (or $z$) differ to the control indices.  
911: For the rest of the description we assume that $\phi$ is valid.
912: Let $v_4, v_5 \in \F2^n$ be two additional vectors satisfying the validity
913: requirement of being at least Hamming distance $3$ from each other and from 
914: the former three vectors. 
915: 
916: Observe that $(v_1,v_2,v_3) = \psi_1 \psi_2$ where 
917: $\psi_1=(v_1,v_2)(v_4,v_5)$ and $\psi_2=(v_1,v_3)(v_4,v_5)$.
918: Therefore it suffices to implement the two double transpositions
919: $\psi_1$ and $\psi_2$. These are implemented in an identical manner.
920: Each implementation is divided into 15 blocks: a core block, which
921: implements the permutation $\Core = (\vp_500,\vp_501) (\vp_510,\vp_511)$,
922: and seven block pairs conjugating it.
923: 
924: The first four of these blocks, called $\pi$-blocks ensure that the
925: control bits of each of the four vectors are distinct.  Specifically, $\vp_i
926: \vpp_i$ is mapped to $\vp_i c_i$, where $c_1 = 00$, $c_2 = c_3 = 01$, $c_4
927: = 10$ and $c_5 = 11$.  If $\vpp_i = c_i$ then the corresponding block,
928: labeled $\pi_i$ performs a nop.  Otherwise, block $\pi_i$ performs the
929: permutation $(\vp_i \vpp_i, \vp_i c_i) (\vp_i a_i, \vp_i b_i)$ where
930: $\{a_i,b_i\} = \F2^2 \backslash \{\vpp_i,c_i\}$.
931: 
932: The remaining three blocks, called $\tau$-blocks, map $\vp_1$, $\vp_2$ (or
933: $\vp_3$), and $\vp_4$ to $\vp_5$, using the control bits to distinguish
934: between the four vectors.  Block $\tau_i$ performs the permutation
935: $\tau_i = \prod_{\vp \in \F2^{n-2}} (\vp c_i, \up c_i)$, where $\up =
936: \vp \oplus \vp_i \oplus \vp_5$.  Since it can easily be checked that
937: $\tau_i = \tau_i^{-1}$, that $\pi_i = \pi_i^{-1}$, and that
938: \[\pi_1 \pi_2 \pi_4 \pi_5 \tau_1 \tau_2 \tau_4 \Core 
939:   \tau_4 \tau_2 \tau_1 \pi_5 \pi_4 \pi_2 \pi_1 = \psi_1
940: \mathrm{\ \ \ and\ \ \ }
941:   \pi_1 \pi_3 \pi_4 \pi_5 \tau_1 \tau_3 \tau_4 \Core 
942:   \tau_4 \tau_3 \tau_1 \pi_5 \pi_4 \pi_3 \pi_1 = \psi_2,\]
943: we need only describe the implementation of each of these blocks.
944: 
945: Each of the blocks is implemented using $O(n)$ simple permutations.  Each
946: $\tau$-block is implemented by concatenating $n-2$ simple permutations,
947: where for $j = 1 \cdots n-2$, the $j$-th simple permutation is the
948: identity if $(\vp_i)_j = (\vp_5)_j$, and otherwise flips the $j$-th bit of
949: vector $v$ if $\vpp = c_i$.
950: 
951: The implementation of the $\Core$ and $\pi$ blocks is more involved.
952: Permutation $\Core$ flips bit $(\vpp)_2$ if and only if $\vp = \vp_5$.
953: Barenco et al~\cite{BBCDMSSSW95} showed how such permutations can be
954: implemented using $O(n)$ simple permutations, comprising four sub-blocks:
955: $\rho_\Top\rho_\Bot\rho_\Top\rho_\Bot$ where permutation $\rho_\Top$ flips
956: bit $(\vpp)_1$ if the first $\lceil (n-2)/2 \rceil$ bits of $\vp$ match
957: the first $\lceil (n-2)/2 \rceil$ bits of $\vp_5$, and where permutation
958: $\rho_\Bot$ flips bit $(\vpp)_2$ if the latter $\lfloor (n-2)/2 \rfloor$
959: bits of $\vp$ match the latter $\lfloor (n-2)/2 \rfloor$ bits of $\vp_5$
960: and $(\vpp)_1 = 1$.  Each sub-block uses the remaining $\lceil (n-2)/2
961: \rceil$ bits as ``scratch'', returning them to their original state by
962: the end of the sub-block.  For details about the construction of the
963: two sub-blocks see~\cite{BBCDMSSSW95} or Lemma~\ref{simulation:lemma}.
964: 
965: Each block $\pi_i$ is implemented in a similar manner using two
966: permutations that are nearly identical to the implementation of $\Core$.
967: The first (second) permutation performs the identity if $(\vpp)_1 =
968: (c_i)_1$ (respectively, $(\vpp)_2 = (c_i)_2$) and otherwise flips bit
969: $(\vpp)_1$ (respectively, $(\vpp)_2$) if $\vp = \vp_i$.
970: 
971: The length of the implementations of $\psi_1$ and $\psi_2$ is $O(n)$,
972: since each of the seven blocks can be implemented using $O(n)$ simple
973: permutations from $\Sigma$. The randomize implementation of $(xyz)$ is
974: obtained by uniformly choosing at random a valid permutation $\phi$ and
975: the two valid random vectors $v_4,v_5$.
976: 
977: We now prove that this randomized implementation satisfies the statement of the
978: theorem. 
979: Let $\Omega = \{x, y, z, v_4, v_5, \phi\}$ 
980: be the probability space obtained by uniformly
981: choosing three distinct vectors $x$, $y$, and $z$, and then uniformly
982: choosing a corresponding implementation, which is fixed by $v_4$, $v_5$,
983: and $\phi$.  Each point $\omega=(x,y,z,v_4,v_5,\phi)\in\Omega$
984: corresponds to an implementation $\sigma_1\cdots\sigma_L$ of the 
985: 3-cycle $(xyz)$.
986: The size of $\Omega$ is $\Theta(2^{5n}n!)$, and   
987: although not uniform, the probability
988: of each point in $\Omega$ is $O(1/2^{5n}n!)$.  Thus, our problem
989: of upper-bounding 
990: $\Pr[x\sigma_1\sigma_2 \cdots \sigma_{\ell-1} = \tilde{x},\,
991:      \sigma_\ell = \tilde{\sigma}]$
992: reduces to a counting problem.
993: 
994: For all implementations $\omega\in\Omega$, the indices of the $\ell$-th 
995: permutation $\sigma_\ell$ depend {\em only} on its position, $\ell$, and $\phi$. 
996: Moreover, as we change $\phi$ the indices of the $\ell$-th permutation of the 
997: implementation of $ (x,y,z,v_4,v_5,\phi)$ agree with the indices of some 
998: fixed permutation $\tilde{\sigma}$ only on a subset of $S_n$ that is of 
999: size $O(n!/n^3)$ and depends only on $\ell$ and $\tilde{\sigma}$.
1000: 
1001: To establish the theorem we need to prove that for any given $\phi$ the 
1002: number of choices of $x$, $y$, $z$, $v_4$, and $v_5$, such that 
1003: $x\sigma_1\sigma_2\cdots\sigma_{\ell-1} = \tilde{x}$, is $O(2^{4n})$, implying
1004: the number of points in $\Omega$ that agree with $\tilde{x}$
1005: and $\tilde{\sigma}$ is $O(2^{4n}n!/n^3) = O(|\Omega|/2^n n^3)$.  This is 
1006: accomplished by the following lemma:
1007: 
1008: \begin{lemma}\label{counting:lemma}
1009: Let $\phi \in S_n$ be fixed.
1010: Then the set of all $x,y,z,v_4,v_5$ such that implementation
1011: corresponding to $(x,y,z,v_4,v_5,\phi)$ satisfies the equality
1012: $x \sigma_1\sigma_2 \cdots \sigma_{\ell-1} = \tilde{x}$
1013: is of size $O(2^{4n})$.
1014: \end{lemma}
1015: 
1016: \begin{proof}(of Lemma~\ref{counting:lemma})
1017: 
1018: Let $v_1 = x\phi$, $v_2 = y\phi$, $v_3 = z\phi$, and $\tilde{v}=\tilde{x}\phi$.
1019: Let $\Omega_{\tilde{v},\ell}$ be the set of tuples $(v_1,\ldots,v_5)$ for 
1020: which $x \sigma_1\sigma_2 \cdots \sigma_{\ell-1} = \tilde{x}$ is satisfied. 
1021: Note that this set is independent of $\phi$.
1022: Then the claim is that $|\Omega_{\tilde{v},\ell}|=O(2^{4n})$.
1023: 
1024: The proof is via case analysis with respect to position $\ell$.
1025: Without loss of generality we assume that the position is in the first
1026: half of the implementation, that which realizes permutation $(v_1,v_2)
1027: (v_4,v_5)$, otherwise, swapping $v_2$ and $v_3$ allows the same argument
1028: to be reused for the latter half of the implementation.  Furthermore,
1029: due to symmetry, we assume that the position of $\ell$ is in or to the
1030: left of block $\Core$.  There are four main cases: either $\ell$ is on
1031: a boundary between two blocks, $\ell$ is in block $\tau_i$, $\ell$ is in the
1032: block $\Core$, or $\ell$ is in block $\pi_i$.
1033: 
1034: \begin{figure}[ht]
1035: \large
1036: \[\underbrace{v_1 \stackrel{\pi_1}{\longrightarrow} 
1037:         \vp_1 c_1 \stackrel{\pi_2\pi_4\pi_5}{-\!\!\!\!-\!\!\!\!\longrightarrow}
1038:         \vp_1 c_1}_{\tvp \mathrm{\ fixes\ }\vp_1}
1039:                   \stackrel{\tau_1}{\longrightarrow}
1040:   \underbrace{\vp_5 c_1 \stackrel{\tau_2\tau_4}{\longrightarrow} 
1041:         \vp_5 c_1 \stackrel{\Core}{\longrightarrow} 
1042:         \vp_5 c_2 }_{\tvp \mathrm{\ fixes\ }\vp_5}
1043:         \cdots
1044: \]
1045: \caption{The evolution of $v_1$. \label{fig:thm9}}
1046: \end{figure}
1047: 
1048: In the first case, the position, $\ell$, is on a block boundary.
1049: Since each $\pi$-block only toggles bits $(\vpp)_1$ and $(\vpp)_2$,
1050: if position $\ell$ is adjacent to a $\pi$-block, then $\tvp = \vp_1$.
1051: Thus, all but two bits of $v_1$ are fixed by $\tilde{v}$.  If position,
1052: $\ell$, is on a boundary but is not adjacent to a $\pi$-block, then it
1053: must occur after block $\tau_1$.  Since block $\tau_1$ maps $\vp_100$
1054: to $\vp_500$, and none of the remaining blocks, $\tau_i$ or $\Core$,
1055: change the $\vp$ component to any other value, we have $\tvp = \vp_5$.
1056: Thus, all but two bits of $v_5$ are fixed by $\tilde{v}$, implying that
1057: $|\Omega_{\tilde{v},\ell}| = O(2^{4n})$.
1058: 
1059: In the second case, the position, $\ell$, is inside block $\tau_i$.  If $i
1060: \not= 1$, then none of the simple permutations in block $\tau_i$ flips a
1061: bit.  Therefore, the value of $\tvp = \vp_5$; thus fixing all but two bits
1062: of $v_5$, as before.  If $i = 1$, then at position $\ell$, we know exactly
1063: how many of the $n-2$ simple permutations have already been performed.
1064: Let $j$ be this number.  Hence we know that $\tvp = (\vp_5)_1, \ldots,
1065: (\vp_5)_j, (\vp_1)_{j+1}, \ldots, (\vp_1)_{n-2}$.  Therefore, $j$ bits of
1066: $v_5$ and $n - 2 - j$ bits of $v_1$ are therefore fixed by $\tilde{v}$,
1067: implying that $|\Omega_{\tilde{v},\ell}| = O(2^{4n})$ as well.
1068: 
1069: In the third case, the position, $\ell$, is inside block $\Core$. In this
1070: case we must look at the sub-blocks of the block $\Core$.  If the position
1071: occurs on a sub-block boundary, and since each of the sub-blocks simply
1072: toggles the bits $(\vpp)_1$ and $(\vpp)_2$, the remaining bits of $\vp_5$
1073: are fixed by $\tvp$.  If the position $\ell$ is inside a sub-block, then
1074: things are only slightly more complicated.  Assume that position $\ell$
1075: is in a $\rho_\Top$ sub-block (similar arguments hold for $\rho_\Bot$).
1076: Then, $\rho_\Top$ toggles bit $(\vpp)_1$ if the first half of $\vp$
1077: matches the first half of $\vp_5$.  The bits being matched are never
1078: modified and the other half of the bits of $\vp$ are used as ``scratch''.
1079: We know that the first half of $\vp$ and $\vp_5$ coincide throughout the
1080: block $\rho_\Top$, and therefore $\tvp$ determines this half of $\vp_5$.
1081: The operations on the ``scratch'' half depends only on the fixed half and
1082: the position, and therefore can be reversed,  reducing the problem to the
1083: position occurring at the beginning of $\rho_\Top$.  Thus, $\tilde{v}$
1084: fixes all but two of the bits of $v_5$.
1085: 
1086: In the last case, the position, $\ell$, is inside a $\pi$-block.  Block
1087: $\pi_i$ comprises two blocks that are similar to $\Core$.  Each of the
1088: two blocks is either the identity or toggles $(\vpp)_1$ or $(\vpp)_2$ if
1089: $\vp = \vp_i$.  If $i = 1$ then the two blocks in Block $\pi_i$ behave
1090: in the same manner as block $\Core$, except that $\tilde{v}$ fixes all
1091: but two of the bits of $v_1$ rather than $v_5$.  If $i \not = 1$, then,
1092: for the most part, the argument remains the same.  We need only consider
1093: what happens if the position, $\ell$, is in one of the eight sub-blocks.
1094: As mentioned before, half of the bits of $\vp$ are not modified by the
1095: sub-block, while the other half are used as ``scratch''.  Again, without
1096: loss of generality, we assume that the sub-block does not modify the
1097: first half of $\vp$.  As before, $\tvp$ fixes the first half of $\vp_1$.
1098: We enumerate on all choices for the first half of $\vp_i$.  This enables
1099: us to reverse the operations of the sub-block on the ``scratch'', fixing
1100: the second half of $\vp_1$---as in the third case.  This implies that
1101: $|\Omega_{\tilde{v},\ell}| = O(2^{4n})$, and completes the proof.
1102: 
1103: \end{proof}
1104: 
1105: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1106: \section{Odds and Ends}\label{more:section}
1107: 
1108: \begin{proof}(of Lemma~\ref{two-wise:lemma})\\
1109: We have to prove that for all $w\ge 1$ the mixing time of
1110: $G_{2,n}^{(w)}=\schreier(\S_w,X^{(2)})$ is $O(n \log n)$.
1111: 
1112: Given a $2$ by $n$ matrix with rows $s,t$, 
1113: we change basis to $s,u$ with $u=s \oplus t$. 
1114: Let $i\in[n]$ be a random coordinate, and consider the action of a width 
1115: $w$ permutations XORing the $i$-th bit with a random function $h$ on $w$ 
1116: distinct coordinates from $[n]\setminus\{i\}$.
1117: We claim that its action on $s,u$ is the same as XORing the $i$-th bit of 
1118: $s$ and $u$ with two {\em independent} random bits 
1119: $\alpha_s$ and $\alpha_u$ respectively.
1120: The bits $\alpha_s,\alpha_u$ are one with probability
1121: $1/2$ and $p_\ell = 1 - \prod_{j=1}^w (1-\frac{\ell}{n-j})$ respectively,
1122: where $\ell$ is the number of ones in $u$ not counting the $i$-th bit.
1123: To see that this is indeed the resulting walk we observe the fact that
1124: if $s$ and $t$ differ on one of the input bits of the random function $h$, 
1125: then the value of the $i$-th coordinate of $s$ and of $t$ change
1126: independently with probability half. Otherwise they change simultaneously
1127: with probability $1/2$.
1128: 
1129: The $u$-component of this walk is a variant of the Aldous cube, and by
1130: the comment at the end of~\cite{ChGr97} it follows that this walk
1131: mixes in $O(n \log n)$ time. We are left to show that in this time the
1132: walk on both components mixes. The way to see it is to notice that in
1133: $O(n \log n)$ time the event $A$ where the indices $i$ assume all
1134: possible values in $1,2,\ldots,n$ (coupon collector) happens with high
1135: probability. Now since the bits $\alpha_s$ are independent of
1136: $\alpha_u$, we get that even when we condition over the walk on the
1137: $u$ component, the $s$ component achieves uniform distribution
1138: conditioned on $A$, which ends the proof.
1139: \end{proof}
1140: 
1141: \begin{lemma}\label{twoeps:lemma}
1142: Let $A$ be an event such that $\Pr[A] \geq 1-\eps$,
1143: and let $Z$ be a random variable over a domain $\Omega$ such
1144: that $d(Z|A,\mbox{uniform}) \leq \eps$.
1145: Then $d(Z,\mbox{uniform}) \leq 2\eps$.
1146: \end{lemma}
1147: \begin{proof}
1148: \[d(Z,\mbox{uniform})
1149: =
1150: \max_{S \subseteq \Omega} \Pr[Z \in S]-\frac{|S|}{|\Omega|}
1151: \leq
1152: \max_{S \subseteq \Omega} \Pr[Z \in S|A]+\Pr[\overline{A}]-\frac{|S|}{|\Omega|}
1153: \leq
1154: \eps + d(Z|A,\mbox{uniform}) \leq 2\eps.
1155: \]
1156: \end{proof}
1157: 
1158: \begin{lemma}\label{cond:lemma}
1159: Let $X$ be a random variable and $A$ an event. 
1160: Then $\Pr[X|A] \leq \Pr[X]/\Pr[A]$.
1161: (Follows from the definition of conditional probability.)
1162: \end{lemma}
1163: 
1164: 
1165: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1166: \section{Some concluding remarks}\label{conclude:section}
1167: 
1168: Let us review what we currently know about the spectral gap of the Markov chain
1169: $P=P_\Sigma^{(k,n)}$. 
1170: By Theorem~\ref{k2n3:theorem}, $\gap(P) \leq \Omega(1/n^2k)$. 
1171: On the other hand, $\gap(P)$ is nonincreasing in $k$ by the
1172: lifting argument from Section~\ref{k2n3proof:section}.
1173: Since for $k=1$, $P$ is the standard random walk on the cube,
1174: we have that $\gap(P) \geq 1/n$.
1175: 
1176: In general, a generating set $S$ for which the spectral gap is large becomes 
1177: more difficult as $k$ increases, until the largest conceivable $k$, which is
1178: $2^n-2$. In this case, this is the random walk on the Cayley graph of the 
1179: alternating group $A_N$ for $N=2^n$ with the generating set $S$.
1180: It is open whether one can find a constant size set 
1181: for which $A_N$ is an expander, \cite[Problem 10.3.4]{Lu94}.
1182: \footnote{The problem of finding a constant size expanding set for 
1183: $A_N$ or $S_N$ is equivalent.}
1184: On the other hand, by Alon and Roichman~\cite{AlRo94}, a random set 
1185: of permutations of size $O(N \cdot \log N)$ will almost surely have a constant 
1186: spectral gap. 
1187: Although smaller expanding sets for $A_N$ are not known to exist,
1188: the general belief is that such sets exist;
1189: Rozenman, Shalev, and Wigderson assume the existence of an 
1190: $N^{1/30}$ expanding set for $A_N$,~\cite[section 1.4]{RSW04}.
1191: 
1192: Our results suggest that width $2$ permutations may be used to
1193: construct an $O(\log^3 N)$ expanding set for $A_N$. 
1194: However, several obstacles stand in the way of achieving this goal.
1195: The first one is to prove that for width $2$ permutations the spectral gap
1196: does not deteriorate with $k$, as we believe, and is $\Omega(1/n)$ for all $k$.
1197: The second problem is to achieve a constant gap.  
1198: To this end, one has to overcome the inherent and obvious weakness of the 
1199: width $2$ simple permutations.
1200: Namely, that their action depends only on two 
1201: coordinates and changes only one.  
1202: This leads to poor expansion because there is only a small chance that the 
1203: action will flip a specific bit
1204: or increase the distance between two similar vectors.
1205: One approach to avoiding this problem is to replace the standard set
1206: of generators of the cube $e_1,\ldots,e_n$ with some expanding set
1207: of size $O(n)$. Such an expanding set for the cube can readily be
1208: constructed from the generating matrix of a good code~\cite{DeSo91},
1209: and could then be used to define
1210: an $O(n^3)$ expanding set of permutations. 
1211: 
1212: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1213: 
1214: \bibliography{bib}
1215: 
1216: \end{document}
1217: 
1218: