1: \documentclass[11pt,reqno]{amsart}
2: \usepackage{amsmath, amssymb, amsthm}
3: \usepackage{graphicx}
4:
5: \numberwithin{equation}{section}
6:
7: %\renewcommand{\baselinestretch}{1.5} %1.5 spacing
8: %\pagestyle{myheadings}
9: %\markright{}
10: %\numberwithin{equation}{section}
11:
12:
13: %\parindent=1em
14: %\baselineskip 15pt
15: \hsize=14cm \textwidth=14cm
16: %\hsize=12.3cm \textwidth=12.3cm
17: %\vsize=18.5cm \textheight=18.5cm
18:
19: \newtheorem{theorem}{Theorem}[section]
20: \newtheorem{definition}[theorem]{Definition}
21: \newtheorem{proposition}[theorem]{Proposition}
22: \newtheorem{corollary}[theorem]{Corollary}
23: \newtheorem{lemma}[theorem]{Lemma}
24: \newtheorem{conjecture}[theorem]{Conjecture}
25: \newtheorem{fact}[theorem]{Fact}
26: \newtheorem*{general Gromov'}{Corollary \ref{general Gromov}$'$}
27:
28: \def \proof {\noindent {\bf Proof.}\ \ }
29: \def \remark {\noindent {\bf Remark.}\ \ }
30: \def \remarks {\noindent {\bf Remarks.}\ \ }
31: \def \example {\vspace{0.5cm} \noindent {\bf Example.}\ \ }
32: \def \endproof {{\mbox{}\nolinebreak\hfill\rule{2mm}{2mm}\par\medbreak}}
33: \newcommand{\margin}[1]{\marginpar{\scriptsize #1}}
34:
35: \DeclareMathOperator*{\Ave}{Ave}
36:
37: \def \N {\mathbb{N}}
38: \def \R {\mathbb{R}}
39: \def \C {\mathbb{C}}
40: \def \Q {\mathbb{Q}}
41: \def \Z {\mathbb{Z}}
42: \def \E {\mathbb{E}}
43: \def \F {\mathbb{F}}
44: \def \G {\mathbb{G}}
45: \def \P {\mathbb{P}}
46: \def \T {\mathbb{T}}
47: \def \I {\mathbb{I}}
48: \def \one {{\bf 1}}
49: \def \EE {\mathcal{E}}
50: \def \NN {\mathcal{N}}
51: \def \CC {\mathcal{C}}
52: \def \MM {\mathcal{M}}
53: \def \OO {\mathcal{O}}
54: \def \PP {\mathcal{P}}
55: \def \SS {\mathcal{S}}
56: \def \QQ {\mathcal{Q}}
57: \def \a {\alpha}
58: \def \b {\beta}
59: \def \g {\gamma}
60: \def \e {\varepsilon}
61: \def \eps {\varepsilon}
62: \def \d {\delta}
63: \def \D {\Delta}
64: \def \f {\varphi}
65: \def \k {\kappa}
66: \def \l {\lambda}
67: \def \L {\Lambda}
68: \def \s {\sigma}
69: \def \t {\tau}
70: \def \om {\omega}
71: \def \w {\omega}
72: \def \W {\Omega}
73: \def \< {\langle}
74: \def \> {\rangle}
75: \def \absconv {{\rm abs.conv}}
76: \def \sign {{\rm sign}}
77: \def \dist {{\rm dist}}
78: \def \diam {{\rm diam}}
79: \def \Span {{\rm span}}
80: \def \rank {{\rm rank }}
81: \def \range {{\rm range }}
82: \def \trace {{\rm trace}}
83: \def \diag {{\rm diag}}
84: \def \conv {{\rm conv}}
85: \def \lin {{\rm lin}}
86: \def \aff {{\rm aff}}
87: \def \HS {{\rm HS}}
88: \def \Prob {{\rm Prob}}
89: \def \id {{\it id}}
90: \def \im {{\rm Im}}
91: \def \vol {{\rm vol}}
92: \def \Lip {{\rm Lip}}
93: \def \supp {{\rm supp}}
94: \def \bi {B\bigl(L_\infty(\Omega)\bigr)}
95: \def \Ball {{\rm Ball}}
96: \def \const {{\rm const}}
97:
98:
99:
100:
101:
102: \begin{document}
103: \title [Geometric approach to error correcting codes and signal recovery]
104: {Geometric approach to error correcting codes
105: and reconstruction of signals}
106:
107: \author{Mark Rudelson}
108: \address{Departent of Mathematics, University of Missouri, Columbia, MO 65211, U.S.A.}
109: \email{rudelson@math.missouri.edu}
110:
111: \author{Roman Vershynin}
112: \address{Departent of Mathematics, University of California, Davis, CA 95616, U.S.A.}
113: \email{vershynin@math.ucdavis.edu}
114:
115: \thanks{The first author is partially supported by the NSF grant DMS 0245380.
116: The second author is partially supported by the NSF grant DMS 0401032
117: and by the Miller Scholarship from the University of
118: Missouri-Columbia. }
119:
120: \subjclass[2000]{46B07, 94B75, 68P30, 52B05}
121:
122: \begin{abstract}
123: We develop an approach through geometric functional analysis
124: to error correcting codes and to reconstruction of signals
125: from few linear measurements. An error correcting code encodes
126: an $n$-letter word $x$ into an $m$-letter word $y$
127: in such a way that $x$ can be decoded correctly when any $r$ letters
128: of $y$ are corrupted. We prove that most linear orthogonal
129: transformations $Q : \R^n \to \R^m$ form efficient and robust robust
130: error correcting codes over reals. The decoder (which corrects the corrupted
131: components of $y$) is the metric projection onto the range of $Q$
132: in the $\ell_1$ norm. An equivalent problem arises in signal processing:
133: how to reconstruct a signal that belongs to a small class from few linear measurements?
134: We prove that for most sets of Gaussian measurements, all signals
135: of small support can be exactly reconstructed by the $L_1$ norm
136: minimization. This is a substantial improvement of recent results of Donoho and
137: of Candes and Tao. An equivalent problem in combinatorial geometry
138: is the existence of a polytope with fixed number of facets and maximal
139: number of lower-dimensional facets.
140: We prove that most sections of the cube form such polytopes.
141: \end{abstract}
142:
143: \maketitle
144:
145:
146:
147: \section{Error correcting codes and transform coding}
148: %_______________________________________________________
149:
150: Error correcting codes are used in modern technology to protect
151: information from errors. Information is formed by finite words
152: over some alphabet $\F$.
153: An encoder transforms an $n$-letter word $x$ into an $m$-letter word $y$ with $m > n$.
154: The decoder must be able to recover $x$ correctly when up to $r$ letters of $y$
155: are corrupted in any way. Such an encoder-decoder pair is called an
156: {\em $(n,m,r)$-error correcting code}.
157:
158: Development of algorithmically efficient error correcing codes
159: has been attracting attention of engineers, computer scientists
160: and applied mathematicians for past five decades.
161: Known constructions involve deep algebraic and combinatorial methods,
162: see \cite{Handbook}, \cite{Sp1}, \cite{Sp2}.
163: This paper develops a new approach to error correcting codes
164: from the viewpoint of geometric functional analysis (asymptotic convex geometry).
165: Our main focus will be on words over the alphabet $\F = \R$ or $\C$. In applications,
166: these words may be formed of the coefficients of some signal (such as image or audio)
167: with respect to some basis or overcomplete system (Fourier, wavelet, etc.)
168: Finite alphabets will be discussed in Section \ref{s:conclusion}.
169:
170: The simplest and most natural way to encode a vector $x \in \R^n$ into
171: a vector $y \in \R^m$ is of course a linear transform
172: \begin{equation} \label{Q}
173: y = Qx
174: \end{equation}
175: where $Q$ is given by an $m \times n$ matrix. Elementary linear
176: algebra tells us that if $m \ge n + 2r$ and the range of $Q$ is
177: generic\footnote{that is, in general position with respect to all
178: subspaces $\R^I$, $|I| = r$} then $x$ can be recovered from $y$
179: even if $r$ coordinates of $y$ are corrupted. This gives an
180: $(n,m,r)$-error correcting code. However, the decoder for this
181: code has a huge computational complexity, as it involves a search
182: through all $r$-element subsets of the components of $y$. Then the
183: problem is:
184:
185: \medskip
186:
187: \begin{quote}
188: {\em How to reconstruct a vector $y$ in an $n$-dimensional subspace $Y$
189: of $\R^m$ from a vector $y' \in \R^m$
190: that differs from $y$ in at most $r$ coordinates?}
191: \end{quote}
192:
193: \medskip
194:
195: \noindent
196: What complicates this problem is the arbitrary magnitude of errors in each
197: corrupted component of $y'$, in contrast to what happens over finite alphabets
198: such as $\F = \{0,1\}$.
199:
200: A traditional and simple approach to denoising $y'$, used in applications such
201: as signal processing, is the mean least square (MLS) minimization. One hopes
202: that $y$ is well approximated by a solution to the minimization problem
203: \begin{equation*}
204: \min_{u \in Y} \|u - y'\|_2 \tag{MLS}
205: \end{equation*}
206: where $\|x\|_2^2 = \sum_i |x_i|^2$.
207: The solution to (MLS) is simply the orthogonal projection of $y'$ onto $Y$.
208: This of course can not recover $y$ exactly, and even the approximation is typically
209: poor since we have no control of the magnitude of the errors in the
210: corrupted coordinates.
211: A promising alternative approach is the {\em Basis Pursuit} (BP).
212: We simply replace the $1$-norm by the $2$-norm and expect $y$ to be the {\em exact}
213: and unique solution to the minimization problem
214: \begin{equation*}
215: \min_{u \in Y} \|u - y'\|_1 \tag{BP}
216: \end{equation*}
217: where $\|x\|_1 = \sum_i |x_i|$.
218: Thus a solution to (BP) is the metric projection of $y'$ onto $Y$
219: with respect to the $1$-norm. (BP) be cast as a Linear Programming problem,
220: and can be attacked with a variety of methods, such as the classical simplex method
221: or more recent interior point methods that yield polynomial time algorithms
222: \cite{CDS}.
223:
224: \begin{center}
225: \raisebox{-1 true in}{\includegraphics[height=2in]{ecc1.eps}}
226: \end{center}
227:
228: The potential of Basis Pursuit for exact reconstruction
229: is illustrated by the following heuristics, essentially due to \cite{DET}.
230: The solution $u$ to (MLS) is the contact point where the smallest Euclidean ball
231: centered at $y'$ meets the subspace $Y$. That contact point is in general
232: different from $y$. The situation is much better in (BP): typically the solution
233: coincides with $y$. The solution $u$ to (BP) is the contact point
234: where the smallest octahedron centered at $y'$ (the ball with respect to the $1$-norm)
235: meets $Y$. Because the vector $y-y'$ lies in a low-dimensional coordinate subspace,
236: the octahedron has a wedge at $y$. Thus, many subspaces $Y$ through $y$
237: will miss the octahedron of radius $y-y'$ (as opposed to the Euclidean ball).
238: This forces the solution $u$ to (BP), which is the contact point of the octahedron,
239: to coincide with $y$.
240:
241: The idea of using the $1$-norm instead of the $2$-norm for better data recovery
242: has been explored since mid-seventies in various applied areas, in particular
243: geophysics and statistics (early history can be found in \cite{T 04c}).
244: With the subsequent development of fast interior point methods in Linear Programming,
245: (BP) turned into an effectively solvable problem, and was put forward
246: more recently by Donoho and his collaborators, triggering
247: massive experimental and theoretical work \cite{CDS, DH, EB, FN, DE, GN,
248: T 04a, T 04b, T 04c, DET, D 04a, D 04b, DT 04a, DT 04b, CRT, CR, CT}.
249:
250: \medskip
251:
252: The main result of this paper validates the Basis Pursuit method
253: for most subspaces $Y$ under an asymptotically sharp condition on $m,n,r$.
254: We thus prove that {\em the Basis Pursuit yields exact reconstruction for most subspaces $Y$}
255: in the Grassmanian.
256: The randomness is with respect to the normalized Haar
257: measure on the Grassmanian $G_{m,n}$ of $n$-dimensional subspaces of $\R^m$.
258: Positive absolute constants will be denoted throughout the paper
259: by $C, c, C_1, \ldots$.
260:
261: \begin{theorem} \label{ecc}
262: Let $m$, $n$ and $r < cm$ be positive integers such that
263: \begin{equation} \label{mnr'}
264: m = n+ R, \ \ \ \text{where $R \ge C r \log(m/r)$}.
265: \end{equation}
266: Then a random $n$-dimensional subspace $Y$ in $\R^m$ satisfies
267: the following with probability at least $1 - e^{-c R}$.
268: Let $y \in Y$ be an unknown vector, and we are given a vector $y'$ in $\R^m$
269: that differs from $y$ on at most $r$ coordinates.
270: Then $y$ can be exactly reconstructed from $y'$ as the solution
271: to the minimization problem (BP).
272: \end{theorem}
273:
274: In an equivalent form, this theorem is a substantial improvement of
275: recent results of Donoho \cite{D 04a} and of Candes and Tao \cite{CT},
276: see Theorem~\ref{reconstruction} below.
277:
278:
279: \subsection{Error correcting codes.} \label{ss:ecc}
280: Theorem \ref{ecc} implies a natural $(n,m,r)$-error correcting code over $\R$.
281: The encoder \eqref{Q} is given by an $m \times n$
282: random orthogonal matrix\footnote{one can view it as the first $n$ rows of
283: a random matrix from $O(m)$ equipped with the normalized Haar measure.} $Q$.
284: Its range $Y$ is a random $n$-dimensional subspace in $\R^m$.
285: The decoder takes a corrupted vector $y'$, solves (BP) and outputs
286: $Q^T u = Q^{-1} u$. Theorem \ref{ecc} states that under the assumption \eqref{mnr'},
287: this encoder-decored pair is an $(n,m,r)$-error correcting code with
288: exponentially good probability $\ge 1 - e^{-c R}$.
289:
290: \subsection{Sharpness.}
291: The sufficient condition \eqref{mnr'} is sharp up to an absolute
292: constant $C$ (see Section \ref{s:conclusion}) and is only slightly
293: stronger than the necessary condition $m \ge n + 2r$. The ratio
294: $\e = r/m$ in \eqref{mnr'} is the number of errors per letter in
295: the noisy communication channel that maps $y$ to $y'$. Thus $\e$
296: should be considered as a quality of the channel, which is
297: independent of the message. Thus \eqref{mnr'} is equivalent to
298: $$
299: m \ge \Bigl(1 + C \e \log \frac{1}{\e} \Bigr) n.
300: $$
301:
302: \subsection{Robustness.}
303: An natural feature of our error correction code is its {\em robustness}.
304: Simple linear algebra yields that
305: the solution to (BP) is stable with respect to the $1$-norm -- in the same way
306: as the solution to (MLS) is stable with respect to the $2$-norm, see \cite{CT}.
307: Such robustness allows in particular quantization of the messages.
308: This immediately yields error correcting codes for finite alphabets, see
309: Section \ref{s:conclusion}.
310:
311: \subsection {Transform coding.}
312: In the signal processing, the linear codes \eqref{Q} are known
313: as {\em transform codes}. The general paradigm about transform codes is
314: that the redundancies in the coefficients of $y$
315: that come from the excess of the dimension $m > n$ should guarantee
316: a stability of the signal with respect to noise, quantization, erasures,
317: etc. This is confirmed by an extensive experimental and some theoretical
318: work, see e.g. \cite{Da,G1,G2,GVT,GKK,KDG,BO,CK}
319: and the bibliography contained therein.
320: Theorem \ref{ecc} states that {\em most orthogonal transform codes
321: are good error-correcting codes}.
322:
323: \subsection* {Acknowledgement.} This work has started when the second
324: author was visiting University of Missouri-Columbia as a Miller
325: Visiting Scholar. He is grateful to UMC for the hospitality.
326:
327:
328:
329: \section{Reconstruction of signals from linear measurements.}
330: %_____________________________________________________________
331:
332:
333: The heuristic idea that guides the Statistical Learning Theory is that
334: {\em a function $f$ from a small class should be determined by few linear measurements}.
335: Linear measurements are generally given by some linear functionals $X_k$
336: in the dual space, which are fixed (in particular are independent of $f$).
337: Most common measurements are point evaluation functionals; the
338: problem there is to interpolate $f$ between known values while keeping $f$
339: in the known (small) class.
340: When the evaluation points are chosen at random, this becomes the `proper learning'
341: problem of the Statistical Learning Theory (see \cite{M}).
342:
343: We shall however be interested in general linear measurements.
344: The proposal to learn $f$ from general linear measurements ({\em `sensing'})
345: has been originated recently from a criticism of the current methodology
346: of signal compression. Most of real life signals, such as images and sounds,
347: seem to belong to small classes. This is because they carry much of unwanted information
348: that can be discarded with almost no perceptual loss, which makes such signals
349: easily compressible. Donoho \cite{D 04c} then questions the conventional scheme of
350: signal processing, where the whole signal must be first acquired (together
351: with lots of unwanted information) and only then be compressed
352: (throwing away the unwanted part).
353: Instead, can one {\em directly acquire} (`sense') the essential part of the signal,
354: via few linear measurements? Similar issues are raised in \cite{CT}.
355: We shall operate under the assumption that some technology
356: allows us to take linear measurements in certain fixed `directions' $X_k$.
357:
358: We will assume that our signal $f$ is discrete, so we view it as a vector in $\R^m$.
359: Suppose we can take linear measurements $\< f, X_k \> $ with some fixed
360: vectors $X_1, X_2, \ldots, X_{R}$ in $\R^m$.
361: Assuming that $f$ belongs to a small class,
362: how many measurements $R$ are needed to reconstruct $f$?
363: And even when we prove that $R$ measurements do determine $f$
364: (uniquely or approximately), the algorithmic issue remains unsettled:
365: how can one reconstruct $f$ from these measurements?
366:
367: The previoous section suggests to reconstruct $f$
368: as a solution to the Basis Pursuit minimization problem
369: \begin{equation*}
370: \min \|g\|_1
371: \ \ \text{subsect to} \ \
372: \< g, X_k \> = \< f, X_k \> , \ \ k = 1, \ldots, R. \tag{BP$'$}
373: \end{equation*}
374: For the Basis Pursuit to work, the vectors $X_k$ must be in a good position
375: with respect to all coordinate subspaces $\R^I$, $|I| \le r$.
376: A typical choice for such vectors would be the independent standard Gaussian
377: vectors\footnote{All the components of $X_k$ are independent
378: standard Gaussian random variables.} $X_k$.
379:
380: \subsection{Functions with small support}
381: In the class of functions with small support, one can hope for exact reconstruction.
382: Candes and Tao \cite{CT} have indeed proved that every {\em fixed} function $f$ with
383: support $|\supp f| \le r$ can indeed be recovered by (BP$'$), correctly
384: with the polynomial probability $1 - m^{-\text{const}}$, from the
385: $R = C r \log m$ Gaussian measurements.
386: However, the polynomial probability is clearly not sufficient
387: to deduce that there is {\em one} set vectors $X_k$ that can be used to
388: reconstruct all functions $f$ of small support.
389:
390: The following equivalent form of Theorem \ref{ecc} does
391: yield a uniform exact reconstruction.
392: It provides us with {\em one set} of linear measurements from from which we
393: can effectively reconstruct {\em every} signal of small support.
394:
395: \begin{theorem} [Uniform Exact Reconstruction] \label{reconstruction}
396: Let $m$, $r < cm$ and $R$ be positive integers satisfying
397: $R \ge C r \log(m/r)$.
398: The independent standard Gaussian vectors $X_k$ in $\R^m$
399: satisfy the following with probability at least $1 - e^{-c R}$.
400: Let $f \in \R^m$ be an unknown function of small support, $|\supp f| \le r$,
401: and we are given $R$ measurements $\< f, X_k\> $.
402: Then $f$ can be exactly reconstructed from these measurements
403: as a solution to the Basis Pursuit problem (BP$'$).
404: \end{theorem}
405:
406: This theorem gives uniformity in Candes-Tao result \cite{CT}, improves the polynomial
407: probability to an exponential probability, and improves upon the number $R$
408: of measurements (which was $R \ge C r \log m$ in \cite{CT}).
409: Donoho \cite{D 04c} proved a weaker form of Theorem \ref{reconstruction}
410: with $R/r$ bounded below by some function of $m/r$.
411:
412: \medskip
413:
414: \proof
415: Write $g = f - u$ for some $u \in \R^m$. Then (BP$'$) reads as
416: \begin{equation} \label{BP uf}
417: \min \|u - f\|_1
418: \ \ \text{subsect to} \ \
419: \< u, X_k \> = 0, \ \ k = 1, \ldots, R.
420: \end{equation}
421: The constraints here define a random $(n = m - R)$-dimensional subspace
422: $Y$ of $\R^m$. Now apply Theorem \ref{ecc} with $y = 0$ and $y' = f$. It states
423: that the unique solution to \eqref{BP uf} is $u = 0$. Therefore, the
424: unique solution to (BP$'$) is $f$.
425: \endproof
426:
427:
428: \subsection{Compressible functions}
429: In a larger class of compressible functions \cite{D 04c}, we can only hope for
430: an approximate reconstruction. This is a class of functions $f$ that are
431: well compressible by a known orthogonal transform, such as Fourier or wavelet.
432: This means that the coefficients of $f$ with respect to a certain known
433: orthogonal basis have a power decay. By applying an appropriate rotation,
434: we can assume that this basis is the canonical basis of $\R^m$, thus
435: $f$ satisfies
436: \begin{equation} \label{compressible}
437: f^*(s) \le s^{-1/p}, \ \ \ s = 1, \ldots, m
438: \end{equation}
439: where $f^*$ denotes a nonincreasing rearrangement of $f$.
440: Many natural signals are compressible for some $0 < p < 1$,
441: such as smooth signals and signals with bounded variations (see \cite{CT}),
442: in particular most photographic images.
443: Theorem \ref{reconstruction} implies, by the argument of \cite{CT},
444: that functions compressible in some basis can be approximately
445: reconstructed from few fixed linear measurements:
446:
447: \begin{corollary}[Uniform Approximate Reconstruction]
448: Let $m$ and $r$ be positive integers.
449: The independent standard Gaussian vectors $X_k$ in $\R^m$
450: satisfy the following with probability at least $1 - e^{-c R}$.
451: Assume that an unknown function $f \in \R^m$ satisfies either
452: \eqref{compressible} for some $0 < p < 1$ or $\|f\|_1 \le 1$ for $p=1$.
453: Suppose that we are given $R$ measurements $\< f, X_k\> $.
454: Then $f$ can be approximately reconstructed from these measurements:
455: a unique solution $g$ to the Basis Pursuit problem (BP$'$) satisfies
456: $$
457: \|f - g\|_2
458: \le C_p \Bigl( \frac{\log(m/R)}{R} \Bigr)^{\frac{1}{p} - \frac{1}{2}}
459: $$
460: where $C_p$ depends on $p$ only.
461: \end{corollary}
462:
463: This theorem also gives uniformity in another Candes-Tao result from \cite{CT}
464: (see also \cite{D 04b}); it improves the polynomial probability to an
465: exponential probability, and also improves upon the approximation error.
466:
467:
468: \section{Counting low-dimensional facets of polytopes.}
469: %_____________________________________________________________
470:
471: Theorem \ref{ecc} turns out to be equivaent to a problem of counting
472: lower-dimensional facets of polytopes. Let $B_1^m$ denote the unit ball
473: with respect to the $1$-norm; it is sometimes called the unit octahedron.
474: The polar body is the unit cube $B_\infty^m = [-1,1]^m$.
475: The conclusion of Theorem \ref{ecc} is then equivalent to the
476: following statement: the affine subspace $z + Y$ is tangent to the unit
477: octahedron at point $z$, where $z = y' - y$. This should happen
478: for all $z$ from the coordinate subspaces $\R^I$ with $|I| = r$.
479: By the duality, this means that the subspace $Y^\perp$ intersects all
480: $(m-r)$-dimensional facets of the unit cube. The section of the cube by
481: the subspace $Y^\perp$ forms an origin-symmetric polytope of dimension $R$
482: and with $2m$ facets.
483:
484: Our problem can thus be stated as a problem of counting lower-dimensional facets
485: of polytopes.
486: \begin{quote}
487: {\em Consider an $R$-dimensional origin symmetric polytope
488: with $2m$ facets. How many $(R-r)$-dimensional facets can it have?}
489: \end{quote}
490: Clearly\footnote{Any such facet is the intersection of some $r$ facets
491: of the polytope of full dimension $R-1$; there are $m$ facets to choose from,
492: each coming with its opposite by the symmetry.}, no more than
493: $2^r \binom{m}{r}$. Does there exist a polytope with that many facets?
494: Our ability to construct such a polytope
495: is equivalent to the existence of the efficient error
496: correcting code. Indeed, looking at the canonical realization of such a
497: polytope as a section of the unit cube by a subspace $Y^\perp$,
498: we see that $Y^\perp$ intersects all the $(m-r)$-dimensional facets
499: of the cube. Thus $Y$ satisfies the conclusion of Theorem~\ref{ecc}.
500: We can thus state Theorem \ref{ecc} in the following form:
501:
502: \begin{theorem}
503: There exists an $R$-dimensional symmetric polytope with $m$ facets
504: and with the maximal number of $(R-r)$-dimensional facets
505: (which is $2^r \binom{m}{r}$), provided $R \ge C r \log(m/r)$.
506: A random section of the cube forms such a polytope with probability
507: $1 - e^{-cR}$.
508: \end{theorem}
509:
510: So, how can we prove that a random subspace $Y^\perp$ indeed intersects all the
511: $(m-r)$-dimensional facets of the cube? It is enough to show that
512: $Y^\perp$ intersects one such fixed facet with exponential probability
513: (bigger than $1 - 2^{-r} \binom{m}{r}^{-1}$).
514: The main difficulty here is that the concentration of measure technique
515: can not be readily applied. This is because the $\infty$-norm defined
516: by the unit cube (more precisely, by its facet) has a bad Lipschitz constant.
517: To improve the Lipschitzness, we first project the facet onto a random
518: subspace (within its affine span); the random subspace parallel to which we
519: project is taken from the random directions that form $Y^\perp$.
520: This creates a big Euclidean ball inside the projected facet;
521: here we shall use the full strength of the estimate
522: of Garnaev and Gluskin \cite{GG} on Euclidean projections of a cube.
523: The existence of the Euclidean ball inside a body creates the needed
524: Lipschitzness, so we can now use the concentration of measure tecnique.
525:
526: \medskip
527:
528: The rest of the paper is organized as follows.
529: In Section \ref{s:proof} we prove Theorem \ref{ecc}.
530: In Section \ref{s:conclusion} we discuss some optimality and
531: robustness of the Basis Pursuit with applications to error correcting
532: codes over finite alphabets.
533:
534:
535:
536:
537:
538: \section{Proof} \label{s:proof}
539: %______________________________________________________________________________
540:
541: We shall use the following standard notations throughout the proof.
542: The $p$-norm ($1 \le p < \infty$) on $\R^m$ is defined by
543: $\|x\|_p^p = \sum_i |x_i|^p$, and for $p = \infty$ it is
544: $\|x\|_\infty = \max_i |x_i|$. The unit ball with respect to the
545: $p$-norm on $\R^n$ is denoted by $B_p^m$. When the $p$-norm is considered
546: on a coordinate subspace $\R^I$, $I \subset \{1,\ldots,m\}$,
547: the corresponding unit ball is denoted by $B_p^I$.
548:
549: The unit Euclidean sphere in a subspace $E$ is denoted by $S(E)$.
550: The normalized rotational invariant Lebesgue measure on $S(E)$ is denoted
551: by $\sigma_E$.
552: The orthogonal projection in onto a subspace $E$ is denoted by $P_E$.
553: The standard Gaussian measure on $E$ (with the identity covariance matrix)
554: is denoted by $\gamma_H$. When $E = \R^d$, we write $\sigma_{d-1}$ for
555: $\sigma_E$ and $\gamma_d$ for $\gamma_E$.
556:
557:
558:
559: \subsection{Duality}
560: We begin the proof of Theorem \ref{ecc} with a typical duality argument,
561: leading to the same reformulation of the problem as in \cite{CT}.
562: We claim that the conclusion of Theorem \ref{ecc} follows from
563: (and is actually equivalent to) the following separation condition:
564: \begin{equation} \label{separation}
565: (z + Y) \cap \;\text{interior}\, (B_1^m) = \emptyset
566: \ \ \ \text{for all} \ \ z \in \bigcup_{|I| = r} B_1^I.
567: \end{equation}
568: Indeed, suppose \eqref{separation} holds. We apply it for
569: $$
570: z := \frac{y-y'}{\|y-y'\|_1}
571: $$
572: noting that $z \in \bigcup_{|I| = r} B_1^I$ holds, because $y$ and $y'$
573: differ in at most $r$ coordinates.
574: By \eqref{separation},
575: $$
576: (z + v) \cap \;\text{interior}\, (B_1^m) = \emptyset
577: \ \ \ \text{for all $v \in Y$}
578: $$
579: which implies
580: $$
581: \|z + v\|_1 \ge 1
582: \ \ \ \text{for all $v \in Y$}.
583: $$
584: Let $u \in Y$ be arbitrary. Using the inequality above for
585: $v := \frac{u-y}{\|u-y\|_1}$, we conclude that
586: $$
587: \|u-y\|_1 \ge \|y-y'\|_1
588: \ \ \ \text{for all $u \in Y$}.
589: $$
590: This proves that $y$ is indeed a solution to (BP).
591: The solution to (BP) is unique with probability $1$ in the Grassmanian.
592: This follows from a direct dimension argument, see e.g. \cite{CT}.
593:
594: By Hahn-Banach theorem, the separation condition \ref{separation}
595: is equivalent to the following:
596: for every $z \in \bigcup_{|I| = r} \;\text{boundary}\, B_1^I$
597: there exists $w = w(z) \in Y^\perp$ such that
598: $$
599: \< w,z \> = \sup_{x \in B_1^m} \< w,x \> = \|w\|_\infty.
600: $$
601: This holds if and only if the components of $w$ satisfy
602: \begin{equation} \label{w}
603: \begin{cases}
604: w_j = \sign(z_j) \ \ \text{for $j \in I$}, \\
605: |w_j| \le 1 \ \ \text{for $j \in I^c$}.
606: \end{cases}
607: \end{equation}
608: The set of vectors $w$ in $\R^m$ that satisfy \eqref{w} form a
609: $(m-r)$-dimensional facet of the unit cube $B_\infty^m$.
610: Then with $E := Y^\perp$ we can say that the conclusion
611: of Theorem \ref{ecc} is equivalent to the following:
612:
613: \medskip
614:
615: \begin{quote}
616: {\em A random $R$-dimensional subspace $E$ in $\R^m$ intersects
617: all the $(m-r)$-dimensional facets of the unit cube
618: with probability at least $1 - e^{-cR}$.}
619: \end{quote}
620:
621: \medskip
622:
623: It will be enough to show that $E$ intersects {\em one fixed}
624: facet with the probability $1 - e^{-cR}$. Indeed, since the total
625: number of the facets is $N = 2^r \binom{m}{r}$, the probability
626: that $E$ misses some facet would be at most $N e^{-cR} \le e^{-c_1 R}$
627: with an appropriate choice of the absolute constant in \eqref{mnr'}.
628:
629:
630: \subsection{Realizing a random subspace}
631: We are to show that a random $R$-dimensional subspace $E$ intersects one fixed
632: $(m-r)$-dimensional facet of the unit cube $B_\infty^m$ with high probability.
633: Without loss of generality, we can assume that our facet is
634: $$
635: F = \{ (w_1, \ldots, w_{m-r}, 1, \ldots, 1), \ \ \text{all $|w_j| \le 1$} \},
636: $$
637: whose center is
638: $$
639: \theta = (\underbrace{0,\ldots,0}_{m-r}, 1,\ldots,1).
640: $$
641: The probability we are interested in is
642: $$
643: P := \Prob\{ E \cap F \ne \emptyset\}.
644: $$
645: We shall restrict our attention to the linear span of $F$,
646: $$
647: \lin(F) = \{ (w_1, \ldots, w_{m-r}, t, \ldots, t),
648: \ \ \text{all $w_j \in \R$, $t \in \R$} \},
649: $$
650: and even to its the affine span of $F$,
651: $$
652: \aff(F) = \{ (w_1, \ldots, w_{m-r}, 1, \ldots, 1),
653: \ \ \text{all $w_j \in \R$} \}.
654: $$
655: Only the random affine subspace $E \cap \aff(F)$ matters for us, because
656: $$
657: P = \Prob\Bigl\{ (E \cap \aff(F)) \cap F \ne \emptyset \Bigr\}.
658: $$
659: The dimension of that affine subspace is almost surely
660: $$
661: l := \dim (E \cap \aff(F)) = R-r.
662: $$
663:
664: We can realize the random affine subspace $E \cap \aff(F)$
665: (or rather a random subspace with the same law) by the following
666: algorithm:
667:
668: \begin{enumerate}
669:
670: \item Select a random variable $D$ with the same law as
671: $\dist(\theta, E \cap \aff(F))$.
672:
673: \item Select a random subspace $L_0$ in the Grassmanian $G_{m-r,l}$.
674: It will realize the ``direction'' of $E \cap \aff(F)$ in $\aff(F)$.
675:
676: \item Select a random point $z$ on the Euclidean sphere $D \cdot S(L_0^\perp)$
677: of radius $D$, according to the uniform distribution on the sphere.
678: Here $L_0^\perp$ is the orthogonal complement of $L_0$ in $\R^{m-r}$.
679: The vector $z$ will realize the distance from the affine subspace
680: $E \cap \aff(F)$ to the center $\theta$ of $F$.
681:
682: \item Set $L = \theta + z + L_0$. Thus the random affine subspace $L$
683: has the same law as $E \cap \aff(F)$.
684:
685: \end{enumerate}
686:
687: \begin{center}
688: \raisebox{-1 true in}{\includegraphics[height=2in]{ecc2.eps}}
689: \end{center}
690:
691: \noindent Hence
692: $$
693: P = \Prob \{ L \cap F \ne \emptyset \}
694: = \Prob \{ (z + L_0) \cap B_\infty^{m-r} \ne \emptyset \}
695: = \Prob \{ z \in P_{L_0^\perp} B_\infty^{m-r} \}.
696: $$
697: $H := L_0^\perp$ is a random subspace in $G_{m-r,m-r-l} = G_{m-r,m-R}$.
698: By the rotational invariance of $z \in D \cdot S(H)$,
699: \begin{equation} \label{P=integral}
700: P = \int_{\R^+} \int_{G_{m-r,m-R}} \sigma_H (D^{-1} P_H B_\infty^{m-r})
701: \; d\nu(H) \; d\mu(D)
702: \end{equation}
703: where $\nu$ is the normalized Haar measure on $G_{m-r,m-R}$
704: and $\mu$ is the law of $D$.
705: We shall bound $P$ in two steps:
706:
707: \begin{enumerate}
708:
709: \item Prove that the distance $D$ is small with high probability;
710:
711: \item Prove that a suitable multiple of the random projection
712: $P_H B_\infty^{m-r}$ has an almost full Gaussian
713: (thus also spherical) measure.
714:
715: \end{enumerate}
716:
717:
718: \subsection{The distance $D$ from the center of the facet to a random subspace}
719: We shall first relate $D$, the distance to the affine subspace $E \cap \aff(F)$,
720: to the distance to the linear subspace $E \cap \lin(F)$.
721: Equivalently, we compute the length of the projection onto $E \cap \lin(F)$.
722:
723: \begin{lemma} \label{linear vs affine}
724: $$
725: \|P_{E \cap \lin(F)} \theta \|_2 = \sqrt{\frac{r}{r+D^2}} \;
726: \|\theta\|_2.
727: $$
728: \end{lemma}
729:
730: \proof
731: Let $f$ be the multiple of the vector $P_{E \cap \lin(F)} \theta$ such that
732: $f-\theta$ is orthogonal to $\theta$. Such a multiple exists and is unique,
733: as this is a two-dimensional problem.
734:
735: \begin{center}
736: \raisebox{-1 true in}{\includegraphics[height=1.25in]{ecc3.eps}}
737: \end{center}
738:
739: Then $f \in E \cap \aff(F)$. Notice that $D= \|f-\theta\|_2$. By
740: the similarity of the triangles with the vertices $(0, \theta,
741: P_{E \cap \lin(F)} \theta)$ and $(0, f, \theta)$, we conclude that
742: $$
743: \|P_{E \cap \lin(F)} \theta \|_2 = \frac{r}{\sqrt{r+D^2}} =
744: \sqrt{\frac{r}{r+D^2}} \; \|\theta\|_2
745: $$
746: because $\|\theta\|_2 = \sqrt{r}$.
747: This completes the proof.
748: \endproof
749:
750: \medskip
751:
752: The length of the projection of a fixed vector onto a random subspace in
753: Lemma~\ref{linear vs affine} is well known. The asymptotically sharp
754: estimate was computed by S.~Artstein \cite{A}, but we will be satisfied
755: with a much weaker elementary estimate, see e.g. \cite{Ma} 15.2.2.
756:
757: \begin{lemma} \label{l: random projection}
758: Let $\theta \in \R^{d-1}$ and let $G$ be a random subspace in $G_{d,k}$.
759: Then
760: $$
761: \Prob \Bigl\{ c \sqrt{\frac{k}{d}} \; \|\theta\|_2
762: \le \|P_G \theta\|_2
763: \le C \sqrt{\frac{k}{d}} \; \|\theta\|_2
764: \Bigr\}
765: \ge 1 - 2 e^{-ck}.
766: $$
767: \end{lemma}
768:
769: We apply this lemma for $G = E \cap \lin(F)$, which is a random subspace
770: in the Grassmanian of $(l+1)$-dimensional subspaces of $\lin(F)$.
771: Since $\dim \lin(F) = m-r+1$, we have
772: $$
773: \Prob \Bigl\{ \|P_{E \cap \lin(F)} \theta\|_2
774: \ge c \sqrt{\frac{l+1}{m-r+1}} \; \|\theta\|_2
775: \Bigr\}
776: \ge 1 - 2 e^{-cl}.
777: $$
778: Together with Lemma \ref{linear vs affine} this gives
779: \begin{equation} \label{D small}
780: \Prob \Bigl\{ D \le c \sqrt{m-r} \sqrt{\frac{r}{l}} \Bigr\}
781: \ge 1 - 2e^{-cl}.
782: \end{equation}
783: Note that $\sqrt{m-r}$ is the radius of the Euclidean ball circumscribed
784: on the facet $F$. The statement $D \le \sqrt{m-r}$ would only tell us
785: that the random subspace $E$ intersects the circumscribed ball, not yet the
786: facet itself. The ratio $r/l$ in \eqref{D small} will be chosen logarithmically
787: small, which will force $E$ intersect also the facet $F$.
788:
789:
790:
791: \subsection{Gaussian measure of random projections of the cube}
792: By \eqref{P=integral} and \eqref{D small},
793: $$
794: P \ge \int_{G_{m-r,m-R}}
795: \sigma_H \Bigl( \frac{c}{\sqrt{m-r}} \sqrt{\frac{l}{r}} \,
796: P_H B_\infty^{m-r} \Bigr)
797: \; d\nu(H) -2 e^{-cl}.
798: $$
799: We can replace the spherical measure $\sigma_H$ by the
800: Gaussian measure $\g_H$ via a simple lemma:
801:
802:
803: \begin{lemma} \label{spherical vs Gaussian}
804: Let $K$ be a star-shaped set in $\R^d$. Then
805: $$
806: \g_d(c \sqrt{d} \cdot K) - e^{-d}
807: \le \sigma_{d-1}(K)
808: \le \g_d(C \sqrt{d} \cdot K)\cdot (1+ e^{-d}).
809: $$
810: \end{lemma}
811:
812: \proof Passing to polar coordinates, by the rotational invariance
813: of the Gaussian measure we see that there exists a probability
814: measure $\mu$ on $\R^+$ so that the Gaussian measure of every set
815: $A$ can be computed as $\int_{\R^+} \s^t(A) \; d\mu(t)$, where
816: $\s^t$ denotes the normalized Lebesgue measure on the Euclidean
817: sphere of radius $t$ in $\R^d$. Since $K$ is star-shaped,
818: $\s^t(K)$ is a non-increasing function of $t$. Hence
819: \begin{align*}
820: \gamma_d(K)
821: & \ge \int_0^{C \sqrt{d}} \s^t(K) \, d\mu(t)
822: \ge \s^{C \sqrt{d}}(K) \cdot
823: \gamma_d( C \sqrt{d} B_2^d)
824: \intertext{and}
825: \gamma_d(K)
826: & \le \int_0^{c \sqrt{d}} d\mu(t)
827: + \s^{c \sqrt{d}}(K) \int_{c \sqrt{d}}^\infty d\mu(t)
828: \le \gamma_d(c \sqrt{d} \cdot B_2^d) + \s^{c \sqrt{d}}(K).
829: \end{align*}
830: The classical large deviation inequalities imply $\gamma_d(c
831: \sqrt{d} \cdot B_2^d) \le e^{-d}$ and $\gamma_d( C \sqrt{d} B_2^d)
832: \ge 1- e^{-d}/2$. Using the above argument for $c \sqrt{d} \cdot
833: K$, we conclude that $\g_d(c \sqrt{d} \cdot K) \le e^{-d} +
834: \sigma_{d-1}(K)$ and $\g_d(C \sqrt{d} \cdot K) \ge \sigma_{d-1}(K)
835: \cdot (1-e^{-d}/2)$.
836: \endproof
837:
838: \medskip
839:
840:
841: Using Lemma \ref{spherical vs Gaussian}
842: in the space $H$ of dimension $d = m-R$, we obtain
843: $$
844: P \ge \int_{G_{m-r,m-R}}
845: \gamma_H \Bigl( c \sqrt{\frac{m-R}{m-r}} \sqrt{\frac{l}{r}} \,
846: P_H B_\infty^{m-r} \Bigr)
847: \; d\nu(H) -2 e^{-cl} -e^{m-R}.
848: $$
849: By choosing the absolute constant $c$ in the assumption $r < cm$
850: appropriately small, we can assume that $2r < R < m/2$.
851: Thus
852: \begin{equation} \label{P}
853: P \ge \int_{G_{m-r,m-R}}
854: \gamma_H \Bigl( c \sqrt{\frac{R}{r}} \,
855: P_H B_\infty^{m-r} \Bigr)
856: \; d\nu(H) -2 e^{-cR}.
857: \end{equation}
858: We now compute the Gaussian measure of random projections of the cube.
859:
860: \begin{proposition} \label{proj of cube}
861: Let $H$ be a random subspace in $G_{n,n-k}$, $k < n/2$.
862: Then the inequality
863: $$
864: \gamma_H \Bigl( C \sqrt{\log \frac{n}{k}} \,
865: P_H B_\infty^n \Bigr)
866: \ge 1 - e^{-ck}
867: $$
868: holds with probability at least $1 - e^{-ck}$ in the Grassmanian.
869: \end{proposition}
870:
871: The proof of this estimate will follow from the concentration of Gaussian measure,
872: combined with the existence of a big Euclidean ball inside a random projection
873: of the cube.
874:
875: \begin{lemma}[Concentration of Gaussian measure] \label{concentration}
876: Let $A$ be a measurable set in $\R^n$. Then for $\e > 0$,
877: $$
878: \gamma_n(A) \ge e^{-\e^2 n}
879: \ \ \ \text{implies} \ \ \
880: \gamma_n(A + C \e \sqrt{n} B_2^n ) \ge 1 - e^{-\e^2 n}.
881: $$
882: \end{lemma}
883:
884: With the stronger assumption $\gamma(A) \ge 1/2$, this lemma is the classical
885: concentration inequality, see \cite{L} 1.1. The fact that the concentration
886: holds also for exponentially small sets follows formally by a simple extension
887: argument that was first noticed by D.~Amir and V.~Milman in \cite{AM},
888: see \cite{L} Lemma 1.1.
889:
890: The optimal result on random projections of the cube
891: is due to Garnaev and Gluskin \cite{GG}.
892:
893: \begin{theorem}[Euclidean projections of the cube \cite{GG}] \label{GG lemma}
894: Let $H$ be a random subspace in $G_{n,n-k}$, where $k = \a n < n/2$.
895: Then with probability at least $1 - e^{-ck}$ in the Grassmanian, we have
896: $$
897: c(\a) \, P_H(\sqrt{n} B_2^n)
898: \subseteq P_H(B_\infty^n) \subseteq
899: P_H(\sqrt{n} B_2^n)
900: $$
901: where
902: $$
903: c(\a) = c \sqrt{\frac{\a}{\log(1/\a)}}.
904: $$
905: \end{theorem}
906:
907:
908: \medskip
909:
910: \noindent {\bf Proof of Proposition \ref{proj of cube}. }
911: Let $g_1, g_2, \ldots$ be independent standard Gaussian random variables.
912: Then for a suitable positive absolute constant $c$ and for every $0 < \e < 1/2$,
913: $$
914: \gamma_n \Bigl( C \sqrt{\log \frac{1}{\e}} \, B_\infty^n \Bigr)
915: = \Prob \Bigl\{ \max_{1 \le j \le n} |g_i| \le C \sqrt{\log \frac{1}{\e}} \Bigr\}
916: \ge (1 - \e^2/10)^n \ge e^{-\e^2 n}.
917: $$
918: Since for every measurable set $A$ and every subspace $H$ one has
919: $\gamma_H(P_H A) \ge \gamma(A)$, we conclude that
920: $$
921: \gamma_H \Bigl( C \sqrt{\log \frac{1}{\e}} \, P_H B_\infty^n \Bigr)
922: \ge e^{-\e^2 n}
923: \ \ \ \text{for $0 < \e < 1/2$.}
924: $$
925: Then by Lemma \ref{concentration},
926: \begin{equation} \label{cube+ball}
927: \gamma_H \Bigl( C \sqrt{\log \frac{1}{\e}} \, P_H B_\infty^n
928: + C \e \sqrt{n} \, P_H B_2^n \Bigr)
929: \ge 1 - e^{-\e^2 n}
930: \ \ \ \text{for $0 < \e < 1/2$.}
931: \end{equation}
932: Theorem \ref{GG lemma} tells us that for a random subspace $H$,
933: if $\e = c \sqrt{\a} = c \sqrt{k/n}$,
934: then Euclidean ball is absorbed by the projection of the cube
935: in \eqref{cube+ball}:
936: $$
937: \e \sqrt{n} \, P_H B_2^n \subset C \sqrt{\log \frac{1}{\e}} \, P_H B_\infty^n.
938: $$
939: Hence for a random subspace $H$ and for $\e$ as above we have
940: $$
941: \gamma_H \Bigl( C \sqrt{\log \frac{1}{\e}} \, P_H B_\infty^n \Bigr)
942: \ge 1 - e^{-\e^2 n},
943: $$
944: which completes the proof.
945: \endproof
946:
947: \medskip
948:
949: Coming back to \eqref{P}, we shall use Lemma \ref{proj of cube}
950: for a random subspace $H$ in the Grassmanian $G_{m-r,m-R}$.
951: We conclude that if
952: \begin{equation} \label{Rr}
953: c \sqrt{\frac{R}{r}} \ge C \sqrt{\log \frac{m-r}{R-r}},
954: \end{equation}
955: then with probability at least $1 - e^{-cR}$ in the Grassmanian,
956: $$
957: \gamma_H \Bigl( c \sqrt{\frac{R}{r}} \, P_H B_\infty^{m-r} \Bigr)
958: \ge 1 - e^{-cR}.
959: $$
960: Since $\frac{m-r}{R-r} \le \frac{m}{r}$, the choice of $R$ in \eqref{mnr'}
961: satisfies condition \eqref{Rr}. Thus \eqref{P} implies
962: $$
963: P \ge 1 - 3 e^{-cR}.
964: $$
965: This completes the proof.
966: \endproof
967:
968:
969:
970:
971:
972: \section{Optimality, robustness, finite alphabets} \label{s:conclusion}
973: %______________________________________________________________________________
974:
975:
976: \subsection{Optimality}
977: The logarithmic term in Theorems \ref{ecc} and
978: \ref{reconstruction} is necessary, at least in the case of small
979: $r$. Indeed, combining formula \eqref{P=integral} and Lemmas
980: \ref{linear vs affine}, \ref{l: random projection}, \ref{spherical
981: vs Gaussian}, we obtain
982: \begin{equation} \label{upper P}
983: P \le \int_{G_{m-r,m-R}}
984: \gamma_H \Bigl( c \sqrt{\frac{R}{r}} \,
985: P_H B_\infty^{m-r} \Bigr)
986: \; d\nu(H) + 2 e^{-cR}.
987: \end{equation}
988: To estimate the Gaussian measure we need the following
989: \begin{lemma} \label{l: Gaussian measure}
990: Let $x_1, \ldots x_s$ be vectors in $\R^s$. Then
991: \[
992: \g_s \left (\sum_{j=1}^s [-x_j,x_j] \right )
993: \le \g_s( M \cdot B_{\infty}^s),
994: \]
995: where $M= \max_{j=1, \ldots s} \|x_j\|_2$.
996: \end{lemma}
997:
998: The sum in the Lemma is understood as the Minkowski sum of sets of vectors,
999: $A+B = \{a+b \;|\; a \in A, \; b \in B\}$.
1000:
1001: \medskip
1002:
1003: \proof Let $F= \Span (x_1, \ldots x_{s-1})$ and let $V=F^{\perp}$.
1004: Let $v \in V$ be a unit vector. Set $Z= \sum_{j=1}^{s-1}
1005: [-x_j,x_j]$. Then
1006: \begin{align*}
1007: \g_s \Bigl(\sum_{j=1}^s [-x_j,x_j] \Bigr)
1008: &= \int_V \g_F \Bigl( \Bigl( \sum_{j=1}^s [-x_j,x_j]-tv \Bigr) \cap F
1009: \Bigr) \, d \g_V(t) \\
1010: &= \int_{[-P_V x_s, P_V x_s]} \g_F (Z+ t P_F x_s) d \g_V(t).
1011: \end{align*}
1012: By Anderson's Lemma (see \cite{Lif}),
1013: $\g_F (Z+ t P_F x_s) \le \g_F (Z)$. Thus,
1014: \[
1015: \g_s \Bigl( \sum_{j=1}^s [-x_j,x_j] \Bigr)
1016: \le \g_V([-P_V x_s, P_V x_s]) \cdot \g_F(Z)
1017: \le \g_1([-M,M]) \cdot \g_F(Z).
1018: \]
1019: The proof of the Lemma is completed by induction.
1020: \endproof
1021:
1022: The Gaussian measure of a projection of the cube can be estimated
1023: as follows.
1024: \begin{proposition}
1025: Let $H$ be any subspace in $G_{n,n-k}$, $k < n/2$.
1026: Then
1027: \begin{equation} \label{measure of proj of cube}
1028: \gamma_H \Bigl( \frac{c}{\sqrt{k}} \sqrt{\log \frac{n}{k}} \,
1029: P_H B_\infty^n \Bigr)
1030: \le e^{-cn/k}.
1031: \end{equation}
1032: \end{proposition}
1033:
1034:
1035: \proof Decompose $I$ into the disjoint union of the sets $J_1,
1036: \ldots J_{s+1}$, so that each of the sets $J_1, \ldots J_s$
1037: contains $k+1$ elements and $(k+1)s<n \le (k+1)(s+1)$. Let $1 \le
1038: j \le s$. Let $U_j = H \cap (P_He_i, \ i \in \{1, \ldots n\}
1039: \setminus J_j)^{\perp}$, where $e_1, \ldots e_n$ is the standard
1040: basis of $\R^n$. Then $U_j$ is a one-dimensional subspace of $H$.
1041: Set
1042: \[
1043: x_j= \sum_{i \in J_j} \e_i P_He_i,
1044: \]
1045: where the signs $\e_i \in \{-1,1\}$ are chosen to maximize
1046: $\|P_{U_j}x_j\|_2$. Let $E= \Span (x_1, \ldots x_{s-1})$. Since
1047: $P_{U_j} B_{\infty}^n = [-x_j,x_j]$, we get
1048: \[
1049: P_H B_{\infty}^n \cap E = \sum_{j=1}^s [-x_j,x_j],
1050: \]
1051: where the sum is understood in the sense of Minkowski addition.
1052: Since $\|P_{U_J}\| =1$, $\|x_j\|_2 \le C \sqrt{k}$ and by Lemma
1053: \ref{l: Gaussian measure},
1054: \[
1055: \gamma_E \left ( \frac{\bar{c}\sqrt{\log s}}{\sqrt{k}}
1056: \sum_{j=1}^s [-x_j,x_j] \right )
1057: \le \gamma_E ( c'\sqrt{\log s} \cdot B_{\infty}^E) \le e^{-cs}
1058: \]
1059: for some appropriately chosen constant $\bar{c}$. Finally,
1060: log-concavity of the Gaussian measure implies that for any convex
1061: symmetric body $K \subset H$
1062: \[
1063: \gamma_H (K) \le \gamma_E(K \cap E).
1064: \]
1065: \endproof
1066:
1067: Combining \eqref{upper P} and \eqref{measure of proj of cube} we
1068: obtain $P \le 2e^{-cR}$, whenever $R \le c \log (m/r)$.
1069:
1070:
1071:
1072:
1073: \subsection{Robustness and codes for finite alphabets}
1074: Robustness is a well known property of the Basis Pursuit method.
1075: It states that the solution to (BP) is stable with respect to the $1$-norm.
1076: Indeed, it is not hard to show that, once Theorem \ref{ecc} holds,
1077: the unknown vector $y$ in Theorem \ref{ecc} can be approximately recovered
1078: from $y'' = y' + h$, where $h \in \R^m$ is any additional
1079: error vector of small $1$-norm (see \cite{CT}).
1080: Namely, the solution $u$ to the Basis Pursuit problem
1081: $$
1082: \min_{u \in Y} \|u - y''\|_1
1083: $$
1084: satisfies
1085: $$
1086: \|u - y\|_1 \le 4 \|h\|_1.
1087: $$
1088: This implies a possibility of quantization of the coefficients
1089: in the process of encoding and yields {\em error correcting codes over
1090: alphabets of size polynomial in $n$}.
1091:
1092: The following is the $(m,n,r)$-error correcting code under
1093: assumption \eqref{mnr'}, with input words $x$ over the alphabet
1094: $\{1,\ldots,p\}$ and the encoded words $y$ over the alphabet
1095: $\{1, \ldots, C p n^{3/2}\}$. The construction is the same as
1096: in \eqref{ss:ecc}; we just introduce quantization.
1097: The encoder takes $x \in \{1,\ldots,p\}^n$, computes
1098: $y = Qx$ and outputs the $\hat{y}$ whose coefficients are the quantized
1099: coefficients of $y$ with step $\frac{1}{10m}$.
1100: Then $\hat{y} \in \frac{1}{10m} \Z^m \cap [-p\sqrt{m}, p\sqrt{m}]^m$,
1101: which by rescaling can be identified with $\{1, \ldots, C p n^{3/2}\}$
1102: because we can assume that $m \le 2n$.
1103: The decoder takes $y' \in \frac{1}{10m} \Z^m$, finds solution $u$
1104: to (BP) with $Y = \range(Q)$, inverts to $x' = Q^T u$ and
1105: outputs $\hat{x'}$ whose coefficients are the quantized
1106: coefficients of $x'$ with step $1$.
1107:
1108: This is indeed an $(m,n,r)$-error correcting code. If
1109: $y'$ differs from $\hat{y}$ on at most $r$ coordinates, this and
1110: the condition $\|\hat{y} - y\|_1 \le \frac{1}{10}$ implies
1111: by the robustness that $\|u-y\|_1 \le 0.4$. Hence
1112: $\|x'-x\|_2 = \|Q^T (u-y)\|_2 = \|u-y\|_2 \le \|u-y\|_1 \le 0.4$.
1113: Thus $\hat{x'} = x$, so the decoder recovers $x$ from $y'$ correctly.
1114:
1115: The robustness also implies a ``continuity'' of our error correcting
1116: codes. If the number of corrupted coordinates in the received message
1117: $y'$ is bigger than $r$ but is still a small fraction,
1118: then the $(m,n,r)$-error correcting code above can still recover $y$
1119: up to some small fraction of the coordinates.
1120:
1121: We hope to return to consequences of our method, in particular
1122: to robustness and continuity of our codes and generally to codes over
1123: finite alphabets, in a separate publication.
1124:
1125:
1126:
1127:
1128:
1129:
1130:
1131:
1132:
1133:
1134:
1135:
1136: {\small
1137: \begin{thebibliography}{S 99}
1138:
1139: \bibitem {A} S. Artstein,
1140: {\em Proportional concentration phenomena on the sphere},
1141: Israel J. Math. 132 (2002), 337--358
1142:
1143: \bibitem {AM} D. Amir, V. D. Milman,
1144: {\em Unconditional and symmetric sets in $n$-dimensional normed spaces},
1145: Israel J. Math. 37 (1980), 3--20
1146:
1147: \bibitem {BO} B. Beferull-Lozano, A. Ortega,
1148: {\em Efficient quantization for overcomplete expansions in $\R^n$},
1149: IEEE Trans. Inform. Theory 49 (2003), 129--150
1150:
1151: \bibitem {CDS}, S. Chen, D. Donoho, M. Saunders,
1152: {\em Atomic decomposition by basis pursuit},
1153: SIAM J. Sci. Comput. 20 (1998), no. 1, 33--61;
1154: reprinted in: SIAM Rev. 43 (2001), no. 1, 129--159
1155:
1156: \bibitem {CK} P.G.Casazza, J.Kovacevi\'c,
1157: {\em Equal-norm tight frames with erasures. Frames},
1158: Adv. Comput. Math. 18 (2003), 387--430
1159:
1160: \bibitem {CR} E. Candes, J. Romberg,
1161: {\em Quantitative Robust Uncertainty Principles and Optimally Sparse Decompositions},
1162: preprint
1163:
1164: \bibitem {CRT} E. Candes, J. Romberg, T. Tao,
1165: {\em Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information},
1166: preprint
1167:
1168: \bibitem {CT} E. Candes, T. Tao,
1169: {\em Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?},
1170: preprint
1171:
1172: \bibitem {Da} I.Daubechies,
1173: {\em Ten lectures on wavelets},
1174: SIAM, Philadelphia, 1992
1175:
1176: \bibitem {D 04a} D. Donoho,
1177: {\em For Most Large Underdetermined Systems of Linear Equations,
1178: the minimal $\ell_1$-norm solution is also the sparsest solution},
1179: preprint
1180:
1181: \bibitem {D 04b} D. Donoho,
1182: {\em For Most Large Underdetermined Systems of Linear Equations, the minimal l1-norm
1183: near-solution approximates the sparsest near-solution},
1184: preprint
1185:
1186: \bibitem {D 04c} D. Donoho,
1187: {\em Compressed sensing},
1188: preprint
1189:
1190: \bibitem {DET} D. Donoho, M. Elad, V. Temlyakov,
1191: {\em Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise},
1192: preprint
1193:
1194: \bibitem {DE} D. Donoho, M. Elad,
1195: {\em Optimally sparse representation in general (nonorthogonal) dictionaries via $ell_1$
1196: minimization},
1197: Proc. Natl. Acad. Sci. USA 100 (2003), 2197--2202
1198:
1199: \bibitem {DT 04a} D. Donoho, Y. Tsaig,
1200: {\em Extensions of compresed sensing},
1201: preprint
1202:
1203: \bibitem {DT 04b} D. Donoho, Y. Tsaig,
1204: {\em Breakdown of Equivalence between the minimal l1-norm Solution and the Sparsest Solution},
1205: preprint
1206:
1207: \bibitem {DH} D. Donoho, X. Huo,
1208: {\em Uncertainty principles and ideal atomic decomposition},
1209: IEEE Trans. Inform. Theory 47 (2001), 2845--2862
1210:
1211: \bibitem {EB} M. Elad, A. Bruckstein,
1212: {\em A generalized uncertainty principle and sparse representation in pairs of bases},
1213: IEEE Trans. Inform. Theory 48 (2002), 2558--2567
1214:
1215: \bibitem {FN} A. Feuer, A. Nemirovski,
1216: {\em On sparse representation in pairs of bases},
1217: IEEE Trans. Inform. Theory 49 (2003), 1579--1581
1218:
1219: \bibitem {GG} A. Yu. Garnaev, E. D. Gluskin,
1220: {\em The widths of a Euclidean ball} (Russian),
1221: Dokl. Akad. Nauk SSSR 277 (1984), 1048--1052.
1222: English translation: Soviet Math. Dokl. 30 (1984), 200--204
1223:
1224: \bibitem {G1} V.K.Goyal,
1225: {\em Theoretical Foundations of Transform Coding},
1226: IEEE Signal Processing Magazine 18 (2001), no. 5, 9--21
1227:
1228: \bibitem {G2} V.K.Goyal,
1229: {\em Multiple Description Coding: Compression Meets the Network},
1230: IEEE Signal Processing Magazine 18 (2001), no. 5, 74--93
1231:
1232: \bibitem {GKK} V.K.Goyal, J.Kovacevic, and J.A.Kelner,
1233: {\em Quantized Frame Expansions with Erasures},
1234: Applied and Computational Harmonic Analysis 10 (2001), 203--233
1235:
1236: \bibitem {GVT} V.K.Goyal, M.Vetterli, and N.T.Thao,
1237: {\em Quantized Overcomplete Expansions in RN: Analysis, Synthesis and Algorithms},
1238: IEEE Trans. on Information Theory 44 (1998), 16--31
1239:
1240: \bibitem {GN} R. Gribonval, M. Nielsen,
1241: {\em Sparse representations in unions of bases},
1242: IEEE Trans. Inform. Theory 49 (2003), 3320--3325
1243:
1244: \bibitem {Handbook}
1245: {\em Handbook of coding theory. Vol. I, II.}
1246: Edited by V. S. Pless, W. C. Huffman and R. A. Brualdi.
1247: North-Holland, Amsterdam, 1998.
1248:
1249: \bibitem {KDG} J.~Kovacevic, P.~Dragotti, and V.~Goyal,
1250: {\em Filter Bank Frame Expansions with Erasures},
1251: IEEE Trans. on Information Theory, 48 (2002), 1439--1450
1252:
1253: \bibitem {L} M. Ledoux,
1254: {\em The concentration of measure phenomenon},
1255: Mathematical Surveys and Monographs, 89.
1256: American Mathematical Society, Providence, RI, 2001
1257:
1258: \bibitem {Lif} M. A. Lifshits,
1259: {\em Gaussian random functions},
1260: Mathematics and its Applications, 322.
1261: Kluwer Academic Publishers, Dordrecht, 1995
1262:
1263: \bibitem {Ma} J.~Matousek,
1264: {\em Lectures on discrete geometry},
1265: Graduate Texts in Mathematics, 212. Springer-Verlag, New York, 2002.
1266:
1267: \bibitem {M} S. Mendelson,
1268: {\em Geometric parameters in learning theory},
1269: Geometric aspects of functional analysis, 193--235,
1270: Lecture Notes in Mathematics, 1850, Springer, Berlin, 2004
1271:
1272: \bibitem {Sp1} D. Spielman,
1273: {\em The complexity of error-correcting codes},
1274: Fundamentals of Computation Theory, Krakow, Poland, 67--84,
1275: Lecture Notes in Computer Science 1279, Springer, Berlin, 1997
1276:
1277: \bibitem {Sp2} D. Spielman,
1278: {\em Constructing Error-Correcting Codes from Expander Graphs},
1279: Emerging applications of number theory (Minneapolis, MN, 1996), 591--600,
1280: IMA Vol. Math. Appl., 109, Springer, New York, 1999
1281:
1282: \bibitem {T 04a} J. Tropp,
1283: {\em Recovery of short, complex linear combinations via $\ell_1$ minimization},
1284: IEEE Trans. Inform. Theory, to appear
1285:
1286: \bibitem {T 04b} J. Tropp,
1287: {\em Greed is good: Algorithmic results for sparse approximation},
1288: IEEE Trans. Inform. Theory, Vol. 50, Num. 10, October 2004, pp. 2231-2242
1289:
1290: \bibitem {T 04c} J. Tropp,
1291: {\em Just relax: Convex programming methods for subset selection and sparse approximation},
1292: ICES Report 04-04, UT-Austin, February 2004
1293:
1294:
1295:
1296: \end{thebibliography}
1297: \end{document}
1298: