1:
2: \documentclass[11pt,draftcls,onecolumn]{IEEEtran}
3: %\documentclass[10pt,twocolumn]{IEEEtran}
4: \usepackage{amsmath,amsfonts,amssymb,psfig,color}
5: \usepackage[breaklinks=true, colorlinks=true, linkcolor=black, urlcolor=dblue,
6: citecolor=black, pdfpagemode=None, pdfstartview=FitH]{hyperref}
7: \definecolor{gray}{cmyk}{.2,0.2,.3,.1}
8: \definecolor{dred}{cmyk}{0,0.9,0.4,0.3}
9: \definecolor{dblue}{rgb}{0,0,0.5}
10: \definecolor{dgreen}{rgb}{0,0.3,0}
11: \definecolor{dgray}{rgb}{0.3,0.3,0}
12:
13: \setlength{\unitlength}{1mm}
14: \DeclareOldFontCommand{\rm}{\normalfont\rmfamily}{\mathrm}
15: \DeclareOldFontCommand{\sf}{\normalfont\sffamily}{\mathsf}
16: \DeclareOldFontCommand{\tt}{\normalfont\ttfamily}{\mathtt}
17: \DeclareOldFontCommand{\bf}{\normalfont\bfseries}{\mathbf}
18: \DeclareOldFontCommand{\it}{\normalfont\itshape}{\mathit}
19: \DeclareOldFontCommand{\sl}{\normalfont\slshape}{\@nomath\sl}
20: \DeclareOldFontCommand{\sc}{\normalfont\scshape}{\@nomath\sc}
21: \newcommand{\rend}{\hfill$\square$}
22: \newcommand{\tend}{\hfill$\blacksquare$}
23: \newcommand{\epsfig}{\psfig}
24:
25: \title{\huge Lattice Quantization with Side Information: Codes,
26: Asymptotics, and Applications in Sensor Networks \thanks{S.\ D.\
27: Servetto is with the School of Electrical and Computer Engineering,
28: Cornell University. URL: \href{http://cn.ece.cornell.edu/}
29: {{\tt http://cn.ece. cornell.edu/}}. Work supported by the National
30: Science Foundation, under awards CCR-0227676, CCR-0238271 (CAREER),
31: CCR-0330059, and ANR-0325556. This paper is based in part on work
32: presented at the IEEE Data Compression Conference in
33: 2000~\cite{Servetto:02b}, and at the Allerton conference in
34: 2002~\cite{Servetto:02c}.}}
35: \author{Sergio D. Servetto}
36: \date{August 31, 2006.}
37:
38:
39: \begin{document}
40: \maketitle
41: \thispagestyle{empty}
42:
43: \begin{picture}(0,0)
44: \put(-8,75){\tt To appear in the IEEE Transactions on Information Theory.}
45: \end{picture}
46:
47: \vspace{-15mm}
48: \begin{abstract}
49: \noindent\it
50: We consider the problem of rate/distortion with side information
51: available only at the decoder. For the case of jointly-Gaussian
52: source $X$ and side information $Y$, and mean-squared error distortion,
53: Wyner proved in 1976 that the rate/distortion function for this problem
54: is identical to the conditional rate/distortion function $R_{X|Y}$,
55: assuming the side information $Y$ is available at the encoder. In
56: this paper we construct a structured class of asymptotically optimal
57: quantizers for this problem: under the assumption of high correlation
58: between source $X$ and side information $Y$, we show there exist
59: quantizers within our class whose performance comes arbitrarily close
60: to Wyner's bound. As an application illustrating the relevance of
61: the high-correlation asymptotics, we also explore the use of these
62: quantizers in the context of a problem of data compression for sensor
63: networks, in a setup involving a large number of devices collecting
64: highly correlated measurements within a confined area. An important
65: feature of our formulation is that, although the per-node throughput
66: of the network tends to zero as network size increases, so does the
67: amount of information generated by each transmitter. This is a
68: situation likely to be encountered often in practice, which allows
69: us to cast under new---and more ``optimistic''---light some negative
70: results on the transport capacity of large-scale wireless networks.
71: \rm
72: \end{abstract}
73:
74: \vspace{15mm}
75: \noindent {\bf Index terms:} Rate/distortion, rate/distortion with side
76: information, quantization, vector quantization, lattice quantization,
77: lattice codes, hexagonal lattice, source coding, network information
78: theory, ad-hoc networks, sensor networks, multihop radio networks, wireless
79: networks, throughput, capacity.
80: \vspace{15mm}
81:
82: \setcounter{page}{0}
83: \pagebreak
84:
85:
86: \section{Introduction}
87:
88: \subsection{Large-Scale Wireless Sensor Networks}
89:
90: Wireless networks span a wide spectrum in terms of their functionality
91: (i.e., what they are used for), organization (i.e., how the different
92: components are assembled to form a complete working system), and the
93: technologies used to build them. A long-term project currently under
94: way at Cornell deals with the design and prototyping of networks with
95: the following defining characteristics:
96: \begin{itemize}
97: \item The nodes operate under severe power constraints, support
98: relatively large data transfer rates, and their number and density
99: is large.
100: \item Once nodes are deployed, their mobility is very limited (if there
101: is any at all). Instead, the main source of uncontrolled dynamics in
102: the network is the temporary failure of individual nodes: this will
103: typically happen either due to exhaustion of the power source (and for
104: the duration of the ``refueling'' period), or due to variations in the
105: wireless medium.
106: \end{itemize}
107: In our setup of interest, the network is made up of devices whose
108: functionality is essentially that of a traditional Cisco router, with
109: the addition that they communicate over a wireless channel, their size
110: is many orders of magnitude smaller, and they may come equipped with
111: sensors that generate information locally as well. Such networks
112: would prove extremely useful in a variety of very relevant scenarios,
113: such as disaster relief operations, military and surveillance applications,
114: cell-size reduction in cellular networks, environmental monitoring, etc.
115:
116: The development of a working network of this kind requires solutions
117: to a number of technical challenges (e.g., routing, flow control,
118: source and channel coding, power control, modem design, hardware, etc.).
119: Among all these, of particular interest in this paper is the problem of
120: source coding, in a scenario in which the data collected by a large number
121: of sensors is highly correlated. When network nodes are coupled with
122: devices that sense a spatial process at different locations (e.g.,
123: concentration of ozone in the atmosphere, spread of a pathogen/pollutant
124: agent, temperature of a material, etc.), the measurements collected by
125: each node will not be independent in general, but instead will be
126: correlated, with a correlation structure determined by the corresponding
127: fluid dynamics equations. Furthermore, the higher the density of nodes
128: in the network, the higher the correlation in the measurements will be.
129: Therefore, appropriate source coding capable of removing these dependencies
130: has the potential to significantly reduce the number of bits to be
131: transmitted (and therefore the consumption of scarce power resources),
132: when compared to a coding strategy that treats all measurements as being
133: independently generated.
134:
135: The use of standard and well understood source coding techniques is not
136: appropriate in the context of highly correlated sources: the use of
137: classical source codes to remove redundancy in the measurements collected
138: by different sensors requires that data be pooled at a common node prior
139: to transmission. But this pooling action consumes valuable communication
140: resources itself, thus defeating the very same goal it tries to achieve
141: (communication efficiency). Therefore, {\em distributed} source coding
142: techniques are required, i.e., codes capable of removing correlation
143: among measurements even in the presence of uncertainty about the exact
144: value measured at remote locations.
145: To this end, we define a simple abstraction that captures the essential
146: properties of this problem. First, we consider the source of information
147: to be a random process $(X_s)_{s\in[0,1]}$, defined over a bounded set,
148: and with {\em continuous} sample paths---continuity is one simple way of
149: capturing into our model the notion of correlation among measurements
150: increasing with the number of nodes in a confined area. This process is
151: observed by a finite number of sensors, and these observations are to be
152: communicated over a wireless network, as illustrated in
153: Fig.~\ref{fig:network-model}.
154:
155: \begin{figure}[ht]
156: \centerline{\psfig{file=network-model.eps,height=7.5cm,width=13.5cm}}
157: \caption{Network model. There are three types of nodes: sources,
158: relays, and destination nodes, with $n$ nodes of each type. There is
159: a source (a random process whose statistics are known by all sources),
160: from which each of the source nodes collects a sample. These samples are
161: encoded by each source node without knowledge of the samples collected by
162: other nodes, fed into the network, and each sent to a destination node.
163: Finally, these destination nodes pool all their information at a central
164: location, at which a decoder forms an estimate of the entire sample path,
165: based on the data available from all sensors. A key aspect of our problem
166: formulation is that each source node has to decide what information to send
167: to the central decoder {\em without} explicit knowledge of the information
168: available at other nodes---only with knowledge of the statistics of that
169: correlated data.}
170: \label{fig:network-model}
171: \end{figure}
172:
173: An important aspect of this problem setup is the fact that, as we
174: increase the number of source nodes, the amount of information contained
175: in each sample tends to zero---because the source is continous, two nearby
176: samples are almost the same. And we know from recent work on the transport
177: capacity of one class of wireless networks that, again for large networks,
178: the per-node throughput of networks in this class also tends to
179: zero~\cite{GuptaK:00}. Therefore, {\em provided that the rate at which
180: information contained in each sample decays at least as fast as the
181: throughput of the network}, appropriate source coding techniques should
182: enable an accurate reconstruction of the source at the central decoder
183: of Fig.~\ref{fig:network-model}. A study of the resulting source coding
184: problem in the context of these networks is the central subject of this
185: paper.
186:
187: \subsection{Rate Distortion with Side Information}
188:
189: \subsubsection{Problem Statement}
190:
191: Let $\{(X_n,Y_n)\}_{n=1}^\infty$ be a sequence of independent drawings
192: of a pair of dependent random variables $X$ and $Y$, and let $D(x,\hat{x})$
193: denote a single-letter distortion measure. The problem of rate distortion
194: with side information at the decoder asks the question of how many bits
195: are required to encode the sequence $\{X_n\}$ under the constraint that
196: ${\tt E}D(x,\hat{x}) \leq d$, assuming the side information $\{Y_n\}$ is
197: available to the decoder but not to the
198: encoder~\cite[Ch.\ 14.9]{CoverT:91}. This problem, first
199: considered by Wyner and Ziv in~\cite{WynerZ:76}, is a special
200: case of the general problem of coding correlated information sources
201: considered by Slepian and Wolf~\cite{SlepianW:73b},
202: in that one of the sources ($\{Y_n\}$) is available {\em uncoded} at the
203: decoder. But it also generalizes the setup
204: of~\cite{SlepianW:73b}, in that coding is with respect
205: to a fidelity criterion rather than noiseless. One important motivation
206: for us to consider this problem is the fact that good quantizers with side
207: information will be used in the proof of scalability of a large sensor
208: network.
209:
210: In~\cite{Wyner:78, WynerZ:76}, Wyner and
211: Ziv derive the rate/distortion function $R^*(d)$ for this problem, for
212: general sources and general (single letter) distortion metrics. In this
213: work however we restrict our attention only to Gaussian sources, and mean
214: squared error (MSE) distortion. This case is of special interest because,
215: under these conditions, it happens that $R^*(d) = R_{X|Y}(d)$, the
216: conditional rate/distortion function {\em assuming $Y$ is available at the
217: encoder}~\cite{Wyner:78, WynerZ:76}. We
218: are intrigued by the fact that there exist coding methods which can perform
219: as well as if they had access to the side information at the encoder, even
220: though they don't. One goal pursued in this paper then is the construction
221: a family of quantizers which realizes these promised gains.
222:
223: \subsubsection{Lattice Quantization with Side Information}
224:
225: High-rate quantization theory provides much of the motivation to consider
226: lattices~\cite{GrayN:98}. Under an assumption of fine
227: quantization, the performance of an $n$-dimensional quantizer $\Lambda$
228: whose Voronoi cells are all congruent to a polytope $P$ is given by
229: \begin{equation}
230: d = G(P) \cdot e^{-\frac{2}{n}({\cal H}(\Lambda,p_X)-h(p_X))},
231: \label{eq:zador-gersho-bound}
232: \end{equation}
233: where $p_X$ is the joint source distribution in $n$ dimensions, ${\cal H}$
234: is the discrete entropy induced on the codebook $\Lambda$ by quantization of
235: the source $p_X$, $h$ is the differential entropy, and
236: \[ G(P) = \frac{\frac{1}{n}
237: \int_P ||{\bf x}-{\bf \hat{x}}||^2 \mbox{ d\bf x}}
238: { \left(\int_P \mbox{ d\bf x}\right)^{1+\frac{2}{n}} }
239: \]
240: is the normalized second moment of $P$ (using MSE as a distortion
241: measure)~\cite{gersho:quantization-asymptotics,zador:quantization-asymptotics}.
242:
243: In the problem of rate distortion with side information, for Gaussian
244: sources and MSE distortion, the goal is to attain a distortion value
245: $d$ using $R_{X|Y}(d) < R_X(d)$ nats/sample. In~(\ref{eq:zador-gersho-bound})
246: this means that, at fixed bit rate $R_0$, we want to design quantizers
247: that achieve distortion
248: \[ d_0 \approx c_n \cdot e^{-\frac{2}{n}(nR_0-h(p_{X|Y}))} \]
249: when coding $X$, where $c_n \leq G(P)$ is the coefficient of quantization
250: in $n$ dimensions~\cite{gersho:quantization-asymptotics}. But since we do not
251: have access to $Y$ (we only know $p_{X|Y}$), using classical quantizers we can
252: only attain a distortion value
253: \[ d \approx c_n \cdot e^{-\frac{2}{n}(nR_0-h(p_X))} > d_0 \]
254: (because {\small $h(X|Y) < h(X)$}), or equivalently, we need to use
255: some extra rate $\rho \approx R_X-R_{X|Y}$ such that
256: \[ d_0 \approx c_n \cdot e^{-\frac{2}{n}(n(R_0+\rho)-h(p_X))}. \]
257: What makes this problem interesting is that we are only allowed to use $R_0$
258: nats/sample, not $R_0+\rho$. One way to do that has been proposed by Shamai,
259: Verd\'{u} and Zamir in~\cite{shamai-verdu-zamir:systematic-lossy-coding,
260: zamir-shamai:almost-there}, which consists of: (a) taking a codebook with
261: roughly $e^{n(R_0+\rho)}$ codewords and distortion $d_0$, (b) partitioning
262: this codebook into $e^{nR_0}$ sets of size $e^{n\rho}$ each, (c) encoding
263: only enough information to identify each one of the $e^{nR_0}$ sets, and
264: (d) using the side information $Y$ to discriminate among the $e^{n\rho}$
265: codewords collapsed into each set. One of our motivations for considering
266: lattice codes is the fact that their structure makes it particularly easy
267: to express these partitioning operations described
268: in~\cite{shamai-verdu-zamir:systematic-lossy-coding}.
269:
270: We should also mention that another reason to consider lattices is our
271: wish to answer a challenge posed by Zamir and Shamai
272: in~\cite{zamir-shamai:almost-there}. They present an encoding procedure
273: very closely related to the one we propose here, they argue the existence
274: of good lattices to use with that procedure, they study their distortion
275: performance, but they do not present any examples of concrete
276: constructions: their paper concludes by saying that (sic) ``{\em beyond
277: the question of existence, it would be nice to find specific constructions
278: of good nested codes}''. Finding those specific constructions is one of
279: the original contributions in this work.
280:
281: \subsection{Related Work}
282:
283: Note: this section contains relevant related work as of Fall 2004.
284:
285: \subsubsection{Codes and Quantizers}
286:
287: The design of quantizers for the problem of rate distortion with side
288: information was considered recently by Shamai, Verd\'{u} and Zamir, where
289: they present design criteria for two different cases: Bernoulli sources
290: with Hamming metric, and jointly Gaussian sources with mean squared error
291: metric~\cite{shamai-verdu-zamir:systematic-lossy-coding,
292: zamir-shamai:almost-there}. The key contribution presented in that work
293: is a constructive mechanism for, given a codebook, using the side
294: information at the decoder to reduce the amount of information that needs
295: to be encoded to identify codewords, while at the same time achieving
296: essentially the distortion of the given codebook. That work provided
297: much inspiration for our work on the design of lattice codes presented in
298: this paper.
299:
300: Other work on code constructions includes the application of similar
301: codebook partitioning ideas in the context of trellis
302: codes~\cite{sandeep-kannan:discus}, a preliminary version of this
303: work~\cite{Servetto:02b}, generalizations to the case when the side
304: information may be coded as well~\cite{PradhanR:00,ZhaoE:01},
305: constructions based on LDPC codes~\cite{AaronG:02, MitranB:02,
306: TianGZ:03}, and other code constructions~\cite{LiuCLX:04,
307: RebolloMonederoZG:03}.
308:
309: \subsubsection{Information-Theoretic Performance Bounds}
310:
311: Whereas there has been some interest in recent times on the more
312: practical aspects of these problems, a significant amount of work on
313: related topics had already been done before in the context of multiuser
314: information theory. Specifically on the problem of rate/distortion
315: with side information, besides the above mentioned work of Wyner and
316: Ziv~\cite{Wyner:78, WynerZ:76}, Kaspi
317: and Berger present a summary of known results and a number of new
318: results (as of 1982) in~\cite{KaspiB:82}, leaving only a couple of
319: special cases still open. Heegard and Berger further generalize to the
320: case when there is uncertainty on whether the side information is available
321: at the decoder or not~\cite{heegard-berger:uncertain-side-info}. For
322: an arbitrary pair of sources, Zamir gives bounds on how far away the
323: conditional rate/distortion function and the Wyner-Ziv rate/distortion
324: function can be from each other~\cite{Zamir:96}.
325:
326: Closely related to the problem of rate/distortion with side information
327: is that of {\it Noiseless Coding of Distributed Correlated Sources}.
328: Slepian and Wolf formulate this problem, and
329: determine the minimum number of bits per symbol required to encode two
330: correlated sequences $\{X_n\}$ and $\{Y_n\}$ separately, such that they
331: can be faithfully reproduced by a centralized decoder, under the assumption
332: that $\{(X_n,Y_n)\}_{n=1}^\infty$ is
333: i.i.d.~\cite{SlepianW:73b}. Cover then gives a simpler
334: proof of the same result, which also generalizes to arbitrary ergodic
335: processes, countably infinite alphabets, and arbitrary number of correlated
336: sources~\cite{Cover:75b}. Wyner presents an information theoretic
337: characterization of the minimum rates required for faithful reproduction in
338: a general network with side information~\cite{Wyner:75}. Barros and
339: Servetto consider the Slepian-Wolf problem in an arbitrary network
340: setup with noisy point-to-point links~\cite{BarrosS:06}.
341:
342: A long-standing open problem in network information theory is the
343: characterization of the rate-distortion region for the {\em Multiterminal
344: Source Coding} problem, which is basically the Slepian-Wolf problem,
345: but in which a non-zero distortion is allowed in the encoding of
346: both sources. The most significant contribution to this date can be
347: found in Tung's doctoral dissertation~\cite{Tung:PhD}. Berger
348: developed some useful notes for
349: a tutorial lecture on this and related problems~\cite{Berger:78}.
350:
351: Yet another closely related problem is {\it the CEO Problem}. In this
352: version, multiple sensors observe
353: {\em noisy} versions of the same signal, and must convey their observations
354: to a centralized decoder at a combined rate of not more than $R$ bits/sample.
355: This case generalizes the problem of encoding correlated observations,
356: to the case when the number of sensors is large, and to the case when the
357: signal to be communicated cannot be observed directly. Berger et al.\
358: present a solution to this problem in the general case~\cite{BergerZV:96}.
359: Viswanathan and Berger specialize the results of~\cite{BergerZV:96} to the
360: Quadratic-Gaussian case~\cite{ViswanathanB:97}: an interesting conclusion
361: in this case is that the optimal rate of decay of the error is of the form
362: $R^{-1}$ when the sensors cannot communicate prior to transmission, as
363: opposed to an exponential decay otherwise.
364:
365: An interesting duality between the problem of rate/distortion with side
366: information discussed above, and the problem of channel coding with side
367: information at the transmitter~\cite{Costa:83}, has been pointed out by
368: several groups~\cite{BarronCW:02,PradhanCR:03,SuEG:00}. Cover and Chiang
369: present a comprehensive coverage of duality issues in problems with side
370: information~\cite{CoverC:02}, and Chiang and Boyd fully develop an
371: optimization-theoretic approach to analyzing the duality of channel
372: capacity and rate distortion problems~\cite{ChiangB:04}. Merhav and
373: Shamai established a separation theorem in this context~\cite{MerhavS:03}.
374: Therefore, it should be possible to derive good codes for one problem
375: from good codes available for the other.
376:
377: Zamir et al.\ present a very interesting tutorial on noisy multiterminal
378: networks, with many useful references~\cite{ZamirSE:02}.
379:
380: \subsubsection{Performance of Wireless Networks}
381:
382: A key result in the analysis of performance of wireless networks states
383: that when $n$ non-mobile nodes are optimally placed in a disk of unit area,
384: traffic patterns are optimally assigned, and the range of each transmission
385: is optimally chosen, the total throughput that the network can carry is
386: $O(\sqrt{n})$~\cite{GuptaK:00}. As a result, the per-node throughput is
387: only $O(\frac{1}{\sqrt{n}})$, i.e., decays to zero as the number of nodes
388: in the network increases. Other results along the same lines were presented
389: in~\cite{GuptaK:03, XieK:04}.
390:
391: The work of~\cite{GuptaK:00} sparked significant interest in this problem.
392: When nodes are allowed to move, assuming transmission delays proportional
393: to the mixing time of the network, the total network throughput is $O(n)$,
394: and therefore the network can carry a non-vanishing rate per
395: node~\cite{GrossglauserT:02}. Using a linear programming formulation,
396: non-asymptotic versions of the results in~\cite{GuptaK:00} are given
397: in~\cite{ToumpisG:02}. Using pure network flow methods, similar results
398: (and generalizations thereof) have been obtained
399: in~\cite{PerakiS:03, PerakiS:04}. An alternative method for deriving
400: transport capacity was presented in~\cite{KulkarniV:04}.
401:
402: \subsection{Main Contributions and Organization of the Paper}
403:
404: This paper presents the following original contributions:
405: \begin{itemize}
406: \item The construction of lattice codes for the problem of rate/distortion
407: with side information. We propose a design procedure based on the choice
408: of a lattice that is a good quantizer for the classical rate/distortion
409: problem, and a geometrically-similar sublattice, inspired by the idea of
410: partitioning codebooks to obtain good codes for this problem proposed
411: in~\cite{shamai-verdu-zamir:systematic-lossy-coding,
412: zamir-shamai:almost-there}, and by our previous work on the design of
413: lattice quantizers for multiple description coding~\cite{VaishampayanSS:01}.
414: \item An asymptotic analysis (in rate and correlation) of the performance
415: of these codes which, to the best of our knowledge, is the first such
416: analysis for Wyner-Ziv codes. Our analysis reveals some interesting
417: shortcomings of these codes, and suggest a simple modification to make
418: to the construction to ensure their optimality. These optimal codes
419: effectively answer a challenge of Zamir and
420: Shamai~\cite{zamir-shamai:almost-there}.
421: \item The illustration that high correlation asymptotics in source coding
422: are indeed a new asymptotic regime with very meaningful practical
423: implications. So far source coding has considered two asymptotic
424: regimes: large block asymptotics~\cite{Shannon:59}, or high
425: rate asymptotics~\cite{zador:quantization-asymptotics}. High correlation
426: asymptotics are a new asymptotic regime that, as we will see in
427: Section~\ref{sec:sensor-networks}, proves quite relevant in the context
428: of new problems derived from sensor networking applications.
429: \item The identification of a large class of applications for which the
430: vanishing rates property of wireless networks does not pose a problem,
431: by virtue of the fact that the amount of information that each node needs
432: to transmit decays at the same rate as (or faster than) throughput does.
433: \end{itemize}
434:
435: The rest of this paper is organized as follows. In
436: Section~\ref{sec:code-design} we present the structure of lattice
437: quantizers for the problem of rate/distortion with side information,
438: and in Section~\ref{sec:asymptotics} we evaluate the performance of
439: the codes obtained, under the assumption of high-correlation between
440: the source $X$ and the side information $Y$. In
441: Section~\ref{sec:sensor-networks} we illustrate how the proposed
442: codes can be used to deal effectively with the vanishing rates
443: property of an important class of large-scale sensor networks.
444: Final remarks are presented in Section~\ref{sec:conclusions}.
445:
446:
447: \section{Design of Lattice Codes with Side Information}
448: \label{sec:code-design}
449:
450: \subsection{Definitions}
451:
452: A source generates a sequence of zero-mean iid pairs
453: $(x_i,y_i)_{i=0}^\infty$, with jointly Gaussian distribution
454: \[
455: f_{X,Y}(x,y) = \frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}}\;\;
456: e^{-\frac{1}{2(1-\rho^2)}
457: \left(\frac{x^2}{\sigma_X^2}
458: -\frac{2\rho x y}{\sigma_X\sigma_Y}
459: +\frac{y^2}{\sigma_Y^2}\right)},
460: \]
461: with covariance matrix ${\bf K} = {\tiny \left[\!\begin{array}{cc}
462: \sigma_X^2 & \rho\sigma_X\sigma_Y \\
463: \rho\sigma_X\sigma_Y & \sigma_Y^2 \\
464: \end{array}\!\right]}$, and correlation coefficient $\rho$. The corresponding
465: conditional and marginal densities are denoted by $f_{Y|X}$, $f_{X|Y}$,
466: $f_X$, $f_Y$. For a set of $n$ linearly independent column vectors
467: $\{{\bf v}_1,...,{\bf v}_n\}$, a {\em lattice} $\Lambda\subset\mathbb{R}^n$
468: is defined by
469: \[
470: \Lambda = \left\{ \sum_{i=1}^n c_i {\bf v}_i : c_1...c_n\in\mathbb{Z}
471: \right\},
472: \]
473: and its {\em generator matrix} ${\bf V}=\left[{\bf v}_1|...|{\bf v_n}\right]$.
474: The volume of a polytope $P\subset\mathbb{R}^n$ is denoted by $\nu(P)$.
475: For a constant $s\in\mathbb{R}$, the {\em scaled lattice} $s\Lambda$ is the
476: lattice generated by $s{\bf V}$, where ${\bf V}$ is the generator matrix of
477: a lattice $\Lambda$. The {\em Voronoi cell} of a lattice point $\lambda$ in
478: the lattice $\Lambda$ is defined by
479: \[
480: V[\lambda\!:\!\Lambda]
481: = \{{\bf x}\in\mathbb{R}^n:||{\bf x}-\lambda||^2\leq||{\bf x}-\lambda'||^2,
482: \;\forall\lambda'\in\Lambda \}.
483: \]
484: The {\em nearest neighbor map of a lattice} is a function
485: $Q_\Lambda : \mathbb{R}^n \rightarrow \Lambda$, defined by
486: \[
487: Q_\Lambda({\bf x}) = \arg\min_{\lambda\in\Lambda} ||{\bf x}-\lambda||^2,
488: \]
489: where ties are broken arbitrarily (e.g., numbering all the $\lambda$'s,
490: and assigning ${\bf x}$ to the $\lambda$ with smallest index). From the
491: definitions it follows trivially that $V[\lambda\!:\!\Lambda] =
492: \{{\bf x}\in\mathbb{R}^n:Q_\Lambda({\bf x})=\lambda\}$, except possibly
493: for a set of measure zero. A lattice $\Lambda'$ is a {\em sublattice} of
494: a lattice $\Lambda$ if $\Lambda'\subseteq\Lambda$. The {\em quotient
495: group}~\cite[Sec.\ 6.3]{Bourbaki:58} of a lattice modulo a sublattice is
496: denoted by $\Lambda/\Lambda'$, and its order by $|\Lambda/\Lambda'|$.
497:
498: A {\em Wyner-Ziv Lattice Vector Quantizer} (WZ-LVQ) is a triplet
499: ${\cal Q}=(\Lambda,\kappa,s)$, where:
500: \begin{itemize}
501: \item $\Lambda$ is a lattice.
502: \item $\kappa: \mathbb{R}^n \rightarrow \mathbb{R}^n$ is a linear operator
503: such that $\kappa{\bf u}\cdot\kappa{\bf v} = c\;{\bf u}\cdot{\bf v}$
504: (for some $c>0$), and such that $\kappa(\Lambda) \subseteq \Lambda$.
505: Essentially, $\kappa$ defines a {\em similar} sublattice of
506: $\Lambda$.\footnote{Two lattices $\Lambda_1$, $\Lambda_2$ (with
507: generator matrices $M_1$, $M_2$) are said to be {\em similar} when
508: there is a constant $c \neq 0$, an integer matrix U with
509: $|\mbox{det}(U)| = 1$, and a real matrix $B$ with $BB^{\top} = I$,
510: such that $M_2 = c \; U M_1 B$~\cite{neil:splag}.
511: Intuitively, similar lattices ``look the same'', up to a rotation,
512: a reflection, and a change of scale.}
513: \item $s \in (0,\infty)$ is a scale factor that expands (or shrinks)
514: $\Lambda$ and $\kappa(\Lambda)$.
515: \end{itemize}
516:
517: Intuitively, the lattice $\Lambda$ is the fine codebook, the one whose
518: codewords are to be partitioned into equivalence classes. We choose to
519: implement this partition by considering a sublattice $\Lambda' \subseteq
520: \Lambda$, and then considering the resulting quotient group $\Lambda/\Lambda'$.
521: $s$ is a constant that multiplies the generator matrices of the lattices
522: considered, which is to be adjusted as a function of the correlation
523: between the source $X$ and the side information $Y$. A justification for
524: the choice of a {\em similar} sublattice (as opposed to any other sublattice)
525: to implement the codebook partition, and a justification for the explicit
526: introduction of a scale factor $s$ as a parameter of the quantizer (as
527: opposed to having this lattice scale be determined by the coding rate, as
528: in classical quantization theory) will become apparent later, after we study
529: the rate-distortion performance of the proposed quantizers.
530:
531: The question of the existence of similar sublattices arose in connection
532: with another vector quantization problem~\cite{VaishampayanSS:01}, and also
533: in the study of symmetries of
534: quasicrystals~\cite{baake-moody:similarity-submodules-semigroups}. The
535: subject is thoroughly covered in~\cite{conway-rains-neil:similar-sublattices},
536: where necessary (and in some cases sufficient) conditions are given for
537: their existence.
538:
539: \subsection{Encoding/Decoding Algorithms}
540:
541: Let $X^n$ denote a block of $n$ source samples, and $Y^n$ a block of $n$
542: side information samples. The encoder and decoder are maps
543: $f_n:\mathbb{R}^n \rightarrow s\Lambda/s\kappa(\Lambda)$ and
544: $g_n:s\Lambda/s\kappa(\Lambda)\times\mathbb{R}^n \rightarrow s\Lambda$,
545: defined by
546: \begin{equation}
547: f_n(X^n) = Q_{s\Lambda}\big(X^n-Q_{s\kappa(\Lambda)}(X^n)\big),
548: \hspace{1cm}
549: \hat{X}^n = g_n(f_n(X^n),Y^n) = Q_{s\kappa(\Lambda)+f_n(X^n)}(Y^n),
550: \label{eq:q-alg}
551: \end{equation}
552: whose operation is illustrated in Fig.~\ref{fig:encoder-decoder}, with an
553: example based on the lattice $A_2$.
554:
555: \begin{figure}[ht]
556: \centerline{\psfig{file=mechanics1.eps,height=10cm}
557: \psfig{file=mechanics2.eps,height=10cm}}
558: \vspace{-2mm}
559: \caption{To illustrate the mechanics of the proposed quantizers
560: (left: encoding, right: decoding). A sublattice similiar to the base
561: lattice is chosen (circled points), matched to how far $X^n$ and $Y^n$
562: are expected to be: in this example, with high probability $X^n$ and
563: $Y^n$ are in neighboring Voronoi cells of the fine lattice. Then
564: $X^n$ is quantized first with the coarse lattice, then this coarse
565: description is subtracted from $X^n$, and this difference is quantized
566: again with the fine lattice; this quantized difference is then sent to
567: to the decoder, as a representative of the set of all codewords
568: collapsed into the same equivalence class.
569: At the decoder, the entire class is recreated (all the points with a
570: thick arrow in the right picture), and among these, the point closest
571: to the side information $Y^n$ is declared to be the original quantized
572: value for $X^n$. Note that there is always a chance that a particular
573: realization of the noise process may take $Y^n$ too far away from
574: $X^n$, in which case a decoding error occurs.}
575: \label{fig:encoder-decoder}
576: \end{figure}
577:
578: \subsection{Rate Computation}
579: \label{sec:rate-computation}
580:
581: There are only $N = |\Lambda/\kappa(\Lambda)|$ possible different quantizer
582: outputs, each one with probability $p_k$ ($k=1...N$) given by
583: \[
584: p_k \;=\; \sum_{\lambda\in s\Lambda}
585: \int_{V[\kappa(\lambda)+\gamma_k:s\Lambda]}
586: f_X({\bf x}) \mbox{ d\bf x},
587: \]
588: where $\gamma_k \in s\Lambda/s\kappa(\Lambda)$, and where we identify
589: the entire equivalence class with a canonical representative taken from
590: $\Lambda \; \cap \; V[{\bf 0}\!:\!\kappa(\Lambda)]$. The rate of a
591: quantizer is then given by
592: \[ R \;=\; \mbox{$\frac 1 n$} \sum_{k=1}^N p_k \ln(1/p_k), \]
593: expressed in units of nats per source sample.
594:
595: Assume now, as is standard in fine-resolution quantization theory, that
596: Voronoi cells of the quantizers under consideration are small. In this
597: case, this translates into a requirement for {\em sublattice} cells to
598: be small, for which we have that
599: \[
600: \nu(s\kappa(\Lambda)) \; = \; s^n \nu(\kappa(\Lambda))
601: \; = \; s^n \nu(N^{\frac 1 n}U\Lambda) \; = \; s^nN \nu(\Lambda)
602: \; = \; s^nN,
603: \]
604: where the second equality follows from the fact that
605: $N = |\Lambda/\kappa(\Lambda)| = c^{\frac n 2}$, where $c$ is the norm
606: of the similarity defined by
607: $\kappa$~\cite{conway-rains-neil:similar-sublattices} (and therefore
608: the corresponding scaling is $\sqrt{c}$), $U$ is unitary, and the last
609: equality follows from assuming $\Lambda$ is normalized to have determinant
610: 1~\cite{neil:splag}. Then, we see that requiring small
611: sublattice cells translates into requiring that $s^nN$ be a small
612: number. Now, under this assumption, the rate expression above admits a
613: much simpler form:
614:
615: \[
616: 1 = \sum_{\lambda\in s\Lambda}
617: \int_{V[\lambda:s\Lambda]} f_X({\bf x})\mbox{ d\bf x} \\
618: = \sum_{\gamma_k\in s\Lambda/s\kappa(\Lambda)}
619: \underbrace{\sum_{\lambda\in s\Lambda}
620: \int_{V[\kappa(\lambda)+\gamma_k:s\Lambda]}
621: f_X({\bf x})\mbox{ d\bf x}}_{p_k}.
622: \]
623: The integral of the source density in $p_k$ can be approximated by
624: \[
625: f_X(\kappa(\lambda)+\gamma_k)\;\cdot\;
626: \nu(V[\kappa(\lambda)+\gamma_k:s\Lambda]).
627: \]
628: But assuming small cells
629: for the sublattice (standard in quantization theory), since the Gaussian
630: source is continuous, we have that within a cell of $\kappa(\Lambda)$ $f_X$
631: is approximately constant, and hence independent of the particular shift
632: $\gamma_k$. Furthermore, since $\Lambda$ is a lattice, all its cells are
633: congruent, and therefore their volumes are all the same, thus making $\nu$
634: also independent of the particular shift $\gamma_k$. Call $p$ this
635: (approximately) constant value for $p_k$. Therefore, we have
636: \[
637: 1 \;\; \approx \sum_{\gamma_k\in\Lambda/\kappa(\Lambda)} p
638: \;\; = \;\; |\Lambda/\kappa(\Lambda)| p,
639: \]
640: and hence,
641: \begin{eqnarray*}
642: p_k \approx \frac{1}{|\Lambda/\kappa(\Lambda)|}
643: & \hspace{1cm} \mbox{and} \hspace{1cm}
644: & R \approx \mbox{$\frac 1 n$} \log_2 |\Lambda/\kappa(\Lambda)|,
645: \end{eqnarray*}
646: independent of $s$ and $f_X$, where the approximations are tight in
647: the limit as $s^nN\to 0$.
648:
649: Note that, unlike in classical quantization theory, here the
650: rate of a quantizer seems to be independent of the size of its Voronoi
651: cells. In our context, a high-rate assumption translates into a large
652: value for $|\Lambda/\kappa(\Lambda)|$, i.e., cells in the fine lattice
653: are small {\em relative} to the size of cells in the coarse lattice.
654: But the parameter $s$, which determines the {\em absolute} the size of
655: these cells, is not part of the rate expression.
656:
657: \subsection{Distortion Computation}
658: \label{sec:distortion-nonasymptotic}
659:
660: Let $\gamma_k({\bf x})$ denote the encoding of a source sequence
661: ${\bf x}$ ($k=1...N$), and $\gamma({\bf x},{\bf y})$ denote the
662: reconstruction codeword for a source sequence ${\bf x}$ with side
663: information ${\bf y}$. Then:
664: \begin{eqnarray}
665: \bar d
666: & \stackrel{(a)}{=} &
667: \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n} \int_{{\bf y}\in\mathbb{R}^n}
668: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{XY}({\bf x},{\bf y})
669: \mbox{d}{\bf x}\mbox{d}{\bf y} \nonumber \\
670: & = & \mbox{$\frac 1 n$}
671: \int_{{\bf x}\in\mathbb{R}^n}\left[\int_{{\bf y}\in\mathbb{R}^n}
672: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{Y|X}({\bf y}|{\bf x})
673: \mbox{d}{\bf y}\right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\
674: & \stackrel{(b)}{=} & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}
675: \left[\sum_{\lambda\in s\kappa(\Lambda)+\gamma_k({\bf x})}
676: \int_{{\bf y}\in V[\lambda:s\kappa(\Lambda)+\gamma_k({\bf x})]}
677: ||{\bf x} - \lambda||^2 f_{Y|X}({\bf y}|{\bf x})
678: \mbox{d}{\bf y}\right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\
679: & \stackrel{(c)}{=} & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}
680: \left[\sum_{\lambda\in s\kappa(\Lambda)+\gamma_k({\bf x})}
681: ||{\bf x} - \lambda||^2 {\tt Pr}\big({\bf y}\in
682: V[\lambda:s\kappa(\Lambda)+\gamma_k({\bf x})]\big|{\bf x}\big)
683: \right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\
684: & \triangleq & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}
685: \partial({\bf x}, s\kappa(\Lambda)+\gamma_k({\bf x}))
686: f_X({\bf x})\mbox{d}{\bf x},
687: \label{eq:distortion}
688: \end{eqnarray}
689: where:
690: \begin{itemize}
691: \item[\small (a)] is just the definition of average distortion;
692: \item[\small (b)] follows from, for each possible source sequence ${\bf x}$,
693: partitioning the set of all side information vectors ${\bf y}$ into
694: Voronoi cells of the sublattice $s\kappa(\Lambda)$, centered at location
695: $\gamma_k({\bf x})$;
696: \item[\small (c)] follows from the fact that $||{\bf x}-\lambda||^2$ can
697: be taken out of the integral, and what remains is an integral of the
698: conditional density function.
699: \end{itemize}
700: The last
701: definition is introduced to highlight the concept that in quantization
702: with side information, an entire sublattice plays the role of a single
703: codeword in classical quantization -- the average error in reconstructing
704: ${\bf x}$ is seen to take the form of an expectation of a suitably
705: defined distortion metric between source sequences and sublattices.
706: In Section~\ref{sec:asymptotics} we study the asymptotic behavior
707: of~(\ref{eq:distortion}), assuming high correlation between $X^n$
708: and $Y^n$.
709:
710: \subsection{On the Choice of Similar Sublattices}
711:
712: As we will see in Section~\ref{sec:asymptotics}, there are some
713: drawbacks to implementing quantizers for the Wyner-Ziv problem with
714: a fine quantizer that is essentially a truncated lattice, as follows
715: from the construction given here. But there are also significant
716: benefits to doing so, in terms of the simplicity of this implementation.
717: So for the time being, if we are going to use two lattices, it is
718: of interest to consider what kind of lattices should be used.
719:
720: Suppose we fix the scale factor $s$, and the code rate $\frac{1}{n}\ln(N)$.
721: Among all the sublattices of $\Lambda$ of index $N$, are there differences
722: in terms of their distortion performance? Which sublattices should we
723: choose? It follows from~(\ref{eq:distortion}) that a sensible design
724: criteria is to choose the sublattice which results in maximizing
725: ${\tt Pr}\left\{{\bf y}\in V[{\bf 0}\!:\!s\kappa(\Lambda)]\mid
726: X\!={\bf x}\right\}$, for ${\bf x}\in V[{\bf 0}\!:\!s\Lambda]$.
727:
728: Since the vectors $X$ and $Y$ are jointly Gaussian and with iid
729: components, the vector $Y|X\!=\!{\bf x}$ is also Gaussian and with iid
730: components (although the $x_i$'s and the $y_i$'s are certainly not
731: independent of each other). The pdf of $Y|X\!=\!{\bf x}$ is therefore
732: circularly symmetric, and it follows from classical arguments of coding
733: for Gaussian channels that, to maximize ${\tt Pr}({\bf y}\in V)$, we need
734: to maximize the norm of the shortest vectors in $\kappa(\Lambda)$. This
735: situation is illustrated in Fig.~\ref{fig:why-similar-sublattices}, with
736: an example based on the lattice $A_2$.
737:
738: \begin{figure}[ht]
739: \centerline{\psfig{file=n21-expand.ps,height=7.3cm}}
740: \vspace{-2mm}
741: \caption{Two different sublattices of $A_2$, of index $N=21$. $A_2$
742: is isomorphic to the ring of Eisenstein integers
743: $\mathbb{Z}(\omega) = \{ a+b\omega\;:\;a,b\in\mathbb{Z};\;
744: \omega=[-\frac{1}{2},\frac{\sqrt{3}}{2}]=e^{2\pi i/3}\}$, and {\em ideal}
745: sublattices refer to ideals of this ring. Observe that the ideal sublattice
746: of the example has shortest vectors of norm 21, whereas in the non-ideal
747: sublattice the shortest vectors are shorter.}
748: \label{fig:why-similar-sublattices}
749: \end{figure}
750:
751: The choice of $A_2$ for illustration purposes in
752: Fig.~\ref{fig:why-similar-sublattices} is not arbitrary. In that
753: particular case, it is known that the minimal norm $\mu$ of any sublattice
754: of index $N$ in $A_2$ satisfies $\mu \leq N$, and that $\mu = N$ if and
755: only if the sublattice is ideal~\cite{bernstein-neil-pew:sublattices-of-a2}.
756: Furthermore, in two dimensions, $A_2$ is both the best classical quantizer
757: and the best channel coder~\cite{neil:splag}. Therefore, it seems clear
758: that a hexagonal lattice and a similar sublattice are the best design
759: choices in two dimensions: this combination simultaneously minimizes
760: quantization error, and minimizes the probability of a source vector being
761: decoded to an incorrect codeword.
762:
763: Another interesting example is that of very high dimensional spaces.
764: In this case, we know that good quantizers have (nearly) spherical Voronoi
765: cells. But at the same time, spherical cells maximize the minimum distance
766: between sublattice points, and therefore an optimal sublattice will have
767: to be similar to the base lattice.
768:
769: In between dimensions 2 and $\infty$, we are not able to make equally
770: strong statements---but we use the insights derived from these extreme
771: cases (a lattice with small second-order moment and a similar sublattice)
772: as guiding principles, to curb the complexity of the design task.
773:
774:
775: \section{Asymptotics of Quantizers with Side Information}
776: \label{sec:asymptotics}
777:
778: \subsection{Modeling Assumptions and Performance Metric}
779:
780: \subsubsection{Modeling Assumptions}
781:
782: Our goal in this section is to find a simpler expression for $\bar{d}$
783: than that presented in Section~\ref{sec:distortion-nonasymptotic}. To
784: do so, we work under some extra assumptions:
785: {\it\begin{itemize}
786: \item The correlation coefficient $\rho$ between $X$ and $Y$ is close
787: to 1.
788: \item The coding rate $R$ is large.
789: \item The scale factor $s$ is small.
790: \end{itemize}}
791: The effect of these assumptions is illustrated in Fig.~\ref{fig:assumptions}.
792:
793: \begin{figure}[ht]
794: \centerline{\psfig{file=assumptions.eps,height=6cm,width=12cm}}
795: \vspace{-2mm}
796: \caption{Illustration (in one dimension) of the meaning of the asymptotic
797: regime considered in this work. Working under an assumption of high
798: correlations, we have that the conditional distribution of the source
799: ${\bf x}$ given side information ${\bf y}$ is sharply concentrated around
800: its mean value ${\bf y}$ -- as a result, we can make the probability of
801: the source ${\bf x}$ away from ${\bf y}$ by more than any positive
802: constant be arbitrarily small (by choosing $\rho$ close enough to 1),
803: and hence we can assume that sublattice cells, while being vanishingly
804: small themselves ($s\approx 0$), can be considered large enough to
805: contain most of the probability in $f_{X|Y}$. Then, because we take
806: $R$ large, we further partition each sublattice cell into a large
807: number of much smaller fine lattice cells.}
808: \label{fig:assumptions}
809: \end{figure}
810:
811: The basic intuition on which our analysis in this section is built is
812: very simple: by considering high enough correlations, the encoder can
813: ``roughly center'' the conditional distribution $f_{X|Y}$ at the centroid
814: of a sublattice cell, a cell that is large enough to make the probability
815: that the source vector ${\bf x}$ is not in the considered cell negligible,
816: but at the same time small enough so that tools employed in classical
817: quantization problems can be applied.
818:
819: Recall that as mentioned earlier, unlike in classical high rate asymptotics
820: where $R\to\infty$ results in $\nu(\Lambda)\to 0$, in this case we must
821: explicitly force $s\to 0$, but not ``too fast'' -- in this case, too fast
822: would be at a rate equal or faster than the rate at which $f_{X|Y}$ shrinks,
823: as $|\rho|\to 1$. We will do so by setting the scale factor $s$ to be
824: $s = s(\rho)$, where $s:(-1,1)\to\mathbb{R}^+$ is such that
825: \begin{eqnarray}
826: \lim_{|\rho|\to 1} s(\rho) & = & 0, \nonumber \\
827: \lim_{|\rho|\to 1} \frac{s(\rho)}{\sigma_X\sqrt{1-\rho^2}} & = & \infty.
828: \label{eq:choice-s}
829: \end{eqnarray}
830: For example,
831: $s = \sigma_X\sqrt{1-\rho^2}\log\left(1\big/\sigma_X\sqrt{1-\rho^2}\right)$
832: satisfies these conditions.
833:
834: \subsubsection{Performance Metric}
835:
836: Some justification seems necessary at this point for considering
837: high-correlation asymptotics (i.e., $|\rho|\to 1$), since under this
838: assumption, the side information available uncoded at the decoder
839: already contains almost all of the information about the source. And
840: indeed, once we are done with our calculations, we will confirm the
841: (hardly surprising) fact that for any fixed target distortion $D$,
842: using these proposed quantizers and as $|\rho|\to 1$, the rate required
843: to achieve $D$ vanishes. This is a condition that must be satisfied
844: by {\em any} decent quantizer. However, that is not why we are
845: interested in this analysis: instead, our goal is to evaluate
846: \begin{equation}
847: \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)},
848: \label{eq:figure-of-merit}
849: \end{equation}
850: where $\bar{d}$ is the distortion of our quantizers, and $D(R)$ is
851: the Wyner-Ziv rate/distortion function--that is, we wish to compare
852: the {\em slope} of the distortion function for our proposed quantizers
853: at asymptotically high correlations, with that of the Wyner-Ziv
854: bound. This {\em is} a meaningful performance metric, as it determines
855: the rate of decay of distortion relative to the fastest possible
856: decay.\footnote{This type of analysis is similar in spirit to (and
857: inspired by) that of Verd\'u for modulation schemes operating at
858: asymptotically low SNRs~\cite{Verdu:02}.}
859:
860: \subsection{Asymptotics of the Average Error With Geometrically Similar
861: Coarse and Fine Lattices}
862: \label{sec:average-error}
863:
864: \subsubsection{A Simpler Expression}
865:
866: To obtain a simpler expression for $\bar d$ than that of
867: eq.~(\ref{eq:distortion}), we start by expanding it in a different way:
868: \begin{eqnarray}
869: \bar d
870: & \stackrel{(a)}{=} &
871: \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n} \int_{{\bf y}\in\mathbb{R}^n}
872: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{XY}({\bf x},{\bf y})
873: \mbox{d}{\bf x}\mbox{d}{\bf y}
874: \nonumber \\
875: & = & \mbox{$\frac 1 n$}
876: \int_{{\bf y}\in\mathbb{R}^n}\left[\int_{{\bf x}\in\mathbb{R}^n}
877: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})
878: \mbox{d}{\bf x}\right] f_Y({\bf y})\mbox{d}{\bf y}
879: \nonumber \\
880: & \stackrel{(b)}{=} & \mbox{$\frac 1 n$}
881: \sum_{\lambda\in s\Lambda} \int_{{\bf y}\in V[\lambda:s\Lambda]}
882: \left[\int_{{\bf x}\in\mathbb{R}^n}
883: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})
884: \mbox{d}{\bf x}\right] f_Y({\bf y})\mbox{d}{\bf y}
885: \nonumber \\
886: & \stackrel{(c)}{\approx} &
887: \mbox{$\frac 1 n$} \sum_{\lambda\in s\Lambda}
888: \left[ \int_{{\bf x}\in\mathbb{R}^n}
889: ||{\bf x}-\gamma({\bf x},\lambda)||^2 f_{X|Y}({\bf x}|\lambda)
890: \mbox{d}{\bf x} \right] f_Y(\lambda)\nu(s\Lambda)
891: \nonumber \\
892: & \stackrel{(d)}{=} &
893: \mbox{$\frac 1 n$}
894: \left[ \int_{{\bf x}\in\mathbb{R}^n}
895: ||{\bf x}-\gamma({\bf x},\mathbf{0})||^2 f_{X|Y}({\bf x}|\mathbf{0})
896: \mbox{d}{\bf x} \right]
897: \left(\sum_{\lambda\in s\Lambda} f_Y(\lambda)\nu(s\Lambda)\right)
898: \nonumber \\
899: & \stackrel{(e)}{\approx} & \underbrace{\mbox{$\frac 1 n$}
900: \int_{{\bf x}\in V[{\bf 0}:s\kappa(\Lambda)]}
901: ||{\bf x}-\gamma_k({\bf x})||^2 f_{X|Y}({\bf x}|\mathbf{0})
902: \mbox{d}{\bf x}}_{\alpha}
903: \\ & & \mbox{\hspace{2mm}} + \underbrace{\mbox{$\frac 1 n$}
904: \sum_{\lambda\in s\kappa(\Lambda)\backslash\{{\bf 0}\}}
905: \int_{{\bf x}\in V[\lambda:s\kappa(\Lambda)]}
906: ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2
907: f_{X|Y}({\bf x}|\mathbf{0}) \mbox{d}{\bf x}}_{\beta}
908: \label{eq:def-alpha-beta}
909: \end{eqnarray}
910: where:
911: \begin{itemize}
912: \item[\small $(a)$] is again just the definition of average distortion;
913: \item[\small $(b)$] follows from partitioning the set of all side information
914: sequences ${\bf y}$ into Voronoi cells of the fine lattice $s\Lambda$;
915: \item[\small $(c)$] follows from the assumption that $\nu(s\Lambda)$ is
916: small, and from the continuity of $\int_{{\bf x}\in\mathbb{R}^n}
917: ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})
918: \mbox{d}{\bf x}$ as a function of $\mathbf{y}$;
919: \item[\small $(d)$] follows from the symmetry of $f_{X|Y}$ as a function
920: $\mathbf{y}$;
921: \item[\small $(e)$] follows from the fact that $f_Y$ integrates to 1, and
922: from splitting the domain of integration of ${\bf x}$ into Voronoi cells of
923: the sublattice $s\kappa(\Lambda)$.
924: \end{itemize}
925: Our next goal is to find simpler expressions for $\alpha$ and $\beta$.
926:
927: To simplify $\alpha$, we observe that this term denotes the MSE incurred
928: into when quantizing samples of a distribution $f_{X|Y}({\bf x}|\xi)$
929: with an $N$-level fixed-rate {\em uniform} quantizer, if we assume that
930: the overload cells of the quantizer occur with negligible probability --
931: and this assumption is justified because, for $|\rho|\approx 1$, sublattice
932: cells are large relative to the spread of $f_{X|Y}$ due to our choice of
933: $s$ in~(\ref{eq:choice-s}). Now, again under the assumption that $R$ is
934: large, the random shift in the mean of $f_{X|Y}$ given by its dependence
935: on the unknown parameter $\xi$ is negligible compared to the size of a
936: sublattice cell. Thus, by choosing a value of $|\rho|$ close enough to
937: 1, the probability of ${\bf x}\not\in V[{\bf 0}:s\kappa(\Lambda)]$ can
938: be made arbitrarily small. This is illustrated in
939: Fig.~\ref{fig:simplify-alpha}.
940:
941: \begin{figure}[ht]
942: \centerline{\psfig{file=simplify-alpha.eps,height=5cm,width=12cm}}
943: \caption{Illustration (in one dimension) of the concept that, irrespective
944: of a small random shift in the mean introduced by
945: the unknown side information, a fine quantization of the sublattice cell
946: (thin lines in between thick lines) results in a fine quantization of
947: the unknown distribution. The true distribution could be any of those
948: illustrated for various unknown vectors $\xi_k$.}
949: \label{fig:simplify-alpha}
950: \end{figure}
951:
952: The requirement that the fine and coarse quantizers be geometrically
953: similar lattices results in cells of the coarse lattice being partitioned
954: {\em uniformly} by the fine lattice; this is the optimal quantizer for
955: a source that is uniformly distributed over a sublattice cell, not
956: distributed according to $f_{X|Y}$. Therefore, defining a new pdf
957: $p(\mathbf{x})=\frac 1{s^nN}$ if $\mathbf{x}$ is in the corresponding
958: sublattice cell, and zero otherwise, we have that
959: \[ \lim_{N\to\infty}N^{\frac 2 n}\alpha = G(\Lambda)s^2;
960: \]
961: this follows from evaluating eqn.~(81) in~\cite[Ch.\ 2]{neil:splag}
962: for the uniform distribution $p$ defined above, specialized to the
963: lattice $\Lambda$. Therefore, for $N$ large, we can (equivalently)
964: say that
965: \[ \alpha \;\;\approx\;\; G(\Lambda)s^2e^{-2R}.
966: \]
967:
968: Since $\beta\geq 0$, we have that $\bar d\geq\alpha$, and so
969: \begin{eqnarray}
970: \bar d
971: & \geq & G(\Lambda)\,s^2\,e^{-2R}.
972: \label{eq:distortion-aroundzero-similarcoarsefine}
973: \end{eqnarray}
974:
975: \subsubsection{Comparison Against Wyner's Rate/Distortion Bound}
976:
977: Our next step is to evaluate the figure of merit defined
978: by~(\ref{eq:figure-of-merit}). To this end, consider Wyner's
979: rate/distortion bound~\cite{Wyner:78}:\footnote{In
980: Wyner's paper, the bound is given in the form $R(d)=\frac{1}{2}\log\left(
981: \frac{\sigma_X^2\sigma_U^2}{(\sigma_X^2+\sigma_U^2)d}\right)$ (for
982: the low distortion region), where $\sigma_X^2$ is the variance of $X$,
983: and $Y=X+U$, where $U$ has variance $\sigma_U^2$. A straightforward
984: manipulation puts Wyner's expression in the form shown here.}
985: \begin{equation}
986: D(R) = \sigma_X^2(1-\rho^2)e^{-2R}.
987: \label{eq:wynerziv-rdfunction}
988: \end{equation}
989: Plugging eqns.~\eqref{eq:distortion-aroundzero-similarcoarsefine}
990: and~\eqref{eq:wynerziv-rdfunction} into~(\ref{eq:figure-of-merit}), we get
991: \begin{eqnarray*}
992: \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)}
993: & \geq & \lim_{|\rho|\to 1}
994: \frac{G(\Lambda) s^2 e^{-2R}}
995: {\sigma_X^2(1-\rho^2)e^{-2R}} \\
996: & = & G(\Lambda)
997: \lim_{|\rho|\to 1}\frac{s^2}{\sigma_X^2(1-\rho^2)} \\
998: & = & \infty;
999: \end{eqnarray*}
1000: the divergence of this limit follows from choice of lattice scaling
1001: specified in eqn.~\eqref{eq:choice-s}. Therefore, when the fine
1002: quantizer is constrained to be a lattice that is geometrically similar
1003: to the coarse lattice, the performance of the resulting Wyner-Ziv
1004: quantizer is very poor in the asymptotic regime of high correlations.
1005: This observation motivates us to introduce a small modification in
1006: our code construction.
1007:
1008: \subsection{Asymptotics of the Average Error with a Coarse Lattice and
1009: an Optimal Fixed-Rate Fine Quantizer}
1010:
1011: \subsubsection{A Simpler Expression}
1012:
1013: The suboptimality of the code construction based on two geometrically
1014: similar lattices stems from the fact that sublattice cells are partitioned
1015: uniformly, but the source distribution $f_{X|Y}$ being quantized is not
1016: uniform. Therefore, we enlarge the class of codes considered:
1017: \begin{itemize}
1018: \item we keep the requirement that the coarse quantizer be a lattice;
1019: \item we keep the same quantization algorithm of eqn.~\eqref{eq:q-alg};
1020: \item but we now allow for the fine quantizer to be any arbitrary
1021: fixed-rate classical vector quantizer.
1022: \end{itemize}
1023: By removing the restriction that the fine quantizer also be a lattice,
1024: we can now choose one still with $N$ reconstruction points, but whose
1025: output point density, instead of being uniform, is matched to the
1026: distribution $f_{X|Y}(\mathbf{x}|\mathbf{0})$. As a result, we conclude
1027: that there exists a quantizer such that
1028: \[ \lim_{N\to\infty} N^{\frac 2 n}\alpha\;\;=\;\;G_n||f_{X|Y}||_{\frac{n}{n+2}},
1029: \]
1030: where $||f||_{\frac{n}{n+2}} \triangleq \big[ \int f^{\frac{n}{n+2}}(x)
1031: \mbox{d}x \big]^{\frac{n+2}{n}}$, and where $G_n$ depends only on $n$ (but
1032: not on the source distribution), and is bounded in terms of the standard
1033: $\Gamma$ function by
1034: \begin{equation}
1035: \frac 1{(n+2)\pi}\;\Gamma\Big(\frac n 2+1\Big)^{\frac 2 n}
1036: \;\;\leq\;\;
1037: G_n
1038: \;\;\leq\;\;
1039: \frac 1{n\pi}\;\Gamma\Big(\frac n 2+1\Big)^{\frac 2 n}
1040: \;\Gamma\Big(1+\frac 2 n\Big),
1041: \label{eq:bounds-Gn}
1042: \end{equation}
1043: as follows from eqns.~(81) and~(82) of~\cite[Ch.\ 2]{neil:splag}.
1044: Hence, for $|\rho|\approx 1$ and for $N$ large, we can approximate
1045: $\alpha$ by
1046: \[ \alpha\;\;\approx\;\;G_n\,||f_{X|Y}||_{\frac{n}{n+2}}\,e^{-2R}.
1047: \]
1048:
1049: To simplify $\beta$, the following estimate is obtained in
1050: Appendix~\ref{app:trivial1}:
1051: \begin{equation}
1052: \beta \;\; \approx \;\; \mbox{$\frac 1 n$}
1053: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1054: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1055: \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}
1056: {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right).
1057: \label{eq:b}
1058: \end{equation}
1059:
1060: Combining these two estimates, we arrive at a final expression for
1061: $\bar d$:
1062: \begin{eqnarray}
1063: \bar d
1064: & \approx & G_n\,||f_{X|Y}||_{\frac{n}{n+2}}\,e^{-2R}
1065: + \mbox{$\frac 1 n$}
1066: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1067: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1068: \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}
1069: {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)
1070: \label{eq:distortion-aroundzero}
1071: \end{eqnarray}
1072:
1073: \subsubsection{Comparison Against Wyner's Rate/Distortion Bound}
1074:
1075: Plugging eqns.~\eqref{eq:wynerziv-rdfunction}
1076: and~\eqref{eq:distortion-aroundzero} into~(\ref{eq:figure-of-merit}),
1077: we now get
1078: \begin{eqnarray*}
1079: \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)}
1080: & = & \lim_{|\rho|\to 1}
1081: \frac{G_n ||f_{X|Y}||_{\frac{n}{n+2}} e^{-2R}
1082: + \mbox{$\frac 1 n$}
1083: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1084: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1085: \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}
1086: {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)}
1087: {\sigma_X^2(1-\rho^2)e^{-2R}} \\
1088: & = & G_n
1089: \lim_{|\rho|\to 1}\frac{||f_{X|Y}||_{\frac{n}{n+2}}}
1090: {\sigma_X^2(1-\rho^2)}
1091: + \;\; \lim_{|\rho|\to 1} \mbox{$\frac 1 n$}
1092: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1093: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1094: \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}
1095: {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)
1096: \frac{1}{\sigma_X^2(1-\rho^2)e^{-2R}}.
1097: \end{eqnarray*}
1098:
1099: From eqn.~(57) in~\cite{zador:quantization-asymptotics}, we have that
1100: $\lim_{n\to\infty} ||f_n||_{\frac{n}{n+2}} = e^{2h(f)}$, where $f_n=(f)^n$
1101: is the $n$-dimensional source distribution, and $h$ denotes differential
1102: entropy. We don't know of a way to simplify this expression for small
1103: $n$, so we approximate it with its limit value as $n$ gets
1104: large.\footnote{It is important to emphasize that although we consider
1105: large blocks to simplify $||f_n||_{\frac{n}{n+2}}$, this does {\em not}
1106: mean that the distortion expression thus obtained is only valid for high
1107: dimensional quantizers: we can consider long source blocks, in which
1108: small sub-blocks are quantized with low dimensional codes (for example,
1109: {\em scalar} quantizers), and this form would still apply.}
1110: For the conditional Gaussian distribution,
1111: $h(f) = \frac 1 2 \log\big(2\pi e\sigma_X^2(1-\rho^2)\big)$, and hence
1112: \[ G_n\lim_{|\rho|\to 1}
1113: \frac{\lim_{n\to\infty}||f_{X|Y}||_{\frac{n}{n+2}}}
1114: {\sigma_X^2(1-\rho^2)} \;\;=\;\; G_n\;2\pi e. \]
1115: Note as well that the second term vanishes: for $|\rho|\to 1$,
1116: from~(\ref{eq:choice-s}) we have that $s^2/\big(\sigma_X^2(1-\rho^2)\big)
1117: \to\infty$, and thus this expression is dominated by the vanishing term
1118: $e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}$.
1119: Hence, we conclude that, by explicitly scaling the quantizers with
1120: $s$ satisfying conditions~(\ref{eq:choice-s}),
1121: \[ \lim_{|\rho|\to 1} \frac{\bar{d}}{D(R)} \;\;=\;\; G_n\; 2\pi e. \]
1122:
1123: Finally, since for $n$ large the upper and lower bounds on $G_n$
1124: given in eqn.~\eqref{eq:bounds-Gn} coincide and take the value
1125: $\frac 1{2\pi e}$~\cite[pg.\ 58]{neil:splag}, we see that indeed,
1126: as $n\to\infty$, there exist high-dimensional codes for which this
1127: limit can be made arbitrarily close to 1. {\em Hence, asymptotically
1128: in rate and correlation, our code constructions achieve the Wyner-Ziv
1129: bound.}
1130:
1131: \subsection{Some Intuitive Remarks}
1132:
1133: \subsubsection{On the Optimality of our Codes, in Hindsight}
1134:
1135: Informally, these are the key elements contributing to the optimality
1136: of our codes:
1137: \begin{itemize}
1138: \item The codes are scaled in a way such that, as correlation
1139: increases, the tails of the conditional distribution $f_{X|Y}$
1140: outside a cell of the coarse quantizer become increasingly light.
1141: \item At high correlations, our scaling of the codes results
1142: in the size of cells in the coarse quantizer being small. But
1143: at high rates, the size of a cell in the fine quantizer is negligible
1144: even relative to the small coarse cells. And the side information
1145: is, with high probability, ``pinned'' within one of the small fine
1146: quantizer cells.
1147: \item Because the tails of $f_{X|Y}$ are increasingly light
1148: as correlation increases, and $f_{X|Y}$ is {\em not} uniform,
1149: an optimal quantizer for a uniform distribution is mismatched
1150: to the actual statistics of the data, thus resulting in a severe
1151: penalty in rate. However, this penalty can be eliminated entirely
1152: in a very simple way: only changing the shape of the cells for
1153: the fine quantizer is enough -- if the output point density of
1154: the fine quantizer is matched to the pinned form of $f_{X|Y}$,
1155: this is an optimal code.
1156: \end{itemize}
1157: Essentially, our construction is asymptotically optimal (in rate
1158: and correlation), because we scale the lattice in a way such that
1159: we create multiple copies of $f_{X|Y}$ one within each cell of
1160: the coarse lattice, and we use an optimal code within that cell.
1161:
1162: \subsubsection{On Why $R^*(d)=R_{X|Y}(d)$ for Gaussian Sources}
1163:
1164: This asymptotic analysis also sheds light on why there is no
1165: rate loss for Wyner-Ziv coding of Gaussian sources, at least in
1166: the asymptotic regime of high rates and high correlations. Note
1167: that the conditional distribution $f_{X|Y}$ depends on the side
1168: information $\mathbf{y}$ only in the form of a random shift: this
1169: random shift becomes negligible at high rates, but more importantly,
1170: the {\em shape} of $f_{X|Y}$ is independent of $\mathbf{y}$. As
1171: a result, a single code can be used to quantize the $f_{X|Y}$'s
1172: pinned one within each cell of the coarse lattice. It is this
1173: invariance property of the conditional Gaussian distribution that
1174: results having $R^*(d)=R_{X|Y}(d)$, at least in the asymptotic
1175: regime considered in this section.
1176:
1177:
1178: \section{Applications in Sensor Networks}
1179: \label{sec:sensor-networks}
1180:
1181: \subsection{Discussion}
1182:
1183: Issues in the analysis of performance of wireless networks have received
1184: considerable attention in recent times. To a large extent, interest on
1185: these topics has been sparked by an observation made by Gupta and Kumar:
1186: the total throughput that can be carried by one particular class of
1187: wireless networks is only $O(\sqrt{n})$,\footnote{A word on notation.
1188: In this section, $n$ denotes number of nodes in the network, and $N$
1189: denotes block length. This notation should not be confused with that
1190: in previous section, where $n$ was used to refer to block length, and
1191: $N$ to the number of reconstruction codewords in a code.}
1192: for a network having $O(n)$ nodes~\cite{GuptaK:00}. As a result, each
1193: source-destination pair gets a throughput of $O(1/\sqrt{n})$, i.e., the
1194: amount of information that any one individual node can inject into the
1195: network vanishes as the network size increases. The model used for
1196: performance analysis in~\cite{GuptaK:00} was conceived as an abstraction
1197: for emerging ad-hoc wireless networks, made up of small appliances (such
1198: as laptop computers or microwave ovens or door locks), interconnected
1199: via standard air interfaces (such as Bluetooth or 802.11). In that
1200: context, the fact that as more nodes join the network then the capacity
1201: available to each node decreases, clearly poses serious problems, since
1202: there is no reason to believe that there will be any dependencies in the
1203: data generated by each of these devices. And these problems prompted
1204: the conclusion in~\cite{GuptaK:00} that networks with either a small
1205: number of nodes, or with a small number of connections, may be more
1206: likely to find acceptance.
1207:
1208: In our work, we consider a different type of wireless networks: we
1209: focus on {\em sensor} networks, i.e., networks of devices that collect
1210: measurements of a process that is ``regular'' in some sense. For example,
1211: if the sensors measure ozone concentration in the atmosphere, then the
1212: values of each measurement will not be independent in general, but instead
1213: will be constrained by an appropriate form of the Navier-Stokes equations.
1214: If the sensors measure temperatures at different locations of a material,
1215: the measurements will be constrained by Fourier's heat equations. And
1216: in general, when the sensors sample values of some random process at
1217: different locations, these samples will be constrained by the correlation
1218: structure of the process (see, e.g.,~\cite{ServettoR:06}). By considering
1219: correlated sources we generalize in what we believe is a very meaningful
1220: way the setup of~\cite{GuptaK:00}: now the amount of information generated
1221: by each node is no longer a constant, but instead it depends on the size
1222: of the network itself.
1223:
1224: \subsection{Network Model}
1225:
1226: Consider the following problem setup:
1227:
1228: \begin{itemize}
1229: \item There is a source of information, modeled by a process $X_u(k)$: for
1230: fixed values of $k$, $X_u(k)$ is a brownian motion with parameter
1231: $\sigma^2$; for fixed values of $u\in[0,1]$, $X_u(k)$ is an iid sequence.
1232: That is, at a fixed location $u$, iid samples with distribution
1233: $N(0,\sigma^2u)$ are collected in discrete time, and at a fixed time
1234: slot, a Wiener process unfolds in space.
1235: \item Network nodes are represented by points on the unit square
1236: $[0,1]\times[0,1] \subset \mathbb{R}^2$, and are classified into
1237: three groups:
1238: \begin{itemize}
1239: \item There are $n$ {\em source} nodes $s$, that feed information into
1240: the network, uniformly spread on the left edge of the square.
1241: \item There are $n$ {\em destination} nodes $d$, that take information
1242: out of the network, uniformly spread on the right edge of the square.
1243: \item There are $n$ {\em router} nodes $r$, optimally placed in
1244: the interior of the square, to maximize network throughput. These nodes
1245: are pure routers, they neither inject nor extract information to/from
1246: the network, and they don't apply any form of coding, they only forward
1247: information to other nodes.
1248: \end{itemize}
1249: \item The $m$-th source collects samples of $X_{m/n}(k)$, and
1250: encodes this information prior to sending it to the $m$-th destination
1251: ($m=1...n$). The only information available to each source is:
1252: \begin{itemize}
1253: \item The observed samples $X_{m/n}(k)$.
1254: \item The position in the square of all the nodes.
1255: \item The statistics of the entire process $X$.
1256: \end{itemize}
1257: \item Each destination node forwards whatever data it receives to a special
1258: node $d$, which {\em jointly} decodes all the data received, and computes
1259: an estimate $\hat{X}_u(k)$ of the entire sample path $X_u(k)$ based on all
1260: the decoded samples $X_{m/n}(k)$'s.
1261: \item Nodes do not move, and have an unbounded power supply.
1262: \item A bit is successfully sent from node $v_i$ to node $v_j$ if
1263: (a) $||v_i-v_j||<\Delta_i$, and (b) if for all other transmitting nodes
1264: $v_k$, $||v_k-v_j|| \geq \Delta_k$. $R$ bits per channel use can be
1265: transmitted over any link.
1266: \item Routing and power control are optimally configured to maximize network
1267: throughput.
1268: \end{itemize}
1269: Note that in this model we explicitly rule out the possibility of
1270: source nodes exchanging information to cooperate in the encoding of their
1271: observations. Note also that routers only forward data, but do not apply
1272: any form of coding. That is, encoding is distributed among the sensors,
1273: data is carried over the network by relay nodes, and decoding is
1274: performed at a central location.
1275:
1276: We should point out that our model is different from the model of Gupta
1277: and Kumar~\cite{GuptaK:00}: whereas in their model they consider $n$ nodes
1278: which serve as transmitters/receivers/relays all in a single device, we
1279: break up each device into three pieces, and consider $n$ transmitters, $n$
1280: receivers, and $n$ relays. However, this is not a fundamental difference:
1281: as long as we keep the same number of all three types of devices,
1282: the two models are essentially the same, and therefore their results on
1283: the property of vanishing throughputs as $n\rightarrow\infty$ still holds
1284: for our model. The idea of splitting the devices into three separate units
1285: is to model a situation in which data is captured at some location, is
1286: transported over an ad-hoc network, and an estimate of the field of
1287: measurements is formed at a remote location.
1288:
1289: \subsection{Encoding/Decoding Mechanics in Large Networks}
1290:
1291: Clearly, a network with a finite number of nodes and with communication
1292: links of finite capacity among nodes, can transport only a finite amount
1293: of information. Therefore, exact reconstruction of the brownian field
1294: $X_u(k)$ will not be possible in general, and a key issue then is that
1295: of understanding the rate/distortion tradeoffs involved. A thorough study
1296: of this new rate/distortion problem lies outside the scope intended for
1297: this paper, and we will deal with this problem elsewhere. Of interest
1298: in this paper however is a result that relates the ability of the central
1299: destination node $d$ to estimate the brownian field $X_u(k)$ to both the
1300: number of nodes in the network and the capacity of the individual network
1301: links. Indeed, we have that under the assumption of a large (but still
1302: independent of network size) link capacity $R$, for any $\epsilon>0$ and
1303: $1-\epsilon\leq\rho<1$, there exists a large enough network of size $n$
1304: nodes, such that
1305: \[
1306: D_{\frac m n} \;\stackrel{\Delta}{=}\;
1307: {\tt E}\left(||X_{\frac m n}(k)-\hat{X}_{\frac m n}(k)||^2\right) \;\leq\;
1308: \sigma_X^2\mbox{\small$\frac{m-1}{n}$}(1\!-\!\rho^2)
1309: \;e^{-\frac{R}{6\sqrt{n}}}
1310: \mbox{ (a.e.)},
1311: \]
1312: uniformly for $\frac m n$ in the closed interval
1313: $\left[\frac{1}{n(1-\rho^2)},1\!\right]$, where $m\leq n$ is an integer,
1314: for all time slots $k$, and for almost all sample paths of the field
1315: $X_{\frac m n}(k)$.
1316:
1317: Essentially, what this result states is that, under the assumption of a
1318: large network and with links of high capacity, it is possible for $d$
1319: to estimate the sample paths of $X$ with arbitrarily small error. That
1320: accurate estimation is possible is indeed surprising to us, given the
1321: fact that the amount of information per sample that the network can
1322: carry vanishes~\cite{GuptaK:00}---fortunately, so does the information
1323: content per sample, and that is what we can take advantage of.
1324:
1325: \subsubsection{Placement of Nodes and Scheduling of Transmissions}
1326: \label{sec:placement-scheduling}
1327:
1328: First of all, we give one particular distribution of routers in the
1329: plane and one particular algorithm for scheduling transmissions.
1330:
1331: Assume $\ell = \sqrt{n}$ is an even integer, and define:
1332: \begin{itemize}
1333: \item The sources are located at coordinates $(0,\frac{i}{n})$, and the
1334: destinations at coordinates $(1,\frac{i}{n})$, for $i=1...n$.
1335: \item There are exactly $n$ routers, located at coordinates
1336: $(\frac{1}{2\ell}+\frac{i}{\ell},\frac{1}{2\ell}+\frac{j}{\ell})$,
1337: for $i,j=0,1,...,\ell-1$.
1338: \item The transmission radius for the source nodes is
1339: $\Delta=\frac{\sqrt{2}}{2\ell}$, and for the routers it is
1340: $\Delta=\frac{1}{\ell}$.\footnote{Recall that destination nodes do not
1341: communicate over the shared wireless medium with the central decoder,
1342: they only receive data that way. Therefore, no transmission range
1343: needs be specified in their case.}
1344: \end{itemize}
1345:
1346: In order to present an algorithm to schedule transmissions over time,
1347: we need some definitions. First, divide the square
1348: $[0,1]\times[0,1]\subset\mathbb{R}^2$ into $\ell$ sets defined by
1349: \[
1350: S^{(i)} = \left[\frac{(i\!-\!1)\ell}{n},\frac{i\ell}{n}\right)\times[0,1]
1351: \]
1352: $(i\!=\!1...\ell)$. Within each $S^{(i)}$, there are:
1353: \begin{itemize}
1354: \item $\ell$ source nodes, at coordinates
1355: $\left(0,\frac{(i-1)\ell+m}{n}\right)$, for $m=0...\ell-1$.
1356: \item $\ell$ destination nodes, at coordinates
1357: $\left(1,\frac{(i-1)\ell+m}{n}\right)$, for $m=0...\ell-1$.
1358: \item $\ell$ router nodes, at coordinates
1359: $\left(\frac{1}{2\ell}+\frac{k-1}{\ell},
1360: \frac{1}{2\ell}+\frac{i}{\ell}\right)$, for $k=1...\ell$.
1361: \end{itemize}
1362: Next, we divide the router nodes into three groups $g_0,g_1,g_2$: a
1363: router falls in $g_j$ if its index $k$ is equal to $j$ (mod 3). Source
1364: nodes all belong to the group $g_0$. Finally, we give an algorithm to
1365: schedule transmissions:
1366: \begin{itemize}
1367: \item Time is discrete, and starts at 0. At even time slots, allow
1368: transmissions of nodes in $S^{(i)}$'s for which $i$ is even; at odd time
1369: slots, allow transmissions of nodes for odd $i$'s.
1370: \item Each $S^{(i)}$ keeps its own clock $\tau_i$, which advances only
1371: when transmissions from this $S^{(i)}$ are allowed to proceed: when
1372: $\tau_i\equiv 0$ (mod 3) then $g_0$ sends, when $\tau_i\equiv 1$ (mod 3)
1373: then $g_1$ sends, when $\tau_i\equiv 2$ (mod 3) then $g_2$ sends. And
1374: source nodes send only once every $\ell$ available slots, cycling through
1375: them in round-robin order.
1376: \end{itemize}
1377:
1378: An illustration of the placement and divisions of nodes, and of the
1379: mechanics of the algorithm, is shown in Fig.~\ref{fig:step1}.
1380:
1381: \begin{figure}[!h]
1382: \vspace{-3mm}
1383: \centerline{\psfig{file=layout-and-schedule.eps,height=10cm,width=12cm}}
1384: \vspace{-3mm}
1385: \caption{An example of the placement and division of nodes, and
1386: scheduling of transmissions, for $n=16$ ($\ell=4$). Black dots represent
1387: nodes: 16 sources on the left edge of the square, 16 routers inside the
1388: square, 16 destinations on the right edge of the square. A source sends
1389: data to a destination on the same horizontal line. Thin solid lines
1390: joining nodes are
1391: routes. The sets $S^{(i)}$ and the groups $g_i$ are indicated with dotted
1392: lines. Active transmissions are indicated with a thick arrow, and the
1393: circles around each indicate transmission ranges. The active
1394: transmissions in this picture correspond to an odd time slot (nodes only
1395: within $S^{(1)}$ and $S^{(3)}$ are sending), and the group $g_0$ is active.}
1396: \label{fig:step1}
1397: \end{figure}
1398:
1399: \subsubsection{Throughput per-Node is $\frac{R}{6\sqrt{n}}$}
1400:
1401: The calculation of throughput proceeds in three steps:
1402: \begin{enumerate}
1403: \item Each group $S^{(i)}$ is scheduled for transmission only $\frac{1}{2}$
1404: of the available time slots. Among these slots, only $\frac{1}{3}$ are
1405: available for transmission by $g_0$, the group that contains source nodes.
1406: When this group is scheduled, only once every $\ell$ slots is available
1407: to a particular node. And when a particular node finally gets his
1408: chance to inject a message into the network, it injects $R$ bits (equal
1409: to link capacity). Therefore, the total number of bits {\em injected} by
1410: any one source node per unit of time is
1411: $\frac{1}{2}\frac{1}{3}\frac{1}{\ell}R=\frac{R}{6\sqrt{n}}$.
1412: \item By construction, there is never more than one packet of $R$ bits in
1413: the buffer of any router.
1414: \item Also by construction, there is never more than one active transmission
1415: within range of any receiver.
1416: \end{enumerate}
1417: So, from 1 we have that $\frac{R}{6\sqrt{n}}$ bits per time slot are
1418: injected into the network, from 2 we have that there is no buildup of
1419: packets in any one queue, and from 3 we have that packets are never lost
1420: or delayed. Therefore, all injected bits reach destination, and hence
1421: the throughput is $\frac{R}{6\sqrt{n}}$ bits per time slot per node.
1422:
1423: \subsubsection{Use of Codes with Side Information}
1424: \label{sec:use-lqsi}
1425:
1426: So far we have a network in which there is no loss of data, and which
1427: can carry a total of $\frac{R}{6\sqrt{n}}$ bits per time slot per node.
1428: And we collect one sample of the brownian field $X$ per time slot at
1429: each source node. Therefore, we have $\frac{R}{6\sqrt{n}}$ bits per
1430: sample to encode a block of $N$ samples, for which the network guarantees
1431: delivery.
1432:
1433: Consider encoding a block of samples $X_{m/n}^N
1434: \stackrel{\Delta}{=} [X_{m/n}(0)...X_{m/n}(N-1)]$ at the $m$-th
1435: source node. Trivially, we have that
1436: $X_{m/n}^N = X_{(m-1)/n}^N +
1437: (X_{m/n}^N-X_{(m-1)/n}^N)$. From standard properties of
1438: Wiener processes, we have that $X_{m/n}^N$ and $X_{(m-1)/n}^N$
1439: are jointly Gaussian, and that the increment has distribution
1440: \[
1441: X_{m/n}^N-X_{(m-1)/n}^N
1442: \;\sim\; N\left(0,\mbox{$\frac{\sigma_X^2}{n}$}{\bf I}\right),
1443: \]
1444: independent of $X_{(m-1)/n}^N$. If $X_{(m-1)/n}^N$ were
1445: available at the $m$-th encoder, the encoding procedure would be trivial:
1446: use standard codes for an iid Gaussian source to send this increment. But
1447: without the reference value $X_{(m-1)/n}^N$, $m$ cannot compute that
1448: increment, which is the only ``new'' information at location
1449: $\frac{m}{n}$.
1450:
1451: Our encoding procedure is as follows: we encode $X_{m/n}^N$ using the
1452: codes developed in earlier sections, assuming the side information
1453: $X_{(m-1)/n}^N$ is available at the decoder. The relevant statistics
1454: are:
1455: \[
1456: X_{(m-1)/n}^N
1457: \sim N\left(0,\sigma_X^2(m\!-\!1)/n{\bf I}\right),\hspace{8mm}
1458: X_{m/n}^N \sim N\left(0,\sigma_X^2m/n{\bf I}\right),\hspace{8mm}
1459: \rho_{m-1,m} = \sqrt{1-1/m}.
1460: \]
1461:
1462: \subsection{Distortion Computation}
1463:
1464: Next we turn to the computation of distortion for this proposed coding
1465: strategy. Note that since the side information used to decode the data
1466: generated by one node is the data available at previous nodes, and that
1467: decoding errors can indeed occur with non zero probability (and thus,
1468: in the large-network regime, {\em will} occur), an important issue that
1469: needs to be addressed is the effect of decoding errors on the overall
1470: achieved distortion.
1471:
1472: We proceed in two steps: first we compute the distortion resulting
1473: in the case when no decoding errors occur, and then we compute the increase
1474: in distortion due to decoding errors.
1475:
1476: \subsubsection{Distortion Assuming No Decoding Errors}
1477:
1478: Consider a fixed location $\frac m n$ ($1\leq m\leq n$), a fixed
1479: desired correlation value $\rho$ based on which a large enough value
1480: of $n$ is determined, and assume that no decoding errors occur in
1481: decoding samples $\frac 1 n ... \frac{m-1}n$.
1482:
1483: In Section~\ref{sec:use-lqsi} above, we argued that we can use
1484: codes with side information to effectively approximate the performance
1485: of a genie-aided encoder capable of sending the increments at each node.
1486: We would like to point out now that in our decoder, the side information
1487: is itself quantized with the coarse lattice. As a result, as long as
1488: $X_\frac{m-1}n$ and $\hat X_\frac{m-1}n$ fall in the same sublattice
1489: cell, the reconstruction $\hat X_\frac m n$ is as good as if it were
1490: based on {\em uncoded} side information. This is illustrated in
1491: Fig.~\ref{fig:coded-sideinfo}.
1492:
1493: \begin{figure}[!ht]
1494: \centerline{\psfig{file=coded-sideinfo.eps,height=14cm}}
1495: \caption{To illustrate the robustness of the proposed quantizers
1496: to small amounts of quantization noise in the side information: as long
1497: as the side information falls within a sublattice cell (roughly indicated
1498: as the shaded region in this picture), using coded or uncoded side
1499: information does not make a difference. In this case, $X^N_{\!\frac{m-1}n}$
1500: is the sample at the previous location, used as side information for the
1501: sample $X^N_{\!\frac m n}$ at the current location.}
1502: \label{fig:coded-sideinfo}
1503: \end{figure}
1504:
1505: Thus we conclude that, provided no decoding errors occur in any of the
1506: previous samples, and based on the results in Section~\ref{sec:asymptotics},
1507: we can approximate the distortion in the reproduction of each sample
1508: by Wyner's rate/distortion bound:
1509: \[
1510: D_\frac{m}{n} \:\leq\:
1511: \sigma_X^2\mbox{$\frac{m}{n}$}(1\!-\!\rho^2)\;e^{-\frac{R}{6\sqrt{n}}},
1512: \]
1513: Note that the inequality in this case is because there will be nodes
1514: operating with a correlation value higher than the specified $\rho$, and
1515: for these values $D_u$ will be even lower than this. The location-dependent
1516: correlation coefficients $\rho_{m-1,m}$ between adjacent samples forms a
1517: monotonically increasing sequence $\sqrt{1-1/m}\longrightarrow 1$ as
1518: $m\rightarrow\infty$. A trivial manipulation shows that for all
1519: $m\geq\frac{1}{1-\rho^2}$, $\rho\leq\rho_{m-1,m}<1$, and therefore all node
1520: locations $\frac{m}{n}$ in the closed interval $\left[\frac{1}{n(1-\rho^2)},
1521: 1\right]$ will have correlation values at least $\rho$. Now, since
1522: $m\leq n$, by choosing $n$ large enough we can make $\frac{1}{n(1-\rho^2)}$
1523: come arbitrarily close to zero. So we see that the distortion bound above
1524: holds uniformly for almost all samples in a large network.
1525:
1526: At locations $u$ in which there is no sample collected (i.e., any location
1527: in an open interval $\left(\frac{m-1}{n},\frac{m}{n}\right)$), we need to
1528: interpolate $X_u$: we define $\hat{X}_u = \hat{X}_{(m-1)/n}$, where
1529: $(m-1)/n<u<m/n$.\footnote{Note that we could use better interpolators here
1530: than a simple zero-order hold. But already with this rather simple minded
1531: rule we get the sought result of vanishing estimation error, and hence we
1532: keep it for simplicity.} In this case,
1533: \[
1534: D_u \leq D_\frac{m-1}{n} + \mbox{$\frac{\sigma_X^2}{n}$},
1535: \]
1536: since the interpolation error is at most the size of an increment
1537: between samples, and this increment has variance $\sigma_X^2/n$. Assume
1538: now that the sample path $X_u(k)$ is continuous at $u$:
1539: \begin{itemize}
1540: \item Because $n$ is large, and for a fixed $k\in\mathbb{N}$, we have
1541: a dense sampling of $X_u(k)$, $0\leq u\leq 1$.
1542: \item Because $R$ is large, encoded samples $\hat{X}_u$ available at
1543: the decoder are close to the original value $X_u$, i.e.,
1544: $\hat{X}_u\rightarrow X_u$, $u=\frac{m}{n}$.
1545: \item Because $X_u$ is continuous and $n$ is large, we have that
1546: interpolated samples $X_u\approx X_{(m-1)/n}$
1547: ($\frac{m-1}{n}<u<\frac{m}{n}$), for all $0\leq u\leq 1$.
1548: \end{itemize}
1549: Therefore, $D_u \leq D_\frac{m-1}{n} + \frac{\sigma_X^2}{n}$ holds at
1550: all points of continuity of $X_u$. But finally, since almost all paths
1551: of a Wiener process are continuous~\cite{StarkW:94}, we conclude that
1552: \[
1553: D_u \;\leq\;
1554: \sigma_X^2\left(\mbox{$\frac{m-1}{n}$}(1\!-\!\rho^2)
1555: \;e^{-\frac{R}{6\sqrt{n}}}+\mbox{$\frac{1}{n}$}\right)
1556: \;\; \mbox{(a.e.),}
1557: \]
1558: where $(m-1)/n<u<m/n$, and $1\leq m\leq n$.
1559:
1560: \subsubsection{Distortion Excess Due to Decoding Errors}
1561:
1562: In the subsection above we obtained an expression for the distortion
1563: in the reconstruction of the sample paths assuming that decoding errors
1564: never occur. This is clearly a lower bound on the achievable distortion.
1565: But we still need to account for the distortion increase that results
1566: from the increasingly likely (as $n\to\infty$) event of a decoding error.
1567: Our next goal is to show that, in large networks, this excess distortion
1568: is negligible compared to the distortion above induced by the quantizers.
1569:
1570: Consider two definitions:
1571: \begin{itemize}
1572: \item $\Upsilon_m$ is a random variable such that $\Upsilon_m = l$ denotes
1573: the event in which $l$ nodes (out of the $m$ right before the node at
1574: location $\frac m n$) make a decoding error. Since conditioned on the
1575: side information being correct, errors are independent at each node,
1576: $\Upsilon_m \sim \mbox{B}(m,p_n)$: a binomial distribution with parameters
1577: $m =$ number of previous nodes, and $p_n = $ probability of decoding
1578: error given that there are $n$ nodes in the network.
1579: \item We refer to the term $\beta$ defined by eqn.~(\ref{eq:def-alpha-beta})
1580: as the {\em excess distortion} at node $m$.
1581: \end{itemize}
1582: Both these definitions are illustrated in Fig.~\ref{fig:excess-distortion}.
1583:
1584: \begin{figure}[ht]
1585: \centerline{\psfig{file=excess-distortion.eps,width=15cm,height=10cm}}
1586: \caption{To illustrate the concept of excess distortion. In this picture
1587: we show the reconstruction that would result when no decoding errors
1588: occur (bottom sample path), and the effects of decoding errors (jumps
1589: of average size $\sqrt{\beta}$, as defined in eqn.~(\ref{eq:def-alpha-beta}),
1590: after each decoding error). Note that these errors do not necessarily
1591: add up coherently from node to node, as illustrated in this picture --
1592: however, taking them to behave in this way provides a valid upper bound
1593: on the total excess distortion they induce.}
1594: \label{fig:excess-distortion}
1595: \end{figure}
1596:
1597: Consider now the distortion in a reconstruction of $X_{\frac m n}$ based
1598: on coded side information:
1599: \begin{eqnarray*}
1600: E\big(||X_{\frac m n}-\hat{X}_{\frac m n}||^2\big)
1601: & \stackrel{(a)}{\approx} &
1602: \alpha_n+\sum_{l=0}^m P(\Upsilon_m = l) \left(l\sqrt{\beta_n}\right)^2 \\
1603: & = & \alpha_n+\beta_n E(\Upsilon_m^2)
1604: \;\; = \;\;
1605: \alpha_n+\beta_n \big(\mbox{Var}(\Upsilon_m)+E^2(\Upsilon_m)\big) \\
1606: & \stackrel{(b)}{=} & \alpha_n+\beta_n\big(m p_n (1-p_n) + m^2 p_n^2\big)
1607: \;\; = \;\; \alpha_n+\beta_n m p_n (1+(m-1)p_n) \\
1608: & \stackrel{(c)}{\leq} & \alpha_n+\beta_n n p_n(1+np_n)
1609: \;\; \approx \;\; \alpha_n + \beta_n n^2 p_n^2 \\
1610: & \stackrel{(d)}{\approx} & \alpha_n + e^{-\frac{n}{2\sigma_X^2}} n^2 p_n^2 \\
1611: & \stackrel{\Delta}{=} & \alpha_n + \beta'_n
1612: \end{eqnarray*}
1613: where:
1614: \begin{itemize}
1615: \item[(a)] follows from eqn.~(\ref{eq:def-alpha-beta}), and from the fact
1616: that if $l$ errors occured before the decoding of the $m$-th sample, on
1617: average each error contributes distortion $\beta_n$ and in the worst of
1618: cases all these errors add up coherently (the dependence of $\alpha$ and
1619: $\beta$ in eqn.~(\ref{eq:def-alpha-beta}) on $n$ is highlighted by adding
1620: the subscript);
1621: \item[(b)] follows from the binomial distribution of $\Upsilon_m$;
1622: \item[(c)] follows from the fact that the expression above must hold for
1623: all $1\leq m\leq n$;
1624: \item[(d)] follows from the fact that for $n$ large, we can neglect the
1625: polynomial terms associated with the negative exponential, and from the
1626: fact that $\rho = \sqrt{1-\frac 1 n}$.
1627: \end{itemize}
1628: Clearly, as $n\to\infty$, both $\alpha_n\to 0$ and $\beta'_n\to 0$.
1629: But again, this is not an interesting observation. The interesting
1630: observation in this case is that still in the presence of coded side
1631: information and decoding errors, in the regime of high correlations,
1632: $\beta'_n$ is negligible compared to $\alpha_n$, and
1633: $E\big(||X_{\frac m n}-\hat{X}_{\frac m n}||^2\big)\approx\alpha_n$:
1634: \[\begin{array}{ccccccccc}
1635: \lim_{n\to\infty} \frac{\alpha_n+\beta'_n}{\alpha_n}
1636: & = & 1 + \lim_{n\to\infty} \frac{\beta'_n}{\alpha_n}
1637: & \leq & 1 + \lim_{n\to\infty}
1638: \frac{\beta'_n}{\sigma_X^2\mbox{$\frac 1 n$}}
1639: & \leq & 1 + \lim_{n\to\infty}
1640: \frac{e^{-\frac{n}{2\sigma_X^2}} n^3 p_n^2}{\sigma_X^2}
1641: & < & 1+\epsilon,
1642: \end{array}\]
1643: for any $\epsilon>0$ and $n$ large enough. But we also have
1644: $\frac{\alpha_n+\beta'_n}{\alpha_n}>1$ (since $\beta'_n>0$). Thus,
1645: the excess distortion due to the use of coded side information and
1646: possible decoding errors is negligible compared to the distortion
1647: induced by the quantizers themselves.
1648:
1649: To conclude this section, we would like to point out that there is an
1650: interesting tradeoff in this analysis, that works out favorably for us.
1651: Note that by increasing the number of nodes, we increase the number of
1652: places at which errors can occur, and therefore the probability that
1653: some node will make a decoding error is increased. However, as the
1654: number of nodes increases, the correlation between their measurements
1655: increases as well, and therefore the size of errors is reduced. And
1656: as the previous analysis shows, a linear increase in the number of nodes
1657: results in an exponential decrease in the size of each error -- hence,
1658: error propagation is {\em not} a problem in this setup.
1659:
1660:
1661: \section{Conclusions}
1662: \label{sec:conclusions}
1663:
1664: In this paper we presented our work on the design and performance
1665: analysis of codes for the problem of rate distortion with side
1666: information, and on the application of those codes in the context
1667: of a problem of data compression for sensor networks. First, we
1668: gave concrete constructions for the nested codes studied by
1669: Shamai/Verd\'u/Zamir in~\cite{shamai-verdu-zamir:systematic-lossy-coding,
1670: zamir-shamai:almost-there}, effectively answering an open question
1671: raised in~\cite{zamir-shamai:almost-there}. Then we studied the
1672: distortion performance of our codes, under the assumption of high
1673: correlation between the source and the side information and of
1674: high coding rates: there we showed that our codes attain the
1675: theoretically optimal distortion decay established by Wyner and
1676: Ziv~\cite{Wyner:78, WynerZ:76}. Finally we computed an upper bound
1677: on the error made in estimating a brownian field based on measurements
1678: collected by very ``cheap'' devices and delivered over a wireless
1679: network. In this case, even though the per-node throughput of the
1680: network vanishes as its size increases, and even if the nodes are
1681: not allowed to exchange any information at all, we showed how
1682: arbitrarily accurate estimation of the remote field is possible.
1683: To conclude the paper, we would like to comment on some issues that
1684: follow from our work.
1685:
1686: Concerning the problem of source estimation, in the presence of constraints
1687: on the available data imposed by the wireless network:
1688:
1689: \begin{itemize}
1690:
1691: \item The Brownian model for the source considered in this work is
1692: probably one of the worst cases we could have considered, in the sense
1693: that the regularity conditions satisfied by this process are minimal.
1694: For example, almost all of its sample paths are indeed continuous at
1695: almost all points (something we did use in our analysis); but at the
1696: same time, almost all sample paths are {\em not} differentiable at almost
1697: all points. Furthermore, the crucial assumption of high-resolution
1698: quantization that enabled us to apply our codes in the presence of
1699: {\em coded} side information cannot be justified for processes with
1700: increments of variance $O(n^{-1+\epsilon})$, for any
1701: $\epsilon>0$---compare this to the $O(n^{-1})$ variance of the increments
1702: of the model we considered.
1703:
1704: \item Interesting questions arise if we consider processes more regular
1705: than Brownian motion: consider for example the case when $X_u$ is a
1706: bandlimited signal (since $X_u$ is compactly supported, take its periodic
1707: extension). If the samples $X_{m/n}$ were available at the decoder
1708: without distortion, it follows from Shannon's sampling theorem that
1709: a network of finite size is enough to achieve a reconstruction with
1710: zero distortion. However, this would require network links of infinite
1711: capacity. For any finite value of $R$, there are tradeoffs to explore
1712: between the number of nodes in the network (i.e., the sampling rate) and
1713: the capacity of the network links (i.e., the accuracy in the representation
1714: of each sample), since economic constraints may favor one or the other
1715: option. This problem has received considerable attention in the signal
1716: processing and harmonic analysis literature~\cite{CvetkovicV:98, FuchsD:00,
1717: GoyalVT:98, KrimTMD:99, ThaoV:94}.
1718:
1719: \end{itemize}
1720:
1721: Concerning coding/quantization. Whereas our asymptotic analysis was
1722: performed only for jointly Gaussian sources and MSE distortion, it would
1723: be interesting to learn something about the performance of the proposed
1724: quantizers for sources with non-Gaussian statistics and/or other
1725: distortion measures. An interesting result of Zamir states that,
1726: although the gap between $R_X(d)$ and $R_{X|Y}(d)$ can be unbound, the
1727: gap between the Wyner-Ziv rate/distortion function $R^*_X(d)$ and
1728: $R_{X|Y}(d)$ is bounded, and actually quite small in some cases: 0.5
1729: bits/sample for arbitrary source statistics and MSE distortion, and 0.22
1730: bits/sample for a binary source with Hamming distortion~\cite{Zamir:96}.
1731: In our opinion this is an interesting issue because, should a result
1732: similar to Zamir's hold for the performance of our codes, this would
1733: immediately allow us to conclude that arbitrarily accurate estimation
1734: is possible not just for jointly Gaussian sources, but for any source
1735: statistics. And even if we do not have a formal proof, it certainly
1736: seems plausible to us that this may be so.
1737:
1738: Concerning the type of asymptotics developed in this work. Tools
1739: employed for theoretical performance analysis in source coding problems
1740: can be roughly classified into two main groups:
1741: \begin{itemize}
1742: \item Large-block asymptotics, as pioneered by
1743: Shannon~\cite{Shannon:59}.
1744: \item High-rate asymptotics, as pioneered by Zador, Gersho and
1745: others~\cite{gersho:quantization-asymptotics,zador:quantization-asymptotics}.
1746: \end{itemize}
1747: The asymptotics we considered in this work are of neither type -- instead,
1748: we focused on {\em high-correlation} asymptotics. And we believe this
1749: type of analysis is one particularly well suited for a new class of source
1750: coding problems, that originate in the context of sensor networks. This
1751: paper presents one such analysis for a simple toy problem involving a
1752: Brownian process. More of our work along these lines can be found
1753: in~\cite{LilisZS:04, ScaglioneS:03, ServettoR:06}.
1754:
1755: To conclude, we would like to comment on the nature of our contributions
1756: in this paper. Since the seminal work of Gupta and Kumar~\cite{GuptaK:00},
1757: most of the theory work on wireless networks appears to have been driven
1758: by a desire to find ways to understand, and if possible circumvent, the
1759: fact that the per-node throughput of the network vanishes as the number
1760: of nodes grows. Implicit in previous work seems to have been present an
1761: assumption that each node has a constant amount of information to transmit,
1762: irrespective of the network size: in this case, the fact that the throughput
1763: per node decreases as the network size increases does indeed pose serious
1764: problems. However, we feel the asymptotic analysis of~\cite{GuptaK:00} is
1765: better suited to ``networks of small sensors'' than to ``networks of laptop
1766: computers'': whereas there are only so many laptops that one may want to
1767: have in a single room, much higher densities of small sensing nodes are
1768: conceivable. Yet it is very high densities of nodes what the asymptotic
1769: analysis of~\cite{GuptaK:00} suggests to us. Now, in the context of sensor
1770: networks, the vanishing-throughput property of some wireless networks is
1771: much less of a problem. As an application for our codes with side
1772: information, we illustrated an instance of a class of wireless networking
1773: problems in which, as the size of the network grows, the amount of
1774: information generated by each transmitter decays at the same speed as the
1775: per-node throughput does. Hence, contrary to the conclusions suggested
1776: in~\cite{GuptaK:00}, designers of these networks should be {\em encouraged}
1777: to consider very large numbers of nodes, for doing so may result in
1778: improved quality of the signals reconstructed at the receivers, and it
1779: may also make more economic sense.
1780:
1781:
1782: \bigskip
1783: \noindent {\bf Acknowledgements.} The author would like to thank
1784: Toby Berger, for much needed encouragement and guidance provided
1785: at difficult times; Anna Scaglione, for discussions which resulted
1786: in a solution to a toy problem closely related to this
1787: one~\cite{ScaglioneS:03}; Martin Vetterli, for discussions on the
1788: work of Gupta and Kumar~\cite{GuptaK:00} that greatly contributed
1789: to his understanding of that work; and the anonymous referees, for
1790: their most insightful questions and constructive feedback, which
1791: led to a much improved manuscript. The author also benefited from
1792: several conversations with V.\ A.\ Vaishampayan and N.\ J.\ A.\ Sloane,
1793: on quantization theory and lattices, in the context of some previous
1794: work~\cite{VaishampayanSS:01}.
1795:
1796:
1797: \pagebreak
1798: \appendix
1799:
1800: \subsection{Bounding $\beta$}
1801: \label{app:trivial1}
1802:
1803: Recall from Section~\ref{sec:average-error},
1804: \[
1805: \beta\;\;\stackrel{\Delta}{=}\;\; \mbox{$\frac 1 n$}
1806: \sum_{\lambda\in s\kappa(\Lambda)\backslash\{{\bf 0}\}}
1807: \int_{{\bf x}\in V[\lambda:s\kappa(\Lambda)]}
1808: ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2 f_{X|Y}({\bf x}|\xi)
1809: \mbox{d}{\bf x},
1810: \]
1811: for any $\xi\in V[{\bf 0}:s\Lambda]$. Our goal next is to give an
1812: estimate for $\beta$.
1813:
1814: Since each term of the sum is positive, we have
1815: a trivial lower bound: $\beta \geq 0$. As for an upper bound:
1816: \begin{eqnarray}
1817: \beta
1818: & \stackrel{(a)}{=} & \mbox{$\frac 1 n$}
1819: \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}
1820: \int_{V[\lambda:s\kappa(\Lambda)]}
1821: ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2
1822: \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;
1823: e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}
1824: -\frac{\rho}{\sigma_Y}{\xi}||^2}
1825: \mbox{d\bf x}
1826: \nonumber \\
1827: & \stackrel{(b)}{\leq} & \mbox{$\frac 1 n$}
1828: \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}
1829: \int_{V[\lambda:s\kappa(\Lambda)]}
1830: ||{\bf x}||^2
1831: \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;
1832: e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}
1833: -\frac{\rho}{\sigma_Y}{\xi}||^2}
1834: \mbox{d\bf x}
1835: \nonumber \\ & & \mbox{\hspace{2cm}} +
1836: \int_{V[\lambda:s\kappa(\Lambda)]}
1837: ||\lambda+\gamma_k({\bf x})||^2
1838: \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;
1839: e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}
1840: -\frac{\rho}{\sigma_Y}{\xi}||^2}
1841: \mbox{d\bf x}
1842: \nonumber \\
1843: & \stackrel{(c)}{\approx} & \mbox{$\frac 1 n$}
1844: \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}
1845: 2||\lambda||^2
1846: \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;
1847: e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||\lambda||^2}
1848: \left(\int_{V[\lambda:s\kappa(\Lambda)]}\mbox{d\bf x}\right)
1849: \nonumber \\
1850: & = & \mbox{$\frac 1 n$}
1851: \frac{2\nu(s\kappa(\Lambda))}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1852: \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}
1853: ||\lambda||^2
1854: \;e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||\lambda||^2}
1855: \nonumber \\
1856: & = & \mbox{$\frac 1 n$}
1857: \frac{2\nu(s\kappa(\Lambda))}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1858: \sum_{\lambda\in \kappa(\Lambda)\!\setminus\{0\}}
1859: ||s\lambda||^2
1860: \;e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||s\lambda||^2}
1861: \nonumber \\
1862: & \stackrel{(d)}{=} & \mbox{$\frac 1 n$}
1863: \frac{2\nu(s\kappa(\Lambda))s^2}
1864: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1865: \sum_{m=1}^\infty N_m(\kappa(\Lambda))
1866: \;e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m}
1867: \label{eq:b2}
1868: \end{eqnarray}
1869: where:
1870: \begin{itemize}
1871: \item[(a)] is just a substitution for the conditional Gaussian distribution;
1872: \item[(b)] follows from the fact that $||a-b||^2 \leq ||a||^2+||b||^2$;
1873: \item[(c)] is because of two reasons: under the assumption that
1874: sublattice cells are small, we have $||{\bf x}||^2\approx||\lambda||^2$
1875: (when ${\bf x}\in V[\lambda:s\kappa(\Lambda)]$); and under the further
1876: assumption that $R$ is large, $||\gamma_k||^2$ is negligible compared
1877: to $||\lambda||^2$ (when $\lambda\neq{\bf 0}$), and
1878: $||\xi||^2\approx{\bf 0}$ (when $\xi\in V[{\bf 0}:s\Lambda]$);
1879: \item[(d)] follows from defining $N_m(\kappa(\Lambda))$ as the number of
1880: points in $\lambda\in \kappa(\Lambda)$ such that
1881: $||\lambda||^2=m$.\footnote{Note: wlog, we can take norms to be integers.
1882: If this is not the case, we can always form a (countable) list of all the
1883: norms that appear in $\kappa(\Lambda)$, and take $m$ to be an index in
1884: this list.}
1885: \end{itemize}
1886:
1887: To find a useful estimate for this sum, we need to bound
1888: $N_m(\kappa(\Lambda))$. One simple such bound is:
1889: \[ N_m(\kappa(\Lambda)) \;\; \leq \;\;
1890: \frac{\mbox{surface of an $n$-dimensional sphere of radius m}}
1891: {\mbox{volume of an $(n\!-\!1)$-dimensional sphere of radius
1892: $\frac{N}{2}$}}.
1893: \]
1894: This bound follows from the fact that the highest density of lattice
1895: points on the surface of a sphere cannot be higher than if we assume
1896: a perfect tessellation of this $(n\!-\!1)$-dimensional surface into
1897: $(n\!-\!1)$-dimensional spheres whose radius is $\frac{1}{2}$ of the
1898: smallest separation between sublattice points. Using standard
1899: formulas~\cite{neil:splag}, we find that
1900: \[ N_m(\kappa(\Lambda))
1901: \;\; \leq \;\; \frac{c_n m^{n-1}}{d_n \left(\frac{N}{2}\right)^{n-1}}
1902: \;\; = \;\; e_n m^{n-1},
1903: \]
1904: for appropriate constants $c_n$ and $d_n$, and
1905: $e_n \triangleq \frac{c_n}{d_n(\frac N 2)^{n-1}}$. Therefore,
1906: \begin{eqnarray}
1907: \beta
1908: & \stackrel{(a)}{\leq} & \mbox{$\frac 1 n$}
1909: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1910: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1911: \sum_{m=1}^\infty m^{n-1}
1912: \;e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m}
1913: \nonumber \\
1914: & = & \mbox{$\frac 1 n$}
1915: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1916: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1917: \sum_{m=1}^\infty
1918: e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m+(n-1)\log(m)}
1919: \nonumber \\
1920: & \stackrel{(b)}{=} & \mbox{$\frac 1 n$}
1921: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1922: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1923: \left(-1+\sum_{m=0}^\infty
1924: \left(e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}
1925: +\frac{(n-1)\log(m)}{m}}\right)^m\right)
1926: \nonumber \\
1927: & \stackrel{(c)}{\leq} & \mbox{$\frac 1 n$}
1928: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1929: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1930: \left(-1+\sum_{m=0}^\infty
1931: \left(e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}\right)^m\right)
1932: \nonumber \\
1933: & \stackrel{(d)}{=} & \mbox{$\frac 1 n$}
1934: \frac{2\nu(s\kappa(\Lambda))e_ns^2}
1935: {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}
1936: \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}
1937: {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)
1938: \label{eq:b3} \\
1939: & \stackrel{(e)}{<} & \epsilon
1940: \nonumber
1941: \end{eqnarray}
1942: where:
1943: \begin{itemize}
1944: \item[(a)] follows from replacing the estimate for $N_m(\kappa(\Lambda))$
1945: in eqn.~(\ref{eq:b2});
1946: \item[(b)] follows from simple manipulations, and defining
1947: $\frac{\log 0}{0} = 0$;
1948: \item[(c)] follows from observing that
1949: $\frac{\log m}{m} < \frac{s^2}{2\sigma_X^2(1-\rho^2)}$, for $\rho^2$ close
1950: enough to 1;
1951: \item[(d)] follows from evaluation of the sum of a power series;
1952: \item[(e)] where this holds for all values of $\rho$ such
1953: that $\rho_0 < |\rho| < 1$, for a constant $\rho_0$ that depends on
1954: $\epsilon$ since, from~(\ref{eq:choice-s}), we have
1955: $s/\big(\sigma_X\sqrt{1-\rho^2}\big)\to\infty$, thus convergence is
1956: exponential in $\rho$.
1957: \end{itemize}
1958: Thus, $0\leq \beta < \epsilon$, for all $\epsilon > 0$ and all $|\rho|$
1959: close enough to 1. Hence, eqn.~(\ref{eq:b3}) defines an asymptotically
1960: good estimate of $\beta$.
1961:
1962:
1963: \pagebreak
1964: %\bibliographystyle{plain}
1965: %\bibliography{library}
1966: \begin{thebibliography}{10}
1967:
1968: \bibitem{AaronG:02}
1969: A.~Aaron and B.~Girod.
1970: \newblock {Compression with Side Information Using Turbo Codes}.
1971: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2002.
1972:
1973: \bibitem{baake-moody:similarity-submodules-semigroups}
1974: M.~Baake and R.~V. Moody.
1975: \newblock {Similarity Submodules and Semigroups}.
1976: \newblock In J.~Patera, editor, {\em Quasicrystals and Discrete Geometry},
1977: pages 1--13. Comm. Fields Institute, American Mathematical Society,
1978: Providence, RI, 1998.
1979:
1980: \bibitem{BarronCW:02}
1981: R.~Barron, B.~Chen, and G.~W. Wornell.
1982: \newblock {The Duality Between Information Embedding and Source Coding with
1983: Side Information and Some Applications}.
1984: \newblock {\em IEEE Trans. Inform. Theory}, 49(5):1159--1180, 2003.
1985:
1986: \bibitem{BarrosS:06}
1987: J.~Barros and S.~D. Servetto.
1988: \newblock {Network Information Flow with Correlated Sources}.
1989: \newblock {\em IEEE Trans. Inform. Theory}, 52(1):155--170, 2006.
1990:
1991: \bibitem{Berger:78}
1992: T.~Berger.
1993: \newblock {\em The Information Theory Approach to Communications (G. Longo,
1994: ed.)}, chapter Multiterminal Source Coding.
1995: \newblock Springer-Verlag, 1978.
1996:
1997: \bibitem{BergerZV:96}
1998: T.~Berger, Z.~Zhang, and H.~Viswanathan.
1999: \newblock {The CEO Problem}.
2000: \newblock {\em IEEE Trans. Inform. Theory}, 42(3):887--902, 1996.
2001:
2002: \bibitem{bernstein-neil-pew:sublattices-of-a2}
2003: M.~Bernstein, N.~J.~A. Sloane, and P.~E. Wright.
2004: \newblock {On Sublattices of the Hexagonal Lattice}.
2005: \newblock {\em Discrete Math.}, 170:29--39, 1997.
2006:
2007: \bibitem{Bourbaki:58}
2008: N.~Bourbaki.
2009: \newblock {\em {El\'ements de Math\'ematiques}}.
2010: \newblock Hermann, 1958.
2011: \newblock Livre II (Alg\`ebre), Chapitre 1 (Structures Alg\'ebriques).
2012:
2013: \bibitem{ChiangB:04}
2014: M.~Chiang and S.~Boyd.
2015: \newblock {Geometric Programming Duals of Channel Capacity and Rate
2016: Distortion}.
2017: \newblock {\em IEEE Trans. Inform. Theory}, 50(2):245--258, 2004.
2018:
2019: \bibitem{conway-rains-neil:similar-sublattices}
2020: J.~H. Conway, E.~M. Rains, and N.~J.~A. Sloane.
2021: \newblock {On the Existence of Similar Sublattices}.
2022: \newblock {\em Canad. J. Math.}, 51:1300--1306, 1999.
2023:
2024: \bibitem{neil:splag}
2025: J.~H. Conway and N.~J.~A. Sloane.
2026: \newblock {\em {Sphere Packings, Lattices and Groups}}.
2027: \newblock Springer Verlag, 3rd edition, 1998.
2028:
2029: \bibitem{Costa:83}
2030: M.~H.~M. Costa.
2031: \newblock {Writing on Dirty Paper}.
2032: \newblock {\em IEEE Trans. Inform. Theory}, IT-29(3):439--441, 1983.
2033:
2034: \bibitem{Cover:75b}
2035: T.~M. Cover.
2036: \newblock {A Proof of the Data Compression Theorem of Slepian and Wolf for
2037: Ergodic Sources}.
2038: \newblock {\em IEEE Trans. Inform. Theory}, IT-21(2):226--228, 1975.
2039:
2040: \bibitem{CoverC:02}
2041: T.~M. Cover and M.~Chiang.
2042: \newblock {Duality Between Channel Capacity and Rate Distortion with Two-Sided
2043: State Information}.
2044: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1629--1638, 2002.
2045:
2046: \bibitem{CoverT:91}
2047: T.~M. Cover and J.~Thomas.
2048: \newblock {\em {Elements of Information Theory}}.
2049: \newblock John Wiley and Sons, Inc., 1991.
2050:
2051: \bibitem{CvetkovicV:98}
2052: Z.~Cvetkovi\v{c} and M.~Vetterli.
2053: \newblock {Error-Rate Characteristics of Oversampled Analog-to-Digital
2054: Conversion}.
2055: \newblock {\em IEEE Trans. Inform. Theory}, 44(5):1961--1964, 1998.
2056:
2057: \bibitem{FuchsD:00}
2058: J.-J. Fuchs and B.~Delyon.
2059: \newblock {Minimal $L_1$-Norm Reconstruction Function for Oversampled Signals:
2060: Applications to Time-Delay Estimation}.
2061: \newblock {\em IEEE Trans. Inform. Theory}, 46(4):1666--1673, 2000.
2062:
2063: \bibitem{gersho:quantization-asymptotics}
2064: A.~Gersho.
2065: \newblock {Asymptotically Optimal Block Quantization}.
2066: \newblock {\em IEEE Trans. Inform. Theory}, IT-25(4):373--380, 1979.
2067:
2068: \bibitem{GoyalVT:98}
2069: V.~K. Goyal, M.~Vetterli, and N.~T. Thao.
2070: \newblock {Quantized Overcomplete Expansions in $\mathbb{R}^N$: Analysis,
2071: Synthesis, and Algorithms}.
2072: \newblock {\em IEEE Trans. Inform. Theory}, 44(1):16--31, 1998.
2073:
2074: \bibitem{GrayN:98}
2075: R.~M. Gray and D.~L. Neuhoff.
2076: \newblock {Quantization}.
2077: \newblock {\em IEEE Trans. Inform. Theory}, 44(6):2325--2383, 1998.
2078:
2079: \bibitem{GrossglauserT:02}
2080: M.~Grossglauser and D.~Tse.
2081: \newblock {Mobility Increases the Capacity of AdHoc Wireless Networks}.
2082: \newblock {\em IEEE Trans. Networking}, 10(4):477--486, 2002.
2083:
2084: \bibitem{GuptaK:00}
2085: P.~Gupta and P.~R. Kumar.
2086: \newblock {The Capacity of Wireless Networks}.
2087: \newblock {\em IEEE Trans. Inform. Theory}, 46(2):388--404, 2000.
2088:
2089: \bibitem{GuptaK:03}
2090: P.~Gupta and P.~R. Kumar.
2091: \newblock {Towards an Information Theory of Large Networks: An Achievable Rate
2092: Region}.
2093: \newblock {\em IEEE Trans. Inform. Theory}, 49(8):1877--1894, 2003.
2094:
2095: \bibitem{heegard-berger:uncertain-side-info}
2096: C.~Heegard and T.~Berger.
2097: \newblock {Rate Distortion when Side Information May Be Absent}.
2098: \newblock {\em IEEE Trans. Inform. Theory}, IT-31(6):727--734, 1985.
2099:
2100: \bibitem{KaspiB:82}
2101: A.~H. Kaspi and T.~Berger.
2102: \newblock {Rate-Distortion for Correlated Sources with Partially Separated
2103: Encoders}.
2104: \newblock {\em IEEE Trans. Inform. Theory}, IT-28(6):828--840, 1982.
2105:
2106: \bibitem{KrimTMD:99}
2107: H.~Krim, D.~Tucker, S.~Mallat, and D.~Donoho.
2108: \newblock {On Denoising and Best Signal Representation}.
2109: \newblock {\em IEEE Trans. Inform. Theory}, 45(7):2225--2238, 1999.
2110:
2111: \bibitem{KulkarniV:04}
2112: S.~R. Kulkarni and P.~Viswanath.
2113: \newblock {A Deterministic Approach to Throughput Scaling in Wireless
2114: Networks}.
2115: \newblock {\em IEEE Trans. Inform. Theory}, 50(6):1041--1049, 2004.
2116:
2117: \bibitem{LilisZS:04}
2118: G.~N. Lilis, M.~Zhao, and S.~D. Servetto.
2119: \newblock {Distributed Sensing and Actuation on Wave Fields}.
2120: \newblock In {\em Proc. 2nd Sensor and Actor Networks Protocols and
2121: Applications (SANPA)}, Boston, MA, 2004.
2122:
2123: \bibitem{LiuCLX:04}
2124: Z.~Liu, S.~Cheng, A.~Liveris, and Z.~Xiong.
2125: \newblock {Slepian-Wolf Coded Nested Quantization (SWC-NQ) for Wyner-Ziv
2126: Coding: Performance Analysis and Code Design}.
2127: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2004.
2128:
2129: \bibitem{MerhavS:03}
2130: N.~Merhav and S.~Shamai.
2131: \newblock {On Joint Source-Channel Coding for the Wyner-Ziv Source and the
2132: Gel'fand-Pinsker Channel}.
2133: \newblock {\em IEEE Trans. Inform. Theory}, 49(11):2844--2855, 2003.
2134:
2135: \bibitem{MitranB:02}
2136: P.~Mitran and J.~Bajcsy.
2137: \newblock {Coding for the Wyner-Ziv Problem with Turbo-Like Codes}.
2138: \newblock In {\em Proc. IEEE Int. Symp. Inform. Theory}, Lausanne, Switzerland,
2139: 2002.
2140:
2141: \bibitem{PerakiS:03}
2142: C.~Peraki and S.~D. Servetto.
2143: \newblock {On the Maximum Stable Throughput Problem in Random Networks with
2144: Directional Antennas}.
2145: \newblock In {\em Proc. ACM MobiHoc}, Annapolis, MD, 2003.
2146:
2147: \bibitem{PerakiS:04}
2148: C.~Peraki and S.~D. Servetto.
2149: \newblock {Capacity, Stability and Flows in Large-Scale Random Networks}.
2150: \newblock In {\em Proc. IEEE Inform. Theory Workshop (ITW)}, San Antonio, TX,
2151: 2004.
2152:
2153: \bibitem{PradhanCR:03}
2154: S.~S. Pradhan, J.~Chou, and K.~Ramchandran.
2155: \newblock {Duality Between Source Coding and Channel Coding and its Extension
2156: to the Side Information Case}.
2157: \newblock {\em IEEE Trans. Inform. Theory}, 49(5):1181--1203, 2003.
2158:
2159: \bibitem{sandeep-kannan:discus}
2160: S.~S. Pradhan and K.~Ramchandran.
2161: \newblock {Distributed Source Coding Using Syndromes (DISCUS): Design and
2162: Construction}.
2163: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 1999.
2164:
2165: \bibitem{PradhanR:00}
2166: S.~S. Pradhan and K.~Ramchandran.
2167: \newblock {Distributed Source Coding: Symmetric Rates and Applications to
2168: Sensor Networks}.
2169: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2000.
2170:
2171: \bibitem{RebolloMonederoZG:03}
2172: D.~Rebollo-Monedero, R.~Zhang, and B.~Girod.
2173: \newblock {Design of Optimal Quantizers for Distributed Source Coding}.
2174: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2003.
2175:
2176: \bibitem{ScaglioneS:03}
2177: A.~Scaglione and S.~D. Servetto.
2178: \newblock {On the Interdependence of Routing and Data Compression in Multi-Hop
2179: Sensor Networks}.
2180: \newblock {\em Wireless Networks}, 11(1-2):149--160, 2005.
2181: \newblock Special issue with selected (and revised) papers from ACM MobiCom
2182: 2002.
2183:
2184: \bibitem{Servetto:02b}
2185: S.~D. Servetto.
2186: \newblock {Lattice Quantization with Side Information}.
2187: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2000.
2188:
2189: \bibitem{Servetto:02c}
2190: S.~D. Servetto.
2191: \newblock {On the Feasibility of Large-Scale Wireless Sensor Networks}.
2192: \newblock In {\em Proc. 40th Allerton Conf. on Communication, Control and
2193: Computing}, Urbana, IL, 2002.
2194:
2195: \bibitem{ServettoR:06}
2196: S.~D. Servetto and J.~M. Rosenblatt.
2197: \newblock {The Multiterminal Source Coding Problem for Spatial Waves}.
2198: \newblock In {\em Proc. UCSD Wkshp. Inform. Theory App.}, San Diego, CA, 2006.
2199: \newblock {\em Invited paper}.
2200:
2201: \bibitem{shamai-verdu-zamir:systematic-lossy-coding}
2202: S.~Shamai, S.~Verd\'{u}, and R.~Zamir.
2203: \newblock {Systematic Lossy Source/Channel Coding}.
2204: \newblock {\em IEEE Trans. Inform. Theory}, 44(2):564--579, 1998.
2205:
2206: \bibitem{Shannon:59}
2207: C.~E. Shannon.
2208: \newblock {Coding Theorems for a Discrete Source with a Fidelity Criterion}.
2209: \newblock {\em IRE Nat. Conv. Rec.}, 4:142--163, 1959.
2210:
2211: \bibitem{SlepianW:73b}
2212: D.~Slepian and J.~K. Wolf.
2213: \newblock {Noiseless Coding of Correlated Information Sources}.
2214: \newblock {\em IEEE Trans. Inform. Theory}, IT-19(4):471--480, 1973.
2215:
2216: \bibitem{StarkW:94}
2217: H.~Stark and J.~Woods.
2218: \newblock {\em {Probability, Random Processes, and Estimation Theory for
2219: Engineers (2nd ed.)}}.
2220: \newblock Prentice Hall, 1994.
2221:
2222: \bibitem{SuEG:00}
2223: J.~K. Su, J.~J. Eggers, and B.~Girod.
2224: \newblock {Channel Coding and Rate Distortion with Side Information: Geometric
2225: Interpretation and Illustration of Duality}.
2226: \newblock Submitted to the IEEE Trans. Inform. Theory.
2227:
2228: \bibitem{ThaoV:94}
2229: N.~T. Thao and M.~Vetterli.
2230: \newblock {Reduction of the MSE in $R$-times Oversampled A/D Conversion from
2231: $O(1/R)$ to $O(1/R^2)$}.
2232: \newblock {\em IEEE Trans. Signal Processing}, 42(1):200--203, 1994.
2233:
2234: \bibitem{TianGZ:03}
2235: T.~Tian, J.~Garc\'{\i}a-Fr\'{\i}as, and W.~Zhong.
2236: \newblock {Compression of Correlated Sources using LDPC Codes}.
2237: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2003.
2238:
2239: \bibitem{ToumpisG:02}
2240: S.~Toumpis and A.~J. Goldsmith.
2241: \newblock {Capacity Regions for Wireless Adhoc Networks}.
2242: \newblock {\em IEEE Trans. Wireless Comm.}, 2(4):736--748, 2003.
2243:
2244: \bibitem{Tung:PhD}
2245: S.~Y. Tung.
2246: \newblock {\em {Multiterminal Source Coding}}.
2247: \newblock PhD thesis, Cornell University, 1978.
2248:
2249: \bibitem{VaishampayanSS:01}
2250: V.~A. Vaishampayan, N.~J.~A. Sloane, and S.~D. Servetto.
2251: \newblock {Multiple Description Vector Quantization with Lattice Codebooks:
2252: Design and Analysis}.
2253: \newblock {\em IEEE Trans. Inform. Theory}, 47(5):1718--1734, 2001.
2254:
2255: \bibitem{Verdu:02}
2256: S.\ Verd\'u.
2257: \newblock {Spectral Efficiency in the Wideband Regime}.
2258: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1319--1343, 2002.
2259:
2260: \bibitem{ViswanathanB:97}
2261: H.~Viswanathan and T.~Berger.
2262: \newblock {The Quadratic-Gaussian CEO Problem}.
2263: \newblock {\em IEEE Trans. Inform. Theory}, 43(5):1549--1559, 1997.
2264:
2265: \bibitem{Wyner:75}
2266: A.~D. Wyner.
2267: \newblock {On Source Coding with Side Information at the Decoder}.
2268: \newblock {\em IEEE Trans. Inform. Theory}, IT-21(3):294--300, 1975.
2269:
2270: \bibitem{Wyner:78}
2271: A.~D. Wyner.
2272: \newblock {The Rate-Distortion Function for Source Coding with Side Information
2273: at the Decoder-II: General Sources}.
2274: \newblock {\em Inform. Contr.}, 38:60--80, 1978.
2275:
2276: \bibitem{WynerZ:76}
2277: A.~D. Wyner and J.~Ziv.
2278: \newblock {The Rate-Distortion Function for Source Coding with Side Information
2279: at the Decoder}.
2280: \newblock {\em IEEE Trans. Inform. Theory}, IT-22(1):1--10, 1976.
2281:
2282: \bibitem{XieK:04}
2283: L.-L. Xie and P.~R. Kumar.
2284: \newblock {A Network Information Theory for Wireless Communication: Scaling
2285: Laws and Optimal Operation}.
2286: \newblock {\em IEEE Trans. Inform. Theory}, 50(5):748--767, 2004.
2287:
2288: \bibitem{zador:quantization-asymptotics}
2289: P.~Zador.
2290: \newblock {Asymptotic Quantization Error of Continuous Signals and the
2291: Quantization Dimension}.
2292: \newblock {\em IEEE Trans. Inform. Theory}, IT-28(2):139--149, 1982.
2293:
2294: \bibitem{Zamir:96}
2295: R.~Zamir.
2296: \newblock {The Rate Loss in the Wyner-Ziv Problem}.
2297: \newblock {\em IEEE Trans. Inform. Theory}, 42(6):2073--2084, 1996.
2298:
2299: \bibitem{zamir-shamai:almost-there}
2300: R.~Zamir and S.~Shamai.
2301: \newblock {Nested Linear/Lattice Codes for Wyner-Ziv Encoding}.
2302: \newblock In {\em Proc. IEEE Inform. Theory Workshop}, Killarney, Ireland,
2303: 1998.
2304:
2305: \bibitem{ZamirSE:02}
2306: R.~Zamir, S.~Shamai, and U.~Erez.
2307: \newblock {Nested Linear/Lattice Codes for Structured Multiterminal Binning}.
2308: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1250--1276, 2002.
2309:
2310: \bibitem{ZhaoE:01}
2311: Q.~Zhao and M.~Effros.
2312: \newblock {Optimal Code Design for Lossless and Near Lossless Source Coding in
2313: Multiple Access Networks}.
2314: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2001.
2315:
2316: \end{thebibliography}
2317:
2318:
2319: \begin{biography}{Sergio D.\ Servetto}
2320: was born in Argentina, on January 18, 1968. He
2321: received a Licenciatura en Inform\'atica from Universidad Nacional
2322: de La Plata (UNLP, Argentina) in 1992, and the M.Sc. degree in
2323: Electrical Engineering and the Ph.D. degree in Computer Science from
2324: the University of Illinois at Urbana-Champaign (UIUC), in 1996 and
2325: 1999. Between 1999 and 2001, he worked at the \'Ecole Polytechnique
2326: F\'ed\'erale de Lausanne (EPFL), Lausanne, Switzerland. Since Fall
2327: 2001, he has been an Assistant Professor in the School of Electrical
2328: and Computer Engineering at Cornell University, and a member of the
2329: fields of Applied Mathematics and Computer Science. He was the
2330: recipient of the 1998 Ray Ozzie Fellowship, given to ``outstanding
2331: graduate students in Computer Science,'' and of the 1999 David J.
2332: Kuck Outstanding Thesis Award, for the best doctoral dissertation
2333: of the year, both from the Dept.\ of Computer Science at UIUC. He
2334: was also the recipient of a 2003 NSF CAREER Award. His research
2335: interests are centered around information theoretic aspects of
2336: networked systems, with a current emphasis on problems that arise
2337: in the context of large-scale sensor networks.
2338: \end{biography}
2339:
2340:
2341: \end{document}
2342: