0611:cs0611002/pp.tex

1:

2: \documentclass[11pt,draftcls,onecolumn]{IEEEtran}

3: %\documentclass[10pt,twocolumn]{IEEEtran}

4: \usepackage{amsmath,amsfonts,amssymb,psfig,color}

5: \usepackage[breaklinks=true, colorlinks=true, linkcolor=black, urlcolor=dblue,

6:   citecolor=black, pdfpagemode=None, pdfstartview=FitH]{hyperref}

7: \definecolor{gray}{cmyk}{.2,0.2,.3,.1}

8: \definecolor{dred}{cmyk}{0,0.9,0.4,0.3}

9: \definecolor{dblue}{rgb}{0,0,0.5}

10: \definecolor{dgreen}{rgb}{0,0.3,0}

11: \definecolor{dgray}{rgb}{0.3,0.3,0}

12:

13: \setlength{\unitlength}{1mm}

14: \DeclareOldFontCommand{\rm}{\normalfont\rmfamily}{\mathrm}

15: \DeclareOldFontCommand{\sf}{\normalfont\sffamily}{\mathsf}

16: \DeclareOldFontCommand{\tt}{\normalfont\ttfamily}{\mathtt}

17: \DeclareOldFontCommand{\bf}{\normalfont\bfseries}{\mathbf}

18: \DeclareOldFontCommand{\it}{\normalfont\itshape}{\mathit}

19: \DeclareOldFontCommand{\sl}{\normalfont\slshape}{\@nomath\sl}

20: \DeclareOldFontCommand{\sc}{\normalfont\scshape}{\@nomath\sc}

21: \newcommand{\rend}{\hfill$\square$}

22: \newcommand{\tend}{\hfill$\blacksquare$}

23: \newcommand{\epsfig}{\psfig}

24:

25: \title{\huge Lattice Quantization with Side Information: Codes,

26:   Asymptotics, and Applications in Sensor Networks \thanks{S.\ D.\

27:   Servetto is with the School of Electrical and Computer Engineering,

28:   Cornell University.  URL: \href{http://cn.ece.cornell.edu/}

29:   {{\tt http://cn.ece. cornell.edu/}}.  Work supported by the National

30:   Science Foundation, under awards CCR-0227676, CCR-0238271 (CAREER),

31:   CCR-0330059, and ANR-0325556.  This paper is based in part on work

32:   presented at the IEEE Data Compression Conference in

33:   2000~\cite{Servetto:02b}, and at the Allerton conference in

34:   2002~\cite{Servetto:02c}.}}

35: \author{Sergio D. Servetto}

36: \date{August 31, 2006.}

37:

38:

39: \begin{document}

40: \maketitle

41: \thispagestyle{empty}

42:

43: \begin{picture}(0,0)

44: \put(-8,75){\tt To appear in the IEEE Transactions on Information Theory.}

45: \end{picture}

46:

47: \vspace{-15mm}

48: \begin{abstract}

49: \noindent\it

50: We consider the problem of rate/distortion with side information

51: available only at the decoder.  For the case of jointly-Gaussian

52: source $X$ and side information $Y$, and mean-squared error distortion,

53: Wyner proved in 1976 that the rate/distortion function for this problem

54: is identical to the conditional rate/distortion function $R_{X|Y}$,

55: assuming the side information $Y$ is available at the encoder.  In

56: this paper we construct a structured class of asymptotically optimal

57: quantizers for this problem: under the assumption of high correlation

58: between source $X$ and side information $Y$, we show there exist

59: quantizers within our class whose performance comes arbitrarily close

60: to Wyner's bound.  As an application illustrating the relevance of

61: the high-correlation asymptotics, we also explore the use of these

62: quantizers in the context of a problem of data compression for sensor

63: networks, in a setup involving a large number of devices collecting

64: highly correlated measurements within a confined area.  An important

65: feature of our formulation is that, although the per-node throughput

66: of the network tends to zero as network size increases, so does the

67: amount of information generated by each transmitter.  This is a

68: situation likely to be encountered often in practice, which allows

69: us to cast under new---and more ``optimistic''---light some negative

70: results on the transport capacity of large-scale wireless networks.

71: \rm

72: \end{abstract}

73:

74: \vspace{15mm}

75: \noindent {\bf Index terms:} Rate/distortion, rate/distortion with side

76: information, quantization, vector quantization, lattice quantization,

77: lattice codes, hexagonal lattice, source coding, network information

78: theory, ad-hoc networks, sensor networks, multihop radio networks, wireless

79: networks, throughput, capacity.

80: \vspace{15mm}

81:

82: \setcounter{page}{0}

83: \pagebreak

84:

85:

86: \section{Introduction}

87:

88: \subsection{Large-Scale Wireless Sensor Networks}

89:

90: Wireless networks span a wide spectrum in terms of their functionality

91: (i.e., what they are used for), organization (i.e., how the different

92: components are assembled to form a complete working system), and the

93: technologies used to build them.  A long-term project currently under

94: way at Cornell deals with the design and prototyping of networks with

95: the following defining characteristics:

96: \begin{itemize}

97: \item The nodes operate under severe power constraints, support

98:   relatively large data transfer rates, and their number and density

99:   is large.

100: \item Once nodes are deployed, their mobility is very limited (if there

101:   is any at all).  Instead, the main source of uncontrolled dynamics in

102:   the network is the temporary failure of individual nodes: this will

103:   typically happen either due to exhaustion of the power source (and for

104:   the duration of the ``refueling'' period), or due to variations in the

105:   wireless medium.

106: \end{itemize}

107: In our setup of interest, the network is made up of devices whose

108: functionality is essentially that of a traditional Cisco router, with

109: the addition that they communicate over a wireless channel, their size

110: is many orders of magnitude smaller, and they may come equipped with

111: sensors that generate information locally as well.  Such networks

112: would prove extremely useful in a variety of very relevant scenarios,

113: such as disaster relief operations, military and surveillance applications,

114: cell-size reduction in cellular networks, environmental monitoring, etc.

115:

116: The development of a working network of this kind requires solutions

117: to a number of technical challenges (e.g., routing, flow control,

118: source and channel coding, power control, modem design, hardware, etc.).

119: Among all these, of particular interest in this paper is the problem of

120: source coding, in a scenario in which the data collected by a large number

121: of sensors is highly correlated.  When network nodes are coupled with

122: devices that sense a spatial process at different locations (e.g.,

123: concentration of ozone in the atmosphere, spread of a pathogen/pollutant

124: agent, temperature of a material, etc.), the measurements collected by

125: each node will not be independent in general, but instead will be

126: correlated, with a correlation structure determined by the corresponding

127: fluid dynamics equations.  Furthermore, the higher the density of nodes

128: in the network, the higher the correlation in the measurements will be.

129: Therefore, appropriate source coding capable of removing these dependencies

130: has the potential to significantly reduce the number of bits to be

131: transmitted (and therefore the consumption of scarce power resources),

132: when compared to a coding strategy that treats all measurements as being

133: independently generated.

134:

135: The use of standard and well understood source coding techniques is not

136: appropriate in the context of highly correlated sources: the use of

137: classical source codes to remove redundancy in the measurements collected

138: by different sensors requires that data be pooled at a common node prior

139: to transmission.  But this pooling action consumes valuable communication

140: resources itself, thus defeating the very same goal it tries to achieve

141: (communication efficiency).  Therefore, {\em distributed} source coding

142: techniques are required, i.e., codes capable of removing correlation

143: among measurements even in the presence of uncertainty about the exact

144: value measured at remote locations.

145: To this end, we define a simple abstraction that captures the essential

146: properties of this problem.  First, we consider the source of information

147: to be a random process $(X_s)_{s\in[0,1]}$, defined over a bounded set,

148: and with {\em continuous} sample paths---continuity is one simple way of

149: capturing into our model the notion of correlation among measurements

150: increasing with the number of nodes in a confined area.  This process is

151: observed by a finite number of sensors, and these observations are to be

152: communicated over a wireless network, as illustrated in

153: Fig.~\ref{fig:network-model}.

154:

155: \begin{figure}[ht]

156: \centerline{\psfig{file=network-model.eps,height=7.5cm,width=13.5cm}}

157: \caption{Network model.  There are three types of nodes: sources,

158:   relays, and destination nodes, with $n$ nodes of each type.  There is

159:   a source (a random process whose statistics are known by all sources),

160:   from which each of the source nodes collects a sample.  These samples are

161:   encoded by each source node without knowledge of the samples collected by

162:   other nodes, fed into the network, and each sent to a destination node.

163:   Finally, these destination nodes pool all their information at a central

164:   location, at which a decoder forms an estimate of the entire sample path,

165:   based on the data available from all sensors.  A key aspect of our problem

166:   formulation is that each source node has to decide what information to send

167:   to the central decoder {\em without} explicit knowledge of the information

168:   available at other nodes---only with knowledge of the statistics of that

169:   correlated data.}

170: \label{fig:network-model}

171: \end{figure}

172:

173: An important aspect of this problem setup is the fact that, as we

174: increase the number of source nodes, the amount of information contained

175: in each sample tends to zero---because the source is continous, two nearby

176: samples are almost the same.  And we know from recent work on the transport

177: capacity of one class of wireless networks that, again for large networks,

178: the per-node throughput of networks in this class also tends to

179: zero~\cite{GuptaK:00}.  Therefore, {\em provided that the rate at which

180: information contained in each sample decays at least as fast as the

181: throughput of the network}, appropriate source coding techniques should

182: enable an accurate reconstruction of the source at the central decoder

183: of Fig.~\ref{fig:network-model}.  A study of the resulting source coding

184: problem in the context of these networks is the central subject of this

185: paper.

186:

187: \subsection{Rate Distortion with Side Information}

188:

189: \subsubsection{Problem Statement}

190:

191: Let $\{(X_n,Y_n)\}_{n=1}^\infty$ be a sequence of independent drawings

192: of a pair of dependent random variables $X$ and $Y$, and let $D(x,\hat{x})$

193: denote a single-letter distortion measure.  The problem of rate distortion

194: with side information at the decoder asks the question of how many bits

195: are required to encode the sequence $\{X_n\}$ under the constraint that

196: ${\tt E}D(x,\hat{x}) \leq d$, assuming the side information $\{Y_n\}$ is

197: available to the decoder but not to the

198: encoder~\cite[Ch.\ 14.9]{CoverT:91}.  This problem, first

199: considered by Wyner and Ziv in~\cite{WynerZ:76}, is a special

200: case of the general problem of coding correlated information sources

201: considered by Slepian and Wolf~\cite{SlepianW:73b},

202: in that one of the sources ($\{Y_n\}$) is available {\em uncoded} at the

203: decoder.  But it also generalizes the setup

204: of~\cite{SlepianW:73b}, in that coding is with respect

205: to a fidelity criterion rather than noiseless.  One important motivation

206: for us to consider this problem is the fact that good quantizers with side

207: information will be used in the proof of scalability of a large sensor

208: network.

209:

210: In~\cite{Wyner:78, WynerZ:76}, Wyner and

211: Ziv derive the rate/distortion function $R^*(d)$ for this problem, for

212: general sources and general (single letter) distortion metrics.  In this

213: work however we restrict our attention only to Gaussian sources, and mean

214: squared error (MSE) distortion.  This case is of special interest because,

215: under these conditions, it happens that $R^*(d) = R_{X|Y}(d)$, the

216: conditional rate/distortion function {\em assuming $Y$ is available at the

217: encoder}~\cite{Wyner:78, WynerZ:76}.  We

218: are intrigued by the fact that there exist coding methods which can perform

219: as well as if they had access to the side information at the encoder, even

220: though they don't.  One goal pursued in this paper then is the construction

221: a family of quantizers which realizes these promised gains.

222:

223: \subsubsection{Lattice Quantization with Side Information}

224:

225: High-rate quantization theory provides much of the motivation to consider

226: lattices~\cite{GrayN:98}. Under an assumption of fine

227: quantization, the performance of an $n$-dimensional quantizer $\Lambda$

228: whose Voronoi cells are all congruent to a polytope $P$ is given by

229: \begin{equation}

230:    d = G(P) \cdot e^{-\frac{2}{n}({\cal H}(\Lambda,p_X)-h(p_X))},

231:    \label{eq:zador-gersho-bound}

232: \end{equation}

233: where $p_X$ is the joint source distribution in $n$ dimensions, ${\cal H}$

234: is the discrete entropy induced on the codebook $\Lambda$ by quantization of

235: the source $p_X$, $h$ is the differential entropy, and

236: \[ G(P) = \frac{\frac{1}{n}

237:             \int_P ||{\bf x}-{\bf \hat{x}}||^2 \mbox{ d\bf x}}

238:             { \left(\int_P \mbox{ d\bf x}\right)^{1+\frac{2}{n}} }

239: \]

240: is the normalized second moment of $P$ (using MSE as a distortion

241: measure)~\cite{gersho:quantization-asymptotics,zador:quantization-asymptotics}.

242:

243: In the problem of rate distortion with side information, for Gaussian

244: sources and MSE distortion, the goal is to attain a distortion value

245: $d$ using $R_{X|Y}(d) < R_X(d)$ nats/sample.  In~(\ref{eq:zador-gersho-bound})

246: this means that, at fixed bit rate $R_0$, we want to design quantizers

247: that achieve distortion

248: \[ d_0 \approx c_n \cdot e^{-\frac{2}{n}(nR_0-h(p_{X|Y}))} \]

249: when coding $X$, where $c_n \leq G(P)$ is the coefficient of quantization

250: in $n$ dimensions~\cite{gersho:quantization-asymptotics}.  But since we do not

251: have access to $Y$ (we only know $p_{X|Y}$), using classical quantizers we can

252: only attain a distortion value

253: \[ d \approx c_n \cdot e^{-\frac{2}{n}(nR_0-h(p_X))} > d_0 \]

254: (because {\small $h(X|Y) < h(X)$}), or equivalently, we need to use

255: some extra rate $\rho \approx R_X-R_{X|Y}$ such that

256: \[ d_0 \approx c_n \cdot  e^{-\frac{2}{n}(n(R_0+\rho)-h(p_X))}. \]

257: What makes this problem interesting is that we are only allowed to use $R_0$

258: nats/sample, not $R_0+\rho$.  One way to do that has been proposed by Shamai,

259: Verd\'{u} and Zamir in~\cite{shamai-verdu-zamir:systematic-lossy-coding,

260: zamir-shamai:almost-there}, which consists of: (a) taking a codebook with

261: roughly $e^{n(R_0+\rho)}$ codewords and distortion $d_0$, (b) partitioning

262: this codebook into $e^{nR_0}$ sets of size $e^{n\rho}$ each, (c) encoding

263: only enough information to identify each one of the $e^{nR_0}$ sets, and

264: (d) using the side information $Y$ to discriminate among the $e^{n\rho}$

265: codewords collapsed into each set.  One of our motivations for considering

266: lattice codes is the fact that their structure makes it particularly easy

267: to express these partitioning operations described

268: in~\cite{shamai-verdu-zamir:systematic-lossy-coding}.

269:

270: We should also mention that another reason to consider lattices is our

271: wish to answer a challenge posed by Zamir and Shamai

272: in~\cite{zamir-shamai:almost-there}.  They present an encoding procedure

273: very closely related to the one we propose here, they argue the existence

274: of good lattices to use with that procedure, they study their distortion

275: performance, but they do not present any examples of concrete

276: constructions: their paper concludes by saying that (sic) ``{\em beyond

277: the question of existence, it would be nice to find specific constructions

278: of good nested codes}''.  Finding those specific constructions is one of

279: the original contributions in this work.

280:

281: \subsection{Related Work}

282:

283: Note: this section contains relevant related work as of Fall 2004.

284:

285: \subsubsection{Codes and Quantizers}

286:

287: The design of quantizers for the problem of rate distortion with side

288: information was considered recently by Shamai, Verd\'{u} and Zamir, where

289: they present design criteria for two different cases: Bernoulli sources

290: with Hamming metric, and jointly Gaussian sources with mean squared error

291: metric~\cite{shamai-verdu-zamir:systematic-lossy-coding,

292: zamir-shamai:almost-there}.  The key contribution presented in that work

293: is a constructive mechanism for, given a codebook, using the side

294: information at the decoder to reduce the amount of information that needs

295: to be encoded to identify codewords, while at the same time achieving

296: essentially the distortion of the given codebook.  That work provided

297: much inspiration for our work on the design of lattice codes presented in

298: this paper.

299:

300: Other work on code constructions includes the application of similar

301: codebook partitioning ideas in the context of trellis

302: codes~\cite{sandeep-kannan:discus}, a preliminary version of this

303: work~\cite{Servetto:02b}, generalizations to the case when the side

304: information may be coded as well~\cite{PradhanR:00,ZhaoE:01},

305: constructions based on LDPC codes~\cite{AaronG:02, MitranB:02,

306: TianGZ:03}, and other code constructions~\cite{LiuCLX:04,

307: RebolloMonederoZG:03}.

308:

309: \subsubsection{Information-Theoretic Performance Bounds}

310:

311: Whereas there has been some interest in recent times on the more

312: practical aspects of these problems, a significant amount of work on

313: related topics had already been done before in the context of multiuser

314: information theory.  Specifically on the problem of rate/distortion

315: with side information, besides the above mentioned work of Wyner and

316: Ziv~\cite{Wyner:78, WynerZ:76}, Kaspi

317: and Berger present a summary of known results and a number of new

318: results (as of 1982) in~\cite{KaspiB:82}, leaving only a couple of

319: special cases still open.  Heegard and Berger further generalize to the

320: case when there is uncertainty on whether the side information is available

321: at the decoder or not~\cite{heegard-berger:uncertain-side-info}.  For

322: an arbitrary pair of sources, Zamir gives bounds on how far away the

323: conditional rate/distortion function and the Wyner-Ziv rate/distortion

324: function can be from each other~\cite{Zamir:96}.

325:

326: Closely related to the problem of rate/distortion with side information

327: is that of {\it Noiseless Coding of Distributed Correlated Sources}.

328: Slepian and Wolf formulate this problem, and

329: determine the minimum number of bits per symbol required to encode two

330: correlated sequences $\{X_n\}$ and $\{Y_n\}$ separately, such that they

331: can be faithfully reproduced by a centralized decoder, under the assumption

332: that $\{(X_n,Y_n)\}_{n=1}^\infty$ is

333: i.i.d.~\cite{SlepianW:73b}.  Cover then gives a simpler

334: proof of the same result, which also generalizes to arbitrary ergodic

335: processes, countably infinite alphabets, and arbitrary number of correlated

336: sources~\cite{Cover:75b}.  Wyner presents an information theoretic

337: characterization of the minimum rates required for faithful reproduction in

338: a general network with side information~\cite{Wyner:75}.  Barros and

339: Servetto consider the Slepian-Wolf problem in an arbitrary network

340: setup with noisy point-to-point links~\cite{BarrosS:06}.

341:

342: A long-standing open problem in network information theory is the

343: characterization of the rate-distortion region for the {\em Multiterminal

344: Source Coding} problem, which is basically the Slepian-Wolf problem,

345: but in which a non-zero distortion is allowed in the encoding of

346: both sources.  The most significant contribution to this date can be

347: found in Tung's doctoral dissertation~\cite{Tung:PhD}.  Berger

348: developed some useful notes for

349: a tutorial lecture on this and related problems~\cite{Berger:78}.

350:

351: Yet another closely related problem is {\it the CEO Problem}.  In this

352: version, multiple sensors observe

353: {\em noisy} versions of the same signal, and must convey their observations

354: to a centralized decoder at a combined rate of not more than $R$ bits/sample.

355: This case generalizes the problem of encoding correlated observations,

356: to the case when the number of sensors is large, and to the case when the

357: signal to be communicated cannot be observed directly.  Berger et al.\

358: present a solution to this problem in the general case~\cite{BergerZV:96}.

359: Viswanathan and Berger specialize the results of~\cite{BergerZV:96} to the

360: Quadratic-Gaussian case~\cite{ViswanathanB:97}: an interesting conclusion

361: in this case is that the optimal rate of decay of the error is of the form

362: $R^{-1}$ when the sensors cannot communicate prior to transmission, as

363: opposed to an exponential decay otherwise.

364:

365: An interesting duality between the problem of rate/distortion with side

366: information discussed above, and the problem of channel coding with side

367: information at the transmitter~\cite{Costa:83}, has been pointed out by

368: several groups~\cite{BarronCW:02,PradhanCR:03,SuEG:00}.  Cover and Chiang

369: present a comprehensive coverage of duality issues in problems with side

370: information~\cite{CoverC:02}, and Chiang and Boyd fully develop an

371: optimization-theoretic approach to analyzing the duality of channel

372: capacity and rate distortion problems~\cite{ChiangB:04}.  Merhav and

373: Shamai established a separation theorem in this context~\cite{MerhavS:03}.

374: Therefore, it should be possible to derive good codes for one problem

375: from good codes available for the other.

376:

377: Zamir et al.\ present a very interesting tutorial on noisy multiterminal

378: networks, with many useful references~\cite{ZamirSE:02}.

379:

380: \subsubsection{Performance of Wireless Networks}

381:

382: A key result in the analysis of performance of wireless networks states

383: that when $n$ non-mobile nodes are optimally placed in a disk of unit area,

384: traffic patterns are optimally assigned, and the range of each transmission

385: is optimally chosen, the total throughput that the network can carry is

386: $O(\sqrt{n})$~\cite{GuptaK:00}.  As a result, the per-node throughput is

387: only $O(\frac{1}{\sqrt{n}})$, i.e., decays to zero as the number of nodes

388: in the network increases.  Other results along the same lines were presented

389: in~\cite{GuptaK:03, XieK:04}.

390:

391: The work of~\cite{GuptaK:00} sparked significant interest in this problem.

392: When nodes are allowed to move, assuming transmission delays proportional

393: to the mixing time of the network, the total network throughput is $O(n)$,

394: and therefore the network can carry a non-vanishing rate per

395: node~\cite{GrossglauserT:02}.  Using a linear programming formulation,

396: non-asymptotic versions of the results in~\cite{GuptaK:00} are given

397: in~\cite{ToumpisG:02}.  Using pure network flow methods, similar results

398: (and generalizations thereof) have been obtained

399: in~\cite{PerakiS:03, PerakiS:04}.  An alternative method for deriving

400: transport capacity was presented in~\cite{KulkarniV:04}.

401:

402: \subsection{Main Contributions and Organization of the Paper}

403:

404: This paper presents the following original contributions:

405: \begin{itemize}

406: \item The construction of lattice codes for the problem of rate/distortion

407:   with side information.  We propose a design procedure based on the choice

408:   of a lattice that is a good quantizer for the classical rate/distortion

409:   problem, and a geometrically-similar sublattice, inspired by the idea of

410:   partitioning codebooks to obtain good codes for this problem proposed

411:   in~\cite{shamai-verdu-zamir:systematic-lossy-coding,

412:   zamir-shamai:almost-there}, and by our previous work on the design of

413:   lattice quantizers for multiple description coding~\cite{VaishampayanSS:01}.

414: \item An asymptotic analysis (in rate and correlation) of the performance

415:   of these codes which, to the best of our knowledge, is the first such

416:   analysis for Wyner-Ziv codes.  Our analysis reveals some interesting

417:   shortcomings of these codes, and suggest a simple modification to make

418:   to the construction to ensure their optimality.  These optimal codes

419:   effectively answer a challenge of Zamir and

420:   Shamai~\cite{zamir-shamai:almost-there}.

421: \item The illustration that high correlation asymptotics in source coding

422:   are indeed a new asymptotic regime with very meaningful practical

423:   implications.  So far source coding has considered two asymptotic

424:   regimes: large block asymptotics~\cite{Shannon:59}, or high

425:   rate asymptotics~\cite{zador:quantization-asymptotics}.  High correlation

426:   asymptotics are a new asymptotic regime that, as we will see in

427:   Section~\ref{sec:sensor-networks}, proves quite relevant in the context

428:   of new problems derived from sensor networking applications.

429: \item The identification of a large class of applications for which the

430:   vanishing rates property of wireless networks does not pose a problem,

431:   by virtue of the fact that the amount of information that each node needs

432:   to transmit decays at the same rate as (or faster than) throughput does.

433: \end{itemize}

434:

435: The rest of this paper is organized as follows.  In

436: Section~\ref{sec:code-design} we present the structure of lattice

437: quantizers for the problem of rate/distortion with side information,

438: and in Section~\ref{sec:asymptotics} we evaluate the performance of

439: the codes obtained, under the assumption of high-correlation between

440: the source $X$ and the side information $Y$.  In

441: Section~\ref{sec:sensor-networks} we illustrate how the proposed

442: codes can be used to deal effectively with the vanishing rates

443: property of an important class of large-scale sensor networks.

444: Final remarks are presented in Section~\ref{sec:conclusions}.

445:

446:

447: \section{Design of Lattice Codes with Side Information}

448: \label{sec:code-design}

449:

450: \subsection{Definitions}

451:

452: A source generates a sequence of zero-mean iid pairs

453: $(x_i,y_i)_{i=0}^\infty$, with jointly Gaussian distribution

454: \[

455:   f_{X,Y}(x,y) = \frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}}\;\;

456:                  e^{-\frac{1}{2(1-\rho^2)}

457:                     \left(\frac{x^2}{\sigma_X^2}

458:                           -\frac{2\rho x y}{\sigma_X\sigma_Y}

459:                           +\frac{y^2}{\sigma_Y^2}\right)},

460: \]

461: with covariance matrix ${\bf K} = {\tiny \left[\!\begin{array}{cc}

462: \sigma_X^2 & \rho\sigma_X\sigma_Y \\

463: \rho\sigma_X\sigma_Y & \sigma_Y^2 \\

464: \end{array}\!\right]}$, and correlation coefficient $\rho$.  The corresponding

465: conditional and marginal densities are denoted by $f_{Y|X}$, $f_{X|Y}$,

466: $f_X$, $f_Y$.  For a set of $n$ linearly independent column vectors

467: $\{{\bf v}_1,...,{\bf v}_n\}$, a {\em lattice} $\Lambda\subset\mathbb{R}^n$

468: is defined by

469: \[

470:    \Lambda = \left\{ \sum_{i=1}^n c_i {\bf v}_i : c_1...c_n\in\mathbb{Z}

471:              \right\},

472: \]

473: and its {\em generator matrix} ${\bf V}=\left[{\bf v}_1|...|{\bf v_n}\right]$.

474: The volume of a polytope $P\subset\mathbb{R}^n$ is denoted by $\nu(P)$.

475: For a constant $s\in\mathbb{R}$, the {\em scaled lattice} $s\Lambda$ is the

476: lattice generated by $s{\bf V}$, where ${\bf V}$ is the generator matrix of

477: a lattice $\Lambda$.  The {\em Voronoi cell} of a lattice point $\lambda$ in

478: the lattice $\Lambda$ is defined by

479: \[

480:    V[\lambda\!:\!\Lambda]

481:      = \{{\bf x}\in\mathbb{R}^n:||{\bf x}-\lambda||^2\leq||{\bf x}-\lambda'||^2,

482:          \;\forall\lambda'\in\Lambda \}.

483: \]

484: The {\em nearest neighbor map of a lattice} is a function

485: $Q_\Lambda : \mathbb{R}^n \rightarrow \Lambda$, defined by

486: \[

487:    Q_\Lambda({\bf x}) = \arg\min_{\lambda\in\Lambda} ||{\bf x}-\lambda||^2,

488: \]

489: where ties are broken arbitrarily (e.g., numbering all the $\lambda$'s,

490: and assigning ${\bf x}$ to the $\lambda$ with smallest index).  From the

491: definitions it follows trivially that $V[\lambda\!:\!\Lambda] =

492: \{{\bf x}\in\mathbb{R}^n:Q_\Lambda({\bf x})=\lambda\}$, except possibly

493: for a set of measure zero.  A lattice $\Lambda'$ is a {\em sublattice} of

494: a lattice $\Lambda$ if $\Lambda'\subseteq\Lambda$.  The {\em quotient

495: group}~\cite[Sec.\ 6.3]{Bourbaki:58} of a lattice modulo a sublattice is

496: denoted by $\Lambda/\Lambda'$, and its order by $|\Lambda/\Lambda'|$.

497:

498: A {\em Wyner-Ziv Lattice Vector Quantizer} (WZ-LVQ) is a triplet

499: ${\cal Q}=(\Lambda,\kappa,s)$, where:

500: \begin{itemize}

501: \item $\Lambda$ is a lattice.

502: \item $\kappa: \mathbb{R}^n \rightarrow \mathbb{R}^n$ is a linear operator

503:       such that $\kappa{\bf u}\cdot\kappa{\bf v} = c\;{\bf u}\cdot{\bf v}$

504:       (for some $c>0$), and such that $\kappa(\Lambda) \subseteq \Lambda$.

505:       Essentially, $\kappa$ defines a {\em similar} sublattice of

506:       $\Lambda$.\footnote{Two lattices $\Lambda_1$, $\Lambda_2$ (with

507:       generator matrices $M_1$, $M_2$) are said to be {\em similar} when

508:       there is a constant $c \neq 0$, an integer matrix U with

509:       $|\mbox{det}(U)| = 1$, and a real matrix $B$ with $BB^{\top} = I$,

510:       such that $M_2 = c \; U M_1 B$~\cite{neil:splag}.

511:       Intuitively, similar lattices ``look the same'', up to a rotation,

512:       a reflection, and a change of scale.}

513: \item $s \in (0,\infty)$ is a scale factor that expands (or shrinks)

514:       $\Lambda$ and $\kappa(\Lambda)$.

515: \end{itemize}

516:

517: Intuitively, the lattice $\Lambda$ is the fine codebook, the one whose

518: codewords are to be partitioned into equivalence classes.  We choose to

519: implement this partition by considering a sublattice $\Lambda' \subseteq

520: \Lambda$, and then considering the resulting quotient group $\Lambda/\Lambda'$.

521: $s$ is a constant that multiplies the generator matrices of the lattices

522: considered, which is to be adjusted as a function of the correlation

523: between the source $X$ and the side information $Y$.  A justification for

524: the choice of a {\em similar} sublattice (as opposed to any other sublattice)

525: to implement the codebook partition, and a justification for the explicit

526: introduction of a scale factor $s$ as a parameter of the quantizer (as

527: opposed to having this lattice scale be determined by the coding rate, as

528: in classical quantization theory) will become apparent later, after we study

529: the rate-distortion performance of the proposed quantizers.

530:

531: The question of the existence of similar sublattices arose in connection

532: with another vector quantization problem~\cite{VaishampayanSS:01}, and also

533: in the study of symmetries of

534: quasicrystals~\cite{baake-moody:similarity-submodules-semigroups}.  The

535: subject is thoroughly covered in~\cite{conway-rains-neil:similar-sublattices},

536: where necessary (and in some cases sufficient) conditions are given for

537: their existence.

538:

539: \subsection{Encoding/Decoding Algorithms}

540:

541: Let $X^n$ denote a block of $n$ source samples, and $Y^n$ a block of $n$

542: side information samples.  The encoder and decoder are maps

543: $f_n:\mathbb{R}^n \rightarrow s\Lambda/s\kappa(\Lambda)$ and

544: $g_n:s\Lambda/s\kappa(\Lambda)\times\mathbb{R}^n \rightarrow s\Lambda$,

545: defined by

546: \begin{equation}

547:    f_n(X^n) = Q_{s\Lambda}\big(X^n-Q_{s\kappa(\Lambda)}(X^n)\big),

548:    \hspace{1cm}

549:    \hat{X}^n = g_n(f_n(X^n),Y^n) = Q_{s\kappa(\Lambda)+f_n(X^n)}(Y^n),

550:   \label{eq:q-alg}

551: \end{equation}

552: whose operation is illustrated in Fig.~\ref{fig:encoder-decoder}, with an

553: example based on the lattice $A_2$.

554:

555: \begin{figure}[ht]

556: \centerline{\psfig{file=mechanics1.eps,height=10cm}

557:             \psfig{file=mechanics2.eps,height=10cm}}

558: \vspace{-2mm}

559: \caption{To illustrate the mechanics of the proposed quantizers

560:   (left: encoding, right: decoding).  A sublattice similiar to the base

561:   lattice is chosen (circled points), matched to how far $X^n$ and $Y^n$

562:   are expected to be: in this example, with high probability $X^n$ and

563:   $Y^n$ are in neighboring Voronoi cells of the fine lattice.  Then

564:   $X^n$ is quantized first with the coarse lattice, then this coarse

565:   description is subtracted from $X^n$, and this difference is quantized

566:   again with the fine lattice; this quantized difference is then sent to

567:   to the decoder, as a representative of the set of all codewords

568:   collapsed into the same equivalence class.

569:   At the decoder, the entire class is recreated (all the points with a

570:   thick arrow in the right picture), and among these, the point closest

571:   to the side information $Y^n$ is declared to be the original quantized

572:   value for $X^n$.  Note that there is always a chance that a particular

573:   realization of the noise process may take $Y^n$ too far away from

574:   $X^n$, in which case a decoding error occurs.}

575: \label{fig:encoder-decoder}

576: \end{figure}

577:

578: \subsection{Rate Computation}

579: \label{sec:rate-computation}

580:

581: There are only $N = |\Lambda/\kappa(\Lambda)|$ possible different quantizer

582: outputs, each one with probability $p_k$ ($k=1...N$) given by

583: \[

584:    p_k \;=\; \sum_{\lambda\in s\Lambda}

585:          \int_{V[\kappa(\lambda)+\gamma_k:s\Lambda]}

586:          f_X({\bf x}) \mbox{ d\bf x},

587: \]

588: where $\gamma_k \in s\Lambda/s\kappa(\Lambda)$, and where we identify

589: the entire equivalence class with a canonical representative taken from

590: $\Lambda \; \cap \; V[{\bf 0}\!:\!\kappa(\Lambda)]$.  The rate of a

591: quantizer is then given by

592: \[ R \;=\; \mbox{$\frac 1 n$} \sum_{k=1}^N p_k \ln(1/p_k), \]

593: expressed in units of nats per source sample.

594:

595: Assume now, as is standard in fine-resolution quantization theory, that

596: Voronoi cells of the quantizers under consideration are small.  In this

597: case, this translates into a requirement for {\em sublattice} cells to

598: be small, for which we have that

599: \[

600:    \nu(s\kappa(\Lambda)) \; = \; s^n \nu(\kappa(\Lambda))

601:      \; = \; s^n \nu(N^{\frac 1 n}U\Lambda) \; = \; s^nN \nu(\Lambda)

602:      \; = \; s^nN,

603: \]

604: where the second equality follows from the fact that

605: $N = |\Lambda/\kappa(\Lambda)| = c^{\frac n 2}$, where $c$ is the norm

606: of the similarity defined by

607: $\kappa$~\cite{conway-rains-neil:similar-sublattices} (and therefore

608: the corresponding scaling is $\sqrt{c}$), $U$ is unitary, and the last

609: equality follows from assuming $\Lambda$ is normalized to have determinant

610: 1~\cite{neil:splag}.  Then, we see that requiring small

611: sublattice cells translates into requiring that $s^nN$ be a small

612: number.  Now, under this assumption, the rate expression above admits a

613: much simpler form:

614:

615: \[

616: 1 = \sum_{\lambda\in s\Lambda}

617:            \int_{V[\lambda:s\Lambda]} f_X({\bf x})\mbox{ d\bf x} \\

618:   = \sum_{\gamma_k\in s\Lambda/s\kappa(\Lambda)}

619:            \underbrace{\sum_{\lambda\in s\Lambda}

620:             \int_{V[\kappa(\lambda)+\gamma_k:s\Lambda]}

621:                f_X({\bf x})\mbox{ d\bf x}}_{p_k}.

622: \]

623: The integral of the source density in $p_k$ can be approximated by

624: \[

625:   f_X(\kappa(\lambda)+\gamma_k)\;\cdot\;

626:                 \nu(V[\kappa(\lambda)+\gamma_k:s\Lambda]).

627: \]

628: But assuming small cells

629: for the sublattice (standard in quantization theory), since the Gaussian

630: source is continuous, we have that within a cell of $\kappa(\Lambda)$ $f_X$

631: is approximately constant, and hence independent of the particular shift

632: $\gamma_k$.  Furthermore, since $\Lambda$ is a lattice, all its cells are

633: congruent, and therefore their volumes are all the same, thus making $\nu$

634: also independent of the particular shift $\gamma_k$.  Call $p$ this

635: (approximately) constant value for $p_k$.  Therefore, we have

636: \[

637:    1 \;\; \approx \sum_{\gamma_k\in\Lambda/\kappa(\Lambda)} p

638:      \;\; = \;\; |\Lambda/\kappa(\Lambda)| p,

639: \]

640: and hence,

641: \begin{eqnarray*}

642:   p_k \approx \frac{1}{|\Lambda/\kappa(\Lambda)|}

643:   & \hspace{1cm} \mbox{and} \hspace{1cm}

644:   & R \approx \mbox{$\frac 1 n$} \log_2 |\Lambda/\kappa(\Lambda)|,

645: \end{eqnarray*}

646: independent of $s$ and $f_X$, where the approximations are tight in

647: the limit as $s^nN\to 0$.

648:

649: Note that, unlike in classical quantization theory, here the

650: rate of a quantizer seems to be independent of the size of its Voronoi

651: cells.  In our context, a high-rate assumption translates into a large

652: value for $|\Lambda/\kappa(\Lambda)|$, i.e., cells in the fine lattice

653: are small {\em relative} to the size of cells in the coarse lattice.

654: But the parameter $s$, which determines the {\em absolute} the size of

655: these cells, is not part of the rate expression.

656:

657: \subsection{Distortion Computation}

658: \label{sec:distortion-nonasymptotic}

659:

660: Let $\gamma_k({\bf x})$ denote the encoding of a source sequence

661: ${\bf x}$ ($k=1...N$), and $\gamma({\bf x},{\bf y})$ denote the

662: reconstruction codeword for a source sequence ${\bf x}$ with side

663: information ${\bf y}$.  Then:

664: \begin{eqnarray}

665: \bar d

666:   & \stackrel{(a)}{=} &

667:         \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n} \int_{{\bf y}\in\mathbb{R}^n}

668:         ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{XY}({\bf x},{\bf y})

669:         \mbox{d}{\bf x}\mbox{d}{\bf y} \nonumber \\

670:   & = & \mbox{$\frac 1 n$}

671:         \int_{{\bf x}\in\mathbb{R}^n}\left[\int_{{\bf y}\in\mathbb{R}^n}

672:         ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{Y|X}({\bf y}|{\bf x})

673:         \mbox{d}{\bf y}\right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\

674:   & \stackrel{(b)}{=} & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}

675:         \left[\sum_{\lambda\in s\kappa(\Lambda)+\gamma_k({\bf x})}

676:               \int_{{\bf y}\in V[\lambda:s\kappa(\Lambda)+\gamma_k({\bf x})]}

677:         ||{\bf x} - \lambda||^2 f_{Y|X}({\bf y}|{\bf x})

678:         \mbox{d}{\bf y}\right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\

679:   & \stackrel{(c)}{=} & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}

680:         \left[\sum_{\lambda\in s\kappa(\Lambda)+\gamma_k({\bf x})}

681:         ||{\bf x} - \lambda||^2 {\tt Pr}\big({\bf y}\in

682:         V[\lambda:s\kappa(\Lambda)+\gamma_k({\bf x})]\big|{\bf x}\big)

683:         \right] f_X({\bf x})\mbox{d}{\bf x} \nonumber \\

684:   & \triangleq & \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n}

685:         \partial({\bf x}, s\kappa(\Lambda)+\gamma_k({\bf x}))

686:         f_X({\bf x})\mbox{d}{\bf x},

687:         \label{eq:distortion}

688: \end{eqnarray}

689: where:

690: \begin{itemize}

691: \item[\small (a)] is just the definition of average distortion;

692: \item[\small (b)] follows from, for each possible source sequence ${\bf x}$,

693: partitioning the set of all side information vectors ${\bf y}$ into

694: Voronoi cells of the sublattice $s\kappa(\Lambda)$, centered at location

695: $\gamma_k({\bf x})$;

696: \item[\small (c)] follows from the fact that $||{\bf x}-\lambda||^2$ can

697: be taken out of the integral, and what remains is an integral of the

698: conditional density function.

699: \end{itemize}

700: The last

701: definition is introduced to highlight the concept that in quantization

702: with side information, an entire sublattice plays the role of a single

703: codeword in classical quantization -- the average error in reconstructing

704: ${\bf x}$ is seen to take the form of an expectation of a suitably

705: defined distortion metric between source sequences and sublattices.

706: In Section~\ref{sec:asymptotics} we study the asymptotic behavior

707: of~(\ref{eq:distortion}), assuming high correlation between $X^n$

708: and $Y^n$.

709:

710: \subsection{On the Choice of Similar Sublattices}

711:

712: As we will see in Section~\ref{sec:asymptotics}, there are some

713: drawbacks to implementing quantizers for the Wyner-Ziv problem with

714: a fine quantizer that is essentially a truncated lattice, as follows

715: from the construction given here.  But there are also significant

716: benefits to doing so, in terms of the simplicity of this implementation.

717: So for the time being, if we are going to use two lattices, it is

718: of interest to consider what kind of lattices should be used.

719:

720: Suppose we fix the scale factor $s$, and the code rate $\frac{1}{n}\ln(N)$.

721: Among all the sublattices of $\Lambda$ of index $N$, are there differences

722: in terms of their distortion performance?  Which sublattices should we

723: choose?  It follows from~(\ref{eq:distortion}) that a sensible design

724: criteria is to choose the sublattice which results in maximizing

725: ${\tt Pr}\left\{{\bf y}\in V[{\bf 0}\!:\!s\kappa(\Lambda)]\mid

726: X\!={\bf x}\right\}$, for ${\bf x}\in V[{\bf 0}\!:\!s\Lambda]$.

727:

728: Since the vectors $X$ and $Y$ are jointly Gaussian and with iid

729: components, the vector $Y|X\!=\!{\bf x}$ is also Gaussian and with iid

730: components (although the $x_i$'s and the $y_i$'s are certainly not

731: independent of each other).  The pdf of $Y|X\!=\!{\bf x}$ is therefore

732: circularly symmetric, and it follows from classical arguments of coding

733: for Gaussian channels that, to maximize ${\tt Pr}({\bf y}\in V)$, we need

734: to maximize the norm of the shortest vectors in $\kappa(\Lambda)$.  This

735: situation is illustrated in Fig.~\ref{fig:why-similar-sublattices}, with

736: an example based on the lattice $A_2$.

737:

738: \begin{figure}[ht]

739: \centerline{\psfig{file=n21-expand.ps,height=7.3cm}}

740: \vspace{-2mm}

741: \caption{Two different sublattices of $A_2$, of index $N=21$.  $A_2$

742:   is isomorphic to the ring of Eisenstein integers

743:   $\mathbb{Z}(\omega) = \{ a+b\omega\;:\;a,b\in\mathbb{Z};\;

744:   \omega=[-\frac{1}{2},\frac{\sqrt{3}}{2}]=e^{2\pi i/3}\}$, and {\em ideal}

745:   sublattices refer to ideals of this ring.  Observe that the ideal sublattice

746:   of the example has shortest vectors of norm 21, whereas in the non-ideal

747:   sublattice the shortest vectors are shorter.}

748: \label{fig:why-similar-sublattices}

749: \end{figure}

750:

751: The choice of $A_2$ for illustration purposes in

752: Fig.~\ref{fig:why-similar-sublattices} is not arbitrary.  In that

753: particular case, it is known that the minimal norm $\mu$ of any sublattice

754: of index $N$ in $A_2$ satisfies $\mu \leq N$, and that $\mu = N$ if and

755: only if the sublattice is ideal~\cite{bernstein-neil-pew:sublattices-of-a2}.

756: Furthermore, in two dimensions, $A_2$ is both the best classical quantizer

757: and the best channel coder~\cite{neil:splag}.  Therefore, it seems clear

758: that a hexagonal lattice and a similar sublattice are the best design

759: choices in two dimensions: this combination simultaneously minimizes

760: quantization error, and minimizes the probability of a source vector being

761: decoded to an incorrect codeword.

762:

763: Another interesting example is that of very high dimensional spaces.

764: In this case, we know that good quantizers have (nearly) spherical Voronoi

765: cells.  But at the same time, spherical cells maximize the minimum distance

766: between sublattice points, and therefore an optimal sublattice will have

767: to be similar to the base lattice.

768:

769: In between dimensions 2 and $\infty$, we are not able to make equally

770: strong statements---but we use the insights derived from these extreme

771: cases (a lattice with small second-order moment and a similar sublattice)

772: as guiding principles, to curb the complexity of the design task.

773:

774:

775: \section{Asymptotics of Quantizers with Side Information}

776: \label{sec:asymptotics}

777:

778: \subsection{Modeling Assumptions and Performance Metric}

779:

780: \subsubsection{Modeling Assumptions}

781:

782: Our goal in this section is to find a simpler expression for $\bar{d}$

783: than that presented in Section~\ref{sec:distortion-nonasymptotic}.  To

784: do so, we work under some extra assumptions:

785: {\it\begin{itemize}

786: \item The correlation coefficient $\rho$ between $X$ and $Y$ is close

787:   to 1.

788: \item The coding rate $R$ is large.

789: \item The scale factor $s$ is small.

790: \end{itemize}}

791: The effect of these assumptions is illustrated in Fig.~\ref{fig:assumptions}.

792:

793: \begin{figure}[ht]

794: \centerline{\psfig{file=assumptions.eps,height=6cm,width=12cm}}

795: \vspace{-2mm}

796: \caption{Illustration (in one dimension) of the meaning of the asymptotic

797:   regime considered in this work.  Working under an assumption of high

798:   correlations, we have that the conditional distribution of the source

799:   ${\bf x}$ given side information ${\bf y}$ is sharply concentrated around

800:   its mean value ${\bf y}$ -- as a result, we can make the probability of

801:   the source ${\bf x}$ away from ${\bf y}$ by more than any positive

802:   constant be arbitrarily small (by choosing $\rho$ close enough to 1),

803:   and hence we can assume that sublattice cells, while being vanishingly

804:   small themselves ($s\approx 0$), can be considered large enough to

805:   contain most of the probability in $f_{X|Y}$.  Then, because we take

806:   $R$ large, we further partition each sublattice cell into a large

807:   number of much smaller fine lattice cells.}

808: \label{fig:assumptions}

809: \end{figure}

810:

811: The basic intuition on which our analysis in this section is built is

812: very simple: by considering high enough correlations, the encoder can

813: ``roughly center'' the conditional distribution $f_{X|Y}$ at the centroid

814: of a sublattice cell, a cell that is large enough to make the probability

815: that the source vector ${\bf x}$ is not in the considered cell negligible,

816: but at the same time small enough so that tools employed in classical

817: quantization problems can be applied.

818:

819: Recall that as mentioned earlier, unlike in classical high rate asymptotics

820: where $R\to\infty$ results in $\nu(\Lambda)\to 0$, in this case we must

821: explicitly force $s\to 0$, but not ``too fast'' -- in this case, too fast

822: would be at a rate equal or faster than the rate at which $f_{X|Y}$ shrinks,

823: as $|\rho|\to 1$.  We will do so by setting the scale factor $s$ to be

824: $s = s(\rho)$, where $s:(-1,1)\to\mathbb{R}^+$ is such that

825: \begin{eqnarray}

826: \lim_{|\rho|\to 1} s(\rho) & = & 0, \nonumber \\

827: \lim_{|\rho|\to 1} \frac{s(\rho)}{\sigma_X\sqrt{1-\rho^2}} & = & \infty.

828:   \label{eq:choice-s}

829: \end{eqnarray}

830: For example,

831: $s = \sigma_X\sqrt{1-\rho^2}\log\left(1\big/\sigma_X\sqrt{1-\rho^2}\right)$

832: satisfies these conditions.

833:

834: \subsubsection{Performance Metric}

835:

836: Some justification seems necessary at this point for considering

837: high-correlation asymptotics (i.e., $|\rho|\to 1$), since under this

838: assumption, the side information available uncoded at the decoder

839: already contains almost all of the information about the source.  And

840: indeed, once we are done with our calculations, we will confirm the

841: (hardly surprising) fact that for any fixed target distortion $D$,

842: using these proposed quantizers and as $|\rho|\to 1$, the rate required

843: to achieve $D$ vanishes.  This is a condition that must be satisfied

844: by {\em any} decent quantizer.  However, that is not why we are

845: interested in this analysis: instead, our goal is to evaluate

846: \begin{equation}

847:   \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)},

848:   \label{eq:figure-of-merit}

849: \end{equation}

850: where $\bar{d}$ is the distortion of our quantizers, and $D(R)$ is

851: the Wyner-Ziv rate/distortion function--that is, we wish to compare

852: the {\em slope} of the distortion function for our proposed quantizers

853: at asymptotically high correlations, with that of the Wyner-Ziv

854: bound.  This {\em is} a meaningful performance metric, as it determines

855: the rate of decay of distortion relative to the fastest possible

856: decay.\footnote{This type of analysis is similar in spirit to (and

857: inspired by) that of Verd\'u for modulation schemes operating at

858: asymptotically low SNRs~\cite{Verdu:02}.}

859:

860: \subsection{Asymptotics of the Average Error With Geometrically Similar

861:   Coarse and Fine Lattices}

862: \label{sec:average-error}

863:

864: \subsubsection{A Simpler Expression}

865:

866: To obtain a simpler expression for $\bar d$ than that of

867: eq.~(\ref{eq:distortion}), we start by expanding it in a different way:

868: \begin{eqnarray}

869: \bar d

870:   & \stackrel{(a)}{=} &

871:         \mbox{$\frac 1 n$} \int_{{\bf x}\in\mathbb{R}^n} \int_{{\bf y}\in\mathbb{R}^n}

872:         ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{XY}({\bf x},{\bf y})

873:         \mbox{d}{\bf x}\mbox{d}{\bf y}

874:         \nonumber \\

875:   & = & \mbox{$\frac 1 n$}

876:         \int_{{\bf y}\in\mathbb{R}^n}\left[\int_{{\bf x}\in\mathbb{R}^n}

877:         ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})

878:         \mbox{d}{\bf x}\right] f_Y({\bf y})\mbox{d}{\bf y}

879:         \nonumber  \\

880:   & \stackrel{(b)}{=} & \mbox{$\frac 1 n$}

881:         \sum_{\lambda\in s\Lambda} \int_{{\bf y}\in V[\lambda:s\Lambda]}

882:         \left[\int_{{\bf x}\in\mathbb{R}^n}

883:         ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})

884:         \mbox{d}{\bf x}\right] f_Y({\bf y})\mbox{d}{\bf y}

885:         \nonumber \\

886:   & \stackrel{(c)}{\approx} &

887:         \mbox{$\frac 1 n$} \sum_{\lambda\in s\Lambda}

888:         \left[ \int_{{\bf x}\in\mathbb{R}^n}

889:         ||{\bf x}-\gamma({\bf x},\lambda)||^2 f_{X|Y}({\bf x}|\lambda)

890:         \mbox{d}{\bf x} \right] f_Y(\lambda)\nu(s\Lambda)

891:         \nonumber \\

892:   & \stackrel{(d)}{=} &

893:         \mbox{$\frac 1 n$}

894:         \left[ \int_{{\bf x}\in\mathbb{R}^n}

895:         ||{\bf x}-\gamma({\bf x},\mathbf{0})||^2 f_{X|Y}({\bf x}|\mathbf{0})

896:         \mbox{d}{\bf x} \right]

897:         \left(\sum_{\lambda\in s\Lambda} f_Y(\lambda)\nu(s\Lambda)\right)

898:         \nonumber \\

899:   & \stackrel{(e)}{\approx} & \underbrace{\mbox{$\frac 1 n$}

900:         \int_{{\bf x}\in V[{\bf 0}:s\kappa(\Lambda)]}

901:         ||{\bf x}-\gamma_k({\bf x})||^2 f_{X|Y}({\bf x}|\mathbf{0})

902:         \mbox{d}{\bf x}}_{\alpha}

903:         \\ & & \mbox{\hspace{2mm}} + \underbrace{\mbox{$\frac 1 n$}

904:         \sum_{\lambda\in s\kappa(\Lambda)\backslash\{{\bf 0}\}}

905:         \int_{{\bf x}\in V[\lambda:s\kappa(\Lambda)]}

906:         ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2

907:         f_{X|Y}({\bf x}|\mathbf{0}) \mbox{d}{\bf x}}_{\beta}

908:         \label{eq:def-alpha-beta}

909: \end{eqnarray}

910: where:

911: \begin{itemize}

912: \item[\small $(a)$] is again just the definition of average distortion;

913: \item[\small $(b)$] follows from partitioning the set of all side information

914:   sequences ${\bf y}$ into Voronoi cells of the fine lattice $s\Lambda$;

915: \item[\small $(c)$] follows from the assumption that $\nu(s\Lambda)$ is

916:   small, and from the continuity of $\int_{{\bf x}\in\mathbb{R}^n}

917:   ||{\bf x} - \gamma({\bf x},{\bf y})||^2 f_{X|Y}({\bf x}|{\bf y})

918:   \mbox{d}{\bf x}$ as a function of $\mathbf{y}$;

919: \item[\small $(d)$] follows from the symmetry of $f_{X|Y}$ as a function

920:   $\mathbf{y}$;

921: \item[\small $(e)$] follows from the fact that $f_Y$ integrates to 1, and

922:   from splitting the domain of integration of ${\bf x}$ into Voronoi cells of

923:   the sublattice $s\kappa(\Lambda)$.

924: \end{itemize}

925: Our next goal is to find simpler expressions for $\alpha$ and $\beta$.

926:

927: To simplify $\alpha$, we observe that this term denotes the MSE incurred

928: into when quantizing samples of a distribution $f_{X|Y}({\bf x}|\xi)$

929: with an $N$-level fixed-rate {\em uniform} quantizer, if we assume that

930: the overload cells of the quantizer occur with negligible probability --

931: and this assumption is justified because, for $|\rho|\approx 1$, sublattice

932: cells are large relative to the spread of $f_{X|Y}$ due to our choice of

933: $s$ in~(\ref{eq:choice-s}).  Now, again under the assumption that $R$ is

934: large, the random shift in the mean of $f_{X|Y}$ given by its dependence

935: on the unknown parameter $\xi$ is negligible compared to the size of a

936: sublattice cell.  Thus, by choosing a value of $|\rho|$ close enough to

937: 1, the probability of ${\bf x}\not\in V[{\bf 0}:s\kappa(\Lambda)]$ can

938: be made arbitrarily small.  This is illustrated in

939: Fig.~\ref{fig:simplify-alpha}.

940:

941: \begin{figure}[ht]

942: \centerline{\psfig{file=simplify-alpha.eps,height=5cm,width=12cm}}

943: \caption{Illustration (in one dimension) of the concept that, irrespective

944:   of a small random shift in the mean introduced by

945:   the unknown side information, a fine quantization of the sublattice cell

946:   (thin lines in between thick lines) results in a fine quantization of

947:   the unknown distribution.  The true distribution could be any of those

948:   illustrated for various unknown vectors $\xi_k$.}

949: \label{fig:simplify-alpha}

950: \end{figure}

951:

952: The requirement that the fine and coarse quantizers be geometrically

953: similar lattices results in cells of the coarse lattice being partitioned

954: {\em uniformly} by the fine lattice; this is the optimal quantizer for

955: a source that is uniformly distributed over a sublattice cell, not

956: distributed according to $f_{X|Y}$.  Therefore, defining a new pdf

957: $p(\mathbf{x})=\frac 1{s^nN}$ if $\mathbf{x}$ is in the corresponding

958: sublattice cell, and zero otherwise, we have that

959: \[ \lim_{N\to\infty}N^{\frac 2 n}\alpha = G(\Lambda)s^2;

960: \]

961: this follows from evaluating eqn.~(81) in~\cite[Ch.\ 2]{neil:splag}

962: for the uniform distribution $p$ defined above, specialized to the

963: lattice $\Lambda$.  Therefore, for $N$ large, we can (equivalently)

964: say that

965: \[ \alpha \;\;\approx\;\; G(\Lambda)s^2e^{-2R}.

966: \]

967:

968: Since $\beta\geq 0$, we have that $\bar d\geq\alpha$, and so

969: \begin{eqnarray}

970: \bar d

971:   & \geq & G(\Lambda)\,s^2\,e^{-2R}.

972:   \label{eq:distortion-aroundzero-similarcoarsefine}

973: \end{eqnarray}

974:

975: \subsubsection{Comparison Against Wyner's Rate/Distortion Bound}

976:

977: Our next step is to evaluate the figure of merit defined

978: by~(\ref{eq:figure-of-merit}).  To this end, consider Wyner's

979: rate/distortion bound~\cite{Wyner:78}:\footnote{In

980: Wyner's paper, the bound is given in the form $R(d)=\frac{1}{2}\log\left(

981: \frac{\sigma_X^2\sigma_U^2}{(\sigma_X^2+\sigma_U^2)d}\right)$ (for

982: the low distortion region), where $\sigma_X^2$ is the variance of $X$,

983: and $Y=X+U$, where $U$ has variance $\sigma_U^2$.  A straightforward

984: manipulation puts Wyner's expression in the form shown here.}

985: \begin{equation}

986:   D(R) = \sigma_X^2(1-\rho^2)e^{-2R}.

987:   \label{eq:wynerziv-rdfunction}

988: \end{equation}

989: Plugging eqns.~\eqref{eq:distortion-aroundzero-similarcoarsefine}

990: and~\eqref{eq:wynerziv-rdfunction} into~(\ref{eq:figure-of-merit}), we get

991: \begin{eqnarray*}

992: \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)}

993:   & \geq & \lim_{|\rho|\to 1}

994:         \frac{G(\Lambda) s^2 e^{-2R}}

995:              {\sigma_X^2(1-\rho^2)e^{-2R}} \\

996:   & = & G(\Lambda)

997:         \lim_{|\rho|\to 1}\frac{s^2}{\sigma_X^2(1-\rho^2)} \\

998:   & = & \infty;

999: \end{eqnarray*}

1000: the divergence of this limit follows from choice of lattice scaling

1001: specified in eqn.~\eqref{eq:choice-s}.  Therefore, when the fine

1002: quantizer is constrained to be a lattice that is geometrically similar

1003: to the coarse lattice, the performance of the resulting Wyner-Ziv

1004: quantizer is very poor in the asymptotic regime of high correlations.

1005: This observation motivates us to introduce a small modification in

1006: our code construction.

1007:

1008: \subsection{Asymptotics of the Average Error with a Coarse Lattice and

1009:   an Optimal Fixed-Rate Fine Quantizer}

1010:

1011: \subsubsection{A Simpler Expression}

1012:

1013: The suboptimality of the code construction based on two geometrically

1014: similar lattices stems from the fact that sublattice cells are partitioned

1015: uniformly, but the source distribution $f_{X|Y}$ being quantized is not

1016: uniform.  Therefore, we enlarge the class of codes considered:

1017: \begin{itemize}

1018: \item we keep the requirement that the coarse quantizer be a lattice;

1019: \item we keep the same quantization algorithm of eqn.~\eqref{eq:q-alg};

1020: \item but we now allow for the fine quantizer to be any arbitrary

1021:   fixed-rate classical vector quantizer.

1022: \end{itemize}

1023: By removing the restriction that the fine quantizer also be a lattice,

1024: we can now choose one still with $N$ reconstruction points, but whose

1025: output point density, instead of being uniform, is matched to the

1026: distribution $f_{X|Y}(\mathbf{x}|\mathbf{0})$.  As a result, we conclude

1027: that there exists a quantizer such that

1028: \[ \lim_{N\to\infty} N^{\frac 2 n}\alpha\;\;=\;\;G_n||f_{X|Y}||_{\frac{n}{n+2}},

1029: \]

1030: where $||f||_{\frac{n}{n+2}} \triangleq \big[ \int f^{\frac{n}{n+2}}(x)

1031: \mbox{d}x \big]^{\frac{n+2}{n}}$, and where $G_n$ depends only on $n$ (but

1032: not on the source distribution), and is bounded in terms of the standard

1033: $\Gamma$ function by

1034: \begin{equation}

1035:    \frac 1{(n+2)\pi}\;\Gamma\Big(\frac n 2+1\Big)^{\frac 2 n}

1036:    \;\;\leq\;\;

1037:    G_n

1038:    \;\;\leq\;\;

1039:    \frac 1{n\pi}\;\Gamma\Big(\frac n 2+1\Big)^{\frac 2 n}

1040:                 \;\Gamma\Big(1+\frac 2 n\Big),

1041:    \label{eq:bounds-Gn}

1042: \end{equation}

1043: as follows from eqns.~(81) and~(82) of~\cite[Ch.\ 2]{neil:splag}.

1044: Hence, for $|\rho|\approx 1$ and for $N$ large, we can approximate

1045: $\alpha$ by

1046: \[ \alpha\;\;\approx\;\;G_n\,||f_{X|Y}||_{\frac{n}{n+2}}\,e^{-2R}.

1047: \]

1048:

1049: To simplify $\beta$, the following estimate is obtained in

1050: Appendix~\ref{app:trivial1}:

1051: \begin{equation}

1052:   \beta \;\; \approx \;\; \mbox{$\frac 1 n$}

1053:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1054:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1055:         \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}

1056:                    {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right).

1057:   \label{eq:b}

1058: \end{equation}

1059:

1060: Combining these two estimates, we arrive at a final expression for

1061: $\bar d$:

1062: \begin{eqnarray}

1063: \bar d

1064:   & \approx & G_n\,||f_{X|Y}||_{\frac{n}{n+2}}\,e^{-2R}

1065:         + \mbox{$\frac 1 n$}

1066:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1067:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1068:         \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}

1069:                    {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)

1070:         \label{eq:distortion-aroundzero}

1071: \end{eqnarray}

1072:

1073: \subsubsection{Comparison Against Wyner's Rate/Distortion Bound}

1074:

1075: Plugging eqns.~\eqref{eq:wynerziv-rdfunction}

1076: and~\eqref{eq:distortion-aroundzero} into~(\ref{eq:figure-of-merit}),

1077: we now get

1078: \begin{eqnarray*}

1079: \lim_{|\rho|\to 1} \frac {\bar{d}}{D(R)}

1080:   & = & \lim_{|\rho|\to 1}

1081:         \frac{G_n ||f_{X|Y}||_{\frac{n}{n+2}} e^{-2R}

1082:               + \mbox{$\frac 1 n$}

1083:                 \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1084:                      {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1085:                 \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}

1086:                            {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)}

1087:              {\sigma_X^2(1-\rho^2)e^{-2R}} \\

1088:   & = & G_n

1089:         \lim_{|\rho|\to 1}\frac{||f_{X|Y}||_{\frac{n}{n+2}}}

1090:                                {\sigma_X^2(1-\rho^2)}

1091:         + \;\; \lim_{|\rho|\to 1} \mbox{$\frac 1 n$}

1092:           \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1093:                {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1094:           \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}

1095:                      {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)

1096:           \frac{1}{\sigma_X^2(1-\rho^2)e^{-2R}}.

1097: \end{eqnarray*}

1098:

1099: From eqn.~(57) in~\cite{zador:quantization-asymptotics}, we have that

1100: $\lim_{n\to\infty} ||f_n||_{\frac{n}{n+2}} = e^{2h(f)}$, where $f_n=(f)^n$

1101: is the $n$-dimensional source distribution, and $h$ denotes differential

1102: entropy.  We don't know of a way to simplify this expression for small

1103: $n$, so we approximate it with its limit value as $n$ gets

1104: large.\footnote{It is important to emphasize that although we consider

1105: large blocks to simplify $||f_n||_{\frac{n}{n+2}}$, this does {\em not}

1106: mean that the distortion expression thus obtained is only valid for high

1107: dimensional quantizers: we can consider long source blocks, in which

1108: small sub-blocks are quantized with low dimensional codes (for example,

1109: {\em scalar} quantizers), and this form would still apply.}

1110: For the conditional Gaussian distribution,

1111: $h(f) = \frac 1 2 \log\big(2\pi e\sigma_X^2(1-\rho^2)\big)$, and hence

1112: \[ G_n\lim_{|\rho|\to 1}

1113:   \frac{\lim_{n\to\infty}||f_{X|Y}||_{\frac{n}{n+2}}}

1114:   {\sigma_X^2(1-\rho^2)} \;\;=\;\; G_n\;2\pi e. \]

1115: Note as well that the second term vanishes: for $|\rho|\to 1$,

1116: from~(\ref{eq:choice-s}) we have that $s^2/\big(\sigma_X^2(1-\rho^2)\big)

1117: \to\infty$, and thus this expression is dominated by the vanishing term

1118: $e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}$.

1119: Hence, we conclude that, by explicitly scaling the quantizers with

1120: $s$ satisfying conditions~(\ref{eq:choice-s}),

1121: \[ \lim_{|\rho|\to 1} \frac{\bar{d}}{D(R)} \;\;=\;\; G_n\; 2\pi e. \]

1122:

1123: Finally, since for $n$ large the upper and lower bounds on $G_n$

1124: given in eqn.~\eqref{eq:bounds-Gn} coincide and take the value

1125: $\frac 1{2\pi e}$~\cite[pg.\ 58]{neil:splag}, we see that indeed,

1126: as $n\to\infty$, there exist high-dimensional codes for which this

1127: limit can be made arbitrarily close to 1.  {\em Hence, asymptotically

1128: in rate and correlation, our code constructions achieve the Wyner-Ziv

1129: bound.}

1130:

1131: \subsection{Some Intuitive Remarks}

1132:

1133: \subsubsection{On the Optimality of our Codes, in Hindsight}

1134:

1135: Informally, these are the key elements contributing to the optimality

1136: of our codes:

1137: \begin{itemize}

1138: \item The codes are scaled in a way such that, as correlation

1139:   increases, the tails of the conditional distribution $f_{X|Y}$

1140:   outside a cell of the coarse quantizer become increasingly light.

1141: \item At high correlations, our scaling of the codes results

1142:   in the size of cells in the coarse quantizer being small.  But

1143:   at high rates, the size of a cell in the fine quantizer is negligible

1144:   even relative to the small coarse cells.  And the side information

1145:   is, with high probability, ``pinned'' within one of the small fine

1146:   quantizer cells.

1147: \item Because the tails of $f_{X|Y}$ are increasingly light

1148:   as correlation increases, and $f_{X|Y}$ is {\em not} uniform,

1149:   an optimal quantizer for a uniform distribution is mismatched

1150:   to the actual statistics of the data, thus resulting in a severe

1151:   penalty in rate.  However, this penalty can be eliminated entirely

1152:   in a very simple way: only changing the shape of the cells for

1153:   the fine quantizer is enough -- if the output point density of

1154:   the fine quantizer is matched to the pinned form of $f_{X|Y}$,

1155:   this is an optimal code.

1156: \end{itemize}

1157: Essentially, our construction is asymptotically optimal (in rate

1158: and correlation), because we scale the lattice in a way such that

1159: we create multiple copies of $f_{X|Y}$ one within each cell of

1160: the coarse lattice, and we use an optimal code within that cell.

1161:

1162: \subsubsection{On Why $R^*(d)=R_{X|Y}(d)$ for Gaussian Sources}

1163:

1164: This asymptotic analysis also sheds light on why there is no

1165: rate loss for Wyner-Ziv coding of Gaussian sources, at least in

1166: the asymptotic regime of high rates and high correlations.  Note

1167: that the conditional distribution $f_{X|Y}$ depends on the side

1168: information $\mathbf{y}$ only in the form of a random shift: this

1169: random shift becomes negligible at high rates, but more importantly,

1170: the {\em shape} of $f_{X|Y}$ is independent of $\mathbf{y}$.  As

1171: a result, a single code can be used to quantize the $f_{X|Y}$'s

1172: pinned one within each cell of the coarse lattice.  It is this

1173: invariance property of the conditional Gaussian distribution that

1174: results having $R^*(d)=R_{X|Y}(d)$, at least in the asymptotic

1175: regime considered in this section.

1176:

1177:

1178: \section{Applications in Sensor Networks}

1179: \label{sec:sensor-networks}

1180:

1181: \subsection{Discussion}

1182:

1183: Issues in the analysis of performance of wireless networks have received

1184: considerable attention in recent times.  To a large extent, interest on

1185: these topics has been sparked by an observation made by Gupta and Kumar:

1186: the total throughput that can be carried by one particular class of

1187: wireless networks is only $O(\sqrt{n})$,\footnote{A word on notation.

1188: In this section, $n$ denotes number of nodes in the network, and $N$

1189: denotes block length.  This notation should not be confused with that

1190: in previous section, where $n$ was used to refer to block length, and

1191: $N$ to the number of reconstruction codewords in a code.}

1192: for a network having $O(n)$ nodes~\cite{GuptaK:00}.  As a result, each

1193: source-destination pair gets a throughput of $O(1/\sqrt{n})$, i.e., the

1194: amount of information that any one individual node can inject into the

1195: network vanishes as the network size increases.  The model used for

1196: performance analysis in~\cite{GuptaK:00} was conceived as an abstraction

1197: for emerging ad-hoc wireless networks, made up of small appliances (such

1198: as laptop computers or microwave ovens or door locks), interconnected

1199: via standard air interfaces (such as Bluetooth or 802.11).  In that

1200: context, the fact that as more nodes join the network then the capacity

1201: available to each node decreases, clearly poses serious problems, since

1202: there is no reason to believe that there will be any dependencies in the

1203: data generated by each of these devices.  And these problems prompted

1204: the conclusion in~\cite{GuptaK:00} that networks with either a small

1205: number of nodes, or with a small number of connections, may be more

1206: likely to find acceptance.

1207:

1208: In our work, we consider a different type of wireless networks: we

1209: focus on {\em sensor} networks, i.e., networks of devices that collect

1210: measurements of a process that is ``regular'' in some sense.  For example,

1211: if the sensors measure ozone concentration in the atmosphere, then the

1212: values of each measurement will not be independent in general, but instead

1213: will be constrained by an appropriate form of the Navier-Stokes equations.

1214: If the sensors measure temperatures at different locations of a material,

1215: the measurements will be constrained by Fourier's heat equations.  And

1216: in general, when the sensors sample values of some random process at

1217: different locations, these samples will be constrained by the correlation

1218: structure of the process (see, e.g.,~\cite{ServettoR:06}).  By considering

1219: correlated sources we generalize in what we believe is a very meaningful

1220: way the setup of~\cite{GuptaK:00}: now the amount of information generated

1221: by each node is no longer a constant, but instead it depends on the size

1222: of the network itself.

1223:

1224: \subsection{Network Model}

1225:

1226: Consider the following problem setup:

1227:

1228: \begin{itemize}

1229: \item There is a source of information, modeled by a process $X_u(k)$: for

1230:   fixed values of $k$, $X_u(k)$ is a brownian motion with parameter

1231:   $\sigma^2$; for fixed values of $u\in[0,1]$, $X_u(k)$ is an iid sequence.

1232:   That is, at a fixed location $u$, iid samples with distribution

1233:   $N(0,\sigma^2u)$ are collected in discrete time, and at a fixed time

1234:   slot, a Wiener process unfolds in space.

1235: \item Network nodes are represented by points on the unit square

1236:   $[0,1]\times[0,1] \subset \mathbb{R}^2$, and are classified into

1237:   three groups:

1238:   \begin{itemize}

1239:   \item There are $n$ {\em source} nodes $s$, that feed information into

1240:     the network, uniformly spread on the left edge of the square.

1241:   \item There are $n$ {\em destination} nodes $d$, that take information

1242:     out of the network, uniformly spread on the right edge of the square.

1243:   \item There are $n$ {\em router} nodes $r$, optimally placed in

1244:     the interior of the square, to maximize network throughput.  These nodes

1245:     are pure routers, they neither inject nor extract information to/from

1246:     the network, and they don't apply any form of coding, they only forward

1247:     information to other nodes.

1248:   \end{itemize}

1249: \item The $m$-th source collects samples of $X_{m/n}(k)$, and

1250:   encodes this information prior to sending it to the $m$-th destination

1251:   ($m=1...n$).  The only information available to each source is:

1252:   \begin{itemize}

1253:   \item The observed samples $X_{m/n}(k)$.

1254:   \item The position in the square of all the nodes.

1255:   \item The statistics of the entire process $X$.

1256:   \end{itemize}

1257: \item Each destination node forwards whatever data it receives to a special

1258:   node $d$, which {\em jointly} decodes all the data received, and computes

1259:   an estimate $\hat{X}_u(k)$ of the entire sample path $X_u(k)$ based on all

1260:   the decoded samples $X_{m/n}(k)$'s.

1261: \item Nodes do not move, and have an unbounded power supply.

1262: \item A bit is successfully sent from node $v_i$ to node $v_j$ if

1263:   (a) $||v_i-v_j||<\Delta_i$, and (b) if for all other transmitting nodes

1264:   $v_k$, $||v_k-v_j|| \geq \Delta_k$.  $R$ bits per channel use can be

1265:   transmitted over any link.

1266: \item Routing and power control are optimally configured to maximize network

1267:   throughput.

1268: \end{itemize}

1269: Note that in this model we explicitly rule out the possibility of

1270: source nodes exchanging information to cooperate in the encoding of their

1271: observations.  Note also that routers only forward data, but do not apply

1272: any form of coding.  That is, encoding is distributed among the sensors,

1273: data is carried over the network by relay nodes, and decoding is

1274: performed at a central location.

1275:

1276: We should point out that our model is different from the model of Gupta

1277: and Kumar~\cite{GuptaK:00}: whereas in their model they consider $n$ nodes

1278: which serve as transmitters/receivers/relays all in a single device, we

1279: break up each device into three pieces, and consider $n$ transmitters, $n$

1280: receivers, and $n$ relays.  However, this is not a fundamental difference:

1281: as long as we keep the same number of all three types of devices,

1282: the two models are essentially the same, and therefore their results on

1283: the property of vanishing throughputs as $n\rightarrow\infty$ still holds

1284: for our model.  The idea of splitting the devices into three separate units

1285: is to model a situation in which data is captured at some location, is

1286: transported over an ad-hoc network, and an estimate of the field of

1287: measurements is formed at a remote location.

1288:

1289: \subsection{Encoding/Decoding Mechanics in Large Networks}

1290:

1291: Clearly, a network with a finite number of nodes and with communication

1292: links of finite capacity among nodes, can transport only a finite amount

1293: of information.  Therefore, exact reconstruction of the brownian field

1294: $X_u(k)$ will not be possible in general, and a key issue then is that

1295: of understanding the rate/distortion tradeoffs involved.  A thorough study

1296: of this new rate/distortion problem lies outside the scope intended for

1297: this paper, and we will deal with this problem elsewhere.  Of interest

1298: in this paper however is a result that relates the ability of the central

1299: destination node $d$ to estimate the brownian field $X_u(k)$ to both the

1300: number of nodes in the network and the capacity of the individual network

1301: links.  Indeed, we have that under the assumption of a large (but still

1302: independent of network size) link capacity $R$, for any $\epsilon>0$ and

1303: $1-\epsilon\leq\rho<1$, there exists a large enough network of size $n$

1304: nodes, such that

1305: \[

1306:     D_{\frac m n} \;\stackrel{\Delta}{=}\;

1307:     {\tt E}\left(||X_{\frac m n}(k)-\hat{X}_{\frac m n}(k)||^2\right) \;\leq\;

1308:       \sigma_X^2\mbox{\small$\frac{m-1}{n}$}(1\!-\!\rho^2)

1309:                     \;e^{-\frac{R}{6\sqrt{n}}}

1310:       \mbox{ (a.e.)},

1311: \]

1312: uniformly for $\frac m n$ in the closed interval

1313: $\left[\frac{1}{n(1-\rho^2)},1\!\right]$, where $m\leq n$ is an integer,

1314: for all time slots $k$, and for almost all sample paths of the field

1315: $X_{\frac m n}(k)$.

1316:

1317: Essentially, what this result states is that, under the assumption of a

1318: large network and with links of high capacity, it is possible for $d$

1319: to estimate the sample paths of $X$ with arbitrarily small error.  That

1320: accurate estimation is possible is indeed surprising to us, given the

1321: fact that the amount of information per sample that the network can

1322: carry vanishes~\cite{GuptaK:00}---fortunately, so does the information

1323: content per sample, and that is what we can take advantage of.

1324:

1325: \subsubsection{Placement of Nodes and Scheduling of Transmissions}

1326: \label{sec:placement-scheduling}

1327:

1328: First of all, we give one particular distribution of routers in the

1329: plane and one particular algorithm for scheduling transmissions.

1330:

1331: Assume $\ell = \sqrt{n}$ is an even integer, and define:

1332: \begin{itemize}

1333: \item The sources are located at coordinates $(0,\frac{i}{n})$, and the

1334:   destinations at coordinates $(1,\frac{i}{n})$, for $i=1...n$.

1335: \item There are exactly $n$ routers, located at coordinates

1336:   $(\frac{1}{2\ell}+\frac{i}{\ell},\frac{1}{2\ell}+\frac{j}{\ell})$,

1337:   for $i,j=0,1,...,\ell-1$.

1338: \item The transmission radius for the source nodes is

1339:   $\Delta=\frac{\sqrt{2}}{2\ell}$, and for the routers it is

1340:   $\Delta=\frac{1}{\ell}$.\footnote{Recall that destination nodes do not

1341:   communicate over the shared wireless medium with the central decoder,

1342:   they only receive data that way.  Therefore, no transmission range

1343:   needs be specified in their case.}

1344: \end{itemize}

1345:

1346: In order to present an algorithm to schedule transmissions over time,

1347: we need some definitions.  First, divide the square

1348: $[0,1]\times[0,1]\subset\mathbb{R}^2$ into $\ell$ sets defined by

1349: \[

1350:    S^{(i)} = \left[\frac{(i\!-\!1)\ell}{n},\frac{i\ell}{n}\right)\times[0,1]

1351: \]

1352: $(i\!=\!1...\ell)$.  Within each $S^{(i)}$, there are:

1353: \begin{itemize}

1354: \item $\ell$ source nodes, at coordinates

1355:   $\left(0,\frac{(i-1)\ell+m}{n}\right)$, for $m=0...\ell-1$.

1356: \item $\ell$ destination nodes, at coordinates

1357:   $\left(1,\frac{(i-1)\ell+m}{n}\right)$, for $m=0...\ell-1$.

1358: \item $\ell$ router nodes, at coordinates

1359:   $\left(\frac{1}{2\ell}+\frac{k-1}{\ell},

1360:          \frac{1}{2\ell}+\frac{i}{\ell}\right)$, for $k=1...\ell$.

1361: \end{itemize}

1362: Next, we divide the router nodes into three groups $g_0,g_1,g_2$: a

1363: router falls in $g_j$ if its index $k$ is equal to $j$ (mod 3).  Source

1364: nodes all belong to the group $g_0$.  Finally, we give an algorithm to

1365: schedule transmissions:

1366: \begin{itemize}

1367: \item Time is discrete, and starts at 0.  At even time slots, allow

1368:   transmissions of nodes in $S^{(i)}$'s for which $i$ is even; at odd time

1369:   slots, allow transmissions of nodes for odd $i$'s.

1370: \item Each $S^{(i)}$ keeps its own clock $\tau_i$, which advances only

1371:   when transmissions from this $S^{(i)}$ are allowed to proceed: when

1372:   $\tau_i\equiv 0$ (mod 3) then $g_0$ sends, when $\tau_i\equiv 1$ (mod 3)

1373:   then $g_1$ sends, when $\tau_i\equiv 2$ (mod 3) then $g_2$ sends.  And

1374:   source nodes send only once every $\ell$ available slots, cycling through

1375:   them in round-robin order.

1376: \end{itemize}

1377:

1378: An illustration of the placement and divisions of nodes, and of the

1379: mechanics of the algorithm, is shown in Fig.~\ref{fig:step1}.

1380:

1381: \begin{figure}[!h]

1382: \vspace{-3mm}

1383: \centerline{\psfig{file=layout-and-schedule.eps,height=10cm,width=12cm}}

1384: \vspace{-3mm}

1385: \caption{An example of the placement and division of nodes, and

1386:   scheduling of transmissions, for $n=16$ ($\ell=4$).  Black dots represent

1387:   nodes: 16 sources on the left edge of the square, 16 routers inside the

1388:   square, 16 destinations on the right edge of the square.  A source sends

1389:   data to a destination on the same horizontal line.  Thin solid lines

1390:   joining nodes are

1391:   routes.  The sets $S^{(i)}$ and the groups $g_i$ are indicated with dotted

1392:   lines.  Active transmissions are indicated with a thick arrow, and the

1393:   circles around each indicate transmission ranges.  The active

1394:   transmissions in this picture correspond to an odd time slot (nodes only

1395:   within $S^{(1)}$ and $S^{(3)}$ are sending), and the group $g_0$ is active.}

1396: \label{fig:step1}

1397: \end{figure}

1398:

1399: \subsubsection{Throughput per-Node is $\frac{R}{6\sqrt{n}}$}

1400:

1401: The calculation of throughput proceeds in three steps:

1402: \begin{enumerate}

1403: \item Each group $S^{(i)}$ is scheduled for transmission only $\frac{1}{2}$

1404:   of the available time slots.  Among these slots, only $\frac{1}{3}$ are

1405:   available for transmission by $g_0$, the group that contains source nodes.

1406:   When this group is scheduled, only once every $\ell$ slots is available

1407:   to a particular node.  And when a particular node finally gets his

1408:   chance to inject a message into the network, it injects $R$ bits (equal

1409:   to link capacity).  Therefore, the total number of bits {\em injected} by

1410:   any one source node per unit of time is

1411:   $\frac{1}{2}\frac{1}{3}\frac{1}{\ell}R=\frac{R}{6\sqrt{n}}$.

1412: \item By construction, there is never more than one packet of $R$ bits in

1413:   the buffer of any router.

1414: \item Also by construction, there is never more than one active transmission

1415:   within range of any receiver.

1416: \end{enumerate}

1417: So, from 1 we have that $\frac{R}{6\sqrt{n}}$ bits per time slot are

1418: injected into the network, from 2 we have that there is no buildup of

1419: packets in any one queue, and from 3 we have that packets are never lost

1420: or delayed.  Therefore, all injected bits reach destination, and hence

1421: the throughput is $\frac{R}{6\sqrt{n}}$ bits per time slot per node.

1422:

1423: \subsubsection{Use of Codes with Side Information}

1424: \label{sec:use-lqsi}

1425:

1426: So far we have a network in which there is no loss of data, and which

1427: can carry a total of $\frac{R}{6\sqrt{n}}$ bits per time slot per node.

1428: And we collect one sample of the brownian field $X$ per time slot at

1429: each source node.  Therefore, we have $\frac{R}{6\sqrt{n}}$ bits per

1430: sample to encode a block of $N$ samples, for which the network guarantees

1431: delivery.

1432:

1433: Consider encoding a block of samples $X_{m/n}^N

1434: \stackrel{\Delta}{=} [X_{m/n}(0)...X_{m/n}(N-1)]$ at the $m$-th

1435: source node.  Trivially, we have that

1436: $X_{m/n}^N = X_{(m-1)/n}^N +

1437: (X_{m/n}^N-X_{(m-1)/n}^N)$.  From standard properties of

1438: Wiener processes, we have that $X_{m/n}^N$ and $X_{(m-1)/n}^N$

1439: are jointly Gaussian, and that the increment has distribution

1440: \[

1441:    X_{m/n}^N-X_{(m-1)/n}^N

1442:      \;\sim\; N\left(0,\mbox{$\frac{\sigma_X^2}{n}$}{\bf I}\right),

1443: \]

1444: independent of $X_{(m-1)/n}^N$.  If $X_{(m-1)/n}^N$ were

1445: available at the $m$-th encoder, the encoding procedure would be trivial:

1446: use standard codes for an iid Gaussian source to send this increment.  But

1447: without the reference value $X_{(m-1)/n}^N$, $m$ cannot compute that

1448: increment, which is the only ``new'' information at location

1449: $\frac{m}{n}$.

1450:

1451: Our encoding procedure is as follows: we encode $X_{m/n}^N$ using the

1452: codes developed in earlier sections, assuming the side information

1453: $X_{(m-1)/n}^N$ is available at the decoder.  The relevant statistics

1454: are:

1455: \[

1456:    X_{(m-1)/n}^N

1457:      \sim N\left(0,\sigma_X^2(m\!-\!1)/n{\bf I}\right),\hspace{8mm}

1458:    X_{m/n}^N \sim N\left(0,\sigma_X^2m/n{\bf I}\right),\hspace{8mm}

1459:    \rho_{m-1,m} = \sqrt{1-1/m}.

1460: \]

1461:

1462: \subsection{Distortion Computation}

1463:

1464: Next we turn to the computation of distortion for this proposed coding

1465: strategy.  Note that since the side information used to decode the data

1466: generated by one node is the data available at previous nodes, and that

1467: decoding errors can indeed occur with non zero probability (and thus,

1468: in the large-network regime, {\em will} occur), an important issue that

1469: needs to be addressed is the effect of decoding errors on the overall

1470: achieved distortion.

1471:

1472: We proceed in two steps: first we compute the distortion resulting

1473: in the case when no decoding errors occur, and then we compute the increase

1474: in distortion due to decoding errors.

1475:

1476: \subsubsection{Distortion Assuming No Decoding Errors}

1477:

1478: Consider a fixed location $\frac m n$ ($1\leq m\leq n$), a fixed

1479: desired correlation value $\rho$ based on which a large enough value

1480: of $n$ is determined, and assume that no decoding errors occur in

1481: decoding samples $\frac 1 n ... \frac{m-1}n$.

1482:

1483: In Section~\ref{sec:use-lqsi} above, we argued that we can use

1484: codes with side information to effectively approximate the performance

1485: of a genie-aided encoder capable of sending the increments at each node.

1486: We would like to point out now that in our decoder, the side information

1487: is itself quantized with the coarse lattice.  As a result, as long as

1488: $X_\frac{m-1}n$ and $\hat X_\frac{m-1}n$ fall in the same sublattice

1489: cell, the reconstruction $\hat X_\frac m n$ is as good as if it were

1490: based on {\em uncoded} side information.  This is illustrated in

1491: Fig.~\ref{fig:coded-sideinfo}.

1492:

1493: \begin{figure}[!ht]

1494: \centerline{\psfig{file=coded-sideinfo.eps,height=14cm}}

1495: \caption{To illustrate the robustness of the proposed quantizers

1496:   to small amounts of quantization noise in the side information: as long

1497:   as the side information falls within a sublattice cell (roughly indicated

1498:   as the shaded region in this picture), using coded or uncoded side

1499:   information does not make a difference.  In this case, $X^N_{\!\frac{m-1}n}$

1500:   is the sample at the previous location, used as side information for the

1501:   sample $X^N_{\!\frac m n}$ at the current location.}

1502: \label{fig:coded-sideinfo}

1503: \end{figure}

1504:

1505: Thus we conclude that, provided no decoding errors occur in any of the

1506: previous samples, and based on the results in Section~\ref{sec:asymptotics},

1507: we can approximate the distortion in the reproduction of each sample

1508: by Wyner's rate/distortion bound:

1509: \[

1510:    D_\frac{m}{n} \:\leq\:

1511:     \sigma_X^2\mbox{$\frac{m}{n}$}(1\!-\!\rho^2)\;e^{-\frac{R}{6\sqrt{n}}},

1512: \]

1513: Note that the inequality in this case is because there will be nodes

1514: operating with a correlation value higher than the specified $\rho$, and

1515: for these values $D_u$ will be even lower than this.  The location-dependent

1516: correlation coefficients $\rho_{m-1,m}$ between adjacent samples forms a

1517: monotonically increasing sequence $\sqrt{1-1/m}\longrightarrow 1$ as

1518: $m\rightarrow\infty$.  A trivial manipulation shows that for all

1519: $m\geq\frac{1}{1-\rho^2}$, $\rho\leq\rho_{m-1,m}<1$, and therefore all node

1520: locations $\frac{m}{n}$ in the closed interval $\left[\frac{1}{n(1-\rho^2)},

1521: 1\right]$ will have correlation values at least $\rho$.  Now, since

1522: $m\leq n$, by choosing $n$ large enough we can make $\frac{1}{n(1-\rho^2)}$

1523: come arbitrarily close to zero.  So we see that the distortion bound above

1524: holds uniformly for almost all samples in a large network.

1525:

1526: At locations $u$ in which there is no sample collected (i.e., any location

1527: in an open interval $\left(\frac{m-1}{n},\frac{m}{n}\right)$), we need to

1528: interpolate $X_u$: we define $\hat{X}_u = \hat{X}_{(m-1)/n}$, where

1529: $(m-1)/n<u<m/n$.\footnote{Note that we could use better interpolators here

1530: than a simple zero-order hold.  But already with this rather simple minded

1531: rule we get the sought result of vanishing estimation error, and hence we

1532: keep it for simplicity.}  In this case,

1533: \[

1534:    D_u \leq D_\frac{m-1}{n} + \mbox{$\frac{\sigma_X^2}{n}$},

1535: \]

1536: since the interpolation error is at most the size of an increment

1537: between samples, and this increment has variance $\sigma_X^2/n$.  Assume

1538: now that the sample path $X_u(k)$ is continuous at $u$:

1539: \begin{itemize}

1540: \item Because $n$ is large, and for a fixed $k\in\mathbb{N}$, we have

1541:   a dense sampling of $X_u(k)$, $0\leq u\leq 1$.

1542: \item Because $R$ is large, encoded samples $\hat{X}_u$ available at

1543:   the decoder are close to the original value $X_u$, i.e.,

1544:   $\hat{X}_u\rightarrow X_u$, $u=\frac{m}{n}$.

1545: \item Because $X_u$ is continuous and $n$ is large, we have that

1546:   interpolated samples $X_u\approx X_{(m-1)/n}$

1547:   ($\frac{m-1}{n}<u<\frac{m}{n}$), for all $0\leq u\leq 1$.

1548: \end{itemize}

1549: Therefore, $D_u \leq D_\frac{m-1}{n} + \frac{\sigma_X^2}{n}$ holds at

1550: all points of continuity of $X_u$.  But finally, since almost all paths

1551: of a Wiener process are continuous~\cite{StarkW:94}, we conclude that

1552: \[

1553:    D_u \;\leq\;

1554:    \sigma_X^2\left(\mbox{$\frac{m-1}{n}$}(1\!-\!\rho^2)

1555:    \;e^{-\frac{R}{6\sqrt{n}}}+\mbox{$\frac{1}{n}$}\right)

1556:    \;\; \mbox{(a.e.),}

1557: \]

1558: where $(m-1)/n<u<m/n$, and $1\leq m\leq n$.

1559:

1560: \subsubsection{Distortion Excess Due to Decoding Errors}

1561:

1562: In the subsection above we obtained an expression for the distortion

1563: in the reconstruction of the sample paths assuming that decoding errors

1564: never occur.  This is clearly a lower bound on the achievable distortion.

1565: But we still need to account for the distortion increase that results

1566: from the increasingly likely (as $n\to\infty$) event of a decoding error.

1567: Our next goal is to show that, in large networks, this excess distortion

1568: is negligible compared to the distortion above induced by the quantizers.

1569:

1570: Consider two definitions:

1571: \begin{itemize}

1572: \item $\Upsilon_m$ is a random variable such that $\Upsilon_m = l$ denotes

1573:   the event in which $l$ nodes (out of the $m$ right before the node at

1574:   location $\frac m n$) make a decoding error.  Since conditioned on the

1575:   side information being correct, errors are independent at each node,

1576:   $\Upsilon_m \sim \mbox{B}(m,p_n)$: a binomial distribution with parameters

1577:   $m =$ number of previous nodes, and $p_n = $ probability of decoding

1578:   error given that there are $n$ nodes in the network.

1579: \item We refer to the term $\beta$ defined by eqn.~(\ref{eq:def-alpha-beta})

1580:   as the {\em excess distortion} at node $m$.

1581: \end{itemize}

1582: Both these definitions are illustrated in Fig.~\ref{fig:excess-distortion}.

1583:

1584: \begin{figure}[ht]

1585: \centerline{\psfig{file=excess-distortion.eps,width=15cm,height=10cm}}

1586: \caption{To illustrate the concept of excess distortion.  In this picture

1587:   we show the reconstruction that would result when no decoding errors

1588:   occur (bottom sample path), and the effects of decoding errors (jumps

1589:   of average size $\sqrt{\beta}$, as defined in eqn.~(\ref{eq:def-alpha-beta}),

1590:   after each decoding error).  Note that these errors do not necessarily

1591:   add up coherently from node to node, as illustrated in this picture --

1592:   however, taking them to behave in this way provides a valid upper bound

1593:   on the total excess distortion they induce.}

1594: \label{fig:excess-distortion}

1595: \end{figure}

1596:

1597: Consider now the distortion in a reconstruction of $X_{\frac m n}$ based

1598: on coded side information:

1599: \begin{eqnarray*}

1600: E\big(||X_{\frac m n}-\hat{X}_{\frac m n}||^2\big)

1601:   & \stackrel{(a)}{\approx} &

1602:     \alpha_n+\sum_{l=0}^m P(\Upsilon_m = l) \left(l\sqrt{\beta_n}\right)^2 \\

1603:   & = & \alpha_n+\beta_n E(\Upsilon_m^2)

1604:   \;\; = \;\;

1605:         \alpha_n+\beta_n \big(\mbox{Var}(\Upsilon_m)+E^2(\Upsilon_m)\big) \\

1606:   & \stackrel{(b)}{=} & \alpha_n+\beta_n\big(m p_n (1-p_n) + m^2 p_n^2\big)

1607:   \;\; = \;\; \alpha_n+\beta_n m p_n (1+(m-1)p_n) \\

1608:   & \stackrel{(c)}{\leq} & \alpha_n+\beta_n n p_n(1+np_n)

1609:   \;\; \approx \;\; \alpha_n + \beta_n n^2 p_n^2 \\

1610:   & \stackrel{(d)}{\approx} & \alpha_n + e^{-\frac{n}{2\sigma_X^2}} n^2 p_n^2 \\

1611:   & \stackrel{\Delta}{=} & \alpha_n + \beta'_n

1612: \end{eqnarray*}

1613: where:

1614: \begin{itemize}

1615: \item[(a)] follows from eqn.~(\ref{eq:def-alpha-beta}), and from the fact

1616:   that if $l$ errors occured before the decoding of the $m$-th sample, on

1617:   average each error contributes distortion $\beta_n$ and in the worst of

1618:   cases all these errors add up coherently (the dependence of $\alpha$ and

1619:   $\beta$ in eqn.~(\ref{eq:def-alpha-beta}) on $n$ is highlighted by adding

1620:   the subscript);

1621: \item[(b)] follows from the binomial distribution of $\Upsilon_m$;

1622: \item[(c)] follows from the fact that the expression above must hold for

1623:   all $1\leq m\leq n$;

1624: \item[(d)] follows from the fact that for $n$ large, we can neglect the

1625:   polynomial terms associated with the negative exponential, and from the

1626:   fact that $\rho = \sqrt{1-\frac 1 n}$.

1627: \end{itemize}

1628: Clearly, as $n\to\infty$, both $\alpha_n\to 0$ and $\beta'_n\to 0$.

1629: But again, this is not an interesting observation.  The interesting

1630: observation in this case is that still in the presence of coded side

1631: information and decoding errors, in the regime of high correlations,

1632: $\beta'_n$ is negligible compared to $\alpha_n$, and

1633: $E\big(||X_{\frac m n}-\hat{X}_{\frac m n}||^2\big)\approx\alpha_n$:

1634: \[\begin{array}{ccccccccc}

1635: \lim_{n\to\infty} \frac{\alpha_n+\beta'_n}{\alpha_n}

1636:   & = & 1 + \lim_{n\to\infty} \frac{\beta'_n}{\alpha_n}

1637:   & \leq & 1 + \lim_{n\to\infty}

1638:                \frac{\beta'_n}{\sigma_X^2\mbox{$\frac 1 n$}}

1639:   & \leq & 1 + \lim_{n\to\infty}

1640:                \frac{e^{-\frac{n}{2\sigma_X^2}} n^3 p_n^2}{\sigma_X^2}

1641:   & < & 1+\epsilon,

1642: \end{array}\]

1643: for any $\epsilon>0$ and $n$ large enough.  But we also have

1644: $\frac{\alpha_n+\beta'_n}{\alpha_n}>1$ (since $\beta'_n>0$).  Thus,

1645: the excess distortion due to the use of coded side information and

1646: possible decoding errors is negligible compared to the distortion

1647: induced by the quantizers themselves.

1648:

1649: To conclude this section, we would like to point out that there is an

1650: interesting tradeoff in this analysis, that works out favorably for us.

1651: Note that by increasing the number of nodes, we increase the number of

1652: places at which errors can occur, and therefore the probability that

1653: some node will make a decoding error is increased.  However, as the

1654: number of nodes increases, the correlation between their measurements

1655: increases as well, and therefore the size of errors is reduced.  And

1656: as the previous analysis shows, a linear increase in the number of nodes

1657: results in an exponential decrease in the size of each error -- hence,

1658: error propagation is {\em not} a problem in this setup.

1659:

1660:

1661: \section{Conclusions}

1662: \label{sec:conclusions}

1663:

1664: In this paper we presented our work on the design and performance

1665: analysis of codes for the problem of rate distortion with side

1666: information, and on the application of those codes in the context

1667: of a problem of data compression for sensor networks.  First, we

1668: gave concrete constructions for the nested codes studied by

1669: Shamai/Verd\'u/Zamir in~\cite{shamai-verdu-zamir:systematic-lossy-coding,

1670: zamir-shamai:almost-there}, effectively answering an open question

1671: raised in~\cite{zamir-shamai:almost-there}.  Then we studied the

1672: distortion performance of our codes, under the assumption of high

1673: correlation between the source and the side information and of

1674: high coding rates: there we showed that our codes attain the

1675: theoretically optimal distortion decay established by Wyner and

1676: Ziv~\cite{Wyner:78, WynerZ:76}.  Finally we computed an upper bound

1677: on the error made in estimating a brownian field based on measurements

1678: collected by very ``cheap'' devices and delivered over a wireless

1679: network.  In this case, even though the per-node throughput of the

1680: network vanishes as its size increases, and even if the nodes are

1681: not allowed to exchange any information at all, we showed how

1682: arbitrarily accurate estimation of the remote field is possible.

1683: To conclude the paper, we would like to comment on some issues that

1684: follow from our work.

1685:

1686: Concerning the problem of source estimation, in the presence of constraints

1687: on the available data imposed by the wireless network:

1688:

1689: \begin{itemize}

1690:

1691: \item The Brownian model for the source considered in this work is

1692:   probably one of the worst cases we could have considered, in the sense

1693:   that the regularity conditions satisfied by this process are minimal.

1694:   For example, almost all of its sample paths are indeed continuous at

1695:   almost all points (something we did use in our analysis); but at the

1696:   same time, almost all sample paths are {\em not} differentiable at almost

1697:   all points.  Furthermore, the crucial assumption of high-resolution

1698:   quantization that enabled us to apply our codes in the presence of

1699:   {\em coded} side information cannot be justified for processes with

1700:   increments of variance $O(n^{-1+\epsilon})$, for any

1701:   $\epsilon>0$---compare this to the $O(n^{-1})$ variance of the increments

1702:   of the model we considered.

1703:

1704: \item Interesting questions arise if we consider processes more regular

1705:   than Brownian motion: consider for example the case when $X_u$ is a

1706:   bandlimited signal (since $X_u$ is compactly supported, take its periodic

1707:   extension).  If the samples $X_{m/n}$ were available at the decoder

1708:   without distortion, it follows from Shannon's sampling theorem that

1709:   a network of finite size is enough to achieve a reconstruction with

1710:   zero distortion.  However, this would require network links of infinite

1711:   capacity.  For any finite value of $R$, there are tradeoffs to explore

1712:   between the number of nodes in the network (i.e., the sampling rate) and

1713:   the capacity of the network links (i.e., the accuracy in the representation

1714:   of each sample), since economic constraints may favor one or the other

1715:   option.  This problem has received considerable attention in the signal

1716:   processing and harmonic analysis literature~\cite{CvetkovicV:98, FuchsD:00,

1717:   GoyalVT:98, KrimTMD:99, ThaoV:94}.

1718:

1719: \end{itemize}

1720:

1721: Concerning coding/quantization.  Whereas our asymptotic analysis was

1722: performed only for jointly Gaussian sources and MSE distortion, it would

1723: be interesting to learn something about the performance of the proposed

1724: quantizers for sources with non-Gaussian statistics and/or other

1725: distortion measures.  An interesting result of Zamir states that,

1726: although the gap between $R_X(d)$ and $R_{X|Y}(d)$ can be unbound, the

1727: gap between the Wyner-Ziv rate/distortion function $R^*_X(d)$ and

1728: $R_{X|Y}(d)$ is bounded, and actually quite small in some cases: 0.5

1729: bits/sample for arbitrary source statistics and MSE distortion, and 0.22

1730: bits/sample for a binary source with Hamming distortion~\cite{Zamir:96}.

1731: In our opinion this is an interesting issue because, should a result

1732: similar to Zamir's hold for the performance of our codes, this would

1733: immediately allow us to conclude that arbitrarily accurate estimation

1734: is possible not just for jointly Gaussian sources, but for any source

1735: statistics.  And even if we do not have a formal proof, it certainly

1736: seems plausible to us that this may be so.

1737:

1738: Concerning the type of asymptotics developed in this work.  Tools

1739: employed for theoretical performance analysis in source coding problems

1740: can be roughly classified into two main groups:

1741: \begin{itemize}

1742: \item Large-block asymptotics, as pioneered by

1743:   Shannon~\cite{Shannon:59}.

1744: \item High-rate asymptotics, as pioneered by Zador, Gersho and

1745:   others~\cite{gersho:quantization-asymptotics,zador:quantization-asymptotics}.

1746: \end{itemize}

1747: The asymptotics we considered in this work are of neither type -- instead,

1748: we focused on {\em high-correlation} asymptotics.  And we believe this

1749: type of analysis is one particularly well suited for a new class of source

1750: coding problems, that originate in the context of sensor networks.  This

1751: paper presents one such analysis for a simple toy problem involving a

1752: Brownian process.  More of our work along these lines can be found

1753: in~\cite{LilisZS:04, ScaglioneS:03, ServettoR:06}.

1754:

1755: To conclude, we would like to comment on the nature of our contributions

1756: in this paper.  Since the seminal work of Gupta and Kumar~\cite{GuptaK:00},

1757: most of the theory work on wireless networks appears to have been driven

1758: by a desire to find ways to understand, and if possible circumvent, the

1759: fact that the per-node throughput of the network vanishes as the number

1760: of nodes grows.  Implicit in previous work seems to have been present an

1761: assumption that each node has a constant amount of information to transmit,

1762: irrespective of the network size: in this case, the fact that the throughput

1763: per node decreases as the network size increases does indeed pose serious

1764: problems.  However, we feel the asymptotic analysis of~\cite{GuptaK:00} is

1765: better suited to ``networks of small sensors'' than to ``networks of laptop

1766: computers'': whereas there are only so many laptops that one may want to

1767: have in a single room, much higher densities of small sensing nodes are

1768: conceivable.  Yet it is very high densities of nodes what the asymptotic

1769: analysis of~\cite{GuptaK:00} suggests to us.  Now, in the context of sensor

1770: networks, the vanishing-throughput property of some wireless networks is

1771: much less of a problem.  As an application for our codes with side

1772: information, we illustrated an instance of a class of wireless networking

1773: problems in which, as the size of the network grows, the amount of

1774: information generated by each transmitter decays at the same speed as the

1775: per-node throughput does.  Hence, contrary to the conclusions suggested

1776: in~\cite{GuptaK:00}, designers of these networks should be {\em encouraged}

1777: to consider very large numbers of nodes, for doing so may result in

1778: improved quality of the signals reconstructed at the receivers, and it

1779: may also make more economic sense.

1780:

1781:

1782: \bigskip

1783: \noindent {\bf Acknowledgements.}  The author would like to thank

1784: Toby Berger, for much needed encouragement and guidance provided

1785: at difficult times; Anna Scaglione, for discussions which resulted

1786: in a solution to a toy problem closely related to this

1787: one~\cite{ScaglioneS:03}; Martin Vetterli, for discussions on the

1788: work of Gupta and Kumar~\cite{GuptaK:00} that greatly contributed

1789: to his understanding of that work; and the anonymous referees, for

1790: their most insightful questions and constructive feedback, which

1791: led to a much improved manuscript.  The author also benefited from

1792: several conversations with V.\ A.\ Vaishampayan and N.\ J.\ A.\ Sloane,

1793: on quantization theory and lattices, in the context of some previous

1794: work~\cite{VaishampayanSS:01}.

1795:

1796:

1797: \pagebreak

1798: \appendix

1799:

1800: \subsection{Bounding $\beta$}

1801: \label{app:trivial1}

1802:

1803: Recall from Section~\ref{sec:average-error},

1804: \[

1805:   \beta\;\;\stackrel{\Delta}{=}\;\; \mbox{$\frac 1 n$}

1806:         \sum_{\lambda\in s\kappa(\Lambda)\backslash\{{\bf 0}\}}

1807:         \int_{{\bf x}\in V[\lambda:s\kappa(\Lambda)]}

1808:         ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2 f_{X|Y}({\bf x}|\xi)

1809:         \mbox{d}{\bf x},

1810: \]

1811: for any $\xi\in V[{\bf 0}:s\Lambda]$.  Our goal next is to give an

1812: estimate for $\beta$.

1813:

1814: Since each term of the sum is positive, we have

1815: a trivial lower bound: $\beta \geq 0$.  As for an upper bound:

1816: \begin{eqnarray}

1817: \beta

1818:   & \stackrel{(a)}{=} & \mbox{$\frac 1 n$}

1819:         \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}

1820:         \int_{V[\lambda:s\kappa(\Lambda)]}

1821:         ||{\bf x}-\big(\lambda+\gamma_k({\bf x})\big)||^2

1822:         \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;

1823:         e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}

1824:                                    -\frac{\rho}{\sigma_Y}{\xi}||^2}

1825:         \mbox{d\bf x}

1826:         \nonumber \\

1827:   & \stackrel{(b)}{\leq} & \mbox{$\frac 1 n$}

1828:         \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}

1829:         \int_{V[\lambda:s\kappa(\Lambda)]}

1830:         ||{\bf x}||^2

1831:         \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;

1832:         e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}

1833:                                    -\frac{\rho}{\sigma_Y}{\xi}||^2}

1834:         \mbox{d\bf x}

1835:         \nonumber \\ & & \mbox{\hspace{2cm}} +

1836:         \int_{V[\lambda:s\kappa(\Lambda)]}

1837:         ||\lambda+\gamma_k({\bf x})||^2

1838:         \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;

1839:         e^{-\frac{n}{2(1-\rho^2)}||\frac{1}{\sigma_X}{\bf x}

1840:                                    -\frac{\rho}{\sigma_Y}{\xi}||^2}

1841:         \mbox{d\bf x}

1842:         \nonumber \\

1843:   & \stackrel{(c)}{\approx} & \mbox{$\frac 1 n$}

1844:         \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}

1845:         2||\lambda||^2

1846:         \frac{1}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}} \;

1847:         e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||\lambda||^2}

1848:         \left(\int_{V[\lambda:s\kappa(\Lambda)]}\mbox{d\bf x}\right)

1849:         \nonumber \\

1850:   & = & \mbox{$\frac 1 n$}

1851:         \frac{2\nu(s\kappa(\Lambda))}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1852:         \sum_{\lambda\in s\kappa(\Lambda)\!\setminus\{0\}}

1853:         ||\lambda||^2

1854:         \;e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||\lambda||^2}

1855:         \nonumber \\

1856:   & = & \mbox{$\frac 1 n$}

1857:         \frac{2\nu(s\kappa(\Lambda))}{[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1858:         \sum_{\lambda\in \kappa(\Lambda)\!\setminus\{0\}}

1859:         ||s\lambda||^2

1860:         \;e^{-\frac{n}{2\sigma_X^2(1-\rho^2)}||s\lambda||^2}

1861:         \nonumber \\

1862:   & \stackrel{(d)}{=} & \mbox{$\frac 1 n$}

1863:         \frac{2\nu(s\kappa(\Lambda))s^2}

1864:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1865:         \sum_{m=1}^\infty N_m(\kappa(\Lambda))

1866:         \;e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m}

1867:         \label{eq:b2}

1868: \end{eqnarray}

1869: where:

1870: \begin{itemize}

1871: \item[(a)] is just a substitution for the conditional Gaussian distribution;

1872: \item[(b)] follows from the fact that $||a-b||^2 \leq ||a||^2+||b||^2$;

1873: \item[(c)] is because of two reasons: under the assumption that

1874:   sublattice cells are small, we have $||{\bf x}||^2\approx||\lambda||^2$

1875:   (when ${\bf x}\in V[\lambda:s\kappa(\Lambda)]$); and under the further

1876:   assumption that $R$ is large, $||\gamma_k||^2$ is negligible compared

1877:   to $||\lambda||^2$ (when $\lambda\neq{\bf 0}$), and

1878:   $||\xi||^2\approx{\bf 0}$ (when $\xi\in V[{\bf 0}:s\Lambda]$);

1879: \item[(d)] follows from defining $N_m(\kappa(\Lambda))$ as the number of

1880:   points in $\lambda\in \kappa(\Lambda)$ such that

1881:   $||\lambda||^2=m$.\footnote{Note: wlog, we can take norms to be integers.

1882:   If this is not the case, we can always form a (countable) list of all the

1883:   norms that appear in $\kappa(\Lambda)$, and take $m$ to be an index in

1884:   this list.}

1885: \end{itemize}

1886:

1887: To find a useful estimate for this sum, we need to bound

1888: $N_m(\kappa(\Lambda))$.  One simple such bound is:

1889:   \[ N_m(\kappa(\Lambda)) \;\; \leq \;\;

1890:        \frac{\mbox{surface of an $n$-dimensional sphere of radius m}}

1891:             {\mbox{volume of an $(n\!-\!1)$-dimensional sphere of radius

1892:                    $\frac{N}{2}$}}.

1893:   \]

1894: This bound follows from the fact that the highest density of lattice

1895: points on the surface of a sphere cannot be higher than if we assume

1896: a perfect tessellation of this $(n\!-\!1)$-dimensional surface into

1897: $(n\!-\!1)$-dimensional spheres whose radius is $\frac{1}{2}$ of the

1898: smallest separation between sublattice points.  Using standard

1899: formulas~\cite{neil:splag}, we find that

1900: \[ N_m(\kappa(\Lambda))

1901:      \;\; \leq \;\; \frac{c_n m^{n-1}}{d_n \left(\frac{N}{2}\right)^{n-1}}

1902:      \;\; = \;\; e_n m^{n-1},

1903: \]

1904: for appropriate constants $c_n$ and $d_n$, and

1905: $e_n \triangleq \frac{c_n}{d_n(\frac N 2)^{n-1}}$.  Therefore,

1906: \begin{eqnarray}

1907: \beta

1908:   & \stackrel{(a)}{\leq} & \mbox{$\frac 1 n$}

1909:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1910:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1911:         \sum_{m=1}^\infty m^{n-1}

1912:         \;e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m}

1913:         \nonumber \\

1914:   & = & \mbox{$\frac 1 n$}

1915:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1916:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1917:         \sum_{m=1}^\infty

1918:         e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}m+(n-1)\log(m)}

1919:         \nonumber \\

1920:   & \stackrel{(b)}{=} & \mbox{$\frac 1 n$}

1921:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1922:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1923:         \left(-1+\sum_{m=0}^\infty

1924:                  \left(e^{-\frac{s^2n}{2\sigma_X^2(1-\rho^2)}

1925:                           +\frac{(n-1)\log(m)}{m}}\right)^m\right)

1926:         \nonumber \\

1927:   & \stackrel{(c)}{\leq} & \mbox{$\frac 1 n$}

1928:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1929:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1930:         \left(-1+\sum_{m=0}^\infty

1931:                  \left(e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}\right)^m\right)

1932:         \nonumber \\

1933:   & \stackrel{(d)}{=} & \mbox{$\frac 1 n$}

1934:         \frac{2\nu(s\kappa(\Lambda))e_ns^2}

1935:              {[2\pi\sigma_X^2(1-\rho^2)]^{\frac{n}{2}}}

1936:         \left(\frac{e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}

1937:                    {1-e^{-\frac{s^2}{2\sigma_X^2(1-\rho^2)}}}\right)

1938:         \label{eq:b3} \\

1939:   & \stackrel{(e)}{<} & \epsilon

1940:         \nonumber

1941: \end{eqnarray}

1942: where:

1943: \begin{itemize}

1944: \item[(a)] follows from replacing the estimate for $N_m(\kappa(\Lambda))$

1945:   in eqn.~(\ref{eq:b2});

1946: \item[(b)] follows from simple manipulations, and defining

1947:   $\frac{\log 0}{0} = 0$;

1948: \item[(c)] follows from observing that

1949:   $\frac{\log m}{m} < \frac{s^2}{2\sigma_X^2(1-\rho^2)}$, for $\rho^2$ close

1950:   enough to 1;

1951: \item[(d)] follows from evaluation of the sum of a power series;

1952: \item[(e)] where this holds for all values of $\rho$ such

1953:   that $\rho_0 < |\rho| < 1$, for a constant $\rho_0$ that depends on

1954:   $\epsilon$ since, from~(\ref{eq:choice-s}), we have

1955:   $s/\big(\sigma_X\sqrt{1-\rho^2}\big)\to\infty$, thus convergence is

1956:   exponential in $\rho$.

1957: \end{itemize}

1958: Thus, $0\leq \beta < \epsilon$, for all $\epsilon > 0$ and all $|\rho|$

1959: close enough to 1.  Hence, eqn.~(\ref{eq:b3}) defines an asymptotically

1960: good estimate of $\beta$.

1961:

1962:

1963: \pagebreak

1964: %\bibliographystyle{plain}

1965: %\bibliography{library}

1966: \begin{thebibliography}{10}

1967:

1968: \bibitem{AaronG:02}

1969: A.~Aaron and B.~Girod.

1970: \newblock {Compression with Side Information Using Turbo Codes}.

1971: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2002.

1972:

1973: \bibitem{baake-moody:similarity-submodules-semigroups}

1974: M.~Baake and R.~V. Moody.

1975: \newblock {Similarity Submodules and Semigroups}.

1976: \newblock In J.~Patera, editor, {\em Quasicrystals and Discrete Geometry},

1977:   pages 1--13. Comm. Fields Institute, American Mathematical Society,

1978:   Providence, RI, 1998.

1979:

1980: \bibitem{BarronCW:02}

1981: R.~Barron, B.~Chen, and G.~W. Wornell.

1982: \newblock {The Duality Between Information Embedding and Source Coding with

1983:   Side Information and Some Applications}.

1984: \newblock {\em IEEE Trans. Inform. Theory}, 49(5):1159--1180, 2003.

1985:

1986: \bibitem{BarrosS:06}

1987: J.~Barros and S.~D. Servetto.

1988: \newblock {Network Information Flow with Correlated Sources}.

1989: \newblock {\em IEEE Trans. Inform. Theory}, 52(1):155--170, 2006.

1990:

1991: \bibitem{Berger:78}

1992: T.~Berger.

1993: \newblock {\em The Information Theory Approach to Communications (G. Longo,

1994:   ed.)}, chapter Multiterminal Source Coding.

1995: \newblock Springer-Verlag, 1978.

1996:

1997: \bibitem{BergerZV:96}

1998: T.~Berger, Z.~Zhang, and H.~Viswanathan.

1999: \newblock {The CEO Problem}.

2000: \newblock {\em IEEE Trans. Inform. Theory}, 42(3):887--902, 1996.

2001:

2002: \bibitem{bernstein-neil-pew:sublattices-of-a2}

2003: M.~Bernstein, N.~J.~A. Sloane, and P.~E. Wright.

2004: \newblock {On Sublattices of the Hexagonal Lattice}.

2005: \newblock {\em Discrete Math.}, 170:29--39, 1997.

2006:

2007: \bibitem{Bourbaki:58}

2008: N.~Bourbaki.

2009: \newblock {\em {El\'ements de Math\'ematiques}}.

2010: \newblock Hermann, 1958.

2011: \newblock Livre II (Alg\`ebre), Chapitre 1 (Structures Alg\'ebriques).

2012:

2013: \bibitem{ChiangB:04}

2014: M.~Chiang and S.~Boyd.

2015: \newblock {Geometric Programming Duals of Channel Capacity and Rate

2016:   Distortion}.

2017: \newblock {\em IEEE Trans. Inform. Theory}, 50(2):245--258, 2004.

2018:

2019: \bibitem{conway-rains-neil:similar-sublattices}

2020: J.~H. Conway, E.~M. Rains, and N.~J.~A. Sloane.

2021: \newblock {On the Existence of Similar Sublattices}.

2022: \newblock {\em Canad. J. Math.}, 51:1300--1306, 1999.

2023:

2024: \bibitem{neil:splag}

2025: J.~H. Conway and N.~J.~A. Sloane.

2026: \newblock {\em {Sphere Packings, Lattices and Groups}}.

2027: \newblock Springer Verlag, 3rd edition, 1998.

2028:

2029: \bibitem{Costa:83}

2030: M.~H.~M. Costa.

2031: \newblock {Writing on Dirty Paper}.

2032: \newblock {\em IEEE Trans. Inform. Theory}, IT-29(3):439--441, 1983.

2033:

2034: \bibitem{Cover:75b}

2035: T.~M. Cover.

2036: \newblock {A Proof of the Data Compression Theorem of Slepian and Wolf for

2037:   Ergodic Sources}.

2038: \newblock {\em IEEE Trans. Inform. Theory}, IT-21(2):226--228, 1975.

2039:

2040: \bibitem{CoverC:02}

2041: T.~M. Cover and M.~Chiang.

2042: \newblock {Duality Between Channel Capacity and Rate Distortion with Two-Sided

2043:   State Information}.

2044: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1629--1638, 2002.

2045:

2046: \bibitem{CoverT:91}

2047: T.~M. Cover and J.~Thomas.

2048: \newblock {\em {Elements of Information Theory}}.

2049: \newblock John Wiley and Sons, Inc., 1991.

2050:

2051: \bibitem{CvetkovicV:98}

2052: Z.~Cvetkovi\v{c} and M.~Vetterli.

2053: \newblock {Error-Rate Characteristics of Oversampled Analog-to-Digital

2054:   Conversion}.

2055: \newblock {\em IEEE Trans. Inform. Theory}, 44(5):1961--1964, 1998.

2056:

2057: \bibitem{FuchsD:00}

2058: J.-J. Fuchs and B.~Delyon.

2059: \newblock {Minimal $L_1$-Norm Reconstruction Function for Oversampled Signals:

2060:   Applications to Time-Delay Estimation}.

2061: \newblock {\em IEEE Trans. Inform. Theory}, 46(4):1666--1673, 2000.

2062:

2063: \bibitem{gersho:quantization-asymptotics}

2064: A.~Gersho.

2065: \newblock {Asymptotically Optimal Block Quantization}.

2066: \newblock {\em IEEE Trans. Inform. Theory}, IT-25(4):373--380, 1979.

2067:

2068: \bibitem{GoyalVT:98}

2069: V.~K. Goyal, M.~Vetterli, and N.~T. Thao.

2070: \newblock {Quantized Overcomplete Expansions in $\mathbb{R}^N$: Analysis,

2071:   Synthesis, and Algorithms}.

2072: \newblock {\em IEEE Trans. Inform. Theory}, 44(1):16--31, 1998.

2073:

2074: \bibitem{GrayN:98}

2075: R.~M. Gray and D.~L. Neuhoff.

2076: \newblock {Quantization}.

2077: \newblock {\em IEEE Trans. Inform. Theory}, 44(6):2325--2383, 1998.

2078:

2079: \bibitem{GrossglauserT:02}

2080: M.~Grossglauser and D.~Tse.

2081: \newblock {Mobility Increases the Capacity of AdHoc Wireless Networks}.

2082: \newblock {\em IEEE Trans. Networking}, 10(4):477--486, 2002.

2083:

2084: \bibitem{GuptaK:00}

2085: P.~Gupta and P.~R. Kumar.

2086: \newblock {The Capacity of Wireless Networks}.

2087: \newblock {\em IEEE Trans. Inform. Theory}, 46(2):388--404, 2000.

2088:

2089: \bibitem{GuptaK:03}

2090: P.~Gupta and P.~R. Kumar.

2091: \newblock {Towards an Information Theory of Large Networks: An Achievable Rate

2092:   Region}.

2093: \newblock {\em IEEE Trans. Inform. Theory}, 49(8):1877--1894, 2003.

2094:

2095: \bibitem{heegard-berger:uncertain-side-info}

2096: C.~Heegard and T.~Berger.

2097: \newblock {Rate Distortion when Side Information May Be Absent}.

2098: \newblock {\em IEEE Trans. Inform. Theory}, IT-31(6):727--734, 1985.

2099:

2100: \bibitem{KaspiB:82}

2101: A.~H. Kaspi and T.~Berger.

2102: \newblock {Rate-Distortion for Correlated Sources with Partially Separated

2103:   Encoders}.

2104: \newblock {\em IEEE Trans. Inform. Theory}, IT-28(6):828--840, 1982.

2105:

2106: \bibitem{KrimTMD:99}

2107: H.~Krim, D.~Tucker, S.~Mallat, and D.~Donoho.

2108: \newblock {On Denoising and Best Signal Representation}.

2109: \newblock {\em IEEE Trans. Inform. Theory}, 45(7):2225--2238, 1999.

2110:

2111: \bibitem{KulkarniV:04}

2112: S.~R. Kulkarni and P.~Viswanath.

2113: \newblock {A Deterministic Approach to Throughput Scaling in Wireless

2114:   Networks}.

2115: \newblock {\em IEEE Trans. Inform. Theory}, 50(6):1041--1049, 2004.

2116:

2117: \bibitem{LilisZS:04}

2118: G.~N. Lilis, M.~Zhao, and S.~D. Servetto.

2119: \newblock {Distributed Sensing and Actuation on Wave Fields}.

2120: \newblock In {\em Proc. 2nd Sensor and Actor Networks Protocols and

2121:   Applications (SANPA)}, Boston, MA, 2004.

2122:

2123: \bibitem{LiuCLX:04}

2124: Z.~Liu, S.~Cheng, A.~Liveris, and Z.~Xiong.

2125: \newblock {Slepian-Wolf Coded Nested Quantization (SWC-NQ) for Wyner-Ziv

2126:   Coding: Performance Analysis and Code Design}.

2127: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2004.

2128:

2129: \bibitem{MerhavS:03}

2130: N.~Merhav and S.~Shamai.

2131: \newblock {On Joint Source-Channel Coding for the Wyner-Ziv Source and the

2132:   Gel'fand-Pinsker Channel}.

2133: \newblock {\em IEEE Trans. Inform. Theory}, 49(11):2844--2855, 2003.

2134:

2135: \bibitem{MitranB:02}

2136: P.~Mitran and J.~Bajcsy.

2137: \newblock {Coding for the Wyner-Ziv Problem with Turbo-Like Codes}.

2138: \newblock In {\em Proc. IEEE Int. Symp. Inform. Theory}, Lausanne, Switzerland,

2139:   2002.

2140:

2141: \bibitem{PerakiS:03}

2142: C.~Peraki and S.~D. Servetto.

2143: \newblock {On the Maximum Stable Throughput Problem in Random Networks with

2144:   Directional Antennas}.

2145: \newblock In {\em Proc. ACM MobiHoc}, Annapolis, MD, 2003.

2146:

2147: \bibitem{PerakiS:04}

2148: C.~Peraki and S.~D. Servetto.

2149: \newblock {Capacity, Stability and Flows in Large-Scale Random Networks}.

2150: \newblock In {\em Proc. IEEE Inform. Theory Workshop (ITW)}, San Antonio, TX,

2151:   2004.

2152:

2153: \bibitem{PradhanCR:03}

2154: S.~S. Pradhan, J.~Chou, and K.~Ramchandran.

2155: \newblock {Duality Between Source Coding and Channel Coding and its Extension

2156:   to the Side Information Case}.

2157: \newblock {\em IEEE Trans. Inform. Theory}, 49(5):1181--1203, 2003.

2158:

2159: \bibitem{sandeep-kannan:discus}

2160: S.~S. Pradhan and K.~Ramchandran.

2161: \newblock {Distributed Source Coding Using Syndromes (DISCUS): Design and

2162:   Construction}.

2163: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 1999.

2164:

2165: \bibitem{PradhanR:00}

2166: S.~S. Pradhan and K.~Ramchandran.

2167: \newblock {Distributed Source Coding: Symmetric Rates and Applications to

2168:   Sensor Networks}.

2169: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2000.

2170:

2171: \bibitem{RebolloMonederoZG:03}

2172: D.~Rebollo-Monedero, R.~Zhang, and B.~Girod.

2173: \newblock {Design of Optimal Quantizers for Distributed Source Coding}.

2174: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2003.

2175:

2176: \bibitem{ScaglioneS:03}

2177: A.~Scaglione and S.~D. Servetto.

2178: \newblock {On the Interdependence of Routing and Data Compression in Multi-Hop

2179:   Sensor Networks}.

2180: \newblock {\em Wireless Networks}, 11(1-2):149--160, 2005.

2181: \newblock Special issue with selected (and revised) papers from ACM MobiCom

2182:   2002.

2183:

2184: \bibitem{Servetto:02b}

2185: S.~D. Servetto.

2186: \newblock {Lattice Quantization with Side Information}.

2187: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2000.

2188:

2189: \bibitem{Servetto:02c}

2190: S.~D. Servetto.

2191: \newblock {On the Feasibility of Large-Scale Wireless Sensor Networks}.

2192: \newblock In {\em Proc. 40th Allerton Conf. on Communication, Control and

2193:   Computing}, Urbana, IL, 2002.

2194:

2195: \bibitem{ServettoR:06}

2196: S.~D. Servetto and J.~M. Rosenblatt.

2197: \newblock {The Multiterminal Source Coding Problem for Spatial Waves}.

2198: \newblock In {\em Proc. UCSD Wkshp. Inform. Theory App.}, San Diego, CA, 2006.

2199: \newblock {\em Invited paper}.

2200:

2201: \bibitem{shamai-verdu-zamir:systematic-lossy-coding}

2202: S.~Shamai, S.~Verd\'{u}, and R.~Zamir.

2203: \newblock {Systematic Lossy Source/Channel Coding}.

2204: \newblock {\em IEEE Trans. Inform. Theory}, 44(2):564--579, 1998.

2205:

2206: \bibitem{Shannon:59}

2207: C.~E. Shannon.

2208: \newblock {Coding Theorems for a Discrete Source with a Fidelity Criterion}.

2209: \newblock {\em IRE Nat. Conv. Rec.}, 4:142--163, 1959.

2210:

2211: \bibitem{SlepianW:73b}

2212: D.~Slepian and J.~K. Wolf.

2213: \newblock {Noiseless Coding of Correlated Information Sources}.

2214: \newblock {\em IEEE Trans. Inform. Theory}, IT-19(4):471--480, 1973.

2215:

2216: \bibitem{StarkW:94}

2217: H.~Stark and J.~Woods.

2218: \newblock {\em {Probability, Random Processes, and Estimation Theory for

2219:   Engineers (2nd ed.)}}.

2220: \newblock Prentice Hall, 1994.

2221:

2222: \bibitem{SuEG:00}

2223: J.~K. Su, J.~J. Eggers, and B.~Girod.

2224: \newblock {Channel Coding and Rate Distortion with Side Information: Geometric

2225:   Interpretation and Illustration of Duality}.

2226: \newblock Submitted to the IEEE Trans. Inform. Theory.

2227:

2228: \bibitem{ThaoV:94}

2229: N.~T. Thao and M.~Vetterli.

2230: \newblock {Reduction of the MSE in $R$-times Oversampled A/D Conversion from

2231:   $O(1/R)$ to $O(1/R^2)$}.

2232: \newblock {\em IEEE Trans. Signal Processing}, 42(1):200--203, 1994.

2233:

2234: \bibitem{TianGZ:03}

2235: T.~Tian, J.~Garc\'{\i}a-Fr\'{\i}as, and W.~Zhong.

2236: \newblock {Compression of Correlated Sources using LDPC Codes}.

2237: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2003.

2238:

2239: \bibitem{ToumpisG:02}

2240: S.~Toumpis and A.~J. Goldsmith.

2241: \newblock {Capacity Regions for Wireless Adhoc Networks}.

2242: \newblock {\em IEEE Trans. Wireless Comm.}, 2(4):736--748, 2003.

2243:

2244: \bibitem{Tung:PhD}

2245: S.~Y. Tung.

2246: \newblock {\em {Multiterminal Source Coding}}.

2247: \newblock PhD thesis, Cornell University, 1978.

2248:

2249: \bibitem{VaishampayanSS:01}

2250: V.~A. Vaishampayan, N.~J.~A. Sloane, and S.~D. Servetto.

2251: \newblock {Multiple Description Vector Quantization with Lattice Codebooks:

2252:   Design and Analysis}.

2253: \newblock {\em IEEE Trans. Inform. Theory}, 47(5):1718--1734, 2001.

2254:

2255: \bibitem{Verdu:02}

2256: S.\ Verd\'u.

2257: \newblock {Spectral Efficiency in the Wideband Regime}.

2258: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1319--1343, 2002.

2259:

2260: \bibitem{ViswanathanB:97}

2261: H.~Viswanathan and T.~Berger.

2262: \newblock {The Quadratic-Gaussian CEO Problem}.

2263: \newblock {\em IEEE Trans. Inform. Theory}, 43(5):1549--1559, 1997.

2264:

2265: \bibitem{Wyner:75}

2266: A.~D. Wyner.

2267: \newblock {On Source Coding with Side Information at the Decoder}.

2268: \newblock {\em IEEE Trans. Inform. Theory}, IT-21(3):294--300, 1975.

2269:

2270: \bibitem{Wyner:78}

2271: A.~D. Wyner.

2272: \newblock {The Rate-Distortion Function for Source Coding with Side Information

2273:   at the Decoder-II: General Sources}.

2274: \newblock {\em Inform. Contr.}, 38:60--80, 1978.

2275:

2276: \bibitem{WynerZ:76}

2277: A.~D. Wyner and J.~Ziv.

2278: \newblock {The Rate-Distortion Function for Source Coding with Side Information

2279:   at the Decoder}.

2280: \newblock {\em IEEE Trans. Inform. Theory}, IT-22(1):1--10, 1976.

2281:

2282: \bibitem{XieK:04}

2283: L.-L. Xie and P.~R. Kumar.

2284: \newblock {A Network Information Theory for Wireless Communication: Scaling

2285:   Laws and Optimal Operation}.

2286: \newblock {\em IEEE Trans. Inform. Theory}, 50(5):748--767, 2004.

2287:

2288: \bibitem{zador:quantization-asymptotics}

2289: P.~Zador.

2290: \newblock {Asymptotic Quantization Error of Continuous Signals and the

2291:   Quantization Dimension}.

2292: \newblock {\em IEEE Trans. Inform. Theory}, IT-28(2):139--149, 1982.

2293:

2294: \bibitem{Zamir:96}

2295: R.~Zamir.

2296: \newblock {The Rate Loss in the Wyner-Ziv Problem}.

2297: \newblock {\em IEEE Trans. Inform. Theory}, 42(6):2073--2084, 1996.

2298:

2299: \bibitem{zamir-shamai:almost-there}

2300: R.~Zamir and S.~Shamai.

2301: \newblock {Nested Linear/Lattice Codes for Wyner-Ziv Encoding}.

2302: \newblock In {\em Proc. IEEE Inform. Theory Workshop}, Killarney, Ireland,

2303:   1998.

2304:

2305: \bibitem{ZamirSE:02}

2306: R.~Zamir, S.~Shamai, and U.~Erez.

2307: \newblock {Nested Linear/Lattice Codes for Structured Multiterminal Binning}.

2308: \newblock {\em IEEE Trans. Inform. Theory}, 48(6):1250--1276, 2002.

2309:

2310: \bibitem{ZhaoE:01}

2311: Q.~Zhao and M.~Effros.

2312: \newblock {Optimal Code Design for Lossless and Near Lossless Source Coding in

2313:   Multiple Access Networks}.

2314: \newblock In {\em Proc. IEEE Data Compression Conf. (DCC)}, Snowbird, UT, 2001.

2315:

2316: \end{thebibliography}

2317:

2318:

2319: \begin{biography}{Sergio D.\ Servetto}

2320:   was born in Argentina, on January 18, 1968.  He

2321:   received a Licenciatura en Inform\'atica from Universidad Nacional

2322:   de La Plata (UNLP, Argentina) in 1992, and the M.Sc. degree in

2323:   Electrical Engineering and the Ph.D. degree in Computer Science from

2324:   the University of Illinois at Urbana-Champaign (UIUC), in 1996 and

2325:   1999.  Between 1999 and 2001, he worked at the \'Ecole Polytechnique

2326:   F\'ed\'erale de Lausanne (EPFL), Lausanne, Switzerland.  Since Fall

2327:   2001, he has been an Assistant Professor in the School of Electrical

2328:   and Computer Engineering at Cornell University, and a member of the

2329:   fields of Applied Mathematics and Computer Science.  He was the

2330:   recipient of the 1998 Ray Ozzie Fellowship, given to ``outstanding

2331:   graduate students in Computer Science,'' and of the 1999 David J.

2332:   Kuck Outstanding Thesis Award, for the best doctoral dissertation

2333:   of the year, both from the Dept.\ of Computer Science at UIUC.  He

2334:   was also the recipient of a 2003 NSF CAREER Award.  His research

2335:   interests are centered around information theoretic aspects of

2336:   networked systems, with a current emphasis on problems that arise

2337:   in the context of large-scale sensor networks.

2338: \end{biography}

2339:

2340:

2341: \end{document}

2342: