0811.3617/Background/HighResAnalysis.tex
1: %point density function = derivative of companding function.
2: %precise assumptions
3: %distortion in terms of point density
4: %rate in terms of point density.
5: %derivation of ``optimal'' point densitiies
6: %make note of approximations and their asymptotic validity.
7: %asymptotically-optimal bit allocations
8: The quantities of fundamental interest in the analysis of companding quantizer
9: sequences are the fixed- and variable-rate distortion-rate functions $\Dfr(R;\lambda)$ and
10: $\Dvr(R;\lambda)$, which describe the distortion of fixed- and variable-rate companding 
11: quantizers with rate $R$ and point density $\lambda$.
12: %These quantities may be expressed through the
13: %distortion-resolution function $d(K;\lambda)$ and the 
14: %resolution-rate function $K(\lambda;R)$.
15: High resolution analysis consists of several approximations
16: that allow one to derive asymptotically accurate versions of both of these expressions
17: $\Dfrhr(R;\lambda)$ and $\Dvrhr(R;\lambda)$.  Specifically, under appropriate
18: restrictions on the source distribution we will show that
19: \beq
20: \lim_{R\rightarrow \infty} \frac{\Dfrhr(R;\lambda)}{\Dfr(R;\lambda)} =
21: \lim_{R\rightarrow \infty} \frac{\Dvrhr(R;\lambda)}{\Dvr(R;\lambda)} = 
22: 1 
23: \mbox{.} \label{eq:asymptoticaccuracy}
24: \eeq
25: 
26: In Sec. \ref{sec:BackgroundHighResolutionDistortion}, the
27: approximate distortion-resolution function $\dhr(K; \lambda)$ is derived.
28: Then, in Sec. \ref{sec:BackgroundHighResolutionRate}, the
29: approximate resolution-rate function $\Khr(R; \lambda)$ is obtained
30: for both fixed- and variable-rate constraints.  Finally, in 
31: Sec. \ref{sec:BackgroundHighDistortionRate} these two quantities yield
32: the approximate distortion-rate functions $\Dfrhr(R;\lambda)$ and $\Dvrhr(R;\lambda)$.
33: The derivation we provide is left informal and
34: is not intended to prove that assumptions UO1--UO4 yield \eqref{eq:asymptoticaccuracy};
35: this follows either from Linder \cite{Linder1991} or as a special 
36: case of Theorem \ref{thm:single-distortion} in
37: Sec. \ref{sec:Single}.   For further technical details and references
38: to original sources, see~\cite{GrayN1998}.
39: 
40: Finally, in Sec. \ref{sec:review-optimal}, the approximate distortion-rate
41: functions are optimized through choice of point density (companding function).
42: The sequences of companding quantizers yielded by this optimization are shown to be
43: asymptotically fixed- or variable-rate optimal.
44: 
45: 
46: 
47: \subsubsection{The Distortion-Resolution Function}
48: \label{sec:BackgroundHighResolutionDistortion}
49: As previously defined, $d(K;\lambda)$ is the distortion of the companding quantizer with
50: resolution $K$, or the \emph{distortion-resolution function}.  We now define
51: an approximation $\dhr(K;\lambda)$, known as the approximate distortion-resolution function.
52: For rigorous proof that 
53: \beq
54: \lim_{K\rightarrow\infty} \dhr(K;\lambda)/d(K;\lambda) = 1 \mbox{,} \label{eq:distortionResolutionAsymptotic}
55: \eeq
56: we refer
57: to the main result of Linder \cite{Linder1991}, or to Theorem 
58: \ref{thm:multi-distortion} with $g(x) = x$.
59: 
60: Let $X$ be a random variable with probability density function (pdf) $f_X(x)$, 
61: and let $Q^{\lambda}_K$ be a $K$-point companding quantizer, and suppose $\lambda$ and $f$
62: satisfy assumptions UO1--UO4\@.  Let $\{\beta_i\}_{i\in\I} = Q^{\lambda}_K([0,1])$ be the reconstruction
63: points, and let $S_i = \left(Q_K^{\lambda}\right)^{-1}(\beta_i)$, $i\in\I$, be the corresponding 
64: partition regions.
65: %For optimality, it is necessary for each set in the partition to be an interval,
66: %i.e., the quantizer is \emph{regular}~\cite[Sect.~6.2]{GershoG1992}.
67: 
68: The distortion of the quantizer is
69: \begin{eqnarray}
70:   \label{eq:general-quant-dist}
71:   d(K;\lambda) & = & \E{(X-\Xhat)^2} \nonumber \\
72:       & = & \sum_{i \in \I}
73:           \E{(X - \beta_i)^2 \mid X \in S_i}
74:           \P{X \in S_i}
75: \end{eqnarray}
76: by the law of total expectation.
77: The initial aim of high-resolution theory is to express this distortion
78: as an integral involving $f_X$.
79: To that end, we make the following approximations about the source
80: and quantizer:
81: \begin{enumerate}
82: \item[HR1.] $f_X$ may be approximated as constant on each $S_i$.  
83: \item[HR2.] The size of a cell containing $x$ is approximated with
84: the help of the point density function:
85: \beq
86:   \label{eq:cell-length}
87:   x \in S_i \quad \Rightarrow \quad \length(S_i) \sim (K\lambda(x))^{-1} \mbox{,}
88: \eeq
89: where $\sim$ means that the ratio of the two quantities goes to 1 with increasing
90: resolution $K$.  This is the meaning of ``$\sim$'' for the remainder of the paper.
91: \end{enumerate}
92: The first approximation follows from the smoothness of $f_X$ (assumptions UO1 and UO2),
93: while the second follows from the smoothness of $w(x)$ (assumption UO3).
94: 
95: Now we can approximate each non-boundary term in \eqref{eq:general-quant-dist}.
96: By HR1, $\beta_i$ should be approximately at the center of $S_i$,
97: and the length of $S_i$ then makes the conditional expectation
98: approximately $\frac{1}{12}(K\lambda(\beta_i))^{-2}$.
99: Invoking Assumption HR1 again,
100: the $i$th term in the sum is $\int_{x \in S_i} \frac{1}{12}(K\lambda(\beta_i))^{-2} f_X(x) \, dx$.
101: Finally,
102: \begin{eqnarray}
103: \label{eq:unoptimizedHrDist}
104: d(K;\lambda) \sim \int_0^1 \frac{(K\lambda(x))^{-2}}{12}  f_X(x) \, dx
105:   & = & \frac{1}{12K^2} \E{\lambda^{-2}(X)} \\
106:   & = & \dhr(K;\lambda) \mbox{.} \nonumber
107: \end{eqnarray}
108: %This approximation holds in the sense that the ratio of the two
109: %quantities approaches 1 as the rate increases.
110: 
111: 
112: \subsubsection{The Resolution-Rate Function}
113: \label{sec:BackgroundHighResolutionRate}
114: For a fixed-rate quantizer, the resolution-rate relationship
115: is given simply by $\Kfr(R;\lambda) = \lfloor 2^R \rfloor$, and
116: it is approximated with vanishing relative error by $\Kfrhr(R;\lambda) = 2^R$.  The variable-rate resolution-rate
117: function is more difficult to approximate.
118: 
119: As long as the quantization is fine ($\lambda(x) > 0$) wherever the
120: density is positive,
121: we can approximate the output entropy of a quantizer using the
122: point density.  Defining $p(x)$  as $\P{X \in S_i}$ for $x \in S_i$,
123: and letting $h(X)$ denote the differential entropy of $X$,
124: \begin{eqnarray}
125: H(Q^{\lambda}_K(X)) & = & - \sum_{i\in\I} \P{X \in S_i} \log \P{X \in S_i} \nonumber \\
126:    & \eqlabel{a} & - \int_0^1 f_X(x) \log p(x) \, dx \nonumber \\
127:    & \stackrel{(b)}{\sim} & - \int_0^1 f_X(x) \log( f_X(x)/(K\lambda(x)) ) \, dx \nonumber \\
128:    & = & - \int_0^1 f_X(x) \log f_X(x) \, dx \nonumber \\
129:    &   & \quad + \: \int_0^1 f_X(x) \log(K\lambda(x)) \, dx \nonumber \\
130:    & = & h(X) + \log K + \E{\log \lambda(X)} \mbox{,}
131:  \label{eq:1drate}
132: \end{eqnarray}
133: where (a) follows from the definition of $p(x)$; and
134: (b) involves approximating the source distribution as constant
135: in each cell and \eqref{eq:cell-length}.
136: 
137: A generalized version of this approximation is proven rigorously
138: in \cite{LinderZZ1999}.  We state it here as a lemma.
139: \begin{lemma}
140: \label{lem:ResolutionRate}
141: Suppose the source $X$ has a density over $[0,1]$ and a finite differential
142: entropy $h(X)$.  Additionally suppose that at least one of the quantizers in the
143: companding quantizer sequence $\{Q_{K}^{\lambda}\}$ possesses finite (discrete) entropy
144: $H(Q^{\lambda}_K(X))$.  Then if $\E{\log \lambda(X)}$ is finite,
145: \[
146: \lim_{R\rightarrow\infty} \left[ H(Q^{\lambda}_{K(R;\lambda)}(X)) - \log K(R;\lambda) \right] = h(X) + \E{\log\lambda(X)} \mbox{.}
147: \]
148: \end{lemma}
149: \begin{IEEEproof}
150: Follows as a special case of Proposition 2 in \cite{LinderZZ1999}.
151: \end{IEEEproof}
152: 
153: With the insight of this approximation, we define the variable-rate
154: approximate resolution-rate function $\Kvrhr(R;\lambda)$ through
155: \[
156: \log \Kvrhr(R;\lambda) = R-h(X)-\E{\log\lambda(X)} \mbox{.}
157: \]
158: \begin{lemma}
159: \label{lem:ResRateError}
160: The error between the log of the variable-rate approximate resolution-rate function $\log \Kvrhr (R;\lambda)$
161: and the log of the actual resolution-rate function $\Kvr(R;\lambda)$ goes to zero, i.e.
162: \[ \lim_{R\rightarrow\infty} \log \Kvrhr(R;\lambda) - \log \Kvr(R;\lambda) = 0 \mbox{.}\]
163: \end{lemma}
164: 
165: \begin{IEEEproof}
166: The error of the approximation $\Kvrhr$ may be written as
167: \[
168: \log \Kvr(R;\lambda) - \log \Kvrhr(R;\lambda) = \epsilon_R + H(Q^{\lambda}_{\Kvr(R;\lambda)}(X)) - R \mbox{,}
169: \]
170: where $\epsilon_R$ goes to zero by Lemma~\ref{lem:ResolutionRate}.
171: Furthermore, by definition $\Kvr(R;\lambda)$ has been chosen to be the largest resolution
172: such that $H(Q^{\lambda}_{\Kvr(R;\lambda)}(X)) \leq R$. We then have that 
173: \[ R-H(Q^{\lambda}_{\Kvr(R;\lambda)}(X) ) < H(Q^{\lambda}_{\Kvr(R;\lambda)+1}(X)) - H(Q^{\lambda}_{\Kvr(R;\lambda)}(X)) \mbox{,}\]
174: i.e. the second term in the rate approximation error is bounded
175: by the increment in entropy from an increment in resolution.  
176: By Lemma~\ref{lem:ResolutionRate} once again, the increment in entropy
177: may be bounded as 
178: \beqan
179: \lefteqn{H(Q^{\lambda}_{\Kvr(R;\lambda)+1}(X)) - H(Q^{\lambda}_{\Kvr(R;\lambda)}(X))} \\ & = & 
180: h(X) +\log (\Kvr(R;\lambda)+1) + \E{\log \lambda(X)} - h(X) - \log \Kvr(R;\lambda) -\E{\log \lambda(X)} + \delta_R \\
181: & = & \log (\Kvr(R;\lambda)+1) - \log \Kvr(R;\lambda) + \delta_R \\
182: & = & \log \frac{\Kvr(R;\lambda)+1}{\Kvr(R;\lambda)} + \delta_R\mbox{,}
183: \eeqan
184: where $\delta_R$ goes to zero.  Since $\Kvr(R;\lambda)$ diverges with $R$, this error goes to zero.
185: \end{IEEEproof}
186: 
187: \subsubsection{The Distortion-Rate Functions}
188: \label{sec:BackgroundHighDistortionRate}
189: The high resolution distortion-rate function can be obtained by combining
190: the distortion-resolution and resolution-rate functions.  For fixed-rate,
191: \begin{subequations}
192: \beq
193: \Dfrhr(R) = \frac{1}{12} \E{\lambda^{-2}(X)} 2^{-2R} \mbox{,} \label{eq:BackgroundFixedRateDistortion}
194: \eeq
195: whereas for variable-rate
196: \beq
197: \Dvrhr(R) = \frac{1}{12} \E{\lambda^{-2}(X)} 2^{-2(R-h(X)+\E{\log\lambda(X)})} \mbox{.}
198: \label{eq:BackgroundVariableRateDistortion}
199: \eeq
200: \end{subequations}
201: Asymptotic validity in the sense of \eqref{eq:asymptoticaccuracy}
202: follows in the fixed-rate case from \eqref{eq:distortionResolutionAsymptotic} and from
203: the fact that $\left( \Kfr(R;\lambda)/\Kfrhr(R;\lambda) \right)^2$ goes to 1\@.
204: In the variable-rate case, we may bound the error from use of $\Khr(R;\lambda)$
205: in place of $K(R;\lambda)$ as a multiplying factor of $2^{2|\Khr(R;\lambda) - K(R;\lambda)|}$,
206: which by Lemma \ref{lem:ResRateError} goes to 1\@.
207: 
208: 
209: \subsubsection{Asymptotically-Optimal Companding Quantizer Sequences}
210: \label{sec:review-optimal}
211: We seek asymptotically-optimal companding quantizer sequences for
212: both fixed-rate and variable-rate constraints.  By the following lemma,
213: this reduces to minimizing the high-resolution distortion-rate functions
214: of \eqref{eq:BackgroundFixedRateDistortion} and \eqref{eq:BackgroundVariableRateDistortion}.
215: 
216: \begin{lemma}
217: \label{lem:optimizationIsLegit}
218: Suppose $\lambda_{\rm fr}^*$ and $\lambda_{\rm vr}^*$ minimize $\Dfrhr(R;\lambda)$ 
219: and $\Dvrhr(R;\lambda)$ respectively.
220: Then the quantizer sequences $\{Q_{K}^{\lambda_{\rm fr}^*}\}$ and $\{Q_{K}^{\lambda_{\rm vr}^*}\}$
221: are asymptotically fixed- and variable-rate optimal.
222: \end{lemma}
223: 
224: \begin{IEEEproof}
225: As the proof is virtually identical for fixed- and variable-rate cases, we only provide it for 
226: the variable-rate case.
227: 
228: Let $\{Q_{K}^{\lambda}\}$ be any companding quantizer sequence.  We are interested in proving
229: that
230: \[
231: \limsup_{R\rightarrow \infty} \frac{\Dvr(R;\lambda_{\rm vr}^*)}{\Dvr(R;\lambda)} \leq 1 \mbox{.}
232: \]
233: The supremum limit on the left may be factored:
234: \beqan
235: \limsup_{R\rightarrow \infty} \frac{\Dvr(R;\lambda_{\rm vr}^*)}{\Dvr(R;\lambda)} & = &
236: \limsup_{R\rightarrow \infty} \frac{\Dvr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda_{\rm vr}^*)}
237: \,
238: 							    \frac{\Dvrhr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda)}
239: \,
240: 							    \frac{\Dvrhr(R;\lambda)}{\Dvr(R;\lambda)} \\
241: & \leqlabel{a} &
242: \limsup_{R\rightarrow \infty} \frac{\Dvr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda_{\rm vr}^*)}
243: \,
244: \limsup_{R\rightarrow \infty} \frac{\Dvrhr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda)}
245: \,
246: \limsup_{R\rightarrow \infty} \frac{\Dvrhr(R;\lambda)}{\Dvr(R;\lambda)} \mbox{,}
247: \eeqan
248: where (a) follows because the supremum limit of a product of positive sequences is upper-bounded by the product of their individual supremum limits.  We can now bound each of these factors.
249: 
250: We have, by optimality
251: of $\lambda_{\rm vr}^*$, that $\Dvrhr(R;\lambda) \geq \Dvrhr(R;\lambda_{\rm vr}^*)$
252: for any $R$ and therefore that
253: \[
254: \limsup_{R\rightarrow\infty} \frac{\Dvrhr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda)} \leq 1 \mbox{.}
255: \]
256: Furthermore, by \eqref{eq:asymptoticaccuracy}, we have that
257: \[
258: \lim_{R\rightarrow\infty} \frac{\Dvr(R;\lambda_{\rm vr}^*)}{\Dvrhr(R;\lambda_{\rm vr}^*)} = 
259: \lim_{R\rightarrow\infty} \frac{\Dvrhr(R;\lambda)}{\Dvr(R;\lambda} = 
260: 1 \mbox{.}
261: \]
262: This proves the lemma.
263: \end{IEEEproof}
264: 
265: Now we optimize the distortion-rate expressions.
266: Because analogous optimizations appear in Sections~\ref{sec:Single}
267: and~\ref{sec:Multi}, we explicitly derive
268: both the optimizing point densities and the resulting distortion-rate functions.
269: Our approach follows~\cite{GrayG1977}.
270: 
271: In the fixed-rate case, the problem is to minimize \eqref{eq:BackgroundFixedRateDistortion} for
272: a given value of $R$.
273: This minimization may be performed with the help of H\"{o}lder's inequality:
274: \beqan
275: \Dfrhr(R;\lambda) & = & 	\frac{1}{12}2^{-2R} \int_0^1 f_X(x) \lambda^{-2}(x) dx \\
276: 		& = & 	\frac{1}{12}2^{-2R} \int_0^1 f_X(x) \lambda^{-2}(x) dx  \left( \int_0^1 \lambda(x)dx\right)^2\\
277: 		& \geq & 	\frac{1}{12}2^{-2R} \int_0^1 \left( f_X(x) \lambda^{-2}(x)\right)^{1/3}
278: 									   \left(\lambda(x) \right)^{2/3}
279: 									   dx\\
280: 		& = & \frac{1}{12}2^{-2R} \left( \int_0^1 f_X(x)^{1/3} \right)^3	\mbox{,}		
281: \eeqan
282: with equality only if $\lambda(x)\propto f_X(x)^{1/3}$.
283: Thus,
284: $\Dfrhr$ is minimized by
285: \beq
286:   \label{eq:fixed-opt-lambda}
287:     \lambda(x) = f_X^{1/3}(x) / \left({\textstyle \int_0^1 f_X^{1/3}(t) \,dt}\right) \mbox{.}
288: \eeq
289: The resulting minimal distortion is  
290: \beq
291:   \label{eq:fixedHrDist}
292: \Dfrhr(R) = \frac{1}{12}2^{-2R} \left( \int_0^1 f_X^{1/3}(x) \, dx \right)^3
293:     = \frac{1}{12} \| f_X \|_{1/3} 2^{-2R} \mbox{,}
294: \eeq
295: where we have introduced a notation for the $\mathcal{L}^{1/3}$ quasinorm.
296: 
297: For the variable-rate optimization, we use Jensen's inequality rather than
298:  H\"{o}lder's inequality:
299: \beqan
300: \Dvrhr(R;\lambda)  & = & \frac{1}{12}2^{-2(R-h(X))} \E{\lambda^{-2}(X)}2^{-2\E{\log \lambda(X)}} \\
301: & \stackrel{(a)}{\geq} & \frac{1}{12}2^{-2(R-h(X))} \E{\lambda^{-2}(X)}2^{-2\log\E{\lambda(X)}} \\
302: & = & \Dvrhr(R) \mbox{,}
303: \eeqan
304: where (a) follows from the convexity of $-\log(\cdot)$.
305: This lower bound is achieved when $\lambda(X)$ is a constant.
306: Thus
307: $\lambda(x) = 1$ is asymptotically optimal, i.e., the quantizer should be uniform.%
308: %\footnote{Recall that for the variable-rate case we are assuming
309: % $f_X$ is supported on $[0,1]$. For other bounded supports,
310: % the optimal point density would still be a constant, but perhaps
311: % different from 1\@.  Unbounded supports require the use of an
312: % unnormalized point density.}
313: %The corresponding minimal distortion is 
314: %\beq
315: %  \label{eq:varHrDist}
316: %D_{HR} \approx \frac{1}{12} 2^{2h(X)} 2^{-2R} \mbox{.}
317: %\eeq
318: 
319: %Note that both optimal point densities are positive on the entire
320: %support of $f_X$.  Thus, at high enough resolution, the quantization is
321: %fine \emph{pointwise over $X$}.  In the functional settings,
322: %this will be used to justify piecewise linear approximation of the function $g$.
323: Note that both variable- and fixed-rate quantization
324: have $\Theta(2^{-2R})$, or $-6$ dB/bit, dependence of distortion on rate.
325: This is a common feature of
326: ordinary quantizers, but we demonstrate in Section~\ref{sec:DontCare}
327: that certain functional scenarios can cause distortion to fall even faster
328: with the rate.
329: 
330: %One way to concretely specify a quantizer from a point density is to require
331: %$$
332: %  \Lambda(\beta_i) = i-\half, \qquad i=1,\,2,\,\ldots,\,K\mbox{,}
333: %$$
334: %where $\Lambda(x) = \int_0^x \lambda(t) \, dt$
335: %is the ``cumulative'' point density.
336: %However, analysis of quantizers through point densities does not rely on
337: %the precise placement of codewords and cell boundaries.
338: %Under the assumptions of high-resolution analysis,
339: %$o(1/K)$ deviations in the $\beta_i$s do not affect the distortion.
340: %We return to this point in Section~\ref{sec:single-discontinuous}
341: %to partially generalize the basic analysis to discontinuous functions.
342: 
343: \subsubsection{Optimal Bit Allocation}
344: As a final preparatory digression, we state the solution to a typical
345: resource allocation problem that arises several times in Section~\ref{sec:Multi}.
346: \begin{lemma}
347:   \label{lem:bit-alloc}
348: Suppose $D = \sum_{j=1}^n c_j 2^{-2R_j}$
349: for some positive constants $\{c_j\}_{j=1}^n$.
350: Then the minimum of $D$ over the choice of $\{R_j\}_{j=1}^n$
351: subject to the constraint $\sum_{j=1}^n R_j \leq nR$ is attained with
352: $$
353:   R_j = R + \Half \log \frac{c_j}{ \left( \prod_{j=1}^n c_j \right)^{1/n} },
354: \qquad j=1,\,2,\,\ldots,\,n,
355: $$
356: resulting in
357: $$
358:   D = n \left( {\textstyle \prod_{j=1}^n c_j} \right)^{1/n} 2^{-2R}.
359: $$
360: \end{lemma}
361: \begin{IEEEproof}
362: The result can be shown using the inequality for arithmetic and geometric means.
363: It appeared first in the context of bit allocation in~\cite{HuangS1963};
364: a full proof appears in \cite[Sect.~8.3]{GershoG1992}.
365: \end{IEEEproof}
366: 
367: The lemma does not restrict the $R_j$s to be nonnegative or to be integers.
368: Such restrictions are discussed in~\cite{FarberZ2006}.
369: