1: \documentclass[12pt,a4paper]{article}
2:
3: \usepackage{ajr}
4:
5: \usepackage{defns}
6:
7: \begin{document}
8:
9: \title{Use the information dimension, not the Hausdorff}
10: \author{A.~J. Roberts\thanks{Dept.~Mathematics \& Computing, University
11: of Southern Queensland, Toowoomba, Queensland 4350, \textsc{Australia}.
12: \protect\url{mailto:aroberts@usq.edu.au}}}
13:
14: \date{Original version 1997, revised \today}
15:
16: \maketitle
17:
18: \begin{abstract}
19: Multi-fractal patterns occur widely in nature. In developing new
20: algorithms to determine multi-fractal spectra of experimental data I
21: am lead to the conclusion that generalised dimensions~$D_q$ of order
22: $q\leq0$, including the Hausdorff dimension, are effectively
23: \emph{irrelevant}. The reason is that these dimensions are
24: extraordinarily sensitive to regions of low density in the
25: multi-fractal data. Instead, one should concentrate attention on
26: generalised dimensions~$D_q$ for $q\geq 1$, and of these the
27: information dimension~$D_1$ seems the most robustly estimated from a
28: finite amount of data.
29: \end{abstract}
30:
31: \tableofcontents
32:
33:
34: \section{Introduction}
35:
36: The characterisation of spatial distributions in terms of fractal
37: concepts~\cite{Mandelbrot79, Feder88} is becoming increasingly
38: important. In particular, many distributions in nature are found to
39: have the characteristics of a multi-fractal~\cite{Hentschel83,
40: Halsey86, Paladin87}: among many examples are galaxy
41: clustering~\cite{Borgani93, Martinez91}, strange
42: attractors~\cite{Procaccia88a}, fluid turbulence~\cite{Sreenivasan91},
43: percolation~\cite{Isichenko92}, the shapes of
44: neurons~\cite{Jelinek01, Jelinek04},
45: and plant distributions~\cite{Emmerson95} and shapes~\cite{Jones96}.
46:
47: In application, methods for estimating fractal dimensions are often
48: unreliable. One source of error lies in largely unknown biases
49: introduced by the finite size of data sets, addressed by
50: Grassberger~\cite{Grassberger88b}, and in the associated finite range
51: of length-scales inherent in gathered data. In situations where
52: thousands or tens of thousands of data points are known such biases may
53: be minor; however, in some interesting problems, for example in the
54: spatial clustering of underwater plants~\cite{Emmerson95}, only of the
55: order of 100 data points are known and confidence in the fractal
56: characterisation may be misplaced. We need to know more about factors
57: that cause errors in dimension estimates.
58:
59: Section~\ref{ss2} discusses the sensitivity of the multiplicative
60: multi-fractal process to regions of very low probability (measure).
61: Since such regions only rarely contribute a data point, an experimental
62: sample cannot discern them but such regions do affect the generalised
63: dimensions. Hence I argue that the determination from experimental
64: data of generalised dimensions,~$D_q$, for non-positive~$q$ is
65: meaningless; for $0<q<1$ computations are very sensitive to the sample;
66: and thus the most robust fractal dimension is the information
67: dimension~$D_1$. The argument is supported in Section~\ref{ss3} by a
68: maximum likelihood method~\cite{Roberts95b} of estimating the
69: multi-fractal properties of a data set. The method shows the enormous
70: sensitivity of~$D_q$ for negative~$q$. In contrast the information
71: dimension is reliably estimated.
72:
73:
74:
75:
76:
77: \section{Poor conditioning of generalised dimensions of negative order}
78: \label{ss2}
79:
80: For example, consider the Hausdorff dimension,~$D_0$, of multifractals
81: generated by two different ternary multiplicative process.
82: \begin{itemize}
83: \item Consider first the process shown in Figure~\ref{ftern}(a)
84: where an interval is divided into three thirds and the ``mass'' of
85: the original interval is assigned as follows: a fraction $f_1>0$
86: to the left third; a fraction $f_2=1-f_1>0$ to the right third;
87: and none to the middle third. Repeat this subdivision
88: recursively. This generates a multiplicative multifractal whose
89: Hausdorff dimension of $D_0=\log_32=0.6309$ is precisely the same
90: as the Cantor set because there is no ``mass'' in the middle
91: thirds.
92:
93: \item Conversely, and perversely, consider the process shown in
94: Figure~\ref{ftern}(b) where for some small~$\epsilon$ the ``mass''
95: is assigned as follows: a fraction $f_1>0$ is assigned to the left third;
96: a fraction $f_2>0$ is assigned to the rightmost third; and a small
97: fraction $\epsilon>0$ is assigned to the middle third (such that
98: $f_1+f_2+\epsilon=1$). Repeat recursively. This generates a
99: multiplicative multi-fractal whose Hausdorff dimension is $D_0=1$
100: because there is ``mass'' everywhere along the whole interval!
101: Although the vast bulk of the ``mass'' can be covered by~$2^n$
102: intervals of length~$3^{-n}$, we definitely do need~$3^n$
103: intervals in order to ensure coverage of the thinly spread
104: ``mass'' that fills most of the original interval.
105: \end{itemize}
106: The importance of this for the analysis of an experimental data set of
107: $N$~sampled points is that one cannot tell the difference from the data
108: between these two multi-fractal generating processes for an
109: $\epsilon=\ord{1/N}$. Thus one cannot estimate the Hausdorff
110: dimension~$D_0$ with any accuracy since either answer, $0.6309$~or~$1$
111: could be correct.
112: \begin{figure}[tbp]
113: \centerline{{\tt \setlength{\unitlength}{0.075em}
114: \begin{picture}(402,196)
115: \thinlines \put(242,18){$\epsilon f_2$}
116: \put(170,18){$\epsilon f_1$}
117: \put(206,18){$\epsilon^2$}
118: \put(310,18){$f_2\epsilon$}
119: \put(87,18){$f_1\epsilon$}
120: \put(206,46){$\epsilon$}
121: \put(336,10){\line(-1,0){37}}
122: \put(151,10){\line(1,0){111}}
123: \put(77,10){\line(1,0){37}}
124: \put(151,40){\line(1,0){111}}
125: \put(40,10){\begin{picture}(333,62)
126: \thicklines \put(311,8){$f_2^2$}
127: \put(234,8){$f_2f_1$}
128: \put(88,8){$f_1f_2$}
129: \put(12,8){$f_1^2$}
130: \put(278,38){$f_2$}
131: \put(56,38){$f_1$}
132: \put(333,0){\line(-1,0){37}}
133: \put(222,0){\line(1,0){37}}
134: \put(111,0){\line(-1,0){37}}
135: \put(0,0){\line(1,0){37}}
136: \put(222,30){\line(1,0){111}}
137: \put(0,30){\line(1,0){111}}
138: \put(0,60){\line(1,0){333}}
139: \end{picture}}
140: \put(40,120){\begin{picture}(333,62)
141: \thicklines \put(311,8){$f_2^2$}
142: \put(234,8){$f_2f_1$}
143: \put(88,8){$f_1f_2$}
144: \put(12,8){$f_1^2$}
145: \put(278,38){$f_2$}
146: \put(56,38){$f_1$}
147: \put(333,0){\line(-1,0){37}}
148: \put(222,0){\line(1,0){37}}
149: \put(111,0){\line(-1,0){37}}
150: \put(0,0){\line(1,0){37}}
151: \put(222,30){\line(1,0){111}}
152: \put(0,30){\line(1,0){111}}
153: \put(0,60){\line(1,0){333}}
154: \end{picture}}
155: \thicklines \put(10,67){(b)}
156: \put(10,177){(a)}
157: \end{picture}}}
158: \caption{schematic diagram of the first few stages in the
159: multiplicative multi-fractal process to illustrate the sensitivity
160: of the Hausdorff dimension~$D_0$ with respect to low density
161: regions,~(b), as a perturbation of the same process with zero
162: density regions,~(a).}
163: \protect\label{ftern}
164: \end{figure}
165:
166: Similar reasoning applies to generalised dimensions with negative~$q$.
167: Elementary arguments give that the generalised
168: dimensions~\cite{Hentschel83} of the multi-fractal generated by the
169: second process above are
170: \begin{equation}
171: D_q=\left\{
172: \begin{array}{ll}
173: \frac{-1}{q-1}\log_3\left[f_1^q+f_2^q+\epsilon^q\right] & \mbox{if
174: }q\neq 1\,, \\
175: -\left[f_1\log_3f_1+f_2\log_3f_2+\epsilon\log_3\epsilon\right] &
176: \mbox{if }q=1\,.
177: \end{array}\right.
178: \end{equation}
179: It is readily appreciated that for negative order~$q$ and
180: small~$\epsilon$, the term~$\epsilon^q$ inside the logarithm
181: dominates the evaluation of the generalised dimension~$D_q$. Hence, all
182: generalised dimensions for negative~$q$ are also extremely sensitive to
183: small~$\epsilon$. In a data set obtained from experiments, one cannot
184: expect to distinguish between zero~$\epsilon$ and small non-zero
185: $\epsilon=\ord{1/N}$, and yet the generalised exponents and
186: multi-fractal spectrum are markedly different. See Figure~\ref{fthe}
187: which plots the generalised dimensions for $f_1\approx1/4$,
188: $f_2\approx3/4$ and various small~$\epsilon$.
189: \begin{figure}[tbp]
190: \centerline{\includegraphics[width=0.95\textwidth]{gendim}}
191: \caption{multi-fractal generalised dimensions~$D_q$ for the
192: ternary multi-fractal process with $f_1=(1-\epsilon)/4$,
193: $f_2=(1-\epsilon)3/4$ and $\epsilon=0$ (solid), $0.01$ (dashed)
194: and $0.05$ (dotted). This figure shows that $D_q$~for negative
195: order~$q$ is extraordinarily sensitive to small influences: the
196: curve of smaller~$\epsilon$ is the most changed.}
197: \protect\label{fthe}
198: \end{figure}
199:
200: We can be more precise about the sensitivity to low density regions
201: by computing the derivative of~$D_q$ with respect to~$\epsilon$. For
202: definiteness, suppose $f_1=\phi_1(1-\epsilon)$ and
203: $f_2=\phi_2(1-\epsilon)$. Then
204: \begin{equation}
205: \frac{\partial D_q}{\partial \epsilon}=\frac{-q}{q-1}
206: \frac{\epsilon^{q-1}-\left(\phi_1^q+\phi_2^q\right)(1-\epsilon)^{q-1}}%
207: {\log3\,\left[\epsilon^q+\left(\phi_1^q+\phi_2^q\right)(1-\epsilon)^q\right]}
208: \,.
209: \label{ede}
210: \end{equation}
211: For small, but non-zero, $\epsilon\to 0$ this asymptotes to
212: \begin{equation}
213: \frac{\partial D_q}{\partial\epsilon}\sim\frac{1}{\log3}\left\{
214: \begin{array}{ll}
215: \frac{q}{q-1} & \mbox{if }1<q\,, \\
216: \frac{q}{(1-q)(\phi_1^q+\phi_2^q)}\epsilon^{q-1} & \mbox{if
217: }0<q<1\,, \\
218: \frac{q}{1-q}\epsilon^{-1} & \mbox{if }\phantom{0<}q<0\,.
219: \end{array}
220: \right.
221: \label{easy}
222: \end{equation}
223: This derivative is unbounded as $\epsilon\to0$ for $q<1$, and so any
224: computation of~$D_q$ is only robust if $q\geq 1$.
225:
226: The reason for this aberrant behaviour is clear. With a finite number
227: of data points, it is impossible to tell the difference between truly
228: empty space and space which is visited so rarely that no data point
229: happens to fall within it. That is, one cannot tell the difference
230: between empty space and space that should be filled in with very low
231: probability. These differences dramatically affect the generalised
232: dimensions~$D_q$ for $q<1$. Thus for any experimental data set:
233: \begin{itemize}
234: \item estimating~$D_q$ for $q\leq0$ is nonsense (including the Hausdorff
235: dimension);
236:
237: \item estimates of~$D_q$ for small positive~$q$ are sensitive; and
238:
239: \item I only recommend the reporting of dimensions~$D_q$
240: for $q\geq1$ as being robust.
241: \end{itemize}
242: Out of all the generalised dimensions for order $q\geq 1$, $D_1$~is
243: most representative of the fractal as a whole. For large order~$q$,
244: the computation of~$D_q$ is determined only by the very ``densest''
245: regions of the multi-fractal and so is not representative of the whole
246: fractal. In the above multiplicative process,
247: \begin{displaymath}
248: D_q\sim-\log_3\mbox{max}(f_1,f_2,\epsilon)
249: \quad\mbox{as}\quad q\to\infty\,,
250: \end{displaymath}
251: showing that the large~$q$ behaviour is dictated by the one parameter
252: of the process that determines the character of the very densest
253: clusters in the fractal. The very dense clusters occur rarely in the
254: fractal; they have low fractal dimension as seen in the low~$f$ value
255: typically associated with low values of~$\alpha$ in the multi-fractal
256: spectrum. Because of this rareness, the computation from experimental
257: data of~$D_q$ for large positive order~$q$ is unreliable. Then,
258: conversely, the information dimension weights the data most uniformly,
259: and so ``knows'' most about the fractal, without being overly
260: sensitive to the possible occurrence of regions of very low
261: probability. The information dimension seems most informative.
262:
263:
264:
265:
266: \section{Fractal dimensions unbiased by finite size of data sets}
267: \label{ss3}
268: Cronin \& Roberts~\cite{Roberts95b} proposed a novel method to
269: eliminate biases, caused by finite sized data sets, in determining the
270: multi-fractal properties of a given data set. Jelenik et
271: al.~\cite{Jelinek01, Jelinek04} used this method to explore the shape
272: of neuron cells. The method compares characteristics of the inter-point
273: distances in the data set with those of artificially generated
274: multi-fractals. By maximising the likelihood that the characteristics
275: are the same we model the multi-fractal nature of the data by the
276: parameters of the artificial multi-fractal. By searching among
277: artificial multi-fractals with precisely the same number of sample
278: points as in the data, we anticipate that biases due to the finite
279: sample size will be statistically the same in the data and in the
280: artificial multi-fractals; hence predictions based upon the fitted
281: multi-fractal parameters should be unbiased by the finite sample size.
282:
283: The method also appears to give a reliable indication of the error in
284: the estimates---a very desirable feature as also noted by Judd \&
285: Mees~\cite{Judd91}. Most importantly for this paper, I generate finite
286: size data sets with specific parameters for the following specific
287: multiplicative multi-fractal process. Given parameters
288: $\rho\in[0,0.5]$ and $\phi\in[0,0.5]$ a binary multiplicative
289: multi-fractal is generated by the recursive procedure of dividing each
290: interval into two halves, then assigning a fraction~$\phi$ of the
291: points in the interval to a random sub-interval of length~$\rho$ in the
292: left half, and the complementary fraction $\phi'=1-\phi$ to a random
293: sub-interval of length~$\rho$ in the right half. Such a process has
294: generalised dimension
295: \begin{equation}
296: D_q=\frac{\log\left(\phi^q+{\phi'}^q\right)}{(q-1)\log\rho}\,,
297: \label{emdq}
298: \end{equation}
299: and a multi-fractal spectrum $f(\alpha)$~\cite[\S4]{Halsey86} given
300: parameterically in terms of $0<\xi<1$ and $\xi'=1-\xi$ as
301: \begin{equation}
302: f = \frac{\xi\log{\xi}+\xi'\log{\xi'}}
303: {\log{\rho}}
304: \,, \qquad
305: \alpha = \frac{\xi\log{\phi}+\xi'\log{\phi'}}
306: {\log{\rho}} \,.
307: \label{emfs}
308: \end{equation}
309: Here I chose $\rho=1/3$ and $\phi=1/4$ and sample the process with
310: $N=100$; such a multi-fractal forms a finite data set whose
311: parameters we need to estimate from the sample.
312:
313: As explained in~\cite{Roberts95b}, we analyse such a sample by probing
314: it with \emph{exactly} the same multiplicative multi-fractal process,
315: and seek the best fit parameters. Here the resulting estimate of the
316: original parameters is then in error \emph{only} due to the finite size
317: of the sample of the original multi-fractal process. Because we fit
318: the data with a process which we know includes the one that generated
319: the data (a luxury rare in practise), there is no other
320: error. Thus the spread in errors that we see is characteristic of only
321: the errors induced by a finite sized sample, nothing else. In
322: particular, observe that the deductions of the preceding section are
323: indeed appropriate.
324:
325: \begin{figure}
326: \centerline{\includegraphics[width=0.95\textwidth]{n100means}}
327: \caption{predicted multi-fractal parameters $(\rho,\phi)$, indicated
328: by~$\circ$'s, from the maximum likelihood match to an ensemble of 16
329: different realisations, each of $N=100$ data points, of a binary
330: multiplicative multi-fractal with parameters $\rho=1/3$ and
331: $\phi=1/4$, indicated by~$+$. The mean location of the
332: predictions is indicated by a~$\times$.}
333: \protect\label{n100means}
334: \end{figure}
335:
336: I repeat the sampling of the multi-fractal followed by a maximum
337: likelihood estimate of the parameters 16~times. Figure~\ref{n100means}
338: plots the estimates of the parameters. Observe that the whole sampling
339: and estimation process appears unbiased in that the mean of the
340: predictions is reasonably close to the correct values of the
341: parameters.
342:
343: \begin{figure}
344: \centerline{\includegraphics[width=0.95\textwidth]{n100dqs}}
345: \caption{ensemble of multi-fractal generalised dimensions~$D_q$,
346: dotted, for each of the predictions plotted in
347: Figure~\protect\ref{n100means} made from samples of $N=100$ data
348: points. For comparison the generalised dimensions for the actual
349: fractal is plotted as the solid line. Observe the good estimation
350: near the information dimension, but the large errors for negative
351: order~$q$.}
352: \protect\label{n100dqs}
353: \end{figure}
354:
355: Ultimately, experimenters want to examine multi-fractal properties of
356: the data. Here these will be determined from the parameters
357: $(\rho,\phi)$ of the best fit multi-fractal substituted into analytic
358: expressions such as (\ref{emdq})~and~(\ref{emfs}). For each of the
359: 16~realisations and their best-fit estimates, I plot the corresponding
360: predicted generalised dimensions~$D_q$ in Figure~\ref{n100dqs}. (The
361: corresponding graphs of the multi-fractal spectra~$f(\alpha)$ are
362: plotted in Figure~7 of~\cite{Roberts95b} along with the true
363: $f(\alpha)$~curve.) Observe that the predicted dimensions for
364: positive~$q$ (low~$\alpha$) are quite good for all realisations,
365: especially near the information dimension,~$D_1$. However, predicted
366: dimensions for negative~$q$ (high~$\alpha$) are very poor; this is
367: also the case for the Hausdorff dimension~$D_0$ (the maximum of the
368: $f(\alpha)$~curve). The negative~$q$ predictions are poor despite the
369: fitting process ``knowing'' that there are no very low probability
370: regions in this artificial process. In general applications one
371: cannot know this and I expect the negative~$q$ (large~$\alpha$)
372: predictions to be significantly worse. These numerical results
373: convincingly support the arguments of the preceding section that we
374: should use the information dimension, not the Hausdorff.
375:
376:
377:
378:
379:
380:
381: \bibliographystyle{plain}
382: \bibliography{ajr,bib}
383:
384:
385: \end{document}
386:
387:
388: