cs0201015/cs0201015
1: \documentclass[10pt]{article}
2: 
3: \title{On the Significance of Digits in Interval Notation}
4: \author{\mbox{M.H. van Emden}\\
5:         \mbox{Computer Science Dept, University of Victoria}\\
6:         \mbox{Technical Report DCS-270-IR}
7:        }
8: \date{}
9: 
10: \newtheorem{definition}{Definition}
11: 
12: %%%%%%%%%%%%%%%%%%%%%%%%%%%%% end preamble
13: 
14: \begin{document}
15: \maketitle
16: 
17: \begin{abstract}
18: To analyse the significance of the digits used for interval bounds, we
19: clarify the philosophical presuppositions of various interval
20: notations. We use information theory to determine the information
21: content of the last digit of the numeral used to denote the interval's
22: bounds. This leads to the notion of \emph{efficiency} of a decimal digit:
23: the actual value as percentage of the maximal value of its information
24: content. By taking this efficiency into account, many presentations of
25: intervals can be made more readable at the expense of negligible loss of
26: information.
27: \end{abstract}
28: 
29: \section{Introduction}
30: 
31: Once upon a time, it was a matter of professional ethics among computers
32: never to write a meaningless decimal. Since then computers have become
33: machines and thereby lost any form of ethics, professional or
34: otherwise.  The human computers of yore were helped in their ethical
35: behaviour by the fact that it took effort to write spurious decimals.
36: Now the situation is reversed: the lazy way is to use the default
37: precision of the I/O library function. As a result, we are deluged with
38: meaningless decimals.
39: 
40: Of course interval arithmetic is not guilty of such negligence. After
41: all, the very {\em raison d'\^etre} of the subject is to be explicit
42: about the precision of computed results.  Yet, even interval arithmetic
43: is plagued by superfluous decimals, albeit in a more subtle way.  In
44: this note we first review the various interval notations. We argue in
45: favour of a rarely used notation called ``tail'', or ``factored'', which
46: has the advantage of avoiding the repetition of decimals that are
47: necessarily the same.  We analyse the information content of the
48: remaining decimals.
49: 
50: \section{Philosophical implications of an interval notation} 
51: Several papers \cite{hvnn01,szwc99,vnmdn01a} discussing interval
52: notations have been published recently.  The various notations have
53: different implications, just as people have different reasons for being
54: interested in interval arithmetic.
55: 
56: For some, intervals are a way of denoting a fuzzy, or perhaps
57: probabilistic, quantity. Others use intervals to give an indication
58: of the extent to which rounding has introduced error in a
59: computation. Here we assume an interpretation of intervals that does
60: not necessarily negate the above interpretations, but differs in the
61: way it is made precise. We call it the {\em set interpretation of
62: interval arithmetic}.
63: 
64: \paragraph{The set interpretation}
65: According to the set interpretation, variables range over the real
66: numbers. These reals are represented in computer memory as sets of
67: reals. The constraint is that if variable $x$ is represented by set
68: $S$, we have $x \in S$. Thus the set interpretation differs from
69: conventional numerical analysis in the {\em absence of errors}.  It is
70: either true or false that $x$ belongs to $S$.
71: 
72: The fact that $S$ contains more than one real is not an error.  In
73: conventional numerical analysis, an error arises when, for example, a
74: real variable $x$ with value $0.1$ is represented by a floating-point
75: number $f$. An error arises because $x = f$ is false. On the other
76: hand, representing $x$ by $S$ is not an error if $x \in S$.
77: 
78: Of course, the statement $x \in S$ provides only a limited amount
79: of information about $x$. The larger $S$ is, the less information. In
80: the set interpretation of interval arithmetic we distinguish
81: error, which is avoidable, from the inescapable fact that the
82: amount of information yielded by a finite machine is finite.
83: 
84: \paragraph{Consequences of the set interpretation}
85: Interval arithmetic is no exception to the rule that finite machines
86: can only give a finite amount of information.  In interval arithmetic
87: the sets of reals are limited to those that are easily representable:
88: closed, connected sets of reals that have finite floating-point numbers
89: as bounds, if they have a bound at all. Unbounded closed connected sets
90: of reals use the infinities of the floating-point standard in the
91: obvious way. Each of this finite set of sets of reals can be
92: represented by a pair of floating-point numbers. It is also the case
93: that for every set of reals, there exists a unique least floating-point
94: interval containing it.
95: 
96: This is the set interpretation of interval arithmetic. Its virtues
97: include that it is familiar. In fact, many people are surprised to
98: hear it given a name, as this is what they always thought intervals to
99: be. Another virtue is that, if the set interpretation is followed up
100: in all its consequences, it allows resolution of potential ambiguities
101: in interval arithmetic, especially in interval division involving
102: unbounded intervals, intervals containing zero, or intervals
103: containing nothing but zero \cite{hckvnmdn01}.
104: 
105: \section{Interval notations}
106: If one accepts the advantages of the set interpretation of interval
107: arithmetic, then one prefers a notation for an interval that suggests
108: a set. The traditional notation, exemplified by $[1.233,1.235]$ has this
109: advantage. Although widely used, it is not practical, as is
110: apparent\footnote{
111: I'm not making this up; see page 122 of \cite{vhlmyd97}.
112: } from the statement that an unknown real $x$ belongs to
113: \begin{eqnarray}
114:       [+0.6180339887498946804,+0.6180339887498950136]. \label{crude}
115: \end{eqnarray}
116: The problem with this ubiquitous notation is that it is hard to separate
117: two important pieces of information: \emph{where} the interval is, and
118: \emph{how wide} it is. To remedy this defect,
119: Hyv\"onen \cite{hvnn01} described a notation according to which one writes
120: instead
121: \begin{eqnarray}
122:       +0.61803398874989[46804,50136].                \label{factored}
123: \end{eqnarray}
124: The situation is similar when we are annoyed by having to write
125: \begin{eqnarray*}
126:       0.61803398874989x+0.61803398874989y,
127: \end{eqnarray*}
128: which we prefer to have in factored form: $0.61803398874989(x+y)$.
129: Hence we propose to refer to (\ref{factored}) as \emph{factored notation}
130: for intervals\footnote{
131: The notation has been occasionally used without comment in the literature;
132: see for example \cite{vnmdn99a}.
133: Credit goes to Hyv\"onen, whose paper \cite{hvnn01} was the first to
134: appear in print that drew attention to it and named it.
135: Independently I did so in \cite{vnmdn01a}.
136: Hyv\"onen called it ``tail notation''.
137: }.
138: The name is more than an analogy: in general,
139: one \emph{factors} with respect to a multiplicative infix operation,
140: of which concatenation on strings is an example.
141: 
142: In the example the bounds are in normalized scientific notation and
143: have the same exponent. In general, factored notation converts an
144: interval $[a \times 10^p,b \times 10^q]$, with normalized numerals as
145: bounds, first to $[a,b \times 10^{q-p}] \times 10^p$, where the upper
146: bound is not necessarily normalized.  When $p \not = q$, then this
147: cannot be shortened by taking an initial string of common first
148: decimals outside the brackets. It can only be shortened by limiting the
149: precision of $a$ and $b$, a topic we address later in the paper.
150: 
151: Table~\ref{CLASSIF} contains an overview of interval notations. Most
152: of the table is adopted from Hyv\"onen \cite{hvnn01}.
153: In this overview we distinguish three categories: (a) those that suggest
154: a set, (b) those that suggest a number degraded by an error, and (c) those
155: that suggest a pure number.
156: \begin{table}
157: \begin{tabular}{l|l|l|}
158: Notation  &          Interval value        &   Name of notation    \\
159: \hline
160: \hline
161: $[1.233,1.235]$ &    $[1.233,1.235]$       &      Classic          \\
162: $1.23[3,5]$     &    $[1.233,1.235]$       &      Factored         \\
163: $1.234\pm 2$    &    $[1.232,1.236]$       &      Range            \\
164: $1.234$\verb|~| &    $[1.2335,1.2345]$     &      Tilde            \\
165: $1.234+$        &    $[1.234,1.235]$       &      Plus             \\
166: $1.234+[-1e-3,2e-3]$&$[1.233,1.236]$       &      Error            \\
167: $1.234*$        &    $[1.233,1.235]$       &      Star             \\
168: $1.234$         &    $[1.233,1.235]$       &      Single-Number
169: \end{tabular}
170: \caption{
171: \label{CLASSIF}
172: Overview of interval notations, adapted from Hyv\"onen \cite{hvnn01}.
173: }
174: \end{table}
175: The Classic and Factored notations belong to category (a).
176: Under category (b) we have added, in analogy with the Tilde notation, the
177: Plus notation. This latter notation is useful in the improvement of the
178: factored notation discussed later on in this paper.
179: Category (c) is in the last line. Hyv\"onen used the
180: name ``Fortran notation''. The notation is actually the
181: ``Single-number notation'' for the Fortran implementation described in
182: \cite{szwc99}.
183: 
184: The virtue of the notations in category (b) is that they make explicit
185: that a numeral is not to be interpreted according to mathematical notation,
186: by which we mean that
187: \begin{equation}
188: d_{m}d_{m-1}\ldots d_0.d_{-1}\ldots d_{-n}   \label{mathematical}
189: \end{equation}
190: denotes the number $\sum_{i= -n}^m d_i 10^i$.
191: Mathematical notation implies an infinite number of zeros after the last
192: digit when $n > 0$.
193: 
194: Mathematical notation is not the only way to interpret
195: (\ref{mathematical}).  For a long time physicists, chemists, and
196: engineers have used the convention that
197: \emph{a numeral has as meaning
198: any number that rounds to the number denoted by the numeral
199: displayed}.
200: The coexistence of mathematical notation with the physics convention
201: introduces an ambiguity that is often resolved by context.  With
202: intervals, the ambiguity becomes problematic, as we need numerals to
203: denote the bounds of an interval in the classic notation. Are these to
204: be interpreted according to mathematical notation, or according to the
205: physics convention?  It is implicit in most of the interval literature,
206: and explicit in \cite{hvnn01,vnmdn01a}, that \emph{the numerals in the
207: bounds of an interval are to be interpreted according to the
208: mathematical notation}. In this paper we follow that rule.
209: 
210: We therefore propose to avoid category (c) and to give single-number an
211: annotation to indicate that it does not have the usual mathematical
212: meaning. This has been done by Hickey, who introduced \cite{hck00} the
213: Star notation of Table~\ref{CLASSIF}.
214: 
215: \paragraph{Difficulties of factored notation}
216: There are two problems with the classical notation. The first is the
217: {\em scanning problem}\/: one needs to scan both bounds digit by digit
218: to find the leftmost different digit.  Only then does one have an idea
219: of the width of the interval. The second problem, the {\em problem of
220: useless digits} can also be found in (\ref{crude}): the width of the
221: interval is specified by no fewer than five digits.  Restricting
222: oneself to four digits for this purpose will give almost as much
223: information about $x$ and that the difference is so small as not to be
224: worth that fifth digit.  As we will show below, the same holds almost
225: always for all digits beyond the first two or three.
226: 
227: Factored notation solves the scanning problem; the problem of useless
228: digits remains. To solve it also, we need to study quantitatively the
229: information content of the statement that an unknown real $x$ is
230: contained in an interval $[a,b]$.
231: 
232: \section{Information theory}\label{INFOTH}
233: According to Shannon's theory of information (see for example, among
234: many textbooks, \cite{sh65}), observations can reduce the amount of
235: uncertainty about the value of an unknown quantity. The amount of
236: information yielded by an observation is \emph{the decrease (if any) in
237: the
238: amount of uncertainty}. Shannon argues that the amount of uncertainty
239: is appropriately measured by the \emph{entropy} of the probability
240: distribution over the possible values. For a uniform distribution on a
241: finite number of values, this reduces to the logarithm of the number
242: of possible values. It can be shown that the entropy for a
243: distribution over $n$ outcomes is maximized by the uniform
244: distribution over these outcomes.
245: 
246: When there are two equally probable possible values, and if one would
247: like this logarithm to come out at unity, one takes $2$ as base of the
248: logarithms and one calls the unit of information {\em bit}, for
249: {\bf b}inary un{\bf it} of information. Thus, the binary digits
250: carry at most one bit of
251: information. Similarly, if one works with decimal digits, then it is
252: convenient to use $10$ as the basis of the logarithms.
253: %By analogy, we
254: %call the resulting unit of information {\em dit},
255: %from {\bf d}ecimal un{\bf it} of information. For mathematical
256: %reasons, it is sometimes convenient to use $e \in 2.71[8,9]$ as basis
257: %for the logarithms, with the {\em nit} as corresponding unit of
258: %information.
259: 
260: Thus information theory determines for each number base the maximum
261: amount of information that can be carried by a digit. Normally, if we
262: don't know what a number is, and we are only given the first $k$
263: digits of a numeral denoting that number, we have no idea what the
264: next digit should be. That is, all possibilities in
265: $\{0,1,2,3,4,5,6,7,8,9\}$ are equally probable so that the uncertainty
266: is $\log_{10}10 = 1$. As a decimal digit can only distinguish between
267: ten possibilities, the efficiency of the $(k+1)$st digit is one.
268: %Of
269: %course, this also holds if we do know what the number is, and it is
270: %$\pi$, and if $k = 100$.
271: 
272: %Things are not as straightforward for the other important use of
273: %information theory in interval arithmetic. As we indicated before,
274: In the set interpretation of interval arithmetic, we have information of
275: the form that a real $x$ belongs to a set $S$. According to information
276: theory, this represents an uncertainty equal to the entropy of the
277: probability distribution over the elements of $S$.
278: What distribution to assume?
279: We are only interested in the large differences in
280: information carried by the successive digits of factored notation.
281: These are large compared to those due to the differences among
282: plausible distributions.
283: 
284: The fact that we are only interested in sets that are bounded
285: intervals, simplifies matters considerably. Plausible distributions for
286: bounded intervals include the uniform and the beta distributions.
287: From now on, if we know that $x$ is in
288: an interval $I$, we assume that the probability of $x$ belonging to any
289: subinterval of $I$ only depends on the width of that subinterval and
290: not on where in $I$ this subinterval is located. This property is
291: implied by the uniform distribution over $I$, and this is the
292: distribution we assume for computation of the uncertainty in the
293: statement $x \in I$. This uncertainty is equal to $- \log_{10} w$,
294: in decimal units of information, where $w$
295: is the width of $I$.
296: 
297: \section{Improvement of factored notation}
298: 
299: Factored notation solves the scanning problem. In this section we solve
300: the remaining problem that typically many of the digits inside the
301: brackets are useless. We do this by applying the formula found in
302: Section~\ref{INFOTH} to determine the information content of the digits
303: in factored notation. As factored notation is just an abbreviation of
304: it, this holds for classical notation as well.
305: 
306: We first consider a specific example in which we note a pattern of
307: rapidly decreasing efficiency as more digits are added. We explain this
308: phenomenon by a generally applicable formula, and use it to justify our
309: recommendation to write no more than three decimal digits inside the
310: brackets of factored notation.
311: 
312: For the example, we randomly selected an interval under the constraints
313: that both bounds have 15 digits, that the first five be the same, and
314: that the interval be nonempty. Thus we came to consider the interval
315: $[a,b]$ that is, in factored notation,
316: \begin{equation}
317: 0.389015[282749894,960538227]            \label{SUPERF}
318: \end{equation}
319: The information content is $- \log_{10}(b-a)$, which is about 6.169
320: decimal units. If we have to represent the information that a real is
321: confined to this interval, but are only allowed to use two digits
322: inside the brackets, then this interval has to be $0.389015[28,97]$.
323: This interval has information content of about 6.161. Thus we saved
324: twice seven digits and lost an amount of information equal to 0.008
325: decimal units. Note that an optimally used pair of decimal digits in
326: factored notation carries 1.000 decimal units of information.
327: 
328: This example suggests that two decimals inside the brackets already
329: give almost all the information contained in the statement that $x$ is
330: in (\ref{SUPERF}).  That only two decimal digits inside the brackets
331: are enough could be a misleading feature of this particular example. To
332: investigate this possibility, we analyse the information content
333: remaining for all possible ways of shortening (\ref{SUPERF}).
334: From this we will see that a pattern emerges. We show that the pattern
335: is not a peculiarity of the example.
336: Because the pattern almost always occurs, we give it a name:
337: \emph{Rule of One Tenth}.
338: Before investigating this rule, we first need to be more
339: precise about shortening the representation of an interval.
340: 
341: \paragraph{Inflation}
342: Consider the statement that $x \in [a,b]$. Let $[a^\prime,b^\prime]$
343: properly contain $[a,b]$. Now it may be the case that $x \in
344: [a^\prime,b^\prime]$ conveys almost as much information about $x$ as $x
345: \in [a,b]$ and yet $[a^\prime,b^\prime]$ requires fewer digits to
346: write. Then $[a^\prime,b^\prime]$ is a more efficient representation
347: than $[a,b]$.
348: 
349: A more efficient representation such as $[a^\prime,b^\prime]$ 
350: may be obtained by one
351: or more applications of an operation we refer to as ``inflation''.
352: \begin{definition}
353: Let $I$ be the representation of an interval of which the bounds have a
354: finite number of decimals. The operation of \emph{inflation} has as
355: result the representation of the smallest interval containing $I$ where
356: each bound has one less decimal than the corresponding bound in $I$.
357: \end{definition}
358: In Table~\ref{EXAMPLES} we see some examples of inflation.
359: \begin{table}
360: \begin{tabular}{r||l|l}
361:  line number  & before inflation & after inflation              \\
362: \hline
363: \hline
364:       0       & $0.123[456,789]$  & $0.123[45,79]$                \\
365:       1       & $0.1[2345,34]$    & $0.1[234,4]$                  \\
366:       2       & $0.[1234,9999]$     & $[0.123,1.00]$                \\
367:       3       & $0.123[450,670]$  & $0.123[45,67]$                \\
368:       4       & $0.123[499,501]$  & $0.123[49,51]$
369: \end{tabular}
370: \caption{
371: Examples of inflation.
372: \label{EXAMPLES}
373: }
374: \end{table}
375: Line 0 is a typical case. Line 1 illustrates that inflation may apply
376: to intervals with an unequal number of decimals in the bounds. Line 2
377: is included to illustrate that inflation decreases the number of
378: digits, so that the four-digit $0.9999$ changes to the three-digit
379: numeral $1.00$.
380: 
381: Let us now consider the change in interval width due to inflation. In
382: line 3 of Table~\ref{EXAMPLES} we see that it can be as little as zero.
383: Line 4 shows that the width can increase by a factor of 10. In such a
384: case, the digits saved by inflation carry as much information as is
385: possible for a decimal digit.
386: 
387: In Table~\ref{SPREADSH} we see in the top line the bounds of interval
388: (\ref{SUPERF}).  Each next line shows the result of inflation applied
389: to the previous line. Thus it is true that $x$ is contained in each
390: interval of the table.
391: In the fourth column we see the information content of the
392: statement that $x$ belongs to the interval shown in that line. The last
393: column shows the decrease in information compared to the line before.
394: This decrease is to be compared to the information content of the
395: omitted decimal, which is $1$. Thus, the last column contains the
396: efficiency of showing the last decimal in each bound in the line
397: before.
398: 
399: \begin{table}
400: \begin{tabular}{r||l|l||l|l}
401:  &left boundary $a$ & right boundary $b$ & $-\log_{10}(b-a)$ & information \\
402:  &                &                    &           & loss                \\
403: \hline
404: \hline
405: 0  & 0.389015 282749894 & 0.389015 960538227 & 6.168905911 &               \\
406: 1  & 0.389015 28274989  & 0.389015 96053823  & 6.168905907 & 0.000000005   \\
407: 2  & 0.389015 2827498   & 0.389015 9605383   & 6.168905804 & 0.000000103   \\
408: 3  & 0.389015 282749    & 0.389015 960539    & 6.168904843 & 0.000000961   \\
409: 4  & 0.389015 28274     & 0.389015 96054     & 6.168898435 & 0.000006407   \\
410: 5  & 0.389015 2827      & 0.389015 9606      & 6.168834366 & 0.000064069   \\
411: 6  & 0.389015 282       & 0.389015 961       & 6.168130226 & 0.000704140   \\
412: 7  & 0.389015 28        & 0.389015 97        & 6.161150909 & 0.006979316   \\
413: 8  & 0.38901 52         & 0.38901 60         & 6.096910013 & 0.064240896   \\
414: 9  & 0.38901 5          & 0.38901 6          & 6           & 0.096910013   \\
415: 10 & 0.3890 1           & 0.3890 2           & 5           & 1             \\
416: 11 & 0.389 0            & 0.389 1            & 4           & 1             \\
417: 12 & 0.3 89             & 0.3 90             & 3           & 1             \\
418: 13 & 0.3 8              & 0.3 9              & 2           & 1             \\
419: 14 & 0. 3               & 0. 4               & 1           & 1             \\
420: 15 & 0                  & 1                  & 0           & 1          
421: \end{tabular}
422: \caption{
423: \label{SPREADSH}
424: Intervals $[a,b]$ containing an unknown real $x$.
425: Information loss as the result of successive inflations. Given that $x$
426: is in $[0,1]$, the information content of $x \in [a,b]$ is $-
427: \log_{10}(b-a)$. The loss due to inflation is in the last column.
428: }
429: \end{table}
430: 
431: As one goes down the table, considering successively more succinct, yet
432: true statements about $x$, one sees an interesting transition about
433: halfway. Of course something special has to happen at the point where
434: factored notation is $0.38901[5,6]$. The next more succinct intervals
435: are, successively, $0.3890[1,2]$, $0.389[0,1]$ and so on. In this
436: range, the information decrease is $1$, exactly the information content
437: of the decimal digit saved. That is, the digits that are saved here
438: are fully efficient.  Factored notation is not as useful here as it was
439: higher up in the table. In fact, it is redundant, as there is always a
440: pair of successive single decimals inside the brackets. An ad-hoc
441: notation in the style of tilde notation has a considerable advantage
442: here. I adopted the one proposed by Hickey \cite{hck00} and called
443: it ``Plus'' in Table~\ref{CLASSIF}.
444: 
445: Let us now consider the most important part of Table~\ref{SPREADSH}.
446: Suppose one considers shortening the interval in the top line to
447: $0.389015[28,97]$ and suppose one worries that too much information has
448: been lost. The last column in line 7 shows that the additional digits
449: contained in line 8 add only about one tenth of the amount information
450: contained in the last digits of line 7, which is already pretty low at
451: around one tenth of those in the line above that.  One can summarize
452: the last column above line 8 by the {\bf Rule of One Tenth}:
453: \begin{quote} \emph{Each additional digit carries about one tenth
454: of the information in the previous one.} \end{quote}
455: The rule holds quite well from line 8 upwards. If it would be exact,
456: the last column in line 1 would be $6*10^{-9}$ instead of the
457: $5*10^{-9}$ actually observed. Is this rule a fortuitous feature of
458: this particular example?  In the following, we will argue that it is
459: not.
460: 
461: \paragraph{The general case}
462: In Table~\ref{SPREADSH} we see that the Rule of One Tenth only holds
463: over many lines with considerable fluctuations from line to line. In
464: fact, in Table~\ref{EXAMPLES} we saw that inflation can cause an
465: increase in interval width of as little as a factor of one and as much
466: as a factor of ten.  These factors correspond to information losses of
467: 0 and 1, respectively.  What can we say in general about interval
468: widening due to inflation?
469: 
470: \paragraph{}
471: We consider for the general case the interval shown digit by digit as
472: \begin{equation}\label{GENERIC}
473: 0.x_1 \ldots x_{j-1}[y_j \ldots y_{j+k-1}p,z_j \ldots z_{j+k-1}q],
474: \end{equation}
475: where $y_j < z_j$ and $k \geq 2$.
476: We ask whether the number of digits can be safely decreased
477: by one application of the inflation operation.
478: 
479: If $p=q=0$, width does not increase, so inflation can be applied
480: without any loss of information. The largest information loss occurs if
481: $p=9$ and $q=1$, in which case the width increases by $18 \times
482: 10^{-j-k}$.  Let us take $10^{-j-k+1}$ as a typical width increase, as
483: it is a convenient value near midway these extremes.
484: 
485: This increase should be compared with the width $w$ of
486: (\ref{GENERIC}).  The comparison is obscured by the large variation of
487: $w$. It may be as little as $10^{-j-k}$ (see last line of
488: Table~\ref{EXAMPLES}) and nearly as much as $10^{-j+1}$. In the case
489: (\ref{GENERIC}) is narrowest, inflation widens it typically by a factor
490: ten. In that case $p$ and $q$ carry as much information as is possible
491: for a decimal digit. Perhaps all decimals should be kept.  In the case
492: (\ref{GENERIC}) is widest, inflation widens it by a negligible amount.
493: Inflation is advisable.
494: 
495: Apparently it does not help to consider the extreme values of $w$, as
496: they lead to contradictory advice.  So let us consider average
497: values of $w$. We assume $k \geq 2$ (we retain at least two digits inside
498: the brackets).  If the average is in the order of $10^{-j}$, then
499: inflation causes negligible information loss.  If the average width is
500: near $10^{-j-k}$, then inflation causes the full amount of
501: information loss, so this is the worst case.  To simplify matters, we
502: make the worst case worse and assume that $w$ can range from $0$ to
503: $10^{-j+1}$. This is only a small change, as we are only interested in
504: $k > 2$, in which case the range from $0$ to $10^{-j-k}$ is negligible
505: compared to the range from $0$ to $10^{-j+1}$.
506: 
507: It is simplest to assume that the probability distribution of $w$ is
508: not far from uniform between $0$ and $10^{-j+1}$. In that case, it will
509: usually be the case that $w \in [10^{-j},10^{-j+1}]$.
510: 
511: %$10^{-j-k}$
512: %$10^{-j+1}$
513: 
514: But one may prefer not to make assumptions about the probability
515: distribution of $w$. Then one may accept the assumption that the digits
516: between the brackets in (\ref{GENERIC}) are independent random
517: variables with a uniform distribution on $\{0,\ldots,9\}$ under the
518: constraint that $y_j < z_j$. The average width of (\ref{GENERIC}) can
519: then be expressed as
520: \begin{equation}\label{AVERAGE}
521: w = \sum_{s = 0}^{9} \sum_{t = s+1}^{9} p_{st} w_{st}
522: \end{equation}
523: where $p_{st}$
524: is the probability of $y_j = s $ and $z_j = t$ and $w_{st}$ is the
525: average width under the constraint that $y_j = s$ and $z_j = t$.  For
526: $i$ between $0$ and $8$, if $y_j =i$, then $z_j$ can be $i,\ldots,9$.
527: Under the assumption about the distributions of the digits involved, we
528: have $p_{st} = 1/\sum_{i=1}^{9}i = 1/45$.
529: 
530: We are interested in a lower bound for $w_{st}$.
531: Each width is
532: bounded below by $(t-s-1)*10^{-j}$. Whatever the distribution, the
533: average is also bounded below by $(t-s-1)*10^{-j}$. Because this bound
534: depends only on $t-s$, we rewrite (\ref{AVERAGE}) as
535: \begin{eqnarray*}
536: w & = & \sum_{d=1}^{9} \sum_{a=0}^{9-d} p_{a,a+d} w_{a,a+d}
537: \end{eqnarray*}
538: Using $w_{a,a+d} \geq (d-1)*10^{-j}$ and $p_{st} = 1/45$ we have
539: \begin{eqnarray*}
540: w & \geq & (1/45) \sum_{d=1}^{9}(d-1)*10^{-j}  \\
541:   & \geq & (36/45)*10^{-j} = (4/5)*10^{-j}
542: \end{eqnarray*}
543: Moreover, $w$ is bounded above by $10^{-j+1}$. So it is
544: reasonable to assume that $w$ is in the order of $10^{-j}$.
545: 
546: %Let us now look at the effect of inflation acting on (\ref{GENERIC}).
547: %Dropping the
548: %last digit $p$ increases interval width by $p*10^{-j-k}$. Dropping $q$ and
549: %rounding upwards moves the upper bound by at most $(10-q)*10^{-j-k}$.
550: %Thus we have
551: %$$
552: %\begin{array}{rcccl}
553: %(10-9)*10^{-j-k} &\leq& (10-(q-p))*10^{-j-k} &\leq& (10 - (1-9))*10^{-j-k}
554: %								\\
555: %10^{-j-k}        &\leq& (10-(q-p))*10^{-j-k} &\leq& 18*10^{-j-k}
556: %\end{array}
557: %$$
558: %We take as typical value of interval widening $10^{-j-k+1}$, which is
559: %close to halfway these extremes.
560: Hence inflation widens an interval with a width of about $10^{-j}$ to
561: one that has a width of about $10^{-j}+10^{-j-k+1} =
562: 10^{-j}(1+10^{-k+1})$.  Thus, the uncertainty decreased by the last
563: digit is in the order of $\log_{10}(1+10^{-k+1})$, which is about
564: $10^{-k+1}$, neglecting a factor of $\ln 10$.
565: 
566: This is also the decrease in information gain for every additional
567: digit inside the brackets in factored notation.  This is also the Rule
568: of Ten observed in Table~\ref{SPREADSH} when averaging over many rows.
569: We can expect that the third decimal in a factored notation only
570: increases information by $0.01$ of the potential information in a
571: decimal digit, and is therefore of questionable value. We recommend
572: factored notation with two decimals inside the brackets, while keeping
573: in mind that the rule does not apply in rare cases such as line $4$ in
574: Table~\ref{EXAMPLES}.
575: 
576: \section{Conclusions}
577: 
578: Interval methods are coming of age. When interval software was
579: experimental, it didn't matter whether interval output was easy to
580: read.  Now that the main technical challenges have been overcome, and
581: we at least \emph{know} how to ensure that the floating-point bounds
582: include all reals that are possible values of the variable concerned,
583: we need to turn our attention to small, mundane matters, which include
584: taking care of the convenience of users. Factored notation is an
585: advance in this respect.  However, without some attention to the number
586: of digits inside the brackets, one runs the risk of specifying in
587: maximum accuracy not the number under consideration, but the
588: unavoidable lack of information about this number.
589: 
590: \section{Acknowledgements}
591: 
592: I owe a debt of gratitude to the anonymous referees for their valuable
593: suggestions.
594: Many thanks to Fr\'ed\'eric Goualard for helpful comments on a draft of
595: this paper. We acknowledge generous support by the University of
596: Victoria, the Natural Science and Engineering Research Council NSERC,
597: the Centrum voor Wiskunde en Informatica CWI, and the Nederlandse
598: Organisatie voor Wetenschappelijk Onderzoek NWO.
599: 
600: \begin{thebibliography}{1}
601: 
602: \bibitem{sh65}
603: R.B. Ash.
604: \newblock {\em Information Theory}.
605: \newblock Interscience, 1965.
606: 
607: \bibitem{vhlmyd97}
608: Pascal~Van Hentenryck, Laurent Michel, and Yves Deville.
609: \newblock {\em Numerica: A Modeling Language for Global Optimization}.
610: \newblock MIT Press, 1997.
611: 
612: \bibitem{hckvnmdn01}
613: T.~Hickey, Q.~Ju, and M.H. van Emden.
614: \newblock Interval arithmetic: from principles to implementation.
615: \newblock {\em Journal of the ACM}, 2001.
616: \newblock To appear.
617: 
618: \bibitem{hck00}
619: Timothy~J. Hickey.
620: \newblock {CLIP}: A {CLP}(intervals) dialect for metalevel constraint solving.
621: \newblock In {\em PADL2000}, pages 200--214. Springer-Verlag, 2000.
622: \newblock Lecture Notes in Computer Science 1753.
623: 
624: \bibitem{hvnn01}
625: E.~Hyv\"onen.
626: \newblock Interval input and output.
627: \newblock In W.~Kramer and J.W. von Gudenberg, editors, {\em Scientific
628:   Computing, Validated Numerics, Interval Methods}, pages 41--52. Kluwer, 2001.
629: 
630: \bibitem{szwc99}
631: Michael Schulte, Vitaly Zelov, G.~William Walster, and Dmitri Chiriaev.
632: \newblock Single-number interval {I/O}.
633: \newblock In {\em Developments in Reliable Computing}, pages 141,148. Kluwer
634:   Academic Publishers, 1999.
635: 
636: \bibitem{vnmdn01a}
637: M.H. van Emden.
638: \newblock Factored notation for interval {I/O}.
639: \newblock Technical Report DCS-264-IR, Department of Computer Science,
640:   University of Victoria.
641: \newblock Paper cs.NA/0102023 in Computing Research Repository (CoRR),
642: February 2001.
643: 
644: \bibitem{vnmdn99a}
645: M.H. van Emden.
646: \newblock Algorithmic power from declarative use of redundant constraints.
647: \newblock {\em Constraints}, pages 363--381, 1999.
648: 
649: \end{thebibliography}
650: \end{document}
651: