0201:cs0201015/cs0201015

1: \documentclass[10pt]{article}

2:

3: \title{On the Significance of Digits in Interval Notation}

4: \author{\mbox{M.H. van Emden}\\

5:         \mbox{Computer Science Dept, University of Victoria}\\

6:         \mbox{Technical Report DCS-270-IR}

7:        }

8: \date{}

9:

10: \newtheorem{definition}{Definition}

11:

12: %%%%%%%%%%%%%%%%%%%%%%%%%%%%% end preamble

13:

14: \begin{document}

15: \maketitle

16:

17: \begin{abstract}

18: To analyse the significance of the digits used for interval bounds, we

19: clarify the philosophical presuppositions of various interval

20: notations. We use information theory to determine the information

21: content of the last digit of the numeral used to denote the interval's

22: bounds. This leads to the notion of \emph{efficiency} of a decimal digit:

23: the actual value as percentage of the maximal value of its information

24: content. By taking this efficiency into account, many presentations of

25: intervals can be made more readable at the expense of negligible loss of

26: information.

27: \end{abstract}

28:

29: \section{Introduction}

30:

31: Once upon a time, it was a matter of professional ethics among computers

32: never to write a meaningless decimal. Since then computers have become

33: machines and thereby lost any form of ethics, professional or

34: otherwise.  The human computers of yore were helped in their ethical

35: behaviour by the fact that it took effort to write spurious decimals.

36: Now the situation is reversed: the lazy way is to use the default

37: precision of the I/O library function. As a result, we are deluged with

38: meaningless decimals.

39:

40: Of course interval arithmetic is not guilty of such negligence. After

41: all, the very {\em raison d'\^etre} of the subject is to be explicit

42: about the precision of computed results.  Yet, even interval arithmetic

43: is plagued by superfluous decimals, albeit in a more subtle way.  In

44: this note we first review the various interval notations. We argue in

45: favour of a rarely used notation called ``tail'', or ``factored'', which

46: has the advantage of avoiding the repetition of decimals that are

47: necessarily the same.  We analyse the information content of the

48: remaining decimals.

49:

50: \section{Philosophical implications of an interval notation}

51: Several papers \cite{hvnn01,szwc99,vnmdn01a} discussing interval

52: notations have been published recently.  The various notations have

53: different implications, just as people have different reasons for being

54: interested in interval arithmetic.

55:

56: For some, intervals are a way of denoting a fuzzy, or perhaps

57: probabilistic, quantity. Others use intervals to give an indication

58: of the extent to which rounding has introduced error in a

59: computation. Here we assume an interpretation of intervals that does

60: not necessarily negate the above interpretations, but differs in the

61: way it is made precise. We call it the {\em set interpretation of

62: interval arithmetic}.

63:

64: \paragraph{The set interpretation}

65: According to the set interpretation, variables range over the real

66: numbers. These reals are represented in computer memory as sets of

67: reals. The constraint is that if variable $x$ is represented by set

68: $S$, we have $x \in S$. Thus the set interpretation differs from

69: conventional numerical analysis in the {\em absence of errors}.  It is

70: either true or false that $x$ belongs to $S$.

71:

72: The fact that $S$ contains more than one real is not an error.  In

73: conventional numerical analysis, an error arises when, for example, a

74: real variable $x$ with value $0.1$ is represented by a floating-point

75: number $f$. An error arises because $x = f$ is false. On the other

76: hand, representing $x$ by $S$ is not an error if $x \in S$.

77:

78: Of course, the statement $x \in S$ provides only a limited amount

79: of information about $x$. The larger $S$ is, the less information. In

80: the set interpretation of interval arithmetic we distinguish

81: error, which is avoidable, from the inescapable fact that the

82: amount of information yielded by a finite machine is finite.

83:

84: \paragraph{Consequences of the set interpretation}

85: Interval arithmetic is no exception to the rule that finite machines

86: can only give a finite amount of information.  In interval arithmetic

87: the sets of reals are limited to those that are easily representable:

88: closed, connected sets of reals that have finite floating-point numbers

89: as bounds, if they have a bound at all. Unbounded closed connected sets

90: of reals use the infinities of the floating-point standard in the

91: obvious way. Each of this finite set of sets of reals can be

92: represented by a pair of floating-point numbers. It is also the case

93: that for every set of reals, there exists a unique least floating-point

94: interval containing it.

95:

96: This is the set interpretation of interval arithmetic. Its virtues

97: include that it is familiar. In fact, many people are surprised to

98: hear it given a name, as this is what they always thought intervals to

99: be. Another virtue is that, if the set interpretation is followed up

100: in all its consequences, it allows resolution of potential ambiguities

101: in interval arithmetic, especially in interval division involving

102: unbounded intervals, intervals containing zero, or intervals

103: containing nothing but zero \cite{hckvnmdn01}.

104:

105: \section{Interval notations}

106: If one accepts the advantages of the set interpretation of interval

107: arithmetic, then one prefers a notation for an interval that suggests

108: a set. The traditional notation, exemplified by $[1.233,1.235]$ has this

109: advantage. Although widely used, it is not practical, as is

110: apparent\footnote{

111: I'm not making this up; see page 122 of \cite{vhlmyd97}.

112: } from the statement that an unknown real $x$ belongs to

113: \begin{eqnarray}

114:       [+0.6180339887498946804,+0.6180339887498950136]. \label{crude}

115: \end{eqnarray}

116: The problem with this ubiquitous notation is that it is hard to separate

117: two important pieces of information: \emph{where} the interval is, and

118: \emph{how wide} it is. To remedy this defect,

119: Hyv\"onen \cite{hvnn01} described a notation according to which one writes

120: instead

121: \begin{eqnarray}

122:       +0.61803398874989[46804,50136].                \label{factored}

123: \end{eqnarray}

124: The situation is similar when we are annoyed by having to write

125: \begin{eqnarray*}

126:       0.61803398874989x+0.61803398874989y,

127: \end{eqnarray*}

128: which we prefer to have in factored form: $0.61803398874989(x+y)$.

129: Hence we propose to refer to (\ref{factored}) as \emph{factored notation}

130: for intervals\footnote{

131: The notation has been occasionally used without comment in the literature;

132: see for example \cite{vnmdn99a}.

133: Credit goes to Hyv\"onen, whose paper \cite{hvnn01} was the first to

134: appear in print that drew attention to it and named it.

135: Independently I did so in \cite{vnmdn01a}.

136: Hyv\"onen called it ``tail notation''.

137: }.

138: The name is more than an analogy: in general,

139: one \emph{factors} with respect to a multiplicative infix operation,

140: of which concatenation on strings is an example.

141:

142: In the example the bounds are in normalized scientific notation and

143: have the same exponent. In general, factored notation converts an

144: interval $[a \times 10^p,b \times 10^q]$, with normalized numerals as

145: bounds, first to $[a,b \times 10^{q-p}] \times 10^p$, where the upper

146: bound is not necessarily normalized.  When $p \not = q$, then this

147: cannot be shortened by taking an initial string of common first

148: decimals outside the brackets. It can only be shortened by limiting the

149: precision of $a$ and $b$, a topic we address later in the paper.

150:

151: Table~\ref{CLASSIF} contains an overview of interval notations. Most

152: of the table is adopted from Hyv\"onen \cite{hvnn01}.

153: In this overview we distinguish three categories: (a) those that suggest

154: a set, (b) those that suggest a number degraded by an error, and (c) those

155: that suggest a pure number.

156: \begin{table}

157: \begin{tabular}{l|l|l|}

158: Notation  &          Interval value        &   Name of notation    \\

159: \hline

160: \hline

161: $[1.233,1.235]$ &    $[1.233,1.235]$       &      Classic          \\

162: $1.23[3,5]$     &    $[1.233,1.235]$       &      Factored         \\

163: $1.234\pm 2$    &    $[1.232,1.236]$       &      Range            \\

164: $1.234$\verb|~| &    $[1.2335,1.2345]$     &      Tilde            \\

165: $1.234+$        &    $[1.234,1.235]$       &      Plus             \\

166: $1.234+[-1e-3,2e-3]$&$[1.233,1.236]$       &      Error            \\

167: $1.234*$        &    $[1.233,1.235]$       &      Star             \\

168: $1.234$         &    $[1.233,1.235]$       &      Single-Number

169: \end{tabular}

170: \caption{

171: \label{CLASSIF}

172: Overview of interval notations, adapted from Hyv\"onen \cite{hvnn01}.

173: }

174: \end{table}

175: The Classic and Factored notations belong to category (a).

176: Under category (b) we have added, in analogy with the Tilde notation, the

177: Plus notation. This latter notation is useful in the improvement of the

178: factored notation discussed later on in this paper.

179: Category (c) is in the last line. Hyv\"onen used the

180: name ``Fortran notation''. The notation is actually the

181: ``Single-number notation'' for the Fortran implementation described in

182: \cite{szwc99}.

183:

184: The virtue of the notations in category (b) is that they make explicit

185: that a numeral is not to be interpreted according to mathematical notation,

186: by which we mean that

187: \begin{equation}

188: d_{m}d_{m-1}\ldots d_0.d_{-1}\ldots d_{-n}   \label{mathematical}

189: \end{equation}

190: denotes the number $\sum_{i= -n}^m d_i 10^i$.

191: Mathematical notation implies an infinite number of zeros after the last

192: digit when $n > 0$.

193:

194: Mathematical notation is not the only way to interpret

195: (\ref{mathematical}).  For a long time physicists, chemists, and

196: engineers have used the convention that

197: \emph{a numeral has as meaning

198: any number that rounds to the number denoted by the numeral

199: displayed}.

200: The coexistence of mathematical notation with the physics convention

201: introduces an ambiguity that is often resolved by context.  With

202: intervals, the ambiguity becomes problematic, as we need numerals to

203: denote the bounds of an interval in the classic notation. Are these to

204: be interpreted according to mathematical notation, or according to the

205: physics convention?  It is implicit in most of the interval literature,

206: and explicit in \cite{hvnn01,vnmdn01a}, that \emph{the numerals in the

207: bounds of an interval are to be interpreted according to the

208: mathematical notation}. In this paper we follow that rule.

209:

210: We therefore propose to avoid category (c) and to give single-number an

211: annotation to indicate that it does not have the usual mathematical

212: meaning. This has been done by Hickey, who introduced \cite{hck00} the

213: Star notation of Table~\ref{CLASSIF}.

214:

215: \paragraph{Difficulties of factored notation}

216: There are two problems with the classical notation. The first is the

217: {\em scanning problem}\/: one needs to scan both bounds digit by digit

218: to find the leftmost different digit.  Only then does one have an idea

219: of the width of the interval. The second problem, the {\em problem of

220: useless digits} can also be found in (\ref{crude}): the width of the

221: interval is specified by no fewer than five digits.  Restricting

222: oneself to four digits for this purpose will give almost as much

223: information about $x$ and that the difference is so small as not to be

224: worth that fifth digit.  As we will show below, the same holds almost

225: always for all digits beyond the first two or three.

226:

227: Factored notation solves the scanning problem; the problem of useless

228: digits remains. To solve it also, we need to study quantitatively the

229: information content of the statement that an unknown real $x$ is

230: contained in an interval $[a,b]$.

231:

232: \section{Information theory}\label{INFOTH}

233: According to Shannon's theory of information (see for example, among

234: many textbooks, \cite{sh65}), observations can reduce the amount of

235: uncertainty about the value of an unknown quantity. The amount of

236: information yielded by an observation is \emph{the decrease (if any) in

237: the

238: amount of uncertainty}. Shannon argues that the amount of uncertainty

239: is appropriately measured by the \emph{entropy} of the probability

240: distribution over the possible values. For a uniform distribution on a

241: finite number of values, this reduces to the logarithm of the number

242: of possible values. It can be shown that the entropy for a

243: distribution over $n$ outcomes is maximized by the uniform

244: distribution over these outcomes.

245:

246: When there are two equally probable possible values, and if one would

247: like this logarithm to come out at unity, one takes $2$ as base of the

248: logarithms and one calls the unit of information {\em bit}, for

249: {\bf b}inary un{\bf it} of information. Thus, the binary digits

250: carry at most one bit of

251: information. Similarly, if one works with decimal digits, then it is

252: convenient to use $10$ as the basis of the logarithms.

253: %By analogy, we

254: %call the resulting unit of information {\em dit},

255: %from {\bf d}ecimal un{\bf it} of information. For mathematical

256: %reasons, it is sometimes convenient to use $e \in 2.71[8,9]$ as basis

257: %for the logarithms, with the {\em nit} as corresponding unit of

258: %information.

259:

260: Thus information theory determines for each number base the maximum

261: amount of information that can be carried by a digit. Normally, if we

262: don't know what a number is, and we are only given the first $k$

263: digits of a numeral denoting that number, we have no idea what the

264: next digit should be. That is, all possibilities in

265: $\{0,1,2,3,4,5,6,7,8,9\}$ are equally probable so that the uncertainty

266: is $\log_{10}10 = 1$. As a decimal digit can only distinguish between

267: ten possibilities, the efficiency of the $(k+1)$st digit is one.

268: %Of

269: %course, this also holds if we do know what the number is, and it is

270: %$\pi$, and if $k = 100$.

271:

272: %Things are not as straightforward for the other important use of

273: %information theory in interval arithmetic. As we indicated before,

274: In the set interpretation of interval arithmetic, we have information of

275: the form that a real $x$ belongs to a set $S$. According to information

276: theory, this represents an uncertainty equal to the entropy of the

277: probability distribution over the elements of $S$.

278: What distribution to assume?

279: We are only interested in the large differences in

280: information carried by the successive digits of factored notation.

281: These are large compared to those due to the differences among

282: plausible distributions.

283:

284: The fact that we are only interested in sets that are bounded

285: intervals, simplifies matters considerably. Plausible distributions for

286: bounded intervals include the uniform and the beta distributions.

287: From now on, if we know that $x$ is in

288: an interval $I$, we assume that the probability of $x$ belonging to any

289: subinterval of $I$ only depends on the width of that subinterval and

290: not on where in $I$ this subinterval is located. This property is

291: implied by the uniform distribution over $I$, and this is the

292: distribution we assume for computation of the uncertainty in the

293: statement $x \in I$. This uncertainty is equal to $- \log_{10} w$,

294: in decimal units of information, where $w$

295: is the width of $I$.

296:

297: \section{Improvement of factored notation}

298:

299: Factored notation solves the scanning problem. In this section we solve

300: the remaining problem that typically many of the digits inside the

301: brackets are useless. We do this by applying the formula found in

302: Section~\ref{INFOTH} to determine the information content of the digits

303: in factored notation. As factored notation is just an abbreviation of

304: it, this holds for classical notation as well.

305:

306: We first consider a specific example in which we note a pattern of

307: rapidly decreasing efficiency as more digits are added. We explain this

308: phenomenon by a generally applicable formula, and use it to justify our

309: recommendation to write no more than three decimal digits inside the

310: brackets of factored notation.

311:

312: For the example, we randomly selected an interval under the constraints

313: that both bounds have 15 digits, that the first five be the same, and

314: that the interval be nonempty. Thus we came to consider the interval

315: $[a,b]$ that is, in factored notation,

316: \begin{equation}

317: 0.389015[282749894,960538227]            \label{SUPERF}

318: \end{equation}

319: The information content is $- \log_{10}(b-a)$, which is about 6.169

320: decimal units. If we have to represent the information that a real is

321: confined to this interval, but are only allowed to use two digits

322: inside the brackets, then this interval has to be $0.389015[28,97]$.

323: This interval has information content of about 6.161. Thus we saved

324: twice seven digits and lost an amount of information equal to 0.008

325: decimal units. Note that an optimally used pair of decimal digits in

326: factored notation carries 1.000 decimal units of information.

327:

328: This example suggests that two decimals inside the brackets already

329: give almost all the information contained in the statement that $x$ is

330: in (\ref{SUPERF}).  That only two decimal digits inside the brackets

331: are enough could be a misleading feature of this particular example. To

332: investigate this possibility, we analyse the information content

333: remaining for all possible ways of shortening (\ref{SUPERF}).

334: From this we will see that a pattern emerges. We show that the pattern

335: is not a peculiarity of the example.

336: Because the pattern almost always occurs, we give it a name:

337: \emph{Rule of One Tenth}.

338: Before investigating this rule, we first need to be more

339: precise about shortening the representation of an interval.

340:

341: \paragraph{Inflation}

342: Consider the statement that $x \in [a,b]$. Let $[a^\prime,b^\prime]$

343: properly contain $[a,b]$. Now it may be the case that $x \in

344: [a^\prime,b^\prime]$ conveys almost as much information about $x$ as $x

345: \in [a,b]$ and yet $[a^\prime,b^\prime]$ requires fewer digits to

346: write. Then $[a^\prime,b^\prime]$ is a more efficient representation

347: than $[a,b]$.

348:

349: A more efficient representation such as $[a^\prime,b^\prime]$

350: may be obtained by one

351: or more applications of an operation we refer to as ``inflation''.

352: \begin{definition}

353: Let $I$ be the representation of an interval of which the bounds have a

354: finite number of decimals. The operation of \emph{inflation} has as

355: result the representation of the smallest interval containing $I$ where

356: each bound has one less decimal than the corresponding bound in $I$.

357: \end{definition}

358: In Table~\ref{EXAMPLES} we see some examples of inflation.

359: \begin{table}

360: \begin{tabular}{r||l|l}

361:  line number  & before inflation & after inflation              \\

362: \hline

363: \hline

364:       0       & $0.123[456,789]$  & $0.123[45,79]$                \\

365:       1       & $0.1[2345,34]$    & $0.1[234,4]$                  \\

366:       2       & $0.[1234,9999]$     & $[0.123,1.00]$                \\

367:       3       & $0.123[450,670]$  & $0.123[45,67]$                \\

368:       4       & $0.123[499,501]$  & $0.123[49,51]$

369: \end{tabular}

370: \caption{

371: Examples of inflation.

372: \label{EXAMPLES}

373: }

374: \end{table}

375: Line 0 is a typical case. Line 1 illustrates that inflation may apply

376: to intervals with an unequal number of decimals in the bounds. Line 2

377: is included to illustrate that inflation decreases the number of

378: digits, so that the four-digit $0.9999$ changes to the three-digit

379: numeral $1.00$.

380:

381: Let us now consider the change in interval width due to inflation. In

382: line 3 of Table~\ref{EXAMPLES} we see that it can be as little as zero.

383: Line 4 shows that the width can increase by a factor of 10. In such a

384: case, the digits saved by inflation carry as much information as is

385: possible for a decimal digit.

386:

387: In Table~\ref{SPREADSH} we see in the top line the bounds of interval

388: (\ref{SUPERF}).  Each next line shows the result of inflation applied

389: to the previous line. Thus it is true that $x$ is contained in each

390: interval of the table.

391: In the fourth column we see the information content of the

392: statement that $x$ belongs to the interval shown in that line. The last

393: column shows the decrease in information compared to the line before.

394: This decrease is to be compared to the information content of the

395: omitted decimal, which is $1$. Thus, the last column contains the

396: efficiency of showing the last decimal in each bound in the line

397: before.

398:

399: \begin{table}

400: \begin{tabular}{r||l|l||l|l}

401:  &left boundary $a$ & right boundary $b$ & $-\log_{10}(b-a)$ & information \\

402:  &                &                    &           & loss                \\

403: \hline

404: \hline

405: 0  & 0.389015 282749894 & 0.389015 960538227 & 6.168905911 &               \\

406: 1  & 0.389015 28274989  & 0.389015 96053823  & 6.168905907 & 0.000000005   \\

407: 2  & 0.389015 2827498   & 0.389015 9605383   & 6.168905804 & 0.000000103   \\

408: 3  & 0.389015 282749    & 0.389015 960539    & 6.168904843 & 0.000000961   \\

409: 4  & 0.389015 28274     & 0.389015 96054     & 6.168898435 & 0.000006407   \\

410: 5  & 0.389015 2827      & 0.389015 9606      & 6.168834366 & 0.000064069   \\

411: 6  & 0.389015 282       & 0.389015 961       & 6.168130226 & 0.000704140   \\

412: 7  & 0.389015 28        & 0.389015 97        & 6.161150909 & 0.006979316   \\

413: 8  & 0.38901 52         & 0.38901 60         & 6.096910013 & 0.064240896   \\

414: 9  & 0.38901 5          & 0.38901 6          & 6           & 0.096910013   \\

415: 10 & 0.3890 1           & 0.3890 2           & 5           & 1             \\

416: 11 & 0.389 0            & 0.389 1            & 4           & 1             \\

417: 12 & 0.3 89             & 0.3 90             & 3           & 1             \\

418: 13 & 0.3 8              & 0.3 9              & 2           & 1             \\

419: 14 & 0. 3               & 0. 4               & 1           & 1             \\

420: 15 & 0                  & 1                  & 0           & 1

421: \end{tabular}

422: \caption{

423: \label{SPREADSH}

424: Intervals $[a,b]$ containing an unknown real $x$.

425: Information loss as the result of successive inflations. Given that $x$

426: is in $[0,1]$, the information content of $x \in [a,b]$ is $-

427: \log_{10}(b-a)$. The loss due to inflation is in the last column.

428: }

429: \end{table}

430:

431: As one goes down the table, considering successively more succinct, yet

432: true statements about $x$, one sees an interesting transition about

433: halfway. Of course something special has to happen at the point where

434: factored notation is $0.38901[5,6]$. The next more succinct intervals

435: are, successively, $0.3890[1,2]$, $0.389[0,1]$ and so on. In this

436: range, the information decrease is $1$, exactly the information content

437: of the decimal digit saved. That is, the digits that are saved here

438: are fully efficient.  Factored notation is not as useful here as it was

439: higher up in the table. In fact, it is redundant, as there is always a

440: pair of successive single decimals inside the brackets. An ad-hoc

441: notation in the style of tilde notation has a considerable advantage

442: here. I adopted the one proposed by Hickey \cite{hck00} and called

443: it ``Plus'' in Table~\ref{CLASSIF}.

444:

445: Let us now consider the most important part of Table~\ref{SPREADSH}.

446: Suppose one considers shortening the interval in the top line to

447: $0.389015[28,97]$ and suppose one worries that too much information has

448: been lost. The last column in line 7 shows that the additional digits

449: contained in line 8 add only about one tenth of the amount information

450: contained in the last digits of line 7, which is already pretty low at

451: around one tenth of those in the line above that.  One can summarize

452: the last column above line 8 by the {\bf Rule of One Tenth}:

453: \begin{quote} \emph{Each additional digit carries about one tenth

454: of the information in the previous one.} \end{quote}

455: The rule holds quite well from line 8 upwards. If it would be exact,

456: the last column in line 1 would be $6*10^{-9}$ instead of the

457: $5*10^{-9}$ actually observed. Is this rule a fortuitous feature of

458: this particular example?  In the following, we will argue that it is

459: not.

460:

461: \paragraph{The general case}

462: In Table~\ref{SPREADSH} we see that the Rule of One Tenth only holds

463: over many lines with considerable fluctuations from line to line. In

464: fact, in Table~\ref{EXAMPLES} we saw that inflation can cause an

465: increase in interval width of as little as a factor of one and as much

466: as a factor of ten.  These factors correspond to information losses of

467: 0 and 1, respectively.  What can we say in general about interval

468: widening due to inflation?

469:

470: \paragraph{}

471: We consider for the general case the interval shown digit by digit as

472: \begin{equation}\label{GENERIC}

473: 0.x_1 \ldots x_{j-1}[y_j \ldots y_{j+k-1}p,z_j \ldots z_{j+k-1}q],

474: \end{equation}

475: where $y_j < z_j$ and $k \geq 2$.

476: We ask whether the number of digits can be safely decreased

477: by one application of the inflation operation.

478:

479: If $p=q=0$, width does not increase, so inflation can be applied

480: without any loss of information. The largest information loss occurs if

481: $p=9$ and $q=1$, in which case the width increases by $18 \times

482: 10^{-j-k}$.  Let us take $10^{-j-k+1}$ as a typical width increase, as

483: it is a convenient value near midway these extremes.

484:

485: This increase should be compared with the width $w$ of

486: (\ref{GENERIC}).  The comparison is obscured by the large variation of

487: $w$. It may be as little as $10^{-j-k}$ (see last line of

488: Table~\ref{EXAMPLES}) and nearly as much as $10^{-j+1}$. In the case

489: (\ref{GENERIC}) is narrowest, inflation widens it typically by a factor

490: ten. In that case $p$ and $q$ carry as much information as is possible

491: for a decimal digit. Perhaps all decimals should be kept.  In the case

492: (\ref{GENERIC}) is widest, inflation widens it by a negligible amount.

493: Inflation is advisable.

494:

495: Apparently it does not help to consider the extreme values of $w$, as

496: they lead to contradictory advice.  So let us consider average

497: values of $w$. We assume $k \geq 2$ (we retain at least two digits inside

498: the brackets).  If the average is in the order of $10^{-j}$, then

499: inflation causes negligible information loss.  If the average width is

500: near $10^{-j-k}$, then inflation causes the full amount of

501: information loss, so this is the worst case.  To simplify matters, we

502: make the worst case worse and assume that $w$ can range from $0$ to

503: $10^{-j+1}$. This is only a small change, as we are only interested in

504: $k > 2$, in which case the range from $0$ to $10^{-j-k}$ is negligible

505: compared to the range from $0$ to $10^{-j+1}$.

506:

507: It is simplest to assume that the probability distribution of $w$ is

508: not far from uniform between $0$ and $10^{-j+1}$. In that case, it will

509: usually be the case that $w \in [10^{-j},10^{-j+1}]$.

510:

511: %$10^{-j-k}$

512: %$10^{-j+1}$

513:

514: But one may prefer not to make assumptions about the probability

515: distribution of $w$. Then one may accept the assumption that the digits

516: between the brackets in (\ref{GENERIC}) are independent random

517: variables with a uniform distribution on $\{0,\ldots,9\}$ under the

518: constraint that $y_j < z_j$. The average width of (\ref{GENERIC}) can

519: then be expressed as

520: \begin{equation}\label{AVERAGE}

521: w = \sum_{s = 0}^{9} \sum_{t = s+1}^{9} p_{st} w_{st}

522: \end{equation}

523: where $p_{st}$

524: is the probability of $y_j = s $ and $z_j = t$ and $w_{st}$ is the

525: average width under the constraint that $y_j = s$ and $z_j = t$.  For

526: $i$ between $0$ and $8$, if $y_j =i$, then $z_j$ can be $i,\ldots,9$.

527: Under the assumption about the distributions of the digits involved, we

528: have $p_{st} = 1/\sum_{i=1}^{9}i = 1/45$.

529:

530: We are interested in a lower bound for $w_{st}$.

531: Each width is

532: bounded below by $(t-s-1)*10^{-j}$. Whatever the distribution, the

533: average is also bounded below by $(t-s-1)*10^{-j}$. Because this bound

534: depends only on $t-s$, we rewrite (\ref{AVERAGE}) as

535: \begin{eqnarray*}

536: w & = & \sum_{d=1}^{9} \sum_{a=0}^{9-d} p_{a,a+d} w_{a,a+d}

537: \end{eqnarray*}

538: Using $w_{a,a+d} \geq (d-1)*10^{-j}$ and $p_{st} = 1/45$ we have

539: \begin{eqnarray*}

540: w & \geq & (1/45) \sum_{d=1}^{9}(d-1)*10^{-j}  \\

541:   & \geq & (36/45)*10^{-j} = (4/5)*10^{-j}

542: \end{eqnarray*}

543: Moreover, $w$ is bounded above by $10^{-j+1}$. So it is

544: reasonable to assume that $w$ is in the order of $10^{-j}$.

545:

546: %Let us now look at the effect of inflation acting on (\ref{GENERIC}).

547: %Dropping the

548: %last digit $p$ increases interval width by $p*10^{-j-k}$. Dropping $q$ and

549: %rounding upwards moves the upper bound by at most $(10-q)*10^{-j-k}$.

550: %Thus we have

551: %$$

552: %\begin{array}{rcccl}

553: %(10-9)*10^{-j-k} &\leq& (10-(q-p))*10^{-j-k} &\leq& (10 - (1-9))*10^{-j-k}

554: %								\\

555: %10^{-j-k}        &\leq& (10-(q-p))*10^{-j-k} &\leq& 18*10^{-j-k}

556: %\end{array}

557: %$$

558: %We take as typical value of interval widening $10^{-j-k+1}$, which is

559: %close to halfway these extremes.

560: Hence inflation widens an interval with a width of about $10^{-j}$ to

561: one that has a width of about $10^{-j}+10^{-j-k+1} =

562: 10^{-j}(1+10^{-k+1})$.  Thus, the uncertainty decreased by the last

563: digit is in the order of $\log_{10}(1+10^{-k+1})$, which is about

564: $10^{-k+1}$, neglecting a factor of $\ln 10$.

565:

566: This is also the decrease in information gain for every additional

567: digit inside the brackets in factored notation.  This is also the Rule

568: of Ten observed in Table~\ref{SPREADSH} when averaging over many rows.

569: We can expect that the third decimal in a factored notation only

570: increases information by $0.01$ of the potential information in a

571: decimal digit, and is therefore of questionable value. We recommend

572: factored notation with two decimals inside the brackets, while keeping

573: in mind that the rule does not apply in rare cases such as line $4$ in

574: Table~\ref{EXAMPLES}.

575:

576: \section{Conclusions}

577:

578: Interval methods are coming of age. When interval software was

579: experimental, it didn't matter whether interval output was easy to

580: read.  Now that the main technical challenges have been overcome, and

581: we at least \emph{know} how to ensure that the floating-point bounds

582: include all reals that are possible values of the variable concerned,

583: we need to turn our attention to small, mundane matters, which include

584: taking care of the convenience of users. Factored notation is an

585: advance in this respect.  However, without some attention to the number

586: of digits inside the brackets, one runs the risk of specifying in

587: maximum accuracy not the number under consideration, but the

588: unavoidable lack of information about this number.

589:

590: \section{Acknowledgements}

591:

592: I owe a debt of gratitude to the anonymous referees for their valuable

593: suggestions.

594: Many thanks to Fr\'ed\'eric Goualard for helpful comments on a draft of

595: this paper. We acknowledge generous support by the University of

596: Victoria, the Natural Science and Engineering Research Council NSERC,

597: the Centrum voor Wiskunde en Informatica CWI, and the Nederlandse

598: Organisatie voor Wetenschappelijk Onderzoek NWO.

599:

600: \begin{thebibliography}{1}

601:

602: \bibitem{sh65}

603: R.B. Ash.

604: \newblock {\em Information Theory}.

605: \newblock Interscience, 1965.

606:

607: \bibitem{vhlmyd97}

608: Pascal~Van Hentenryck, Laurent Michel, and Yves Deville.

609: \newblock {\em Numerica: A Modeling Language for Global Optimization}.

610: \newblock MIT Press, 1997.

611:

612: \bibitem{hckvnmdn01}

613: T.~Hickey, Q.~Ju, and M.H. van Emden.

614: \newblock Interval arithmetic: from principles to implementation.

615: \newblock {\em Journal of the ACM}, 2001.

616: \newblock To appear.

617:

618: \bibitem{hck00}

619: Timothy~J. Hickey.

620: \newblock {CLIP}: A {CLP}(intervals) dialect for metalevel constraint solving.

621: \newblock In {\em PADL2000}, pages 200--214. Springer-Verlag, 2000.

622: \newblock Lecture Notes in Computer Science 1753.

623:

624: \bibitem{hvnn01}

625: E.~Hyv\"onen.

626: \newblock Interval input and output.

627: \newblock In W.~Kramer and J.W. von Gudenberg, editors, {\em Scientific

628:   Computing, Validated Numerics, Interval Methods}, pages 41--52. Kluwer, 2001.

629:

630: \bibitem{szwc99}

631: Michael Schulte, Vitaly Zelov, G.~William Walster, and Dmitri Chiriaev.

632: \newblock Single-number interval {I/O}.

633: \newblock In {\em Developments in Reliable Computing}, pages 141,148. Kluwer

634:   Academic Publishers, 1999.

635:

636: \bibitem{vnmdn01a}

637: M.H. van Emden.

638: \newblock Factored notation for interval {I/O}.

639: \newblock Technical Report DCS-264-IR, Department of Computer Science,

640:   University of Victoria.

641: \newblock Paper cs.NA/0102023 in Computing Research Repository (CoRR),

642: February 2001.

643:

644: \bibitem{vnmdn99a}

645: M.H. van Emden.

646: \newblock Algorithmic power from declarative use of redundant constraints.

647: \newblock {\em Constraints}, pages 363--381, 1999.

648:

649: \end{thebibliography}

650: \end{document}

651: