0401:cs0401009/pr.tex

1: \chapter[Probabilistic Reasoning]{Probabilistic Reasoning%

2: \protect\footnote{Based in part on \citet{wolff_1999_prob}.}}\label{pr_chapter}

3:

4: \index{reasoning!probabilistic|(}

5:

6: \section{Introduction}

7:

8: Quoting Benjamin Franklin (``Nothing is certain but death and taxes''), Ginsberg \citeyearpar{ginsberg_1994} writes (p. 2) that: ``The view that Franklin was expressing is that virtually every conclusion we draw [in reasoning] is an uncertain one.'' He goes on to say: ``This sort of reasoning in the face of uncertainty... has ... proved to be remarkably difficult to formalise.''

9:

10: This chapter shows how various kinds of probabilistic reasoning may be performed within the SP system.

11:

12: \subsection{`Reasoning' and `inference'}\label{reasoning_and_inference_section}

13:

14: Before we proceed, some clarification is necessary for the meanings of the terms {\em reasoning}

15: and {\em inference} as they will be used in this book.

16:

17: `Reasoning' will be used to mean any process of ``going beyond the information

18: given''. In most forms of `deductive' reasoning, conclusions are either `true' or `false', whereas in `probabilistic reasoning', conclusions have some kind of `probability' or level of confidence attaching to them, e.g., ``Black clouds mean that it will probably rain''.

19:

20: All-or-nothing deductive reasoning as it appears in logic will not be considered in this

21: chapter, except in passing. Chapter \ref{maths_logic_chapter} considers how logic and mathematics may be understood in terms of information compression and, more specifically, the SP theory.

22:

23: As it is normally used, the term `inference' has three distinct but related meanings:

24:

25: \begin{itemize}

26:

27: \item It is sometimes used to mean a process of constructing a grammar or other kind of

28: knowledge structure by `induction' from a body of `raw' data.

29:

30: \item It is also often used with essentially the same meaning as probabilistic reasoning as

31: described above.

32:

33: \item It may also be used to refer to the result or product of a process of probabilistic reasoning.

34:

35: \end{itemize}

36:

37: Inference of the first kind is the main subject of Chapter \ref{learning_chapter}. Only the second and third of these notions of inference will be considered in this chapter.

38:

39: \subsection{Related research and novelty of the present proposals}

40:

41: There is now a huge literature on probabilistic reasoning and related ideas ranging over `standard' parametric and non-parametric statistics, {\em ad hoc} uncertainty measures in early expert systems, Bayesian statistics, Bayesian/belief/causal networks, Markov networks, Self-Organising Feature Maps, fuzzy set theory and `soft' computing', the Dempster-Shaffer theory, abductive reasoning, nonmonotonic reasoning and reasoning with default values, autoepistemic logic, defeasible logic, possibilistic and other kinds of logic designed to accommodate uncertainty, MLE, algorithmic probability and algorithmic complexity theory, truth maintenance systems, decision analysis, utility theory, and more.

42:

43: A well-known and authoritative survey of the field, with an emphasis on Bayesian networks, is provided by \citet{pearl_1988} although this book is now, perhaps, in need of some updating \citep[but see][]{pearl_2000}. A relatively short but useful review of ``uncertainty handling formalisms'' appears in \citet{parsons_hunter_1998}. Regarding the application of different kinds of `logic' to nonmonotonic and uncertain reasoning, there is a mine of useful information in the articles in \cite{gab_hog_rob_1994} covering such things as `default logic', `autoepistemic logic', `circumscription', `defeasible logic', `uncertainty logics' and `possibilistic logic'. In that volume, the chapter by \citet{ginsberg_1994} provides an excellent introduction to the problems of nonmonotonic reasoning. Papers by \cite{bondarenko_etal_1997, kern-isberner_1998, kohlas_etal_1998} are also relevant as are the papers in \citet{gammerman_1996}.

44:

45: Amongst these many approaches to probabilistic reasoning, the multiple alignment concept---as it has been developed in the SP system---is distinctive. Also distinctive is the attempt to integrate probabilistic reasoning with a wide range of concepts in artificial intelligence, computing, logic and mathematics.

46:

47: \subsection{Information compression and probabilistic reasoning}

48:

49: Naturally enough, much of the literature on probabilistic reasoning deals directly with

50: concepts of probability, especially conditional probability. Since, however, there is a close

51: connection between probability and compression (as described in Section \ref{probabilities_ic_section}), concepts of probability imply corresponding concepts of compression.

52:

53: That said, a primary emphasis on compression rather than probability provides an alternative perspective on the subject which may prove useful. Relevant sources include \citet{dagu_luby_1997, grunwald_1997, van_der_gaag_1996, watanabe_article_1972}.

54:

55: \section{Probabilistic reasoning, multiple alignment and information compression}\label{prob-reason}

56:

57: What connection is there between the formation of a multiple alignment and probabilistic reasoning? This section describes the connection and the way in which probabilities of inferences may be derived from multiple alignments.

58:

59: In the simplest terms, probabilistic reasoning arises from partial pattern recognition: if a pattern is recognised from a subset of its parts (something that people and animals are very good at doing), then, in effect, an inference is made that the unseen part or parts are `really' there. We might, for example, recognise a car from seeing only the front half because the rear half is hidden behind another car or a building. The inference that the rear half is present is probabilistic because there is always a possibility that the rear half is absent or, in some surreal world, replaced by the front half of a horse, or something equally bizarre.

60:

61: In terms of multiple alignment, probabilistic reasoning may be understood as the formation of a multiple alignment in which one or more symbols in the Old patterns are not aligned with any matching symbol or symbols in the New pattern. The probabilities of these symbols (and the inferences that they represent) are calculated as described in Section \ref{probabilities_section}. As a working hypothesis, all kinds of probabilistic reasoning may be understood in these terms.

62:

63: \subsection{Confidence in inferences}\label{confidence_in_inferences}

64:

65: The formulae and calculations presented in Section \ref{probabilities_section} may suggest that, given a system which is equipped with this mathematical machinery, we can calculate precisely the level of confidence that should be placed in inferences that the system makes.

66:

67: This is, of course, nonsense. Any knowledge-based system (including human brains!) is

68: subject to the law ``rubbish in means rubbish out''. In the case of human experts, we do or should ask whether they were given all the relevant information about a given case, how well-trained they are, how much experience they have, whether they have given full attention to the case, and so on. Artificial systems are no different.

69:

70: In general, the confidence which we may place in probabilistic inferences made by a given

71: system should be influenced by several factors including the accuracy and coverage of the

72: information supplied about a particular case, the accuracy and coverage of the knowledge stored

73: in the system, the effectiveness of the search methods used by the system, and the thoroughness of the search which has been made for a given set of data and inferences.

74:

75: \subsection{A simple example}\label{simple_pr_example}

76:

77: In order to illustrate the kinds of values that may be calculated for absolute and relative probabilities, this subsection presents a very simple example: the inference of `fire' from `smoke'. Here, we shall extend the concept of `smoke' to include anything, like mist or fog, which looks like smoke. Also, `fire' has been divided into three categories: the kind of fire used to heat a house or other building, dangerous fires that need a fire extinguisher or more, and the kind of fire inside a burning cigarette or pipe.

78:

79: Given a New pattern containing the single symbol `smoke' and the Old patterns shown in Figure \ref{associations_figure}, SP61 forms the five obvious multiple alignments of New with each of the patterns which contain the symbol `smoke'.

80:

81: \begin{figure}[!hbt]

82: \centering

83: \begin{tabular}{l}

84: clouds black rain (15,000) \\

85: dangerous fire smoke (500) \\

86: heating fire smoke (7,000) \\

87: tobacco fire smoke (10,000) \\

88: fog smoke (2,000) \\

89: stage smoke (100) \\

90: thunder lightning (5,000) \\

91: strawberries cream (1,500) \\

92: \end{tabular}

93: \caption{A small knowledge base of associations. The numbers in brackets show an imaginary frequency of occurrence of each pattern in some notional reference environment.}

94: \label{associations_figure}

95: \end{figure}

96:

97: The absolute and relative probabilities of the five multiple alignments, calculated as described in Section \ref{probabilities_section}, are shown in Table \ref{smoke_alignment_probabilities}.

98:

99: \begin{table}[!hbt]

100: \centering

101: \begin{tabular}{l l l}

102: \em Alignment & \em Absolute & \em Relative \\

103: & \em probability & \em probability \\

104: \\

105: smoke/tobacco fire smoke & 0.08718 & 0.51020 \\

106: smoke/heating fire smoke & 0.06103 & 0.35714 \\

107: smoke/fog smoke & 0.01744 & 0.10204 \\

108: smoke/dangerous fire smoke & 0.00436 & 0.02551 \\

109: smoke/stage smoke & 0.00009 & 0.00510 \\

110: \end{tabular}

111: \caption{Absolute and relative probabilities of each of the five reference multiple alignments formed between `smoke' in New and the patterns shown in Table \ref{associations_figure} in Old. In this example, the relative probability of each pattern from Old is the same as the multiple alignment in which it appears.}

112: \label{smoke_alignment_probabilities}

113: \end{table}

114:

115: In this very simple example, the relative probability of each pattern from Old is the same as for the multiple alignment in which it appears. However, the same cannot be said of individual symbol types. The relative probabilities of the symbol types that appear in any of the five reference multiple alignments are shown in Table \ref{smoke_symbol_probabilities}. The main points to notice about the relative probabilities shown in this table are:

116:

117: \begin{itemize}

118:

119: \item The relative probability of `smoke' is 1.0. This is because it is a `fact' which appears in New, so there is no uncertainty attaching to it.

120:

121: \item Of the other symbol types from Old, the one with the highest probability relative to the other symbols is `fire', and this relative probability is higher than the relative probability of any of the patterns from Old (Table \ref{smoke_alignment_probabilities}). This is because `fire' appears in three of the reference multiple alignments.

122:

123: \end{itemize}

124:

125: \begin{table}[!hbt]

126: \centering

127: \begin{tabular}{ll}

128: \em Symbol & \em Relative \\

129:  & \em probability \\

130: \\

131: smoke & 1.00000 \\

132: fire & 0.89286 \\

133: tobacco & 0.51020 \\

134: heating & 0.35714 \\

135: fog & 0.10204 \\

136: dangerous & 0.02551 \\

137: stage & 0.00510 \\

138: \end{tabular}

139: \caption{The relative probabilities of the symbol types from Old that appear in any of the reference set of multiple alignments shown in Table \ref{smoke_alignment_probabilities}.}

140: \label{smoke_symbol_probabilities}

141: \end{table}

142:

143: In this example, we have ignored all the subtle cues that people would use in practice to infer the origin of smoke: the smell, colour and volume of smoke, associated noises, behaviour of other people, and so on. Allowing for this, and allowing for the probable inaccuracy of the frequency values which have been used, the relative probabilities of multiple alignments, patterns and symbols seem to reflect the subjective probability which we might assign to the five alternative sources of smoke-like matter in everyday situations.

144:

145: \section{One-step `deductive' reasoning}\label{one_step_deductive_reasoning}

146:

147: \index{reasoning!deductive|(}

148:

149: Consider a `standard' example of {\em modus ponens} syllogistic reasoning:

150:

151: \begin{enumerate}

152:

153: \item $\forall x$: bird($x$) $\implies$ canfly($x$).

154:

155: \item bird(Tweety).

156:

157: \item $\therefore$ canfly(Tweety).

158:

159: \end{enumerate}

160:

161: \noindent which, in English, may be interpreted as:

162:

163: \begin{enumerate}

164:

165: \item If something is a bird then it can fly.

166:

167: \item Tweety is a bird.

168:

169: \item Therefore, Tweety can fly.

170:

171: \end{enumerate}

172:

173: In classical logic, a `material implication' like $(p \implies q)$ (``If something is a bird then it can fly'') is equivalent to $\neg(p \land \neg q)$ (``It is not true that something is a bird and it cannot fly'') and to $(\neg q \implies \neg p)$ (``If something cannot fly then it is not a bird'') and also to $(\neg p \lor q)$ (``Either something is not a bird or it can fly'').

174:

175: However, there is a more relaxed, `everyday' kind of `deduction' which, in terms of our example, may be expressed as: ``If something is a bird then, {\em probably}, it can fly. Tweety is a bird. Therefore, {\em probably}, Tweety can fly.''

176:

177: This kind of probabilistic `deduction' differs from material implication because it does not have the same equivalencies as the classical form. If our focus of interest is in describing and reasoning about the real world rather than exploring the properties of abstract systems of symbols, the probabilistic kind of `deduction' seems to be more appropriate. With regard to birds, we know that there are flightless birds, and for most other examples of a similar kind, an ``all or nothing'' logical description would not be an accurate reflection of the facts.

178:

179: With a pattern of symbols, we may record the fact that birds can fly and, in a very natural way, we may record all the other attributes of a bird in the same pattern. The pattern may look something like this:

180:

181: \begin{center}

182: \begin{BVerbatim}

183: Bd bird name #name canfly warm-blooded wings feathers ... #Bd,

184: \end{BVerbatim}

185: \end{center}

186:

187: \noindent or the attributes of a bird may be described in the more elaborate way described in Section \ref{class_part_inheritance}.

188:

189: This pattern and others of a similar kind may be stored in `Old', together with patterns like `name Tweety \#name', `name George \#name', `name Susan \#name' and so on which define the range of possible names. Also, the pattern, `bird Tweety', corresponding to the proposition ``Tweety is a bird'' may be supplied as New. Given patterns like these in New and Old, the best multiple alignment found by SP61 is the one shown in Figure \ref{bird_tweety_alignment}.

190:

191: \begin{figure}[!hbt]

192: \centering

193: \begin{BVerbatim}

194: 0        1        2

195:

196:                   Bd

197: bird ------------ bird

198:          name --- name

199: Tweety - Tweety

200:          #name -- #name

201:                   canfly

202:                   warm-blooded

203:                   wings

204:                   feathers

205:                   ...

206:                   #Bd

207:

208: 0        1        2

209: \end{BVerbatim}

210: \caption{The best multiple alignment found by SP61 with the pattern `bird Tweety' in New and other patterns in Old as described in the text.}

211: \label{bird_tweety_alignment}

212: \end{figure}

213:

214: As before, the inferences which are expressed by this multiple alignment are represented by the unmatched symbols in the multiple alignment. The fact that Tweety is a bird allows us to infer that Tweety can fly but it also allows us to infer that Tweety is warm-blooded, has wings and feathers and all the other attributes of birds. These inferences arise directly from the pattern describing the attributes of birds.

215:

216: In this case, there is only one multiple alignment which encodes all the symbols in New. Therefore, the relative probability of the multiple alignment is 1.0, the relative probability of `canfly' is 1.0, and likewise for all the other symbols in the multiple alignment, both those which are matched to New and those which are not.

217:

218: At this point readers may wonder whether the SP scheme can handle nonmonotonic reasoning: the fact that additional information about penguins, kiwis and other flightless birds would invalidate the inference that something being a bird means that it can fly. The way in which the SP system can perform nonmonotonic reasoning is described in Section \ref{nonmonotonic_reasoning_section}, below.%

219: \index{reasoning!deductive|)}

220:

221: \section{Abductive reasoning}\label{abductive_reasoning_section}

222:

223: \index{reasoning!abductive|(}

224:

225: In the SP system, any subsequence of a pattern may function as what is `given' in reasoning, with the complementary subsequence functioning as the inference. Thus, it is just as easy to reason in a `backwards', abductive manner as it is to reason in a `forwards', deductive manner. We can also reason from the middle of a pattern outwards, from the ends of a pattern to the middle, and many other possibilities. In short, the SP system allows seamless integration of probabilistic `deductive' reasoning with abductive reasoning and other kinds of reasoning which are not commonly recognised.

226:

227: Figure \ref{tweety_canfly_alignment} shows the best multiple alignment and the other member of its reference set of multiple alignments which are formed by SP61 with the same patterns in Old as were used in the example of `deductive' reasoning (Section \ref{one_step_deductive_reasoning}) and with the pattern `Tweety canfly' in New.

228:

229: By contrast with the example of `deductive' reasoning, there are two multiple alignments in the reference set of multiple alignments that encode all the symbols in New. These two multiple alignments represent two alternative sets of abductive inferences that may be drawn from this combination of New and Old.

230:

231: \begin{figure}[!hbt]

232: \fontsize{09.00pt}{10.80pt}

233: \centering

234: \begin{BVerbatim}

235: 0        1        2

236:

237:                   Bd

238:                   bird

239:          name --- name

240: Tweety - Tweety

241:          #name -- #name

242: canfly ---------- canfly

243:                   warm-blooded

244:                   wings

245:                   feathers

246:                   ...

247:                   #Bd

248:

249: 0        1        2

250:

251: (a)

252:

253: 0        1        2

254:

255:                   Bt

256:                   bat

257:          name --- name

258: Tweety - Tweety

259:          #name -- #name

260: canfly ---------- canfly

261:                   furry

262:                   echo-sounding

263:                   ...

264:                   #Bt

265:

266: 0        1        2

267:

268: (b)

269: \end{BVerbatim}

270: \caption{The best multiple alignment (a), and the other member of its reference set of multiple alignments (b), formed by SP61 with patterns in Old as described in Section \ref{one_step_deductive_reasoning} and with `Tweety canfly' in New.}

271: \label{tweety_canfly_alignment}

272: \end{figure}

273:

274: With regard to the first multiple alignment (Figure \ref{tweety_canfly_alignment} (a)), `Tweety' could be a bird with all the attributes of birds, including the ability to fly. The relative probability of the multiple alignment is 0.74, as is the relative probability of the pattern for `bird' and every other symbol in that pattern (apart from the `name' and `\#name' symbols where the relative probability is 1.0).

275:

276: Alternatively, we may infer from the second multiple alignment (Figure \ref{tweety_canfly_alignment} (b)) that `Tweety' could be a bat. But in this case the relative probability of the multiple alignment, the pattern for `bat' and all the symbols in that pattern (apart from the `name \#name' symbols) is only 0.26.%

277: \index{reasoning!abductive|)}

278:

279: \section{Reasoning with probabilistic decision networks and decision trees}\label{probabilistic_decision_network}

280:

281: \index{decision network or tree|(}

282:

283: So far, we have considered examples of reasoning in a single step. One of the simplest kinds of system that supports reasoning in more than one step (as well as single step reasoning) is a `decision network' or a `decision tree'. In such a system, a path is traced through the network or tree from a start node to two or more alternative destination nodes depending on the answers to multiple-choice questions at intermediate nodes. Any such network or tree may be given a probabilistic dimension by attaching a value for probability or frequency to each of the alternative answers to questions at the intermediate nodes.

284:

285: Figure \ref{decision_tree_rules} shows a set of patterns, each of which represents a non-terminal node of a decision network from a car maintenance manual for diagnosing faults in a car engine. To save space, the text associated with each pattern has been omitted. Terminal nodes have also been omitted because, without the text, each one would contain nothing but a number symbol. Figure \ref{decision_tree_rules_sample} shows a sample of the patterns for non-terminal and terminal nodes of the network with the text for each node included.

286:

287: In Figure \ref{decision_tree_rules}, each pattern except the pattern for the start node is identified by the number symbol which appears at the beginning of the pattern, together with the `yes' or `no' answer to the question in the parent node. The number at the end of each pattern identifies the two children of the node.

288:

289: As usual with an SP knowledge base, each pattern in Figure \ref{decision_tree_rules} has a frequency of occurrence, shown to the right of each pattern. In the present case, each frequency is a guestimated frequency of a symptom or symptoms in the domain of car repair.

290:

291: \begin{figure}[!hbt]

292: \fontsize{07.00pt}{08.40pt}

293: \centering

294: \begin{BVerbatim}

295: Start 43 (1202)

296:    43 yes 44 (1043)

297:       44 yes 19 (1009)

298:          19 yes 59 (46)

299:          19 no 1 (963)

300:              1 yes 2 (691)

301:                 2 yes 4 (92)

302:                    4 yes 58 (36)

303:                    4 no 23 (56)

304:                 2 no 5 (599)

305:                    5 yes 58 (62)

306:                    5 no 27 (537)

307:                       27 yes 21 (293)

308:                          21 yes 61 (84)

309:                          21 no 22 (209)

310:                             22 yes 24 (24)

311:                             22 no 22a (87)

312:                                22a yes 35 (44)

313:                                   35 yes 58 (19)

314:                                   35 no 26 (25)

315:                                22a no 36 (43)

316:                                   36 yes 58 (21)

317:                                   36 no 26 (22)

318:                       27 no 60 (244)

319:              1 no 3 (272)

320:                 3 yes 6 (36)

321:                 3 no 7 (236)

322:                    7 yes 8 (71)

323:                       8 yes 8a (55)

324:                       8 no 9 (16)

325:                    7 no 10 (165)

326:                       10 yes 11 (115)

327:                          11 yes 13 (14)

328:                          11 no 14 (101)

329:                             14 yes 15 (21)

330:                             14 no 16 (80)

331:                       10 no 12 (50)

332:       44 no 51 (34)

333:          51 yes 56 (14)

334:          51 no 59 (20)

335:    43 no 45 (159)

336:       45 yes 46 (145)

337:          46 yes 52 (120)

338:             52 yes 53 (86)

339:             52 no 54 (34)

340:                54 yes 55 (22)

341:                54 no 50 (12)

342:          46 no 62 (25)

343:       45 no 47 (14)

344:          47 yes 48 (8)

345:          47 no 49 (6)

346: \end{BVerbatim}

347: \caption{A set of patterns representing the non-terminal nodes of a decision tree for the diagnosis of

348: faults in a car engine. To save space, the text associated with each pattern has been

349: omitted. The full version of some of these patterns can be seen in Figure \ref{decision_tree_rules_sample}.}

350: \label{decision_tree_rules}

351: \end{figure}

352:

353: \begin{figure}[!hbt]

354: \fontsize{10.00pt}{12.00pt}

355: \centering

356: \begin{BVerbatim}

357: Start 43 Does the starter turn the engine?

358:    43 yes 44 Does starter turn the engine briskly?

359:       44 yes 19 Slowly press throttle to floor, return

360:             choke and try again. Does engine start?

361:          19 yes 58 Problem is solved.

362:          19 no 1 Is there a spark at the plug leads?

363:             1 yes 2 Remove and examine plugs.

364:                   Are they black, oily or wet?

365:                2 yes 4 Thoroughly clean and

366:                      dry [plugs], or renew, check

367:                      gap and refit. Try again.

368:                      Does engine start?

369:                   4 yes 58 Problem is solved.

370:                   4 no 23 Check for water in fuel.

371:                         Drain off to clear.

372:                2 no 5 Clean, check gap and replace plugs.

373:                      Does engine start?

374: (etc)

375: \end{BVerbatim}

376: \caption{A sample of the patterns representing nodes of the decision network for diagnosing faults

377: in car engines, including the text associated with each pattern.}

378: \label{decision_tree_rules_sample}

379: \end{figure}

380:

381: \subsection{Networks and trees: convergence and divergence of links}

382:

383: The patterns in the full network of non-terminal and terminal nodes represent a network

384: rather than a tree because there is both divergence and convergence of links. Divergence appears

385: when a node has two or more children while convergence appears where a node has two or more

386: parents (e.g., node 58). Clearly, a decision tree can be represented by using simple patterns in the

387: same manner as this example of a decision network, except that there must not be any patterns

388: which have two or more parents.

389:

390: \subsection{Forming multiple alignments: sequential processing of New}

391:

392: How can this network be used to diagnose a fault in a car engine? In principle, this means

393: putting the `Start' symbol into New followed by a sequence of `yes' and `no' symbols, putting

394: the patterns from Figure \ref{decision_tree_rules} into Old, and then searching for the multiple alignment which represents the best encoding of New in terms of Old.

395:

396: This abstract description of how the system may be used is not very practical---because it

397: leads to inefficient searching (in terms both of speed and success in finding the `correct' multiple alignment)

398: and because it does not allow the questions which are posed by the system to be answered in the

399: kind of progressive interactive way which is natural for this kind of system.

400:

401: For these reasons, SP61 has been designed so that, where necessary or appropriate, it can

402: be set to search for a good multiple alignment by processing any pattern in New in sections or `windows',

403: one window at a time, in left-to-right order, forming intermediate multiple alignments at each stage, as described in Section \ref{windows_section}. The size of the window can vary from one symbol to the whole of New.

404:

405: In this mode of operation, with each window containing only one symbol, it is possible to

406: start the search with only the `Start' symbol in New and then add `yes' and `no' symbols to it,

407: one symbol at a time, in response to the questions which are posed in each of the best intermediate

408: multiple alignments.

409:

410: \subsection{Multiple alignment and engine fault diagnosis}

411:

412: Figure \ref{decision_tree_alignment_1} shows the best multiple alignment found by SP61 with the patterns from Figure \ref{decision_tree_rules} in Old and the pattern `Start yes yes no no no yes no', supplied symbol-by-symbol to New.

413:

414: \begin{figure}[!hbt]

415: \fontsize{10.00pt}{12.00pt}

416: \centering

417: \begin{BVerbatim}

418: 0 Start    yes    yes    no   no   no   yes   no   0

419:     |       |      |     |    |    |     |    |

420: 1 Start 43  |      |     |    |    |     |    |    1

421:         |   |      |     |    |    |     |    |

422: 2       43 yes 44  |     |    |    |     |    |    2

423:                |   |     |    |    |     |    |

424: 3              44 yes 19 |    |    |     |    |    3

425:                       |  |    |    |     |    |

426: 4                     19 no 1 |    |     |    |    4

427:                             | |    |     |    |

428: 5                           1 no 3 |     |    |    5

429:                                  | |     |    |

430: 6                                3 no 7  |    |    6

431:                                       |  |    |

432: 7                                     7 yes 8 |    7

433:                                             | |

434: 8                                           8 no 9 8

435: \end{BVerbatim}

436: \caption{The best multiple alignment found by SP61 with the pattern `Start yes yes no no no yes

437: no' in New and the patterns from Figure \ref{decision_tree_rules} in Old. SP61 was set to operate in the

438: progressive manner described in the text.}

439: \label{decision_tree_alignment_1}

440: \end{figure}

441:

442: As usual, every symbol in a pattern from Old which does not form a hit with a

443: symbol in New represents an inference which can be drawn from the multiple alignment. However, the most

444: important of these is the last one, the symbol `9' which represents a terminal node containing the

445: text ``There is an HT fault or, possibly, a capacitor fault''. This is the diagnosis which the system

446: offers in response to the sequence of yes/no symbols in New representing answers to questions

447: posed by the system. All the other `inferences'---the symbols `43', `44', `19' etc---merely

448: represent the questions which have been answered by the yes/no symbols in New.

449:

450: \subsection{Probabilities}

451:

452: What about probabilities? In this case, the reference set of multiple alignments (which encode all

453: and only the same symbols from New as the best multiple alignment) contains only one member---the best

454: multiple alignment itself. In this case, the relative probability of the diagnosis is 1.0. In other words, this

455: probabilistic version of the system provides an all-or-nothing answer in the same manner as the

456: non-probabilistic chart from which it was derived. Of course, this probability value, like any other, is subject to the qualifications that were noted in Section \ref{confidence_in_inferences}.

457:

458: \subsection{So what?}

459:

460: Regarding the example which has been discussed, readers may object that the SP scheme is a

461: long-winded way to achieve something which is done perfectly adequately with a conventional

462: expert system or even the original flow chart on paper from which the example was derived. Has

463: anything been gained by re-casting the example in an unfamiliar form?

464:

465: In this case, the answer is ``probably not''. The main reason for including the example in

466: this chapter is to show that multiple alignment as it has been developed in the SP system

467: has a much broader scope than may, at first sight, be assumed.

468:

469: For any particular domain, a system with this kind of broad scope may not have any

470: particular advantage over a system which is dedicated to that domain and can function effectively

471: only in that domain. The benefits of using a generic system rather than different systems

472: for each area of application are described in Section \ref{simplification_of_computing_systems}.

473:

474: \subsection{Information which is incomplete}\label{information_which_is_incomplete}

475:

476: Another possible response to the ``So what?'' question, above, is that the SP system, unlike

477: most conventional systems, does not depend exclusively on input which is both complete and

478: accurate. One of the strengths of the SP system is that it can bridge gaps in information

479: supplied to it (as New) and can compensate for symbols which have been added or substituted in

480: the input, provided there are not too many.

481:

482: It is not immediately obvious why this kind of capability might be useful for a diagnostic

483: system like the example above but it is interesting to see that SP61 can produce a plausible result

484: even when parts of the input (in New) are missing or when there is addition or substitution of

485: `wrong' symbols.

486:

487: SP61 has been run in sequential processing mode with the patterns from Figure \ref{decision_tree_rules} in Old and the pattern `Start 43 yes 44 7 yes no' in New. Apart from the addition of some

488: `correct' number symbols (for a reason which is given in a moment), this pattern in New is the same

489: as in Figure \ref{decision_tree_alignment_1} except that the middle sequence of yes/no symbols (`yes no no no') is missing. By contrast with the example in Figure \ref{decision_tree_alignment_1}, it is necessary in this case to include in New some of the number symbols from the relevant patterns in Old, otherwise there is too much ambiguity about which questions are being answered by the symbols `yes' and `no'.

490:

491: Figure \ref{decision_tree_alignment_2} shows the best multiple alignment produced by SP61 with Old and New as just described. No other multiple alignment is produced which encodes all the symbols in New. The multiple alignment in this figure successfully bridges four steps in the multiple alignment where information from New is missing and

492: arrives at the same diagnostic conclusion as the multiple alignment in Figure \ref{decision_tree_alignment_1}. By bridging this gap in New, the multiple alignment has, in effect, inferred what questions should have been asked in places where relevant information is missing and what the answers to those questions should have been.

493:

494: \begin{figure}[!hbt]

495: \fontsize{10.00pt}{12.00pt}

496: \centering

497: \begin{BVerbatim}

498: 0 Start 43 yes 44                     7 yes 8 no   0

499:     |   |   |  |                      |  |  | |

500: 1   |   |   |  |                      7 yes 8 |    1

501:     |   |   |  |                      |     | |

502: 2   |   |   |  |                 3 no 7     | |    2

503:     |   |   |  |                 |          | |

504: 3   |   |   |  |            1 no 3          | |    3

505:     |   |   |  |            |               | |

506: 4 Start 43  |  |            |               | |    4

507:         |   |  |            |               | |

508: 5       43 yes 44           |               | |    5

509:                |            |               | |

510: 6              44 yes 19    |               | |    6

511:                       |     |               | |

512: 7                     19 no 1               | |    7

513:                                             | |

514: 8                                           8 no 9 8

515: \end{BVerbatim}

516: \caption{The best multiple alignment produced by SP61 with the patterns from Figure \ref{decision_tree_rules} in Old and the pattern `Start 43 yes 44 7 yes no' in New.}

517: \label{decision_tree_alignment_2}

518: \end{figure}

519:

520: \subsection{Information containing additions or substitutions}

521:

522: What about adding `wrong' symbols to New or substituting `wrong' symbols for `correct'

523: symbols? With respect to the pattern in New shown in Figure \ref{decision_tree_alignment_1}, adding `no' or `yes' at any point or substituting `no' for `yes' (or {\em vice versa}), will either lead to a different conclusion which is correct in terms of the modified version of New or, if the sequence of `yes' and `no' symbols does not correspond to any possible sequence in the decision network, then the results are unpredictable because of the many alternative ways in which partially `correct' multiple alignments may be formed.

524:

525: The kind of `errors' that the system can cope with most easily are ones which are clearly

526: wrong such as, for example, the addition or substitution in New of symbols which do not match

527: any of the symbols in Old. Provided there not too many of them, SP61 can bridge these kinds of

528: error in much the same way as we saw in the example of `spelling checking and correction' in

529: Section \ref{fuzzy_pattern_recognition}.

530:

531: As noted above, bridging gaps in information or by-passing inaccurate information is not

532: the kind of thing that conventional decision networks or decision trees can do. Although this may

533: be nothing more than a `party trick' at present, there may turn out to be situations where these kinds

534: of capability would be useful in a decision network or tree.%

535: \index{decision network or tree|)}

536:

537: \section{Reasoning with `rules'}\label{reasoning_with_rules_section}

538:

539: \index{reasoning!rules|(}

540:

541: The rules in a typical expert system express associations between things in the form `IF condition THEN consequence (or action)'. As we saw with the example in Section \ref{simple_pr_example}, we can express an association quite simply as a pattern like `fire smoke' without the need to make a formal distinction between the `condition' and the `consequence' or `action'. And, as we saw in Sections \ref{one_step_deductive_reasoning} and \ref{abductive_reasoning_section}, it is possible to use patterns like these quite freely in both a `forwards' and a `backwards' direction. As was noted in Section \ref{abductive_reasoning_section}, the SP system allows inferences to be drawn from patterns in a totally flexible way: any subsequence of the symbols in a pattern may function as a condition, with the complementary subsequence as the corresponding inference.

542:

543: It is easy to form a chain of inference like ``If A then B, if B then C'' from patterns like `A B' and `B C'. But if the patterns are `A B' and `C B', the relative positions of `A' and `C' in the multiple alignment are undefined and they form a `mismatch' as described in Section \ref{mismatches_section}. This means that, in the SP61 model, the multiple alignment is treated as being illegal and is discarded.

544:

545: A way round this problem which can be used with the current models is to adopt a convention that the symbols in every pattern are arranged in some arbitrary sequence, e.g., alphabetical, and to include a `framework' pattern as described in Sections \ref{ordering_of_symbols_and_patterns} and \ref{ordering_of_symptoms}. This has the effect of ensuring that every symbol type always has a column to itself. This avoids the kind of problem described above and allows patterns to be treated as if they were unordered associations of symbols.

546:

547: Figure \ref{arson_associations} shows a small set of patterns representing well-known associations---such as the association of fire with smoke or the association of black clouds with rain---together with one pattern (`1 suspect 2 7 petrol 8') representing the fact that a  `suspect' person has been seen with petrol, another pattern (`4 destroy 5 11 the\_barn 12') representing the fact that `the barn' has been destroy(ed) and a framework pattern (`1 2 3 4 5 6 7 8 9 10 11') as mentioned above. In every pattern except the last, the alphabetic symbols are arranged in alphabetical order. Every alphabetical symbol has its own slot in the framework, e.g., `black\_clouds' has been assigned to the position between the service symbols `1' and `2'. Every alphabetical symbol is flanked by the service symbols representing its slot.

548:

549: \begin{figure}[!hbt]

550: \fontsize{10.00pt}{12.00pt}

551: \centering

552: \begin{BVerbatim}

553: 1 suspect 2 7 petrol 8 (1)

554: 4 destroy 5 11 the_barn 12 (1)

555: 5 fire 6 9 smoke 10 (500)

556: 4 destroy 5 fire 6 (100)

557: 5 fire 6 matches 7 petrol 8 (300)

558: 2 black_clouds 3 8 rain 9 (2000)

559: 2 black_clouds 3 cold 4 10 snow 11 (1000)

560: 1 2 3 4 5 6 7 8 9 10 11 12 (7000)

561: \end{BVerbatim}

562: \caption{Patterns in Old representing well-known associations, together with two `facts' (`1 suspect 2 7 petrol 8' and `4 destroy 5 11 the\_barn 12') and a `framework' pattern (`1 2 3 4 5 6 7 8 9 10 11') as described in the text.}

563: \label{arson_associations}

564: \end{figure}

565:

566: Figure \ref{arson_alignment} shows the best multiple alignment found by SP61 with the pattern `suspect matches smoke the\_barn' in New and the patterns from Figure \ref{arson_associations} in Old. The pattern in New may be taken to represent the key points in an allegation that a person suspected of arson has been seen with petrol near `the\_barn' (which has been destroyed) and that smoke was seen at the same time.

567:

568: \begin{figure}[!hbt]

569: \fontsize{10.00pt}{12.00pt}

570: \centering

571: \begin{BVerbatim}

572: 0          1          2            3         4         5

573:

574:                       1 ---------- 1

575: suspect -------------------------- suspect

576:                       2 ---------- 2

577:                       3

578:            4 -------- 4 ------------------------------ 4

579:            destroy ----------------------------------- destroy

580:            5 -------- 5 -- 5 --------------- 5 ------- 5

581:                            fire ------------ fire ---- fire

582:                       6 -- 6 --------------- 6 ------- 6

583: matches ------------------------------------ matches

584:                       7 ---------- 7 ------- 7

585:                                    petrol -- petrol

586:                       8 ---------- 8 ------- 8

587:                       9 -- 9

588: smoke -------------------- smoke

589:                       10 - 10

590:            11 ------- 11

591: the_barn - the_barn

592:            12 ------- 12

593:

594: 0          1          2            3         4         5

595: \end{BVerbatim}

596: \caption{The best multiple alignment formed by SP61 with a pattern in New representing allegations about someone suspected of arson and patterns in Old as shown in Figure \ref{arson_associations}.}

597: \label{arson_alignment}

598: \end{figure}

599:

600: The alleged facts about the suspect do not, in themselves, show that he/she is guilty of arson. For the jury to find the suspected person guilty, they must understand the connections between the suspect, petrol, smoke and the destruction of the barn. With this example, the inferences are so simple (for people) that the prosecuting lawyer would hardly need to spell them out. But the inferences still need to be made.

601:

602: The multiple alignment shown in Figure \ref{arson_alignment} may be interpreted as a piecing together of the argument that the suspect used matches to start a fire (which was not witnessed by anyone), and that the fire explains why smoke was seen and why the barn was destroyed. Part of the argument is that the suspect was known to be in possession of petrol. Of course, in a more realistic example, there would be many other clues to the existence of a fire (e.g., charred wood) but the example, as shown, gives an indication of the way in which evidence and inferences may be connected together in the SP paradigm.%

603: \index{reasoning!rules|)}

604:

605: \section{Nonmonotonic reasoning and reasoning with default values}\label{nonmonotonic_reasoning_section}

606:

607: \index{reasoning!nonmonotonic|(}

608:

609: The concepts of {\em monotonic} and {\em nonmonotonic} reasoning are well explained by \citet{ginsberg_1994}. In brief, conventional deductive inference is {\em monotonic} because deductions made on the strength of current knowledge cannot be invalidated by new knowledge. The conclusion that ``Socrates is mortal'', deduced from ``All humans are mortal'' and ``Socrates is human'' remains true for all time, regardless of anything we learn later. By contrast, the inference that ``Tweety can probably fly'' from the propositions that ``Most birds fly'' and ``Tweety is a bird'' is {\em nonmonotonic} because it may be changed if, for example, we learn that Tweety is a penguin (unless he/she is an astonishing new kind of penguin that can fly).

610:

611: This section presents some simple examples which show how the SP system can accommodate nonmonotonic reasoning.

612:

613: \subsection{Typically, birds fly}\label{typically_birds_fly}

614:

615: In Sections \ref{one_step_deductive_reasoning} and \ref{abductive_reasoning_section}, the idea that (all) birds can fly was expressed with the pattern `Bd bird name \#name canfly warm-blooded wings feathers ... \#Bd'. This, of course, is an oversimplification of the real-world facts because, while it true that the majority of birds fly, we know that there are also flightless birds like ostriches, penguins and kiwis.

616:

617: In order to model these facts more closely, we need to modify the pattern that describes birds to be something like this:

618:

619: \begin{center}

620: \begin{BVerbatim}

621: Bd bird name #name f #f warm-blooded wings feathers ... #Bd.

622: \end{BVerbatim}

623: \end{center}

624:

625: \noindent And, to our database of Old patterns, we need to add patterns like this:

626:

627: \begin{center}

628: \begin{BVerbatim}

629: Default Bd f canfly #f #Bd #Default

630: P penguin Bd f cannotfly #f #Bd ... #P

631: O ostrich Bd f cannotfly #f #Bd ... #O.

632: \end{BVerbatim}

633: \end{center}

634:

635: Now, the pair of symbols `f \#f' in `Bd bird name \#name f \#f warm-blooded wings feathers ... \#Bd' functions like a `variable' that may take the value `canfly' if a given class of birds can fly and `cannotfly' when a type of bird cannot fly. The pattern `P penguin Bd f cannotfly \#f \#Bd ... \#P' shows that penguins cannot fly and, likewise, the pattern `O ostrich Bd f cannotfly \#f \#Bd ... \#O' shows that ostriches cannot fly. The pattern `Default Bd f canfly \#f \#Bd \#Default', which has a substantially higher frequency than the other two patterns, represents the default value for the variable which is `canfly'. Notice that all three of these patterns contains the pair of symbols `Bd ... \#Bd' showing that the corresponding classes are all subclasses of birds.

636:

637: \subsection{Tweety is a bird so, probably, Tweety can fly}\label{tweety-flies}

638:

639: When SP61 is run with `bird Tweety' in New and the same patterns in Old as before, modified as just described, the three best multiple alignments found are those shown in Figure \ref{nonmon_figure_1}. The first multiple alignment, which has a relative probability of 0.66, confirms that Tweety is a bird and tells us that he or she can fly.

640:

641: \begin{figure}[!hbt]

642: \fontsize{05.00pt}{06.00pt}

643: \centering

644: \begin{BVerbatim}

645: 0        1        2              3

646:

647:                                  Default

648:                   Bd ----------- Bd

649: bird ------------ bird

650:          name --- name

651: Tweety - Tweety

652:          #name -- #name

653:                   f ------------ f

654:                                  canfly

655:                   #f ----------- #f

656:                   warm-blooded

657:                   wings

658:                   feathers

659:                   ...

660:                   #Bd ---------- #Bd

661:                                  #Default

662:

663: 0        1        2              3

664:

665: (a)

666:

667: 0        1        2              3

668:

669:                                  O

670:                                  ostrich

671:                   Bd ----------- Bd

672: bird ------------ bird

673:          name --- name

674: Tweety - Tweety

675:          #name -- #name

676:                   f ------------ f

677:                                  cannotfly

678:                   #f ----------- #f

679:                   warm-blooded

680:                   wings

681:                   feathers

682:                   ...

683:                   #Bd ---------- #Bd

684:                                  ...

685:                                  #O

686:

687: 0        1        2              3

688:

689: (b)

690:

691: 0        1        2              3

692:

693:                                  P

694:                                  penguin

695:                   Bd ----------- Bd

696: bird ------------ bird

697:          name --- name

698: Tweety - Tweety

699:          #name -- #name

700:                   f ------------ f

701:                                  cannotfly

702:                   #f ----------- #f

703:                   warm-blooded

704:                   wings

705:                   feathers

706:                   ...

707:                   #Bd ---------- #Bd

708:                                  ...

709:                                  #P

710:

711: 0        1        2              3

712:

713: (c)

714: \end{BVerbatim}

715: \caption{The best three multiple alignments formed by SP61 with `bird Tweety' in New and patterns in Old as described in the text. The relative probabilities of (a), (b) and (c) are 0.66, 0.22 and 0.12, respectively.}

716: \label{nonmon_figure_1}

717: \end{figure}

718:

719: The second multiple alignment, with a relative probability of 0.22, tells us that Tweety might be an ostrich and, as such, he or she would not be able to fly. Likewise, the third multiple alignment tells us that, with a relative probability of 0.12, Tweety might be a penguin and would not be able to fly. The values for probabilities in this simple example are derived from frequencies that are, almost certainly, not ornithologically correct.

720:

721: \subsection{Tweety is a penguin, so Tweety cannot fly}\label{tweety_is_penguin}

722:

723: Figure \ref{nonmon_figure_2} shows the best multiple alignment found by SP61 when it is run again, with `penguin Tweety' in New instead of `bird Tweety'. This time, there is only one multiple alignment in the reference set and its relative probability is 1.0. Correspondingly, all inferences that we can draw from this multiple alignment have a probability of 1.0. In particular, we can be confident, within the limits of the available knowledge, that Tweety cannot fly.

724:

725: \begin{figure}[!hbt]

726: \fontsize{10.00pt}{12.00pt}

727: \centering

728: \begin{BVerbatim}

729: 0         1        2              3

730:

731:                                   P

732: penguin ------------------------- penguin

733:                    Bd ----------- Bd

734:                    bird

735:           name --- name

736: Tweety -- Tweety

737:           #name -- #name

738:                    f ------------ f

739:                                   cannotfly

740:                    #f ----------- #f

741:                    warm-blooded

742:                    wings

743:                    feathers

744:                    ...

745:                    #Bd ---------- #Bd

746:                                   ...

747:                                   #P

748:

749: 0         1        2              3

750: \end{BVerbatim}

751: \caption{The best multiple alignment formed by SP61 with `penguin Tweety' in New and patterns in Old as described in the text. The relative probability of this multiple alignment is 1.0.}

752: \label{nonmon_figure_2}

753: \end{figure}

754:

755: \index{reasoning!nonmonotonic|)}

756:

757: \section{Explaining away `explaining away': the SP system as an alternative to Bayesian networks}\label{explaining_away}

758:

759: \index{reasoning!explaining away|(}\index{reasoning!causal|(}

760:

761: In recent years, {\em Bayesian networks} (otherwise known as {\em causal nets}, {\em influence diagrams}, {\em probabilistic networks} and other names) have become popular as a means of representing probabilistic knowledge and for probabilistic reasoning \citep[see][]{pearl_1988}.

762:

763: A Bayesian network is a directed, acyclic graph like the one shown in Figure \ref{alarm_bayesian_network} (below) where each node has zero or more `inputs' (connections with nodes that can influence the given node) and one or more `outputs' (connections to other nodes that the given node can influence).

764:

765: Each node contains a set of conditional probability values, each one the probability of a given output value for a given input value. With this information, conditional probabilities of alternative outputs for any node may be computed for any given {\em combination} of inputs. By combining these calculations for sequences of nodes, probabilities may be propagated through the network from one or more `start' nodes to one or more `finishing' nodes.

766:

767: This section shows how the SP system provides alternative to the Bayesian network explanation of the phenomenon of ``explaining away''.

768:

769: \subsection{A Bayesian network explanation of ``explaining away''}\label{bayesian_network}

770:

771: In the words of Judea Pearl \citeyearpar[p. 7]{pearl_1988}, the phenomenon of `explaining away' may be characterised as: ``If A implies B, C implies B, and B is true, then finding that C is true makes A {\em less} credible. In other words, finding a second explanation for an item of data makes the first explanation less credible.'' (his italics). Here is an example:

772:

773: \begin{quotation}

774:

775: ``Normally an alarm sound alerts us to the possibility of a burglary. If somebody calls you at the office and tells you that your alarm went off, you will surely rush home in a hurry, even though there could be other causes for the alarm sound. If you hear a radio announcement that there was an earthquake nearby, and if the last false alarm you recall was triggered by an earthquake, then your certainty of a burglary will diminish.'' \citep[][pp. 8-9]{pearl_1988}.

776:

777: \end{quotation}

778:

779: Although it is not normally presented as an example of nonmonotonic reasoning, this kind of effect in the way we react to new information is similar to the example we considered in Section \ref{nonmonotonic_reasoning_section} because new information has an impact on inferences that we formed on the basis of information that was available earlier.

780:

781: The causal relationships in the example just described may be captured in a Bayesian network like the one shown in Figure \ref{alarm_bayesian_network}.

782:

783: \begin{figure}[!hbt]

784: \centering

785: \includegraphics[width=0.9\textwidth]{07_PR/bayes1.ps}

786: \caption{A Bayesian network representing causal relationships discussed in the text. In this diagram, ``Phone call'' means ``a phone call about the alarm going off'' and ``Radio announcement'' means ``a radio announcement about an earthquake''.}

787: \label{alarm_bayesian_network}

788: \end{figure}

789:

790: Pearl argues that, with appropriate values for conditional probabilities, the phenomenon of ``explaining away'' can be explained in terms of this network (representing the case where there is a radio announcement of an earthquake) compared with the same network without the node for ``radio announcement'' (representing the situation where there is no radio announcement of an earthquake).

791:

792: \subsection{Representing contingencies with patterns and frequencies}\label{contingencies}

793:

794: To see how this phenomenon may be understood in terms of the SP theory, consider, first, the set of patterns shown in Figure \ref{alarm_patterns}, which are to be stored in Old. The first four patterns in the figure show events which occur together in some notional sample of the `World' together with their frequencies of occurrence in the sample.

795:

796: Like other knowledge-based systems, an SP system would normally be used with a `closed-world' assumption that, for some particular domain, the knowledge stored in the knowledge base is comprehensive. Thus, for example, a travel booking clerk using a database of all flights between cities will assume that, if no flight is shown between, say, Edinburgh and Paris, then no such flight exists. Of course, the domain may be only `flights provided by one particular airline', in which case the booking clerk would need to check databases for other airlines. In systems like Prolog, the closed-world assumption is the basis of `negation as failure': if a proposition cannot be proved with the clauses provided in a Prolog program then, in terms of that store of knowledge, the proposition is assumed to be false.

797:

798: In the present case, we shall assume that the closed-world assumption applies so that the absence of any pattern may be taken to mean that the corresponding pattern of events did not occur, at least not with a frequency greater than one would expect by chance.

799:

800: \begin{figure}[!hbt]

801: \fontsize{10.00pt}{12.00pt}

802: \centering

803: \begin{BVerbatim}

804: alarm phone_alarm_call (980)

805: earthquake alarm (20)

806: earthquake radio_earthquake_announcement (40)

807: burglary alarm (1000)

808: e1 earthquake e2 (40)

809: \end{BVerbatim}

810: \caption{A set of patterns to be stored in Old in an example of `explaining away'. The symbol `phone\_alarm\_call' is intended to represent a phone call conveying news that the alarm sounded; `radio\_earthquake\_announcement' represents an announcement on the radio that there has been an earthquake. The symbols `e1' and `e2' represent other contexts for `earthquake' besides the contexts `alarm' and `radio\_earthquake\_announcement'.}

811: \label{alarm_patterns}

812: \end{figure}

813:

814: The fourth pattern shows that there were 1000 occasions when there was a burglary and the alarm went off and the second pattern shows just 20 occasions when there was an earthquake and the alarm went off (presumably triggered by the earthquake). Thus we have assumed that burglaries are much more common than earthquakes. Since there is no pattern showing the simultaneous occurrence of an earthquake, burglary and alarm, we shall infer from the closed-world assumption that this constellation of events was not recorded during the sampling period.

815:

816: The first pattern shows that, out of the 1020 cases when the alarm went off, there were 980 cases where a telephone call about the alarm was made. Since there is no pattern showing telephone calls (about the alarm) in any other context, the closed-world assumption allows us to assume that there were no false positives (including hoaxes): telephone calls about the alarm when no alarm had sounded.

817:

818: Some of the frequencies shown in Figure \ref{alarm_patterns} are intended to reflect the two probabilities suggested for this example in \citet[p. 49]{pearl_1988}: ``... the [alarm] is sensitive to earthquakes and can be accidentally (P = 0.20) triggered by one. ... if an earthquake had occurred, it surely (P = 0.40) would be on the [radio] news.''

819:

820: In our example, the frequency of `earthquake alarm' is 20, the frequency of `earthquake radio\_earthquake\_announcement' is 40 and the frequency of `earthquake' in other contexts is 40. Since there is no pattern like `earthquake alarm radio\_earthquake\_announcement' or `earthquake radio\_earthquake\_announcement alarm' representing cases where an earthquake triggers the alarm and also leads to a radio announcement, we may assume that cases of that kind have not occurred. As before, this assumption is based on the closed-world assumption that the set of patterns is a reasonably comprehensive representation of non-random associations in this small world.

821:

822: The pattern at the bottom, with its frequency, shows that an earthquake has occurred on 40 occasions in contexts where the alarm did not ring and there was no radio announcement.

823:

824: \subsection{Approximating the temporal order of events}\label{temporal-order}

825:

826: In these patterns and in the multiple alignments shown below, the left-to-right order of symbols may be regarded as an approximation to the order of events in time. Thus, in the first pattern, `phone\_alarm\_call' (a phone call to say the alarm has gone off) follows `alarm' (the alarm itself); in the second pattern, `alarm' follows `earthquake' (the earthquake which, we may guess, triggered the alarm); and so on. A single dimension can only approximate the order of events in time because it cannot represent events which overlap in time or which occur simultaneously. However, this kind of approximation has little or no bearing on the points to be illustrated here.

827:

828: \subsection{Other considerations}\label{other_explaining_away_considerations}

829:

830: Other points relating to the patterns shown in Figure \ref{alarm_patterns} include:

831:

832: \begin{itemize}

833:

834: \item No attempt has been made to represent the idea that ``the last false alarm you recall was triggered by an earthquake'' \citep[][p. 9]{pearl_1988}. At some stage in the development of the SP system, there will be a need to take account of recency (see Section \ref{recency_section}).

835:

836: \item With these imaginary frequency values, it has been assumed that burglaries (with a total frequency of occurrence of 1160) are much more common than earthquakes (with a total frequency of 100). As we shall see, this difference reinforces the belief that there has been a burglary when it is known that the alarm has gone off (but without additional knowledge of an earthquake).

837:

838: \item In accordance with Pearl's example (p. 49) (but contrary to the phenomenon of looting during earthquakes), it has been assumed that earthquakes and burglaries are independent. If there was some association between them, then, in accordance with the closed-world assumption, there should be a pattern in Figure \ref{alarm_patterns} representing the association.

839:

840: \end{itemize}

841:

842: \subsection{Formation of alignments: the burglar alarm has sounded}\label{burglar_alarm_sounded}

843:

844: Receiving a phone call to say that one's house alarm has gone off may be represented by placing the symbol `phone\_alarm\_call' in New. Figure \ref{alarm_alignments_1} shows, at the top, the best multiple alignment formed by SP61 in this case with the patterns from Figure \ref{alarm_patterns} in Old. The other two multiple alignments in the reference set are shown below the best multiple alignment, in order of CD value and relative probability. The actual values for $CD$ and relative probability are given in the caption to Figure \ref{alarm_patterns}.

845:

846: \begin{figure}[!hbt]

847: \fontsize{10.00pt}{12.00pt}

848: \centering

849: \begin{BVerbatim}

850: 0       phone_alarm_call 0

851:                |

852: 1 alarm phone_alarm_call 1

853:

854: (a)

855:

856: 0                phone_alarm_call 0

857:                         |

858: 1          alarm phone_alarm_call 1

859:              |

860: 2 burglary alarm                  2

861:

862: (b)

863:

864: 0                  phone_alarm_call 0

865:                           |

866: 1            alarm phone_alarm_call 1

867:                |

868: 2 earthquake alarm                  2

869:

870: (c)

871: \end{BVerbatim}

872: \caption{The best multiple alignment (at the top) and the other three multiple alignments in its reference set formed by SP61 with the pattern `phone\_alarm\_call' in New and the patterns from Figure \ref{alarm_patterns} in Old. In order from the top, the values for $CD$ with relative probabilities in brackets are: 19.91 (0.6563), 18.91 (0.3281), 14.52 (0.0156).}

873: \label{alarm_alignments_1}

874: \end{figure}

875:

876: The unmatched symbols in these multiple alignments represent inferences made by the system. The probabilities for these inferences which are calculated by SP61 (using the method described in Section \ref{probabilities_section}) are shown in Table \ref{symbol_probabilities_table}. These probabilities do not add up to 1 and we should not expect them to because any given multiple alignment can contain two or more of these symbols.

877:

878: The most probable inference is the rather trivial inference that the alarm has indeed sounded. This reflects the fact that there is no pattern in Figure \ref{alarm_patterns} representing false positives for telephone calls about the alarm. Apart from the inference that the alarm has sounded, the most probable inference (p = 0.3281) is that there has been a burglary. However, there is a distinct possibility that there has been an earthquake---but the probability in this case (p = 0.0156) is much lower than the probability of a burglary.

879:

880: \begin{table}

881: \centering

882: \begin{tabular}{ll}

883: \em Symbol & \em Probability \\

884: \\

885: alarm & 1.0 \\

886: burglary & 0.3281 \\

887: earthquake & 0.0156 \\

888: \end{tabular}

889: \caption{The probabilities of unmatched symbols, calculated by SP61 for the four multiple alignments shown in Figure \ref{alarm_alignments_1}. The probability of `phone\_alarm\_call' is 1.0 because it is supplied as a `fact' in New.}

890: \label{symbol_probabilities_table}

891: \end{table}

892:

893: These inferences and their relative probabilities seem to accord quite well with what one would naturally think following a telephone call to say that the burglar alarm at one's house has gone off (given that one was living in a part of the world where earthquakes were not vanishingly rare).

894:

895: \subsection{Formation of alignments: the burglar alarm has sounded and there is a radio announcement of an earthquake}\label{radio-announcement}

896:

897: In this example, the phenomenon of `explaining away' occurs when you learn not only that the burglar alarm has sounded but that there has been an announcement on the radio that there has been an earthquake. In terms of the SP model, the two events (the phone call about the alarm and the announcement about the earthquake) can be represented in New by a pattern like this:

898:

899: \begin{center}

900: \begin{BVerbatim}

901: phone_alarm_call radio_earthquake_announcement

902: \end{BVerbatim}

903: \end{center}

904:

905: \noindent or `radio\_earthquake\_announcement phone\_alarm\_call'. The order of the two symbols does not matter because it makes no difference to the result, except for the order in which columns appear in the best multiple alignment.

906:

907: \begin{figure}[!hbt]

908: \fontsize{09.00pt}{10.80pt}

909: \centering

910: \begin{BVerbatim}

911: 0                  phone_alarm_call radio_earthquake_announcement 0

912:                           |                       |

913: 1            alarm phone_alarm_call               |               1

914:                |                                  |

915: 2 earthquake alarm                                |               2

916:       |                                           |

917: 3 earthquake                        radio_earthquake_announcement 3

918:

919: (a)

920:

921: 0 phone_alarm_call radio_earthquake_announcement 0

922:                                  |

923: 1 earthquake       radio_earthquake_announcement 1

924:

925: (b)

926:

927: 0       phone_alarm_call radio_earthquake_announcement 0

928:                |

929: 1 alarm phone_alarm_call                               1

930:

931: (c)

932:

933: 0                phone_alarm_call radio_earthquake_announcement 0

934:                         |

935: 1          alarm phone_alarm_call                               1

936:              |

937: 2 burglary alarm                                                2

938:

939: (d)

940:

941: 0                  phone_alarm_call radio_earthquake_announcement 0

942:                           |

943: 1            alarm phone_alarm_call                               1

944:                |

945: 2 earthquake alarm                                                2

946:

947: (e)

948: \end{BVerbatim}

949: \caption{At the top, the best multiple alignment formed by SP61 with the pattern `phone\_alarm\_call radio\_earthquake\_announcement' in New and the patterns from Figure \ref{alarm_patterns} in Old. Other multiple alignments formed by SP61 are shown below. From the top, the $CD$ values are: 74.64, 54.72, 19.92, 18.92, and 14.52.}

950: \label{alarm_alignments_2}

951: \end{figure}

952:

953: In this case, there is only one multiple alignment (shown at the top of Figure \ref{alarm_alignments_2}) that can `explain' all the information in New. Since there is only this one multiple alignment in the reference set for the best multiple alignment, the associated probabilities of the inferences that can be read from the multiple alignment (`alarm' and `earthquake') are 1.0.

954:

955: These results show broadly how `explaining away' may be explained in terms of the SP theory. The main point is that the multiple alignment or multiple alignments that provide the best `explanation' of a telephone call to say that one's burglar alarm has sounded is different from the multiple alignment or multiple alignments that best explain the same telephone call coupled with an announcement on the radio that there has been an earthquake. In the latter case, the best explanation is that the earthquake triggered the alarm. Other possible explanations have lower probabilities.

956:

957: \subsection{Other possible alignments}\label{other_possible_alignments}

958:

959: The foregoing account of `explaining away' in terms of the SP theory is not entirely satisfactory because it does not say enough about alternative explanations of what has been observed. This subsection tries to plug this gap. What is missing from the account of `explaining away' in the previous subsection is any consideration of such other possibilities as, for example:

960:

961: \begin{itemize}

962:

963: \item A burglary (which triggered the alarm) and, at the same time, an earthquake (which led to a radio announcement), or

964:

965: \item An earthquake that triggered the alarm and led to a radio announcement and, at the same time, a burglary that did not trigger the alarm.

966:

967: \item And many other unlikely possibilities of a similar kind.

968:

969: \end{itemize}

970:

971: Alternatives of this kind may be created by combining multiple alignments shown in Figure \ref{alarm_alignments_2} with each other, or with patterns or symbols from Old, or both these things. The two examples just mentioned are shown in Figure \ref{alarm_alignments_3}.

972:

973: \begin{figure}[!hbt]

974: \fontsize{09.00pt}{10.80pt}

975: \centering

976: \begin{BVerbatim}

977: 0                phone_alarm_call radio_earthquake_announcement 0

978:                         |                       |

979: 1          alarm phone_alarm_call               |               1

980:              |                                  |

981: 2 burglary alarm                                |               2

982:                                                 |

983: 3 earthquake                      radio_earthquake_announcement 3

984:

985: (a)

986:

987: 0                  phone_alarm_call radio_earthquake_announcement 0

988:                           |                       |

989: 1            alarm phone_alarm_call               |               1

990:                |                                  |

991: 2 earthquake alarm                                |               2

992:       |                                           |

993: 3 earthquake                        radio_earthquake_announcement 3

994:

995: 4 burglary                                                        4

996:

997: (b)

998: \end{BVerbatim}

999: \caption{Two multiple alignments discussed in the text. (a) A multiple alignment created by combining the second and fourth multiple alignment from Figure \ref{alarm_alignments_2}. $CD$ = 73.64, Absolute P = 5.5391e-5. (b) A multiple alignment created from the first multiple alignment in Figure \ref{alarm_alignments_2} and the symbol `burglary'. $CD$ = 72.57, Absolute P = 2.6384e-5.}

1000: \label{alarm_alignments_3}

1001: \end{figure}

1002:

1003: Any multiple alignment created by combining multiple alignments as just described may be evaluated in exactly the same way as the multiple alignments formed directly by SP61. $CD$s and absolute probabilities for the two example multiple alignments are shown in the caption to Figure \ref{alarm_alignments_3}.

1004:

1005: Given the existence of multiple alignments like those shown in Figure \ref{alarm_alignments_3}, values for relative probabilities of multiple alignments will change. The best multiple alignment from Figure \ref{alarm_alignments_2} and the two multiple alignments from Figure \ref{alarm_alignments_3} constitute a reference set because they all `encode' the same symbols from New. However, there are probably several other multiple alignments that one could construct that would belong in the same reference set.

1006:

1007: Given a reference set containing the first multiple alignment in Figure \ref{alarm_alignments_2} and the two multiple alignments in Figure \ref{alarm_alignments_3}, values for relative probabilities are shown in Table \ref{absolute_and_relative_probabilities}, together with the absolute probabilities from which they were derived. Whichever measure is used, the multiple alignment which was originally judged to represent the best interpretation of the available facts has not been dislodged from this position.

1008:

1009: \begin{table}

1010: \centering

1011: \begin{tabular}{lll}

1012: \em Alignment & \em Absolute & \em Relative \\

1013:  & \em probability & \em probability \\

1014: \\

1015: (a) in Figure \ref{alarm_alignments_2} & 1.1052e-4 & 0.5775 \\

1016: (a) in Figure \ref{alarm_alignments_3} & 5.5391e-5 & 0.2881 \\

1017: (b) in Figure \ref{alarm_alignments_3} & 2.6384e-5 & 0.1372 \\

1018: \end{tabular}

1019: \caption{\small Values for absolute and relative probability for the best multiple alignment in Figure \ref{alarm_alignments_2} and the two multiple alignments in Figure \ref{alarm_alignments_3}.}

1020: \label{absolute_and_relative_probabilities}

1021: \end{table}

1022:

1023: \index{reasoning!explaining away|)}

1024:

1025: \section{Causal diagnosis}\label{causal_diagnosis_section}

1026:

1027: \index{diagnosis!causal|(}\index{diagnosis!fault finding|(}

1028:

1029: As we saw in Section \ref{medical_diagnosis_section}, medical diagnosis may be viewed as a process of pattern recognition but the diagnostic process may also involve reasoning about the causes of a patient's symptoms (Section \ref{medical_causal_reasoning}). Causal reasoning is, perhaps, even more prominent in the process of diagnosing faults in artificial systems (cars, televisions etc), probably because these systems are simpler than the human body and better understood.

1030:

1031: In this section, we consider a simple example of fault diagnosis in an electronic circuit---described by \citet[pp. 263--272]{pearl_1988}. Figure \ref{electronic_circuit_figure} shows the circuit with inputs on the left, outputs on the right and, in between, three multipliers ($M_1$, $M_2$, and $M_3$) and two adders ($M_4$ and $M_5$). For the given inputs on the left, it is clear that output F is false and output G is correct.

1032:

1033: \begin{figure}[!hbt]

1034: \centering

1035: \includegraphics[width=0.9\textwidth]{07_PR/circuit_1.ps}

1036: \caption{An electronic circuit containing three multipliers, $M_1$, $M_2$, and $M_3$, and two adders, $M_4$ and $M_5$ (Redrawn from \citet[p. 263]{pearl_1988}).}

1037: \label{electronic_circuit_figure}

1038: \end{figure}

1039:

1040: Figure \ref{electronic_circuit_network} shows a causal network derived from the electronic circuit in Figure \ref{electronic_circuit_figure} (from \citet[p. 264]{pearl_1988}). In this diagram, each of the nodes $X$, $Y$, $Z$, $F$ and $G$ represent the outputs of components $M_1$, $M_2$, $M_3$, $M_4$ and $M_5$, respectively. In each case, there are three causal influences on the output: the two inputs to the component and the state of the component which may be `good' or `bad'. These influences are shown by lines with arrows connecting the source of the influence to the target node. Thus, for example, the two inputs of component $M_1$ are represented by $A$ and $C$ in Figure \ref{electronic_circuit_network}, the good or bad state of component $M_1$ is represented by the node labelled $M_1$, and their causal influences on node $X$ are shown by the three arrows pointing at that node.

1041:

1042: \begin{figure}[!hbt]

1043: \centering

1044: \includegraphics[width=0.9\textwidth]{07_PR/circuit_2.ps}

1045: \caption{A causal network derived from the electronic circuit in Figure \ref{electronic_circuit_figure} (Redrawn from \citet[p. 264]{pearl_1988}).}

1046: \label{electronic_circuit_network}

1047: \end{figure}

1048:

1049: Given a causal analysis like this, and given appropriate information about conditional probabilities, it is possible to derive one or more alternative diagnoses of which components are good and which are bad. In Pearl's example, it is assumed that components of the same type have the same prior probability of failure and that the probability of failure of multipliers is greater than for adders. Given these and some subsidiary assumptions together with the inputs and outputs (but not the intermediate values) shown in Figure \ref{electronic_circuit_figure}, the best diagnosis derived from the causal network is that the $M_1$ component is bad and the second best diagnosis is that $M_4$ is bad. Pearl indicates that some third-best interpretations may be retrievable (e.g., $M_2$ and $M_5$ are bad) ``... but in general, it is not guaranteed that interpretations beyond the second-best will be retrievable.'' (p. 272).

1050:

1051: \subsection{An SP approach to causal diagnosis}\label{sp_causal_diagnosis}

1052:

1053: The main elements of the SP analysis presented here are as follows:

1054:

1055: \begin{itemize}

1056:

1057: \item The input-output relations of any component may be represented as a set of patterns, each one with a measured or estimated frequency of occurrence.

1058:

1059: \item With suitable extensions, these patterns may serve to transfer the output of one component to the input of another.

1060:

1061: \item A framework pattern (as described in Sections \ref{ordering_of_symbols_and_patterns} and elsewhere) is needed to ensure that appropriate multiple alignments can be built.

1062:

1063: \end{itemize}

1064:

1065: Figure \ref{sp_causal_diagnosis_patterns} shows a set of patterns for the circuit shown in Figure \ref{electronic_circuit_figure}. In the figure, the patterns that start with the symbol `M1' represent I/O relations for component $M_1$, those that start with `M2' represent I/O relations for the $M_2$ component and likewise for the other patterns except the last one (starting with the symbol `frame') which is the framework pattern mentioned above. For each initial symbol there is a corresponding `terminating' symbol with an initial `\#' character. For reasons explained shortly, there may be other symbols following the `terminating' symbol.

1066:

1067: \begin{figure}[!hbt]

1068: \fontsize{10.00pt}{12.00pt}

1069: \centering

1070: \begin{BVerbatim}

1071: M1 M1GOOD TM1I1 TM1I2 TM1O #M1 TM4I2 (500000)

1072: M1 M1BAD TM1I1 TM1I2 TM1O #M1 TM4I2 (4)

1073: M1 M1BAD TM1I1 TM1I2 FM1O #M1 FM4I2 (96)

1074: M2 M2GOOD TM2I1 TM2I2 TM2O #M2 TM4I1 TM5I2 (500000)

1075: M2 M2BAD TM2I1 TM2I2 TM2O #M2 TM4I1 TM5I2 (4)

1076: M2 M2BAD TM2I1 TM2I2 FM2O #M2 FM4I1 FM5I2 (96)

1077: M3 M3GOOD TM3I1 TM3I2 TM3O #M3 TM5I1 (500000)

1078: M3 M3BAD TM3I1 TM3I2 TM3O #M3 TM5I1 (4)

1079: M3 M3BAD TM3I1 TM3I2 FM3O #M3 FM5I1 (96)

1080: M4 M4GOOD TM4I1 TM4I2 TM4O #M4 (250000)

1081: M4 M4GOOD TM4I1 FM4I2 FM4O #M4 (250000)

1082: M4 M4GOOD FM4I1 TM4I2 FM4O #M4 (250000)

1083: M4 M4GOOD FM4I1 FM4I2 FM4O #M4 (250000)

1084: M4 M4BAD TM4I1 TM4I2 FM4O #M4 (24)

1085: M4 M4BAD TM4I1 FM4I2 FM4O #M4 (24)

1086: M4 M4BAD FM4I1 TM4I2 FM4O #M4 (24)

1087: M4 M4BAD FM4I1 FM4I2 FM4O #M4 (24)

1088: M4 M4BAD TM4I1 TM4I2 TM4O #M4 (1)

1089: M4 M4BAD TM4I1 FM4I2 TM4O #M4 (1)

1090: M4 M4BAD FM4I1 TM4I2 TM4O #M4 (1)

1091: M4 M4BAD FM4I1 FM4I2 TM4O #M4 (1)

1092: M5 M5GOOD TM5I1 TM5I2 TM5O #M5 (250000)

1093: M5 M5GOOD TM5I1 FM5I2 FM5O #M5 (250000)

1094: M5 M5GOOD FM5I1 TM5I2 FM5O #M5 (250000)

1095: M5 M5GOOD FM5I1 FM5I2 FM5O #M5 (250000)

1096: M5 M5BAD TM5I1 TM5I2 FM5O #M5 (24)

1097: M5 M5BAD TM5I1 FM5I2 FM5O #M5 (24)

1098: M5 M5BAD FM5I1 TM5I2 FM5O #M5 (24)

1099: M5 M5BAD FM5I1 FM5I2 FM5O #M5 (24)

1100: M5 M5BAD TM5I1 TM5I2 TM5O #M5 (1)

1101: M5 M5BAD TM5I1 FM5I2 TM5O #M5 (1)

1102: M5 M5BAD FM5I1 TM5I2 TM5O #M5 (1)

1103: M5 M5BAD FM5I1 FM5I2 TM5O #M5 (1)

1104: frame M1 #M1 M2 #M2 M3 #M3 M4 #M4 M5 #M5 #frame (1)

1105: \end{BVerbatim}

1106: \caption{A set of SP patterns modelling I/O relations in the electronic circuit shown in Figure \ref{electronic_circuit_figure}. They were supplied as Old patterns to SP61 for the building of  the multiple alignment shown in Figure \ref{sp_causal_diagnosis_alignment}. {\em Key}: T = true (information is correct); F = false (information is incorrect); M1, M2, M3, M4, M5 = components of the circuit; GOOD, BAD indicates whether a component is good or bad; I1, I2 = First and second inputs of a component; O = Output of a component.}

1107: \label{sp_causal_diagnosis_patterns}

1108: \end{figure}

1109:

1110: Let us now consider the first pattern in the figure (`M1 M1GOOD TM1I1 TM1I2 TM1O \#M1 TM4I2')

1111:  representing I/O relations for component $M_1$ when that component is good, as indicated by the symbol `M1GOOD'. In this pattern, the symbols `TM1I1', `TM1I2' and `TM1O' represent the two inputs and the output of the component, `\#M1' is the terminating symbol, and `TM4I2' serves to transfer the output of $M_1$ to the second input of component $M_4$ as will be explained. In a symbol like `TM1I1', `T' indicates that the input is true, `M1' identifies the component, and `I1' indicates that this is the first input of the component. Other symbols may be interpreted in a similar way, following the key given in the caption of Figure \ref{sp_causal_diagnosis_patterns}. In effect, this pattern says that, when the component is working correctly, true inputs yield a true output. The pattern has a relatively high frequency of occurrence (500000) reflecting the idea that the component will normally work correctly.

1112:

1113: The other two patterns for component $M_1$ (`M1 M1BAD TM1I1 TM1I2 TM1O \#M1 TM4I2' and

1114: `M1 M1BAD TM1I1 TM1I2 FM1O \#M1 FM4I2') describe I/O relations when the component is bad. The first one describes the situation where true inputs to a faulty component yield a true result, a possibility noted by Pearl (p. 265). The second pattern---with a higher frequency---describes the more usual situation where true inputs to a faulty component yield a false result. Both these bad patterns have much lower frequencies than the good pattern.

1115:

1116: The other patterns in Figure \ref{sp_causal_diagnosis_patterns} may be interpreted in a similar way. Components $M_1$, $M_2$ and $M_3$ have only three patterns each because it is assumed that inputs to the circuit will always be true so it is not necessary to include patterns describing what happens when one or both of the inputs are false. By contrast, there are 4 good  patterns and 8 bad patterns for each of $M_4$ and $M_5$ because either of these components may receive faulty input.

1117:

1118: For each of the five components, the frequencies of the bad patterns sum to 100. However, for each of the components $M_1$, $M_2$, and $M_3$, the total frequency of the good patterns is 500,000 compared with 1,000,000 for the set of good patterns associated with each of the component $M_4$ and $M_5$. These figures accord with the assumptions in Pearl's example that components of the same type have the same probability of failure and that the probability of failure of multipliers ($M_1$, $M_2$, and $M_3$) is greater than the probability of failure of adders ($M_4$ and $M_5$).

1119:

1120: \subsection{Multiple alignments in causal diagnosis}

1121:

1122: Given appropriate patterns, SP61 constructs multiple alignments from which diagnoses may be obtained. Figure \ref{sp_causal_diagnosis_alignment} shows the best multiple alignment created by SP61 with the Old patterns shown in Figure \ref{sp_causal_diagnosis_patterns} and `TM1I1 TM1I2 TM2I1 TM2I2 TM3I1 TM3I2 FM4O TM5O' as the New pattern. The first six symbols in this pattern express the idea that all the inputs for components $M_1$, $M_2$ and $M_3$ are true. The penultimate symbol (`FM4O') shows that the output of $M_4$ is false and the last symbol (`TM5O') shows that the output of $M_5$ is true---in accordance with the outputs shown in Figure \ref{electronic_circuit_figure}.

1123:

1124: \begin{figure}[!hbt]

1125: \fontsize{10.00pt}{12.00pt}

1126: \centering

1127: \begin{BVerbatim}

1128: 0       1        2        3        4        5       6

1129:

1130:                  frame

1131:                  M1 ----------------------- M1

1132:                                             M1BAD

1133: TM1I1 ------------------------------------- TM1I1

1134: TM1I2 ------------------------------------- TM1I2

1135:                                             FM1O

1136:                  #M1 ---------------------- #M1

1137:                  M2 ------------------------------- M2

1138:                                                     M2GOOD

1139: TM2I1 --------------------------------------------- TM2I1

1140: TM2I2 --------------------------------------------- TM2I2

1141:                                                     TM2O

1142:                  #M2 ------------------------------ #M2

1143:                  M3 ----- M3

1144:                           M3GOOD

1145: TM3I1 ------------------- TM3I1

1146: TM3I2 ------------------- TM3I2

1147:                           TM3O

1148:                  #M3 ---- #M3

1149:                  M4 -------------- M4

1150:                                    M4GOOD

1151:                                    TM4I1 ---------- TM4I1

1152:                                    FM4I2 -- FM4I2

1153: FM4O ----------------------------- FM4O

1154:                  #M4 ------------- #M4

1155:         M5 ----- M5

1156:         M5GOOD

1157:         TM5I1 ----------- TM5I1

1158:         TM5I2 ------------------------------------- TM5I2

1159: TM5O -- TM5O

1160:         #M5 ---- #M5

1161:                  #frame

1162:

1163: 0       1        2        3        4        5       6

1164: \end{BVerbatim}

1165: \caption{The best multiple alignment found by SP61 with `TM1I1 TM1I2 TM2I1 TM2I2 TM3I1 TM3I2 FM4O TM5O' in New and the patterns shown in Figure \ref{sp_causal_diagnosis_patterns} in Old.}

1166: \label{sp_causal_diagnosis_alignment}

1167: \end{figure}

1168:

1169: From the multiple alignment in Figure \ref{sp_causal_diagnosis_alignment} it can be inferred that component $M_1$ is bad and all the other components are good. A total of seven alternative diagnoses can be derived from those multiple alignments created by SP61 that encode all the symbols in New. These diagnoses are shown in Table \ref{circuit_diagnoses_and_probabilities}, each with its relative probability.

1170:

1171: \begin{table}

1172: \centering

1173: \begin{tabular}{ll}

1174: \em Bad Component(s) & \em Relative Probability \\

1175: \\

1176: M1 & 0.6664 \\

1177: M4 & 0.3332 \\

1178: M1, M3 & 0.00013 \\

1179: M1, M2 & 0.00013 \\

1180: M1, M4 & 6.664e-5 \\

1181: M3, M4 & 6.664e-5 \\

1182: M1, M2, M3 & 2.666e-8 \\

1183: \end{tabular}

1184: \caption{Seven alternative diagnoses of faults in the circuit shown in Figure \ref{electronic_circuit_figure}, derived from multiple alignments created by SP61 with `TM1I1 TM1I2 TM2I1 TM2I2 TM3I1 TM3I2 FM4O TM5O' in New and the patterns from Figure \ref{sp_causal_diagnosis_patterns} in Old. The relative probability of each diagnosis is shown in the second column.}

1185: \label{circuit_diagnoses_and_probabilities}

1186: \end{table}

1187:

1188: It is interesting to see that the best diagnosis derived by SP61 ($M_1$ is bad) and the second best diagnosis ($M_4$ is bad) are in accordance with first two diagnoses obtained by Pearl's method. The remaining five diagnoses derived by SP61 are different from the one obtained by Pearl's method ($M_2$ and $M_5$ are bad) but this is not altogether surprising because detailed figures are different from Pearl's example and there are differences in assumptions that have been made.%

1189: \index{diagnosis!causal|)}\index{diagnosis!fault finding|)}\index{reasoning!causal|)}

1190:

1191: \section{Reasoning which is not supported by evidence}

1192:

1193: In Section \ref{reasoning_and_inference_section}, `reasoning', including `probabilistic reasoning', was characterised as a process of ``going beyond the information given''. All the examples of reasoning in the SP framework that we have considered thus far have exhibited this feature in the form of symbols and sequences of symbols from Old that are not matched to anything in New.

1194:

1195: What happens if there is little or no information in New or if the process of `reasoning' goes

1196: so far beyond the information in New that alternative lines of reasoning lose their support?

1197: Something like this seems to happen in ordinary thinking when, in considering alternative

1198: scenarios in the future, we think and sometimes worry about possibilities which are very unlikely

1199: to occur (winning the lottery and what we might do with the `loot', falling under the proverbial bus, etc). Hence the patronising advice, ``Don't worry, it may never happen!'' or, a little more helpfully, ``Let's climb that mountain [or `cross that bridge'] when we get to it''---the problem may never arise.

1200:

1201: Figure \ref{start_alignments} shows the first seven multiple alignments formed by SP61 with the patterns from Figure \ref{decision_tree_rules} in Old and only the symbol `Start' in New. If it is allowed to proceed without any check that the multiple alignments formed are actually or potentially useful, the program would carry on creating multiple alignments corresponding to the large number of possible multiple alignments which are implicit in the patterns in Old.

1202:

1203: \begin{figure}[!hbt]

1204: \fontsize{08.00pt}{09.60pt}

1205: \centering

1206: \begin{BVerbatim}

1207: 0 Start    0                         0 Start           0

1208:     |                                    |

1209: 1 Start 43 1                         1 Start 43        1

1210:                                              |

1211:                                      2       43 yes 44 2

1212:

1213: (a)                                  (b)

1214:

1215:

1216: 0 Start          0                   0 Start                  0

1217:     |                                    |

1218: 1 Start 43       1                   1 Start 43               1

1219:         |                                    |

1220: 2       43 no 45 2                   2       43 yes 44        2

1221:                                                     |

1222:                                      3              44 yes 19 3

1223:

1224: (c)                                  (d)

1225:

1226: 0 Start                 0            0 Start                 0

1227:     |                                    |

1228: 1 Start 43              1            1 Start 43              1

1229:         |                                    |

1230: 2       43 no 45        2            2       43 yes 44       2

1231:               |                                     |

1232: 3             45 yes 46 3            3              44 no 51 3

1233:

1234: (e)                                  (f)

1235:

1236: 0 Start                0

1237:     |

1238: 1 Start 43             1

1239:         |

1240: 2       43 no 45       2

1241:               |

1242: 3             45 no 47 3

1243:

1244: (g)

1245: \end{BVerbatim}

1246: \caption{Alignments formed by SP61 with the patterns from Figure \ref{decision_tree_rules} in Old and only the symbol `Start' in New. SP61 was set to stop searching when two `unsupported' inferential steps had been made. Without this check, many more multiple alignments would be formed.}

1247: \label{start_alignments}

1248: \end{figure}

1249:

1250: Out of all these many multiple alignments, the first one (at the top of Figure \ref{start_alignments}) has the highest $CD$ because it encodes New completely and contains the fewest patterns from Old. Amongst the other multiple alignments, $CD$ values decrease as the number of patterns from Old in each multiple alignment increases.

1251:

1252: \subsection{Escaping from `local peaks' in the search space}

1253:

1254: Is there any use for this kind of `reasoning' without supporting evidence? In many cases,

1255: ``no'', but in any kind of problem where there are `local peaks' in the search space (if we regard heuristic search as a form of `hill climbing'), the program must be able to explore regions of the search space which are sub-optimal from a local perspective but which may provide the means of escaping from a local peak and finding a result elsewhere which is better from a broader perspective (a higher `peak').

1256:

1257: The example described in Section \ref{information_which_is_incomplete} (Figure \ref{decision_tree_alignment_2}) illustrates this kind of reasoning. Compared with in Figure \ref{decision_tree_alignment_1}, the example contains a gap in New because the substring `yes no

1258: no no' is missing. Working left-to-right in sequential mode, SP61 is able to bridge this gap in

1259: New because, after it has found a good multiple alignment for the first part of New, it is able to continue forming relatively poor multiple alignments until it finds one which bridges the gap and allows the right-hand section of New can be included. The multiple alignment shown in Figure \ref{decision_tree_alignment_2} has a higher $CD$ than any of the earlier multiple alignments, including the best of the multiple alignments for the left-hand section of New.

1260:

1261: \section{Conclusion}

1262:

1263: In this chapter we have seen how probabilistic inferences can be drawn from any multiple alignment which contains one or more Old symbols that are not matched to any symbol in New. Probabilities can be derived using the method described in Section \ref{probabilities_section}. The versatility of the SP system has been seen in examples showing how the system can model one-step `deductive' reasoning, abductive reasoning, reasoning with probabilistic networks and trees, reasoning with `rules', and nonmonotonic reasoning with default values. The system provides an alternative to causal networks in modelling the phenomenon of `explaining away' and in the diagnosis of faults in electronic circuits and similar systems. The system can also model kinds of conceptual exploration that are not constrained by empirical evidence.

1264:

1265: It is in the nature of the SP system that it blurs many distinctions that are prevalent in computing and artificial intelligence. In particular, there is no clear boundary in the SP scheme between fuzzy pattern recognition, information retrieval and probabilistic reasoning. As we see in chapters that follow, much the same can be said about the boundary between this recognition-retrieval-and-reasoning amalgam and other areas of artificial intelligence.%

1266: \index{reasoning!probabilistic|)}

1267: