0211:cs0211035/wip.tex

1: \documentclass{amsart}

2: \usepackage{diagrams}

3:  \newtheorem{thm}{Theorem}[subsection]

4:  \newtheorem{cor}[thm]{Corollary}

5:  \newtheorem{lem}[thm]{Lemma}

6:  \newtheorem{prop}[thm]{Proposition}

7:  \theoremstyle{definition}

8:  \newtheorem{defn}[thm]{Definition}

9:  \theoremstyle{remark}

10:  \newtheorem{rem}[thm]{Remark}

11:

12: \begin{document}

13: \title[Monadic Control Constructs for Inference]

14:  {Monadic Style Control Constructs \\for Inference Systems}

15:

16: \author{ Jean-Marie Chauvet }

17:

18: \address{Dassault D\'{e}veloppement, Paris, France}

19:

20: \email{jmc@neurondata.org}

21:

22: \subjclass{Primary 68Q55, 18C50; Secondary 18C15, 18C20}

23:

24: \keywords{categorical semantics, triples, monads, continuation,

25: computation, control, inference systems}

26:

27: \date{ August 4, 2002.}

28:

29: \dedicatory{}

30:

31:

32: %%% ----------------------------------------------------------------------

33:

34: \begin{abstract}

35: Recent advances in programming languages study and design have

36: established a standard way of grounding computational systems

37: representation in category theory. These formal results led to a

38: better understanding of issues of control and side-effects in functional and imperative

39: languages. Another benefit is a better way of modelling computational effects in

40: logical frameworks. With this analogy in mind, we embark on an

41: investigation of inference systems based on considering inference

42: behaviour as a form of computation. We delineate a categorical

43: formalisation of control constructs in inference systems. This

44: representation emphasises the parallel between the modular articulation of the

45: categorical building blocks (triples) used to account for the inference

46: architecture and the modular composition of cognitive processes.

47: \end{abstract}

48:

49: %%% ----------------------------------------------------------------------

50: \maketitle

51: %%% ----------------------------------------------------------------------

52:

53: \section{Introduction}

54: Originally grounded in research work in Artificial

55: Intelligence (AI) and, more specifically in AI in medicine and

56: production systems \cite{Rappaport_1984_77}, \emph{Nexpert Object}

57: became a commercial success in the late eighties and

58: nineties. Its implementation architecture broke new ground in

59: graphical user interfaces, in portability across operating

60: environments and in distributed object systems. But powerful

61: innovations also went deep into the design of the inference system

62: itself (NXP), which can be traced back to that seminal research

63: work in AI.

64:

65: In this and a companion paper we offer a formal account of this

66:  inference systems, using recent theoretical

67: advances in the understanding of programming language design.  In

68: particular, some of the computational effects introduced by the NXP

69: design show a denotational semantics best captured with the

70: current tools in \emph{continuation-passing} or

71: \emph{monadic} styles.

72:

73:

74: \emph{Continuations.} The idea of transforming programs to

75: continuation-passing style (CPS) appeared in the mid-sixties

76: \cite{sabry92reasoning}. The transformation was formally codified

77: by Fischer and Reynolds in 1972, yielding a standard CPS

78: representation of call-by-value lambda calculus. In the context of

79: denotational semantics \cite{Stoy}, described as the theory of

80: meaning for programs, denotations are usually built with the help

81: of functions in some mathematical structures. In CPS these

82: functions are explicitly passed an additional argument $\kappa$

83: representing "the rest of the computation", a first step towards

84: full reification of the notion of computation itself. Although

85: most of the work on continuations involves a functional language,

86: usually a fragment of typed or untyped lambda calculus, CPS is

87: also useful in understanding imperative languages. Continuations

88: were found to be a major tool for the design of interpreters

89: and compilers for many languages, most prominently Scheme, ML and Haskell.

90:

91: As such, continuations appear as the raw material of control.

92: Operations on continuations control of the

93: unfolding of a computation---that this translation happens in a standard way

94: is a major result of the theoretical work on continuations.

95: Similarly, in inference systems, of which production systems are a

96: well-studied example \cite{PDIS}, designing a proper behaviour

97: relies on the delicate interweaving of \emph{goal-driven} and

98: \emph{data-driven} control. Moreover the importance of

99: the relative data-driven nature of the flow of control sets inference

100: systems apart from both functional and imperative

101: programming languages. In inference systems, both side-effects and

102: procedure calls are mediated through operations on data.

103: This paper shows the relevance

104: of the CPS representation for analysis and assessment of inference

105: systems.

106:

107:

108:

109: \emph{Monads and triples.} Originally introduced by Moggi \cite{

110: moggi91notions} in computer science, monads or triples, a notion

111: from category theory \cite{ttt}, were shown to generalise the

112: continuation-passing style transformation. Monads can model a wide

113: variety of features, including continuations, state, exceptions,

114: input-output, non-determinism, and parallelism \cite{

115: wadler93monads,wadler94monads}. The \emph{monadic style} essentially

116: distinguishes between values and computations ; it sets up a

117: uniform infrastructure for representing and manipulating

118: computations with effects as first-class objects \cite{

119: filinski94representing}.  More specifically, monads introduce a

120: type constructor $T$ through which the type $T \tau$ represents a

121: computation that yields a result of type $\tau$ and may have

122: side-effects. Cast in this background, continuations arise as a

123: special case of monad translation.

124:

125:

126:

127: \emph{Inference systems.} Inference systems are characterised by a

128: program organization based on data or event-driven operations.

129: Such an organisation can be described as \emph{pattern directed}:

130: patterns occurring in the data select pieces of code in the system

131: to be activated. Systems organised this way have been used

132: primarily for deductive inference \cite{PDIS}, and, to a lesser

133: extent, for inductive inference or learning

134: \cite{Rappaport_1984_77}. An inference system has three

135: components: a collection of substructures, called pattern-directed

136: modules, which can be activated or fired by patterns in the data;

137: one or more data structures that may be examined and modified by

138: the pattern-directed modules; and an executive, or interpreter,

139: that controls the selection and activation of the modules. In

140: traditional production systems the pattern-directed modules are

141: called \emph{production rules} and reside in \emph{production

142: memory} while the data structures are stored in the \emph{working

143: memory}. The interpreter uses various strategies in a

144: \emph{recognize-act cycle} to select candidate production rules

145:  and execute their action part.

146:

147: In the \emph{NXP architecture}, this mechanism is supplemented by

148: a goal driven process. Informally, goals are investigated by

149: actively looking for patterns in data, a search usually called

150: \emph{backward chaining} as it involves a recursive descent in the

151: generalized and-or tree representing linked production rules. Such

152: goals are confirmed, rejected or considered ``not known'' as the

153: data collected during the recursive descent trigger some of the

154: patterns of the production rules, a process dually called

155: \emph{forward chaining}. In addition goals are said to be

156: \emph{evoked} when, hitherto uninvestigated, they also depend on

157: data collected during the backward chaining descent from initial

158: goals. Eventually every goal sharing a triggered data pattern with

159: an initial goal ends up evoked. These evoked goals are queued for

160: later evaluation once the current agenda of goal investigation or

161: data collection completes, their investigation is suspended or

162: \emph{postponed} until the current one terminates. Considering

163: backward and forward chainings in NXP as computations in the

164: previously mentioned sense, goal evocation can be viewed as an

165: \emph{effect}, comparable, for instance, to the {\tt escape}

166: construct of functional programming languages. Following sections

167: of this paper show how to capture backward and forward chainings

168: as computations and account for goal evocation in

169: continuation-passing and monadic styles.

170:

171:

172: \emph{Logical frameworks.} Most of the work we are presenting in

173: this and its companion paper has been designed for validation and

174: checking with Twelf \cite{ pfenning99}, a logical framework

175: developed at Carnegie-Mellon University. In a logical framework

176: such as this, a deductive system is used as a device for

177: establishing semantic properties of the mathematical or

178: computational object under study. Following Martin-L\"{o}f

179: introduction of \emph{judgements} as the foundation for

180: constructive mathematics and computer science, logical frameworks

181: represent judgements as types enabling proof-theoretic derivations

182: by type reconstruction. Logical frameworks provide a uniform

183: \emph{judgements-as-types} representation of proofs of semantic

184: properties. In this paper we use the Twelf deductive system

185: to investigate the semantic properties of computations with

186: effects in the NXP inference system (which is itself a deductive

187: system at a different level of abstraction). Obviously terms such

188: as \emph{type}, \emph{deduction} and \emph{inference} have

189: overloaded meanings and we will specify precisely in which context

190: or system we use the latter when there is a risk of confusion.

191:

192: The rest of this paper is organised in three sections. The first one

193: explores a continuation-passing style representation

194: of the NXP computational architecture. The second presents

195: generalisation to a  monadic style

196: description of various effects in the NXP architecture. The last

197: section uses the monadic results of the previous exploration to

198: suggest an abstract implementation of the NXP architecture in the

199: form of the \emph{NXP abstract machine} with implications for

200: building theoretically sound NXP interpreters or compilers.

201:

202: \section{Representing inference control with continuations}

203:

204: \subsection{Terminology}

205: In the following we will consider simple languages representing

206: various aspects of inference systems, and, more specifically of

207: the NXP architecture. Such languages consist of a syntax, defining

208: the set of well-formed types and well-typed terms, and a

209: semantics, assigning some notion of meaning to terms. We expect a

210: semantics to provide a notion of \emph{program evaluation} or

211: \emph{inference derivation}, usually represented as a (partial)

212: function from a suitable set of terms to some set of observable

213: results, say boolean values.

214:

215: In \emph{denotational semantics} the semantics is expressed as

216: functions mapping syntactic constructs in the program to

217: mathematical objects in a mathematical domain. Analyses of

218: programs are actually conducted in the mathematical domain, in

219: which properties assessed about denotations of programs translate

220: to properties about programs. In contrast, the \emph{operational

221: semantics} defines an abstract machine with a state, possibly

222: several components, and some set of primitive instructions. The

223: \emph{axiomatic approach} associates an ``axiom'' with each kind of

224: statement in the programming language stating what we may

225: assert after execution of that statement in terms of what was true

226: beforehand. There is usually a gap between a denotational

227: semantics and an operational semantics for a language and the main

228: formal work in designing interpreters and compilers is to prove

229: their soundness and faithfulness by proving equivalence

230: between these semantics. Bridging this gap requires to proceed in

231: several stages: at each stage an alternative semantic definition

232: of the language is introduces, embodying successively more and more

233: implementation details. These intermediate stage semantics are

234: called \emph{non-standard semantics} to reflect their instrumental

235: status. With these sequence of steps, denotational techniques

236: often drift into program transformations and non-standard

237: semantics have often been used as intermediate languages in

238: compilers.

239:

240: \subsection{Continuation-passing style transformations}

241: Practically all programming languages have some form of control

242: structure or jumping; the more advanced forms of control

243: structures tend to resemble function calls, so much so they are

244: rarely describes as jumps. The library function {\tt exit} in C,

245: for example, may be called with an argument like a function. Its

246: whole purpose, however, is utterly non functional: it jumps out of

247: arbitrarily many surrounding blocks and pending function calls.

248: Such a non-returning function or jump with arguments is an example

249: of why continuations are needed.

250:

251: In  continuation-passing style, a function call is transformed

252: into a jump with arguments to the callee such that one of these is

253: a continuation that enables the callee to jump back to the caller.

254: This idea has been formalised into a standard CPS transformation

255: for lambda calculus.

256: \begin{defn} \label{d:Lambda-calculus}

257: The pure untyped lambda calculus $\Lambda$ is defined by \cite{

258: Barendregt-1997}: A set of terms, $M$, inductively generated over an

259: infinite set of variables $Vars$,

260: \begin{itemize}

261: \item Terms $M ::= V |\: M\: M $; \item Values $V ::= x |\:\lambda

262: x.M $,  $x$ in $Vars$.

263: \end{itemize}

264: \end{defn}

265:

266: The standards CPS translation for $\Lambda$ in denotational

267: semantics is given by the map $[[\_]]_{F} : \Lambda \rightarrow

268: \Lambda$ originally described by Fisher:

269: \begin{defn} \label{d:FisherCPS}

270: Let $k$, $m$, $n$ in $Vars$ be variables that do not occur in the

271: argument to $[[\_]]_{F}$.

272: \begin{itemize}

273: \item $[[V]]_{F} = \lambda k.k \psi (V)$ \item $[[M N]]_{F}

274: = \lambda k.[[M]]_{F}(\lambda m.[[N]]_{F} \lambda n.(m k)n)$ \item

275: $\psi (x) = x$ \item $\psi (\lambda x.M) = \lambda k.\lambda

276: x.[[M]]_{F}k$

277: \end{itemize}

278: \end{defn}

279:

280: We will consider a much simpler fragment for the purpose of

281: analysis of effects and control in inference systems and, more

282: specifically, in the NXP architecture. The syntax is made up of

283: simple expressions of boolean arithmetic:

284: $$ E ::= true \:| false\: | E\: or\: E\: | E\: and\: E $$

285: In this simplified core model of the inference computation, goals

286: are represented by and/or tree expressions, the leaves of which are

287: the working memory elements. Keeping with the essentials,

288: these working memory elements are valued in ${\bf B} = \{0, 1\}$

289: the boolean algebra with boolean operations $\&$ (logical and) and

290: $|$ (logical or). This will be relaxed later as we take into

291: account other effects in the NXP architecture. In this toy model,

292: the pattern-directed modules or production rules of inference

293: systems are captured by expressions in the language, and the

294: executive system is naturally a boolean arithmetic evaluator.

295:

296: Following the denotational semantics approach, the standard

297: semantics for the simple and/or language is defined by:

298: \begin{defn} \label{d:stdsemNXP}

299: Let $B = \{true, false\}$ be the set of constants, $Exp$ the set

300: of expressions inductively defined as in the previous paragraph,

301: and ${\bf B}$ the classical boolean algebra, we define

302: the semantic function $\mathcal{B}: B \rightarrow {\bf B}$ and the

303: evaluation function $[[\_]]_{0}: Exp \rightarrow {\bf B}$ by:

304: \begin{itemize}

305: \item $[[B]]_{0} = \mathcal{B}(B)$ \item $[[E_{1}\:or\:E_{2}]]_{0}

306: = [[E_{1}]]_{0} \:|\: [[E_{2}]]_{0}$ \item

307: $[[E_{1}\:and\:E_{2}]]_{0} = [[E_{1}]]_{0} \:\&\: [[E_{2}]]_{0}$

308: \end{itemize}

309: \end{defn}

310:

311: This definition does not specify a particular order for evaluation

312: of boolean expressions, nor does it refer directly to

313: continuations. As such it is really denotational and only calls

314: for several steps of refinement to be turned into an operational

315: semantics. It does, however, reflect the basic principles of

316: inference systems introduced earlier and, properly extended, will

317: be an adequate basis on which to build the analysis. The next step

318: towards operational semantics now consists of providing a non-standard

319: semantics for the simple language and proving its denotational

320: equivalence with definition \ref{d:stdsemNXP}.

321:

322: \subsection{Two non-standard semantics of computations in the NXP architecture}

323: \subsubsection{Continuations to order goal investigations}

324: The first non-standard semantics introduced here makes explicit

325: reference to continuations. In order to better model the behaviour

326: of inference systems, we add a new operator to our simple and/or

327: tree language to capture sequential investigation of two or more

328: goals:

329: $$ E ::= true \:| false\: | E\: or\: E\: | E\: and\: E | E\:;\:E$$

330: where $E$ are expressions as in definition \ref{d:stdsemNXP}, and

331: $E_{1}\:;E_{2}$ describes investigation of goal $E_{1}$

332: followed by investigation of goal $E_{2}$. In imperative

333: programming languages, this is simply the succession of

334: computations of expressions $E_{1}$ and $E_{2}$. We can now define

335: the following non-standard semantics for this new language as

336: follows:

337: \begin{defn} \label{d:nonstdsemNXP1}

338: Let ${\bf K}$ be the set of continuations, in our case the set of

339: functions ${\bf B} \rightarrow O$ where $O$ is the type of output

340: objects with a distinguished element ${\tt exit}: {\bf B}

341: \rightarrow O$, and $B$, $Exp$ and $\mathcal{B}$ as in definition

342: \ref{d:stdsemNXP}, we define the evaluation function $[[\_]]_{1}:

343: Exp \rightarrow {\bf K} \rightarrow {\bf K}$ by:

344: \begin{itemize}

345: \item $[[B]]_{1}\:k = k\:\mathcal{B}(B)$ \item

346: $[[E_{1}\:or\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\lambda

347: \epsilon_{1}.[[E_{2}]]_{1}(\lambda\epsilon_{2}.

348: k(\epsilon_{1}\:|\:\epsilon_{2})))$ \item

349: $[[E_{1}\:and\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\lambda

350: \epsilon_{1}.[[E_{2}]]_{1}(\lambda\epsilon_{2}.

351: k(\epsilon_{1}\:\&\:\epsilon_{2})))$ \item

352: $[[E_{1}\:;\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\:[[E_{2}]]_{1}\:k)$

353: \end{itemize}

354: \end{defn}

355:

356: The continuations introduced here follow the convention of mapping

357: results of computations, here values of goals inferred by the

358: system, to a specific output type. With these conventions in

359: place, the top level {\tt read-eval-print} loop of traditional

360: interpreters is captured by the simple semantics $$[[E]]_{1}\: {\tt

361: exit}$$ which basically executes the computation of the expression

362: $E$ and pass the result to the continuation that yields control

363: back to the user. In addition, definition \ref{d:nonstdsemNXP1}

364: introduces a left to right ordering of goal investigation and a

365: left to right ordering of the data collection process: this

366: appears in the semantics denotation in places where the left

367: subexpression of a composed expression is always evaluated first

368: with a continuation built from the computation of the right

369: subexpression. This of course, is already a form of design choice

370: and alternative semantics are easily produced to represent other

371: options. The embedded design choice then goes one step further

372: away from the standard denotational semantics down the road of an

373: implementation-ready operational semantics.

374:

375: \subsubsection{Sequence semantics for continuations}

376: With the refined non-standard semantics introduced in this section,

377: we are setting up a framework for representing the semantics of

378: control of in the NXP architecture. The following definitions

379: refine the operational semantics introduced in

380: \ref{d:nonstdsemNXP1}:

381: \begin{defn} \label{d:stacks}

382: We will use the notation $s = \langle B_{1},B_{2}...B_{n}\rangle$

383: for finite sequences of boolean values, $s$ in ${\bf B}^{*}$ the

384: set of finite sequences of boolean values; we will also use

385: indifferently ${\tt nil}$ and $\langle\:\rangle$ to designate the

386: empty sequence. The finite sequence abstract data type is

387: characterised by the following operations:

388: $$s\downarrow i = B_{i} , 0 < i \leq n \;(selection)$$

389: $$s\uparrow i = \langle B_{i+1},...B_{n}\rangle, 0 < i < n; {\tt nil}, i

390: = n \;(rest)$$

391: $$length(s) = n$$

392: $$s\:+\:s' = \langle

393: B_{1},B_{2}...B_{n},B'_{1},B'_{2}...B'_{m}\rangle\;(concatenation)$$

394: $$Or(s) = \langle\;s\downarrow1\:|\:s\downarrow2\;\rangle+s\uparrow2, n \geq 2$$

395: $$And(s) = \langle\;s\downarrow1\:\&\:s\downarrow2\;\rangle+s\uparrow2, n \geq 2$$

396: \end{defn}

397: The sequence abstract data type is comparable to the stack data

398: type introduced by Stoy \cite{Stoy} for operational semantics of

399: programming languages. In the NXP architecture the sequence

400: semantics captures the succession of goals investigated and evoked by

401: the systems. Based on definition \ref{d:stacks} we are now able to

402: present the (non-standard) sequence semantics for our simple

403: representation of the NXP architecture:

404: \begin{defn}\label{d:nonstdsemNXP2}

405: Let ${\bf B}^{*}$ be the set of finite boolean values  sequences

406: with the previously defined abstract data type

407: operations, let $Exp$ and ${\bf B}$ be as in definition

408: \ref{d:nonstdsemNXP1} let $\mathcal{B}$ be, as before, the

409: semantic function of definition \ref{d:stdsemNXP}, we introduce

410: the new semantic evaluation function $[[\_]]_{2}: Exp \rightarrow

411: {\bf B}^{*} \rightarrow {\bf B}^{*}$:

412: \begin{itemize}

413: \item $[[B]]_{2}s = \langle\mathcal{B}(B)\rangle+s$ \item

414: $[[E_{1}\:or\:E_{2}]]_{2}\:s =

415: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item

416: $[[E_{1}\:and\:E_{2}]]_{2}\:s =

417: And([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item

418: $[[E_{1}\:;\:E_{2}]]_{2}\:s = [[E_{2}]]_{2}([[E_{1}]]_{2}\:s)$

419: \end{itemize}

420: \end{defn}

421: \begin{rem}\label{r:left-to-right-ordering}

422: Note that, even though the notation is reversed, compared to

423: definition \ref{d:nonstdsemNXP1}, the previous evaluation function

424: still enforces a left to right order of evaluation on expressions

425: and subexpressions. The left subexpression is always evaluated

426: first and pushed on the sequence before evaluation of the right

427: subexpression. This is particularly important in the specific case

428: of expressions like $E_{1}\:;\:E_{2}$.

429: \end{rem}

430: \begin{rem}

431: In the sequence semantics of the simple NXP language,

432: continuations are actually ``implemented'' as finite sequences,

433: initially with type operations similar to stacks (sequence and

434: rest, for instance). In the NXP architecture, the boolean values

435: leaves of and/or trees are determined from examination of the

436: working memory or from interaction with the user (asking

437: questions). This operation is not captured in the simple language

438: we have presented so far but will be discussed later. It is important

439: however it somehow constrains when

440: operations are actually performed on a finite sequence $s$ of

441: ${\bf B}^{*}$. Namely, we consider that boolean values in a

442: sequence $s$ are valued \emph{just in time} when they are passed

443: as arguments to the sequence operations $And$, $Or$, and the like.

444: (This can be compared to lazy evaluation in languages like Scheme

445: or Haskell.) Boolean values in sequences of definition

446: \ref{d:nonstdsemNXP2} are then representations of \emph{delayed}

447: accesses to the working memory. The reifications of those delayed

448: executions, usually called \emph{thunks}, are

449: triggered by the goal investigation process as needed. The lazy

450: evaluation convention though does not affect the inference computation

451: nor the control of its unfolding.

452: \end{rem}

453: We ground the previous definition with the following result:

454: \begin{prop} \label{p:congruence}

455: The  semantics \ref{d:stdsemNXP} and \ref{d:nonstdsemNXP2} are

456: \emph{congruent} i.e.

457: $$[[E]]_{2}\:s = \langle[[E]]_{0}\rangle+\:s$$ for all $E$ and for

458: all $s$.

459: \end{prop}

460: \begin{proof}

461: The proof is by induction on expressions. Obviously :

462: $$ [[B]]_{2}\:s = \langle\mathcal{B}(B)\rangle +\:s$$ by

463: definition. For the other syntactic forms of expressions :

464: \begin{align*}

465: [[E_{1}\:or\:E_{2}]]_{2}\:s &=

466: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s)) \\

467:  &= Or([[E_{2}]]_{2}(\langle[[E_{1}]]_{0}\rangle+\:s)) \\

468:  &=

469:  Or(\langle[[E_{2}]]_{0}\rangle+\:(\langle[[E_{1}]]_{0}\rangle+\:s))

470:  \\

471:  &= Or(\langle[[E_{2}]]_{0},[[E_{1}]]_{0} \rangle+\:s) \\

472:  &= \langle[[E_{2}]]_{0}\:|\:[[E_{1}]]_{0} \rangle+\:s \\

473:  &= \langle[[E_{1}\:or\:E_{2}]]_{0}\rangle +\:s

474: \end{align*}

475: and similarly for the $and$ expression.

476: \end{proof}

477:

478: \subsubsection{Formalisation in a Logical Framework (Twelf)}

479: The previous approach lends itself to formalisation in a Logical

480: Framework. In this section, we use the codification of

481: formalisation techniques into a meta-language offered by Twelf, a

482: logical framework developed at Carnegie-Mellon University \cite{

483: pfenning99}, to specify proofs of inference systems properties.

484: This is done in three stages : The first one is the representation

485: of the abstract syntax of the simple language under investigation;

486: the second stage is the representation of the language semantics

487: (both notions of values and types, and notions of computations,

488: i.e. operational semantics); the last stage is the representation

489: of the properties of the language (for instance, congruence of

490: semantics and type preservation).

491:

492: We base the representation of the simple language expressions on

493: abstract syntax (rather than concrete) in order to expose the

494: essential structure and focus on the semantics and its properties

495: rather than on lexical analysis and parsing---although the Logical

496: Framework could also help here. The representation technique is

497: called \emph{higher-order abstract syntax} and basically uses

498: types in Twelf to capture all syntax.

499: \begin{defn} \label{d:TwelfSyntax}

500: Going back to definition \ref{d:stdsemNXP} of the simple NXP

501: language, we then declare a new Twelf type: $${\tt exp : type.}$$

502: to represent expressions of $Exp$, intending that every Twelf

503: object $M$ of type {\tt exp} represents a simple language

504: expression and vice-versa. Similarly, we declare boolean values as

505: a new Twelf type: $${\tt bool : type.}$$ and declare two

506: distinguished instances of this newly defined {\tt bool} type:

507: $${\tt true : bool.}$$ $${\tt false : bool.}$$ With these basic

508: types in place we now turn to boolean values expressions built

509: from $\&$ and $|$ operators. Twelf captures these

510: operators as \emph{expression constructors} which translate into

511: constant functional types:

512: $${\tt \& : bool \:->\: bool \:->\: bool. }$$

513: $${\tt | : bool \:->\: bool \:->\: bool. }$$

514: for which we will use infix mode for better readability.

515: \end{defn}

516:

517: In a second stage we introduce a representation of deductions

518: following the idea of \emph{judgements-as-types}. In Twelf,

519: deductions are objects and judgements are types, so that proofs in

520: the semantics space are in fact type reconstructions in Twelf. In

521: order to do so, we introduce a type family {\tt eval0} indexed by

522: representations of expressions ($Exp$) and boolean values (${\bf

523: B}$):

524: $${\tt eval0 : exp \:->\: bool \:->\: type.}$$

525: such that we have types such as {\tt eval0 e b} which depend on

526: objects.

527:

528: Axioms are simply represented as types such as these,

529: while semantics operations can be viewed as constructors which,

530: given deductions of their arguments, yield a deduction of

531: their result.

532: \begin{defn} \label{d:TwelfStdSemantics}

533: For the simple NXP language we introduce:

534: $${\tt constant : bool \:->\: exp.}$$

535: $${\tt and : exp \:->\: exp \:->\: exp.}$$

536: $${\tt or  : exp \:->\: exp \:->\: exp.}$$

537: and the following representation of their respective evaluations:

538: $${\tt eval0cst \;:\; eval0\; (constant B)\; B.}$$

539: $${\tt eval0or  \;:\; eval0\; (E2 \;or\; E1)\; (V1 \;|\; V2) \;<-\; eval0\; E1\; V1\; <- \;eval0\; E2\;

540: V2.}$$ $${\tt eval0and \;:\; eval0\; (E2\; and\; E1)\; (V1\; \&\;

541: V2)\; <- \;eval0\; E1\; V1\; <-\; eval0\; E2\; V2.}$$ which

542: capture the standard semantics of definition \ref{d:stdsemNXP}.

543: \end{defn}

544: \begin{rem}

545: Using Twelf built-in deductive capabilities we can now explicit

546: deductions such as in the following example:

547: \begin{verbatim}

548: eval0 (constant true and constant true or constant false) V.

549: ---------- Solution 1 ----------

550:  V = (false | true) & false.

551:  A = eval0_and eval0_cst (eval0_or eval0_cst eval0_cst).

552: \end{verbatim}

553: which simply shows the result of the boolean operation as {\tt V}

554: and the deduction as {\tt A}. Note that the deduction {\tt A} may

555: be considered as a logical proof of the value {\tt V} or

556: alternatively as the logical program implementing {\tt eval0}

557: viewed as a function call.

558: \end{rem}

559:

560: The non-standard semantics of the NXP architecture are captured

561: in the same way, with the introduction of an additional

562: Twelf type, {\tt bool\_seq} for sequences of boolean values.

563: \begin{defn} \label{d:TwelfNonStdSem2}

564: The Twelf representation of the sequence semantics for NXP consists

565: firstly of an evaluation function which builds a functional

566: programming style representation of expressions of the simple

567: language :

568: \begin{verbatim}

569: %%% The sequence abstract data type

570: nil    : bool_seq.

571: atom   : bool -> bool_seq.

572:

573: select : bool_seq -> int -> bool.

574: tail   : bool_seq -> int -> bool_seq.

575: length : bool_seq -> int.

576: +      : bool_seq -> bool_seq -> bool_seq. %infix right 10 +.

577:

578: %%% The functional programming style representation

579: eval2 : exp -> bool_seq -> bool_seq -> type.

580:

581: eval2_cst : eval2 (constant B) nil (atom B).

582: eval2_cst2: eval2 (constant B) S (atom B + S).

583: eval2_or_s : eval2 (E1 or E2) S (or_s S2)

584:                 <- eval2 E1 S S1 <- eval2 E2 S1 S2.

585: eval2_and_s : eval2 (E1 and E2) S (and_s S2)

586:                 <- eval2 E1 S S1 <- eval2 E2 S1 S2.

587: \end{verbatim}

588: and secondly of an evaluation function that performs the

589: functionally specified operations on the input sequence :

590: \begin{verbatim}

591: eval2a : bool_seq -> bool_seq -> type.

592:

593: eval2a_atom : eval2a (atom B) (atom B).

594: eval2a_atom_2 : eval2a (atom B + S) (atom B + S')

595:          <- eval2a S S'.

596: eval2a_or_s1 : eval2a (or_s (S'')) (atom(B1 | B2) + S')

597:            <- eval2a S'' (atom B1 + atom B2 + S)

598:            <- eval2a S S'.

599: eval2a_or_s2 : eval2a (or_s (S'')) (atom(B1 | B2))

600:            <- eval2a S'' (atom B1 + atom B2).

601: eval2a_or_s_nil : eval2a (or_s (A1 + A2 )) (atom(B1 | B2) )

602:            <- eval2a A1 (atom B1) <- eval2a A2 (atom B2).

603: eval2a_and_s1 : eval2a (and_s (S'')) (atom(B1 & B2) + S')

604:            <- eval2a S'' (atom B1 + atom B2 + S)

605:            <- eval2a S S'.

606: eval2a_and_s2 : eval2a (and_s (S'')) (atom(B1 & B2))

607:            <- eval2a S'' (atom B1 + atom B2).

608: eval2a_and_s_nil : eval2a (and_s (A1 + A2 )) (atom(B1 & B2) )

609:            <- eval2a A1 (atom B1) <- eval2a A2 (atom B2).

610: \end{verbatim}

611: The Twelf representation of the sequence semantics function is simply

612: the composition of the previously shown functions :

613: \begin{verbatim}

614: eval2b : exp -> bool_seq -> bool_seq -> type.

615: eval2b_1 : eval2b E S S' <- eval2 E S S'' <- eval2a S'' S'.

616: \end{verbatim}

617: \end{defn}

618: \begin{rem}

619: There are many ways the non-standard semantics of the NXP

620: architecture could have been mapped to Twelf's metalanguage.

621: Definition \ref{d:TwelfNonStdSem2} is in fact more ``operational''

622: than really needed. It separates a functional programming

623: equivalent of the expression parsed from the execution of this

624: representation. The previous definition emphasises the distinction

625: between \emph{instructions} and a \emph{machine} executing

626: instructions. The impact of this formal distinction will be

627: discussed in the implementation section of this paper.

628: \end{rem}

629: \begin{rem}

630: Theorem-proving capabilities of the Twelf environment are also

631: extremely useful in assessing properties of the simple language.

632: For instance, as a verification of the congruence of semantics we

633: can query Twelf as follows :

634: \begin{verbatim}

635: %query 1 1

636: eval2b ((constant true and constant false)

637:     or constant true

638:     or constant false) nil

639:    S.

640: ---------- Solution 1 ----------

641: S = atom ((false | true) | false & true).

642: A =

643:    eval2b_1

644:       (eval2a_or_s2

645:           (eval2a_or_s1 eval2a_atom

646:               (eval2a_atom_2

647:                   (eval2a_atom_2

648:             (eval2a_and_s2

649:                 (eval2a_atom_2 eval2a_atom))))))

650:       (eval2_or_s (eval2_or_s eval2_constant2 eval2_constant2)

651:           (eval2_and_s eval2_constant2 eval2_constant)).

652:

653: %query 1 1

654: eval0 ((constant true and constant false)

655:     or constant true

656:     or constant false) V.

657: ---------- Solution 1 ----------

658: V = (false | true) | false & true.

659: A = eval0_or

660:     (eval0_and eval0_cst eval0_cst)

661:         (eval0_or eval0_cst eval0_cst).

662:

663: \end{verbatim}

664: which basically shows deductions of proposition \ref{p:congruence} for

665: a particular expression.

666: \end{rem}

667:

668: Within this framework we are now ready to set forth the semantics

669: of some of the major inference features of the NXP architecture,

670: and, with the Logical Framework mapping introduced in this

671: section, ground some of its behavioural properties into

672: mathematical logic.

673:

674: \subsubsection{Sequence semantics for goal evocation}

675: The sequence semantics previously presented was designed to make

676: it relatively natural and simple to express goal evocation in the

677: NXP inference architecture. Informally goal evocation, which is

678: triggered by the data collection process, \emph{queues} a goal for

679: postponed investigation, once the current threads of

680: investigation complete. In turn, we will see that this sequence

681: semantics is suggestive of an operational semantics closer

682: to implementation.

683: \begin{defn} \label{d:nxpsyntax2}

684: Let us refine the simple NXP language with the following: $Exp$,

685: expressions: $$ E ::= b\;| E \;or\; E | E \;and\;E | E \;;\;E | b

686: \;post\;E$$ with $b$ in ${\bf B}$ and where the new expression $b

687: \;post\;E$ collects data value $b$ and evokes goal $E$ (which, as

688: an expression, is itself an arbitrarily long sequence of

689: expressions). The other expressions are as in definition

690: \ref{d:nonstdsemNXP2}.

691: \end{defn}

692: The semantics of goal evocation is captured by extending the

693: previous definition \ref{d:nonstdsemNXP2} of sequence semantics

694: with the additional mapping:

695: \begin{defn}\label{f:nonstdsemNXP3}

696: Let ${\bf B}^{*}$ be the set of finite boolean values  sequences

697: with the previously defined selection, rest and concatenation

698: operations, let $Exp$ and ${\bf B}$ be as in definition

699: \ref{d:nonstdsemNXP2} let $\mathcal{B}$ be, as before, the

700: semantic function of definition \ref{d:stdsemNXP}, we extend the

701:  semantic evaluation function $[[\_]]_{2}: Exp \rightarrow {\bf

702: B}^{*} \rightarrow {\bf B}^{*}$ as follows:

703: \begin{itemize}

704: \item $[[B]]_{2}s = \langle\mathcal{B}(B)\rangle+s$ \item

705: $[[E_{1}\:or\:E_{2}]]_{2}\:s =

706: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item

707: $[[E_{1}\:and\:E_{2}]]_{2}\:s =

708: And([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item

709: $[[E_{1}\:;\:E_{2}]]_{2}\:s = [[E_{2}]]_{2}([[E_{1}]]_{2}\:s)$

710: \item $[[B\;post\;E]]_{2}\:s =

711: \langle\mathcal{B}(B)\rangle+\:s\:+[[E]]_{2}\langle\rangle$

712: \end{itemize}

713: \end{defn}

714: \begin{rem}

715: The definition of the $post$ semantic function captures the postponed

716: execution of the goal investigation. It effectively replaces the

717: current continuation with the same \emph{followed} by the evaluation

718: of a goal or a sequence of goals. The effect is of completing first the

719: current goal evaluation process and then pursue with the postponed goals

720: evoked during that initial process.

721: \end{rem}

722: \begin{rem}

723: Although it is implicit in the definition of sequences, the previous

724: definition relies on a left-to-right ordering of operation on

725: sequences as in remark \ref{r:left-to-right-ordering}. If we required

726: an operational semantics closer to implementation, we would enforce

727: this by letting sequences hold unevaluated expressions, or thunks, as

728: well as instructions of a virtual machine, transforming the sequence

729: abstract data type into a conventional stack abstract data type.

730: \end{rem}

731: \begin{rem}

732: In a refined operational semantics, we would need to be a bit more

733: specific than the previous definition \ref{f:nonstdsemNXP3}. More

734: formally, we already mentioned that boolean values in goal

735: expressions are actually evaluated by accessing the working memory

736: of the inference system (possibly causing further interaction with

737: the user). The working memory is represented as an environment

738: $\rho$, a map from a set of variables $Vars$ to boolean values

739: {\bf B}. Goal expressions then refer to variables and the

740: semantic function is expressed in term of $\rho$. The map $\rho$

741: is partial in the sense that its value is not necessarily defined on all

742: variables at all times. In the occurrence of one of these unvalued variables,

743: the goal evaluation process triggers an atomic transaction with

744: the user, requesting the value which is then assigned in $\rho$. In

745: programming languages this caching effect is also known as \emph{memoizing} and can be

746: captured in different ways in an operational semantics. The

747: postponed evaluation of goals then proceeds in the environment as

748: resulting from the preceding evaluation cycle. In this, postponed

749: evaluation differs from re-evaluation that would, in contrast,

750: proceed from scratch in a new environment $\rho_{0}$ totally

751: undefined.

752: \end{rem}

753: With this semantics we can review some elementary results

754: formalising the intuitions underlying goal evocation in the NXP

755: architecture.

756: \begin{prop}\label{p:commutative}

757: Sequencing and postponing goals are commutative operations in the

758: following restricted sense:

759: $$[[B_{1}\;post\;E_{1}\;;\;B_{2}\;post\;E_{2}]]s =

760: [[B_{1}\;post\;(E_{2}\;;\;E_{1})\;;\;B_{2}]]s$$

761: and $$[[B_{1}\;post\;E_{1}\;;\;B_{2}\;post\;E_{2}]]s =

762: [[B_{1}\;;\;B_{2}\;post;(E_{2}\;;\;E_{1})]]s$$

763: \end{prop}

764: \begin{proof}

765: By simple application of the concatenation operation on sequences.

766: \end{proof}

767: This simply states that a sequence of goals can be locally evoked either as a

768: group or individually without affecting the overall behaviour. In the

769: accompanying paper, this result will be set in the context of

770: knowledge acquisition mechanisms concurrently running with the

771: evaluation process. More specifically, this restricted commutativity

772: will be cast in the context of the chunking processes described by

773: Newell in the Soar architecture \cite{Soar}. From an implementation perspective,

774: proposition \ref{p:commutative} may be used for optimisation of the

775: execution by preprocessing goal expressions, gathering groups of

776: evoked goals rather than processing them individually. This would seem

777: particularly useful when building a lazy interpreter for the NXP

778: architecture.

779: \begin{rem}\label{r:evocation-as-exception}

780: Goal evocation in the NXP architecture is reminiscent of

781: \emph{exceptions} in functional of imperative programming

782: languages. In the latter the current continuation is replaced

783: immediately with another one. The behaviour is then to abandon the

784: current process and switch to another one. In the former the

785: continuation is just \emph{added} after the current one; the resulting

786: behaviour being to successively go through the processes to their

787: completion. The analogy will be better illustrated in the categorical

788: investigation of the NXP architecture in the next section.

789: \end{rem}

790: Goal evocation as exemplified by the NXP architecture is an

791: instance of associative cognitive mechanisms. Theses associative

792: processes are at work in the background of the evaluation process

793: itself interacting with it by driving the \emph{focus of

794: attention} of the inference system. While the backward and forward

795: chaining of  the goal evaluation process could be

796: described as \emph{strong} or \emph{local} focus of attention,

797: goal evocation represents a form of \emph{global},

798: \emph{weaker} forward chaining. More generally if \emph{learning}

799: processes have traditionally been studied separately from the

800: performance component of inference systems, in recent resaerch work

801: emphasis has been put on accounting for interactions between

802: performance and learning \cite{Soar}. In this perspective

803: understanding and representing of inference control are critical

804: as control is the operational juncture between the two subsystems.

805: Furthermore the representation semantics is of direct service in

806: specifying the implementation of inference systems.

807:

808: \section{A categorical investigation of inference control}

809: \subsection{Motivations}

810: Category theory has in the past few years become a tool of choice for

811: to investigate properties of programming languages. Because it

812: focuses on very high-level properties abstracted from a number of

813: subdomains of mathematics (domain theory, topology, set theory, etc.) its results are

814: characterized by a wide range of applicability.

815:

816: Composition if of the essence of categories. It is only natural

817: that sequencing and its variations, which are the basic building

818: blocks of inference systems control constructs---as seen in the

819: previous section---, find a well-fit mathematical expression in

820: categorical terms. In this section, we suggest such a categorical

821: construction for analyses of control in inference systems.

822: Starting from the original insight of Moggi, we use triples, a

823: categorical construction with its origins in algebra, to define

824: effects and controls in the NXP architecture. Following Wadler's lines of

825: exploration we produce a category to account for goal evaluation

826: and goal evocation in the NXP architecture. Finally, other results

827: in \emph{monadic style} representation of computations offer

828: interesting analogies for the representation of behaviours of

829: inference systems, particularly in \emph{layering} effects around

830: the core categorical representation of the architecture. The

831: categorical study of knowledge acquisition in inference

832: systems---in its multi-faceted forms---constitutes the major part

833: of the companion paper covering the NXP architecture.

834:

835: \subsection{Categorical background}

836: This section defines the basic notions from category theory that

837: we need in the formalisation of the NXP architecture in

838: monadic style. Readers are referred to \cite{ttt} for a

839: comprehensive presentation of categories and triples. Let $C$ be a

840: category, we denote by $Obj(C)$ the objects of $C$ and by

841: $Hom(A,B)$ the set of arrows with source object $A$ and target

842: object $B$.

843: \begin{defn} \label{d:functor}

844: If ${\bf C}$ and ${\bf D}$ are categories, a functor $F:{\bf C}

845: \rightarrow {\bf D}$ is a map for which:

846: \begin{itemize}

847: \item  If $f: A \rightarrow B$ is an arrow of ${\bf C}$, then $Ff:

848: FA \rightarrow FB$ is an arrow of ${\bf D}$; \item $F(id{_A}) =

849: id_{FA}$; and \item If $g: A \rightarrow B$, then $F(g \circ f) =

850: Fg \circ Ff$.

851: \end{itemize}

852: \end{defn}

853: A functor is a morphism of categories, a map which takes objects

854: to objects, arrows to arrows, and preserves source, target,

855: identities and composition. More generally $F$ {\em preserves} a

856: property P that an arrow $f$ may have if $F(f)$ has property P

857: whenever $f$ has. It {\em reflects} property P if $f$ has the

858: property whenever $F(f)$ has.

859:

860: A natural transformation is defined as a "deformation" of one

861: functor to another.

862: \begin{defn}

863: If $F:{\bf C} \rightarrow {\bf D}$ and $G:{\bf C} \rightarrow {\bf

864: D}$ are two functors, $\lambda : F \rightarrow G$ is a natural

865: transformation from $F$ to $G$ if $\lambda$ is a collection of

866: arrows $\lambda C \rightarrow \lambda D$, one for each object $C$

867: of ${\bf C}$, such that for each arrow $g: C \rightarrow C'$ of

868: ${\bf C}$ the following diagram commutes:

869:

870: \def\ComponentOne{{\lambda C}}

871: \def\FuncOne{{Fg}}

872: \def\FuncTwo{{Gg}}

873: \def\ComponentTwo{{\lambda C'}}

874: \begin{diagram}

875: FC & \rTo^{\ComponentOne} & GC \\

876: \dTo<\FuncOne & & \dTo>\FuncTwo \\

877: FC' & \rTo^{\ComponentTwo} & GC'\\

878: \end{diagram}

879:

880: The arrows $\lambda C$ are the components of $\lambda$. The

881: natural transformation $\lambda$ is a {\em natural equivalence} if

882: each component of $\lambda$ is an isomorphism in ${\bf D}$.

883: \end{defn}

884:

885: \subsection{Triples, monads and categories for computations}

886: Triple or monads are, from one point of view, abstraction of

887: certain properties of algebraic structures, namely monoids. They

888: are categorical constructs that originally arose in homotopy

889: theory and were used in algebraic theory. Moggi

890: \cite{moggi91notions} was the first to discover the connection

891: between triples and semantics of effects in programming language

892: design. Since then the monadic style has pervaded theoretical

893: research on denotational and operational semantics.

894: \begin{defn} \label{d:triple}

895: A triple ${\bf T} = (T, \eta, \mu)$ on a category ${\bf C}$ is an

896: endofunctor $T: {\bf C} \rightarrow {\bf C}$ together with two

897: natural transformations $\eta: id{_{{\bf C}}} \rightarrow T$,

898: $\mu: TT \rightarrow T$ subject to the following commutative

899: diagrams:

900:

901: \def\mut{{\mu T}}

902: \def\tmu{{T \mu}}

903: \begin{diagram}

904: TTT & \rTo^\mut & TT \\

905: \dTo<\tmu & {\rm associativity} & \dTo>{\mu} \\

906: TT & \rTo^{\mu} & T \\

907: \end{diagram}

908:

909: expressing associative identity, and:

910:

911: \def\etat{{\eta T}}

912: \def\teta{{T \eta}}

913: \begin{diagram}

914: T   & \rTo^\etat    & TT        & \lTo^\teta    & T \\

915:     & \rdTo<{=}     & \dTo>{\mu}& \ldTo>{=}     &   \\

916:     &               & T         &               &   \\

917: \end{diagram}

918:

919: expressing left and right unitary identities.  The component of

920: $\mu T$ at an object $X$ is the component of $\mu$ at $TX$, while

921: the component of $T \mu$ at X is $T(\mu X)$; similar descriptions

922: apply to $\eta$.

923: \end{defn}

924:

925: \begin{rem} \label{d:monad}

926: There is an alternate way of defining a triple based on a result

927: due to Manes.

928:

929: Let ${\bf C}$ be a category with:

930: \begin{itemize}

931: \item A function $T : Obj(C) \rightarrow Obj(C)$; \item for each

932: pair of objects $C$ and $D$, a function $Hom(C,TD) \rightarrow

933: Hom(TC,TD)$, denoted $f \rightarrow f^{*}$; \item for each object

934: $C$ of ${\bf C}$ a morphism $\eta C : C \rightarrow TC$;

935: \end{itemize}

936: subject to the following conditions:

937: \begin{itemize}

938: \item For $f : C \rightarrow TD$, $f = \eta TD \circ f^{*}$; \item

939: for any object $C$, $(\eta C)^{*} = id_{TC}$; \item  for $f : C

940: \rightarrow TD$ and $g : D \rightarrow TE$, $(g^{*}\circ f)^{*} =

941: g^{*} \circ f^{*}$; \end{itemize} is equivalent to a triple on

942: ${\bf C}$.

943: \end{rem}

944:

945: The equivalence results from constructions of triples from adjoint

946: pairs separately discovered by Eilenberg-More and by Kleisli. The

947: function $ (\_)^{*}: Hom(C,TD) \rightarrow Hom(TC,TD)$ is also

948: known as the Kleisli star. This alternate definition emphasises

949: the connection between a triple and a monoid, an algebraic

950: structure with an associative operation and a unit element. Wadler

951: suggested a straightforward interpretation of the Kleisli star in

952: programming language semantics. In this context, the purpose of

953: the star operation is to combine two computations, where the

954: second computation may depend on a value yielded by the first

955: \cite{wadler94monads}. More precisely if $m$ is a computation of

956: type $T \tau_{1}$ and $k$ a function from values to computations

957: (such as a continuation) $\tau_{1} \rightarrow T\tau_{2}$, then

958: $k^{*}(m)$, or $m*k$, is of type $T\tau_{2}$ and represents the

959: computation that performs computation $m$, applies $k$ to the

960: value yielded by the computation, and then performs the

961: computations that results. It binds the result of computation $m$

962: in computation $k$. Different definitions for the triple $T$ and

963: the star operation then give rise to different monads to represent

964: different control operators such as {\tt escape/exit}, {\tt

965: call/cc}, {\tt prompt/control} or {\tt shift/reset}

966: \cite{wadler93monads,queinnec93library}.

967:

968: As noted by Wadler and others, monadic and continuation-passing

969: styles appear closely related \cite{wadler94monads}. The actual

970: correspondence, however, is formally quite involved. Filinski has

971: shown the remarkable result that \emph{any} monadic effect whose

972: definition is itself expressible in a functional language can be

973: synthetised from just two constructs: first-class continuations

974: and a storage cell \cite{filinski94representing,

975: filinski96controlling}.  This has direct consequences on the

976: feasibility of various implementations of the NXP architecture as

977: investigated in the last section of this paper.

978:

979: \subsection{Monadic style semantics for capturing side-effects}

980: In most programming languages, evaluation may have implicit

981: side-effects that are not predicted by the type of the expression.

982: This is typically the case with goal evocation in the NXP

983: architecture where collecting data may trigger goals for further,

984: delayed evaluation. From their first introduction in the world of

985: programming languages triples, or monads, were precisely used to

986: distinguish between \emph{values}, and \emph{computations} whose

987: evaluation may have side-effects. The semantic separation leads to

988: a stratified style where a pure functional language, for instance,

989: is used to to express the manipulation of values, and one or

990: several monadic sublanguages are used to express manipulation of

991: computations \cite{ wadler94monads}. Layering several monadic

992: sublanguages is formalised using \emph{triple morphisms} as in

993: Filinski's \cite{ filinski96controlling}. Interactions between

994: the core functional language and the monadic extensions are

995: mediated by the type system which keeps track of the computational

996: effects and their propagation.

997:

998: Each side-effect introduces a new type of computation associated

999: to a particular triple $T$. For instance, triples have been defined for

1000: effects such as exceptions, state, and input/output. Generally

1001: speaking, in monadic style semantics the type $T a$ designates a

1002: computation yielding a value of type $a$ with possible

1003: side-effects. The semantics of these side-effects is captured by

1004: proper definition of the natural transformations of triple

1005: $T$ (its unit and Kleisli star). Formally and following Manes'

1006: construction, the unit of the triple turns a value into a

1007: computation that returns that value without side-effect:

1008: $$\eta : a \rightarrow Ta.$$

1009: The Kleisli star applies a function of type $a \rightarrow Tb$ to

1010: a computation of type $Ta$, basically chaining two computations in

1011: succession. Following Wadler's popular notation---of using $*$ as

1012: an infix operator---the star operation:

1013: $$ (*) : Ta \rightarrow (a \rightarrow Tb) \rightarrow Tb$$

1014: represents application of a continuation to a computation of the

1015: given type. In the following paragraphs we will make the analogy

1016: even more explicit by using the notation:

1017: $$m*\lambda x.k$$

1018: where $m$ and $k$ are expressions and $x$ is a variable. The above

1019: can be read as follows: perform the computation $m$, bind $x$ to

1020: the resulting value and perform computation $k$. In so-called

1021: impure languages, the above notation is similar to:

1022: $${\tt let\:x\:=\:m\:in\:k.}$$

1023: Note that, however, that the latter does not properly

1024: distinguishes pure types (no side-effect) from computation types

1025: (with possible side-effects).

1026:

1027: We are now ready to present the monadic style definition of the

1028: previous standard and non-standard evaluators for the NXP

1029: architecture.

1030:

1031: \subsection{A monadic style presentation of the NXP architecture}

1032: \subsubsection{The standard evaluation triple} The new

1033: presentation is simply a rework of the semantic evaluation

1034: functions of definition \ref{d:stdsemNXP} in terms of the

1035: appropriate triple, unit, and Kleisli star.

1036: \begin{defn}\label{d:msStdSem}

1037: The standard monadic style semantic of the simple and/or

1038: expression language is captured by the map $eval$:

1039: \begin{align*}

1040: eval &: Exp \rightarrow T {\bf B}  \\

1041: eval(b) &= \eta b \\

1042: eval(E_{1}\;or\;E_{2}) &= eval(E_{1})*\lambda

1043: x.eval(E_{2})*\lambda y.\eta(x\;|\;y) \\

1044: eval(E_{1}\;and\;E_{2}) &= eval(E_{1})*\lambda

1045: x.eval(E_{2})*\lambda y.\eta(x\;\&\;y) \\

1046: \end{align*}

1047: \end{defn}

1048: \begin{rem}

1049: In the definition above lambda abstraction binds less tightly and

1050: the application star binds more tightly, so that we can get rid of

1051: parentheses.

1052: \end{rem}

1053: \begin{rem}

1054: The above evaluation is much more flexible in that is separates

1055: values from computations. Values are evaluated through the unit

1056: semantic function $\eta$ of the triple, while the sequencing of

1057: computations is specified by the application star. The variations

1058: in denotational semantics we explored in the successive

1059: definitions of the previous section are investigated here by

1060: simply changing definitions of unit and application.

1061: \end{rem}

1062:

1063: \subsubsection{The non-standard evaluation triple}

1064: In the sequence triple, a computation accepts an initial sequence

1065: and returns a value paired with the final sequence.

1066: \begin{defn}\label{d:msSeqSem}

1067: The sequence triple $T : \tau \rightarrow \tau\times {\bf B}^{*}$

1068: with unit $\eta$ defined by $$\eta(b) = \lambda x.(b, \langle b

1069: \rangle + x)$$ where sequence operations are as in definition

1070: \ref{d:stacks}; and with application star $$(*) : T\tau_{1}

1071: \rightarrow (\tau_{1} \rightarrow T \tau_{2}) \rightarrow

1072: T\tau_{2}$$ such that:

1073: $$

1074: m*k  =\lambda x.{\tt let }\:(a, S_{1})\;=\;m\;x\: {\tt in}\:

1075:       {\tt let }\:(b, S_{2})\;=\;k\;a\;x \:{\tt in }\: (b,

1076:       S_{2})

1077: $$

1078: and the evaluator as in definition \ref{d:msStdSem} captures the

1079: non-standard sequence semantics of the NXP architecture.

1080: \end{defn}

1081: \begin{rem}

1082: The sequence triple is similar to the \emph{state triple}

1083: introduced by Moggi and by Wadler to represent programming

1084: languages with instructions operating on a global state. The

1085: seemingly awkward definition of the application star simply

1086: expresses the intermediary sequences and results of first

1087: computing $m$ and then applying $k$ to the result.

1088: \end{rem}

1089: \begin{rem}

1090: For evaluation purposes, the $eval$ semantic function of the

1091: generic interpreter of definition \ref{d:msStdSem} has $T\;{\bf

1092: B}$ as its domain, meaning that computations with the sequence

1093: triple are of type ${\bf B}\times {\bf B}^{*}$ as expected.

1094: \end{rem}

1095: In order to introduce goal evocation as a last touch to the

1096: sequence triple, we note that goal evocation is \emph{only} a

1097: side-effect and does not interfere with values. This critical

1098: characteristic which was somewhat implicit in the previous

1099: continuation passing style presentation can now be made explicit

1100: in the monadic style. We formalise this separation by defining an

1101: additional semantic function in the sequence triple $T$: $post$ of

1102: type $T\;()$, i.e. $() \rightarrow () \times {\bf B}^{*}$. The

1103: $()$ type signals that the $post$ function is only concerned with

1104: effects on sequences and basically ignores pure values.

1105: \begin{defn}

1106: The evocation function for the sequence triple is defined by:

1107: $$post : T\;()$$

1108: and

1109: $$post\; E = \lambda s.((),s + {\tt inr}(E))$$

1110: where {\tt inr} is the right injection ${\bf B}\times {\bf B}^{*}

1111: \rightarrow {\bf B}^{*}$.

1112: \end{defn}

1113: With the evocation function thus defined, we extend the definition

1114: of the standard interpreter to accommodate the goal evocation

1115: expression in the following final definition of the sequence

1116: triple.

1117: \begin{defn}\label{d:msNonStdSeqSem}

1118: The monadic style sequence semantic of the simple and/or

1119: expression language is captured by the map $eval$ in the sequence

1120: triple $T$:

1121: \begin{align*}

1122: eval &: Exp \rightarrow T {\bf B}  \\

1123: eval(b) &= \eta b \\

1124: eval(E_{1}\;or\;E_{2}) &= eval(E_{1})*\lambda

1125: x.eval(E_{2})*\lambda y.\eta(x\;|\;y) \\

1126: eval(E_{1}\;and\;E_{2}) &= eval(E_{1})*\lambda

1127: x.eval(E_{2})*\lambda y.\eta(x\;\&\;y) \\

1128: eval(b \;post\; E) &= post E * \eta b \\

1129: \end{align*}

1130: \end{defn}

1131: In that definition we simply replaced $\eta b$ with $post \;E *

1132: \eta b$ in the computation. Properties of natural transformations

1133:  assure that the call to $post$ indeed

1134: percolates through higher-level goal evaluations. In the

1135: definition \ref{d:msNonStdSeqSem}, as in the previous section

1136: definitions we insist on a left-to-right evaluation ordering of

1137: sequences.

1138: \begin{rem}

1139: Note that the definition of $post$ could be viewed as a family of

1140: semantic functions indexed by goals, $E$ in the representation. A

1141: better notation in this case would be $post_{E}$ to capture goal

1142: indices. Although equivalent in the monadic style presentation

1143: suggested above for the NXP architecture, this alternative

1144: notation will help formalise the knowledge acquisition part of the

1145: NXP inference architecture.

1146: \end{rem}

1147:

1148: \section{Operational semantics and implementation issues}

1149: \subsection{Layering effects in the NXP architecture}

1150: Filinski's and Wadler's work

1151: \cite{filinski96controlling,wadler94monads} have emphasised the

1152: compositional nature of triples in representing computations.

1153: Filinski's results in particular are geared towards presenting a

1154: framework for computational effects which makes it possible to

1155: describe effects in a \emph{modular} way---an idea which permeates

1156: the design of the Haskell functional programming language, to

1157: mention only one. Drawing on the latter framework, we are able to

1158: add \emph{inference effects}, so critical to the NXP architecture

1159: and to inference systems at large, \emph{incrementally}. Following

1160: this approach, the NXP architecture is actually specified by a

1161: sequence of definitional translations, each one of which

1162: ``translates away'' one level of inference effects. For example,

1163: we can refine the description of the NXP architecture with goal

1164: evocation and dynamic working memory by specifying it as a

1165: composition of a goal evocation and a working memory translation.

1166: Informally we will also talk of \emph{composing} a goal evocation

1167: and a working memory triple.

1168:

1169: The monadic composition relies on the following general idea:

1170: assume we have two triples $T$ and $U$ over a base language $L$,

1171: where $U$ is in a certain sense ``more general'' than $T$. There

1172: is a standard translation of effects, $L^{M}$ into the base

1173: language given their monadic representation. Using this

1174: monadic translation we can give two different translations from

1175: $L^{T}$ to $L$: the original monadic translation for $T$ and a

1176: variant translation using $U$-representations of $T$-effects,

1177: inducing the same evaluation semantics. The reader is referred to

1178: \cite{filinski96controlling} for a formal proof of these

1179: results---interestingly enough the crux of the demonstration relies

1180: on the result that continuations are in a precise sense a

1181: \emph{universal} effect: any definable triple can be simulated by

1182: a continuation triple on a language with only first-class

1183: continuations (a la Scheme) and typed states.

1184:

1185: We present here other control constructs of the NXP

1186: architecture and propose candidates for their formal monadic

1187: representation. The working memory, a major element of the NXP

1188: architecture, is detailed by itself in the next section.

1189:

1190: \subsubsection{Associative links, context}

1191: In the NXP architecture, the data-triggered goal evocation mechanism

1192: is complemented by second goal-triggered goal evocation process based

1193: on associative links between goals and goals or subgoals.

1194: Associative links are defined by a binary relation: the

1195: \emph{contextual} relation.

1196:

1197: \begin{defn}\label{d:nxpsyntax3}

1198: The simple base language for inference systems expression is extended

1199: with the {\tt context} expression. Let us define $Exp$, the set of

1200: expressions: $$ E ::= b\;| E \;or\; E | E \;and\;E | E \;;\;E | b

1201: \;post\;E | E\;context\;E$$ with $b$ in ${\bf B}$ and where the new expression $E

1202: \;context\;E$ evaluates the first expression and evokes the second (which, as

1203: an expression, is itself an arbitrarily long sequence of

1204: expressions). The other expressions are as in definition

1205: \ref{d:nxpsyntax2}.

1206: \end{defn}

1207: \begin{rem}

1208: The goal-triggered goal evocation has lower priority than the

1209: data-triggered one. In the concrete expression

1210: $E_{1}\;{\tt context }\;E_{2}$, any goal evoked by the evaluation of

1211: $E_{1}$ through {\tt post} subexpressions will be evaluated

1212: before actual evaluation of $E_{2}$. Priorities aside {\tt context}

1213: and {\tt post} have the same semantics.

1214: \end{rem}

1215: We reuse the sequence triple to capture the semantics of the

1216: context links in the NXP architecture.

1217: \begin{defn}\label{d:msContextSem}

1218: Associative links semantics in the NXP architecture extend the $eval$

1219: semantic map in the sequence triple as follows:

1220: $$eval(E_{1} \;context\; E_{2}) = post E_{2} * eval(E_{1}) $$

1221: \end{defn}

1222: \begin{rem}

1223: Note that the $post$ primitive is now used twice: in the triple

1224: sequence used to capture goal evocation, and in the second triple

1225: sequence used to capture context links.

1226: \end{rem}

1227:

1228:

1229: In some respect, associative links are higher level control

1230: constructs in inference systems. Compared to pattern-directed modules

1231: of the canonical architecture, they do not rely on actual patterns in

1232: the working memory elements but rather on the behaviour of the

1233: inference process itself, posting goals for later evaluation when the

1234: inference, seen as computation, reaches some particular goal or

1235: subgoal. Alternatively  the composition

1236: of identical sequence triples could also be viewed as the more general

1237: triple defined by  $T : \tau \rightarrow \tau\times {\bf B}^{*} \times

1238: {\bf B}^{*}$ where the second sequence is in fact used for

1239: contextually linked goals.

1240:

1241: \subsubsection{The {\tt reset} action}

1242: Independently of the various actions operating on the working memory,

1243: calling for a special treatment as explained in the next section,

1244: the NXP production rules support a {\tt reset} action on a variable

1245: with the immediate effect of setting its value to ``unknown'' regardless

1246: of its previous assignment.

1247:

1248: From the behavioural standpoint, the representation of the computation

1249: of such an expression is simply to \emph{unevaluate} it. When {\tt

1250: reset} operates on a data, the semantics is simply to reevaluate it

1251: should the computation need it at some later time. When {\tt reset}

1252: operates on a goal (an expression in the simple NXP language) however,

1253: the situation is quite different: indeed the reset goal, considered

1254: as data in a later computation, should be reevaluated but what of its

1255: antecedents? And what if further computations do not explicitly need

1256: the reset goal?

1257:

1258: In the NXP architecture the effect of the {\tt reset} action

1259: propagates back from the goal to its antecedents but does not affect

1260: posted goals from evocation or contextual links. This propagation only

1261: interferes with the computation process by forcing a reevaluation,

1262: i.e. a new expansion of the reset expression. Its semantics is handled

1263: by the working memory executive which mediates all accesses to values.

1264:

1265: \subsection{Working memory and user interaction}

1266: In the semantics definition of the previous sections we

1267: oversimplified an essential element of an inference system, namely

1268: the \emph{working memory}. The working memory is holding the data

1269: structures on which the pattern-directed modules operate. In the

1270: traditional view of inference systems, data is often said to

1271: \emph{enter} or \emph{leave} working memory as it is added or

1272: deleted by the action part of those modules. In the cognitive

1273: sciences framework, the working memory is the representation in

1274: inference systems of the \emph{short-term memory}. Inflow and

1275: outflow of data in working memory capture the essentially local

1276: characteristic of short term memory which can only hold a limited

1277: number of items (``the magic number seven'') and only for a short

1278: period of time.

1279:

1280: In most inference systems, and particularly in traditional

1281: production systems where pattern-directed modules are production

1282: rules, the working memory is usually structured as a database of

1283: records. Working memory elements are instances of pre-defined or

1284: user-declared data structures with typed attributes. The

1285: pattern parts of the pattern-directed modules simply express

1286: boolean conditions over these attributes or, in some cases such as in

1287: the OPS series of production system languages \cite{OPS5}, the

1288: presence or absence of particular records.

1289:

1290: As such the basic structure is similar to the concept of an

1291: \emph{environment} in denotational semantics. An environment tells what

1292: the identifiers mean in an expression: it says what values the

1293: identifiers denote. In order to capture the level of indirections, it

1294: it usual to introduce $Ide$ a set of identifiers (variable names) and

1295: the domain $U$ of environments as $U : Ide \rightarrow {\bf B}$. With

1296: this the simple NXP language complies with the following definition.

1297: \begin{defn}\label{d:nxpSyntax4}

1298: $Exp$ is now the set of expressions : $$ E ::= x\;| E \;or\; E | E

1299: \;and\;E | E \;;\;E |\; x \;post\;E |\; E \;context\;E $$ with $x$

1300: in $Ide$.

1301: \end{defn}

1302:

1303: In the NXP architecture, the working memory additionally plays a

1304: much more dynamic role. In a major departure from the \emph{closed

1305: world} assumption of former inference systems, the working

1306: memory of the NXP architecture is also the locus of interaction

1307: with the external environment, whether data is captured through

1308: interactions with users or from other

1309: applications and systems. Indeed the historical

1310: implementation, \emph{Nexpert Object}, is famous for its

1311: pioneering use of API (Application Programming Interface) and

1312: middleware to integrate inference capabilities to entreprise

1313: applications and information systems in heterogeneous

1314: environments. In order to fulfil this role, the NXP working memory

1315: has its own low-level simplified executive module whose

1316: responsibility is to plan for the acquisition of the value of a

1317: particular data required by the inference process. In

1318: most cases, the plan is a simple look-up of the value for

1319: the requested data in memory. If this fails, either because the

1320: value is unknown yet or because a previous inference operation has

1321: reset that particular data item, the value is sought outside the

1322: NXP system either by asking the user (this in

1323: the form of a popup \emph{question} window) or by sending the

1324: request to the middleware layer to distant databases and

1325: applications (\emph{Nexpert Object} has a wide variety of foreign

1326: request mechanisms ranging from file loading and SQL queries up to

1327: COM or Corba-based dynamic queries to application servers). Beyond

1328: the marshalling of the returned value, its type conversion, the

1329: working memory executive is also able to synchronise the request

1330: for value with the higher-level chaining and goal-driven

1331: processes. It may pre-process group of

1332: requests and use caching in order to optimise usage of costly data

1333: access channels, for instance.

1334:

1335: A second dimension of dynamic behaviour that sets the NXP working

1336: memory apart from former production systems architecture is the

1337: evolutive nature of the working memory structure. This

1338: aspect is fully developed in the companion paper, but let us

1339: simply state here that the NXP architecture allows for

1340: pre-processing of the working memory elements based on previous

1341: runs of the inference systems. This pre-processing helps determine

1342: the initial \emph{focus of attention} of the system which, in

1343: turn, assigns priorities in the goal-driven process. This

1344: experience-based \emph{skill acquisition} process

1345: differs from historically studied machine learning processes concerned

1346: with alteration of the collection of pattern-directed modules, or

1347: production memory, e.g. chunking \cite{Soar}.

1348:

1349: Operations of the working memory mini-executive are transparent to

1350: the semantic definitions of previous sections. An operational

1351: semantics that would be closer to actual implementation would

1352: replace the simple set of boolean constants, ${\bf B}$, with a

1353: family of semantic functions $get_{B}$, indexed on a set of

1354: variables, representing working memory elements and capturing the

1355: simple semantics of request-response interactions with the external

1356: environment. Those functions could be further refined either in an

1357: asynchronous framework, accounting for message-based communication

1358: channels for instance, or in a lazy evaluation framework in which

1359: the mini-executive is implemented as a lazy interpreter.

1360:

1361: The monadic style representation of the semantics of NXP

1362: architecture's working memory follows the layered approach

1363: discussed in the previous subsection. By analogy with Haskell's

1364: I/O monads \cite{HaskellBi98}, we introduce the following

1365: definition.

1366: \begin{defn}\label{d:wmTriple}

1367: The working memory monad $W$ describes a computation as a series of

1368: channels paired with the value returned.

1369: \begin{align*}

1370: {\bf type}\: W\;b =& \langle C_{1},\cdots,C_{n}, b\rangle \\

1371: \eta :& b \rightarrow W b \\

1372: \eta x =& \langle \star_{1},\cdots,\star_{n} , x \rangle \\

1373: (*) :& W b_{1} \rightarrow (b_{1} \rightarrow WM\;b_{2})

1374: \rightarrow W\;b_{2} \\

1375: m*k =& {\tt let}\; \langle csx_{1},\cdots,csx_{n},bx \rangle = m

1376: \;{\tt in}\\

1377:   & {\tt let}\; \langle csy_{1},\cdots,csy_{n},by \rangle

1378: = k\;bx \;{\tt in}\\

1379:   & \langle csx_{1};csy_{1},\cdots,csx_{n};csy_{n}, by\rangle \\

1380: \end{align*}

1381: where channels, $C_{i}$, represent distinct request-response

1382: communication channels; $csx_{i}$ denotes state $x$ of channel

1383: $C_{i}$; $\star_{i}$ denotes the undefined (in the strict sense)

1384: state of channel $C_{i}$ ; $bx$ and $by$ denote boolean values

1385: from {\bf B}, and $csx_{i};csy_{i}$ denotes the state of channel

1386: $C_{i}$ after stepping through state $csx_{i}$ then $csy_{i}$.

1387: \end{defn}

1388: \begin{rem}

1389: The previous definition does not specify the operational semantics of

1390: communication channels. In an operational semantics, channels

1391: would use the state monad which simply captures that requests and

1392: responses flowing through the channel change its state. The

1393: definition also leaves the number of channels unspecified. In the

1394: NXP architecture, at least one of these channels is identified as

1395: the user interaction channel through which question-answer based

1396: interactive transactions can be set up.

1397: \end{rem}

1398: \begin{rem}

1399: This definition also leaves the choice of channel state semantics

1400: open. In particular, different semantics are called for synchronous

1401: and asynchronous communication patterns.

1402: Similarly, definition \ref{d:wmTriple} excludes

1403: interaction between communication channels oversimplifying some of

1404: the capabilities of the mini-executive in the \emph{Nexpert

1405: Object} reference implementation of the NXP architecture.

1406: \end{rem}

1407: The advantage of the monadic style is in this instance twofold.

1408: Firstly, in the layered approach outlined in the previous

1409: paragraphs, the composition of monads adequately reflects the

1410: juncture of the inference processes, chainings and goal-driven, with

1411: interactive processes. Here this articulation is

1412: naturally represented by the composition of the NXP monad from

1413: definition \ref{d:msNonStdSeqSem} with the working memory monad

1414: from definition \ref{d:wmTriple}. This architectural pattern will

1415: be reused in the companion paper to present the composition of

1416: inference with associative and dynamic memory processes.

1417:

1418: Secondly, it brings forward the nature of the working memory

1419: mini-executive itself and suggests various operational

1420: implementations. In particular, lazy interpretation could be

1421: useful in the case of continuous data streams channels, for

1422: instance, or for combined management of interacting communication

1423: channels---both situations quite frequent in real-time enterprise

1424: application integration.

1425:

1426: \subsection{Operational semantics: an NXP machine}

1427: As a last step towards an operational semantic for the NXP

1428: architecture, we will think of the computation as described by the

1429: operation of a state-based machine: $${\tt until}\: term(\sigma)\:{\tt

1430: do}\: \sigma = step(\sigma)$$

1431: At each stage of the machine's operation the state, $\sigma$, is

1432: modified to a new state as specified by the single step state

1433: transformation function $step$; this is repeated until a terminal

1434: state, recognised by the predicate $term$, is reached. The operation

1435: is captured by a formal definition.

1436: \begin{defn}

1437: With $S$ a set of states, $step : S \rightarrow S$ a function from

1438: states to states, and $term : S \rightarrow {\bf B}$ a function from

1439: states to truth values, we define the general {\tt machine} function :

1440: $${\tt machine}(step, term) = fix(\lambda f

1441: \lambda\sigma.{\tt if}\;term(\sigma)\;{\tt then}\;\sigma\;{\tt

1442: else}\;f\circ step(\sigma))$$

1443: where the expression {\tt if ... then ... else} has the usual meaning

1444: and where $fix$ is the fixed point operator (here acting on $f$).

1445: \end{defn}

1446: \begin{rem}

1447: With this very general definition a machine is simply identified by

1448: the couple of functions $step$ and $term$.

1449: \end{rem}

1450: In order to exhibit an example of an NXP machine, we will

1451: transform the sequences of the previously described non standard

1452: semantics into operational sequences. Here the

1453: denotational semantics naturally suggests an operational

1454: semantics: instead of queuing up denoted values in sequences,

1455: we now queue \emph{instructions} for the NXP machine sequences. Each

1456: expression of the simple language is \emph{compiled} into a

1457: sequence of mixed instructions and arguments, constituting a

1458: program then passed to an abstract machine (with proper $term$ and

1459: $step$ definitions).

1460: \begin{defn}\label{d:programs}

1461: We introduce the syntactic domains: ${\bf B}$ of boolean values, $Ins$

1462: of instructions, $Prg = Ins^{*}$ of finite sequences of instructions,

1463: or programs.

1464:

1465: The syntax for instructions is:

1466: $$ I ::= {\tt get}\; x\;| {\tt and}\; | {\tt or}\; | {\tt reset}\; x$$

1467: The semantics of this machine code will basically arrange that, as

1468: might be expected, {\tt get} accesses the working memory to

1469: retrieve the value of a particular identifier---possibly triggering

1470: side-effects in the working memory executive---and place it on top

1471: of the sequence; {\tt or} will form the logical or of the top two

1472: elements on the sequence; similarly for {\tt and}; and {\tt reset}

1473: which causes the working memory executive to reset values used

1474: in the evaluation of a variable--possibly triggering side-effects.

1475: \end{defn}

1476: \begin{rem}

1477: With this definition in place we have shifted emphasis from a semantic

1478: function mapping expressions to denotations to a compiling function

1479: mapping expressions to syntactic values, namely sequences of instructions.

1480: \end{rem}

1481: The compiling function transforms expression into programs fit for the

1482: NXP machine.

1483: \begin{defn}\label{d:compiler}

1484: Following the lines of the non standard semantics, we introduce the

1485: compiling function $C : Exp \times Prg \times Prg \rightarrow Prg \times Prg$.

1486: \begin{align*}

1487: C(x,s,p) &= s+\langle {\tt get}\;x \rangle, p \\

1488: C(E_{1}\;or\;E_{2},s,p) &= {\tt let}\; s', p' =

1489: C(E_{2},C(E_{1},s,p))\;{\tt in}\; s'+\langle {\tt or}\rangle, p' \\

1490: C(E_{1}\;and\;E_{2},s,p) &= {\tt let}\; s', p' =

1491: C(E_{2},C(E_{1},s,p))\;{\tt in}\; s'+\langle {\tt and}\rangle, p' \\

1492: C(E_{1}\;;\;E_{2},s,p) &= C(E_{2},C(E_{1},s,p)) \\

1493: C(x\;post\;E,s,p) &= {\tt let}\; s', p' = C(E,\langle\rangle,\langle\rangle) \;{\tt

1494: in}\; s'+\langle {\tt get }\;x \rangle, p' + p \\

1495: C( E_{1}\;context\;E_{2},s,p) &= {\tt let}\; s', p' =

1496: C(E_{2},\langle\rangle,\langle\rangle) \;{\tt in}\;

1497: C(E_{1},s,p),p'+s'+p \\

1498: \end{align*}

1499: \end{defn}

1500: \begin{rem}

1501: In this definition we need two operational sequences $s$ and $p$. The

1502: first one represents direct continuations as processed by

1503: evaluation and evocation of goals. The second one is used to

1504: process contextual links and goal associations.

1505: \end{rem}

1506: \begin{rem}

1507: As remarked in the previous section, the  compilation

1508: process could be captured by a specific triple then blurring the formal

1509: distinction between compilation and semantic  evaluatiation of NXP simple

1510: language expressions.

1511: \end{rem}

1512: We now complete the definition of an operational semantics for the NXP

1513: architecture by displaying a $term$ and a $step$ function for the

1514: compiled code definitions \ref{d:compiler} and \ref{d:programs}.

1515: \begin{defn}

1516: Let $S =  Prg \times {\bf N} \times {\bf B}^{*}$ be a set of

1517: states, and $\sigma = (\Pi,n,s)$ be a typical member of $S$.

1518: Simply enough, we specify the effect of any single instruction on

1519: the stack by defining a function $I: Ins \rightarrow {\bf B}^{*}

1520: \rightarrow {\bf B}^{*}$ as follows:

1521: \begin{align*}

1522:  I( {\tt get}\;x, s)  = &\: \langle get(x) \rangle + s   \\

1523:  I( {\tt reset}\;x, s)  = &\: s   \\

1524:  I( {\tt or}\;,s)  = &\: Or(s)  \\

1525:  I( {\tt and}\;,s)  = &\: And(s)

1526: \end{align*}

1527: \end{defn}

1528: \begin{rem}

1529: In this definition $get$ and $reset$ are invocations of the working

1530: memory executive. In fact $reset: Ide \rightarrow \langle\rangle$

1531: always returns the empty stack while $get: Ide \rightarrow {\bf B}$

1532: returns the boolean value denoted by identifier $x$---both calls having

1533: possible side-effects. The semantics of these calls is represented by the

1534: working memory triple.

1535: \end{rem}

1536: Finally we define an instance of the abstract machine.

1537: \begin{defn}

1538: The NXP virtual machine is defined by the set $S$ of states, and the

1539: $term$ and $step$ functions:

1540: $$step(\Pi,n,s) = (\Pi,n+1,I(\Pi\downarrow i,s))$$

1541: and

1542: $$term(\Pi,n,s) = n \geq length(\Pi)$$

1543: \end{defn}

1544: \begin{rem}

1545: Thus when the machine executes a program $\Pi$ with the

1546: starting stack $s$--usually the empty stack $\langle\rangle$--the

1547: final stack is obtained as specified by the semantic function $M$,

1548: defined by:

1549: \begin{align*}

1550: M & : Prg \rightarrow {\bf B}^{*} \rightarrow {\bf B}^{*} \\

1551: M(\Pi,s) &= in_{3}({\tt machine}(step,term)(\Pi,1,s)) \\

1552: \end{align*}

1553: where $in_{3}$ is the canonical projection of the third element of the

1554: triplet.

1555: \end{rem}

1556: \begin{rem}

1557: Note that the final program passed to the NXP machine is the

1558: concatenation of the two compiled operational sequences, accounting

1559: for the lower priority of the associative links. This is captured in

1560: the following congruence proposition.

1561: \end{rem}

1562: \begin{prop}\label{p:congruenceCompiler}

1563: The effect of running a compiled expression is the one given by the

1564: sequence semantics, more specifically:

1565: $$M(\Pi,s) = [[E]]_{2}s$$

1566: where

1567: $$\Pi = \;{\tt let}\; s',p' = C(E,s,\langle\rangle)\;{\tt in}\:p'+s'$$ for

1568: all $E$, $s$.

1569: \end{prop}

1570: Interestingly enough the compiler is not the only operational

1571: semantics suggested by the non standard semantics studied

1572: above. Building an interpreter for the NXP architecture is also

1573: possible along the same lines as the compiler, but instructions are

1574: then executed on the fly. More generally several implementations can still

1575: be chosen at this stage, somehow independently of the operational

1576: semantics: eager or lazy evaluation, for instance, is applicable to

1577: both operational semantics.

1578:

1579: \section{Conclusions and further research}

1580: Inference may be considered as a special form of computing. This

1581: basic tenet of the whole Artificial Intelligence theoretical field

1582: has been comprehensively captured by Newell's \emph{knowledge

1583: level} hypothesis \cite{Newell93}, and if the distinction between

1584: behaviour and beliefs is well established, the nature of behaviour

1585: as computation is still much debated.

1586:

1587: Inspired by the analogy between the latter distinction and the one usually

1588: drawn between computations and values in the study and design of

1589: programming languages, we have presented an architectural view of

1590: control constructs in inference systems and, more specifically, of

1591: \emph{Nexpert Object} as a canonical inference system. The resulting

1592: NXP architecture has grounded semantics expressed in categorical terms

1593: using the monadic style of presentation.

1594:

1595: The benefits of this formal representation is twofold. On the

1596: one hand, the categorical foundation of this representation helps

1597: study of the NXP architecture in a logical framework, such as Twelf,

1598: and formally assess its architectural properties. On the

1599: other hand, the denotational (and derived non standard) semantics

1600: strongly suggest operational semantics for the implementation of the

1601: NXP architecture. Formally provable congruence relation henceforth

1602: ascertain the soundness of the resulting implementation.

1603:

1604: Finally the categorical foundation of the NXP architecture extends to

1605: the formal representation of other cognitive processes beyond

1606: inference. In particular the monadic style can be used

1607: to articulate the working memory process to inference in a modular

1608: way. It can also be used to express relationship between inference and

1609: learning or adaptative processes reflecting the long term effect of experience

1610: on inference.

1611:

1612: \bibliographystyle{amsplain}

1613: \bibliography{wip}

1614: \end{document}

1615: