1: \documentclass{amsart}
2: \usepackage{diagrams}
3: \newtheorem{thm}{Theorem}[subsection]
4: \newtheorem{cor}[thm]{Corollary}
5: \newtheorem{lem}[thm]{Lemma}
6: \newtheorem{prop}[thm]{Proposition}
7: \theoremstyle{definition}
8: \newtheorem{defn}[thm]{Definition}
9: \theoremstyle{remark}
10: \newtheorem{rem}[thm]{Remark}
11:
12: \begin{document}
13: \title[Monadic Control Constructs for Inference]
14: {Monadic Style Control Constructs \\for Inference Systems}
15:
16: \author{ Jean-Marie Chauvet }
17:
18: \address{Dassault D\'{e}veloppement, Paris, France}
19:
20: \email{jmc@neurondata.org}
21:
22: \subjclass{Primary 68Q55, 18C50; Secondary 18C15, 18C20}
23:
24: \keywords{categorical semantics, triples, monads, continuation,
25: computation, control, inference systems}
26:
27: \date{ August 4, 2002.}
28:
29: \dedicatory{}
30:
31:
32: %%% ----------------------------------------------------------------------
33:
34: \begin{abstract}
35: Recent advances in programming languages study and design have
36: established a standard way of grounding computational systems
37: representation in category theory. These formal results led to a
38: better understanding of issues of control and side-effects in functional and imperative
39: languages. Another benefit is a better way of modelling computational effects in
40: logical frameworks. With this analogy in mind, we embark on an
41: investigation of inference systems based on considering inference
42: behaviour as a form of computation. We delineate a categorical
43: formalisation of control constructs in inference systems. This
44: representation emphasises the parallel between the modular articulation of the
45: categorical building blocks (triples) used to account for the inference
46: architecture and the modular composition of cognitive processes.
47: \end{abstract}
48:
49: %%% ----------------------------------------------------------------------
50: \maketitle
51: %%% ----------------------------------------------------------------------
52:
53: \section{Introduction}
54: Originally grounded in research work in Artificial
55: Intelligence (AI) and, more specifically in AI in medicine and
56: production systems \cite{Rappaport_1984_77}, \emph{Nexpert Object}
57: became a commercial success in the late eighties and
58: nineties. Its implementation architecture broke new ground in
59: graphical user interfaces, in portability across operating
60: environments and in distributed object systems. But powerful
61: innovations also went deep into the design of the inference system
62: itself (NXP), which can be traced back to that seminal research
63: work in AI.
64:
65: In this and a companion paper we offer a formal account of this
66: inference systems, using recent theoretical
67: advances in the understanding of programming language design. In
68: particular, some of the computational effects introduced by the NXP
69: design show a denotational semantics best captured with the
70: current tools in \emph{continuation-passing} or
71: \emph{monadic} styles.
72:
73:
74: \emph{Continuations.} The idea of transforming programs to
75: continuation-passing style (CPS) appeared in the mid-sixties
76: \cite{sabry92reasoning}. The transformation was formally codified
77: by Fischer and Reynolds in 1972, yielding a standard CPS
78: representation of call-by-value lambda calculus. In the context of
79: denotational semantics \cite{Stoy}, described as the theory of
80: meaning for programs, denotations are usually built with the help
81: of functions in some mathematical structures. In CPS these
82: functions are explicitly passed an additional argument $\kappa$
83: representing "the rest of the computation", a first step towards
84: full reification of the notion of computation itself. Although
85: most of the work on continuations involves a functional language,
86: usually a fragment of typed or untyped lambda calculus, CPS is
87: also useful in understanding imperative languages. Continuations
88: were found to be a major tool for the design of interpreters
89: and compilers for many languages, most prominently Scheme, ML and Haskell.
90:
91: As such, continuations appear as the raw material of control.
92: Operations on continuations control of the
93: unfolding of a computation---that this translation happens in a standard way
94: is a major result of the theoretical work on continuations.
95: Similarly, in inference systems, of which production systems are a
96: well-studied example \cite{PDIS}, designing a proper behaviour
97: relies on the delicate interweaving of \emph{goal-driven} and
98: \emph{data-driven} control. Moreover the importance of
99: the relative data-driven nature of the flow of control sets inference
100: systems apart from both functional and imperative
101: programming languages. In inference systems, both side-effects and
102: procedure calls are mediated through operations on data.
103: This paper shows the relevance
104: of the CPS representation for analysis and assessment of inference
105: systems.
106:
107:
108:
109: \emph{Monads and triples.} Originally introduced by Moggi \cite{
110: moggi91notions} in computer science, monads or triples, a notion
111: from category theory \cite{ttt}, were shown to generalise the
112: continuation-passing style transformation. Monads can model a wide
113: variety of features, including continuations, state, exceptions,
114: input-output, non-determinism, and parallelism \cite{
115: wadler93monads,wadler94monads}. The \emph{monadic style} essentially
116: distinguishes between values and computations ; it sets up a
117: uniform infrastructure for representing and manipulating
118: computations with effects as first-class objects \cite{
119: filinski94representing}. More specifically, monads introduce a
120: type constructor $T$ through which the type $T \tau$ represents a
121: computation that yields a result of type $\tau$ and may have
122: side-effects. Cast in this background, continuations arise as a
123: special case of monad translation.
124:
125:
126:
127: \emph{Inference systems.} Inference systems are characterised by a
128: program organization based on data or event-driven operations.
129: Such an organisation can be described as \emph{pattern directed}:
130: patterns occurring in the data select pieces of code in the system
131: to be activated. Systems organised this way have been used
132: primarily for deductive inference \cite{PDIS}, and, to a lesser
133: extent, for inductive inference or learning
134: \cite{Rappaport_1984_77}. An inference system has three
135: components: a collection of substructures, called pattern-directed
136: modules, which can be activated or fired by patterns in the data;
137: one or more data structures that may be examined and modified by
138: the pattern-directed modules; and an executive, or interpreter,
139: that controls the selection and activation of the modules. In
140: traditional production systems the pattern-directed modules are
141: called \emph{production rules} and reside in \emph{production
142: memory} while the data structures are stored in the \emph{working
143: memory}. The interpreter uses various strategies in a
144: \emph{recognize-act cycle} to select candidate production rules
145: and execute their action part.
146:
147: In the \emph{NXP architecture}, this mechanism is supplemented by
148: a goal driven process. Informally, goals are investigated by
149: actively looking for patterns in data, a search usually called
150: \emph{backward chaining} as it involves a recursive descent in the
151: generalized and-or tree representing linked production rules. Such
152: goals are confirmed, rejected or considered ``not known'' as the
153: data collected during the recursive descent trigger some of the
154: patterns of the production rules, a process dually called
155: \emph{forward chaining}. In addition goals are said to be
156: \emph{evoked} when, hitherto uninvestigated, they also depend on
157: data collected during the backward chaining descent from initial
158: goals. Eventually every goal sharing a triggered data pattern with
159: an initial goal ends up evoked. These evoked goals are queued for
160: later evaluation once the current agenda of goal investigation or
161: data collection completes, their investigation is suspended or
162: \emph{postponed} until the current one terminates. Considering
163: backward and forward chainings in NXP as computations in the
164: previously mentioned sense, goal evocation can be viewed as an
165: \emph{effect}, comparable, for instance, to the {\tt escape}
166: construct of functional programming languages. Following sections
167: of this paper show how to capture backward and forward chainings
168: as computations and account for goal evocation in
169: continuation-passing and monadic styles.
170:
171:
172: \emph{Logical frameworks.} Most of the work we are presenting in
173: this and its companion paper has been designed for validation and
174: checking with Twelf \cite{ pfenning99}, a logical framework
175: developed at Carnegie-Mellon University. In a logical framework
176: such as this, a deductive system is used as a device for
177: establishing semantic properties of the mathematical or
178: computational object under study. Following Martin-L\"{o}f
179: introduction of \emph{judgements} as the foundation for
180: constructive mathematics and computer science, logical frameworks
181: represent judgements as types enabling proof-theoretic derivations
182: by type reconstruction. Logical frameworks provide a uniform
183: \emph{judgements-as-types} representation of proofs of semantic
184: properties. In this paper we use the Twelf deductive system
185: to investigate the semantic properties of computations with
186: effects in the NXP inference system (which is itself a deductive
187: system at a different level of abstraction). Obviously terms such
188: as \emph{type}, \emph{deduction} and \emph{inference} have
189: overloaded meanings and we will specify precisely in which context
190: or system we use the latter when there is a risk of confusion.
191:
192: The rest of this paper is organised in three sections. The first one
193: explores a continuation-passing style representation
194: of the NXP computational architecture. The second presents
195: generalisation to a monadic style
196: description of various effects in the NXP architecture. The last
197: section uses the monadic results of the previous exploration to
198: suggest an abstract implementation of the NXP architecture in the
199: form of the \emph{NXP abstract machine} with implications for
200: building theoretically sound NXP interpreters or compilers.
201:
202: \section{Representing inference control with continuations}
203:
204: \subsection{Terminology}
205: In the following we will consider simple languages representing
206: various aspects of inference systems, and, more specifically of
207: the NXP architecture. Such languages consist of a syntax, defining
208: the set of well-formed types and well-typed terms, and a
209: semantics, assigning some notion of meaning to terms. We expect a
210: semantics to provide a notion of \emph{program evaluation} or
211: \emph{inference derivation}, usually represented as a (partial)
212: function from a suitable set of terms to some set of observable
213: results, say boolean values.
214:
215: In \emph{denotational semantics} the semantics is expressed as
216: functions mapping syntactic constructs in the program to
217: mathematical objects in a mathematical domain. Analyses of
218: programs are actually conducted in the mathematical domain, in
219: which properties assessed about denotations of programs translate
220: to properties about programs. In contrast, the \emph{operational
221: semantics} defines an abstract machine with a state, possibly
222: several components, and some set of primitive instructions. The
223: \emph{axiomatic approach} associates an ``axiom'' with each kind of
224: statement in the programming language stating what we may
225: assert after execution of that statement in terms of what was true
226: beforehand. There is usually a gap between a denotational
227: semantics and an operational semantics for a language and the main
228: formal work in designing interpreters and compilers is to prove
229: their soundness and faithfulness by proving equivalence
230: between these semantics. Bridging this gap requires to proceed in
231: several stages: at each stage an alternative semantic definition
232: of the language is introduces, embodying successively more and more
233: implementation details. These intermediate stage semantics are
234: called \emph{non-standard semantics} to reflect their instrumental
235: status. With these sequence of steps, denotational techniques
236: often drift into program transformations and non-standard
237: semantics have often been used as intermediate languages in
238: compilers.
239:
240: \subsection{Continuation-passing style transformations}
241: Practically all programming languages have some form of control
242: structure or jumping; the more advanced forms of control
243: structures tend to resemble function calls, so much so they are
244: rarely describes as jumps. The library function {\tt exit} in C,
245: for example, may be called with an argument like a function. Its
246: whole purpose, however, is utterly non functional: it jumps out of
247: arbitrarily many surrounding blocks and pending function calls.
248: Such a non-returning function or jump with arguments is an example
249: of why continuations are needed.
250:
251: In continuation-passing style, a function call is transformed
252: into a jump with arguments to the callee such that one of these is
253: a continuation that enables the callee to jump back to the caller.
254: This idea has been formalised into a standard CPS transformation
255: for lambda calculus.
256: \begin{defn} \label{d:Lambda-calculus}
257: The pure untyped lambda calculus $\Lambda$ is defined by \cite{
258: Barendregt-1997}: A set of terms, $M$, inductively generated over an
259: infinite set of variables $Vars$,
260: \begin{itemize}
261: \item Terms $M ::= V |\: M\: M $; \item Values $V ::= x |\:\lambda
262: x.M $, $x$ in $Vars$.
263: \end{itemize}
264: \end{defn}
265:
266: The standards CPS translation for $\Lambda$ in denotational
267: semantics is given by the map $[[\_]]_{F} : \Lambda \rightarrow
268: \Lambda$ originally described by Fisher:
269: \begin{defn} \label{d:FisherCPS}
270: Let $k$, $m$, $n$ in $Vars$ be variables that do not occur in the
271: argument to $[[\_]]_{F}$.
272: \begin{itemize}
273: \item $[[V]]_{F} = \lambda k.k \psi (V)$ \item $[[M N]]_{F}
274: = \lambda k.[[M]]_{F}(\lambda m.[[N]]_{F} \lambda n.(m k)n)$ \item
275: $\psi (x) = x$ \item $\psi (\lambda x.M) = \lambda k.\lambda
276: x.[[M]]_{F}k$
277: \end{itemize}
278: \end{defn}
279:
280: We will consider a much simpler fragment for the purpose of
281: analysis of effects and control in inference systems and, more
282: specifically, in the NXP architecture. The syntax is made up of
283: simple expressions of boolean arithmetic:
284: $$ E ::= true \:| false\: | E\: or\: E\: | E\: and\: E $$
285: In this simplified core model of the inference computation, goals
286: are represented by and/or tree expressions, the leaves of which are
287: the working memory elements. Keeping with the essentials,
288: these working memory elements are valued in ${\bf B} = \{0, 1\}$
289: the boolean algebra with boolean operations $\&$ (logical and) and
290: $|$ (logical or). This will be relaxed later as we take into
291: account other effects in the NXP architecture. In this toy model,
292: the pattern-directed modules or production rules of inference
293: systems are captured by expressions in the language, and the
294: executive system is naturally a boolean arithmetic evaluator.
295:
296: Following the denotational semantics approach, the standard
297: semantics for the simple and/or language is defined by:
298: \begin{defn} \label{d:stdsemNXP}
299: Let $B = \{true, false\}$ be the set of constants, $Exp$ the set
300: of expressions inductively defined as in the previous paragraph,
301: and ${\bf B}$ the classical boolean algebra, we define
302: the semantic function $\mathcal{B}: B \rightarrow {\bf B}$ and the
303: evaluation function $[[\_]]_{0}: Exp \rightarrow {\bf B}$ by:
304: \begin{itemize}
305: \item $[[B]]_{0} = \mathcal{B}(B)$ \item $[[E_{1}\:or\:E_{2}]]_{0}
306: = [[E_{1}]]_{0} \:|\: [[E_{2}]]_{0}$ \item
307: $[[E_{1}\:and\:E_{2}]]_{0} = [[E_{1}]]_{0} \:\&\: [[E_{2}]]_{0}$
308: \end{itemize}
309: \end{defn}
310:
311: This definition does not specify a particular order for evaluation
312: of boolean expressions, nor does it refer directly to
313: continuations. As such it is really denotational and only calls
314: for several steps of refinement to be turned into an operational
315: semantics. It does, however, reflect the basic principles of
316: inference systems introduced earlier and, properly extended, will
317: be an adequate basis on which to build the analysis. The next step
318: towards operational semantics now consists of providing a non-standard
319: semantics for the simple language and proving its denotational
320: equivalence with definition \ref{d:stdsemNXP}.
321:
322: \subsection{Two non-standard semantics of computations in the NXP architecture}
323: \subsubsection{Continuations to order goal investigations}
324: The first non-standard semantics introduced here makes explicit
325: reference to continuations. In order to better model the behaviour
326: of inference systems, we add a new operator to our simple and/or
327: tree language to capture sequential investigation of two or more
328: goals:
329: $$ E ::= true \:| false\: | E\: or\: E\: | E\: and\: E | E\:;\:E$$
330: where $E$ are expressions as in definition \ref{d:stdsemNXP}, and
331: $E_{1}\:;E_{2}$ describes investigation of goal $E_{1}$
332: followed by investigation of goal $E_{2}$. In imperative
333: programming languages, this is simply the succession of
334: computations of expressions $E_{1}$ and $E_{2}$. We can now define
335: the following non-standard semantics for this new language as
336: follows:
337: \begin{defn} \label{d:nonstdsemNXP1}
338: Let ${\bf K}$ be the set of continuations, in our case the set of
339: functions ${\bf B} \rightarrow O$ where $O$ is the type of output
340: objects with a distinguished element ${\tt exit}: {\bf B}
341: \rightarrow O$, and $B$, $Exp$ and $\mathcal{B}$ as in definition
342: \ref{d:stdsemNXP}, we define the evaluation function $[[\_]]_{1}:
343: Exp \rightarrow {\bf K} \rightarrow {\bf K}$ by:
344: \begin{itemize}
345: \item $[[B]]_{1}\:k = k\:\mathcal{B}(B)$ \item
346: $[[E_{1}\:or\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\lambda
347: \epsilon_{1}.[[E_{2}]]_{1}(\lambda\epsilon_{2}.
348: k(\epsilon_{1}\:|\:\epsilon_{2})))$ \item
349: $[[E_{1}\:and\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\lambda
350: \epsilon_{1}.[[E_{2}]]_{1}(\lambda\epsilon_{2}.
351: k(\epsilon_{1}\:\&\:\epsilon_{2})))$ \item
352: $[[E_{1}\:;\:E_{2}]]_{1}\:k = [[E_{1}]]_{1}(\:[[E_{2}]]_{1}\:k)$
353: \end{itemize}
354: \end{defn}
355:
356: The continuations introduced here follow the convention of mapping
357: results of computations, here values of goals inferred by the
358: system, to a specific output type. With these conventions in
359: place, the top level {\tt read-eval-print} loop of traditional
360: interpreters is captured by the simple semantics $$[[E]]_{1}\: {\tt
361: exit}$$ which basically executes the computation of the expression
362: $E$ and pass the result to the continuation that yields control
363: back to the user. In addition, definition \ref{d:nonstdsemNXP1}
364: introduces a left to right ordering of goal investigation and a
365: left to right ordering of the data collection process: this
366: appears in the semantics denotation in places where the left
367: subexpression of a composed expression is always evaluated first
368: with a continuation built from the computation of the right
369: subexpression. This of course, is already a form of design choice
370: and alternative semantics are easily produced to represent other
371: options. The embedded design choice then goes one step further
372: away from the standard denotational semantics down the road of an
373: implementation-ready operational semantics.
374:
375: \subsubsection{Sequence semantics for continuations}
376: With the refined non-standard semantics introduced in this section,
377: we are setting up a framework for representing the semantics of
378: control of in the NXP architecture. The following definitions
379: refine the operational semantics introduced in
380: \ref{d:nonstdsemNXP1}:
381: \begin{defn} \label{d:stacks}
382: We will use the notation $s = \langle B_{1},B_{2}...B_{n}\rangle$
383: for finite sequences of boolean values, $s$ in ${\bf B}^{*}$ the
384: set of finite sequences of boolean values; we will also use
385: indifferently ${\tt nil}$ and $\langle\:\rangle$ to designate the
386: empty sequence. The finite sequence abstract data type is
387: characterised by the following operations:
388: $$s\downarrow i = B_{i} , 0 < i \leq n \;(selection)$$
389: $$s\uparrow i = \langle B_{i+1},...B_{n}\rangle, 0 < i < n; {\tt nil}, i
390: = n \;(rest)$$
391: $$length(s) = n$$
392: $$s\:+\:s' = \langle
393: B_{1},B_{2}...B_{n},B'_{1},B'_{2}...B'_{m}\rangle\;(concatenation)$$
394: $$Or(s) = \langle\;s\downarrow1\:|\:s\downarrow2\;\rangle+s\uparrow2, n \geq 2$$
395: $$And(s) = \langle\;s\downarrow1\:\&\:s\downarrow2\;\rangle+s\uparrow2, n \geq 2$$
396: \end{defn}
397: The sequence abstract data type is comparable to the stack data
398: type introduced by Stoy \cite{Stoy} for operational semantics of
399: programming languages. In the NXP architecture the sequence
400: semantics captures the succession of goals investigated and evoked by
401: the systems. Based on definition \ref{d:stacks} we are now able to
402: present the (non-standard) sequence semantics for our simple
403: representation of the NXP architecture:
404: \begin{defn}\label{d:nonstdsemNXP2}
405: Let ${\bf B}^{*}$ be the set of finite boolean values sequences
406: with the previously defined abstract data type
407: operations, let $Exp$ and ${\bf B}$ be as in definition
408: \ref{d:nonstdsemNXP1} let $\mathcal{B}$ be, as before, the
409: semantic function of definition \ref{d:stdsemNXP}, we introduce
410: the new semantic evaluation function $[[\_]]_{2}: Exp \rightarrow
411: {\bf B}^{*} \rightarrow {\bf B}^{*}$:
412: \begin{itemize}
413: \item $[[B]]_{2}s = \langle\mathcal{B}(B)\rangle+s$ \item
414: $[[E_{1}\:or\:E_{2}]]_{2}\:s =
415: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item
416: $[[E_{1}\:and\:E_{2}]]_{2}\:s =
417: And([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item
418: $[[E_{1}\:;\:E_{2}]]_{2}\:s = [[E_{2}]]_{2}([[E_{1}]]_{2}\:s)$
419: \end{itemize}
420: \end{defn}
421: \begin{rem}\label{r:left-to-right-ordering}
422: Note that, even though the notation is reversed, compared to
423: definition \ref{d:nonstdsemNXP1}, the previous evaluation function
424: still enforces a left to right order of evaluation on expressions
425: and subexpressions. The left subexpression is always evaluated
426: first and pushed on the sequence before evaluation of the right
427: subexpression. This is particularly important in the specific case
428: of expressions like $E_{1}\:;\:E_{2}$.
429: \end{rem}
430: \begin{rem}
431: In the sequence semantics of the simple NXP language,
432: continuations are actually ``implemented'' as finite sequences,
433: initially with type operations similar to stacks (sequence and
434: rest, for instance). In the NXP architecture, the boolean values
435: leaves of and/or trees are determined from examination of the
436: working memory or from interaction with the user (asking
437: questions). This operation is not captured in the simple language
438: we have presented so far but will be discussed later. It is important
439: however it somehow constrains when
440: operations are actually performed on a finite sequence $s$ of
441: ${\bf B}^{*}$. Namely, we consider that boolean values in a
442: sequence $s$ are valued \emph{just in time} when they are passed
443: as arguments to the sequence operations $And$, $Or$, and the like.
444: (This can be compared to lazy evaluation in languages like Scheme
445: or Haskell.) Boolean values in sequences of definition
446: \ref{d:nonstdsemNXP2} are then representations of \emph{delayed}
447: accesses to the working memory. The reifications of those delayed
448: executions, usually called \emph{thunks}, are
449: triggered by the goal investigation process as needed. The lazy
450: evaluation convention though does not affect the inference computation
451: nor the control of its unfolding.
452: \end{rem}
453: We ground the previous definition with the following result:
454: \begin{prop} \label{p:congruence}
455: The semantics \ref{d:stdsemNXP} and \ref{d:nonstdsemNXP2} are
456: \emph{congruent} i.e.
457: $$[[E]]_{2}\:s = \langle[[E]]_{0}\rangle+\:s$$ for all $E$ and for
458: all $s$.
459: \end{prop}
460: \begin{proof}
461: The proof is by induction on expressions. Obviously :
462: $$ [[B]]_{2}\:s = \langle\mathcal{B}(B)\rangle +\:s$$ by
463: definition. For the other syntactic forms of expressions :
464: \begin{align*}
465: [[E_{1}\:or\:E_{2}]]_{2}\:s &=
466: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s)) \\
467: &= Or([[E_{2}]]_{2}(\langle[[E_{1}]]_{0}\rangle+\:s)) \\
468: &=
469: Or(\langle[[E_{2}]]_{0}\rangle+\:(\langle[[E_{1}]]_{0}\rangle+\:s))
470: \\
471: &= Or(\langle[[E_{2}]]_{0},[[E_{1}]]_{0} \rangle+\:s) \\
472: &= \langle[[E_{2}]]_{0}\:|\:[[E_{1}]]_{0} \rangle+\:s \\
473: &= \langle[[E_{1}\:or\:E_{2}]]_{0}\rangle +\:s
474: \end{align*}
475: and similarly for the $and$ expression.
476: \end{proof}
477:
478: \subsubsection{Formalisation in a Logical Framework (Twelf)}
479: The previous approach lends itself to formalisation in a Logical
480: Framework. In this section, we use the codification of
481: formalisation techniques into a meta-language offered by Twelf, a
482: logical framework developed at Carnegie-Mellon University \cite{
483: pfenning99}, to specify proofs of inference systems properties.
484: This is done in three stages : The first one is the representation
485: of the abstract syntax of the simple language under investigation;
486: the second stage is the representation of the language semantics
487: (both notions of values and types, and notions of computations,
488: i.e. operational semantics); the last stage is the representation
489: of the properties of the language (for instance, congruence of
490: semantics and type preservation).
491:
492: We base the representation of the simple language expressions on
493: abstract syntax (rather than concrete) in order to expose the
494: essential structure and focus on the semantics and its properties
495: rather than on lexical analysis and parsing---although the Logical
496: Framework could also help here. The representation technique is
497: called \emph{higher-order abstract syntax} and basically uses
498: types in Twelf to capture all syntax.
499: \begin{defn} \label{d:TwelfSyntax}
500: Going back to definition \ref{d:stdsemNXP} of the simple NXP
501: language, we then declare a new Twelf type: $${\tt exp : type.}$$
502: to represent expressions of $Exp$, intending that every Twelf
503: object $M$ of type {\tt exp} represents a simple language
504: expression and vice-versa. Similarly, we declare boolean values as
505: a new Twelf type: $${\tt bool : type.}$$ and declare two
506: distinguished instances of this newly defined {\tt bool} type:
507: $${\tt true : bool.}$$ $${\tt false : bool.}$$ With these basic
508: types in place we now turn to boolean values expressions built
509: from $\&$ and $|$ operators. Twelf captures these
510: operators as \emph{expression constructors} which translate into
511: constant functional types:
512: $${\tt \& : bool \:->\: bool \:->\: bool. }$$
513: $${\tt | : bool \:->\: bool \:->\: bool. }$$
514: for which we will use infix mode for better readability.
515: \end{defn}
516:
517: In a second stage we introduce a representation of deductions
518: following the idea of \emph{judgements-as-types}. In Twelf,
519: deductions are objects and judgements are types, so that proofs in
520: the semantics space are in fact type reconstructions in Twelf. In
521: order to do so, we introduce a type family {\tt eval0} indexed by
522: representations of expressions ($Exp$) and boolean values (${\bf
523: B}$):
524: $${\tt eval0 : exp \:->\: bool \:->\: type.}$$
525: such that we have types such as {\tt eval0 e b} which depend on
526: objects.
527:
528: Axioms are simply represented as types such as these,
529: while semantics operations can be viewed as constructors which,
530: given deductions of their arguments, yield a deduction of
531: their result.
532: \begin{defn} \label{d:TwelfStdSemantics}
533: For the simple NXP language we introduce:
534: $${\tt constant : bool \:->\: exp.}$$
535: $${\tt and : exp \:->\: exp \:->\: exp.}$$
536: $${\tt or : exp \:->\: exp \:->\: exp.}$$
537: and the following representation of their respective evaluations:
538: $${\tt eval0cst \;:\; eval0\; (constant B)\; B.}$$
539: $${\tt eval0or \;:\; eval0\; (E2 \;or\; E1)\; (V1 \;|\; V2) \;<-\; eval0\; E1\; V1\; <- \;eval0\; E2\;
540: V2.}$$ $${\tt eval0and \;:\; eval0\; (E2\; and\; E1)\; (V1\; \&\;
541: V2)\; <- \;eval0\; E1\; V1\; <-\; eval0\; E2\; V2.}$$ which
542: capture the standard semantics of definition \ref{d:stdsemNXP}.
543: \end{defn}
544: \begin{rem}
545: Using Twelf built-in deductive capabilities we can now explicit
546: deductions such as in the following example:
547: \begin{verbatim}
548: eval0 (constant true and constant true or constant false) V.
549: ---------- Solution 1 ----------
550: V = (false | true) & false.
551: A = eval0_and eval0_cst (eval0_or eval0_cst eval0_cst).
552: \end{verbatim}
553: which simply shows the result of the boolean operation as {\tt V}
554: and the deduction as {\tt A}. Note that the deduction {\tt A} may
555: be considered as a logical proof of the value {\tt V} or
556: alternatively as the logical program implementing {\tt eval0}
557: viewed as a function call.
558: \end{rem}
559:
560: The non-standard semantics of the NXP architecture are captured
561: in the same way, with the introduction of an additional
562: Twelf type, {\tt bool\_seq} for sequences of boolean values.
563: \begin{defn} \label{d:TwelfNonStdSem2}
564: The Twelf representation of the sequence semantics for NXP consists
565: firstly of an evaluation function which builds a functional
566: programming style representation of expressions of the simple
567: language :
568: \begin{verbatim}
569: %%% The sequence abstract data type
570: nil : bool_seq.
571: atom : bool -> bool_seq.
572:
573: select : bool_seq -> int -> bool.
574: tail : bool_seq -> int -> bool_seq.
575: length : bool_seq -> int.
576: + : bool_seq -> bool_seq -> bool_seq. %infix right 10 +.
577:
578: %%% The functional programming style representation
579: eval2 : exp -> bool_seq -> bool_seq -> type.
580:
581: eval2_cst : eval2 (constant B) nil (atom B).
582: eval2_cst2: eval2 (constant B) S (atom B + S).
583: eval2_or_s : eval2 (E1 or E2) S (or_s S2)
584: <- eval2 E1 S S1 <- eval2 E2 S1 S2.
585: eval2_and_s : eval2 (E1 and E2) S (and_s S2)
586: <- eval2 E1 S S1 <- eval2 E2 S1 S2.
587: \end{verbatim}
588: and secondly of an evaluation function that performs the
589: functionally specified operations on the input sequence :
590: \begin{verbatim}
591: eval2a : bool_seq -> bool_seq -> type.
592:
593: eval2a_atom : eval2a (atom B) (atom B).
594: eval2a_atom_2 : eval2a (atom B + S) (atom B + S')
595: <- eval2a S S'.
596: eval2a_or_s1 : eval2a (or_s (S'')) (atom(B1 | B2) + S')
597: <- eval2a S'' (atom B1 + atom B2 + S)
598: <- eval2a S S'.
599: eval2a_or_s2 : eval2a (or_s (S'')) (atom(B1 | B2))
600: <- eval2a S'' (atom B1 + atom B2).
601: eval2a_or_s_nil : eval2a (or_s (A1 + A2 )) (atom(B1 | B2) )
602: <- eval2a A1 (atom B1) <- eval2a A2 (atom B2).
603: eval2a_and_s1 : eval2a (and_s (S'')) (atom(B1 & B2) + S')
604: <- eval2a S'' (atom B1 + atom B2 + S)
605: <- eval2a S S'.
606: eval2a_and_s2 : eval2a (and_s (S'')) (atom(B1 & B2))
607: <- eval2a S'' (atom B1 + atom B2).
608: eval2a_and_s_nil : eval2a (and_s (A1 + A2 )) (atom(B1 & B2) )
609: <- eval2a A1 (atom B1) <- eval2a A2 (atom B2).
610: \end{verbatim}
611: The Twelf representation of the sequence semantics function is simply
612: the composition of the previously shown functions :
613: \begin{verbatim}
614: eval2b : exp -> bool_seq -> bool_seq -> type.
615: eval2b_1 : eval2b E S S' <- eval2 E S S'' <- eval2a S'' S'.
616: \end{verbatim}
617: \end{defn}
618: \begin{rem}
619: There are many ways the non-standard semantics of the NXP
620: architecture could have been mapped to Twelf's metalanguage.
621: Definition \ref{d:TwelfNonStdSem2} is in fact more ``operational''
622: than really needed. It separates a functional programming
623: equivalent of the expression parsed from the execution of this
624: representation. The previous definition emphasises the distinction
625: between \emph{instructions} and a \emph{machine} executing
626: instructions. The impact of this formal distinction will be
627: discussed in the implementation section of this paper.
628: \end{rem}
629: \begin{rem}
630: Theorem-proving capabilities of the Twelf environment are also
631: extremely useful in assessing properties of the simple language.
632: For instance, as a verification of the congruence of semantics we
633: can query Twelf as follows :
634: \begin{verbatim}
635: %query 1 1
636: eval2b ((constant true and constant false)
637: or constant true
638: or constant false) nil
639: S.
640: ---------- Solution 1 ----------
641: S = atom ((false | true) | false & true).
642: A =
643: eval2b_1
644: (eval2a_or_s2
645: (eval2a_or_s1 eval2a_atom
646: (eval2a_atom_2
647: (eval2a_atom_2
648: (eval2a_and_s2
649: (eval2a_atom_2 eval2a_atom))))))
650: (eval2_or_s (eval2_or_s eval2_constant2 eval2_constant2)
651: (eval2_and_s eval2_constant2 eval2_constant)).
652:
653: %query 1 1
654: eval0 ((constant true and constant false)
655: or constant true
656: or constant false) V.
657: ---------- Solution 1 ----------
658: V = (false | true) | false & true.
659: A = eval0_or
660: (eval0_and eval0_cst eval0_cst)
661: (eval0_or eval0_cst eval0_cst).
662:
663: \end{verbatim}
664: which basically shows deductions of proposition \ref{p:congruence} for
665: a particular expression.
666: \end{rem}
667:
668: Within this framework we are now ready to set forth the semantics
669: of some of the major inference features of the NXP architecture,
670: and, with the Logical Framework mapping introduced in this
671: section, ground some of its behavioural properties into
672: mathematical logic.
673:
674: \subsubsection{Sequence semantics for goal evocation}
675: The sequence semantics previously presented was designed to make
676: it relatively natural and simple to express goal evocation in the
677: NXP inference architecture. Informally goal evocation, which is
678: triggered by the data collection process, \emph{queues} a goal for
679: postponed investigation, once the current threads of
680: investigation complete. In turn, we will see that this sequence
681: semantics is suggestive of an operational semantics closer
682: to implementation.
683: \begin{defn} \label{d:nxpsyntax2}
684: Let us refine the simple NXP language with the following: $Exp$,
685: expressions: $$ E ::= b\;| E \;or\; E | E \;and\;E | E \;;\;E | b
686: \;post\;E$$ with $b$ in ${\bf B}$ and where the new expression $b
687: \;post\;E$ collects data value $b$ and evokes goal $E$ (which, as
688: an expression, is itself an arbitrarily long sequence of
689: expressions). The other expressions are as in definition
690: \ref{d:nonstdsemNXP2}.
691: \end{defn}
692: The semantics of goal evocation is captured by extending the
693: previous definition \ref{d:nonstdsemNXP2} of sequence semantics
694: with the additional mapping:
695: \begin{defn}\label{f:nonstdsemNXP3}
696: Let ${\bf B}^{*}$ be the set of finite boolean values sequences
697: with the previously defined selection, rest and concatenation
698: operations, let $Exp$ and ${\bf B}$ be as in definition
699: \ref{d:nonstdsemNXP2} let $\mathcal{B}$ be, as before, the
700: semantic function of definition \ref{d:stdsemNXP}, we extend the
701: semantic evaluation function $[[\_]]_{2}: Exp \rightarrow {\bf
702: B}^{*} \rightarrow {\bf B}^{*}$ as follows:
703: \begin{itemize}
704: \item $[[B]]_{2}s = \langle\mathcal{B}(B)\rangle+s$ \item
705: $[[E_{1}\:or\:E_{2}]]_{2}\:s =
706: Or([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item
707: $[[E_{1}\:and\:E_{2}]]_{2}\:s =
708: And([[E_{2}]]_{2}([[E_{1}]]_{2}\:s))$ \item
709: $[[E_{1}\:;\:E_{2}]]_{2}\:s = [[E_{2}]]_{2}([[E_{1}]]_{2}\:s)$
710: \item $[[B\;post\;E]]_{2}\:s =
711: \langle\mathcal{B}(B)\rangle+\:s\:+[[E]]_{2}\langle\rangle$
712: \end{itemize}
713: \end{defn}
714: \begin{rem}
715: The definition of the $post$ semantic function captures the postponed
716: execution of the goal investigation. It effectively replaces the
717: current continuation with the same \emph{followed} by the evaluation
718: of a goal or a sequence of goals. The effect is of completing first the
719: current goal evaluation process and then pursue with the postponed goals
720: evoked during that initial process.
721: \end{rem}
722: \begin{rem}
723: Although it is implicit in the definition of sequences, the previous
724: definition relies on a left-to-right ordering of operation on
725: sequences as in remark \ref{r:left-to-right-ordering}. If we required
726: an operational semantics closer to implementation, we would enforce
727: this by letting sequences hold unevaluated expressions, or thunks, as
728: well as instructions of a virtual machine, transforming the sequence
729: abstract data type into a conventional stack abstract data type.
730: \end{rem}
731: \begin{rem}
732: In a refined operational semantics, we would need to be a bit more
733: specific than the previous definition \ref{f:nonstdsemNXP3}. More
734: formally, we already mentioned that boolean values in goal
735: expressions are actually evaluated by accessing the working memory
736: of the inference system (possibly causing further interaction with
737: the user). The working memory is represented as an environment
738: $\rho$, a map from a set of variables $Vars$ to boolean values
739: {\bf B}. Goal expressions then refer to variables and the
740: semantic function is expressed in term of $\rho$. The map $\rho$
741: is partial in the sense that its value is not necessarily defined on all
742: variables at all times. In the occurrence of one of these unvalued variables,
743: the goal evaluation process triggers an atomic transaction with
744: the user, requesting the value which is then assigned in $\rho$. In
745: programming languages this caching effect is also known as \emph{memoizing} and can be
746: captured in different ways in an operational semantics. The
747: postponed evaluation of goals then proceeds in the environment as
748: resulting from the preceding evaluation cycle. In this, postponed
749: evaluation differs from re-evaluation that would, in contrast,
750: proceed from scratch in a new environment $\rho_{0}$ totally
751: undefined.
752: \end{rem}
753: With this semantics we can review some elementary results
754: formalising the intuitions underlying goal evocation in the NXP
755: architecture.
756: \begin{prop}\label{p:commutative}
757: Sequencing and postponing goals are commutative operations in the
758: following restricted sense:
759: $$[[B_{1}\;post\;E_{1}\;;\;B_{2}\;post\;E_{2}]]s =
760: [[B_{1}\;post\;(E_{2}\;;\;E_{1})\;;\;B_{2}]]s$$
761: and $$[[B_{1}\;post\;E_{1}\;;\;B_{2}\;post\;E_{2}]]s =
762: [[B_{1}\;;\;B_{2}\;post;(E_{2}\;;\;E_{1})]]s$$
763: \end{prop}
764: \begin{proof}
765: By simple application of the concatenation operation on sequences.
766: \end{proof}
767: This simply states that a sequence of goals can be locally evoked either as a
768: group or individually without affecting the overall behaviour. In the
769: accompanying paper, this result will be set in the context of
770: knowledge acquisition mechanisms concurrently running with the
771: evaluation process. More specifically, this restricted commutativity
772: will be cast in the context of the chunking processes described by
773: Newell in the Soar architecture \cite{Soar}. From an implementation perspective,
774: proposition \ref{p:commutative} may be used for optimisation of the
775: execution by preprocessing goal expressions, gathering groups of
776: evoked goals rather than processing them individually. This would seem
777: particularly useful when building a lazy interpreter for the NXP
778: architecture.
779: \begin{rem}\label{r:evocation-as-exception}
780: Goal evocation in the NXP architecture is reminiscent of
781: \emph{exceptions} in functional of imperative programming
782: languages. In the latter the current continuation is replaced
783: immediately with another one. The behaviour is then to abandon the
784: current process and switch to another one. In the former the
785: continuation is just \emph{added} after the current one; the resulting
786: behaviour being to successively go through the processes to their
787: completion. The analogy will be better illustrated in the categorical
788: investigation of the NXP architecture in the next section.
789: \end{rem}
790: Goal evocation as exemplified by the NXP architecture is an
791: instance of associative cognitive mechanisms. Theses associative
792: processes are at work in the background of the evaluation process
793: itself interacting with it by driving the \emph{focus of
794: attention} of the inference system. While the backward and forward
795: chaining of the goal evaluation process could be
796: described as \emph{strong} or \emph{local} focus of attention,
797: goal evocation represents a form of \emph{global},
798: \emph{weaker} forward chaining. More generally if \emph{learning}
799: processes have traditionally been studied separately from the
800: performance component of inference systems, in recent resaerch work
801: emphasis has been put on accounting for interactions between
802: performance and learning \cite{Soar}. In this perspective
803: understanding and representing of inference control are critical
804: as control is the operational juncture between the two subsystems.
805: Furthermore the representation semantics is of direct service in
806: specifying the implementation of inference systems.
807:
808: \section{A categorical investigation of inference control}
809: \subsection{Motivations}
810: Category theory has in the past few years become a tool of choice for
811: to investigate properties of programming languages. Because it
812: focuses on very high-level properties abstracted from a number of
813: subdomains of mathematics (domain theory, topology, set theory, etc.) its results are
814: characterized by a wide range of applicability.
815:
816: Composition if of the essence of categories. It is only natural
817: that sequencing and its variations, which are the basic building
818: blocks of inference systems control constructs---as seen in the
819: previous section---, find a well-fit mathematical expression in
820: categorical terms. In this section, we suggest such a categorical
821: construction for analyses of control in inference systems.
822: Starting from the original insight of Moggi, we use triples, a
823: categorical construction with its origins in algebra, to define
824: effects and controls in the NXP architecture. Following Wadler's lines of
825: exploration we produce a category to account for goal evaluation
826: and goal evocation in the NXP architecture. Finally, other results
827: in \emph{monadic style} representation of computations offer
828: interesting analogies for the representation of behaviours of
829: inference systems, particularly in \emph{layering} effects around
830: the core categorical representation of the architecture. The
831: categorical study of knowledge acquisition in inference
832: systems---in its multi-faceted forms---constitutes the major part
833: of the companion paper covering the NXP architecture.
834:
835: \subsection{Categorical background}
836: This section defines the basic notions from category theory that
837: we need in the formalisation of the NXP architecture in
838: monadic style. Readers are referred to \cite{ttt} for a
839: comprehensive presentation of categories and triples. Let $C$ be a
840: category, we denote by $Obj(C)$ the objects of $C$ and by
841: $Hom(A,B)$ the set of arrows with source object $A$ and target
842: object $B$.
843: \begin{defn} \label{d:functor}
844: If ${\bf C}$ and ${\bf D}$ are categories, a functor $F:{\bf C}
845: \rightarrow {\bf D}$ is a map for which:
846: \begin{itemize}
847: \item If $f: A \rightarrow B$ is an arrow of ${\bf C}$, then $Ff:
848: FA \rightarrow FB$ is an arrow of ${\bf D}$; \item $F(id{_A}) =
849: id_{FA}$; and \item If $g: A \rightarrow B$, then $F(g \circ f) =
850: Fg \circ Ff$.
851: \end{itemize}
852: \end{defn}
853: A functor is a morphism of categories, a map which takes objects
854: to objects, arrows to arrows, and preserves source, target,
855: identities and composition. More generally $F$ {\em preserves} a
856: property P that an arrow $f$ may have if $F(f)$ has property P
857: whenever $f$ has. It {\em reflects} property P if $f$ has the
858: property whenever $F(f)$ has.
859:
860: A natural transformation is defined as a "deformation" of one
861: functor to another.
862: \begin{defn}
863: If $F:{\bf C} \rightarrow {\bf D}$ and $G:{\bf C} \rightarrow {\bf
864: D}$ are two functors, $\lambda : F \rightarrow G$ is a natural
865: transformation from $F$ to $G$ if $\lambda$ is a collection of
866: arrows $\lambda C \rightarrow \lambda D$, one for each object $C$
867: of ${\bf C}$, such that for each arrow $g: C \rightarrow C'$ of
868: ${\bf C}$ the following diagram commutes:
869:
870: \def\ComponentOne{{\lambda C}}
871: \def\FuncOne{{Fg}}
872: \def\FuncTwo{{Gg}}
873: \def\ComponentTwo{{\lambda C'}}
874: \begin{diagram}
875: FC & \rTo^{\ComponentOne} & GC \\
876: \dTo<\FuncOne & & \dTo>\FuncTwo \\
877: FC' & \rTo^{\ComponentTwo} & GC'\\
878: \end{diagram}
879:
880: The arrows $\lambda C$ are the components of $\lambda$. The
881: natural transformation $\lambda$ is a {\em natural equivalence} if
882: each component of $\lambda$ is an isomorphism in ${\bf D}$.
883: \end{defn}
884:
885: \subsection{Triples, monads and categories for computations}
886: Triple or monads are, from one point of view, abstraction of
887: certain properties of algebraic structures, namely monoids. They
888: are categorical constructs that originally arose in homotopy
889: theory and were used in algebraic theory. Moggi
890: \cite{moggi91notions} was the first to discover the connection
891: between triples and semantics of effects in programming language
892: design. Since then the monadic style has pervaded theoretical
893: research on denotational and operational semantics.
894: \begin{defn} \label{d:triple}
895: A triple ${\bf T} = (T, \eta, \mu)$ on a category ${\bf C}$ is an
896: endofunctor $T: {\bf C} \rightarrow {\bf C}$ together with two
897: natural transformations $\eta: id{_{{\bf C}}} \rightarrow T$,
898: $\mu: TT \rightarrow T$ subject to the following commutative
899: diagrams:
900:
901: \def\mut{{\mu T}}
902: \def\tmu{{T \mu}}
903: \begin{diagram}
904: TTT & \rTo^\mut & TT \\
905: \dTo<\tmu & {\rm associativity} & \dTo>{\mu} \\
906: TT & \rTo^{\mu} & T \\
907: \end{diagram}
908:
909: expressing associative identity, and:
910:
911: \def\etat{{\eta T}}
912: \def\teta{{T \eta}}
913: \begin{diagram}
914: T & \rTo^\etat & TT & \lTo^\teta & T \\
915: & \rdTo<{=} & \dTo>{\mu}& \ldTo>{=} & \\
916: & & T & & \\
917: \end{diagram}
918:
919: expressing left and right unitary identities. The component of
920: $\mu T$ at an object $X$ is the component of $\mu$ at $TX$, while
921: the component of $T \mu$ at X is $T(\mu X)$; similar descriptions
922: apply to $\eta$.
923: \end{defn}
924:
925: \begin{rem} \label{d:monad}
926: There is an alternate way of defining a triple based on a result
927: due to Manes.
928:
929: Let ${\bf C}$ be a category with:
930: \begin{itemize}
931: \item A function $T : Obj(C) \rightarrow Obj(C)$; \item for each
932: pair of objects $C$ and $D$, a function $Hom(C,TD) \rightarrow
933: Hom(TC,TD)$, denoted $f \rightarrow f^{*}$; \item for each object
934: $C$ of ${\bf C}$ a morphism $\eta C : C \rightarrow TC$;
935: \end{itemize}
936: subject to the following conditions:
937: \begin{itemize}
938: \item For $f : C \rightarrow TD$, $f = \eta TD \circ f^{*}$; \item
939: for any object $C$, $(\eta C)^{*} = id_{TC}$; \item for $f : C
940: \rightarrow TD$ and $g : D \rightarrow TE$, $(g^{*}\circ f)^{*} =
941: g^{*} \circ f^{*}$; \end{itemize} is equivalent to a triple on
942: ${\bf C}$.
943: \end{rem}
944:
945: The equivalence results from constructions of triples from adjoint
946: pairs separately discovered by Eilenberg-More and by Kleisli. The
947: function $ (\_)^{*}: Hom(C,TD) \rightarrow Hom(TC,TD)$ is also
948: known as the Kleisli star. This alternate definition emphasises
949: the connection between a triple and a monoid, an algebraic
950: structure with an associative operation and a unit element. Wadler
951: suggested a straightforward interpretation of the Kleisli star in
952: programming language semantics. In this context, the purpose of
953: the star operation is to combine two computations, where the
954: second computation may depend on a value yielded by the first
955: \cite{wadler94monads}. More precisely if $m$ is a computation of
956: type $T \tau_{1}$ and $k$ a function from values to computations
957: (such as a continuation) $\tau_{1} \rightarrow T\tau_{2}$, then
958: $k^{*}(m)$, or $m*k$, is of type $T\tau_{2}$ and represents the
959: computation that performs computation $m$, applies $k$ to the
960: value yielded by the computation, and then performs the
961: computations that results. It binds the result of computation $m$
962: in computation $k$. Different definitions for the triple $T$ and
963: the star operation then give rise to different monads to represent
964: different control operators such as {\tt escape/exit}, {\tt
965: call/cc}, {\tt prompt/control} or {\tt shift/reset}
966: \cite{wadler93monads,queinnec93library}.
967:
968: As noted by Wadler and others, monadic and continuation-passing
969: styles appear closely related \cite{wadler94monads}. The actual
970: correspondence, however, is formally quite involved. Filinski has
971: shown the remarkable result that \emph{any} monadic effect whose
972: definition is itself expressible in a functional language can be
973: synthetised from just two constructs: first-class continuations
974: and a storage cell \cite{filinski94representing,
975: filinski96controlling}. This has direct consequences on the
976: feasibility of various implementations of the NXP architecture as
977: investigated in the last section of this paper.
978:
979: \subsection{Monadic style semantics for capturing side-effects}
980: In most programming languages, evaluation may have implicit
981: side-effects that are not predicted by the type of the expression.
982: This is typically the case with goal evocation in the NXP
983: architecture where collecting data may trigger goals for further,
984: delayed evaluation. From their first introduction in the world of
985: programming languages triples, or monads, were precisely used to
986: distinguish between \emph{values}, and \emph{computations} whose
987: evaluation may have side-effects. The semantic separation leads to
988: a stratified style where a pure functional language, for instance,
989: is used to to express the manipulation of values, and one or
990: several monadic sublanguages are used to express manipulation of
991: computations \cite{ wadler94monads}. Layering several monadic
992: sublanguages is formalised using \emph{triple morphisms} as in
993: Filinski's \cite{ filinski96controlling}. Interactions between
994: the core functional language and the monadic extensions are
995: mediated by the type system which keeps track of the computational
996: effects and their propagation.
997:
998: Each side-effect introduces a new type of computation associated
999: to a particular triple $T$. For instance, triples have been defined for
1000: effects such as exceptions, state, and input/output. Generally
1001: speaking, in monadic style semantics the type $T a$ designates a
1002: computation yielding a value of type $a$ with possible
1003: side-effects. The semantics of these side-effects is captured by
1004: proper definition of the natural transformations of triple
1005: $T$ (its unit and Kleisli star). Formally and following Manes'
1006: construction, the unit of the triple turns a value into a
1007: computation that returns that value without side-effect:
1008: $$\eta : a \rightarrow Ta.$$
1009: The Kleisli star applies a function of type $a \rightarrow Tb$ to
1010: a computation of type $Ta$, basically chaining two computations in
1011: succession. Following Wadler's popular notation---of using $*$ as
1012: an infix operator---the star operation:
1013: $$ (*) : Ta \rightarrow (a \rightarrow Tb) \rightarrow Tb$$
1014: represents application of a continuation to a computation of the
1015: given type. In the following paragraphs we will make the analogy
1016: even more explicit by using the notation:
1017: $$m*\lambda x.k$$
1018: where $m$ and $k$ are expressions and $x$ is a variable. The above
1019: can be read as follows: perform the computation $m$, bind $x$ to
1020: the resulting value and perform computation $k$. In so-called
1021: impure languages, the above notation is similar to:
1022: $${\tt let\:x\:=\:m\:in\:k.}$$
1023: Note that, however, that the latter does not properly
1024: distinguishes pure types (no side-effect) from computation types
1025: (with possible side-effects).
1026:
1027: We are now ready to present the monadic style definition of the
1028: previous standard and non-standard evaluators for the NXP
1029: architecture.
1030:
1031: \subsection{A monadic style presentation of the NXP architecture}
1032: \subsubsection{The standard evaluation triple} The new
1033: presentation is simply a rework of the semantic evaluation
1034: functions of definition \ref{d:stdsemNXP} in terms of the
1035: appropriate triple, unit, and Kleisli star.
1036: \begin{defn}\label{d:msStdSem}
1037: The standard monadic style semantic of the simple and/or
1038: expression language is captured by the map $eval$:
1039: \begin{align*}
1040: eval &: Exp \rightarrow T {\bf B} \\
1041: eval(b) &= \eta b \\
1042: eval(E_{1}\;or\;E_{2}) &= eval(E_{1})*\lambda
1043: x.eval(E_{2})*\lambda y.\eta(x\;|\;y) \\
1044: eval(E_{1}\;and\;E_{2}) &= eval(E_{1})*\lambda
1045: x.eval(E_{2})*\lambda y.\eta(x\;\&\;y) \\
1046: \end{align*}
1047: \end{defn}
1048: \begin{rem}
1049: In the definition above lambda abstraction binds less tightly and
1050: the application star binds more tightly, so that we can get rid of
1051: parentheses.
1052: \end{rem}
1053: \begin{rem}
1054: The above evaluation is much more flexible in that is separates
1055: values from computations. Values are evaluated through the unit
1056: semantic function $\eta$ of the triple, while the sequencing of
1057: computations is specified by the application star. The variations
1058: in denotational semantics we explored in the successive
1059: definitions of the previous section are investigated here by
1060: simply changing definitions of unit and application.
1061: \end{rem}
1062:
1063: \subsubsection{The non-standard evaluation triple}
1064: In the sequence triple, a computation accepts an initial sequence
1065: and returns a value paired with the final sequence.
1066: \begin{defn}\label{d:msSeqSem}
1067: The sequence triple $T : \tau \rightarrow \tau\times {\bf B}^{*}$
1068: with unit $\eta$ defined by $$\eta(b) = \lambda x.(b, \langle b
1069: \rangle + x)$$ where sequence operations are as in definition
1070: \ref{d:stacks}; and with application star $$(*) : T\tau_{1}
1071: \rightarrow (\tau_{1} \rightarrow T \tau_{2}) \rightarrow
1072: T\tau_{2}$$ such that:
1073: $$
1074: m*k =\lambda x.{\tt let }\:(a, S_{1})\;=\;m\;x\: {\tt in}\:
1075: {\tt let }\:(b, S_{2})\;=\;k\;a\;x \:{\tt in }\: (b,
1076: S_{2})
1077: $$
1078: and the evaluator as in definition \ref{d:msStdSem} captures the
1079: non-standard sequence semantics of the NXP architecture.
1080: \end{defn}
1081: \begin{rem}
1082: The sequence triple is similar to the \emph{state triple}
1083: introduced by Moggi and by Wadler to represent programming
1084: languages with instructions operating on a global state. The
1085: seemingly awkward definition of the application star simply
1086: expresses the intermediary sequences and results of first
1087: computing $m$ and then applying $k$ to the result.
1088: \end{rem}
1089: \begin{rem}
1090: For evaluation purposes, the $eval$ semantic function of the
1091: generic interpreter of definition \ref{d:msStdSem} has $T\;{\bf
1092: B}$ as its domain, meaning that computations with the sequence
1093: triple are of type ${\bf B}\times {\bf B}^{*}$ as expected.
1094: \end{rem}
1095: In order to introduce goal evocation as a last touch to the
1096: sequence triple, we note that goal evocation is \emph{only} a
1097: side-effect and does not interfere with values. This critical
1098: characteristic which was somewhat implicit in the previous
1099: continuation passing style presentation can now be made explicit
1100: in the monadic style. We formalise this separation by defining an
1101: additional semantic function in the sequence triple $T$: $post$ of
1102: type $T\;()$, i.e. $() \rightarrow () \times {\bf B}^{*}$. The
1103: $()$ type signals that the $post$ function is only concerned with
1104: effects on sequences and basically ignores pure values.
1105: \begin{defn}
1106: The evocation function for the sequence triple is defined by:
1107: $$post : T\;()$$
1108: and
1109: $$post\; E = \lambda s.((),s + {\tt inr}(E))$$
1110: where {\tt inr} is the right injection ${\bf B}\times {\bf B}^{*}
1111: \rightarrow {\bf B}^{*}$.
1112: \end{defn}
1113: With the evocation function thus defined, we extend the definition
1114: of the standard interpreter to accommodate the goal evocation
1115: expression in the following final definition of the sequence
1116: triple.
1117: \begin{defn}\label{d:msNonStdSeqSem}
1118: The monadic style sequence semantic of the simple and/or
1119: expression language is captured by the map $eval$ in the sequence
1120: triple $T$:
1121: \begin{align*}
1122: eval &: Exp \rightarrow T {\bf B} \\
1123: eval(b) &= \eta b \\
1124: eval(E_{1}\;or\;E_{2}) &= eval(E_{1})*\lambda
1125: x.eval(E_{2})*\lambda y.\eta(x\;|\;y) \\
1126: eval(E_{1}\;and\;E_{2}) &= eval(E_{1})*\lambda
1127: x.eval(E_{2})*\lambda y.\eta(x\;\&\;y) \\
1128: eval(b \;post\; E) &= post E * \eta b \\
1129: \end{align*}
1130: \end{defn}
1131: In that definition we simply replaced $\eta b$ with $post \;E *
1132: \eta b$ in the computation. Properties of natural transformations
1133: assure that the call to $post$ indeed
1134: percolates through higher-level goal evaluations. In the
1135: definition \ref{d:msNonStdSeqSem}, as in the previous section
1136: definitions we insist on a left-to-right evaluation ordering of
1137: sequences.
1138: \begin{rem}
1139: Note that the definition of $post$ could be viewed as a family of
1140: semantic functions indexed by goals, $E$ in the representation. A
1141: better notation in this case would be $post_{E}$ to capture goal
1142: indices. Although equivalent in the monadic style presentation
1143: suggested above for the NXP architecture, this alternative
1144: notation will help formalise the knowledge acquisition part of the
1145: NXP inference architecture.
1146: \end{rem}
1147:
1148: \section{Operational semantics and implementation issues}
1149: \subsection{Layering effects in the NXP architecture}
1150: Filinski's and Wadler's work
1151: \cite{filinski96controlling,wadler94monads} have emphasised the
1152: compositional nature of triples in representing computations.
1153: Filinski's results in particular are geared towards presenting a
1154: framework for computational effects which makes it possible to
1155: describe effects in a \emph{modular} way---an idea which permeates
1156: the design of the Haskell functional programming language, to
1157: mention only one. Drawing on the latter framework, we are able to
1158: add \emph{inference effects}, so critical to the NXP architecture
1159: and to inference systems at large, \emph{incrementally}. Following
1160: this approach, the NXP architecture is actually specified by a
1161: sequence of definitional translations, each one of which
1162: ``translates away'' one level of inference effects. For example,
1163: we can refine the description of the NXP architecture with goal
1164: evocation and dynamic working memory by specifying it as a
1165: composition of a goal evocation and a working memory translation.
1166: Informally we will also talk of \emph{composing} a goal evocation
1167: and a working memory triple.
1168:
1169: The monadic composition relies on the following general idea:
1170: assume we have two triples $T$ and $U$ over a base language $L$,
1171: where $U$ is in a certain sense ``more general'' than $T$. There
1172: is a standard translation of effects, $L^{M}$ into the base
1173: language given their monadic representation. Using this
1174: monadic translation we can give two different translations from
1175: $L^{T}$ to $L$: the original monadic translation for $T$ and a
1176: variant translation using $U$-representations of $T$-effects,
1177: inducing the same evaluation semantics. The reader is referred to
1178: \cite{filinski96controlling} for a formal proof of these
1179: results---interestingly enough the crux of the demonstration relies
1180: on the result that continuations are in a precise sense a
1181: \emph{universal} effect: any definable triple can be simulated by
1182: a continuation triple on a language with only first-class
1183: continuations (a la Scheme) and typed states.
1184:
1185: We present here other control constructs of the NXP
1186: architecture and propose candidates for their formal monadic
1187: representation. The working memory, a major element of the NXP
1188: architecture, is detailed by itself in the next section.
1189:
1190: \subsubsection{Associative links, context}
1191: In the NXP architecture, the data-triggered goal evocation mechanism
1192: is complemented by second goal-triggered goal evocation process based
1193: on associative links between goals and goals or subgoals.
1194: Associative links are defined by a binary relation: the
1195: \emph{contextual} relation.
1196:
1197: \begin{defn}\label{d:nxpsyntax3}
1198: The simple base language for inference systems expression is extended
1199: with the {\tt context} expression. Let us define $Exp$, the set of
1200: expressions: $$ E ::= b\;| E \;or\; E | E \;and\;E | E \;;\;E | b
1201: \;post\;E | E\;context\;E$$ with $b$ in ${\bf B}$ and where the new expression $E
1202: \;context\;E$ evaluates the first expression and evokes the second (which, as
1203: an expression, is itself an arbitrarily long sequence of
1204: expressions). The other expressions are as in definition
1205: \ref{d:nxpsyntax2}.
1206: \end{defn}
1207: \begin{rem}
1208: The goal-triggered goal evocation has lower priority than the
1209: data-triggered one. In the concrete expression
1210: $E_{1}\;{\tt context }\;E_{2}$, any goal evoked by the evaluation of
1211: $E_{1}$ through {\tt post} subexpressions will be evaluated
1212: before actual evaluation of $E_{2}$. Priorities aside {\tt context}
1213: and {\tt post} have the same semantics.
1214: \end{rem}
1215: We reuse the sequence triple to capture the semantics of the
1216: context links in the NXP architecture.
1217: \begin{defn}\label{d:msContextSem}
1218: Associative links semantics in the NXP architecture extend the $eval$
1219: semantic map in the sequence triple as follows:
1220: $$eval(E_{1} \;context\; E_{2}) = post E_{2} * eval(E_{1}) $$
1221: \end{defn}
1222: \begin{rem}
1223: Note that the $post$ primitive is now used twice: in the triple
1224: sequence used to capture goal evocation, and in the second triple
1225: sequence used to capture context links.
1226: \end{rem}
1227:
1228:
1229: In some respect, associative links are higher level control
1230: constructs in inference systems. Compared to pattern-directed modules
1231: of the canonical architecture, they do not rely on actual patterns in
1232: the working memory elements but rather on the behaviour of the
1233: inference process itself, posting goals for later evaluation when the
1234: inference, seen as computation, reaches some particular goal or
1235: subgoal. Alternatively the composition
1236: of identical sequence triples could also be viewed as the more general
1237: triple defined by $T : \tau \rightarrow \tau\times {\bf B}^{*} \times
1238: {\bf B}^{*}$ where the second sequence is in fact used for
1239: contextually linked goals.
1240:
1241: \subsubsection{The {\tt reset} action}
1242: Independently of the various actions operating on the working memory,
1243: calling for a special treatment as explained in the next section,
1244: the NXP production rules support a {\tt reset} action on a variable
1245: with the immediate effect of setting its value to ``unknown'' regardless
1246: of its previous assignment.
1247:
1248: From the behavioural standpoint, the representation of the computation
1249: of such an expression is simply to \emph{unevaluate} it. When {\tt
1250: reset} operates on a data, the semantics is simply to reevaluate it
1251: should the computation need it at some later time. When {\tt reset}
1252: operates on a goal (an expression in the simple NXP language) however,
1253: the situation is quite different: indeed the reset goal, considered
1254: as data in a later computation, should be reevaluated but what of its
1255: antecedents? And what if further computations do not explicitly need
1256: the reset goal?
1257:
1258: In the NXP architecture the effect of the {\tt reset} action
1259: propagates back from the goal to its antecedents but does not affect
1260: posted goals from evocation or contextual links. This propagation only
1261: interferes with the computation process by forcing a reevaluation,
1262: i.e. a new expansion of the reset expression. Its semantics is handled
1263: by the working memory executive which mediates all accesses to values.
1264:
1265: \subsection{Working memory and user interaction}
1266: In the semantics definition of the previous sections we
1267: oversimplified an essential element of an inference system, namely
1268: the \emph{working memory}. The working memory is holding the data
1269: structures on which the pattern-directed modules operate. In the
1270: traditional view of inference systems, data is often said to
1271: \emph{enter} or \emph{leave} working memory as it is added or
1272: deleted by the action part of those modules. In the cognitive
1273: sciences framework, the working memory is the representation in
1274: inference systems of the \emph{short-term memory}. Inflow and
1275: outflow of data in working memory capture the essentially local
1276: characteristic of short term memory which can only hold a limited
1277: number of items (``the magic number seven'') and only for a short
1278: period of time.
1279:
1280: In most inference systems, and particularly in traditional
1281: production systems where pattern-directed modules are production
1282: rules, the working memory is usually structured as a database of
1283: records. Working memory elements are instances of pre-defined or
1284: user-declared data structures with typed attributes. The
1285: pattern parts of the pattern-directed modules simply express
1286: boolean conditions over these attributes or, in some cases such as in
1287: the OPS series of production system languages \cite{OPS5}, the
1288: presence or absence of particular records.
1289:
1290: As such the basic structure is similar to the concept of an
1291: \emph{environment} in denotational semantics. An environment tells what
1292: the identifiers mean in an expression: it says what values the
1293: identifiers denote. In order to capture the level of indirections, it
1294: it usual to introduce $Ide$ a set of identifiers (variable names) and
1295: the domain $U$ of environments as $U : Ide \rightarrow {\bf B}$. With
1296: this the simple NXP language complies with the following definition.
1297: \begin{defn}\label{d:nxpSyntax4}
1298: $Exp$ is now the set of expressions : $$ E ::= x\;| E \;or\; E | E
1299: \;and\;E | E \;;\;E |\; x \;post\;E |\; E \;context\;E $$ with $x$
1300: in $Ide$.
1301: \end{defn}
1302:
1303: In the NXP architecture, the working memory additionally plays a
1304: much more dynamic role. In a major departure from the \emph{closed
1305: world} assumption of former inference systems, the working
1306: memory of the NXP architecture is also the locus of interaction
1307: with the external environment, whether data is captured through
1308: interactions with users or from other
1309: applications and systems. Indeed the historical
1310: implementation, \emph{Nexpert Object}, is famous for its
1311: pioneering use of API (Application Programming Interface) and
1312: middleware to integrate inference capabilities to entreprise
1313: applications and information systems in heterogeneous
1314: environments. In order to fulfil this role, the NXP working memory
1315: has its own low-level simplified executive module whose
1316: responsibility is to plan for the acquisition of the value of a
1317: particular data required by the inference process. In
1318: most cases, the plan is a simple look-up of the value for
1319: the requested data in memory. If this fails, either because the
1320: value is unknown yet or because a previous inference operation has
1321: reset that particular data item, the value is sought outside the
1322: NXP system either by asking the user (this in
1323: the form of a popup \emph{question} window) or by sending the
1324: request to the middleware layer to distant databases and
1325: applications (\emph{Nexpert Object} has a wide variety of foreign
1326: request mechanisms ranging from file loading and SQL queries up to
1327: COM or Corba-based dynamic queries to application servers). Beyond
1328: the marshalling of the returned value, its type conversion, the
1329: working memory executive is also able to synchronise the request
1330: for value with the higher-level chaining and goal-driven
1331: processes. It may pre-process group of
1332: requests and use caching in order to optimise usage of costly data
1333: access channels, for instance.
1334:
1335: A second dimension of dynamic behaviour that sets the NXP working
1336: memory apart from former production systems architecture is the
1337: evolutive nature of the working memory structure. This
1338: aspect is fully developed in the companion paper, but let us
1339: simply state here that the NXP architecture allows for
1340: pre-processing of the working memory elements based on previous
1341: runs of the inference systems. This pre-processing helps determine
1342: the initial \emph{focus of attention} of the system which, in
1343: turn, assigns priorities in the goal-driven process. This
1344: experience-based \emph{skill acquisition} process
1345: differs from historically studied machine learning processes concerned
1346: with alteration of the collection of pattern-directed modules, or
1347: production memory, e.g. chunking \cite{Soar}.
1348:
1349: Operations of the working memory mini-executive are transparent to
1350: the semantic definitions of previous sections. An operational
1351: semantics that would be closer to actual implementation would
1352: replace the simple set of boolean constants, ${\bf B}$, with a
1353: family of semantic functions $get_{B}$, indexed on a set of
1354: variables, representing working memory elements and capturing the
1355: simple semantics of request-response interactions with the external
1356: environment. Those functions could be further refined either in an
1357: asynchronous framework, accounting for message-based communication
1358: channels for instance, or in a lazy evaluation framework in which
1359: the mini-executive is implemented as a lazy interpreter.
1360:
1361: The monadic style representation of the semantics of NXP
1362: architecture's working memory follows the layered approach
1363: discussed in the previous subsection. By analogy with Haskell's
1364: I/O monads \cite{HaskellBi98}, we introduce the following
1365: definition.
1366: \begin{defn}\label{d:wmTriple}
1367: The working memory monad $W$ describes a computation as a series of
1368: channels paired with the value returned.
1369: \begin{align*}
1370: {\bf type}\: W\;b =& \langle C_{1},\cdots,C_{n}, b\rangle \\
1371: \eta :& b \rightarrow W b \\
1372: \eta x =& \langle \star_{1},\cdots,\star_{n} , x \rangle \\
1373: (*) :& W b_{1} \rightarrow (b_{1} \rightarrow WM\;b_{2})
1374: \rightarrow W\;b_{2} \\
1375: m*k =& {\tt let}\; \langle csx_{1},\cdots,csx_{n},bx \rangle = m
1376: \;{\tt in}\\
1377: & {\tt let}\; \langle csy_{1},\cdots,csy_{n},by \rangle
1378: = k\;bx \;{\tt in}\\
1379: & \langle csx_{1};csy_{1},\cdots,csx_{n};csy_{n}, by\rangle \\
1380: \end{align*}
1381: where channels, $C_{i}$, represent distinct request-response
1382: communication channels; $csx_{i}$ denotes state $x$ of channel
1383: $C_{i}$; $\star_{i}$ denotes the undefined (in the strict sense)
1384: state of channel $C_{i}$ ; $bx$ and $by$ denote boolean values
1385: from {\bf B}, and $csx_{i};csy_{i}$ denotes the state of channel
1386: $C_{i}$ after stepping through state $csx_{i}$ then $csy_{i}$.
1387: \end{defn}
1388: \begin{rem}
1389: The previous definition does not specify the operational semantics of
1390: communication channels. In an operational semantics, channels
1391: would use the state monad which simply captures that requests and
1392: responses flowing through the channel change its state. The
1393: definition also leaves the number of channels unspecified. In the
1394: NXP architecture, at least one of these channels is identified as
1395: the user interaction channel through which question-answer based
1396: interactive transactions can be set up.
1397: \end{rem}
1398: \begin{rem}
1399: This definition also leaves the choice of channel state semantics
1400: open. In particular, different semantics are called for synchronous
1401: and asynchronous communication patterns.
1402: Similarly, definition \ref{d:wmTriple} excludes
1403: interaction between communication channels oversimplifying some of
1404: the capabilities of the mini-executive in the \emph{Nexpert
1405: Object} reference implementation of the NXP architecture.
1406: \end{rem}
1407: The advantage of the monadic style is in this instance twofold.
1408: Firstly, in the layered approach outlined in the previous
1409: paragraphs, the composition of monads adequately reflects the
1410: juncture of the inference processes, chainings and goal-driven, with
1411: interactive processes. Here this articulation is
1412: naturally represented by the composition of the NXP monad from
1413: definition \ref{d:msNonStdSeqSem} with the working memory monad
1414: from definition \ref{d:wmTriple}. This architectural pattern will
1415: be reused in the companion paper to present the composition of
1416: inference with associative and dynamic memory processes.
1417:
1418: Secondly, it brings forward the nature of the working memory
1419: mini-executive itself and suggests various operational
1420: implementations. In particular, lazy interpretation could be
1421: useful in the case of continuous data streams channels, for
1422: instance, or for combined management of interacting communication
1423: channels---both situations quite frequent in real-time enterprise
1424: application integration.
1425:
1426: \subsection{Operational semantics: an NXP machine}
1427: As a last step towards an operational semantic for the NXP
1428: architecture, we will think of the computation as described by the
1429: operation of a state-based machine: $${\tt until}\: term(\sigma)\:{\tt
1430: do}\: \sigma = step(\sigma)$$
1431: At each stage of the machine's operation the state, $\sigma$, is
1432: modified to a new state as specified by the single step state
1433: transformation function $step$; this is repeated until a terminal
1434: state, recognised by the predicate $term$, is reached. The operation
1435: is captured by a formal definition.
1436: \begin{defn}
1437: With $S$ a set of states, $step : S \rightarrow S$ a function from
1438: states to states, and $term : S \rightarrow {\bf B}$ a function from
1439: states to truth values, we define the general {\tt machine} function :
1440: $${\tt machine}(step, term) = fix(\lambda f
1441: \lambda\sigma.{\tt if}\;term(\sigma)\;{\tt then}\;\sigma\;{\tt
1442: else}\;f\circ step(\sigma))$$
1443: where the expression {\tt if ... then ... else} has the usual meaning
1444: and where $fix$ is the fixed point operator (here acting on $f$).
1445: \end{defn}
1446: \begin{rem}
1447: With this very general definition a machine is simply identified by
1448: the couple of functions $step$ and $term$.
1449: \end{rem}
1450: In order to exhibit an example of an NXP machine, we will
1451: transform the sequences of the previously described non standard
1452: semantics into operational sequences. Here the
1453: denotational semantics naturally suggests an operational
1454: semantics: instead of queuing up denoted values in sequences,
1455: we now queue \emph{instructions} for the NXP machine sequences. Each
1456: expression of the simple language is \emph{compiled} into a
1457: sequence of mixed instructions and arguments, constituting a
1458: program then passed to an abstract machine (with proper $term$ and
1459: $step$ definitions).
1460: \begin{defn}\label{d:programs}
1461: We introduce the syntactic domains: ${\bf B}$ of boolean values, $Ins$
1462: of instructions, $Prg = Ins^{*}$ of finite sequences of instructions,
1463: or programs.
1464:
1465: The syntax for instructions is:
1466: $$ I ::= {\tt get}\; x\;| {\tt and}\; | {\tt or}\; | {\tt reset}\; x$$
1467: The semantics of this machine code will basically arrange that, as
1468: might be expected, {\tt get} accesses the working memory to
1469: retrieve the value of a particular identifier---possibly triggering
1470: side-effects in the working memory executive---and place it on top
1471: of the sequence; {\tt or} will form the logical or of the top two
1472: elements on the sequence; similarly for {\tt and}; and {\tt reset}
1473: which causes the working memory executive to reset values used
1474: in the evaluation of a variable--possibly triggering side-effects.
1475: \end{defn}
1476: \begin{rem}
1477: With this definition in place we have shifted emphasis from a semantic
1478: function mapping expressions to denotations to a compiling function
1479: mapping expressions to syntactic values, namely sequences of instructions.
1480: \end{rem}
1481: The compiling function transforms expression into programs fit for the
1482: NXP machine.
1483: \begin{defn}\label{d:compiler}
1484: Following the lines of the non standard semantics, we introduce the
1485: compiling function $C : Exp \times Prg \times Prg \rightarrow Prg \times Prg$.
1486: \begin{align*}
1487: C(x,s,p) &= s+\langle {\tt get}\;x \rangle, p \\
1488: C(E_{1}\;or\;E_{2},s,p) &= {\tt let}\; s', p' =
1489: C(E_{2},C(E_{1},s,p))\;{\tt in}\; s'+\langle {\tt or}\rangle, p' \\
1490: C(E_{1}\;and\;E_{2},s,p) &= {\tt let}\; s', p' =
1491: C(E_{2},C(E_{1},s,p))\;{\tt in}\; s'+\langle {\tt and}\rangle, p' \\
1492: C(E_{1}\;;\;E_{2},s,p) &= C(E_{2},C(E_{1},s,p)) \\
1493: C(x\;post\;E,s,p) &= {\tt let}\; s', p' = C(E,\langle\rangle,\langle\rangle) \;{\tt
1494: in}\; s'+\langle {\tt get }\;x \rangle, p' + p \\
1495: C( E_{1}\;context\;E_{2},s,p) &= {\tt let}\; s', p' =
1496: C(E_{2},\langle\rangle,\langle\rangle) \;{\tt in}\;
1497: C(E_{1},s,p),p'+s'+p \\
1498: \end{align*}
1499: \end{defn}
1500: \begin{rem}
1501: In this definition we need two operational sequences $s$ and $p$. The
1502: first one represents direct continuations as processed by
1503: evaluation and evocation of goals. The second one is used to
1504: process contextual links and goal associations.
1505: \end{rem}
1506: \begin{rem}
1507: As remarked in the previous section, the compilation
1508: process could be captured by a specific triple then blurring the formal
1509: distinction between compilation and semantic evaluatiation of NXP simple
1510: language expressions.
1511: \end{rem}
1512: We now complete the definition of an operational semantics for the NXP
1513: architecture by displaying a $term$ and a $step$ function for the
1514: compiled code definitions \ref{d:compiler} and \ref{d:programs}.
1515: \begin{defn}
1516: Let $S = Prg \times {\bf N} \times {\bf B}^{*}$ be a set of
1517: states, and $\sigma = (\Pi,n,s)$ be a typical member of $S$.
1518: Simply enough, we specify the effect of any single instruction on
1519: the stack by defining a function $I: Ins \rightarrow {\bf B}^{*}
1520: \rightarrow {\bf B}^{*}$ as follows:
1521: \begin{align*}
1522: I( {\tt get}\;x, s) = &\: \langle get(x) \rangle + s \\
1523: I( {\tt reset}\;x, s) = &\: s \\
1524: I( {\tt or}\;,s) = &\: Or(s) \\
1525: I( {\tt and}\;,s) = &\: And(s)
1526: \end{align*}
1527: \end{defn}
1528: \begin{rem}
1529: In this definition $get$ and $reset$ are invocations of the working
1530: memory executive. In fact $reset: Ide \rightarrow \langle\rangle$
1531: always returns the empty stack while $get: Ide \rightarrow {\bf B}$
1532: returns the boolean value denoted by identifier $x$---both calls having
1533: possible side-effects. The semantics of these calls is represented by the
1534: working memory triple.
1535: \end{rem}
1536: Finally we define an instance of the abstract machine.
1537: \begin{defn}
1538: The NXP virtual machine is defined by the set $S$ of states, and the
1539: $term$ and $step$ functions:
1540: $$step(\Pi,n,s) = (\Pi,n+1,I(\Pi\downarrow i,s))$$
1541: and
1542: $$term(\Pi,n,s) = n \geq length(\Pi)$$
1543: \end{defn}
1544: \begin{rem}
1545: Thus when the machine executes a program $\Pi$ with the
1546: starting stack $s$--usually the empty stack $\langle\rangle$--the
1547: final stack is obtained as specified by the semantic function $M$,
1548: defined by:
1549: \begin{align*}
1550: M & : Prg \rightarrow {\bf B}^{*} \rightarrow {\bf B}^{*} \\
1551: M(\Pi,s) &= in_{3}({\tt machine}(step,term)(\Pi,1,s)) \\
1552: \end{align*}
1553: where $in_{3}$ is the canonical projection of the third element of the
1554: triplet.
1555: \end{rem}
1556: \begin{rem}
1557: Note that the final program passed to the NXP machine is the
1558: concatenation of the two compiled operational sequences, accounting
1559: for the lower priority of the associative links. This is captured in
1560: the following congruence proposition.
1561: \end{rem}
1562: \begin{prop}\label{p:congruenceCompiler}
1563: The effect of running a compiled expression is the one given by the
1564: sequence semantics, more specifically:
1565: $$M(\Pi,s) = [[E]]_{2}s$$
1566: where
1567: $$\Pi = \;{\tt let}\; s',p' = C(E,s,\langle\rangle)\;{\tt in}\:p'+s'$$ for
1568: all $E$, $s$.
1569: \end{prop}
1570: Interestingly enough the compiler is not the only operational
1571: semantics suggested by the non standard semantics studied
1572: above. Building an interpreter for the NXP architecture is also
1573: possible along the same lines as the compiler, but instructions are
1574: then executed on the fly. More generally several implementations can still
1575: be chosen at this stage, somehow independently of the operational
1576: semantics: eager or lazy evaluation, for instance, is applicable to
1577: both operational semantics.
1578:
1579: \section{Conclusions and further research}
1580: Inference may be considered as a special form of computing. This
1581: basic tenet of the whole Artificial Intelligence theoretical field
1582: has been comprehensively captured by Newell's \emph{knowledge
1583: level} hypothesis \cite{Newell93}, and if the distinction between
1584: behaviour and beliefs is well established, the nature of behaviour
1585: as computation is still much debated.
1586:
1587: Inspired by the analogy between the latter distinction and the one usually
1588: drawn between computations and values in the study and design of
1589: programming languages, we have presented an architectural view of
1590: control constructs in inference systems and, more specifically, of
1591: \emph{Nexpert Object} as a canonical inference system. The resulting
1592: NXP architecture has grounded semantics expressed in categorical terms
1593: using the monadic style of presentation.
1594:
1595: The benefits of this formal representation is twofold. On the
1596: one hand, the categorical foundation of this representation helps
1597: study of the NXP architecture in a logical framework, such as Twelf,
1598: and formally assess its architectural properties. On the
1599: other hand, the denotational (and derived non standard) semantics
1600: strongly suggest operational semantics for the implementation of the
1601: NXP architecture. Formally provable congruence relation henceforth
1602: ascertain the soundness of the resulting implementation.
1603:
1604: Finally the categorical foundation of the NXP architecture extends to
1605: the formal representation of other cognitive processes beyond
1606: inference. In particular the monadic style can be used
1607: to articulate the working memory process to inference in a modular
1608: way. It can also be used to express relationship between inference and
1609: learning or adaptative processes reflecting the long term effect of experience
1610: on inference.
1611:
1612: \bibliographystyle{amsplain}
1613: \bibliography{wip}
1614: \end{document}
1615: