cs0107013/cs0107013
1: \documentclass{article}
2: \newcommand{\C}[1]{\mbox{$\{{#1}\}$}}           % curly braces
3: \newcommand{\la}{\mbox{$\:\leftarrow\:$}}
4: \newcommand{\ra}{\mbox{$\:\rightarrow\:$}}
5: \newcommand{\LL}{\mbox{$\ldots$}}
6: \newcommand{\NI}{\noindent}
7: \newcommand{\VV}{\vspace{5 mm}}
8: \newcommand{\II}{\vspace{2 mm}}
9: \newcommand{\vect}[1]{{\bf #1}}
10: \newcommand{\Seq}[1]{{\bf #1}}
11: 
12: \usepackage{latexsym}
13: \usepackage{alltt}
14: 
15: %\setlength{\textwidth}{160mm}
16: %\setlength{\textheight}{240mm}
17: %\setlength{\oddsidemargin}{0in}
18: %\setlength{\topskip}{1cm}
19: %\setlength{\topmargin}{-0.3in}
20: %\raggedbottom
21: %\setlength{\abovedisplayskip}{3mm}
22: %\setlength{\belowdisplayskip}{3mm}
23: %\setlength{\abovedisplayshortskip}{0mm}
24: %\setlength{\belowdisplayshortskip}{2mm}
25: %\setlength{\normalbaselineskip}{12pt}
26: %\normalbaselines
27: 
28: 
29: 
30: \begin{document}
31: \date{}
32: 
33: \title{The Logic Programming Paradigm and Prolog}
34: \author{Krzysztof R. Apt}
35: 
36: \maketitle
37: 
38: \begin{abstract}
39:   This is a tutorial on logic programming and Prolog appropriate for a
40:   course on programming languages for students familiar with
41:   imperative programming.
42: \end{abstract}
43: 
44: 
45: \tableofcontents
46: 
47: \newpage
48: 
49: \section{History of Logic Programming}
50: \label{sec:history}
51: 
52: The logic programming paradigm has its roots in automated theorem
53: proving from which it took the notion of a deduction. What is new is
54: that in the process of deduction some values are computed.  The
55: creation of this programming paradigm is the outcome of a long history
56: that for most of its course ran within logic and only later inside
57: computer science.  Logic programming is based on the syntax of
58: first-order logic, that was originally proposed in the second half of
59: 19th century by Gottlob Frege and later modified to the currently used
60: form by Giuseppe Peano and Bertrand Russell.
61: 
62: In the 1930s Kurt G\"{o}del and Jacques Herbrand studied the notion of
63: computability based on derivations. These works can be viewed as the
64: origin of the ``computation as deduction'' paradigm.  Additionally,
65: Herbrand discussed in his PhD thesis a set of rules for manipulating
66: algebraic equations on terms that can be viewed now as a sketch of a
67: unification algorithm.
68: Some thirty years later in 1965 Alan Robinson published his
69: fundamental paper \cite{Rob65} that lies at the foundations of the field of
70: automated deduction.  In this paper he introduced the resolution
71: principle, the notion of unification and a unification algorithm.
72: Using the resolution method one can prove theorems of first-order
73: logic, but another step was needed to see how one could compute within
74: this framework.  
75: 
76: This was eventually achieved in 1974 by Robert Kowalski
77: \index{Kowalski, R.A.}  in his paper \cite{Kow74} in which logic
78: programs with a restricted form of resolution were introduced. The
79: difference between this form of resolution and the one proposed by
80: Robinson is that the syntax is more restricted, but proving now has a
81: side effect in the form of a satisfying substitution. This
82: substitution can be viewed as a result of a computation and
83: consequently certain logical formulas can be interpreted as programs.
84: In parallel, Alain Colmerauer \index{Colmerauer, A.}  with his
85: colleagues worked on a programming language for natural language
86: processing based on automated theorem proving.  This ultimately led to
87: creation of Prolog in 1973.  Kowalski and Colmerauer with his team
88: often interacted in the period 1971--1973.  This influenced their
89: views and helped them to crystallize the ideas.
90: 
91: Prolog can be seen as a practical realization of the idea of logic
92: programs.  It started as a programming language for applications in
93: natural language processing, but soon after it was found that it can be
94: used as a general purpose programming language, as well.  A number of
95: other attempts to realize the computation as deduction paradigm were
96: proposed around the same time, notably by Cordell Green and Carl
97: Hewitt, but the logic programming proposal, probably because it was the
98: simplest and most versatile, became most successful.
99: 
100: Originally, Prolog was implemented by Philippe Roussel,
101: \index{Roussel, Ph.}  a colleague of Colmerauer, in the form of an
102: interpreter written in Algol-W.  An important step forward was
103: achieved by David H. Warren \index{Warren. D.H}
104: who proposed in 1983 an abstract machine,
105: now called WAM (Warren Abstract Machine), that consists of a machine
106: architecture with an instruction set which serves as a target for
107: machine independent Prolog compilers.  WAM became a standard basis for
108: implementing Prolog and other logic programming languages.
109: 
110: The logic programming paradigm influenced a number of developments in
111: computer science. Already in the seventies it led to the creation of
112: deductive databases that extend the relational databases by providing
113: deduction capabilities.  A further impetus to the subject came
114: unexpectedly from the Japanese Fifth Generation Project for
115: intelligent computing systems (1982--1991) in which logic programming
116: was chosen as its basis.
117: More recently, this paradigm led to constraint logic
118: programming that realizes a general approach to computing in which the
119: programming process is limited to a generation of constraints
120: (requirements) and a solution of them, and to inductive logic
121: programming, a logic based approach to machine learning.
122: 
123: The above account of history of logic programming and Prolog shows its
124: roots in logic and automated deduction. In fact, Colmerauer and
125: Roussel write in \cite{CR96}: ``There is no question that Prolog is
126: essentially a theorem prover `\`{a} la Robinson.'  Our contribution
127: was to transform that theorem prover into a programming language.''
128: This origin of the logic paradigm probably impeded its acceptance
129: within computer science in times when imperative programming 
130: got impetus thanks to the creation of Pascal and C, the fields of
131: verification and semantics of imperative programs gained ground and
132: when the artificial intelligence community already adopted Lisp as
133: the language of their choice.
134: 
135: Here we offer an alternative presentation of the subject by focusing
136: on the ordinary programming concepts (often implicitly) present in
137: logic programming and by relating various of its ideas to those
138: present in the imperative and functional programming paradigms.
139: 
140: \section{Brief Overview of the Logic Programming Paradigm}
141: 
142: The logic programming paradigm substantially differs from other
143: programming paradigms.  When stripped to the bare essentials it can be
144: summarized by the following three features:
145: 
146: \begin{itemize}
147: 
148: \item computing takes place over the domain of all terms 
149: defined over a ``universal'' alphabet.
150: 
151: \item values are assigned to variables by means of automatically generated
152: substitutions, called {\em most general unifiers}. These values may
153: contain variables, called \emph{logical variables},
154: 
155: \item the control is provided by a single mechanism: automatic {\em backtracking}. 
156: 
157: 
158: \end{itemize}
159: 
160: In our exposition of this programming paradigm we shall stress the
161: above three points.  Even such a brief summary shows both the strength
162: and weakness of the logic programming paradigm.  Its strength lies in
163: an enormous simplicity and conciseness; its weakness has to do with
164: the restrictions to one control mechanism and the use of a single data
165: type.
166: 
167: So this framework has to be modified and enriched to accommodate it
168: to the customary needs of programming, for example by
169: providing various control constructs and by introducing the data type of
170: integers with the customary arithmetic operations.  This can be done
171: and in fact Prolog and constraint logic programming languages are
172: examples of such a customization of this framework.
173: 
174: \paragraph{Declarative programming} 
175: Two additional features of logic programming are important to note.
176: First, in its pure form it supports {\em declarative programming}.  A
177: {\em declarative program\/} admits two interpretations.  The first
178: one, called a {\em procedural interpretation\/}, \index{procedural
179:   interpretation} explains {\em how\/} the computation takes place,
180: whereas the second one, called a {\em declarative interpretation\/},
181: \index{declarative interpretation} is concerned with the question {\em
182:   what\/} is being computed.
183: 
184: Informally, the procedural interpretation is concerned with the {\em
185:   method}, whereas the declarative interpretation is concerned with
186: the {\em meaning}.  In the procedural interpretation a declarative
187: program is viewed as a description of an algorithm that can be
188: executed.  In the declarative interpretation a declarative program is
189: viewed as a formula, and one can reason about its correctness without
190: any reference to the underlying computational mechanism.  This makes
191: declarative programs easier to understand and to develop.
192: 
193: As we shall see, in some situations the specification of a problem in
194: the logic programming format already forms an algorithmic solution to
195: the problem. So logic programming supports declarative programming and
196: allows us to write {\em executable specifications}. It should be added
197: however, that in practice the Prolog programs obtained in this way are
198: often inefficient, so this approach to programming has to be combined
199: with various optimization techniques, and an appropriate understanding
200: of the underlying computation mechanism is indispensable.  To clarify
201: this point we shall present here a number of Prolog programs that are
202: declarative and eliminate from them various sources of inefficiency.
203: 
204: This dual interpretation of declarative programs also accounts for the
205: double use of logic programming ---as a formalism for programming and
206: for knowledge representation, and explains the importance of logic
207: programming in the field of artificial intelligence.
208: 
209: \paragraph{Interactive Programming}
210: Another important feature of logic programming is that it supports
211: {\em interactive programming}.  That is, the user can write a single
212: program and interact with it by means of various queries of interest
213: to which answers are produced.  The Prolog systems greatly support
214: such an interaction and provide simple means to compute one or more
215: solutions to the submitted query, to submit another query, and to
216: trace the execution by setting up, if desired, various check points,
217: all within the same ``interaction loop''.  This leads to a flexible
218: style of programming.
219: 
220: This is completely analogous to the way functional programs are used
221: where the interaction is achieved by means of expressions that need to
222: be evaluated using a given collection of function definitions.
223: 
224: \medskip
225: 
226: In what follows we shall introduce Prolog, the best known programming
227: language based on the logic programming paradigm. Prolog is then based
228: on a subset of first-order logic. We explain here how Prolog uses this
229: syntax in a novel way (this characteristic is called \emph{ambivalent
230:   syntax}) and extends it by a number of interesting features, notably
231: by supporting infix notation and by providing so-called
232: \emph{anonymous} and \emph{meta-variables}. These extensions amount to
233: more than syntactic sugar.  In fact, they make it possible to realize
234: in Prolog higher-order programming and meta-programming in a simple
235: way.
236: 
237: When discussing Prolog it is useful to abstract from the programming
238: language and first consider the underlying conceptual model provided
239: by logic programming.
240: 
241: 
242: \section{Equations Solved by Unification as Atomic Actions}
243: \label{sec:unification}
244: 
245: We begin by explaining how computing takes place at the ``atomic
246: level''.  In logic programming the atomic actions are equations
247: between terms (arbitrary expressions).  They are executed by means of
248: the unification process that attempts to solve such equations. In the
249: process of solving values are assigned to variables. These values can
250: be arbitrary terms.  In fact, the variables are all of one type that
251: consists of the set of all terms.
252: 
253: This informal summary shows that the computation process in 
254: logic programming is governed by different principles
255: than in the other programming paradigms.
256: 
257: \subsection{Terms}
258: \label{subsec:terms}
259: 
260: In a more rigorous explanation let us start by introducing an alphabet
261: that consists of the following disjoint classes of symbols:
262: \begin{itemize}
263: \item
264: {\em variables\/}, denoted by $x,y,z,\dots$ possibly with subscripts,
265: \item
266: {\em function symbols\/}, \index{function symbol}
267: \item
268: {\em parentheses\/}, ``('' and ``)'',
269: \item
270: {\em comma\/}, ``,''.
271: \end{itemize}
272: 
273: We also postulate that each function symbol has a fixed {\em
274: arity\/}, that is the number of arguments associated
275: with it.  0-ary function symbols are called
276: {\em constants\/},
277: and are usually denoted by $a,b,c,d,\dots$.
278: Below we denote function symbols of positive arity 
279: by $f,g,h,\dots$.
280: 
281: Finally, {\em terms\/} are defined inductively as follows:
282: \begin{itemize}
283: \item
284: a variable is a term,
285: \item
286: if $f$ is an $n$-ary function symbol and $t_1,\dots,t_n$ are terms, then
287: $f(t_1,\dots,t_n)$ is a term.
288: \end{itemize}
289: In particular every constant is a term.  Variable-free terms are usually
290: called {\em ground terms}. 
291: Below we denote terms by
292: $s,t,u,w, \dots$.  
293: 
294: For example, if $a$ is a constant, $x$ and $y$ are variables,
295: $f$ is a binary function symbol and $g$ a unary function symbol, then
296: $f(f(x,g(b)),y)$ is a term.
297: 
298: Terms are fundamental concepts in mathematical logic but at first
299: sight they seem to be less common in computer science.  However, they
300: can be seen as a generalization of the concept of a string familiar
301: from the theory of formal languages. In fact, strings can be viewed as
302: terms built out of an alphabet the only function symbols of which are
303: the concatenation operations in each arity (or alternatively, out of
304: an alphabet the only function symbol of which is the binary
305: concatenation operation assumed to be associative, say to the right).
306: Another familiar example of terms are arithmetic expressions. These
307: are terms built out of an alphabet in which as the function symbols we
308: take the usual arithmetic operations of addition, subtraction,
309: multiplication, and, say, integer division, and as constants 0, -1, 1, \LL .
310: 
311: In logic programming no specific alphabet is assumed. In fact, it is
312: convenient to assume that in each arity an infinite supply of function
313: symbols exists and that all terms are written in this ``universal
314: alphabet''.  These function symbols can be in particular the
315: denotations of arithmetic operations but no meaning is attached to
316: these function symbols. This is in contrast to most of the imperative
317: programming languages, in which for example the use of ``+'' in an
318: expression implies that we refer to the addition operation.  The other
319: consequence of this choice is that no types are assigned to terms.  In
320: fact, no types are assumed and consequently there is no distinction
321: between, say, arithmetic expressions, Boolean expressions, and terms
322: denoting lists.  All these terms are considered as being of one type.
323: 
324: \subsection{Substitutions}
325: \label{subsec:substitutions}
326: 
327: Unlike in imperative programming, in logic programming the variables
328: can be uninitialized.  Moreover, the possible values of variables are
329: terms. So to explain properly the computation process we need to
330: reexamine the notion of a state.
331: 
332: At any moment during the computation there will be only a finite
333: number of variables that are \emph{initialized} ---these are
334: variables to which in the considered computation some value was
335: already assigned. Since these values are terms, we are naturally led
336: to consider \emph{substitutions}. These are finite mappings from
337: variables to terms such that no variable is mapped to itself.  So
338: substitution provides information about which variables are initialized.
339: (Note that no variable can be initialized to itself which explains the
340: restriction that no variable is mapped to itself.)
341: 
342: Substitutions then form a counterpart of the familiar notion of a
343: {\em state\/} used in imperative programming.
344: We denote a substitution by $\C{x_1 / t_1, \dots, x_n / t_n}$.
345: This notation implies that $x_1, \dots, x_n$ are different variables,
346: $t_1, \dots, t_n$ are terms and that no term $t_i$ equals the variable $x_i$.
347: We say then that the substitution $\C{x_1 / t_1, \dots, x_n / t_n}$
348: \emph{binds} the variable $x_i$ to the term $t_i$. 
349: 
350: Using a substitution we can evaluate a term in much the same way as
351: using a state we can evaluate an expression in imperative programming
352: languages.  This process of evaluation is called an \emph{application}
353: of a substitution to a term.  It is the outcome of a simultaneous
354: replacement of each variable occurring in the domain of the
355: substitution by the corresponding term.  So for example the
356: application of the substitution $\C{x/f(z), y/g(z)}$ to the term
357: $h(x,y)$ yields the term $h(f(z),g(z))$. Here the variable $x$ was
358: replaced by the term $f(z)$ and the variable $y$ by the term $g(z)$.
359: In the same way we define an application of a substitution to an atom,
360: query, or a clause.
361: 
362: So an evaluation of a term using a substitution yields again a term.
363: This is in contrast to imperative programming where an evaluation
364: of an expression using a state yields a value that belongs to the type
365: of this expression.  
366: 
367: \subsection{Most General Unifiers}
368: \label{subsec:mgu}
369: 
370: As already mentioned, in logic programming the atomic actions are equations
371: between terms and the unification process is used to determine their meaning.
372: Before we discuss these matters in detail let us consider some
373: obvious examples of how solving equations can be used as an assignment.
374: 
375: We assume that all mentioned variables are uninitialized.  By writing
376: $x = a$ we assign the constant $a$ to the variable $x$. Since in logic
377: programming the equality ``='' is symmetric, the same effect is
378: achieved by writing $a = x$.  More interestingly, by writing $x =
379: f(y)$ (or, equivalently, $f(y) = x$) we assign the term $f(y)$ to the
380: variable $x$. Since $f(y)$ is a term with a variable, we assigned to
381: the variable $x$ an expression with a variable in it. Recall that a
382: variable that occurs in a value assigned to another variable is called
383: a logical variable. So $y$ is a logical variable here.  The use of
384: logical variables is an important distinguishing feature of logic
385: programming and we devote the whole Subsection \ref{subsec:logical} to
386: an explanation of their use.  Finally, by writing $f(y) = f(g(a))$ we
387: assign the term $g(a)$ to the variable $y$, as this is the way
388: to make these two terms equal.
389: 
390: These examples show that the equality ``='' in logic programming
391: and the assignment in C, also written using ``='', are totally
392: different concepts.
393: 
394: Intuitively, \emph{unification} is the process of solving an equation
395: between terms (i.e., of making two terms equal) in a least
396: constraining way. The resulting substitution (if it exists) is called
397: a \emph{most general unifier} (\emph{mgu}).  For example, the equation $x =
398: f(y)$ can be solved (i.e., the terms $x$ and $f(y)$ \emph{unify}) in a
399: number of different ways, for instance by means of each of the
400: substitutions $\C{x/f(y)}$, $\C{x/f(a), y/a}$, $\C{x/f(a), y/a,
401:   z/g(b)}$, Clearly only the first one is ``least constraining''. In
402: fact, out of these three substitutions the first one is the only most
403: general unifier of the equation $x = f(y)$.
404: The notion of a least constraining substitution can be made precise by
405: defining an order on substitutions. In this order the substitution
406: $\C{x/f(y)}$ is more general than $\C{x/f(a), y/a}$, etc.
407: 
408: Note that we made the terms $x$ and $f(y)$ equal by instantiating only
409: one of them.  Such a special case of the unification is called
410: \emph{matching} which is the way of assigning values in functional
411: programming languages.  Unification is more general than matching as
412: the following, slightly less obvious, example shows.  Consider the
413: equation $f(x,a) = f(b,y)$. Here the most general unifier is $\C{x/b,
414:   y/a}$.  In contrast to the previous example it is now not possible to
415: make these two terms equal by instantiating only one of them.
416: 
417: The problem of deciding whether an equation between terms has a
418: solution is called the {\em unification problem}. \index{unification
419:   problem} Robinson showed in \cite{Rob65} that the unification
420: problem is decidable. More precisely, he introduced a unification
421: algorithm with the following property.  If an equation between terms
422: has a solution, the algorithm produces an mgu and otherwise it reports
423: a failure.  An mgu of an equation is unique up to renaming of the
424: variables.
425: 
426: \subsection{A Unification Algorithm} \label{subsec:mm}
427: 
428: In what follows we discuss the unification process in more detail using
429: an elegant unification algorithm introduced in Martelli
430: and Montanari \cite{MM82}.
431: This algorithm takes 
432: as input a finite set of term equations $\C{s_1 = t_1, \LL, s_n = t_n}$
433: and tries to produces an mgu of them.
434: \VV
435: 
436: \NI
437: {\sc Martelli--Montanari Algorithm}
438: \II
439: 
440: \NI
441: Nondeterministically choose from the set of equations an equation of a
442: form below and perform the associated action.
443: \begin{tabbing}
444: \= (2') \=  $x=t$ where $x$ occurs in $t$ and $x$ differs from $t$x \= \kill \\
445: \> (1) \> $f(s_1 ,...,s_n) = f(t_1 ,...,t_n)$                           \> {\em
446: replace by the equations}  \\
447: \>     \>                                                               \> $s_1
448: = t_1 ,...,s_n = t_n$, \\[2mm]
449: \> (2) \> $f(s_1 ,...,s_n) = g(t_1 ,...,t_m)$ where $f \neq g$          \> {\em
450: halt with failure}, \\[2mm]
451: \> (3) \> $x=x$                                                         \> {\em
452: delete the equation}, \\[2mm]
453: \> (4) \> $t=x$ where $t$ is not a variable                             \> {\em
454: replace by the equation} $x=t$, \\[2mm]
455: \> (5) \> $x=t$ where $x$ does not occur in $t$                         \> {\em
456:  apply the substitution \C{x/t}} \\
457: \>     \> and $x$ occurs elsewhere                                      \> {\em
458:  to all other equations}\\[2mm]
459: \> (6) \> $x=t$ where $x$ occurs in $t$ and $x$ differs from $t$        \> {\em
460:  halt with failure.} 
461: \end{tabbing}
462: \medskip
463: 
464: The algorithm terminates when no action can be performed or when
465: failure arises. In case of success, by changing in the final set of
466: equations all occurrences of ``$=$'' to ``$/$'' we obtain the desired
467: mgu.  Note that action (1) includes the case $c=c$ for every constant
468: $c$ which leads to deletion of such an equation. In addition, action
469: (2) includes the case of two different constants.  Finally, note that
470: no action is performed when the selected equation is of the form $x=t$
471: where $x$ does not occur occur elsewhere (so a fortiori does not occur
472: in $t$). In fact, in case of success all equations will be of such a
473: form.
474: 
475: To illustrate the operation of this algorithm reconsider the equation
476: $f(x,a) = f(b,y)$.  Using action (1) it rewrites to the set of two
477: equations, $\C{x = b, a = y}$. By action (4) we now get the set 
478: $\C{x = b, y = a}$. At this moment the algorithm terminates and we obtain the mgu
479: $\C{x/b, y/a}$.
480: 
481: So by interpreting the equality symbol as the request to find
482: a most general unifier of the considered pair of terms,
483: each equation is turned into
484: an atomic action that either produces a substitution
485: (a most general unifier) or \emph{fails}.
486: This possibility of a failure at the level of an atomic
487: action is another distinguishing feature of logic programming.
488: 
489: By writing a sequence of equations we can create very
490: simple logic programs that either succeed and produce
491: as output a substitution or fail.
492: It is important to understand how the computation then proceeds.
493: We illustrate it by means of three progressively more complex
494: examples.
495: 
496: First, consider the sequence 
497: \[
498: f(x,a) = f(g(z),y), \ h(u) = h(d).
499: \]
500: The first equation yields first the intermediate substitution
501: $\C{x/g(z), y/a}$ and the second one the substitution $\C{u/d}$. By
502: combining these two substitutions we obtain the substitution
503: $\C{x/g(z), y/a, u/d}$ produced by this logic program.
504: 
505: As a slightly less obvious example consider the sequence 
506: \[
507: f(x,a) = f(g(z),y), \ h(x,z) = h(u,d).
508: \]
509: Here the intermediate substitution $\C{x/g(z), y/a}$ binds the
510: variable $x$ that also occurs in the second equation.  This second
511: equation needs to be evaluated first in the ``current state'', here
512: represented by the substitution $\C{x/g(z), y/a}$, before being
513: executed. This evaluation produces the equation $h(g(z),z) = h(u,d)$.
514: This equation yields the most general unifier $\C{u/g(d), z/d}$ and
515: the resulting final substitution is here $\C{x/g(d), y/a, u/g(d),
516:   z/d}$.
517: 
518: What happened here is that the substitution $\C{u/g(d), z/d}$ was
519: \emph{applied} to the intermediate substitution $\C{x/g(z), y/a}$.
520: The effect of an application of one substitution, say
521: $\delta$, to another, say $\gamma$,
522: (or of \emph{composition} of the substitutions) is obtained by 
523: \begin{itemize}
524: \item applying 
525: $\delta$ to each of the terms that appear in the range of $\gamma$,
526: 
527: \item adding to the resulting substitution the bindings to the 
528: variables that are in the domain of $\delta$ but not in the domain of $\gamma$.
529: \end{itemize}
530: In the above example the first step yields the substitution
531: $\C{x/g(d), y/a}$ while the second step adds the bindings $u/g(d)$ and
532: $z/d$ to
533: the final substitution.  This process of substitution composition
534: corresponds to an \emph{update of a state} in imperative programming
535: and that is how we shall refer to it in the sequel.
536: 
537: As a final example consider the sequence  
538: \[
539: f(x,a) = f(g(z),y), \ h(x,z) = h(d,u).
540: \]
541: It yields a failure. Indeed, after executing the first equation the
542: variable $x$ is bound to $g(z)$, so the evaluation of the second
543: equation yields $h(g(z),z) = h(d,u)$ and no substitution makes equal
544: (unifies) the terms $h(g(z),z)$ and $h(d,u)$.
545: 
546: It is useful to compare solving equations by unification with the
547: assignment command.  First, note that, in contrast to assignment,
548: unification can assign an arbitrary term to a variable. Also it
549: can fail, something the assignment cannot do.  On the other hand, using
550: assignment one can modify the value of a variable, something
551: unification can perform in a very limited way: by further
552: instantiating the term used as a value.  So these atomic actions are
553: incomparable.
554: 
555: \section{Clauses as Parts of Procedure Declarations}
556: \label{sec:lp}
557: 
558: Logic programming is a rule based formalism and Prolog is a rule based language.
559: In this context the rules are called clauses.
560: To better understand the relationship between logic programming and
561: imperative programming we proceed in two steps and introduce
562: a restricted form of clauses first.
563: 
564: \subsection{Simple Clauses}
565: \label{subsec:simple-clauses}
566: 
567: Using unification we can execute only extremely simplistic programs
568: that consist of sequences of equations.  We now enrich this framework
569: by adding procedures.  In logic programming they are modelled by means
570: of {\em relation symbols}, \index{relation symbol} sometimes called
571: {\em predicates\/}.  Below we denote relation symbols by
572: $p,q,r,\dots$.  As in the case of the function symbols, we assume that
573: each relation symbol has a fixed arity associated with it. When the
574: arity is 0, the relation symbol is usually called a {\em propositional
575:   symbol}. \index{propositional symbol}
576: 
577: If $p$ is an $n$-ary relation symbol and $t_1,\dots,t_n$ are terms,
578: then we call $p(t_1,\dots,t_n )$ an {\em atom}. \index{atom} When $n =
579: 0$ the propositional symbols coincide with atoms.  Interestingly, as
580: we shall see, such atoms are useful.  Intuitively, a relation symbol
581: corresponds to a \emph{procedure identifier} and an atom to a
582: \emph{procedure call}.  The equality symbol ``='' is a binary relation
583: symbol written in an infix form, so each equation is also an atom.
584: However, the meaning of equality is determined, so it can be viewed as
585: a built-in procedure, i.e., a procedure with a predefined meaning.
586: 
587: We still need to define the procedure declarations
588: and to clarify the parameter mechanism used.
589: Given an $n$-ary relation symbol $p$ and atoms $A_1, \LL, A_k$ we call
590: an expression of the form
591: \[
592: p(x_1, \LL, x_n) :- \: A_1, \LL, A_k.
593: \]
594: a \emph{simple clause}. \index{simple clause} $p(x_1, \LL, x_n)$ is called
595: the \emph{head} of the clause and $A_1, \LL, A_k$ its \emph{body}.
596: The fullstop ``.'' at the end of the clause is important: it
597: signals to the compiler (or interpreter) that the end of the clause is
598: encountered.
599: 
600: The procedural interpretation of a simple clause $p(x_1, \LL, x_n) :-
601: \: A_1, \LL, A_k$ is: ``to establish $p(x_1, \LL, x_n)$ establish
602: $A_1, \LL, A_k$'', while the declarative interpretation is: ``$p(x_1,
603: \LL, x_n)$ is true if $A_1, \LL, A_k$ is true''.  The declarative
604: interpretation explains why in the logic programming theory the reversed
605: implication symbol ``$\la$'' is used instead of ``{\tt :-}''.
606: 
607: Finally a \emph{simple logic program} \index{simple logic program} is
608: a finite set of clauses.  Such a program is activated by providing an
609: initial \emph{query}, \index{query} which is a sequence of atoms.  In
610: the imperative programming jargon a query is then a program and a
611: simple logic program is a set of procedure declarations.  Intuitively, given
612: a simple program, the set of its simple clauses with the same relation symbol
613: in the head corresponds to the procedure declaration in the imperative
614: languages.  One of the syntactic confusions is
615: that in logic programming the comma ``,'' is used as a separator
616: between the atoms constituting a query, whereas in the imperative
617: programming the semicolon ``;'' is used for this purpose.
618: 
619: 
620: \subsection{Computation Process}
621: \label{subsec:computing}
622: 
623: A nondeterminism is introduced into this framework by
624: allowing \emph{multiple clauses} with the same relation symbol in the head.
625: In the logic programming theory this form of nondeterminism (called
626: {\em don't know nondeterminism\/}) \index{don't know nondeterminism}
627: is retained by considering all computations that can be generated by 
628: means of multiple clauses and by retaining the ones that lead to
629: a success. ``Don't know'' refers to the fact that in general we do not know
630: which computation will lead to a success.
631: 
632: In Prolog this computation process is made deterministic by ordering
633: the clauses by the textual ordering and by employing automatic
634: backtracking to recover from failures.  Still, when designing Prolog
635: programs it is useful to have the don't know nondeterminism in mind.
636: In fact, in explanations of Prolog programs phrases like ``this
637: program nondeterministically guesses an element such that \LL'' are
638: common.  Let us explain now more precisely how the computing takes
639: place in Prolog.  To this end we need to clarify the procedure
640: mechanism used and the role played by the use of multiple clauses.
641: 
642: The procedure mechanism associated with the simple clauses introduced
643: above is \emph{call-by-name} according to which the formal parameters
644: are simultaneously substituted by the actual ones. So this procedure
645: mechanism can be simply explained by means of substitutions: given a
646: simple clause $p(x_1, \LL, x_n) :- \: A_1, \LL, A_k .$ a procedure
647: call $p(t_1,\dots,t_n )$ leads to an execution of the statement $(A_1,
648: \LL, A_k) \C{x_1 / t_1, \dots, x_n / t_n}$ obtained by applying the
649: substitution $\C{x_1 / t_1, \dots, x_n / t_n}$ to the statement $A_1,
650: \LL, A_k$.
651: (We assume here that the
652: variables of the clauses are appropriately renamed to avoid variable
653: clashes.)  Equivalently, we can say that the procedure call
654: $p(t_1,\dots,t_n)$ leads to an execution of the statement $A_1, \LL,
655: A_k$ in the state (represented by a substitution) updated by 
656: the substitution $\C{x_1 / t_1, \dots, x_n / t_n}$.
657: 
658: The clauses are tried in the order they appear in the program text.
659: The depth-first strategy is implied by the fact that a procedure call
660: leads directly to an execution of the body of the selected simple
661: clause.  If at a certain stage a failure arises, the computation
662: backtracks to the last choice point (a point in the computation at
663: which one out of more applicable clauses was selected) and the
664: subsequent simple clause is selected.  If the selected clause was the
665: last one, the computation backtracks to the previous choice point. If
666: no choice point is left, a failure arises.  Backtracking implies that
667: the state is restored, so all the state updates performed since
668: the creation of the last choice point are undone.
669: 
670: Let us illustrate now this definition of Prolog's computation
671: process by considering the most known Prolog program the purpose of
672: which is to append two lists.  In Prolog the empty list is denoted by
673: \texttt{[]} and the list with head \texttt{h} and tail \texttt{t} by
674: \texttt{[h | t]}. The term \texttt{[a | [b | s]]} abbreviates to a
675: more readable form \texttt{[a,b | s]}, the list \texttt{[a | [b |
676:   []]]} abbreviates to \texttt{[a,b]} and similarly with longer
677: lists. This notation can be used both for lists and for arbitrary
678: terms that start with the list formation operator \texttt{[.|..]}.
679: 
680: Then the following logic program defines by induction w.r.t. the first
681: argument how to append two lists. Here and elsewhere we follow
682: Prolog's syntactic conventions and denote variables by strings
683: starting with an upper case letter.  The names ending with ``{\tt s}''
684: are used for the variables meant to be instantiated to lists.
685: 
686: \begin{verbatim}
687: % append(Xs, Ys, Zs) :- Zs is the result of concatenating 
688: %                       the lists Xs and Ys. 
689:   append(Xs, Ys, Zs) :- Xs = [], Zs = Ys.
690:   append(Xs, Ys, Zs) :- Xs = [H | Ts], Zs = [H | Us], append(Ts, Ys, Us).
691: \end{verbatim}
692: 
693: In Prolog the answers are generated as substitutions written in an equational
694: form (as in the Martelli--Montanari algorithm presented above).
695: In what follows we display a query $Q.$ as \texttt{?-} $Q$. . Here
696: ``?-'' is the system prompt and the fullstop ``.'' 
697: signals the end of the query.
698: 
699: One can check then that the query 
700: \medskip
701: 
702: \noindent
703: \texttt{?- append([jan,feb,mar], [april,may], Zs).}
704: \medskip
705: 
706: \noindent
707: yields \texttt{Zs = [jan,feb,mar,april,may]} as the answer and that
708: the query 
709: \medskip
710: 
711: \noindent
712: \texttt{?- append([jan,feb,mar], [april,may], [jan,feb,mar,april,may]).}
713: \medskip
714: 
715: \noindent
716: succeeds and yields the empty substitution as the answer.
717: In contrast, the query 
718: \medskip
719: 
720: \noindent
721: \texttt{?- append([jan,feb,mar], [april,may], [jan,feb,mar,april]).} 
722: \medskip
723: 
724: \noindent
725: fails. Indeed, the computation leads to the subsequent procedure calls
726: \medskip
727: 
728: \texttt{append([feb,mar], [april,may], [feb,mar,april])},
729: 
730: \texttt{append([mar], [april,may], [mar,april])} and
731: 
732: \texttt{append([], [april,may], [april])}, 
733: \medskip
734: 
735: \noindent
736: and the last one fails
737: since the terms \texttt{[april,may]} and \texttt{[april]} don't
738: unify. 
739: 
740: \subsection{Clauses}
741: \label{subsec:clauses}
742: 
743: The last step in defining logic programs consists of allowing
744: arbitrary atoms as heads of the clauses. 
745: Formally, given atoms $H, A_1, \LL, A_k$, we call
746: an expression of the form
747: \[
748: H :- \: A_1, \LL, A_k.
749: \]
750: a \emph{clause}. \index{clause} If $k=0$, that is if the clause's body
751: is empty, such a clause is called a \emph{fact} \index{fact} and the
752: ``{\tt :-}'' symbol is then omitted.  If $k>0$, that is, if the
753: clause's body is non-empty, such a clause is called a \emph{rule}.
754: \index{rule} A \emph{logic program} \index{logic program} is then a
755: finite set of clauses and a \emph{pure Prolog program} is a finite
756: sequence of clauses.
757: 
758: Given a pure Prolog program, we call the set of its clauses with
759: the relation $p$ in the head the \emph{definition of $p$}.
760: Definitions correspond to the procedure declarations in imperative
761: programming and to the function definitions in functional
762: programming.
763: Variables that occur in the body of a clause but not in its head are
764: called \emph{local}. They correspond closely to the variables that are
765: local to the procedure bodies in the imperative languages with the
766: difference that in logic programs their declaration is implicit.
767: Logic programming, like Pascal, does not have a block statement.
768: 
769: To explain how the computation process takes place for pure Prolog
770: programs we simply view a clause of the form
771: \[
772: p(s_1, \LL, s_n) :- \: A_1, \LL, A_k.
773: \]
774: as a shorthand for the simple clause
775: \[
776: p(x_1, \LL, x_n) :- \: (x_1, \LL, x_n) = (s_1, \LL, s_n), A_1, \LL, A_k.
777: \]
778: where $x_1, \LL, x_n$ are fresh variables.  We use here Prolog's
779: syntactic facility according to which given a sequence $s_1,\dots,s_n$
780: of terms $(s_1,\dots,s_n )$ is also a term.
781: 
782: So given a procedure call $p(t_1,\dots,t_n )$ if the above clause 
783: $p(s_1, \LL, s_n) :- \: A_1, \LL, A_k$ is
784: selected, an attempt is made to unify $(t_1,\dots,t_n )$ with $(s_1,
785: \LL, s_n)$.  (As before we assume here that no variable clashes arise.
786: Otherwise the variables of the clause should be appropriately renamed.)
787: If the unification succeeds and produces a substitution $\theta$,
788: the state (represented by a substitution) is updated by applying to it
789: $\theta$ and the computation continues with the statement $A_1, \LL,
790: A_k$ in this new state. Otherwise a failure arises and the next clause
791: is selected.
792: 
793: So due to the use of clauses instead of simple clauses, unification is
794: effectively lifted to a parameter mechanism.  As a side effect this
795: makes the explicit use of unification, modelled by means of ``='',
796: superfluous.  As an example reconsider the above program appending two
797: lists.  Using the clauses it can be written in a much more
798: succinct way, as the following program \texttt{APPEND}:
799: 
800: \begin{verbatim}
801: % append(Xs, Ys, Zs) :- Zs is the result of concatenating 
802: %                       the lists Xs and Ys. 
803:   append([], Ys, Ys).
804:   append([X | Xs], Ys, [X | Zs]) :- append(Xs, Ys, Zs).
805: \end{verbatim}
806: Here the implicit case analysis present in the previous program is in
807: effect moved into the heads of the clauses. 
808: The use of terms in the heads of the clauses is completely
809: analogous to the use of \emph{patterns} in function definitions in functional
810: programming.
811: 
812: To summarize, the characteristic elements of procedure declarations in
813: logic programming, in contrast to imperative programming, are: the use
814: of multiple rules and use of patterns to select among these rules.
815: 
816: \section{Prolog's Approach to Programming}
817: 
818: The power and originality of the Prolog programming style lies in the combination of
819: automatic backtracking with the use of relations and 
820: logical variables.
821: 
822: \subsection{Multiple Uses of a Single Program}
823: \label{subsec:multiple}
824: 
825: As a first illustration of the novelty of Prolog's approach to programming
826: we illustrate the possibility of using the same program for different purposes.
827: The perhaps simplest example involves the following program {\tt MEMBER}.
828: We use in it a useful feature of Prolog, so-called {\em anonymous
829:   variable\/}, \index{variable!anonymous} written as an
830: ``underscore'' character ``\_'' .  Each occurrence of ``\_'' in a
831: query or in a clause is interpreted as a {\em different\/} variable.
832: Anonymous variables are analogous to the \emph{wildcard pattern}
833: feature of the Haskell language.  
834: 
835: \begin{verbatim}
836: % member(X, Xs):-  X is a member of the list Xs.
837:   member(X, [X | _]).
838:   member(X, [_ | Xs]):-  member(X, Xs).
839: \end{verbatim}
840: 
841: {\tt MEMBER} can be used both for testing and for computing:
842: 
843: \begin{verbatim}
844: ?- member(wed, [mon, wed, fri]).
845: 
846: yes
847: 
848: ?-  member(X, [mon, wed, fri]).
849: 
850: Xs = mon ;
851: 
852: Xs = wed ;
853: 
854: Xs = fri ;
855: 
856: no
857: \end{verbatim}
858: Here ``;'' is the user's request to produce the next answer. If this
859: request fails, the answer ``{\tt no}'' is printed.
860: 
861: Consequently, given a variable {\tt X} and two lists {\tt s} and {\tt t},
862: the query {\tt member(X, s), member(X, t).} generates all elements
863: that are present both in {\tt s} and {\tt t}.
864: Operationally, the first call generates all members of {\tt s}
865: and the second call tests for each of them the membership in {\tt t}.
866: 
867: Also the \texttt{APPEND} program can be used for a number of purposes,
868: in particular to concatenate two lists and to split a list in all
869: possible ways. For example we have
870: 
871: \begin{verbatim}
872: ?- append(Xs, Ys, [mon, wed, fri]).
873: 
874: Xs = []
875: Ys = [mon, wed, fri] ;
876: 
877: Xs = [mon]
878: Ys = [wed, fri] ;
879: 
880: Xs = [mon, wed]
881: Ys = [fri] ;
882: 
883: Xs = [mon, wed, fri]
884: Ys = [] ;
885: 
886: no
887: \end{verbatim}
888: 
889: This cannot be achieved with any functional programming version of the
890: {\tt APPEND} procedure.  The difference comes from the fact that in
891: logic programming procedures are defined by means of relations
892: whereas in functional programming functions are used. In fact, there
893: is no distinction between input and output arguments in the procedures
894: in logic programs.
895: 
896: To see two uses of {\tt append} in a single program consider a program
897: that checks whether one list is a consecutive sublist of another one.
898: The one line program \texttt{SUBLIST} that follows formalizes the following
899: definition of a sublist:
900: \begin{itemize}
901: \item the list {\tt Xs} is a sublist of the list {\tt Ys} if {\tt Xs}
902: is a prefix of a suffix of {\tt Ys}.
903: \end{itemize}
904: 
905: \begin{verbatim}
906: % sublist(Xs, Ys) :- Xs is a sublist of the list Ys.
907:   sublist(Xs, Ys) :- append(_, Zs, Ys), append(Xs, _, Zs).
908: \end{verbatim}
909: 
910: Here both anonymous variables and {\tt Zs} are local.
911: In this rule {\tt Zs} is a suffix of {\tt Ys} and {\tt Xs} is a prefix
912: of {\tt Zs}.  This relation is illustrated in Figure
913: \ref{fig:sublist}. 
914: 
915: \begin{figure}[htbp]
916: \medskip
917: 
918: \catcode`@=11
919: \expandafter\ifx\csname graph\endcsname\relax \alloc@4\box\chardef\insc@unt\graph\fi
920: \expandafter\ifx\csname graphtemp\endcsname\relax \alloc@1\dimen\dimendef\insc@unt\graphtemp\fi
921: \catcode`@=12
922: \setbox\graph=\vtop{%
923:   \vbox to0pt{\hbox{%
924:     \special{pn 8}%
925:     \special{pn 13}%
926:     \special{pa 1500 400}%
927:     \special{pa 0 400}%
928:     \special{fp}%
929:     \special{pa 0 450}%
930:     \special{pa 0 350}%
931:     \special{fp}%
932:     \special{pa 1500 450}%
933:     \special{pa 1500 350}%
934:     \special{fp}%
935:     \graphtemp=.6ex \advance\graphtemp by 0.48in
936:     \rlap{\kern 0.75in\lower\graphtemp\hbox to 0pt{\hss \tt Ys\hss}}%
937:     \special{pa 1500 225}%
938:     \special{pa 533 225}%
939:     \special{fp}%
940:     \special{pa 533 275}%
941:     \special{pa 533 175}%
942:     \special{fp}%
943:     \special{pa 1500 275}%
944:     \special{pa 1500 175}%
945:     \special{fp}%
946:     \graphtemp=.6ex \advance\graphtemp by 0.305in
947:     \rlap{\kern 1.016in\lower\graphtemp\hbox to 0pt{\hss \tt Zs\hss}}%
948:     \special{pa 533 50}%
949:     \special{pa 1178 50}%
950:     \special{fp}%
951:     \special{pa 533 100}%
952:     \special{pa 533 0}%
953:     \special{fp}%
954:     \special{pa 1178 100}%
955:     \special{pa 1178 0}%
956:     \special{fp}%
957:     \graphtemp=.6ex \advance\graphtemp by 0.13in
958:     \rlap{\kern 0.855in\lower\graphtemp\hbox to 0pt{\hss \tt Xs\hss}}%
959:     \kern 1.5in
960:   }\vss}%
961:   \kern 0.616in
962: }
963: \centerline{\box\graph}
964: \caption{Xs is a sublist of the list Ys \label{fig:sublist} }
965: \end{figure}
966: 
967: Operationally, given two lists, \texttt{as} and
968: \texttt{bs}, the query \texttt{sublist(as, bs).} leads to a generation
969: of splits of the list \texttt{bs} through the call 
970: \texttt{append(\_,  Zs, bs)}.  Then for each generated suffix \texttt{Zs} 
971: of \texttt{bs} it is checked whether for some list, denoted by the anonymous 
972: variable \_ , the call \texttt{append(as, \_, Zs)} succeeds.
973: This happens when \texttt{as} is a prefix of \texttt{Zs}.
974: So a typical use of this program involves backtracking.
975: 
976: \subsection{Logical Variables}
977: \label{subsec:logical}
978: 
979: Let us return now to the logical variables. They are an important
980: feature of logic programming but it is easy to overlook their use. For
981: example, they already appear in the computations involving the first
982: version of the list concatenation program, and consequently, because
983: of the way we defined the computation process, in the computations of
984: the {\tt APPEND} program. Indeed, given the query {\tt
985:   append([jan,feb,mar], [april,may], Zs).} the rule 
986: \begin{verbatim}
987:   append(Xs, Ys, Zs) :- Xs = [H | Ts], Zs = [H | Us], append(Ts, Ys, Us).  
988: \end{verbatim}
989: leads to the binding of the variable {\tt Zs} to the term {\tt [jan |
990:   Us]}.  The value of the variable {\tt Us} is computed later, by
991: means of the call {\tt append([feb,mar], [april,may], Us)}. This call
992: first binds {\tt Us} to the term {\tt [feb | U1s]}, where {\tt U1s} is
993: a fresh variable, and hence {\tt Zs} to the term {\tt [jan, feb |
994:   U1s]}.  This progressive building of the output using the logical
995: variables is typical for Prolog.
996: 
997: The real power of logical variables should become apparent after
998: considering the following three original Prolog programs.
999: 
1000: \subsubsection*{A Type Assignment}
1001: 
1002: Consider the typed lambda calculus and Curry's system of type
1003: assignment. It involves statements of the form $s: \tau$ which should be
1004: read as ``term $s$ has type $\tau$''.  Finite sequences of such
1005: statements with $s$ being a variable are called {\em environments\/}
1006: are denoted below by $E$.  A statement of the form $E \vdash s: \tau$
1007: should be read as ``in the environment $E$ the term $s$ has type $\tau$''.
1008: The following three rules define by induction on the structure of
1009: lambda terms how to assign types to lambda terms: 
1010: \medskip
1011: 
1012: \[ \frac{ x:t \in E                }
1013:         { E \vdash  x:t            }
1014: \]
1015: 
1016: \[ \frac{ E \vdash m : s\ra t \ , \ \  E \vdash n : s}
1017:         { E \vdash  (m \ n) : t}\]
1018: 
1019: \[ \frac{ E \; , \; x:s \vdash  m:t        }
1020:         { E \vdash  (\lambda x. m) : s\ra t}\]
1021: 
1022: \VV
1023: 
1024: To encode the lambda terms as usual ``first-order'' terms we use
1025: the unary function symbol {\tt var} and two binary function symbols,
1026: {\tt lambda} and {\tt apply}.  The lambda term $x$ (a variable) is
1027: translated to the term {\tt var(x)}, the lambda term $(m \ n)$ to the
1028: term {\tt apply(m, n)}, and the lambda term $\lambda x. m$ to the term
1029: {\tt lambda(x, m)}.  For example, the lambda term $\lambda x. \: (x \ 
1030: x)$ translates to {\tt lambda(x, apply(var(x), var(x)))}.  The subtle
1031: point is that according to Prolog convention, lower case letters stand
1032: for constants, so for example {\tt var(x)} is a ground term (i.e. a
1033: term without variables).
1034: 
1035: The above rules directly translate into the following Prolog program
1036: that refers to the previously defined {\tt member} relation.
1037: 
1038: \begin{verbatim}
1039: :- op(1100, yfx, arrow).
1040: 
1041: % type(E, S, T):-  lambda term S has type T in the environment E.
1042:   type(E, var(X), T):-  member([X, T], E).
1043:   type(E, apply(M, N), T):-  type(E, M, S arrow T), type(E, N, S).
1044:   type(E, lambda(X, M), (S arrow T)):-  type([[X, S] | E], M, T).
1045: \end{verbatim}
1046: 
1047: For readability we use here \texttt{arrow} as a binary function symbol
1048: written in infix notation.  The first line declares this use of
1049: \texttt{arrow} together with a certain associativity and priority
1050: information (The details of this archaic, though useful, Prolog
1051: notation are not relevant here.)
1052: 
1053: As expected, the above program can be used to check whether a given
1054: (representation of a) lambda term has a given type.  Less expected is
1055: that this program can also be used to compute a type assignment to a
1056: lambda term, if such an assignment exists, and to report a failure if
1057: no such assignment exists. To this end, given a lambda term $s$, it
1058: suffices to use the query {\tt type([], t, T).}, where the empty list
1059: {\tt []} denotes the empty environment and where {\tt t} is the
1060: translation of $s$ to a first-order term.  For instance, the query
1061: \medskip
1062: 
1063: \texttt{?- type([], lambda(x, apply(var(x), var(x))), T).}
1064: \medskip
1065: 
1066: \noindent
1067: fails. In fact, no type can be assigned to the lambda term $\lambda x. \: (x \ x)$. 
1068: The computation first leads to the call
1069: \medskip
1070: 
1071: \texttt{type([[x, S]], apply(var(x), var(x)), T)} 
1072: \medskip
1073: 
1074: \noindent
1075: and then to the call
1076: \medskip
1077: 
1078: \texttt{type([[x, S]], var(x), S arrow T)}.
1079: 
1080: \medskip
1081: 
1082: \noindent
1083: This in turn leads to the call
1084: \medskip
1085: 
1086: \texttt{member([x, S arrow T], [[x, S]])}
1087: \medskip
1088: 
1089: \noindent
1090: which fails since the terms \texttt{S arrow T} and \texttt{S} do not unify.
1091: In the above computation \texttt{T} is used as a logical variable.
1092: 
1093: The problem of computing a type assignment for lambda terms was posed
1094: and solved by Curry (see Curry and Feys \cite{CF58}). It is an
1095: important topic in the theory of lambda calculus that is of relevance
1096: for type inference in functional programming.  The solution in Prolog
1097: given above is completely elementary.  A typical use of this program
1098: does not involve backtracking. In fact, its power relies on
1099: unification.
1100: 
1101: \subsubsection*{A Sequence Program}
1102: 
1103: Next, consider the following problem:
1104: arrange three 1s, three 2s, ..., three 9s
1105: in sequence so that for all $i \in [1,9]$ there
1106: are exactly $i$ numbers between successive occurrences
1107: of $i$.
1108: An example of such a sequence is
1109: \begin{center}
1110: 1, 9, 1, 2, 1, 8, 2, 4, 6, 2, 7, 9, 4, 5, 8, 6, 3, 4, 7, 5, 3, 9, 6, 8, 3, 5, 7.
1111: \end{center}
1112: 
1113: The desired program is an almost verbatim
1114: formalization of the problem in Prolog.
1115: \begin{verbatim}
1116: % sequence(Xs) :- Xs is a list of 27 variables.
1117:   sequence([_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_]).
1118: 
1119: % question(Ss) :- Ss is a solution to the problem.
1120:   question(Ss) :-
1121:           sequence(Ss),
1122:           sublist([9,_,_,_,_,_,_,_,_,_,9,_,_,_,_,_,_,_,_,_,9], Ss),
1123:           sublist([8,_,_,_,_,_,_,_,_,8,_,_,_,_,_,_,_,_,8], Ss),
1124:           sublist([7,_,_,_,_,_,_,_,7,_,_,_,_,_,_,_,7], Ss),
1125:           sublist([6,_,_,_,_,_,_,6,_,_,_,_,_,_,6], Ss),
1126:           sublist([5,_,_,_,_,_,5,_,_,_,_,_,5], Ss),
1127:           sublist([4,_,_,_,_,4,_,_,_,_,4], Ss),
1128:           sublist([3,_,_,_,3,_,_,_,3], Ss),
1129:           sublist([2,_,_,2,_,_,2], Ss),
1130:           sublist([1,_,1,_,1], Ss).
1131: \end{verbatim}
1132: Note how the anonymous variables dramatically improve the readability of the program.
1133: 
1134: Operationally, the query \texttt{?- question(Ss).} leads to the
1135: procedure call \texttt{sequence(Ss)} that instantiates the variable
1136: \texttt{Ss} to the list of 27 anonymous (so different) variables.
1137: Then each of the nine calls of the \texttt{sublist} procedure
1138: enforces an existence of a specific sublist pattern on \texttt{Ss}.
1139: Each pattern involves syntactically anonymous variables, each of them
1140: representing operationally a logical variable.
1141: 
1142: In spite of the fact that the program is simple and transparent 
1143: the resulting computation is involved because of a extensive
1144: use of backtracking. The query generates all six solutions
1145: to the problem.
1146: 
1147: \subsubsection*{Difference Lists}
1148: 
1149: One of the drawbacks of the concatenation of lists performed by the
1150: {\tt APPEND} program is that for lists {\tt s} and {\tt t} the execution of the
1151: query {\tt append(s, t, Z)} takes the number of steps that is
1152: proportional to the length of the first list, {\tt s}. This is
1153: obviously inefficient. In an imperative setting if one represents a
1154: list as a link list, to concatenate two lists it suffices to adjust
1155: one pointer.
1156: 
1157: {\em Difference list\/} is a generalization of the concept of a
1158: list that allows us to perform  concatenation in constant time.
1159: The fact that many programs  rely explicitly on list
1160: concatenation explains the importance of this concept.
1161: 
1162: In what follows we use the subtraction operator
1163: ``{\tt -}'' written in the infix form. Its use has nothing to do 
1164: with arithmetic, though intuitively one should
1165: read it as the ``difference''. 
1166: Formally, a {\em difference list\/} is a construct of the form
1167: ${\tt [a_1, ..., a_m | x] - x}$, where {\tt x} is a variable and 
1168: where we used the notation introduced in Subsection \ref{subsec:computing}.
1169: It {\em represents\/} the list ${\tt [a_1, ..., a_m]}$ in a form amenable to a different
1170: definition of concatenation.
1171: Namely, consider two difference lists 
1172: ${\tt [a_1, ..., a_m | x] - x}$ and ${\tt [b_1, ..., b_n | y] - y}$.
1173: Then their concatenation is the difference list 
1174: ${\tt [a_1, ..., a_m, b_1, ..., b_n | y] - y}$.
1175: 
1176: This concatenation process is achieved by the following one line {\tt APPEND\_DL} program:
1177: 
1178: \begin{verbatim}
1179: % append(Xs, Ys, Zs) :- the difference list Zs is the result of 
1180: %                       concatenating the difference lists Xs and Ys.
1181:   append_dl(X-Y, Y-Z, X-Z). 
1182: \end{verbatim}
1183: 
1184: For example, we have:
1185: \begin{verbatim}
1186: ?- append_dl([a,b|X]-X, [c,d|Y]-Y, U). 
1187: 
1188: U = [a,b,c,d|Y]-Y,
1189: X = [c,d|Y]
1190: \end{verbatim}
1191: which shows that {\tt U} became instantiated to the difference list
1192: representing the list {\tt [a,b,c,d]}.
1193: We shall illustrate the use of difference lists in Subsection \ref{subsec:acr}.
1194: 
1195: \section{Arithmetics in Prolog}
1196: \label{sec:arithmetic}
1197: 
1198: The Prolog programs presented so far are \emph{declarative} since
1199: they admit a dual reading as a formula.
1200: The treatment of arithmetic in Prolog compromises to some
1201: extent its declarative underpinnings. However, it is difficult
1202: to come up with a better solution than the one offered by
1203: the original designers of the language.
1204: The shortcomings of Prolog's treatment of arithmetic are overcome
1205: in the constraint logic programming languages.
1206: 
1207: \subsection{Arithmetic Operators}
1208: \label{subsec:arithmetic-operators}
1209: 
1210: Prolog provides integers and floating point numbers as built-in data
1211: structures, with the typical operations on them. These operations
1212: include the usual arithmetic operators such as {\tt +}, {\tt -},
1213: {\tt *} (multiplication), and {\tt //} (integer division).
1214: 
1215: Now, according to the usual notational convention of logic programming and
1216: Prolog, the relation and function symbols are written in the
1217: {\em prefix form\/}, \index{prefix form} that is in front of the arguments. In
1218: contrast, in accordance with their usage in arithmetic, the binary
1219: arithmetic operators are written in {\em infix form\/},
1220: \index{infix form} that is between the arguments.
1221: Moreover,
1222: negation of a natural number can be written in the {\em bracket-less
1223:   prefix form}, \index{bracket-less prefix form} that is without brackets
1224: surrounding its argument.
1225: 
1226: This discrepancy in the syntax is resolved by considering the
1227: arithmetic operators as built-in function symbols written in the infix
1228: or bracket-less prefix form with information about their associativity
1229: and binding power that allows us to disambiguate the arithmetic
1230: expressions.
1231: 
1232: Actually, Prolog provides a means to declare an {\em arbitrary\/}
1233: function symbol as an infix binary symbol or as a bracket-less prefix
1234: unary symbol, with a fixed {\em priority\/} \index{priority} that
1235: determines its binding power and a certain {\em mnemonics\/} that
1236: implies some (or no) form of associativity.  An example of such a
1237: declaration was the line \texttt{:- op(1100, yfx, arrow).} used in the
1238: above type assignment program. 
1239: Function symbols declared in this way are called
1240: {\em operators}. \index{operator}
1241: Arithmetic operators can be thus viewed as operators
1242: predeclared in the language ``prelude''. 
1243: 
1244: In addition to the arithmetic operators we also have at our disposal 
1245: infinitely many integer constants
1246: and infinitely many floating point numbers. 
1247: In what follows by a {\em number} we mean either an integer
1248: constant or a floating point number.
1249: The arithmetic operators and the set of all numbers uniquely
1250: determine a set of terms.  We call terms defined in this language {\em
1251:   arithmetic expressions\/} \index{expression!arithmetic} and
1252: introduce the abbreviation {\em gae\/} \index{gae (ground arithmetic
1253:   expression)} for ground (i.e., variable free) arithmetic
1254: expressions.
1255: 
1256: \subsection{Arithmetic Comparison Relations}
1257: \label{subsec:acr}
1258: 
1259: With each gae we can uniquely associate its {\em value},
1260: \index{gae!value of} computed in the expected way.  Prolog allows us
1261: to compare the values of gaes by means of the customary six {\em
1262:   arithmetic comparison relations\/} \index{arithmetic!comparison
1263:   relation}
1264: \medskip
1265: 
1266: \verb+ <+, 
1267: \verb+ =<+,
1268: \verb+ =:=+ (``equal''),
1269: \verb+ =\=+, (``different''),
1270: \verb+ >=+,
1271: and \verb+ >+.
1272: 
1273: \medskip
1274: 
1275: \noindent
1276: The ``equal'' relation ``\verb+ =:=+'' should not be confused with the ``is
1277: unifiable with'' relation ``='' discussed in Section
1278: \ref{sec:unification}.
1279: 
1280: The arithmetic comparison relations work on gaes and produce the expected outcome.
1281: For instance, \verb+ >+ compares the values of two gaes and succeeds if
1282: the value of the first argument is larger than the value of the second and
1283: fails otherwise.
1284: Thus, for example
1285: 
1286: \begin{verbatim}
1287: ?- 6*2 =:= 3*4. 
1288: 
1289: yes
1290: 
1291: ?- 7 > 3+4.
1292: 
1293: no
1294: \end{verbatim}
1295: 
1296: However, when one of the arguments of the arithmetic comparison relations
1297: is  not a gae, the computation
1298: {\em ends in an error\/}. 
1299: For example, we have
1300: 
1301: \begin{verbatim}
1302: ?- [] < 5.
1303: 
1304: error in arithmetic expression: [] is not a number.
1305: \end{verbatim}
1306: 
1307: As a simple example of the use of the arithmetic comparison relations,
1308: consider the following program which checks whether a list of numbers
1309: is ordered.
1310: 
1311: \begin{verbatim}
1312: % ordered(Xs) :- Xs is an =<-ordered list of numbers
1313:   ordered([]).
1314:   ordered([_]).
1315:   ordered([X, Y | Xs]) :- X =< Y, ordered([Y | Xs]).
1316: \end{verbatim}
1317: 
1318: Recall that \texttt{[X, Y | Xs])} is the abbreviated Prolog notation for
1319: \texttt{[X | [Y | Xs]])}.
1320: We now have
1321: \begin{verbatim}
1322: ?- ordered([1,1,2,3]).
1323: 
1324: yes
1325: \end{verbatim}
1326: but also
1327: \begin{verbatim}
1328: ?- ordered([1,X,1]).
1329: 
1330: instantiation fault in 1 =< X
1331: \end{verbatim}
1332: 
1333: Here a run-time error took place because at a certain stage the
1334: comparison relation ``\verb+ =<+'' was applied to an argument that is not
1335: a number. 
1336: 
1337: As another example consider Prolog's version of the {\em quicksort\/} procedure
1338: of C.A.R. Hoare \index{Hoare, C.A.R.}.
1339: According to this sorting procedure, 
1340: a list is first partitioned into two sublists using an element {\tt X} of it, 
1341: one consisting of the elements smaller than {\tt X} and the other consisting
1342: of the elements larger or equal than {\tt X}.
1343: Then each sublist is quicksorted and the resulting
1344: sorted sublists are appended with the element {\tt X} put in the middle.
1345: This can be expressed in Prolog by means of the following 
1346: {\tt QUICKSORT} program, where {\tt X} is chosen to be the
1347: first element of the given list:
1348: 
1349: \begin{verbatim}
1350: % qs(Xs, Ys) :- Ys is an ordered permutation of the list Xs.
1351:   qs([], []).
1352:   qs([X | Xs], Ys) :-
1353:           part(X, Xs, Littles, Bigs),
1354:           qs(Littles, Ls),
1355:           qs(Bigs, Bs),
1356:           append(Ls, [X | Bs], Ys).
1357: 
1358: % part(X, Xs, Ls, Bs) :- Ls is a list of the elements of Xs which are < X, 
1359: %                        Bs is a list of the elements of Xs which are >= X.
1360:   part(_, [], [], []).
1361:   part(X, [Y | Xs], [Y | Ls], Bs) :- X > Y, part(X, Xs, Ls, Bs).
1362:   part(X, [Y | Xs], Ls, [Y | Bs]) :- X =< Y, part(X, Xs, Ls, Bs).
1363: \end{verbatim}
1364: 
1365: We now have for example
1366: \begin{verbatim}
1367: ?- qs([7,9,8,1,5], Ys).
1368: 
1369: Ys = [1, 5, 7, 8, 9]
1370: \end{verbatim}
1371: and also
1372: \begin{verbatim}
1373: ?- qs([7,9,8,1,5], [1,5,7,9,8]).
1374: 
1375: no
1376: \end{verbatim}
1377: 
1378: The {\tt QUICKSORT} program uses the {\tt append} relation to
1379: concatenate the lists. Consequently, its efficiency can be improved
1380: using the difference lists introduced in Subsection
1381: \ref{subsec:logical}.  Conceptually, the calls of the {\tt
1382:   append} relation are first replaced by the corresponding calls of the {\tt
1383:   append\_dl} relation.  This yields a program defining the {\tt qs\_dl}
1384: relation.  Then unfolding the calls of {\tt append\_dl}
1385: leads to a program that does not use the {\tt APPEND\_DL} program
1386: anymore and performs the list concatenation ``on the fly''.  
1387: This results in the program {\tt QUICKSORT\_DL} in which the definition of the
1388: {\tt qs} relation is replaced by:
1389: 
1390: \begin{verbatim}
1391: % qs(Xs, Ys) :- Ys is an ordered permutation of the list Xs.
1392:   qs(Xs, Ys)  :- qs_dl(Xs, Ys - []).
1393: % qs_dl(Xs, Y) :-  Y is a difference list representing the 
1394: %                  ordered permutation of the list Xs.
1395:   qs_dl([], Xs - Xs). 
1396:   qs_dl([X | Xs], Ys - Zs) :-
1397:      part(X, Xs, Littles, Bigs),
1398:      qs_dl(Littles, Ys - [X | Y1s]),
1399:      qs_dl(Bigs, Y1s - Zs). 
1400: \end{verbatim}
1401: The first rule links the {\tt qs} relation with the {\tt qs\_dl} relation.
1402: 
1403: \subsection{Evaluation of Arithmetic Expressions}
1404: \label{subsec:eval}
1405: 
1406: So far we have presented programs that use ground arithmetic
1407: expressions, but have not yet introduced any means of evaluating them.
1408: For example, no facilities have been introduced so far to evaluate
1409: {\tt 3+4}. All we can do at this stage is to check that the outcome is
1410: {\tt 7} by using the comparison relation {\tt =:=} and the query {\tt
1411:   7 =:= 3+4}.  But using the comparison relations it is not possible
1412: to {\em assign\/} the value of {\tt 3+4}, that is {\tt 7}, to a
1413: variable, say {\tt X}. Note that the query {\tt X =:= 3+4.} ends in an
1414: error, while the query {\tt X = 3+4.} instantiates {\tt X} to the term
1415: {\tt 3+4}.
1416:  
1417: To overcome this problem
1418: the binary {\em arithmetic evaluator \/} \index{arithmetic!evaluator}
1419: {\tt is} is used in Prolog.
1420: {\tt is} is an infix operator defined as follows.
1421: Consider the call {\tt s is t}. 
1422: Then {\tt t} has to be a ground arithmetic expression (gae).
1423: The call of {\tt s is t} results in the unification of
1424: the {\em value\/} of the gae {\tt t} with {\tt s}.
1425: If {\tt t} is not a gae then a run-time error arises.
1426: So for example we have
1427: \begin{verbatim}
1428: ?- 7 is 3+4.   
1429: 
1430: yes
1431: 
1432: 8 is 3+4.
1433: 
1434: no 
1435: 
1436: ?- X is 3+4.
1437: 
1438: X = 7
1439: 
1440: ?- X is Y+1.
1441: 
1442: ! Error in arithmetic expression: not a number
1443: \end{verbatim}
1444: 
1445: As an example of the use of an arithmetic evaluator
1446: consider the proverbial factorial function. It can be computed using the 
1447: following program {\tt FACTORIAL}:
1448: 
1449: \begin{verbatim}
1450: % factorial(N, F) :- F is N!.
1451:   factorial(0, 1).
1452:   factorial(N, F) :- N > 0, N1 is N-1, factorial(N1, F1), F is N*F1.
1453: \end{verbatim}
1454: 
1455: Note the use of a local variable {\tt N1} in the atom
1456: {\tt N1 is N-1} to compute the decrement of {\tt N} and the use of a
1457: local variable {\tt F1} to compute the value of {\tt N1} factorial.
1458: The atom {\tt N1 is N-1} corresponds to the assignment command {\tt N := N-1}
1459: of imperative programming. 
1460: The difference is that a new variable needs
1461: to be used to compute the value of \texttt{N-1}.  Such uses of local
1462: variables are typical when computing with integers in Prolog.
1463: 
1464: As another example consider a Prolog program
1465: that computes the length of a list.
1466: 
1467: \begin{verbatim}
1468: % length(Xs, N) :- N is the length of the list Xs.
1469:   length([], 0).
1470:   length([_ | Ts], N) :- length(Ts, M), N is M+1.
1471: \end{verbatim}
1472: We then have
1473: \begin{verbatim}
1474: ?- length([a,b,c], N).
1475: 
1476: N = 3
1477: \end{verbatim}
1478: 
1479: An intuitive but incorrect version would use as the second
1480: clause
1481: \begin{verbatim}
1482:   length([_ | Ts], N+1) :- length(Ts, N).
1483: \end{verbatim}
1484: With such definition we would get the following
1485: nonintuitive outcome:
1486: \begin{verbatim}
1487: ?- length([a,b,c], N).
1488: 
1489: N = 0 + 1 + 1 + 1
1490: \end{verbatim}
1491: The point is that the generated ground arithmetic expressions
1492: are not automatically evaluated in Prolog.
1493: 
1494: We conclude that arithmetic facilities in Prolog are quite subtle
1495: and require good insights to be properly used.
1496: 
1497: \section{Control, Ambivalent Syntax and Meta-Variables}
1498: \label{sec:control}
1499: 
1500: In the framework discussed so far no control constructs are present.
1501: Let us see now how they could be simulated by means of the 
1502: features explained so far. Consider the customary 
1503: \textbf{if} \texttt{B} \textbf{then} \texttt{S} \textbf{else} \texttt{T} \textbf{fi} 
1504: construct. It can be modelled by means of the following two clauses:
1505: \medskip
1506: 
1507: {\texttt{p}(\textbf{x}) \texttt{:- B, S}.}
1508: 
1509: \texttt{p}(\textbf{x}) \texttt{:- not B, T.}
1510: 
1511: \medskip
1512: 
1513: \noindent
1514: where \texttt{p} is a new procedure identifier and all the variables
1515: of \texttt{B, S} and \texttt{T} are collected in \textbf{x}.  To see
1516: how inefficiency creeps into this style of programming consider two
1517: cases.
1518: 
1519: First, suppose that the first clause is selected and that \texttt{B} is
1520: true (i.e., succeeds). Then the computation continues with \texttt{S}.
1521: But in general \texttt{B} is an arbitrary query and because of the
1522: implicit nondeterminism present \texttt{B} can succeed in many ways.
1523: If the computation of \texttt{S} fails, these alternative ways of
1524: computing \texttt{B} will be automatically tried even though we know already
1525: that \texttt{B} is true.
1526: 
1527: Second, suppose that the first clause is selected and that \texttt{B}
1528: is false (that is fails). Then backtracking takes place and the
1529: second clause is tried.  The computation proceeds by evaluating
1530: \texttt{not B}. This is completely unneeded since we know at this
1531: stage that \texttt{not B} is true (that is succeeds).  
1532: 
1533: Note that omitting \texttt{not B} in the second rule would cause a
1534: problem in case a success of \texttt{B} were followed by a failure of
1535: \texttt{S}. Then upon backtracking \texttt{T} would be executed.
1536: 
1537: \subsection{Cut}
1538: 
1539: To deal with such problems Prolog provides a low level built-in nullary
1540: relation symbol called \emph{cut} and denoted by ``!''.
1541: To explain its meaning we rewrite first the above clauses using cut:
1542: \medskip
1543: 
1544: {\texttt{p}(\textbf{x}) \texttt{:- B, !, S}.}
1545: 
1546: {\texttt{p}(\textbf{x}) \texttt{:- T}.}
1547: \medskip
1548: 
1549: \noindent
1550: In the resulting analysis two possibilities arise, akin to the above
1551: case distinction.  First, if \texttt{B} is true (i.e., succeeds), then
1552: the cut is encountered. Its execution
1553: \begin{itemize}
1554: \item discards all alternative ways of computing \texttt{B},
1555: 
1556: \item discards the second clause, {\texttt{p}(\textbf{x}) \texttt{:- T}.}, 
1557: as a backtrackable alternative to the current selection of the first clause.
1558: \end{itemize}
1559: Both items have an effect that in the current computation some clauses are not
1560: anymore available.
1561: 
1562: Second, if \texttt{B} is false (i.e., fails), then backtracking
1563: takes place and the second clause is tried.  The computation proceeds now by
1564: directly evaluating \texttt{T}.
1565: So using the cut and the above rewriting we achieved the intended
1566: effect and modelled the \textbf{if} \texttt{B} \textbf{then}
1567: \texttt{S} \textbf{else} \texttt{T} \textbf{fi} construct in the
1568: desired way.
1569: 
1570: The above explanation of the effect of cut is a good starting point
1571: to provide its definition in full generality.
1572: Consider the following definition of a relation {\tt p}:
1573: \begin{center}\begin{tabular}{l}
1574: {\tt p($\Seq{s}_1$) :- ${\bf A}_1$}. \\
1575: \dots \\
1576: {\tt p($\Seq{s}_i$)   :- {\bf B},!,{\bf C}}. \\
1577: \dots \\
1578: {\tt p($\Seq{s}_k$) :- ${\bf A}_k$}. \\
1579: \end{tabular}\end{center}
1580: Here, the $i$-th clause contains a cut atom. Now, suppose that during
1581: the execution of a query a call  {\tt p($\Seq{t}$)} is encountered and
1582: eventually the $i$-th clause is used and the indicated occurrence of
1583: the cut is executed.  Then the indicated occurrence
1584: of !  succeeds immediately, but additionally
1585: \begin{enumerate}
1586: \item
1587: all alternative ways of computing $\Seq{B}$ are discarded, and
1588: \item all computations of {\tt p($\Seq{t}$)} using the $i+1$-th to
1589:   $k$-th clause for {\tt p} are discarded as backtrackable
1590:   alternatives to the current selection of the $i$-clause.
1591: \end{enumerate}
1592: 
1593: The cut was introduced to improve the implicit control present through the
1594: combination of backtracking and the textual ordering of the clauses.
1595: Because of the use of patterns in the clause heads the potential
1596: source of inefficiency can be sometimes hidden somewhat deeper in the
1597: program text.  Reconsider for example the {\tt QUICKSORT} program of
1598: Section \ref{sec:arithmetic} and the query \texttt{?- qs([7,9,8,1,5],
1599:   Ys)}.  To see that the resulting computation is inefficient note
1600: that the second clause defining the \texttt{part} relation fails when 7
1601: is compared with 9 and subsequently the last, third, clause is tried. At
1602: this moment 7 is again compared with 9. The same redundancy occurs
1603: when 1 is compared with 5.  To avoid such inefficiencies the
1604: definition of \texttt{part} can be rewritten using cut as follows:
1605: 
1606: \begin{verbatim}
1607:   part(_, [], [], []).
1608:   part(X, [Y | Xs], [Y | Ls], Bs) :- X > Y, !, part(X, Xs, Ls, Bs).
1609:   part(X, [Y | Xs], Ls, [Y | Bs]) :- part(X, Xs, Ls, Bs).
1610: \end{verbatim}
1611: Of course, this improvement can be also applied to 
1612: the {\tt QUICKSORT\_DL} program.
1613: 
1614: Cut clearly compromises the declarative reading of the Prolog
1615: programs.  It has been one of the most criticized features of Prolog.
1616: In fact, a proper use of cut requires a good understanding of Prolog's
1617: computation mechanism and a number of thumb rules were developed to
1618: help a Prolog programmer to use it correctly.  A number of
1619: alternatives to cut were proposed. The most interesting of them,
1620: called \emph{commit}, entered various constraint and parallel logic
1621: programming languages but is not present in standard Prolog.
1622: 
1623: \subsection{Ambivalent Syntax and Meta-variables}
1624: \label{subsec:meta-variables}
1625: 
1626: Before we proceed let us review first the basics of Prolog syntax mentioned so far:
1627: 
1628: \begin{itemize}
1629: \item Variables are denoted by strings starting with an upper case
1630:   letter or ``\_'' (underscore).  In particular, Prolog allows
1631:   so-called anonymous variables, written as ``\_'' (underscore).
1632:   
1633: \item Relation symbols (procedure identifiers), function symbols
1634:   and non-numeric constants are denoted by strings starting with a
1635:   lower case letter.
1636: 
1637: \item Binary and unary function symbols can be declared 
1638: as infix or bracket-less prefix operators.
1639: 
1640: \end{itemize}
1641: 
1642: Now, in contrast to first-order logic, in Prolog the {\em same\/} name
1643: can be used both for function symbols and for relation symbols.
1644: Moreover, the same name can be used for function or relation symbols
1645: of different arity.  This facility is called {\em ambivalent
1646:   syntax\/}. \index{ambivalent syntax} A function or a relation symbol
1647: {\tt f} of arity {\tt n} is then referred to as {\tt f/n}.  So in a
1648: Prolog program we can use both a relation symbol {\tt p/2} and
1649: function symbols {\tt p/1} and {\tt p/2} and build syntactically legal
1650: terms or atoms like {\tt p(p(a,b),c,p(X))}.  
1651: 
1652: In presence of the ambivalent syntax, the distinction between function
1653: symbols and relation symbols and between terms and
1654: atoms disappears, but in the context of queries and clauses it is
1655: clear which symbol refers to which syntactic category.  
1656: 
1657: The ambivalent syntax together with Prolog's facility to declare
1658: binary function symbols (and thus also binary relation symbols) as
1659: infix operators allows us to pass queries, clauses and programs as
1660: arguments. In fact, ``{\tt :-}/2'' is declared internally as an infix
1661: operator and so is the comma ``,/2'' between the atoms, so each clause
1662: is actually a term. This facilitates {\em meta-programming}, that is,
1663: writing programs that use other programs as data.
1664: 
1665: In what follows we shall explain how meta-programming can be realized
1666: in Prolog. To this end we need to introduce one more syntactic feature.
1667: Prolog permits the use of variables in the positions of atoms, both in
1668: the queries and in the clause bodies. Such a variable is called then a
1669: \index{meta-variable} {\em meta-variable\/}.  Computation in the
1670: presence of the meta-variables is defined as before since
1671: the mgus employed can also bind the meta-variables.  So, for
1672: example, given the legal, albeit unusual, Prolog program (that uses
1673: the ambivalent syntax facility)
1674: \begin{verbatim}
1675:   p(a).
1676:   a.
1677: \end{verbatim}
1678: the execution of the Prolog query {\tt p(X), X.} first leads to the
1679: query {\tt a.} and then succeeds.  Here {\tt a} is both a constant and
1680: a nullary relation symbol.
1681: 
1682: Prolog requires that the meta-variables are properly instantiated before
1683: they are executed. That is, they need to evaluate to a non-numeric term
1684: at the moment they are encountered in an execution. Otherwise a
1685: run-time error arises. For example, for the above program and the
1686: query {\tt p(X), X, Y.} the Prolog computation ends up in error once
1687: the query {\tt Y.} is encountered.
1688: 
1689: \subsection{Control Facilities}
1690: 
1691: Let us see now how the ambivalent syntax in conjunction with
1692: meta-variables supports meta-programming.  In this section we limit
1693: ourselves to (meta-)programs that show how to introduce new control
1694: facilities.  We discuss here three examples, each introducing a
1695: control facility actually available in Prolog as a built-in.
1696: More meta-programs will be presented in the next section
1697: once we introduce other features of Prolog.
1698: 
1699: \paragraph{Disjunction}
1700: To start with we can define disjunction \index{disjunction} by means
1701: of the following simple program:
1702: \begin{verbatim}
1703:   or(X,Y) :- X. 
1704:   or(X,Y) :- Y.
1705: \end{verbatim}
1706: A typical query is then \texttt{or(Q,R)}, where \texttt{Q} and
1707: \texttt{R} are ``conventional queries''.  Disjunction is a Prolog's
1708: built-in declared as an infix operator ``{\tt ;}/2'' and defined by
1709: means of the above two rules, with ``{\tt or}'' replaced by ``;''. So
1710: instead of \texttt{or(Q,R)} one writes \texttt{Q ; R}.
1711: 
1712: \paragraph{if-then-else}
1713: The other two examples involve the cut operator.
1714: The already discussed \textbf{if} \texttt{B} \textbf{then}
1715: \texttt{S} \textbf{else} \texttt{T} \textbf{fi} construct can be
1716: introduced by means of the by now familiar program
1717: \begin{verbatim}
1718:   if_then_else(B, S, T) :-  B,!,S.
1719:   if_then_else(B, S, T) :-  T.
1720: \end{verbatim}
1721: 
1722: In Prolog {\tt if\_then\_else} is a built-in defined internally by the
1723: above two rules.  {\tt if\_then\_else(B, S, T)} is written as {\tt B
1724:   -> S;T}, where ``{\tt \ra}/2'' is a
1725: built-in declared as an infix operator.
1726: As an example of its use let us rewrite yet again the definition of the {\tt
1727:   part} relation used in the {\tt QUICKSORT} program, this time using Prolog's
1728: {\tt B -> S;T}.  To enforce the correct parsing we need to enclose the
1729: {\tt B -> S;T} statement in brackets:
1730: 
1731: \begin{verbatim}
1732:   part(_, [], [], []).
1733:   part(X, [Y | Xs], Ls, Bs) :- 
1734:      ( X > Y ->
1735:        Ls = [Y | L1s], part(X, Xs, L1s, Bs)
1736:      ;
1737:        Bs = [Y | B1s], part(X, Xs, Ls, B1s)
1738:      ).
1739: \end{verbatim}
1740: 
1741: Note that we had to dispense here with the use of patterns in the
1742: ``output'' positions of {\tt part} and reintroduce the explicit use
1743: of unification in the procedure body.  By introducing yet another {\tt
1744:   B -> S;T} statement to deal with the case analysis in the second
1745: argument we obtain a definition of the {\tt part} relation that very
1746: much resembles a functional program:
1747: \begin{verbatim}
1748:   part(X, X1s, Ls, Bs) :- 
1749:      ( X1s = [] ->
1750:        Ls = [], Bs = []
1751:      ;
1752:        X1s = [Y | Xs], 
1753:        ( X > Y ->
1754:          Ls = [Y | L1s], part(X, Xs, L1s, Bs)
1755:        ;
1756:          Bs = [Y | B1s], part(X, Xs, Ls, B1s)
1757:        )
1758:      ).
1759: \end{verbatim}
1760: In fact, in this program all uses of unification boil down to matching and 
1761: its use does not involve backtracking.
1762: This example explains how the use of patterns often hides an implicit
1763: case analysis. By making this case analysis explicit using the {\bf
1764:   if-then-else} construct we end up with longer programs.
1765: In the end the original solution with the cut seems to be closer
1766: to the spirit of the language.
1767: \paragraph{Negation}
1768: 
1769: Finally, consider the negation operation {\tt not} that is
1770: supposed to reverse failure with success. That is, the intention
1771: is that the query {\tt not Q.} succeeds iff the query {\tt Q.} fails.
1772: This operation can be easily implemented by means of meta-variables and cut
1773: as follows:
1774: 
1775: \begin{verbatim}
1776:   not(X) :- X, !, fail.
1777:   not(_).
1778: \end{verbatim}
1779: {\tt fail}/0 is Prolog's built-in with the empty definition. Thus
1780: the call of the parameterless procedure {\tt fail} always fails.
1781: 
1782: This cryptic two line program employs several discussed features of
1783: Prolog.  In the first line {\tt X} is used as a meta-variable. 
1784: Consider now the call {\tt not(Q)}, where {\tt Q} is a query.
1785: If {\tt Q} succeeds, then the cut is performed. This has the effect that all
1786: alternative ways of computing {\tt Q} are discarded and also the
1787: second clause is discarded. Next the built-in {\tt fail} is executed and
1788: a failure arises. Since the only alternative clause was just discarded, the
1789: query {\tt not(Q)} fails.  If on the other hand the query {\tt Q}
1790: fails, then backtracking takes place and the second clause, {\tt
1791:   not(\_)} is selected. It immediately succeeds and so the initial query
1792: {\tt not(Q)} succeeds. So this definition of {\tt not} achieves the desired effect.
1793: 
1794: {\tt not/1} is defined internally by the above two line
1795: definition augmented with the appropriate declaration of
1796: it as a bracket-less prefix unary symbol.
1797: 
1798: \paragraph{Call}
1799: Finally, let us mention that Prolog also provides an indirect way of
1800: using meta-variables by means of a built-in relation {\tt call/1}.
1801: {\tt call/1} is defined internally by the rule
1802: 
1803: \begin{verbatim}
1804:   call(X) :- X.  
1805: \end{verbatim}
1806: {\tt call/1} is often used to ``mask'' the explicit use of meta-variables,
1807: but the outcome is the same.
1808: 
1809: \subsection{Negation as Failure}
1810: 
1811: The distinction between successful and failing computations
1812: is one of the unique features of logic programming and Prolog.
1813: In fact, no counterpart of failing computations exists
1814: in other programming paradigms.
1815: 
1816: The most natural way of using failing computations is by employing the
1817: negation operator \texttt{not} that allows us to turn failure into
1818: success, by virtue of the fact that the query \texttt{not Q.} succeeds
1819: iff the query \texttt{ Q.} fails.  This way we can use \texttt{not} to
1820: represent negation of a Boolean expression. In fact, we already
1821: referred informally to this use of negation at the beginning of
1822: Section \ref{sec:control}.
1823: 
1824: This suggests a declarative interpretation of the \texttt{not}
1825: operator as a classical negation. This interpretation is correct only
1826: if the negated query always terminates and is ground.  Note in
1827: particular that given the procedure \texttt{p} defined by the single
1828: rule \texttt{p :- p.} the query \texttt{not p.} does not terminate.
1829: Also, for the query \texttt{not(X = 1).} we get the following
1830: counterintuitive outcome:
1831: 
1832: \begin{verbatim}
1833: ?- not(X = 1).
1834: 
1835: no
1836: \end{verbatim}
1837: Thus to generate all elements of a list \texttt{Ls} that differ from 1
1838: the correct query is \texttt{member(X, Ls), not(X = 1).} and not
1839: \texttt{not(X = 1), member(X, Ls).}
1840: 
1841: One usually refers to the way negation is used in Prolog as ``negation
1842: as failure''.  When properly used it is a powerful feature as testified
1843: by the following jewel program.  
1844: We consider the problem of determining a winner in a two-person finite
1845: game.  Suppose that the moves in the game are represented by a
1846: relation \texttt{move}. The game is assumed to be finite, so we
1847: postulate that given a position \texttt{pos} the query
1848: \texttt{move(pos, Y).} generates finitely many answers which are all
1849: possible moves from \texttt{pos}.  A player loses if he is in a
1850: position \texttt{pos} from which no move exists, i.e., if the query
1851: \texttt{move(pos, Y).}  fails.
1852: 
1853: A position is a winning one when a move exists which
1854: leads to a losing, i.e., non-winning position. Using the negation
1855: operator this can be written as
1856: \begin{verbatim}
1857: % win(X) :- X is a winning position in the two-person finite game 
1858: %           represented by the relation move.
1859:   win(X) :- move(X, Y), not win(Y).
1860: \end{verbatim}
1861: So this remarkably concise program has a simple declarative
1862: interpretation.  In contrast, the procedural interpretation is quite
1863: complex: the query \texttt{win(pos).} determines whether \texttt{pos}
1864: is a winning position by performing a minimax search on the 0-1 game
1865: tree represented by the relation \texttt{move}. In this recursive
1866: procedure the base case appears when the call to \texttt{move}
1867: fails---then the corresponding call of \texttt{win} also fails.
1868: 
1869: \subsection{Higher-Order Programming and Meta-Programming in Prolog}
1870: 
1871: Thanks to the ambivalent syntax and meta-variables higher-order
1872: programming and another form of meta-programming can be easily
1873: realized in Prolog.  To explain this we need two more built-ins.
1874: Each of them belongs to a different category.
1875: 
1876: \subsubsection*{Term Inspection Facilities}
1877: \label{subsec:tif}
1878: 
1879: Prolog offers a number of built-in relations that allow us to inspect,
1880: compare and decompose terms.  One of them is \texttt{=../2} (pronounced
1881: \emph{univ}) that allows us to switch between a term and its representation as
1882: a list.  Instead of describing precisely its meaning we just
1883: illustrate one of its uses by means the following query:
1884: 
1885: \begin{verbatim}
1886: ?- Atom =.. [square, [1,2,3,4], Ys].
1887: 
1888: Atom = square([1,2,3,4], Ys).
1889: \end{verbatim}
1890: So the left-hand side, here \texttt{Atom}, is unified with the term
1891: (or, equivalently, the atom), here \texttt{square([1,2,3,4], Ys)},
1892: represented by a list on the right-hand side, here \texttt{[square,
1893:   [1,2,3,4], Ys]}. In this list representation of a term the head of
1894: the list is the leading function symbol and the tail is the list of the
1895: arguments.
1896: 
1897: So using \emph{univ} one can construct terms and pass them as arguments.
1898: More interestingly, one can construct atoms and execute them using the
1899: meta-variable facility. This way it is possible to realize
1900: higher-order programming in Prolog in the sense that relations can be
1901: passed as arguments.  To illustrate this point consider the following
1902: program \texttt{MAP}:
1903: 
1904: \begin{verbatim}
1905: % map(P, Xs, Ys) :- the list Ys is the result of applying P 
1906: %                   elementwise to the list Xs.
1907:   map(P, [], []).
1908:   map(P, [X | Xs] , [Y | Ys]) :- apply(P, [X, Y]), map(P, Xs, Ys).
1909: 
1910: % apply(P, [X1, ..., Xn]) :- execute the atom P(X1, ..., Xn).
1911:   apply(P, Xs) :- Atom =.. [P|Xs], Atom.
1912: \end{verbatim}
1913: In the last rule \emph{univ} is used to construct an atom.
1914: Note the use of the meta-variable \texttt{Atom}.
1915: \texttt{MAP} is Prolog's counterpart of the familiar higher-order
1916: functional program and it behaves in the expected way.  For example,
1917: given the program
1918: \begin{verbatim}
1919: % square(X, Y) :- Y is the square of X.
1920:   square(X, Y) :- Y is X*X.
1921: \end{verbatim}
1922: we get
1923: 
1924: \begin{verbatim}
1925: ?- map(square, [1,2,3,4], Ys).
1926: 
1927: Ys = [1, 4, 9, 16]
1928: \end{verbatim}
1929: 
1930: \subsubsection*{Program Manipulation Facilities}
1931: 
1932: Another class of Prolog built-ins makes it possible to access and
1933: modify the program during its execution.  We consider here a single
1934: built-in in this category, {\tt clause/2} \index{{\tt clause/2}}, that
1935: allows us to access the definitions of the relations present in the
1936: considered program. Again, consider first an example of its use in
1937: which we refer to the program {\tt MEMBER} of Subsection
1938: \ref{subsec:multiple}.
1939: 
1940: \begin{verbatim}
1941: ?- clause(member(X,Y), Z).
1942: 
1943: Y = [X|_A],
1944: Z = true ;
1945: 
1946: Y = [_A|_B],
1947: Z = member(X,_B) ;
1948: 
1949: no
1950: \end{verbatim}
1951: 
1952: In general, the call {\tt clause(head, body)} leads to a unification
1953: of the term {\tt head :- body} with the successive clauses forming the
1954: definition of the relation in question.  This relation, here
1955: \texttt{member}, is the leading symbol of the first argument of
1956: {\tt clause/2} that has to be a non-variable.
1957: 
1958: This built-in assumes that {\tt true} is the body of a fact, here
1959: \texttt{member(X, [X | \_])}.  {\tt true}/0 is Prolog's built-in that
1960: succeeds immediately. So its definition consists just of the fact
1961: \texttt{true}.  This explains the first answer.  The second answer is
1962: the result of unifying the term \texttt{member(X,Y) :- Z} with (a
1963: renaming of) the second clause defining \texttt{member}, namely
1964: \texttt{member(X, [\_ | Xs]):- member(X, Xs)}.
1965: 
1966: Using {\tt clause/2} we can construct Prolog interpreters written in
1967: Prolog, that is, \emph{meta-interpreters}.  Here is the simplest one.
1968: 
1969: \begin{verbatim}
1970: % solve(Q) :- the query Q succeeds for the program accessible by clause/2.
1971:   solve(true) :- !. 
1972:   solve((A,B)) :- !, solve(A), solve(B). 
1973:   solve(A) :- clause(A, B), solve(B).
1974: \end{verbatim}
1975: 
1976: Recall that {\tt (A,B)} is a legal Prolog term (with no leading function symbol).
1977: To understand this program one needs to know that the comma ``,''
1978: between the atoms is declared internally as a right associative infix
1979: operator, so the query {\tt A,B,C,D} actually stands for the term {\tt
1980:   (A,(B,(C,D)))}, etc.
1981: 
1982: The first clause states that the built-in {\tt true} succeeds immediately.
1983: The second clause states that a query of the form
1984: $A, \vect{B}$ can be solved if $A$ can be solved and $\vect{B}$
1985: can be solved. Finally, the last clause states that an atomic query
1986: $A$ can be solved if there exists a clause of the form $A :- \ \vect{B}$
1987: such that the query $\vect{B}$ can be solved.
1988: The cuts are used here to enforce the a ``definition by cases'':
1989: either the argument of {\tt solve} is {\tt true} or a non-atomic
1990: query or else an atomic one.
1991: 
1992: To illustrate the behavior of the above meta-interpreter
1993: assume that {\tt MEMBER} is a part of the considered
1994: program. We then have
1995: \begin{verbatim}
1996: ?- solve(member(X, [mon, wed, fri])).
1997: 
1998: X = mon ;
1999: 
2000: X = wed ;
2001: 
2002: X = fri ;
2003: 
2004: no
2005: \end{verbatim}
2006: This meta-program forms a basis for building various types of interpreters
2007: for larger fragments of Prolog or for its extensions.
2008: 
2009: \section{Assessment of Prolog}
2010: 
2011: Prolog, due to its origin in automated theorem proving, is an unusual
2012: programming language.  It leads to a different style of programming
2013: and to a different view of programming.  A number of elegant Prolog
2014: programs presented here speak for themselves. We also noted that the
2015: same Prolog program can often be used for different purposes ---for
2016: computing, testing or completing a solution, or for computing all
2017: solutions.  Such programs cannot be easily written in other
2018: programming paradigms. Logical variables are a unique and, as we saw,
2019: very useful feature of logic programming.  Additionally, pure Prolog
2020: programs have a dual interpretation as logical formulas. In this sense
2021: Prolog supports declarative programming.
2022: 
2023: Both through the development of a programming methodology and ingenious
2024: implementations a great care was taken to overcome possible sources
2025: of inefficiency.  On the programming level we already discussed cut
2026: and the difference lists.  Programs such as {\tt FACTORIAL} of Subsection
2027: \ref{subsec:eval} can be optimized by means of tail recursion.  On
2028: the implementation level the efficiency is improved by such techniques
2029: as the last call optimization that can be used to optimize tail recursion,
2030: indexing that deals with the presence of multiple clauses, and a default
2031: omission of the occur-check (the test ``$x$ does not occur in $t$'' in
2032: clause (5) of the Martelli--Montanari algorithm) that speeds up the
2033: unification process (though on rare occasions makes it unsound).
2034: 
2035: Prolog's only data type, the terms, is implicitly present in many
2036: areas of computer science. In fact, whenever the objects of interest
2037: are defined by means of grammars, for example first-order formulas,
2038: digital circuits, programs in any programming language, or sentences
2039: in some formal language, these objects can be naturally defined as
2040: terms.  Prolog programs can then be developed starting with this
2041: representation of the objects as terms.  Prolog's support for handling
2042: terms by means of unification and various term inspection facilities
2043: comes then handy. In short, symbolic data can be naturally handled in
2044: Prolog.
2045: 
2046: Automatic backtracking becomes very useful when dealing with
2047: search. Search is of paramount importance in many artificial
2048: intelligence applications and backtracking itself is most natural when
2049: dealing with NP-complete problems.  Moreover, the principle of
2050: ``computation as deduction'' underlying Prolog's computation process
2051: facilitates formalization of various forms of reasoning in Prolog. In
2052: particular, Prolog's negation operator \texttt{not} can be naturally
2053: used to support non-monotonic reasoning.  All this explains why Prolog
2054: is a natural language for programming artificial intelligence
2055: applications, such as automated theorem provers, expert systems and
2056: machine learning programs where reasoning needs to be combined with
2057: computing, game playing programs, and various decision support
2058: systems.
2059: 
2060: Prolog is also an attractive language for computational linguistics
2061: applications and for compiler writing. In fact, Prolog provides
2062: support for so-called definite clause grammars (DCG).  Thanks to this
2063: a grammar written in the DCG form is already a Prolog program that
2064: forms a parser for this grammar.  The fact that Prolog allows one to
2065: write executable specifications makes it also a useful language
2066: for rapid prototyping, in particular in the area of meta-programming.
2067: 
2068: For the sake of a balanced presentation let us discuss now Prolog's shortcomings.
2069: 
2070: \paragraph{Lack of Types}
2071: 
2072: Types are used in programming languages to structure the data manipulated
2073: by the program and to ensure its correct use.
2074: In Prolog one can define various types like binary trees and
2075: records.  Moreover, the language provides a notation for lists and offers a
2076: limited support for the type of all numbers by means of the arithmetic
2077: operators and arithmetic comparison relations.  However, Prolog does
2078: not support types in the sense that it does not check whether the
2079: queries use the program in the intended way.  
2080: 
2081: Because of this absence of type checking, type errors are
2082: easy to make but difficult to find.  For example, even though the
2083: \texttt{APPEND} program was meant to be used to concatenate two lists
2084: it can also be used with non-lists as arguments:
2085: \begin{verbatim}
2086: ?-  append([a,b], f(c), Zs). 
2087: 
2088: Zs = [a, b|f(c)]
2089: \end{verbatim}
2090: and no error is reported.  In fact, almost every Prolog program can be
2091: misused.  Moreover, because of lack of type checking some improvements
2092: of the efficiency of the implementation cannot be carried out and various
2093: run-time errors cannot be prevented.
2094: 
2095: \paragraph{Subtle Arithmetic}
2096: We discussed already the subtleties arising in presence of arithmetic
2097: in Section \ref{sec:arithmetic}.  We noted that Prolog's facilities
2098: for arithmetic easily lead to run-time errors. It would be desirable
2099: to discover such errors at compile time but this is highly non-trivial.
2100: 
2101: \paragraph{Idiosyncratic control}
2102: 
2103: Prolog's control mechanisms are difficult to master by programmers
2104: accustomed to the imperative programming style.  One of the reasons is
2105: that both bounded iteration (the {\tt for} statement) and unbounded
2106: iteration (the {\tt while} statement) need to be implemented by means
2107: of recursion. So for example a nested {\tt for} statement is
2108: implemented by means of nested tail recursion that is less easy to
2109: understand. Of course, one can introduce both constructs by means of
2110: meta-programming but then their proper use is not enforced due to
2111: the lack of types.
2112: Additionally, as already mentioned, cut is a low level mechanism
2113: that is not easy to understand.
2114: 
2115: \paragraph{Complex semantics of various built-ins}
2116: 
2117: Prolog offers a large number of built-ins. In fact, the ISO Prolog
2118: Standard \cite{Iso95} describes 102 built-ins.  Several of them are
2119: quite subtle.  For example, the query \texttt{not(not Q).}  tests
2120: whether the query \texttt{Q.} succeeds and this test is carried out
2121: without changing the state, i.e., without binding any of the
2122: variables. Moreover, it is not easy to describe precisely the
2123: meaning of some of the built-ins. For example, in the ISO Prolog
2124: Standard the operational interpretation of the {\bf if-then-else}
2125: construct consists of 17 steps.
2126: 
2127: \paragraph{No Modules and no Objects}
2128: 
2129: Finally, even though modules exist in many widely used Prolog
2130: versions, neither modules nor objects are present in ISO Prolog
2131: Standard. This makes it difficult to properly structure Prolog
2132: programs and to reuse them as components of other Prolog programs.  It
2133: should be noted that thanks to Prolog's support for meta-programming
2134: the object-programming style can be mimicked in Prolog in a pretty
2135: simple way. But no compile-time checking of its proper use is then
2136: enforced and errors in the program design will be discovered at
2137: best at the run-time.  The same critique applies to Prolog's approach
2138: to higher-order programming and to meta-programming.  \medskip
2139: 
2140: Of course, these limitations of Prolog were recognized by many
2141: researchers who came up with various good proposals how to improve
2142: Prolog's control, how to add to it (or how to infer) types, and how to
2143: provide modules and objects. Research in the field of logic
2144: programming has also dealt with the precise relation between the
2145: procedural and declarative interpretation of logic programs and a
2146: declarative account of various aspects of Prolog, including negation
2147: and meta-programming. Also verification of Prolog programs and its
2148: semantics were extensively studied.
2149: 
2150: However, no single programming language proposal emerged yet that
2151: could be seen as a natural successor to Prolog in which the above
2152: shortcomings are properly taken care of.  The language that comes
2153: closest to this ideal is Mercury (see
2154: \verb+http://www.cs.mu.oz.au/research/mercury/+).  Colmerauer
2155: \index{Colmerauer, A.} himself designed a series of successors of
2156: Prolog, Prolog II, III and IV that incorporated various forms of
2157: constraint processing into this programming style.
2158: 
2159: When assessing Prolog it is useful to have in mind that it is a
2160: programming language designed in the early seventies (and standardized in
2161: the nineties).  The fact that it is still widely used and that new
2162: applications for it keep being found testifies to its originality.  No
2163: other programming language succeeded to embrace first-order logic in
2164: such an effective way.
2165: 
2166: \section{Bibliographic Remarks}
2167: 
2168: For those interested to learn in detail the origins of logic
2169: programming and of Prolog there is no better place to start than to
2170: read the fascinating account in \cite{CR96}.
2171: There a number of excellent books on programming in Prolog.  The two
2172: deservedly most successful are \index{Bratko, I.} Bratko \cite{Bra01}
2173: and \index{Sterling, L.} \index{Shapiro, E.}  Sterling and Shapiro
2174: \cite{SS94}.  The book of O'Keefe \cite{O'K90} \index{O'Keefe, R.A.}
2175: discusses in depth the efficiency and pragmatics of programming in
2176: Prolog.  A{\"{\i}t}-Kaci \cite{Ait91} \index{A{\"{\i}t}-Kaci, H.} is
2177: an outstanding tutorial on the implementation of Prolog.
2178: 
2179: \section{Summary}
2180: We discussed here the logic programming paradigm and its realization
2181: in Prolog.  This paradigm has contributed a number of novel ideas
2182: in the area of programming languages. It introduced unification as a
2183: computation mechanism and it realized the concept of ``computation as
2184: deduction''.  Additionally, it showed that a fragment of first-order
2185: logic can be used as a programming language and that declarative
2186: programming is an interesting alternative to structured programming in
2187: the imperative programming style.
2188: 
2189: Prolog is a rule based language but thanks to a large number of
2190: built-ins it is a general purpose programming language.  Programming
2191: in Prolog substantially differs from programming in the imperative
2192: programming style. The following table may help to relate the underlying
2193: concepts used in both programming styles.
2194: 
2195: \medskip
2196: 
2197: \begin{center}
2198:   
2199: \begin{tabular}{|l|l|}
2200: \hline
2201: logic programming                  & imperative programming \\ \hline
2202: equation solved by unification    & assignment \\
2203: relation symbol                    & procedure identifier \\
2204: term                               & expression \\
2205: atom                               & procedure call \\
2206: query                              & program \\
2207: definition of a relation           & procedure declaration \\
2208: local variable of a rule           & local variable of a procedure \\
2209: logic program                      & set of procedure declarations \\
2210: ``,'' between atoms                & sequencing (``;'')  \\ 
2211: substitution                       & state \\ 
2212: composition of substitutions       & state update \\
2213: \hline
2214: \end{tabular}  
2215: \end{center}
2216: \medskip
2217: 
2218: \subsection*{Acknowledgements}
2219: Maarten van Emden and Jan Smaus provided us with useful comments on
2220: this article.
2221: 
2222: 
2223: \begin{thebibliography}{10}
2224: 
2225: \bibitem{Ait91}
2226: H.~A{\"{\i}t}-Kaci.
2227: \newblock {\em Warren's Abstract Machine}.
2228: \newblock MIT Press, 1991.
2229: \newblock Out of print. Available via
2230:   \verb+http://www.isg.sfu.ca/~hak/documents/wam.html+.
2231: 
2232: \bibitem{Bra01}
2233: I.~Bratko.
2234: \newblock {\em {PROLOG Programming for Artificial Intelligence}}.
2235: \newblock International Computer Science Series. Addison-Wesley, third edition,
2236:   2001.
2237: 
2238: \bibitem{CR96}
2239: A.~Colmerauer and Ph. Roussel.
2240: \newblock The birth of {Prolog}.
2241: \newblock In Thomas~J. Bergin and Richard~G. Gibson, editors, {\em History of
2242:   Programming Languages}, pages 331--367. ACM Press/Addison-Wesley, 1996.
2243: 
2244: \bibitem{CF58}
2245: H.B. Curry and R.~Feys.
2246: \newblock {\em Combinatory Logic, Volume I, Studies in Logic and the Foundation
2247:   of Mathematics}.
2248: \newblock North-Holland, Amsterdam, 1958.
2249: 
2250: \bibitem{Iso95}
2251: {I}nternational {S}tandard. {I}nformation {T}echnology --- {P}rogramming
2252:   {L}anguage --- {P}rolog, {P}art 1: {G}eneral {C}ore, 1995.
2253: \newblock ISO/IEC DIS 13211-1:1995(E).
2254: 
2255: \bibitem{Kow74}
2256: R.A. Kowalski.
2257: \newblock Predicate logic as a programming language.
2258: \newblock In {\em Proceedings IFIP'74}, pages 569--574. North-Holland, 1974.
2259: 
2260: \bibitem{MM82}
2261: A.~Martelli and U.~Montanari.
2262: \newblock An efficient unification algorithm.
2263: \newblock {\em ACM Transactions on Programming Languages and Systems},
2264:   4:258--282, 1982.
2265: 
2266: \bibitem{O'K90}
2267: R.A. O'Keefe.
2268: \newblock {\em The Craft of {P}rolog}.
2269: \newblock MIT Press, 1990.
2270: 
2271: \bibitem{Rob65}
2272: J.A. Robinson.
2273: \newblock A machine-oriented logic based on the resolution principle.
2274: \newblock {\em Journal of the ACM}, 12(1):23--41, 1965.
2275: 
2276: \bibitem{SS94}
2277: L.~Sterling and E.~Shapiro.
2278: \newblock {\em The Art of {P}rolog}.
2279: \newblock The MIT Press, Cambridge, Mass., second edition, 1994.
2280: 
2281: \end{thebibliography}
2282: 
2283: %\bibliographystyle{plain}
2284: 
2285: %\bibliography{/ufs/apt/bib/01}
2286: \end{document}
2287: 
2288: 
2289: 
2290: 
2291: 
2292: 
2293: 
2294: 
2295: 
2296: