cs0304006/cmplg.tex
1: \documentclass[10pt]{article}
2: 
3: 
4: \usepackage{hltnaacl03}
5: \usepackage{times}
6: \usepackage{latexsym}
7: \usepackage{epsfig}
8: \usepackage{xspace}
9: \newcommand{\epsfscaledbox}[2]{\centerline{\psfig{figure=#1,width=#2}}}
10: \newcommand{\omt}[1]{}
11: \newcommand{\bibsnip}{\vspace*{-.11in}}
12: \newcommand{\proc}{Proc.\xspace}
13: \newcommand{\U}[1]{\underline{#1}}
14: \newcommand{\UU}[1]{\underline{\underline{#1}}}
15: \newcommand{\comment}[1]{{\bf !!- - - #1 - - -  !!}}
16: \newcommand{\Lattice}{Lattice\xspace}
17: \newcommand{\lattice}{lattice\xspace}
18: \newcommand{\lattices}{lattices\xspace}
19: \newcommand{\slotlat}{slotted \lattice}
20: \newcommand{\slotlats}{slotted \lattices}
21: \newcommand{\template}[1]{{\sf #1}}
22: \newcommand{\corpus}{C}
23: 
24: \newenvironment{frameit}[1]
25:   {\begin{tabular}{|p{#1}|}\hline}{\\\hline\end{tabular}}
26: 
27: \newcommand{\textexample}[1]{
28:   {\noindent
29:     \begin{center}
30:       \fbox{\parbox{0.45\textwidth}{\small\sf #1}}
31:     \end{center}}}
32: 
33: \setlength\titlebox{6.5cm}
34: \title{\vspace{-75pt}
35: {\normalsize {\it \hfill Proceedings of HLT/NAACL 2003}} \\ \mbox{}\\Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment}
36: 
37: \author{Regina Barzilay \and Lillian Lee \\ 
38: Department of Computer Science \\
39: Cornell University \\
40: Ithaca, NY 14853-7501 \\
41: \{regina,llee\}@cs.cornell.edu}
42: 
43: \date{}
44: 
45: \begin{document}
46: \maketitle
47: \begin{abstract}
48:   We address the text-to-text generation problem of sentence-level paraphrasing
49:   --- a phenomenon distinct from and more difficult than word- or phrase-level
50:   paraphrasing.  Our approach applies {\em multiple-sequence alignment} to
51:   sentences gathered from unannotated comparable corpora: it learns a set of
52:   paraphrasing patterns represented by {\em word lattice} pairs and
53:   automatically determines how to apply these patterns to rewrite new
54:   sentences.  The results of our evaluation experiments show that the system
55:   derives accurate paraphrases, outperforming baseline systems.
56: \end{abstract}
57: 
58: 
59: 
60: \section{Introduction}
61: 
62: \begin{quote}
63: {\em This is a late parrot! It's a stiff! Bereft of life, it rests in
64: peace! If you hadn't nailed him to the perch he would be pushing up
65: the daisies! Its metabolical processes are of interest only to
66: historians! It's hopped the twig! It's shuffled off this mortal coil!
67: It's rung down the curtain and joined the choir invisible! This is
68: an EX-PARROT!} --- Monty Python, ``Pet Shop'' 
69: \end{quote}
70: 
71: A mechanism for automatically generating multiple paraphrases of a
72: given sentence would be of significant practical import for
73: text-to-text generation systems.  Applications include summarization
74: \cite{Knight&Marcu:2000a} and rewriting
75: \cite{Chandrasekar+Srinivas:97a}: both could employ such a mechanism
76: to produce candidate sentence paraphrases
77: that other
78: system components  would filter for length, sophistication level, and
79: so forth.\footnote{Another interesting application,
80:   somewhat tangential to generation, would be
81: to expand existing corpora by providing several versions of their
82: component sentences.  
83: This could, for example, aid machine-translation evaluation, where it has
84: become common to evaluate systems by comparing their output against a bank of
85: several reference translations for the same sentences
86: \cite{Papineni&al:2002a}.
87: See \newcite{Bangalore&Murdock&Riccardi:2002a} and
88: \newcite{Barzilay&Lee:2002a} for other uses of such data.}
89: Not surprisingly, therefore,  
90: paraphrasing has been a focus of generation
91: research for quite some time
92: \cite{McKeown:79a,Meteer+Shaked:88a,Dras:1999a}. 
93: 
94: One might initially suppose that sentence-level paraphrasing is simply the
95: result of word-for-word or phrase-by-phrase substitution applied in a domain-
96: and context-independent fashion.  However, in studies of paraphrases across
97: several domains
98: \cite{Iordanskaja&Kittredge&Polguere:1991a,Robin-phd,McKeown&Kukich&Shaw:1994a},
99: this was generally not the case.
100: For instance, consider the following two sentences (similar to
101: examples found in  \newcite{Smadja&McKeown:1991a}):
102:   \begin{center}
103:     \begin{frameit}{0.9\columnwidth}
104:     {\small    After the latest Fed rate cut, stocks rose across the board.}
105:       \\\hline
106:       {\small Winners strongly outpaced losers after Greenspan cut
107:       interest rates again.}
108:     \end{frameit}
109:   \end{center}
110:   Observe that ``Fed'' (Federal Reserve) and ``Greenspan'' are interchangeable
111:   only in the domain of US financial matters.  Also, note that one cannot draw
112:   one-to-one correspondences between single words or phrases.  For instance,
113:   nothing in the second sentence is really equivalent to ``across the board'';
114:   we can only say that the entire clauses ``stocks rose across the board'' and
115:   ``winners strongly outpaced losers'' are paraphrases.  This evidence suggests
116:   two consequences: (1) we cannot rely solely on generic domain-independent
117:   lexical resources for the task of paraphrasing, and (2) {\em sentence-level}
118:   paraphrasing is an important problem extending beyond that of paraphrasing
119:   smaller lexical units.
120:   
121:   {\em Our work presents a novel knowledge-lean algorithm that uses {\em
122:       multiple-sequence alignment} (MSA) to {\em learn} to generate
123:     sentence-level paraphrases essentially from unannotated corpus data alone.}
124:   In contrast to previous work using MSA for generation
125:   \cite{Barzilay&Lee:2002a}, we need neither parallel data nor explicit
126:   information about sentence semantics.  Rather, we use two {\em comparable
127:     corpora}, in our case, collections of articles produced by two different
128:   newswire agencies about the same events.  The use of related corpora is key:
129:   we can capture paraphrases that on the surface bear little resemblance but
130:   that, by the nature of the data, must be descriptions of the same
131:   information.  Note that we also acquire paraphrases from each of the
132:   individual corpora; but the lack of clues as to sentence equivalence in
133:   single corpora means that we must be more conservative, only selecting as
134:   paraphrases items that are structurally very similar.
135:   
136:   Our approach has three main steps.  First, working on each of the comparable
137:   corpora separately, we compute {\em \lattices} --- compact graph-based
138:   representations --- to find commonalities within (automatically derived)
139:   groups of structurally similar sentences.  Next, we identify pairs of
140:   lattices from the two different corpora that are paraphrases of each other;
141:   the identification process checks whether the lattices take similar
142:   arguments.  Finally, given an input sentence to be paraphrased, we match it
143:   to a lattice and use a paraphrase from the matched lattice's mate to generate
144:   an output sentence.  The key features of this approach are:
145: 
146: \noindent
147: \textbf{Focus on paraphrase generation.} In contrast to earlier work, we not
148: only extract paraphrasing rules, but also automatically determine which of the
149: potentially relevant rules to apply to an input sentence and produce a revised
150: form using them.
151: 
152: \noindent
153: \textbf{Flexible paraphrase types.} Previous approaches to paraphrase
154: acquisition focused on certain rigid types of paraphrases, for instance,
155: limiting the number of arguments.  In contrast, our method is not limited to a
156: set of {\it a priori}-specified paraphrase types.
157: 
158: \noindent
159: \textbf{Use of comparable corpora and minimal use of knowledge resources.}  In
160: addition to the advantages mentioned above, comparable corpora can be easily
161: obtained for many domains, whereas previous approaches to paraphrase
162: acquisition (and the related problem of phrase-based machine translation
163: \cite{Wang:1998a,Och&Tillman&Ney:1999a,Vogel&Ney:2000a}) required parallel
164: corpora.  We point out that one such approach, recently proposed by
165: \newcite{Pang+Knight+Marcu:03a}, also represents paraphrases by lattices,
166: similarly to our method, although their lattices are derived using parse
167: information.
168: 
169: 
170: Moreover, our algorithm does not employ knowledge resources such as parsers or
171: lexical databases, which may not be available or appropriate for all domains
172: --- a key issue since paraphrasing is typically domain-dependent.  Nonetheless,
173: our algorithm achieves good performance.
174: 
175: 
176: 
177: \section{Related work}
178: Previous work on automated paraphrasing has considered different levels of
179: paraphrase granularity.  Learning synonyms via distributional similarity has
180: been well-studied \cite{Pereira&Tishby&Lee:1993a,Grefenstette:94a,Lin:1998a}.
181: \newcite{Jacquemin:l999a} and \newcite{Barzilay&McKeown:01a} identify
182: phrase-level paraphrases, while \newcite{Lin&Pantel:2001a} and
183: \newcite{Shinyama&al:2002a} acquire structural paraphrases encoded as
184: templates.  These latter are the most closely related to the sentence-level
185: paraphrases we desire, and so we focus in this section on template-induction
186: approaches.
187: 
188: \newcite{Lin&Pantel:2001a} extract inference rules, which are related to
189: paraphrases (for example, \template{X wrote Y} implies \template{X is the
190:   author of Y}), to improve question answering.  They assume that {\em paths}
191: in dependency trees that take similar arguments (leaves) are close in meaning.
192: However, only two-argument templates are considered.
193: \newcite{Shinyama&al:2002a} also use dependency-tree information to extract
194: templates of a limited form (in their case, determined by the underlying
195: information extraction application).  Like us (and unlike Lin and Pantel, who
196: employ a single large corpus), they use articles written about the same event
197: in different newspapers as data.
198: 
199: Our approach shares two characteristics with the two methods just described:
200: pattern comparison by analysis of the patterns' respective arguments, and use
201: of non-parallel corpora as a data source.  However, {\em extraction} methods
202: are not easily extended to {\em generation} methods.  One problem is that their
203: templates often only match small fragments of a sentence.  While this is
204: appropriate for other applications, deciding whether to use a given template to
205: generate a paraphrase requires information about the surrounding context
206: provided by the entire sentence.
207: 
208: 
209: \newcommand{\slot}{slot\xspace}
210: \newcommand{\slots}{slots\xspace}
211: \newcommand{\findclusters}{Sentence clustering}
212: \newcommand{\families}{clusters\xspace}
213: \newcommand{\Families}{Clusters\xspace}
214: \newcommand{\family}{cluster\xspace}
215: \newcommand{\famlat}{\lattice}
216: \newcommand{\famlats}{\lattices}
217: \newcommand{\msg}{pattern\xspace}
218: 
219: \newcommand{\patterninformal}{pattern\xspace}
220: \newcommand{\patternsinformal}{patterns\xspace}
221: \newcommand{\Patterninformal}{Pattern\xspace}
222: \newcommand{\surprise}{surprise\xspace}
223: \newcommand{\backbone}{backbone\xspace}
224: \newcommand{\numtoken}{NUM}
225: \newcommand{\nametoken}{NAME}
226: \newcommand{\datetoken}{DATE}
227: 
228: 
229: 
230: \section{Algorithm}
231: 
232: 
233: \paragraph{Overview} We first sketch the algorithm's broad outlines. The subsequent subsections provide
234: more detailed descriptions of the individual steps.
235: 
236: The major goals of our algorithm are to learn: 
237: \begin{itemize}
238: \item  recurring {\patternsinformal} in the data, such as  \template{X
239: (injured/wounded) Y people, Z seriously}, where the capital letters 
240: represent variables; 
241: \item
242: pairings between such \patternsinformal that represent paraphrases, for
243: example, between the \patterninformal \template{X (injured/wounded) Y people,
244: Z of them seriously} and the \patterninformal \template{Y were
245: (wounded/hurt) by X, among them Z were in serious condition}.
246: \end{itemize}
247: 
248: Figure~\ref{fig:arch} illustrates the main stages of our approach.  During
249: training, \patterninformal induction is first applied independently to the two
250: datasets making up a pair of {comparable corpora}.  Individual
251: \patternsinformal are learned by applying {\em multiple-sequence alignment} to
252: \families of sentences describing approximately similar events; these
253: \patternsinformal are represented compactly by {\em \lattices} (see Figure
254: \ref{fig:lattice}).  We then check for \lattices from the two different corpora
255: that tend to take the same arguments; these \lattice pairs are taken to be
256: paraphrase \patternsinformal.
257: 
258: \begin{figure}
259: \begin{center}
260: \epsfscaledbox{arch.eps}{2.2in}
261: \end{center}
262: \vspace*{-.2in}
263: \caption{\label{fig:arch} System architecture.}
264: \end{figure}
265: 
266: Once training is done, we can generate paraphrases as follows: given the
267: sentence ``The \surprise bombing injured twenty people, five of them
268: seriously'', we match it to the lattice \template{X (injured/wounded) Y people,
269:   Z of them seriously} which can be rewritten as \template{Y were
270:   (wounded/hurt) by X, among them Z were in serious condition}, and so by
271: substituting arguments we can generate ``Twenty were wounded by the \surprise
272: bombing, among them five were in serious condition'' or ``Twenty were hurt by
273: the \surprise bombing, among them five were in serious condition''.
274: 
275: \begin{figure}
276: \newcounter{sentexample}\setcounter{sentexample}{1}
277: \newcommand{\sentex}[1]{{\footnotesize (\thesentexample)~#1 \stepcounter{sentexample}}}
278: \fbox{
279: \begin{minipage}{3in}
280:   \sentex{\textbf{A Palestinian suicide bomber blew himself up in} a southern
281:     city Wednesday, \textbf{killing} two other \textbf{people}
282:     \textbf{and wounding} 27.} \\
283:   \sentex{\textbf{A suicide bomber blew himself up in} the settlement of Efrat,
284:     on Sunday, \textbf{killing} himself \textbf{and injuring}
285:     seven people.} \\
286:   \sentex{\textbf{A suicide bomber blew himself up in} the coastal resort of
287:     Netanya on Monday, \textbf{killing} three other \textbf{people} 
288:     \textbf{and wounding} dozens more.} \\
289:   \sentex{\textbf{A Palestinian suicide bomber blew himself up in} a garden
290:     cafe on Saturday, \textbf{killing} 10 \textbf{people}  \textbf{and wounding}
291:     54.} \\
292:   \sentex{\textbf{A suicide bomber blew himself up in} the centre of Netanya on
293:     Sunday, \textbf{killing} three \textbf{people} as well as himself 
294:     \textbf{and injuring} 40. }
295: \end{minipage}
296: }
297: \caption{\label{fig:cluster} Five sentences (without date, number,
298:   and name substitution) from a \family of 49, similarities emphasized.  }
299: \end{figure} 
300: 
301: 
302: 
303: \begin{figure*}
304:   \psfig{figure=msa-new.ps,width=6.5in}
305: \caption{\label{fig:lattice} \Lattice and 
306:   \slotlat for the five sentences from Figure \ref{fig:cluster}.  Punctuation
307:   and articles removed for clarity.}
308: \end{figure*}
309: 
310: \subsection{\findclusters}
311: 
312: Our first step is to cluster sentences into groups from which to learn useful
313: patterns; for the multiple-sequence techniques we will use, this means that the
314: sentences within \families should describe similar events and have similar
315: structure, as in the sentences of Figure \ref{fig:cluster}.  This is
316: accomplished by applying hierarchical complete-link clustering to the sentences
317: using a similarity metric based on word n-gram overlap ($n=1,2,3,4$).  The only
318: subtlety is that we do not want mismatches on sentence details (e.g., the
319: location of a raid) causing sentences describing the same type of occurrence
320: (e.g., a raid) from being separated, as this might yield \families too
321: fragmented for effective learning to take place. (Moreover, variability in the
322: {\em arguments} of the sentences in a cluster is needed for our learning
323: algorithm to succeed; see below.)  We therefore first
324: replace all appearances of dates, numbers, and proper names\footnote{Our crude
325:   proper-name identification method was to flag every phrase (extracted by a
326:   noun-phrase chunker) appearing capitalized in a non-sentence-initial position
327:   sufficiently often.  }  with generic tokens.  \Families with fewer than ten
328: sentences are discarded.
329: 
330: 
331: \newcommand{\art}[1]{}
332: \newcommand{\monthtoken}{MONTH\xspace}
333: \newcommand{\mayseven}{\datetoken~\numtoken \xspace}
334: \newcommand{\palestinian}{\nametoken\xspace}
335: \newcommand{\southta}{\nametoken\xspace}
336: \newcommand{\marchnine}{\datetoken~\numtoken \xspace}
337: \newcommand{\jerusalem}{\nametoken\xspace}
338: \newcommand{\ipmasharon}{\nametoken\xspace}
339: \newcommand{\marchten}{\datetoken~\numtoken \xspace}
340: \newcommand{\saturday}{\datetoken\xspace}
341: \newcommand{\marchthirtyone}{\datetoken~\numtoken \xspace}
342: \newcommand{\afpsource}{\nametoken\xspace}
343: \newcommand{\jewish}{\nametoken\xspace}
344: \newcommand{\efratwwbbeth}{\nametoken\xspace}
345: \newcommand{\sunday}{\datetoken\xspace}
346: \newcommand{\juneeighteen}{\datetoken~\numtoken \xspace}
347: \newcommand{\fifteen}{\numtoken1\xspace}
348: \newcommand{\tuesday}{\datetoken\xspace}
349: \newcommand{\seven}{{\numtoken1}\xspace}
350: \newcommand{\eleven}{{\numtoken1}\xspace}
351: \newcommand{\fifty}{\numtoken2\xspace}
352: \newcommand{\eighteen}{\numtoken1\xspace}
353: \newcommand{\fortyeight}{\numtoken2\xspace}
354: \newcommand{\locone}{{in \art{a} crowded hall in \southta}\xspace}
355: \newcommand{\loconeshort}{in \art{a} crowded hall$\ldots$\xspace}
356: \newcommand{\loctwo}{into \art{a} crowded \jerusalem cafe [sic] \ipmasharon's residence\xspace}
357: \newcommand{\loctwoshort}{into \art{a} crowded $\ldots$ residence\xspace}
358: \newcommand{\locthree}{{in \art{the} \jewish
359:  settlement of \efratwwbbeth}\xspace}
360: \newcommand{\locthreeshort}{in \art{the} \jewish
361:  settlement $\ldots$ \xspace}
362: \newcommand{\locfour}{{aboard \art{a} crowded bus in \jerusalem}\xspace}
363: \newcommand{\locfourshort}{{aboard $\ldots$ \jerusalem}\xspace}
364: \newcommand{\synone}{injuring\xspace}
365: \newcommand{\syntwo}{wounding\xspace}
366: 
367: 
368: 
369: 
370: \subsection{Inducing \patternsinformal}
371: 
372: \newcommand{\simfn}{\textrm{sim}} \newcommand{\alphabet}{\Sigma}
373: \newcommand{\underscore}{\underline{~}} In order to learn \patternsinformal, we
374: first compute a {\em multiple-sequence alignment} (MSA) of the sentences in a
375: given \family.  Pairwise MSA takes two sentences and a scoring function giving
376: the similarity between words; it determines the highest-scoring way to perform
377: insertions, deletions, and changes to transform one of the sentences into the
378: other.  Pairwise MSA can be extended efficiently to multiple sequences via the
379: iterative pairwise alignment, a polynomial-time method commonly used in
380: computational biology \cite{Durbin+Eddy+al:98a}.\footnote{Scoring function:
381:   aligning two identical words scores 1; inserting a word scores -0.01, and
382:   aligning two different words scores -0.5 (parameter values taken from
383:   \newcite{Barzilay&Lee:2002a}).}  \omt{ $$\simfn(x,y) = 1 & $x = y$, $x \in
384:   \alphabet$; \cr -0.01 & exactly one of $x,y$ is $\underscore$~; \cr -0.5 &
385:   otherwise (mismatch)$$
386:   1 if the two words $x$ and $y$ are the same, -0.01 }
387: The results can be represented in an intuitive form via a word {\em \lattice}
388: (see Figure \ref{fig:lattice}), which compactly represents (n-gram) structural
389: similarities between the \family's sentences.
390: 
391: To transform \lattices into generation-suitable \patternsinformal requires some
392: understanding of the possible varieties of \lattice structures.  The most
393: important part of the transformation is to determine which words are actually
394: instances of arguments, and so should be replaced by {\em slots} (representing
395: variables).  The key intuition is that because the sentences in the \family
396: represent the same {\em type} of event, such as a bombing, but generally refer
397: to different {\em instances} of said event (e.g. a bombing in Jerusalem versus
398: in Gaza), areas of large variability in the \lattice should correspond to
399: arguments.
400: 
401: To quantify this notion of variability, we first formalize its opposite:
402: commonality.  We define {\em \backbone} nodes as those shared by more than 50\%
403: of the \family's sentences.  The choice of 50\% is not arbitrary --- it can be
404: proved using the pigeonhole principle that our strict-majority criterion
405: imposes a unique linear ordering of the backbone nodes that respects the word
406: ordering within the sentences, thus guaranteeing at least a degree of
407: well-formedness and avoiding the problem of how to order backbone nodes
408: occurring on parallel ``branches'' of the lattice.
409: 
410: 
411: Once we have identified the \backbone nodes as points of strong commonality,
412: the next step is to identify the regions of variability (or, in \lattice terms,
413: many parallel disjoint paths) between them as (probably) corresponding to the
414: arguments of the propositions that the sentences represent.  For example, in
415: the top of Figure \ref{fig:lattice}, the words ``southern city, ``settlement of
416: NAME'',``coastal resort of NAME'', etc.  all correspond to the location of an
417: event and could be replaced by a single {\slot}.
418: Figure \ref{fig:lattice} shows an example of a \lattice and the derived
419: \slotlat; we give the details of the slot-induction process in the Appendix.
420: 
421: 
422: \subsection{Matching \famlats}
423: 
424: Now, if we were using a parallel corpus, we could employ
425: sentence-alignment information to determine which lattices correspond
426: to paraphrases.  Since we do not have this information, we essentially
427: approximate the parallel-corpus situation by correlating information
428: from descriptions of (what we hope are) the same event occurring in
429: the two different corpora.
430: 
431: Our method works as follows.  Once \lattices for each corpus in our
432: comparable-corpus pair are computed, we identify \lattice paraphrase pairs,
433: using the idea that paraphrases will tend to take the same values as arguments
434: \cite{Shinyama&al:2002a,Lin&Pantel:2001a}. More specifically, we take a pair of
435: \lattices from different corpora, look back at the sentence clusters from which
436: the two lattices were derived, and compare the slot values of those
437: cross-corpus sentence pairs that appear in articles written on the {\em same
438:   day} on the same topic; we pair the \lattices if the degree of matching is
439: over a threshold tuned on held-out data.  For example, suppose we have two
440: (linearized) lattices \template{{slot1} bombed slot2} and \template{slot3 was
441:   bombed by slot4} drawn from different corpora.  If in the first lattice's
442: sentence cluster we have the sentence ``the plane bombed the town'', and in the
443: second lattice's sentence cluster we have a sentence written on the same day
444: reading ``the town was bombed by the plane'', then the corresponding lattices
445: may well be paraphrases, where \template{slot1} is identified with
446: \template{slot4} and \template{slot2} with \template{slot3}.
447: 
448: 
449: To compare the set of argument values of two lattices, we simply count their
450: word overlap, giving double weight to proper names and numbers and discarding
451: auxiliaries (we purposely ignore order because paraphrases can consist of word
452: re-orderings).
453: 
454: \subsection{Generating paraphrase sentences}
455: 
456: Given a sentence to paraphrase, we first need to identify which, if any, of our
457: previously-computed sentence \families the new sentence belongs most strongly
458: to. We do this by finding the best alignment of the sentence to the existing
459: \famlats.\footnote{ To facilitate this process, we add ``insert'' nodes between
460:   \backbone nodes; these nodes can match any word sequence and thus account for
461:   new words in the input sentence.  Then, we perform multiple-sequence
462:   alignment where insertions score \mbox{-0.1} and all other node alignments
463:   receive a score of unity.}  If a matching \famlat is found, we choose one of
464: its comparable-corpus paraphrase \lattices to rewrite the sentence,
465: substituting in the argument values of the original sentence.  This yields as
466: many paraphrases as there are lattice paths.
467: 
468: 
469: 
470: \section{Evaluation}
471: \label{sec:eval}
472: 
473: 
474: All evaluations involved judgments by native speakers of
475: English who were not familiar with the paraphrasing systems
476: under consideration.
477: 
478: \begin{figure*}
479: \epsfscaledbox{templateeval4.eps}{6.4in}
480: \caption{\label{msa-dirt-accuracy} Correctness and agreement results.
481: Columns = instances; each grey box represents a judgment of ``valid''
482: for the instance.  For each method, a good, middling, and poor
483: instance is shown.  (Results separated by algorithm for clarity; the
484: blind evaluation presented instances from the two algorithms in random
485: order.)
486: } 
487: \end{figure*}
488: 
489: We implemented our system on a pair of comparable corpora consisting of
490: articles produced between September 2000 and August 2002 by the Agence
491: France-Presse (AFP) and Reuters news agencies.  Given our interest in
492: domain-dependent paraphrasing, we limited attention to 9MB of articles,
493: collected using a TDT-style document clustering system, concerning individual
494: acts of violence in Israel and army raids on the
495: Palestinian territories.  From this data (after removing 120 articles as a
496: held-out parameter-training set), we extracted 43 \slotlats from the AFP corpus
497: and 32 \slotlats from the Reuters corpus, and found 25 cross-corpus matching
498: pairs; since \lattices contain multiple paths, these yielded 6,534 template
499: pairs.\footnote{The extracted paraphrases are available at \texttt{http://www.cs.cornell.edu/Info/Projects/\\NLP/statpar.html}}
500: 
501: 
502: \subsection{Template Quality Evaluation}
503: 
504: Before evaluating the quality of the rewritings produced by our templates and
505: \lattices, we first tested the quality of a random sample of just the template
506: pairs.  In our instructions to the judges, we defined two {text units} (such as
507: sentences or snippets) to be paraphrases if one of them can generally be
508: substituted for the other without great loss of information (but not
509: necessarily vice versa).  \footnote{We switched to this ``one-sided''
510:   definition because in initial tests judges found it excruciating to decide on
511:   equivalence.
512: %LL-post
513:   Also, in applications such as summarization some information loss is
514:   acceptable.}  Given a pair of {\em templates} produced by a system, the
515: judges marked them as paraphrases if for many instantiations of the templates'
516: variables, the resulting text units were paraphrases.  (Several labelled
517: examples were provided to supply further guidance).
518: 
519: To put the evaluation results into context, we wanted to compare against
520: another system, but we are not aware of any previous work creating templates
521: precisely for the task of generating paraphrases.  Instead, we made a
522: good-faith effort to adapt the DIRT system \cite{Lin&Pantel:2001a} to the
523: problem, selecting the 6,534 highest-scoring templates it produced when run on
524: our datasets. (The system of \newcite{Shinyama&al:2002a} was unsuitable for
525: evaluation purposes because their paraphrase extraction component is too
526: tightly coupled to the underlying information extraction system.)  It is
527: important to note some important caveats in making this comparison, the most
528: prominent being that DIRT was not designed with sentence-paraphrase generation
529: in mind --- its templates are much shorter than ours, which may have affected
530: the evaluators' judgments --- and was originally implemented on much larger
531: data sets.\footnote{To cope with the corpus-size issue, DIRT was trained on an
532:   84MB corpus of Middle-East news articles, a strict superset of the 9MB we
533:   used.  Other issues include the fact that DIRT's output needed to be
534:   converted into English: it produces paths like ``N:of:N
535:   $\langle$tide$\rangle$ N:nn:N'', which we transformed into ``Y tide of X'' so
536:   that its output format would be the same as ours.  } The point of this
537: evaluation is simply to determine whether another corpus-based
538: paraphrase-focused approach could easily achieve the same performance level.
539: 
540: 
541: In brief, the DIRT system works as follows. Dependency trees are
542: constructed from parsing a large corpus.  Leaf-to-leaf paths are
543: extracted from these dependency trees, with the leaves serving as
544: slots.  Then, pairs of paths in which the slots tend to be filled by
545: similar values, where the similarity measure is based on the mutual
546: information between the value and the slot, are deemed to be
547: paraphrases.
548: 
549: 
550: We randomly extracted 500 pairs from the two algorithms' output sets.  Of
551: these, 100 paraphrases (50 per system) made up a ``common'' set evaluated by
552: all four judges, allowing us to compute agreement rates; in addition, each
553: judge also evaluated another ``individual'' set, seen only by him- or herself,
554: consisting of another 100 pairs (50 per system). The ``individual'' sets
555: allowed us to broaden our sample's coverage of the corpus.\footnote{Each judge
556:   took several hours at the task, making it infeasible to expand the sample
557:   size further.}
558: The pairs were presented in random order, and the judges were
559: not told which system produced a given pair.
560: 
561: As Figure~\ref{msa-dirt-accuracy} shows, our system outperforms the DIRT
562: system, with a consistent performance gap for all the judges of about 38\%,
563: although the absolute scores vary (for example, Judge 4 seems lenient).  The
564: judges' assessment of correctness was fairly constant between the full
565: 100-instance set and just the 50-instance common set alone.
566: 
567:  In terms of agreement, the Kappa value (measuring pairwise agreement
568: discounting chance occurrences\footnote{One issue is that the Kappa
569: statistic doesn't account for varying difficulty among instances.  For
570: this reason, we actually asked judges to indicate for each instance
571: whether making the validity decision was difficult.  However, the
572: judges generally did not agree on difficulty.  Post hoc analysis
573: indicates that perception of difficulty depends on each judge's
574: individual ``threshold of similarity'', not just the instance itself.
575: }) on the common set was 0.54, which corresponds to moderate
576: agreement~\cite{Landis&Koch:1977a}.  Multiway agreement is depicted in
577: Figure~\ref{msa-dirt-accuracy} --- there, we see that in 86 of 100
578: cases, at least three of the judges gave the same correctness
579: assessment, and in 60 cases all four judges concurred.
580: 
581: 
582: \subsection{Evaluation of the generated paraphrases}
583: 
584: Finally, we evaluated the quality of the paraphrase sentences generated by our
585: system, thus (indirectly) testing all the system components: pattern selection,
586: paraphrase acquisition, and generation.  We are not aware of another system
587: generating sentence-level paraphrases.  Therefore, we used as a baseline a
588: simple paraphrasing system that just replaces words with one of their
589: randomly-chosen WordNet synonyms (using the most frequent sense of the word
590: that WordNet listed synonyms for). The number of substitutions was set
591: proportional to the number of words our method replaced in the same sentence.
592: The point of this comparison is to check whether simple synonym substitution
593: yields results comparable to those of our algorithm.  \footnote{ We chose not
594:   to employ a language model to re-rank either system's output because such an
595:   addition would make it hard to isolate the contribution of the paraphrasing
596:   component itself.  }
597: 
598: 
599: 
600: \begin{figure*}[htpb]\footnotesize
601: \hspace*{-.2in}
602:    \begin{tabular}{|l|l|}    \hline
603:       Original (1) & {\em The caller identified the bomber as Yussef Attala, 20, from the
604:       Balata refugee camp near Nablus.}                  \\\hline
605:       MSA  &  The caller named the bomber as 20-year old Yussef Attala from the
606:       Balata refugee camp near Nablus.                    \\\hline
607:       Baseline & The company placed the bomber as Yussef Attala, 20, from the
608:       Balata refugee camp near Nablus.    \\\hline \hline
609:       Original (2) & {\em A spokesman for the group claimed responsibility for the attack
610:       in a phone call to AFP in this northern West Bank town}. \\\hline
611:       MSA  & The attack in a phone call to AFP in this northern West Bank town
612:       was claimed by a spokesman of the group.                     \\\hline
613:       Baseline & \parbox[t]{6in}{A spokesman for the grouping laid claim
614:       responsibility for the onslaught in a phone call to AFP  
615:       in this northern West Bank town. } \\\hline
616:     \end{tabular}
617:     \caption{Example sentences and generated paraphrases. Both judges felt 
618:     MSA preserved the meaning of (1) but not (2), and that neither
619:     baseline paraphrase was meaning-preserving.}
620:     \label{fig:WordNet}
621: \end{figure*}
622: 
623: 
624: 
625: \newcommand{\results}[2]{#2\xspace} For this experiment, we randomly selected
626: 20 AFP articles about violence in the Middle East published later than the
627: articles in our training corpus.  Out of 484 sentences in this set, our system
628: was able to paraphrase 59 (12.2\%).  (We chose parameters that optimized
629: precision rather than recall on our small held-out set.)  We found that after
630: proper name substitution, only seven sentences in the test set appeared in the
631: training set,\footnote{Since we are doing unsupervised paraphrase acquisition,
632:   train-test overlap is allowed.}  which implies that \lattices boost the
633: generalization power of our method significantly: from seven to 59 sentences.
634: Interestingly, the coverage of the system varied significantly with article
635: length.  For the eight articles of ten or fewer sentences, we paraphrased
636: 60.8\% of the sentences per article on average, but for longer articles only
637: 9.3\% of the sentences per article on average were paraphrased.  Our analysis
638: revealed that long articles tend to include large portions that are unique to
639: the article, such as personal stories of the event participants, which explains
640: why our algorithm had a lower paraphrasing rate for such articles.
641: 
642: 
643: 
644: All 118 instances (59 per system) were presented in random order to two judges,
645: who were asked to indicate whether the meaning had been preserved.  Of the
646: paraphrases generated by our system, the two evaluators deemed
647: \results{59-11}{81.4\%} and \results{59-13}{78\%}, respectively, to be valid,
648: whereas for the baseline system, the correctness results were
649: \results{59-18}{69.5\%} and \results{59-20}{66.1\%}, respectively. Agreement
650: according to the Kappa statistic was 0.6.  Note that judging full sentences is
651: inherently easier than judging templates, because template comparison requires
652: considering a variety of possible slot values, while sentences are
653: self-contained units.
654: 
655: Figure \ref{fig:WordNet} shows two example sentences, one where our MSA-based
656: paraphrase was deemed correct by both judges, and one where both judges deemed
657: the MSA-generated paraphrase incorrect.  Examination of the results indicates
658: that the two systems make essentially orthogonal types of errors. The baseline
659: system's relatively poor performance supports our claim that whole-sentence
660: paraphrasing is a hard task even when accurate word-level paraphrases are
661: given.
662: 
663: 
664: \section{Conclusions}
665: 
666: We presented an approach for generating sentence level
667: paraphrases, a task not addressed previously. Our method learns
668: structurally similar patterns of expression from data and identifies
669: paraphrasing pairs among them using a comparable corpus. A flexible 
670: pattern-matching procedure allows us to paraphrase an unseen sentence by
671: matching it to one of the induced patterns. Our approach
672: generates both lexical and structural paraphrases.
673: 
674: Another contribution is the induction of MSA lattices from non-parallel data.
675: Lattices have proven advantageous in a number of NLP contexts
676: \cite{Mangu&Brill&Stolcke:00a,Bangalore&Murdock&Riccardi:2002a,Barzilay&Lee:2002a,Pang+Knight+Marcu:03a},
677: but were usually produced from \mbox{(multi-)p}arallel data, which may not be
678: readily available for many applications.  We showed that word lattices can be
679: induced from a type of corpus that can be easily obtained for many domains,
680: broadening the applicability of this useful representation.
681: 
682: \vspace*{-.1in}
683: 
684: 
685: \section*{Acknowledgments}
686: 
687: {\footnotesize{
688:     We are grateful to many people for helping us in this work.  We thank
689:     Stuart Allen, Itai Balaban, Hubie Chen, Tom Heyerman, Evelyn Kleinberg,
690:     Carl Sable, and Alex Zubatov for acting as judges.  Eric Breck helped us
691:     with translating the output of the DIRT system.  We had numerous very
692:     useful conversations with all those mentioned above and with Eli Barzilay,
693:     Noemie Elhadad, Jon Kleinberg (who made the ``pigeonhole'' observation),
694:     Mirella Lapata, Smaranda Muresan and Bo Pang.  We are very grateful to
695:     Dekang Lin for providing us with DIRT's output.  We thank the Cornell NLP
696:     group, especially Eric Breck, Claire Cardie, Amanda Holland-Minkley, and Bo
697:     Pang, for helpful comments on previous drafts.  This paper is based upon
698:     work supported in part by the National Science Foundation under ITR/IM
699:     grant IIS-0081334 and a Sloan Research Fellowship.  Any opinions, findings,
700:     and conclusions or recommendations expressed above are those of the authors
701:     and do not necessarily reflect the views of the National Science Foundation
702:     or the Sloan Foundation.
703: 
704: \vspace*{-.2in}
705: 	
706: 
707: 
708: 
709: \bibliographystyle{llee-fullname}
710: 
711: \begin{thebibliography}{}
712: 
713: \bibitem[\protect\citename{Bangalore, Murdock, and
714:   Riccardi}2002]{Bangalore&Murdock&Riccardi:2002a}
715: Bangalore, Srinivas, Vanessa Murdock, and Giuseppe Riccardi.
716: \newblock 2002.
717: \newblock Bootstrapping bilingual data using consensus translation for a
718:   multilingual instant messaging system.
719: \newblock In {\em \proc of COLING}.
720: 
721: \bibsnip
722: 
723: \bibitem[\protect\citename{Barzilay and Lee}2002]{Barzilay&Lee:2002a}
724: Barzilay, Regina and Lillian Lee.
725: \newblock 2002.
726: \newblock Bootstrapping lexical choice via multiple-sequence alignment.
727: \newblock In {\em \proc of EMNLP}, pages 164--171.
728: 
729: \bibsnip
730: 
731: \bibitem[\protect\citename{Barzilay and McKeown}2001]{Barzilay&McKeown:01a}
732: Barzilay, Regina and Kathleen McKeown.
733: \newblock 2001.
734: \newblock Extracting paraphrases from a parallel corpus.
735: \newblock In {\em \proc of the ACL/EACL}, pages 50--57.
736: 
737: \bibsnip
738: 
739: \bibitem[\protect\citename{Chandrasekar and
740:   Bangalore}1997]{Chandrasekar+Srinivas:97a}
741: Chandrasekar, Raman and Srinivas Bangalore.
742: \newblock 1997.
743: \newblock Automatic induction of rules for text simplification.
744: \newblock {\em Knowledge-Based Systems}, 10(3):183--190.
745: 
746: \bibsnip
747: 
748: \bibitem[\protect\citename{Dras}1999]{Dras:1999a}
749: Dras, Mark.
750: \newblock 1999.
751: \newblock {\em Tree Adjoining Grammar and the Reluctant Paraphrasing of Text}.
752: \newblock {Ph.D.} thesis, Macquarie University.
753: 
754: \bibsnip
755: 
756: \bibitem[\protect\citename{Durbin \bgroup et al.\egroup
757:   }1998]{Durbin+Eddy+al:98a}
758: Durbin, Richard, Sean Eddy, Anders Krogh, and Graeme Mitchison.
759: \newblock 1998.
760: \newblock {\em Biological Sequence Analysis}.
761: \newblock Cambridge University Press, Cambridge, UK.
762: 
763: \bibsnip
764: 
765: \bibitem[\protect\citename{Grefenstette}1994]{Grefenstette:94a}
766: Grefenstette, Gregory.
767: \newblock 1994.
768: \newblock {\em Explorations in Automatic Thesaurus Discovery}, volume 278.
769: \newblock Kluwer.
770: 
771: \bibsnip
772: 
773: \bibitem[\protect\citename{Iordanskaja, Kittredge, and
774:   Polguere}1991]{Iordanskaja&Kittredge&Polguere:1991a}
775: Iordanskaja, L., R.~Kittredge, and A.~Polguere.
776: \newblock 1991.
777: \newblock Lexical selection and paraphrase in a meaning-text generation model.
778: \newblock In C.~Paris, W.~Swartout, and W.~Mann, editors, {\em Natural Language
779:   Generation in Artificial Intelligence and Computational Linguistics}. Kluwer,
780:   chapter~11.
781: 
782: \bibsnip
783: 
784: \bibitem[\protect\citename{Jacquemin}1999]{Jacquemin:l999a}
785: Jacquemin, Christian.
786: \newblock 1999.
787: \newblock Syntagmatic and paradigmatic representations of term variations.
788: \newblock In {\em \proc of the ACL}, pages 341--349.
789: 
790: \bibsnip
791: 
792: \bibitem[\protect\citename{Knight and Marcu}2000]{Knight&Marcu:2000a}
793: Knight, Kevin and Daniel Marcu.
794: \newblock 2000.
795: \newblock Statistics-based summarization --- {Step} one: Sentence compression.
796: \newblock In {\em \proc of AAAI}.
797: 
798: \bibsnip
799: 
800: \bibitem[\protect\citename{Landis and Koch}1977]{Landis&Koch:1977a}
801: Landis, J.~Richard and Gary~G. Koch.
802: \newblock 1977.
803: \newblock The measurement of observer agreement for categorical data.
804: \newblock {\em Biometrics}, 33:159--174.
805: 
806: \bibsnip
807: 
808: \bibitem[\protect\citename{Lin}1998]{Lin:1998a}
809: Lin, Dekang.
810: \newblock 1998.
811: \newblock {Automatic retrieval and clustering of similar words}.
812: \newblock In {\em \proc of ACL/COLING}, pages 768--774.
813: 
814: \bibsnip
815: 
816: \bibitem[\protect\citename{Lin and Pantel}2001]{Lin&Pantel:2001a}
817: Lin, Dekang and Patrick Pantel.
818: \newblock 2001.
819: \newblock Discovery of inference rules for question-answering.
820: \newblock {\em Natural Language Engineering}, 7(4):343--360.
821: 
822: \bibsnip
823: 
824: \bibitem[\protect\citename{Mangu, Brill, and
825:   Stolcke}2000]{Mangu&Brill&Stolcke:00a}
826: Mangu, Lidia, Eric Brill, and Andreas Stolcke.
827: \newblock 2000.
828: \newblock Finding consensus in speech recognition: Word error minimization and
829:   other applications of confusion networks.
830: \newblock {\em Computer, Speech and Language}, 14(4):373--400.
831: 
832: \bibsnip
833: 
834: \bibitem[\protect\citename{McKeown}1979]{McKeown:79a}
835: McKeown, Kathleen~R.
836: \newblock 1979.
837: \newblock Paraphrasing using given and new information in a question-answer
838:   system.
839: \newblock In {\em \proc of the ACL}, pages 67--72.
840: 
841: \bibsnip
842: 
843: \bibitem[\protect\citename{McKeown, Kukich, and
844:   Shaw}1994]{McKeown&Kukich&Shaw:1994a}
845: McKeown, Kathleen~R., Karen Kukich, and James Shaw.
846: \newblock 1994.
847: \newblock Practical issues in automatic documentation generation.
848: \newblock In {\em \proc of ANLP}, pages 7--14.
849: 
850: \bibsnip
851: 
852: \bibitem[\protect\citename{Meteer and Shaked}1988]{Meteer+Shaked:88a}
853: Meteer, Marie and Varda Shaked.
854: \newblock 1988.
855: \newblock Strategies for effective paraphrasing.
856: \newblock In {\em \proc of COLING}, pages 431--436.
857: 
858: \bibsnip
859: 
860: \bibitem[\protect\citename{Och, Tillman, and Ney}1999]{Och&Tillman&Ney:1999a}
861: Och, Franz~Josef, Christoph Tillman, and Hermann Ney.
862: \newblock 1999.
863: \newblock Improved alignment models for statistical machine translation.
864: \newblock In {\em \proc of EMNLP}, pages 20--28.
865: 
866: \bibsnip
867: 
868: \bibitem[\protect\citename{Pang, Knight, and Marcu}2003]{Pang+Knight+Marcu:03a}
869: Pang, Bo, Kevin Knight, and Daniel Marcu.
870: \newblock 2003.
871: \newblock Syntax-based alignment of multiple translations: Extracting
872:   paraphrases and generating new sentences.
873: \newblock In {\em Proceedings of HLT/NAACL}.
874: 
875: \bibsnip
876: 
877: \bibitem[\protect\citename{Papineni \bgroup et al.\egroup
878:   }2002]{Papineni&al:2002a}
879: Papineni, Kishore~A., Salim Roukos, Todd Ward, and Wei-Jing Zhu.
880: \newblock 2002.
881: \newblock Bleu: A method for automatic evaluation of machine translation.
882: \newblock In {\em \proc of the ACL}, pages 311--318.
883: 
884: \bibsnip
885: 
886: \bibitem[\protect\citename{Pereira, Tishby, and
887:   Lee}1993]{Pereira&Tishby&Lee:1993a}
888: Pereira, Fernando, Naftali Tishby, and Lillian Lee.
889: \newblock 1993.
890: \newblock Distributional clustering of {English} words.
891: \newblock In {\em \proc of the ACL}, pages 183--190.
892: 
893: \bibsnip
894: 
895: \bibitem[\protect\citename{Robin}1994]{Robin-phd}
896: Robin, Jacques.
897: \newblock 1994.
898: \newblock {\em Revision-Based Generation of Natural Language Summaries
899:   Providing Historical Background: Corpus-Based Analysis, Design,
900:   Implementation, and Evaluation}.
901: \newblock {Ph.D.} thesis, Columbia University.
902: 
903: \bibsnip
904: 
905: \bibitem[\protect\citename{Shinyama \bgroup et al.\egroup
906:   }2002]{Shinyama&al:2002a}
907: Shinyama, Yusuke, Satoshi Sekine, Kiyoshi Sudo, and Ralph Grishman.
908: \newblock 2002.
909: \newblock Automatic paraphrase acquisition from news articles.
910: \newblock In {\em \proc of HLT}, pages 40--46.
911: 
912: \bibsnip
913: 
914: \bibitem[\protect\citename{Smadja and McKeown}1991]{Smadja&McKeown:1991a}
915: Smadja, Frank and Kathleen McKeown.
916: \newblock 1991.
917: \newblock Using collocations for language generation.
918: \newblock {\em Computational Intelligence}, 7(4).
919: \newblock Special issue on natural language generation.
920: 
921: \bibsnip
922: 
923: \bibitem[\protect\citename{Vogel and Ney}2000]{Vogel&Ney:2000a}
924: Vogel, Stephan and Hermann Ney.
925: \newblock 2000.
926: \newblock Construction of a hierarchical translation memory.
927: \newblock In {\em \proc of COLING}, pages 1131--1135.
928: 
929: \bibsnip
930: 
931: \bibitem[\protect\citename{Wang}1998]{Wang:1998a}
932: Wang, Ye-Yi.
933: \newblock 1998.
934: \newblock {\em Grammar Inference and Statistical Machine Translation}.
935: \newblock {Ph.D.} thesis, CMU.
936: 
937: \end{thebibliography}
938: 
939: }
940: }
941: 
942: \section*{Appendix}
943: 
944: In this appendix, we describe how we  insert slots into  \lattices to
945: form \slotlats.
946: 
947: Recall that the backbone nodes in our \lattices represent words appearing in
948: many of the sentences from which the lattice was built.  As mentioned above,
949: the intuition is that areas of high variability between backbone nodes may
950: correspond to arguments, or slots.  But the key thing to note is that there are
951: actually two different phenomena giving rise to multiple parallel paths: {\em
952:   argument variability}, described above, and {\em synonym variability}.  For
953: example, Figure \ref{fig:variability}(b) contains parallel paths corresponding
954: to the synonyms ``injured'' and ``wounded''.  Note that we want to {\em remove}
955: argument variability so that we can generate paraphrases of sentences with
956: arbitrary arguments; but we want to {\em preserve} synonym variability in order
957: to generate a variety of sentence rewritings.
958: 
959: To distinguish  these two situations, we analyze the {\em split
960: level} of \backbone nodes that begin regions with multiple paths. The
961: basic intuition is that there is probably more variability associated
962: with arguments than with
963: synonymy: for example, as datasets increase, the number of locations
964: mentioned rises faster than the number of synonyms appearing. We make
965: use of a 
966: {\em synonymy threshold} $s$ (set by held-out parameter-tuning
967:  to  30), as follows.
968:  
969: \begin{itemize}
970: \item If no more than $s$\% of all the edges out of a \backbone node
971:  lead to the same next node, we have high enough variability to
972: warrant inserting a {\slot} node. 
973: \item Otherwise, we incorporate reliable synonyms\footnote{While our original
974:     implementation, evaluated in Section~\ref{sec:eval}, identified only
975:     single-word synonyms, phrase-level synonyms can similarly be acquired by
976:     considering chains of nodes connecting backbone nodes.}  into the \backbone
977:   structure by preserving all nodes that are reached by at least $s$\% of the
978:   sentences passing through the two neighboring \backbone nodes.
979: \end{itemize} 
980: Furthermore, all \backbone nodes
981: labelled with our special generic tokens are
982: also replaced with \slot nodes, since they, too, probably represent arguments
983: (we condense adjacent \slots into one).  Nodes with in-degree lower than the
984: synonymy threshold are removed under the assumption that they probably
985: represent idiosyncrasies of individual sentences.  See Figure
986: \ref{fig:variability} for examples.
987: 
988: Figure \ref{fig:lattice} shows an example of a
989: \lattice and the  \slotlat derived via the process just described.
990: 
991: 
992: \begin{figure}[h]
993: \epsfscaledbox{variability.eps}{2.8in}
994: \caption{\label{fig:variability} Simple seven-sentence examples of two types of
995: variability.  The double-boxed nodes are \backbone nodes; edges show
996: consecutive words in some sentence. The synonymy threshold (set to 30\%
997: in this example)
998: determines the type of variability. }
999: \end{figure}
1000: 
1001: 
1002: 
1003: 
1004: 
1005: \end{document}
1006: