cs0111007/final.tex
1: \documentclass[11pt]{article}
2: 
3: %\usepackage{ijcai01}
4: %\usepackage{fullpage,palatino}
5: \usepackage{fullpage,url}
6: \setlength{\oddsidemargin}{-0.25in}
7: \setlength{\evensidemargin}{-0.25in}
8: \setlength{\topmargin}{0.5in}
9: \setlength{\headheight}{0pt}
10: \setlength{\headsep}{0pt}
11: \setlength{\footskip}{0.35in}
12: \setlength{\textheight}{8.75in}
13: \setlength{\textwidth}{7in}
14: \setlength{\itemindent}{-0.5cm}
15: \setlength{\marginparwidth}{0in}
16: \setlength{\marginparsep}{0in}
17: %\renewcommand{\baselinestretch}{1.62}   % Double-space
18: \hyphenation{inform-ation-seeking inform-ation}
19: \newenvironment{descit}[1]{\begin{quote} \textit{#1}}{\end{quote}}
20: 
21: \input{psfig-dvips}
22: 
23: \newif\ifpdf
24: \ifx\pdfoutput\undefined
25:   \pdffalse
26: \else
27:   \pdfoutput=1
28:   \pdftrue
29: \fi
30: 
31: \ifpdf
32:   \usepackage[pdftex]{graphicx}
33:   \usepackage[pdftex]{color}
34:   \DeclareGraphicsExtensions{.pdf,.png,.jpg}
35: \else
36:   \usepackage[dvips]{graphicx}
37:   \usepackage[dvips]{color}
38:   \DeclareGraphicsExtensions{.eps,.epsi,.ps}
39: \fi
40: 
41: \usepackage{times}
42: %\usepackage{fancyheadings}
43: 
44: \def\midv{\mathop{\,|\,}}
45: \newtheorem{defn}{Definition}
46: \long\def\cbk#1{{\color{red}[CBK: #1]}}
47: \newlength\colwidth \setlength\colwidth{3.25in}
48: 
49: \title{Explaining Scenarios for Information Personalization}
50: \author{Naren Ramakrishnan, Mary Beth Rosson, and John M. Carroll\\
51: Department of Computer Science\\
52: Virginia Tech, VA 24061\\
53: Email: \{naren,rosson,carroll\}@cs.vt.edu}
54: \begin{document}
55: 
56: \maketitle
57: \begin{abstract}
58: \noindent
59: Personalization customizes information access. The PIPE (`Personalization is
60: Partial Evaluation') modeling methodology represents interaction with
61: an information space as a program. The program is then specialized to a
62: user's known interests or information seeking activity by the technique
63: of partial evaluation. In this paper, we elaborate PIPE by considering
64: requirements analysis in the personalization lifecycle. We investigate the use 
65: of scenarios as a means of identifying and analyzing personalization 
66: requirements. As our first result, we show how designing a PIPE representation
67: can be cast as a search within a space of PIPE models, organized along 
68: a partial order. This allows us to view the design of a personalization 
69: system, itself, as specialized interpretation of an 
70: information space. We then exploit the underlying equivalence 
71: of explanation-based generalization (EBG) and partial evaluation to realize
72: high-level goals and needs identified in scenarios;
73: in particular, we specialize (personalize) an information space based 
74: on the explanation of a user scenario in that information space, just as 
75: EBG specializes a theory based on the explanation of an example in that theory.
76: In this approach, personalization becomes the transformation of information
77: spaces to support the explanation of usage scenarios. An
78: example application is described.\\
79: 
80: \noindent
81: {\bf Word counts:} Abstract (183 words), Main Text (10200), 
82: Bibliography (1092), Appendix (2375).\\
83: 
84: \noindent
85: {\bf Keywords:} Personalization, Partial Evaluation, Scenario-Based Design,
86: Explanation-Based Generalization.
87: \end{abstract}
88: %\pagestyle{empty}
89: 
90: \newpage
91: \tableofcontents
92: \newpage
93: \section{Introduction}
94: Personalization constitutes the mechanisms and technologies required to
95: customize information access to the end-user.  It can be defined as
96: the automatic adjustment of information content, structure, and presentation
97: tailored to an individual user. With the rapid increase in the amount of
98: information
99: being placed online, the scope of personalization today
100: extends to many different forms of information content and
101: delivery~\cite{cacm-broader,cacm-kantor,cacm-streams}, not just web pages.
102: It is
103: estimated that by the year 2003, personalization services
104: will constitute the major component of the Internet industry~\cite{appian}.
105: 
106: There are undoubtedly `personal views of personalization'~\cite{cacm-personal}.
107: This is evident both from the numerous ways in which the term is informally
108: interpreted as well as the various choices available for designing, building,
109: and targeting personalization systems~\cite{adaptive-sites,specissue,rus}.
110: A simple form of personalization is where a web portal such as {\tt
111: myCNN.\hskip0ex com}
112: allows a user to customize newsfeeds, colors, and layouts to create a
113: personal gateway
114: to the Internet~\cite{manber}. This example abstracts the personalization
115: problem to a point where
116: the burden of completing the personalization task is shifted to the user,
117: who must specify
118: the settings. Another form of personalization involves a web browser that
119: automatically
120: `hides' hyperlinks that will not lead to interesting pages. This example relies
121: on more
122: sophisticated user modeling; for example browsing history may be used to
123: predict pages of interest.
124: A third example is
125: the recommendation facility at {\tt amazon.com} that suggests books
126: according to
127: similarities in purchase behavior.
128: 
129: Not withstanding this variety, a core body of personalization
130: algorithms and techniques have emerged.
131: For instance, the mining
132: of web user logs to identify browsing patterns has matured into a
133: well-abstracted data mining
134: problem~\cite{cacm-mulvenna}. Similarly, algorithms for determining
135: similarities between
136: buying patterns have
137: been studied and scaled to realistic dimensions.
138: However, the process of analyzing and specifying
139: requirements for personalization and
140: designing a system that achieves
141: the desired functional goals is still
142: an ill-understood and under-emphasized research issue. In fact, the {\it
143: lifecycle}
144: underlying design and deployment of personalization systems has not been
145: articulated well enough to enable the investigation of these issues.
146: 
147: It is difficult to capture specifications of requirements
148: independent of particular personalization algorithms or techniques. Beyond the
149: familiar cognitive gap between specifying and implementing requirements,
150: this is due to the dynamic nature of Internet technologies,
151: where a new development (e.g., cookies~\cite{cookies}) enables a form of
152: personalization that was
153: not possible before. Consequently, the lifecycle for personalization systems
154: tends to reflect the
155: solution-first strategy of inventing a specific technique and then implementing
156: it in a demonstration system. The hazards of such an approach are well
157: documented~\cite{jack-making-use}.
158: 
159: Our goal in this paper is to begin building a bridge from the high-level
160: design goals and
161: functional requirements for personalization on the one hand, and the
162: specific techniques and algorithms used
163: to realize these goals and requirements on the other. Our approach is
164: motivated by the recent development of a modeling methodology for
165: personalization
166: systems --- PIPE (`Personalization is Partial
167: Evaluation')~\cite{naren-ic,pipe-tois}.
168: Personalization systems are designed and implemented in PIPE by modeling an
169: information seeking
170: interaction in a programmatic representation.  PIPE helps realize
171: a variety of individual personalization algorithms and
172: enables the view of personalization as specializing representations. However,
173: PIPE currently supports only the interaction modeling required of a
174: personalization
175: system; it does not address earlier stages in the lifecycle of
176: personalization system
177: design (such as requirements analysis) or later stages (such as
178: verification and validation).
179: 
180: We elaborate how to integrate the use of PIPE with the early stages of the
181: personalization lifecycle, in particular
182: capturing requirements and translating them into the design of
183: software. In extending PIPE, we employ
184: scenario-based methods for analyzing and representing
185: usage tasks.
186: Scenarios and scenario-based methods are ideally suited for our purposes
187: because they
188: help identify personalization opportunities and organize design rationale
189: for a system in terms of its constituent facilities.
190: Two key contributions emerge from our approach. First, we relate PIPE to
191: the context
192: in which personalization scenarios are envisioned, abstracted, and realized
193: in an information
194: system, and thus contribute to a better understanding of the lifecycle
195: underlying personalization
196: system design. In particular, we relate personalization to the ability to
197: operationalize the {\it explanation} of a scenario 
198: of (intended) usage. Second, we provide new
199: techniques for
200: managing and reasoning with scenarios that not only aid in personalization
201: system design
202: but also find applications in other situations that involve transformation
203: of representations.
204: For instance, the construction of simplified views of systems for training
205: and demonstration purposes
206: can be expressed using these methods.
207: 
208: \subsection*{Reader's Guide}
209: The balance of the paper is organized as follows. In
210: Section~\ref{pipe-modeling}, we introduce
211: the PIPE modeling methodology for personalization. We describe its
212: capabilities, shortcomings, and
213: relate PIPE to other projects that represent and reason about
214: information seeking.
215: Section~\ref{newsec}
216: takes the first steps toward reasoning from scenarios (as representations
217: of requirements) to
218: modeling opportunities in PIPE. Section~\ref{explain} further
219: describes how scenario-based methods can extend PIPE to apply to the
220: earlier stages in the
221: lifecycle of personalization system design, such as requirements analysis
222: and high-level specification
223: of goals. Finally, 
224: Section~\ref{discuss}
225: identifies opportunities for
226: future research in both scenario-based methods and personalization systems
227: design. A case study that illustrates many of
228: the ideas introduced in this paper is provided in the Appendix. 
229: 
230: \section{PIPE: Personalization by Partial Evaluation}
231: \label{pipe-modeling}
232: 
233: As a methodology,
234: PIPE~\cite{naren-ic} makes no commitments to a particular
235: algorithm, format for information resources,
236: type of information seeking activities or, more basically, the nature
237: of personalization delivered. Instead, it emphasizes the modeling of an
238: information space in a way where descriptions of
239: information seeking activities can be
240: represented as partial information. Such
241: partial information is then exploited (in the model) by
242: {\it partial evaluation}, a technique popular in the programming languages
243: community~\cite{jones}.
244: 
245: \subsection{Example: Personalizing a Browsing Hierarchy}
246: \label{eggs}
247: It is easy to illustrate the basic concepts of PIPE by describing its
248: application to personalizing a browsing hierarchy. Consider a congressional
249: web site, organized in a hierarchical fashion, that provides information
250: about US Senators, Representatives, their party and
251: state affiliations (Fig.~\ref{senator} (left)).
252: Assume further that we
253: wish to personalize the site so that a reduced or restructured hierarchy is
254: made available for
255: each user.
256: The first step to
257: modeling in PIPE involves thinking of information as being organized along
258: a motif
259: of interaction sequences. We can identify two such organizations ---
260: the site's layout and design that influences how a user interacts with it,
261: and the user's
262: mental model that indicates
263: how best her information seeking goals are specified and realized. In
264: Fig.~\ref{senator} (left),
265: the designer has made a somewhat arbitrary partition, with type of politician
266: as the root level dichotomy, the party as the second level, and state at
267: the third.
268: However the user might think of politicians by party first, a viewpoint
269: that is not
270: supported by the current site design. Site designs that are hardwired to
271: disable some
272: interaction sequences can be called `unpersonalized' with respect to
273: the user's mental model.
274: 
275: One typical personalization solution involves anticipating every type of
276: interaction sequence beforehand,
277: and implementing customized interfaces (algorithms) for all of
278: them~\cite{hearst-setting}.
279: For independent levels
280: of classification (such as in Fig.~\ref{senator} (left)), this usually implies
281: creating
282: and storing separate trees of information hierarchies. Sometimes, the site
283: designer
284: chooses an intermediate solution that places a prior constraint
285: on the types and forms of
286: interaction sequences supported. This is frequently implemented by
287: directing the user to one of several
288: predefined categories (e.g., `to search by State, click here.'). It is clear
289: that such solutions can involve
290: an exponential space of possibilities and lead to correspondingly
291: cumbersome site designs.
292: 
293: \begin{figure}
294: \centering
295: \begin{tabular}{cc}
296: & \mbox{\psfig{figure=senators.eps,width=5in}}
297: \end{tabular}
298: \caption{Personalizing a browsing hierarchy. (left)
299: Original information resource, depicting information about members
300: of the US Congress. The labels on edges represent choices and selections
301: made by a navigator. (right) Personalized hierarchy with respect to the
302: criterion `Democrats.' Notice that not only the pages, but also their structure is 
303: customized for (further browsing by) the user.}
304: \label{senator}
305: \end{figure}
306: 
307: \begin{figure}
308: \centering
309: \begin{tabular}{|l|l|} \hline
310: {\tt int pow(int base, int exponent) \{} & {\tt int pow2(int base) \{} \\
311: \,\,\,\,\,{\tt int prod = 1;} & \,\,\,\,\,{\tt return (base * base);} \\
312: \,\,\,\,\,{\tt for (int i=0;i<exponent;i++)} &  \} \\
313: \,\,\,\,\,\,\,\,\,\,{\tt prod = prod * base;} & \\
314: \,\,\,\,\,{\tt return (prod);} & \\
315: \} & \\
316: \hline
317: \end{tabular}
318: \caption{Illustration of the partial evaluation technique.
319: A general purpose {\tt pow}er function written in C (left) and
320: its specialized version (with {\tt exponent} statically set to 2) to handle
321: squares
322: (right). Such specializations are performed automatically by partial
323: evaluators such as C-Mix.}
324: \label{pe}
325: \end{figure}
326: 
327: \begin{figure}
328: \centering
329: \begin{tabular}{|l|l|} \hline
330: {\tt if (Sen)} & \\
331: \,\,\,\,{\tt if (Dem)} & \\
332: \,\,\,\,\,\,\,\,{\tt if (CA)} & \\
333: \,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} & {\tt if (Sen)}\\
334: \,\,\,\,\,\,\,\,{\tt else if (NY)} & \,\,\,\,{\tt if (CA)} \\
335: \,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} & \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$}\\
336: \,\,\,\,{\tt else if (Rep)} & \,\,\,\,{\tt else if (NY)}\\
337: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} & \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
338: {\tt else if (Repr)} & {\tt else if(Repr)} \\
339: \,\,\,\,{\tt if (Dem)} &  \,\,\,\,{$\cdots \cdots \cdots$} \\
340: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} & \\
341: \,\,\,\,{\tt else if (Rep)} & \\
342: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} & \\
343: \hline
344: \end{tabular}
345: \caption{Using partial evaluation for personalization. (left) Programmatic input
346: to partial evaluator, reflecting the organization of information in Fig.~\ref{senator} (left).
347: (right) Specialized program from the partial evaluator, used to create the personalized
348: information space shown in Fig.~\ref{senator} (right).}
349: \label{sen1}
350: \end{figure}
351: 
352: The approach in PIPE is to create a programmatic representation of the
353: space of possible interaction sequences, and then to use the technique of
354: partial evaluation to realize individual interaction sequences. PIPE models
355: the information space as a program,
356: partially evaluates the program with respect to (any) user input, and
357: recreates a personalized
358: information space from the specialized program.
359: 
360: The input to a partial evaluator
361: is a program and (some) static information about its arguments. Its
362: output is a specialized version of this program (typically in the same
363: language),
364: that uses the static information to `pre-compile' as many operations
365: as possible. A simple example is how the C function {\tt pow}
366: can be specialized to create a new function, say
367: {\tt pow2}, that computes the square of an integer.
368:  Consider for example,
369: the definition of a {\tt pow}er function shown in the left part of
370: Fig.~\ref{pe}
371: (grossly simplified for presentation purposes).
372: If we knew that a particular user will utilize it
373: only for computing squares of
374: integers, we could specialize it (for that user) to produce the {\tt pow2}
375: function.
376: Thus, {\tt pow2} is obtained automatically (not by a human programmer)
377: from {\tt pow} by precomputing all expressions that involve {\tt exponent},
378: unfolding the for-loop, and by various other compiler transformations such as
379: {\it copy propagation} and {\it forward substitution}.
380: Automatic program specializers are available for C, FORTRAN, PROLOG, LISP,
381: and several other important
382: languages. The interested reader is referred to~\cite{jones} for a good
383: introduction.
384: While the traditional motivation for using partial evaluation is to achieve
385: speedup
386: and/or remove interpretation overhead~\cite{jones}, it can also be viewed
387: as a technique
388: for simplifying program presentation, by removing inapplicable, unnecessary,
389: and `uninteresting' information (based on user criteria) from a program.
390: 
391: Thus we can abstract the situation in Fig.~\ref{senator} (left) by the program of
392: Fig.~\ref{sen1} (left) whose structure models the information resource (in
393: this case, a hierarchy of web pages) and whose control-flow models
394: the information seeking activity within it (in
395: this case, browsing through the hierarchy by making individual selections).
396: The link
397: labels are represented as program variables and semantic dependencies
398: between links
399: are captured by the mutually-exclusive {\tt if..else} dichotomies. To
400: personalize this site,
401: for say, `Democrats,' this program is partially evaluated with
402: respect to
403: the variable {\tt Dem} (setting it to one and all
404: conflicting variables
405: such as {\tt Rep} to zero). This produces
406: the simplified program in the right part of Fig.~\ref{sen1}
407: which can be used to recreate web pages with personalized web content (shown in
408: Fig.~\ref{senator} (right)).
409: For hierarchies such as in Fig.~\ref{senator}, the representation afforded
410: by PIPE (notice the nesting of conditionals in Fig.~\ref{sen1}, left)
411: is typically much smaller than expressing the same as
412: a union of all possible interaction sequences.
413: 
414: Since the partial evaluation of a program results in another program, the PIPE
415: personalization operator is {\it closed.} In terms of interaction, this
416: means that
417: any modes of information seeking (such as browsing, in Fig.~\ref{sen1})
418: originally modeled in the program are preserved. In the above example,
419: personalizing a browsable
420: hierarchy returns another browsable hierarchy.  The closure property also
421: means that the
422: original information seeking activity (browsing) and personalization can be
423: interleaved in
424: any order. Executing the program in the form and order in which it was
425: modeled amounts
426: to the system-initiated mode of browsing. `Jumping ahead' to nested
427: program segments by
428: partially evaluating the program amounts to the user-directed mode of
429: personalization.
430: In Fig.~\ref{sen1} (right), the simplified program can be rendered and browsed in
431: the traditional sense,
432: or partially evaluated further with additional user inputs.  PIPE's use of
433: partial
434: evaluation is thus central to realizing a {\it mixed-initiative} mode of
435: information seeking~\cite{pipe-pepm},
436: without explicitly hardwiring all possible interaction sequences.
437: 
438: \subsection{Modeling in PIPE}
439: \label{factors}
440: Modeling an information space as a program that encapsulates the underlying
441: information
442: seeking activity is key to the successful application of PIPE. For browsing
443: hierarchies, a programmatic
444: model can be trivially built by a depth-first crawl of the site. 
445: In addition, a variety of other information spaces and corresponding
446: information seeking activities can be modeled in PIPE.
447: In~\cite{naren-ic,pipe-tois},
448: we have described modeling options for representing
449: information integration,
450: abstracting within a web page, interacting with recommender systems, 
451: modeling clickable maps, representing
452: computed information,
453: and capturing syntactic and semantic constraints
454: pertaining
455: to browsing hierarchies. Opportunities to curtail the cost of partial
456: evaluation for
457: large sites are also described in~\cite{pipe-tois}. We will not address
458: such modeling aspects here
459: except to say that the effectiveness of a PIPE implementation depends on the
460: particular modeling choices made {\it within} the programmatic
461: representation (akin
462: to~\cite{rabbit}).
463: We cannot overemphasize this aspect ---- an
464: example such as Fig.~\ref{sen1} can be made `more personalized' by conducting
465: a more sophisticated modeling
466: of the underlying domain. For example, individual
467: politicians' web pages at the leaves of Fig.~\ref{senator} could be modeled
468: by a deeper nesting of conditionals involving address, education, precinct,
469: and other
470: attributes of the individual. In other words, a single page could be
471: further modeled
472: as a browsable hierarchy and `attached' (functionally invoked) at various
473: places
474: in the program of Fig.~\ref{sen1} (left).
475: Conversely, the example in Fig.~\ref{senator} can be made
476: `less personalized' by requiring categorical information along with user input.
477: For instance, replacing {\tt if (Dem)} in Fig.~\ref{sen1}
478: with {\tt if (Party=Dem)} implies that the specification of the type of
479: input (namely that `Democrat' refers to the `name of the party') is required
480: in order for the statement to be partially evaluated.
481: Personalization systems built with PIPE can thus be distinguished by what
482: they model and the forms of customization enabled by applying
483: partial evaluation to such a modeling.
484: 
485: \subsection{Reasoning about Representations}
486: \label{reason}
487: Not all information spaces (and information seeking activities) can be
488: effectively modeled in PIPE. For example, a depth-first crawl of a 
489: site based on social network navigation
490: (e.g., [{\tt www.\hskip0ex imdb.\hskip0ex com}]) will result in spaghetti
491: code. In such cases, we
492: need a more complete understanding of the processes by which an online
493: information resource is created,
494: expressed, validated, and used. Even for sites that are
495: easily personalized, PIPE requires that they be modeled so that 
496: all information seeking activities are expressible as partial inputs. 
497: For instance, consider the following three information seeking activities 
498: in the context of Fig.~\ref{senator} (left).
499: 
500: \vspace{-0.3in}
501: \begin{descit}{}
502: \begin{description}
503: \item [User 1:] I will specify a party name first; then I will specify the name of
504: a state; finally, I will browse through any remaining links at the site. 
505: \item [User 2:] I would like to see the list of possible states first. So, the top level 
506: of the site should present me links for all the possible states. 
507: \item [User 3:] Show me information about the Democratic Senators of California.
508: \end{description}
509: \end{descit}
510: 
511: The information seeking activity of User 1 can be easily realized in the representation
512: of Fig.~\ref{sen1} (left), since we can partially evaluate the representation
513: with respect to the user's choice
514: of party and state. We say that the representation is well-factored for this activity and
515: that it is {\it personable} for this activity. However, the activity of User 2 cannot
516: be accomodated in Fig.~\ref{sen1} (left)
517: since it requires restructuring operations that are not describable as
518: partial evaluations. Applying partial evaluation to the representation of Fig.~\ref{sen1} (left)
519: can simplify interactions and allow User 2 
520: to make a choice of state out-of-turn. But it
521: cannot change the default order in which the interactions are modeled, 
522: which is by a 
523: branch-of-congress-party-state hierarchy. In this case, we say that the representation is
524: under-factored (for User 2's activity) and, equivalently, is {\it unpersonable} for it.
525: The reader should note that this doesn't mean that User 2's request can never be satisfied
526: in a PIPE model; see~\cite{pipe-tois} for an alternate representation of the information space
527: that is personable for User 2's activity (and is hence, well-factored for it).
528: 
529: Now, consider how we will satisfy User 3's request. 
530: This user has specified choices for all possible program variables --- 
531: involving state, party, and branch of Congress. This amounts to 
532: a {\it complete evaluation}, rather than a partial evaluation. Complete 
533: evaluation in a PIPE model
534: implies that every possible aspect of interaction is specified 
535: in advance, obviating the need for any interaction! 
536: Since PIPE emphasizes the specialization of interaction by partial evaluation,
537: the representation of
538: Fig.~\ref{sen1} (left) offers no particular advantages for User 3's activity.
539: In this case, we say that the representation is over-factored and, 
540: again, is unpersonable (by partial evaluation) for 
541: User 3's activity. Thus, both under-factorization and over-factorization
542: lead to unpersonable representations of information spaces.
543: The interesting representations are in between.
544: 
545: %It is difficult to create a representation that is well-factored, at once, 
546: %for a variety of
547: %information-seeking interactions. The easiest way to
548: %create representations --- by anticipating all possible information-seeking
549: %goals --- is over-factored for all interactions! For instance,
550: %an over-factored design for all of the above three users would be a
551: %web site that has the following top-level prompt:
552: %
553: %\vspace{-0.3in}
554: %\begin{descit}{}
555: %\vspace{-0.1in}
556: %\begin{description}
557: %\item click [here] if you are the user who likes to specify party name first, 
558: %then name of a state, and then browses through remaining links.
559: %\item click [here] if you are the user who would like to see list of 
560: %possible states first.  
561: %\item click [here] if you are the user who knows values for all three attributes 
562: %of politicians.
563: %\end{description}
564: %\end{descit}
565: %
566: %\noindent
567: %Such solutions bucket all users into well-defined categories by
568: %over-specifying the personalization problem.
569: In practice, it is acceptable to 
570: have a few situations that involve 
571: complete evaluation, as long as they are a small fraction of the total 
572: number of information seeking activities (that are to be accomodated in 
573: a PIPE model). 
574: More discussion about representations and their factorizations 
575: is available in~\cite{pipe-tois}; for the purposes of this paper, it suffices 
576: to note that we have various possibilities for representing information 
577: spaces in PIPE and that it is important to choose a representation that is 
578: well-factored.
579:  
580: \subsection{Related Research}
581: In name and spirit, PIPE's personalization by partial evaluation is similar
582: to RABBIT's~\cite{rabbit}
583: retrieval by reformulation. Both these approaches involve the modeling of
584: information seeking in a setting
585: that emphasizes (i) reconciling the mismatch
586: between how an information space is organized and how a particular user
587: forages in it,
588: (ii) closure properties of the transformation operators, and (iii)
589: the design of information systems in ways that highlight new evaluation
590: criteria. Like RABBIT, PIPE
591: assumes that `the user knows more about the generic structure of the
592: [information space] than [PIPE]
593: does, although [PIPE] knows more about the particulars ([web
594: pages])~\cite{rabbit}.' For
595: instance, personalization by partial evaluation is only as effective as the
596: ease with
597: which program variables could be set (on or off) based on information
598: supplied by the user.
599: As such, PIPE has no semantic understanding of the representation.
600: 
601: PIPE also differs from RABBIT in important ways. It emphasizes the modeling
602: of an information
603: space as well as an information seeking activity in a unified programmatic
604: representation.
605: Its single transformation operator (partial evaluation) provides a basis to
606: reason about the design of
607: personalization systems. Since partial evaluation works best for highly
608: parameterized and structured spaces, the PIPE viewpoint relates the 
609: personalizability of an information resource to the factorizability 
610: of its representation. A well-factored information
611: space is thus a personable
612: one, since information seeking activities are expressible as partial inputs.
613: %This means that for an information seeking task that can be modeled
614: %programmatically,
615: %partial evaluation can be used as a theoretical way of assessing {\it any}
616: %personalization system
617: %designed for that task, not just ones designed by PIPE.
618: 
619: Research at the intersection of information systems and HCI has a strong
620: tradition, with many other
621: prominent examples. Both the Scatter/Gather~\cite{scatter-gather} and
622: Dynamic Taxonomies~\cite{tkde-navigation}
623: projects rely on defining a set of operations under which transformations
624: made on an information
625: space are closed. These projects concentrate on retrieval and navigation,
626: respectively. While
627: there has been considerable research in web
628: personalization~\cite{adomavicius01expert-driven,
629: ira, fab, cacm-hirsh, grouplens, cacm-jaideep, phoaks},
630: many of these algorithms/systems (or in some cases their results)
631: are usefully viewed as modeling choices to be made in a PIPE
632: implementation. For instance,
633: the graph-theoretic recommendation algorithm described in~\cite{ira} can
634: be modeled as a function in PIPE, so that the results of the function are
635: used to set
636: values for program variables, which can in turn be `linked' to more
637: detailed information about the
638: recommended artifacts.
639: There have also
640: been attempts at defining theories of information access, suitable for the
641: design of personalization
642: systems~\cite{pirolli-card}. Pirolli~\cite{pirolli-chapter}
643: explains the idea of `information foraging' and analyzes projects such
644: as Scatter/Gather in this context.
645: 
646: Empirical research involving usage modeling
647: and information capture is also relevant here. Drawing ideas from the ACT-R
648: theory of cognition, Pirolli
649: et al.~\cite{forage2} describe how a quantitative model of information
650: foraging
651: can be defined. Tools for capturing history of interaction
652: in information foraging are also well
653: studied~\cite{holland-hill,footprints}. Mining
654: web user logs
655: has become a popular technique for obtaining models of site
656: navigation~\cite{cacm-jaideep,
657: cacm-mulvenna,cacm-myra}. While this strand of research
658: has arrived at rich, quantitative
659: models of site usage and navigation, there is a persistent gap between what
660: could
661: be mined from site usage and how the site could be automatically
662: transformed to conform
663: to any identified needs. Typical approaches to bridging
664: from the results of site usage studies
665: to opportunities for site restructuring
666: are heuristic (see for instance~\cite{cacm-myra}) and are limited in the
667: transformations they employ~\cite{adaptive-sites}.
668: 
669: \section{From Scenarios to Modeling Choices in PIPE}
670: \label{newsec}
671: As mentioned earlier, PIPE's modeling methodology requires a programmatic
672: representation (such as Fig.~\ref{sen1}, left) for partial evaluation.
673: Where do such representations come from? In this section, we analyze how
674: personalization requirements originate in usage contexts, and 
675: how they can help to build the representations of information spaces needed 
676: in PIPE. In this sense, we are extending the PIPE methodology `upstream' 
677: in the personalization system design life cycle, to include requirements 
678: analysis and specification.
679: 
680: While even a very general notion of requirements gathering applies in our 
681: situation~\cite{jack-require,
682: kieras-crc,human1,sommerville-crc}, personalization offers the unique
683: viewpoint of interpreting a general, existing information resource in a 
684: specialized manner (and thus, indirectly improving it). Studies in 
685: traditional IR contexts (e.g., see~\cite{human3,rabbit}) have
686: shown that one way to achieve such specialized interpretation is to support
687: the iterative reformulation of information requests. Besides reconciling
688: the mental mismatch between user expectations and the facilities 
689: afforded by an information system, reformulation engages the
690: user in an active dialog with the system, using both system 
691: features and user input to complete the information seeking activity.
692: 
693: We propose that such active search and reformulation episodes can be 
694: anticipated, revealed, and modeled by usage scenarios. These scenarios can
695: then form the basis of a scenario-based analysis and design (SBD)
696: process~\cite{jmc1,rosson-jmc}. Current practice is observed and
697: described in scenarios, and such scenarios help analyze how designers 
698: and users think about complex information resources. 
699: By helping to reason about the tradeoffs and design rationale associated 
700: with system design decisions, scenarios can aid in identifying 
701: opportunities for personalization. How to systematically proceed from 
702: high-level goals and needs identified in a scenario to a programmatic 
703: model in PIPE is the subject of the balance of this paper.
704: 
705: %As we will show later, the design rationale is a key 
706: %research result by itself, since it serves as a basis for understanding 
707: %what it means to design and use partial evaluation 
708: %techniques for information personalization.
709: %
710: %purposes, the design rationale can serve as a key research 
711: %These scenarios can then
712: %form the basis of a scenario-based analysis and
713: %design (SBD) process~\cite{jmc1,rosson-jmc}.
714: %SBD methods involve a combination of analytic and empirical activities ---
715: %current practice is observed and described in scenarios, and these
716: %scenarios are analyzed, transformed, refined, and
717: %evaluated in a continuing cycle of design and analysis. Central design products
718: %include narratives of realistic use that motivate the design of a system and
719: %an accompanying analysis of the tradeoffs or design rationale associated with
720: %system design decisions~\cite{jmc1}. For personalization systems, these
721: %scenarios and design rationale explain how designers and users think
722: %about complex information resources. As we will show later, the design
723: %rationale is a key research result by itself, since it serves as a 
724: %basis for understanding what it means to design and use partial evaluation 
725: %techniques for information personalization.
726: 
727: Before we describe our approach, it is important to make some preliminary
728: remarks. Let us revisit the two PIPE models in Fig.~\ref{sen1}. The
729: model in Fig.~\ref{sen1} (right) is the result of partially evaluating
730: Fig.~\ref{sen1} (left) with respect to `Democrats.' However, the model of
731: Fig.~\ref{sen1} (left) can itself be viewed as the result of a partial
732: evaluation (say, of a model that provides information about
733: all US politicians, with respect to `congressional officials'). In other words,
734: Fig.~\ref{sen1} (left) is personalized for information about
735: members of the US Congress. Likewise, the model of Fig.~\ref{sen1} (right) 
736: can be viewed as the starting point of interaction with an (unpersonalized)
737: information system, one that is designed for people who are interested in only 
738: Democrats. It should thus be clear that there is actually a continuum of 
739: PIPE models (see Fig.~\ref{continuum}), organized along a partial order 
740: (where the specialization relation is partial evaluation). 
741: 
742: \begin{figure}
743: \centering
744: \begin{tabular}{llllll}
745: $\cdots$
746: \begin{tabular}{|l|} \hline
747: PIPE Model\\
748: of US Politics\\
749: \hline
750: \end{tabular}
751: $\cdots$
752: \begin{tabular}{|l|} \hline
753: PIPE Model\\
754: of US Elected\\
755: Officials \\
756: \hline
757: \end{tabular}
758: $\cdots$ &
759: \begin{tabular}{|l|} \hline
760: {\tt if (Sen)} \\
761: \,\,\,\,{\tt if (Dem)} \\
762: \,\,\,\,\,\,\,\,{\tt if (CA)} \\
763: \,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
764: \,\,\,\,\,\,\,\,{\tt else if (NY)} \\
765: \,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
766: \,\,\,\,{\tt else if (Rep)} \\
767: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
768: {\tt else if (Repr)} \\
769: \,\,\,\,{\tt if (Dem)} \\
770: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
771: \,\,\,\,{\tt else if (Rep)} \\
772: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
773: \hline
774: \end{tabular}
775: & $\cdots$ & 
776: \begin{tabular}{|l|} \hline
777: {\tt if (Sen)}\\
778: \,\,\,\,{\tt if (CA)} \\
779: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$}\\
780: \,\,\,\,{\tt else if (NY)}\\
781: \,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
782: {\tt else if(Repr)} \\
783: \,\,\,\,{$\cdots \cdots \cdots$} \\
784: \hline
785: \end{tabular}
786: & $\cdots$
787: \end{tabular}
788: \caption{A space of PIPE models organized by the partial evaluation
789: relation.}
790: \label{continuum}
791: \end{figure}
792: 
793: Given this observation, we can cast our requirements analysis problem as
794: a search within a space of PIPE models, such as Fig.~\ref{continuum}. But
795: we can go further. Every model in this space can be
796: thought of as the result of a partial evaluation or, equally, as a starting
797: point for partial evaluation. This means that the task of selecting
798: a PIPE representation for subsequent personalization 
799: can actually be viewed as a problem of partially
800: evaluating (personalizing) a more general representation! Designing a 
801: personalization system is thus reduced 
802: to a problem of personalization (of a general, and perhaps ineffective,
803: information space). A PIPE representation can be seen as
804: `freezing' some aspects
805: of interaction and making available some other aspects to
806: model users' information seeking activities. In Fig.~\ref{sen1}
807: (left), the model is the result of partial evaluation with respect to congressional 
808: officials, but program variables
809: pertaining to party, type, and state are available 
810: to represent users' personalization objectives. This viewpoint 
811: reinforces our idea that both designing and using personalization systems 
812: involve specialized interpretation of information spaces. 
813: 
814: Clearly this cyclic argument has to end somewhere, so what is the `starting
815: model' for partial evaluation? And where does it come from?
816: Our approach is to relate opportunities 
817: identified in usage scenarios to a characterization of the space of PIPE 
818: models, by qualifying a most-specific and the most-general elements of 
819: the space. We then
820: define an evaluation function 
821: to express our preference for one 
822: model over another. Before we can formally describe our methodology,
823: we must broaden our view of partial evaluation, 
824: moving from its algorithmic 
825: details as a specialization function, to larger contexts that recognize the
826: space of models induced by the partial order.
827: 
828: %our method demands that requirements identified in
829: %this fashion are representable as partial inputs in a PIPE model.
830: %This is difficult because partial evaluation is simply a program
831: %manipulation technique, so it does not provide any guidance on where 
832: %partial inputs (or the program) can come from, {\it only that they 
833: %are assumed given.} One way to view the challenge is as
834: %a need to make partial evaluation well-specified (for personalization 
835: %purposes), given the high-level goals and needs identified in a scenario. 
836: %To address this problem we must broaden our view of partial evaluation,
837: %moving from its algorithmic details to examples of how similar
838: %techniques have been applied in other contexts.
839: 
840: One such context is the work on explanation-based generalization 
841: (EBG) in AI~\cite{dejong}. Just as partial evaluation addresses the
842: specialization of programs, EBG addresses the specialization of domain
843: theories. In fact, van Haremelen and Bundy have observed~\cite{EBG_PE} that
844: when programs and domain theories are both represented in Prolog notation,
845: partial evaluation and EBG are essentially equivalent. With respect to
846: our scenario modeling problem, EBG makes the important addition of recognizing
847: the space of models induced by the partial evaluation
848: relation: the space is first defined by a systematic process of 
849: `explaining' observations and reasoning about features that are 
850: relevant to the observation. The vocabulary for conducting the explanation is 
851: provided by the domain theory; the relevant features thus identified 
852: help characterize the search space. Next, EBG provides a search criterion for 
853: evaluating models, one that emphasizes the utility and usefulness of 
854: the ensuing representations.
855: 
856: We can borrow this idea of explanation, using it to bridge from usage 
857: scenarios to the models and representational choices required by PIPE.
858: Just as an existing domain theory supports the construction of an explanation
859: for an example observation (which then guides the specialization of 
860: the theory), an existing information space can support the construction 
861: of an explanation for a scenario (of intended usage), which can then
862: guide the personalization of the information space (in our case, thus 
863: helping to design a personalization system).
864: 
865: \subsection{Explanation-Based Generalization}
866: To understand how we can bridge the high-level requirements uncovered
867: through scenarios and the programmatic modeling required by PIPE, we briefly 
868: review the basic ideas of EBG in an everyday context.
869: Consider a non-native speaker of English (Linus)
870: visiting the
871: United States. He is attempting to learn conversational constructs for
872: `being polite.'
873: The essence of EBG is that it is easier for Linus (at first) to verify or
874: explain why a particular conversation is
875: an example of politeness, than to describe or define politeness in a vacuum.
876: Thus, Linus observes instances of politeness and generalizes from them by
877: explaining
878: why they appear to be polite. For instance,
879: he witnesses the following dialog between two people:
880: 
881: \begin{descit}
882: {\bf Person 1:} Sir, I was wondering if you could point me in the direction
883: of Central Park.\\
884: {\bf Person 2:} Sure. Make a right two blocks after the gas station.
885: \end{descit}
886: 
887: At this point, Linus can infer that this is a valid example of politeness
888: (from Person 2's helpful response) and proceeds to explain the observation. 
889: By analyzing the structure of Person 1's query
890: and using his knowledge of how English sentences are constructed, Linus
891: constructs an explanation of this observation. DeJong~\cite{dejong} shows 
892: that an explanation can be viewed as a tree where each leaf is a 
893: property of the example being explained, each internal node models 
894: an inference procedure applied to its children, and the root is the
895: final conclusion supported by the explanation (namely, that the above was
896: an example of politeness). The explanation tree proves that
897: the conversation is polite and 
898: helps separate out the 
899: relevant and incidental parts of the above conversation; any attribute of 
900: the conversation that does not participate in the `proof' does not contribute
901: to politeness.
902: 
903: Using this structure (and his knowledge of English), Linus can then study 
904: how it can be generalized to other situations.
905: For instance, he can reason that the
906: phrase `Sir, I was wondering if' is what confers politeness onto the whole
907: sentence. He can also conclude that `Central Park' is not a 
908: property of politeness per se, but a feature of the
909: particular request. Notice that Linus could have arrived at the phrase
910: `Sir, I was wondering if' himself (without the above example), but that
911: would have required
912: a lot of imagination (computation, for an AI system~\cite{russel-norvig})
913: on his part. This is the
914: essence of EBG --- namely that we don't `actually learn anything factually
915: new from the instance~\cite{russel-norvig}' but such examples
916: point us in the direction in which to specialize our unmanageable domain
917: theory (in this
918: case, Linus's rules of grammar and knowledge of how English sentences are
919: constructed).
920: 
921: The reader might notice a disconnect between the G in EBG (which stands for
922: {\it generalization}) and our statement that EBG is really a technique for
923: {\it specializing}
924: domain theories. This can be understood by noticing that the most common
925: usage of EBG is in
926: learning concept descriptors from individual examples~\cite{dejong}. Thus,
927: while it is the domain theory that is being specialized, the example is 
928: being generalized by throwing away 
929: parts of its explanation structure. In other words, the domain theory
930: constitutes the prior knowledge
931: that is useful for generalization~\cite{russel-norvig}. Linus can then
932: exhibit his newly acquired politeness in
933: a different situation such as: `Sir, I was wondering if you could hold open
934: the elevator for me.' Or
935: even further, he might generalize `Sir' to include `Madam' and `Lady.'
936: 
937: In concept learning, 
938: the level to which Linus generalizes a particular explanation is influenced 
939: by {\it operationality}. For instance, 
940: if he doesn't generalize beyond `Sir, I was wondering if you could point me 
941: in the direction of,' then his learning can only be applied to, say, 
942: situations when he is lost. At the other extreme, Linus might reason that there are many other ways of
943: being polite (such as
944: `Could you please ...?') and conclude that
945: any well mannered phrase prefixed to a request constitutes an instance of
946: politeness. Such
947: an over-generalization is however less operational, since it assumes that
948: Linus has some other way of
949: deciding what makes a phrase `well mannered.' Operationality is thus related
950: to the utility of the induced generalization.
951: 
952: %\begin{table}
953: %\centering
954: %\begin{tabular}{|lcl|} \hline
955: %Learning to be polite & = & Concept Learning (in EBG) \\
956: %                        & = & Program Specialization (in PE)\\
957: %			& & \\
958: %Rules of Conversational English & = &  Domain Theory (in EBG) \\
959: %			& = & Original Program (in PE) \\
960: %			& & \\
961: %%Central Park Conversation & = & Example of Goal Concept (in EBG) \\
962: %%			& = &
963: %%\fbox{\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,}
964: %%(in PE) \\
965: %%			& & \\
966: %Essential Aspects of Central Park  Conversation & = & Features of Training
967: %Example (in EBG) \\
968: %& = & Values for Static Variables (in PE) \\
969: %			& & \\
970: %Rules for Politeness    & = & Reformulated Goal Concept (in EBG) \\
971: %                        & = & Specialized Program (in PE) \\ \hline
972: %\end{tabular}
973: %\caption{A comparison of explanation-based generalization (EBG) and partial
974: %evaluation (PE).}
975: %\label{ebgispe}
976: %\end{table}
977: 
978: %With these observations, we can make a mapping between EBG and partial
979: %evaluation
980: %(see Table~\ref{ebgispe})
981: %by cross-referencing terms and concepts between the two areas. We mention
982: %that our mappings are
983: %more conservative than the stronger claims made in~\cite{EBG_PE} but the
984: %latter work is presented in
985: %the context of a first-order logic representation for the domain theory. 
986: 
987: \begin{figure}
988: \centering
989: \begin{tabular}{|ll|} \hline
990: {\bf Most general:} & $\prec$polite conversational construct$\succ$.\\
991: & ....\\
992: &$\prec$well-mannered phrase$\succ$ $\prec$Linus's desire$\succ$.\\
993: & ....\\
994: & $\prec$address$\succ$, I was wondering if $\prec$Linus's desire$\succ$.\\
995: & ....\\
996: & Sir, I was wondering if $\prec$Linus's desire$\succ$.\\
997: & ....\\
998: {\bf Most specific:} & Sir, I was wondering if you could point me in the direction of Central Park.\\
999: \hline
1000: \end{tabular}
1001: \caption{Example generalizations for the Central Park conversation.}
1002: \label{linusgen}
1003: \end{figure}
1004: 
1005: \subsection{Using EBG in Personalization}
1006: \label{ebginpe}
1007: Keller~\cite{keller-op} shows how we can think of EBG as a search 
1008: through a concept description space such as Fig.~\ref{linusgen}.
1009: The operationality consideration is then the objective function
1010: used to evaluate entries in the concept description space. 
1011: The most specific construct simply records the conversation 
1012: and can only be replayed in an exactly similar situation. The effort to 
1013: instantiate the construct is thus minimal but many such constructs will 
1014: likely be needed to cover a realistic set of situations.
1015: The most general construct involves no learning on Linus's part and 
1016: merely restates his desire to learn polite constructs. If Linus adopts this
1017: construct, he can have one single explanation structure to support
1018: all situations but he has to expend the effort to instantiate it (effectively,
1019: constructing the proof) every time he needs to be polite. 
1020: 
1021: 
1022: Analogously, the goal of obtaining a PIPE model is viewed as
1023: search through a space of possible PIPE models, ordered by the partial 
1024: evaluation operator. 
1025: Explaining a user's successful interaction at a site (or collection of sites)
1026: with respect to a domain theory (more on this later) will help identify 
1027: the parts of the interaction that contribute to achieving 
1028: the personalization objectives. The explanation tree thus serves to define 
1029: the search space of PIPE models. The operationality boundary is then the 
1030: point at which certain parameters are fixed in our representation of the 
1031: information space and certain others are available to model users' 
1032: information seeking interactions. 
1033: %For instance, the operationality boundary in the example given in
1034: %Section~\ref{eggs} is our requirement that the personalization system 
1035: %cater to information about members of the US Congress {\it only.} Stated 
1036: %alternatively, operationalization here is intended to support the subsequent
1037: %partial evaluation of the information space with respect to the type, 
1038: %party, and state of political officials; the representation is
1039: %hence parameterized in terms of these variables.
1040: 
1041: \begin{figure}
1042: \centering
1043: \begin{tabular}{|ll|} \hline
1044: {\bf Inputs}&---- Functional Description of Personalization Problem \\
1045: & \,\,\,\,\,\,\,\,\,(i.e., a non-operative definition)\\ 
1046:  & ---- Domain Theory \\
1047: & \,\,\,\,\,\,\,\,\,(describing site layout, task models, browsing semantics; \\
1048: & \,\,\,\,\,\,\,\,\,must support the construction of an explanation) \\
1049: & ---- Usage Scenario  \\
1050: & \,\,\,\,\,\,\,\,\,(showing successful realization of personalization goal;\\
1051: & \,\,\,\,\,\,\,\,\,typically a sequence of interactions which achieves goal)\\
1052: & ---- Operationality Criterion\\
1053: & \,\,\,\,\,\,\,\,\,(helps evaluate alternative PIPE models;\\
1054: & \,\,\,\,\,\,\,\,\,outlines which aspects of explanation structure can be fixed,\\
1055: & \,\,\,\,\,\,\,\,\,and which should be available for capturing interactions with
1056: users)\\
1057: {\bf Output}&---- PIPE model (that defines a personalization system) \\
1058: \hline
1059: \end{tabular}
1060: \caption{Formalizing requirements analysis in the personalization lifecycle
1061: as an EBG problem.}
1062: \label{funcos}
1063: \end{figure}
1064: 
1065: \begin{figure}
1066: \centering
1067: \begin{tabular}{cc}
1068: \mbox{\psfig{figure=lifecycle.eps,height=4in}}
1069: \end{tabular}
1070: \caption{From scenarios to modeling choices in PIPE: toward a lifecycle of personalization
1071: system design.}
1072: \label{lifecycle}
1073: \end{figure}
1074: 
1075: Adapting from~\cite{ebg-alternate}, we can formalize requirements analysis
1076: for PIPE as shown in Fig.~\ref{funcos}.
1077: Our methodology requires the specification
1078: of four inputs. A functional description of the personalization problem is
1079: assumed so that we can distinguish between scenarios where the user was
1080: successful in achieving his objectives from those where he was not successful.
1081: The domain theory is the most critical aspect of the methodology and encodes
1082: knowledge about site layout, browsing semantics, task models, and any other
1083: information that is relevant for reasoning about the user's
1084: personalization objectives. In addition, the domain theory language
1085: should support inference procedures (e.g., deduction, rewriting) that 
1086: enable the construction of explanations. Usage scenarios, narratives, and think-aloud records constitute
1087: the third input and together with the domain theory, drive the 
1088: explanation construction process. Finally, operationality serves as the
1089: criterion for evaluating the space of models induced by generalizing an
1090: explanation.
1091: 
1092: Procedurally, our methodology consists of a three-stage approach 
1093: (see Fig.~\ref{lifecycle}):
1094: (i) constructing explanations from scenarios of use; this reveals how 
1095: a given site (or collection of sites) helps the user to complete his 
1096: information seeking activity, (ii) operationalizing the explanations in 
1097: terms of site facilities; this allows us to assess the most relevant aspects 
1098: of the explanation structure that we would like to retain and express 
1099: in a personalization system, and (iii) expressing the operationalized 
1100: explanation in a PIPE model (which will allow its subsequent personalization 
1101: for future users). We now proceed to study these steps in greater detail.
1102: 
1103: \section{Explaining Personalization as Partial Evaluation}
1104: \label{explain} \subsection{Constructing Explanations from Scenarios} The SBD process begins with an analysis of current usage practices. This would typically take place through field work that includes observation of work sessions, interviews or surveys of domain
1105: experts, and collection and analysis of work artifacts. The goals and
1106: concerns of current use are synthesized and contextualized as problem
1107: scenarios.
1108: 
1109: \begin{figure}
1110: %\centering
1111: %\framebox{
1112: \begin{tabular}{|l|}\hline
1113: \begin{minipage}{2\colwidth}
1114: %\small
1115: \begin{description}
1116: \item [Nancy's scenario]: Nancy, a resident of 
1117: North Carolina, is interested in determining the committees that 
1118: the junior senator from her state is a member of.
1119: She uses the PoliticalInfo web site to perform her information 
1120: seeking activity. The top level of the site features various categories
1121: of political information, organized according to the offices of government
1122: (such as `President,' `Congress,' and `State Offices'). She clicks on
1123: `Congress' and reaches a page that prompts her to choose a state. Upon
1124: selecting `North Carolina,' the site refreshes to a selection involving
1125: branch of Congress (Senate or the House of Representatives). Nancy selects
1126: `Senate,' and the system now requests information on whether the senator
1127: occupies the junior or senior seat.
1128: By mistake, 
1129: she clicks on an advertisement banner for campaign finance reform, which 
1130: causes a new browser window to be opened up, soliciting information from 
1131: Nancy for an opinion poll. She hurriedly closes
1132: this new window, goes back to her browsing session in progress, and clicks
1133: on `Junior Seat.' Nancy scrolls down the displayed homepage of the individual,
1134: eyeballs the various headings, and finally spots the information about
1135: committee memberships. She notes that there are four committees that
1136: the senator is a member of. Satisfied with the results of her information
1137: seeking, Nancy exits from her browser program.
1138: \end{description}
1139: \end{minipage}
1140: \\\hline
1141: \end{tabular}
1142: \caption{Fictitious narrative of a problem scenario synthesized from field observation.}
1143: \label{nancy}
1144: \end{figure} 
1145: 
1146: \begin{figure}
1147: \centering
1148: \begin{tabular}{|rrl|} \hline
1149: R1: & politicalinfo($x$) & $\Leftarrow$ complete($x$) \\
1150: R2: & complete($x$) & $\Leftarrow$ officeselect($x$,``Congress'') $\wedge$ member($x$) $\wedge$ aspect($x$) \\
1151: R3: & complete($x$) & $\Leftarrow$ officeselect($x$,``President'') $\wedge$ aspect($x$) \\
1152: $\cdots$ & & \\
1153: R25: & member($x$) & $\Leftarrow$ representative($x$) \\
1154: R26: & member($x$) & $\Leftarrow$ senator($x$) \\
1155: $\cdots$ & & \\
1156: R32: & senator($x$) & $\Leftarrow$ branchselect($x$,``Senate'') $\wedge$ stateselect($x$,$s$) $\wedge$ seatselect($x$,``Junior Seat'') \\
1157: R33: & senator($x$) & $\Leftarrow$ branchselect($x$,``Senate'') $\wedge$ stateselect($x$,$s$) $\wedge$ seatselect($x$,``Senior Seat'') \\
1158: $\cdots$ & & \\
1159: R48: & aspect($x$) & $\Leftarrow$ aspectselect($x$,``Education'') \\
1160: R49: & aspect($x$) & $\Leftarrow$ aspectselect($x$,``Committee Memberships'') \\
1161: R50: & aspect($x$) & $\Leftarrow$ aspectselect($x$,``Home City'') \\
1162: $\cdots$ & & \\
1163: S1: & stateselect($x$,$s$) & $\Leftarrow$ clickedon($x$,$p$,$s$) $\wedge$
1164: congresslevel($p$) $\wedge$ hyperlink($s$) \\
1165: S2: & adselect($x$,$a$) & $\Leftarrow$ clickedon($x$,$p$,$a$) $\wedge$
1166: advertisement($a$) \\
1167: $\cdots$ & & \\
1168: %L1: & $\forall x,p,l,q$ follow($x$,$p$,$l$) $\wedge$ graph($p$,$l$,$q$) $\Rightarrow$ visited($x$,$q$) \\ 
1169: %L2: & $\forall x,p,l$ follow($x$,$p$,$l$) $\Rightarrow$ clickedon($x$,$l$) \\
1170: %$\cdots$ & \\
1171: %N1: & $\forall x1,x2,p,l,q$ follow($x1$,$p$,$l$) $\wedge$ advertisement($l$)
1172: %$\wedge$ graph($p$,$l$,$q$) $\wedge$ newsession($x1$,$x2$) $\Rightarrow$
1173: %visited($x2$,$q$) \\
1174: %$\cdots$ & \\
1175: \hline
1176: \end{tabular}
1177: \caption{An example domain theory for reasoning about interactions at
1178: the fictitious PoliticalInfo site. All variables are assumed to be
1179: universally quantified (syntax adapted from~\cite{dejong}).}
1180: \label{egdomain}
1181: \end{figure}
1182: 
1183: \begin{figure}
1184: \centering
1185: \begin{tabular}{|ll|} \hline
1186: F1: & officeselect($x47$,``Congress''). \\
1187: F2: & stateselect($x47$,``North Carolina'').\\
1188: F3: & branchselect($x47$,``Senate'').\\
1189: F4: & adselect($x47$,``Campaign Finance Reform Adverstisement'').\\
1190: F5: & seatselect($x47$,``Junior Seat'').\\
1191: F6: & aspectselect($x47$,``Committee Memberships'').\\
1192: \hline
1193: \end{tabular}
1194: \caption{Description of Nancy's scenario for subsequent explanation.}
1195: \label{nancy2}
1196: \end{figure}
1197: 
1198: \begin{figure}
1199: \centering
1200: \begin{tabular}{cc}
1201: \mbox{\psfig{figure=nancy-tree.eps,width=4.5in}}
1202: \end{tabular}
1203: \caption{Constructing an explanation for Nancy's scenario.
1204: Following DeJong~\cite{dejong}, the arrows indicate
1205: influence of rule antecedents on consequents and the parallel
1206: lines indicate unification constraints.}
1207: \label{nancy-tree}
1208: \end{figure}
1209: 
1210: As an example, consider the analysis and development of scenarios for 
1211: personalizing information about US political officials.
1212: We might observe users' interactions with various web sites
1213: for several hours, then interview them and analyze their decisions or 
1214: other artifacts that they generate. Empirical techniques for
1215: analyzing web logs and for the automatic sessionizing of
1216: user access patterns might also be used
1217: (see~\cite{jaideep-kais} for methods of data preparation and
1218: gathering for this activity).
1219: The analysis might point to a recurring scenario in which users 
1220: select a particular site, navigate the pages in the
1221: site by making various selections on the category of the political 
1222: official and finally, lookup individual web pages to obtain 
1223: specific information about a particular official. For instance,
1224: Fig.~\ref{nancy} describes a positive instance of our modeling goal
1225: (i.e., personalizing information about political officials at the PoliticalInfo
1226: web site).
1227: 
1228: We now turn our attention to the domain theory which consists of: 
1229: a modeling of the information seeking goal in terms of an underlying 
1230: schemata; our understanding of PoliticalInfo's layout (its site structure 
1231: and how link choices specify information seeking attributes), and;
1232: %constraints on interaction that can be inferred from how the political 
1233: %system is structured (e.g., `every state has only two senators but the
1234: %number of representatives varies'), and;
1235: aspects that capture how browsing interactions take place at the site (such
1236: as `clicking on advertisement banners cause new browser windows to
1237: be opened up'). 
1238: While EBG is sometimes proposed as a possible computational model of human
1239: concept formation and cognition, it is important to note that the purpose 
1240: of a domain theory in our methodology is {\it not} to explain
1241: users' behavior at this level. Rather, we seek to encode information in the 
1242: domain theory that helps relate interactions with information systems to 
1243: the realization of information seeking objectives. This requires that
1244: the domain theory be sufficient to reason deductively why an observed
1245: sequence of interactions achieves the desired personalization objectives. Our 
1246: experience is that for a variety of information spaces that support a 
1247: goal-oriented view of information seeking 
1248: (e.g., browsing hierarchies involving taxonomic relationships), this 
1249: assumption of the availability of a domain theory can indeed be satisfied. 
1250: 
1251: %, semantics of 
1252: %how attributes of political officials relate to one another (e.g., `choice of
1253: %state and choice of branch of US congress are independent of each other'),
1254: %taxonomic and cardinality constraints (e.g., `all states have only two senators,
1255: %but the number of representatives can vary from one in a state like
1256: %South Dakota to 52 in a state like California'), and aspects that capture 
1257: %how browsing interactions take place (such as the fact that new
1258: %browser windows can be opened and particular spatial locations on maps can be
1259: %clicked). 
1260: 
1261: For instance, a domain theory for interactions at PoliticalInfo can be organized
1262: as shown in Fig.~\ref{egdomain}.
1263: The first part of the theory describes an underlying schema for achieving
1264: the politicalinfo information seeking goal. Rule R1, in particular, is 
1265: our functional specification of the personalization problem. It is 
1266: non-operative and merely states that an interaction session ($x$) that is 
1267: complete can help satisfy the personalization objectives. Rules R2 and R3 
1268: describe two specific ways in which an
1269: interaction can be complete, namely by either
1270: concentrating on aspects dealing with members of the US Congress or 
1271: on the President. Members of the US Congress, in turn, are defined as
1272: either senators or representatives (rules R25 and R26). Notice that there
1273: are many possibilities for defining member($x$) and these are encoded
1274: in the domain theory.
1275: The second part of the domain theory describes 
1276: how primitive actions carried out by the user correspond to 
1277: the specification of information seeking attributes. For instance, the 
1278: selection of a state (rule S1) can be made by clicking on an
1279: available hyperlink from the congresslevel page. Similarly selections
1280: of advertisements can be made by clicking on advertisements from any page
1281: (rule S2).
1282: 
1283: Using the predicates in the domain theory, we can represent Nancy's scenario
1284: ($x47$) as shown in Fig.~\ref{nancy2}. For ease of presentation, 
1285: Fig.~\ref{nancy2} describes only the high-level selections inferred from
1286: Nancy's interactions instead of assertions
1287: at the level of clicks and hyperlinks. We now proceed to show
1288: that Nancy's scenario is a `correct'
1289: example of accessing information about political officials. In other
1290: words, we attempt to prove that politicalinfo($x47$) is true.
1291: The explanation tree constructed by 
1292: resolution is shown in 
1293: Fig.~\ref{nancy-tree} and identifies
1294: the salient aspects of the scenario that 
1295: contribute to realizing Nancy's information seeking goals. In particular,
1296: rules R26 and R32 have been used to prove that Nancy's selection of attributes 
1297: defines a particular political official. The explanation tree also reveals
1298: that the interactions relating to the advertisement for
1299: campaign finance reform do not contribute to Nancy's objectives. 
1300: We exclude the aspect of Nancy recording the details of the 
1301: committee memberships; her satisfaction with the information seeking 
1302: activity is assumed to be implicit in the completion of the proof.
1303: 
1304: %\begin{figure}
1305: %\centering
1306: %\begin{tabular}{cc}
1307: %\mbox{\psfig{figure=finance.epsi,width=3.5in}}
1308: %\end{tabular}
1309: %\caption{Using a scenario editor to define and
1310: %write narratives for Bill's personalization scenario.}
1311: %\label{bill-scenario}
1312: %\end{figure}
1313: %
1314: \subsubsection{A Note about Domain Theories}
1315: Before we describe the next stage in our methodology, it is pertinent to make
1316: some observations about the domain theory. First, we are not constrained to
1317: a predicate logic representation for the domain theory. The only 
1318: requirement is that `the representation language support the
1319: construction of an explanation'~\cite{dejong}. Second, alternate domain
1320: theories might permit the explanation of the same scenarios, but in
1321: qualitatively different ways. 
1322: This is a useful feature since it 
1323: %out the interconnection between a usage scenario and the domain theory and
1324: prevents overgeneralizing from the observed features of the scenario.
1325: In addition, it allows us to compare and contrast domain theories
1326: and determine if the resulting explanations are acceptable. Third,
1327: we can start with a `coarse' domain theory and revise it 
1328: by focusing on particular scenarios and situations~\cite{restructure-theories}. 
1329: Such {\it theory revision} research is an active area of EBG, where 
1330: explanation-based techniques are augmented with more empirically-based 
1331: methods to address the problem of imperfect prior knowledge. Finally,
1332: while the availability of a general schemata 
1333: aids in the construction of a domain theory, external
1334: guidance (from users and think-aloud records) can support the construction
1335: of an explanation and augment domain theories that are incomplete. 
1336: 
1337: This last feature is especially useful when we extend our approach to more
1338: complex situations, such as multiple information resources. Consider 
1339: a scenario where Nancy is seeking information about financial investments.
1340: During analysis and design, some sub-goals and decisions can be
1341: inferred by think-aloud protocols while Nancy forages in 
1342: various web sites to address her financial analysis goal. For instance, Nancy 
1343: might report that she conducted a mental calculation of dollar amounts from
1344: Euro currency in analyzing some merger stocks, and hence we might model a
1345: procedure for unit conversion as an intermediate goal in our domain
1346: theory. Similarly, Nancy might have performed a manual information
1347: integration by copying text from one browser window to another or might
1348: have performed a mapping from company names (e.g., `Microsoft')
1349: to ticker symbols (`MSFT'). We would represent these as unification 
1350: constraints or semantic mappings in our domain theory, respectively. By
1351: augmenting a domain theory in this manner, we can summarize scenarios
1352: collected as field data into appropriate explanation structures.
1353: 
1354: We also recognize the possibility that a domain theory might
1355: be ineffective in explaining scenarios. Consider the case when
1356: Nancy is seeking information about the `Democratic senator from
1357: North Carolina' and is unsure if the senator occupies the junior or senior 
1358: seat. If the site does not allow the direct specification of her request,
1359: Nancy might resort to trying both choices of seats to determine 
1360: the one occupied by the Democrat. An ineffective domain theory might 
1361: incorrectly infer that Nancy's information seeking was 
1362: focused on both senators! We thus need to discount some steps in the
1363: scenario as being tentative or exploratory. This is a well studied problem 
1364: in EBG and various strategies for reducing dependence on such `brittle
1365: theories' (such as induction over explanations) 
1366: have been proposed~\cite{dejong,flann}. 
1367: 
1368: The reader will also note that the explanation in Fig.~\ref{nancy-tree}
1369: (or the domain theory) does not capture the order in which the 
1370: attributes were specified in Nancy's scenario. In this particular
1371: instance, the temporal sequencing of subgoals is not critical to completing
1372: the explanation; Nancy could have selected `Senate' first (if the site
1373: allowed it) before the choice of state was made.
1374: In a different application, the domain theory might need to support the
1375: construction of explanations that recognize the ordering of interactions.
1376: 
1377: \subsection{Operationalizing Explanations}
1378: Fig.~\ref{nancy-tree}'s explanation of Nancy's scenario, while
1379: identifying relevant parts of the domain theory, is too specific to be
1380: used as the basis of a personalization system.
1381: The next step in our methodology is thus to determine the parts of the
1382: explanation structure that we would like to retain and express in a PIPE
1383: model. A trivial step of identity elimination is first done to eliminate 
1384: dependence on the particular scenario of Nancy (i.e.,
1385: the $x47$ in Fig.~\ref{nancy-tree} is replaced by just $x$).
1386: Operationalization can then be thought of as drawing a cutting plane
1387: through the explanation tree. Every node below the plane is too specific
1388: to be assumed to be part of all scenarios. The structure above the plane
1389: is considered the persistent feature of all usage scenarios and is
1390: expressed in the personalization system design. The user is then
1391: expected to supply the details of the structure below the plane so that
1392: the proof can be completed.
1393: 
1394: \begin{figure}
1395: \centering
1396: \begin{tabular}{|ll|} \hline
1397: %\begin{descit}{} 
1398: %\begin{description}
1399: click [here] & if you are the user who seeks information about committee \\
1400: & memberships of the junior senator from North Carolina. \\
1401: click [here] & if you are the user who likes information about the \\
1402: & educational background of the President. \\
1403: click [here] & if you are the user who seeks details about bills proposed by the\\
1404: & Republican representative from Littletown constituency of Virginia.\\
1405: click [here] & $\cdots$ \\ 
1406: \hline
1407: %\end{description} 
1408: %\end{descit}
1409: \end{tabular}
1410: \caption{Operationalizing multiple explanations at the leaf level leads to
1411: an over-factored representation in PIPE.}
1412: \label{over-factor}
1413: \end{figure}
1414: 
1415: \begin{figure}
1416: \centering
1417: \begin{tabular}{cc}
1418: \mbox{\psfig{figure=nancy-tree-op3.eps,width=4.5in}}
1419: \end{tabular}
1420: \caption{Operationalizing the explanation for Nancy's scenario.}
1421: \label{nancy-tree-op3}
1422: \end{figure}
1423: 
1424: For instance, if we draw the cutting plane just below politicalinfo($x$)
1425: then this is equivalent to no personalization 
1426: at all. A single explanation tree can accommodate all possible 
1427: information seeking activities but really provides no support as a 
1428: personalization system.
1429: 
1430: If the cutting 
1431: plane is drawn to include the leaves, then this amounts to freezing 
1432: all aspects of Nancy's scenario so that it can be replayed in full. In such 
1433: a case, it is unlikely that a personalization system modeled after one
1434: explanation tree will satisfy all users. We could freeze many more such
1435: explanation trees and the design of
1436: the personalization system then reduces to providing a top-level prompt for
1437: the correct explanation (see Fig.~\ref{over-factor}).
1438: This solution anticipates all forms of interactions and 
1439: over-specifies the personalization problem.
1440: As is well-known in EBG, such a design
1441: is inefficient since a new user has to search for the
1442: correct explanation that is appropriate for his information seeking
1443: activity. From Section~\ref{reason}, we also know that such a design would 
1444: be over-factored and unpersonable under PIPE. This
1445: is because the resulting PIPE model has only one argument (namely, the
1446: choice of the correct explanation) and all invocations of such a model have
1447: to involve complete evaluation!
1448: 
1449: Fig.~\ref{nancy-tree-op3} describes an intermediate solution where some aspects
1450: of the explanation are fixed but some other aspects are available for 
1451: addressing users' interactions. This operationalization induces
1452: a system that personalizes information about
1453: congressional officials. It
1454: assumes that, like Nancy, a new user will invoke rule R2 from Fig.~\ref{egdomain} but, unlike Nancy, could be interested in other members of Congress
1455: besides senators. In addition,
1456: the part of the tree specifying the aspect of interest is also available
1457: for specification by the user. The reader should note that such an 
1458: operationalization will not cover scenarios where the user is interested in,
1459: say, information about the President. To accommodate this case, we could either
1460: move our operationality boundary or create and operationalize
1461: another explanation tree (for
1462: an appropriate scenario). Studying the tradeoff between these two possibilities 
1463: constitutes the crux of operationality research. For the purposes of
1464: this article, it suffices to note that two dimensions of operationality
1465: are: how many explanation trees are operationalized, and 
1466: where the boundaries are drawn in each tree.
1467: 
1468: EBG's ability to induce general constructs by explaining 
1469: scenarios can be a drawback as well as an advantage.
1470: If we generate a lot of templates,
1471: then users' interactions with the personalization system can 
1472: get burdened by a mushrooming of choices. At the same time, EBG provides
1473: a systematic way to cluster the space of users and to determine dense regions
1474: of repetitive interactions that could be supported. A case in point is
1475: a web site such as [{\tt amazon.com}] that distinguishes between returning 
1476: customers and new customers. A top-level prompt at the site makes this
1477: distinction (sometimes, this is automated with cookies) and transfers are made
1478: to different interaction sequences, based on the results of this choice.
1479: For returning customers, questions about mode of payment and mailing address
1480: are skipped because the parts of the proof dealing
1481: with those aspects are already subsumed in the design of the system.
1482: Another example is a web site that provides links from the top-level
1483: page to `the top 10 frequently accessed pages at our site.' In this case, 
1484: popular explanations have been operationalized at the leaf level and 
1485: presented so that new users can directly access them. 
1486: %Our methodology can thus aid in the creation of such personalized views 
1487: %of systems for users.
1488: 
1489: %\subsubsection*{Extensions of the Basic Idea}
1490: Operationalization is only one way to generalize an explanation to other
1491: situations. A variety of other generalization approaches are prevalent
1492: in the EBG literature. There are techniques that conduct generalization across 
1493: multiple explanation trees simultaneously, by identifying recurring 
1494: subtrees~\cite{flann}. To some extent, this can help overcome sensitivity
1495: to initially explained scenarios and also address shortcomings in the
1496: representation of the domain theory. There are also approaches that model and 
1497: generalize temporal interactions 
1498: and ones that help acquire iterative concepts. Iterative concepts are useful, 
1499: for instance, when users to our PoliticalInfo site seek multiple aspects
1500: of information about a political official. Nancy
1501: was interested in only committee memberships but a different user could
1502: have been interested in committee memberships as well as the educational
1503: profile. Generalizing to $n$ such aspects can be achieved by acquiring
1504: an iterative concept. We can also generalize to acquire
1505: recursive formulations; this is useful if information seeking has
1506: an exploratory nature to it. Linus might have visited a sports site
1507: four times in a single scenario, whose correct generalization could be
1508: `keep visiting the page to see if the scores have been updated.' Finally, 
1509: since the choice of operationality is primarily driven by empirical and 
1510: usability concerns, a variety of existing methodologies for utility 
1511: analysis and estimation can be employed here~\cite{dejong}.
1512: 
1513: \begin{figure}
1514: \centering
1515: \begin{tabular}{|ll|} \hline
1516: &$\cdots \cdots \cdots$\\
1517: L1: &{\tt if (Senator)}\\
1518: L2: &\,\,\,\,{\tt if (JuniorSeat)} \\
1519: L3: &\,\,\,\,\,\,\,\,{\tt if (NC)} \\
1520: L4: &\,\,\,\,\,\,\,\,\,\,\,\,{\tt if (CommitteeMemberships)} \\
1521: L5: &\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\tt /* Details about committees */} \\
1522: &\,\,\,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
1523: &\,\,\,\,\,\,\,\,\,\,\,\,{\tt else if (Education)} \\
1524: &\,\,\,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
1525: &\,\,\,\,{\tt else if (SeniorSeat)} \\
1526: &\,\,\,\,\,\,\,\,\,\,\,\,{$\cdots \cdots \cdots$} \\
1527: &$\cdots \cdots \cdots$\\
1528: &{\tt else if (Representative)} \\
1529: &$\cdots \cdots \cdots$ \\
1530: \hline
1531: \end{tabular}
1532: \caption{Designing a PIPE representation from the
1533: operationalized explanation in Fig.~\ref{nancy-tree-op3}.}
1534: \label{pipecone}
1535: \end{figure}
1536: 
1537: \subsection{Designing a PIPE Representation}
1538: The last step is to express the operationalized explanation in a PIPE
1539: representation. We can think of this stage as designing an information system
1540: that provides all the necessary facilities to complete the proof. 
1541: The part of the proof {\it above} the cutting plane is to be performed by 
1542: the system, whereas the user has to supply the details of the proof 
1543: {\it below} the cutting plane (in our case, for member($x$) and
1544: aspect($x$)).
1545: 
1546: Ideally, the user should be able to supply her part of the proof
1547: in as expressive a manner as possible. For instance, 
1548: just saying `North Dakota' and `Representative' in the current political 
1549: landscape defines a unique member of Congress. To achieve this effect 
1550: in a PIPE model, we must ensure that all possible ways of completing the
1551: proof are describable in terms of interaction sequences. Alternatively, we
1552: might choose to support only certain possibilities of completing the
1553: proof. The PIPE model should thus be parameterized in terms of
1554: variables that help define member($x$) and aspect($x$). A representation
1555: in C is shown in Fig.~\ref{pipecone}. Lines L1, L2, and L3 help define
1556: member($x$) for Nancy's scenario and Line L4 helps define aspect($x$).
1557: Since PIPE representations can be partially
1558: evaluated, the user can specify the underlying attributes of the proof in any
1559: order.
1560: 
1561: It is important that we also model terminal code that gets triggered upon
1562: completion of the proof (or subproofs). In our running example, the 
1563: terminal code (e.g., line L5) represents the results presented to Nancy 
1564: upon successful completion of the proof (i.e., information about
1565: committee memberships). 
1566: 
1567: The reader should recall that we have a variety of modeling options
1568: for creating the PIPE representation. Fig.~\ref{pipecone} models the
1569: interaction as a browsing hierarchy, similar to Fig.~\ref{sen1}. Instead we
1570: could have modeled the interaction as a sequence of forms to be filled by
1571: the user. 
1572: Our representation also assumes that we have only one operationalized
1573: explanation structure. If we have multiple explanation structures, an extra
1574: program variable can be introduced that identifies the
1575: explanation a given user is interested in. The PIPE model in this 
1576: case would be a sequence of representations such 
1577: as Fig.~\ref{pipecone}, joined together by a top-level
1578: {\tt switch} construct. A complete description of an example application
1579: developed by our methodology is given in the Appendix.
1580: 
1581: Programmatic PIPE models obtained by our methodology can be viewed as 
1582: compact representations of all pertinent scenarios and, in this sense, 
1583: are more expressive than scenario grammars~\cite{hsia} and scenario
1584: schema~\cite{potts}. Using program compaction techniques~\cite{debray-toplas,
1585: pipe-tois}, we can further curtail the explosion of scenario possibilities.
1586: Any program analysis technique can then be applied to aid in scenario
1587: analysis. For example, the technique of {\it program slicing}~\cite{slicing}
1588: can help reason about the program parts that will be affected by changes
1589: in a given scenario. 
1590: By comparing these effects to those deduced before
1591: the change, we can reason about the orthogonality of scenarios.
1592: 
1593: \subsection{Related Research}
1594: As narrative descriptions of use, scenario-based methods have become prevalent
1595: in various applications, including requirements
1596: analysis~\cite{jack-require,jarke,decl-scenarios} and user interface
1597: design~\cite{jack-making-use}.
1598: The design of software systems from scenarios (as opposed to purely functional,
1599: solution-first specifications) is the cornerstone of our approach. Most
1600: relevant to our presentation is Potts's distinction between inducing scenarios
1601: (from interaction) and deducing scenarios (from specifications)~\cite{potts}.
1602: The operational definition of goal-achieving actions emphasized
1603: in~\cite{potts}
1604: is similar to our explanation structure. For instance, Potts employs a goal
1605: hierarchy
1606: where leaves are associated with user actions and which serve to
1607: operationalize
1608: goals. In addition, we are able
1609: to mechanically transform an information space using attributes of such an
1610: explanation
1611: structure (by partially evaluating the programmatic representation of the
1612: operationalized
1613: explanation).
1614: Other applications in automated software
1615: engineering~\cite{hall-se} and information pattern
1616: extraction~\cite{scenario-pattern-extract}, while supporting explanation-based
1617: views of scenario analysis, have not been connected to the partial
1618: evaluation aspect, which is critical for the
1619: PIPE methodology of personalization.
1620: 
1621: The PIPE approach is also related to the use of task models in
1622: software design. Traditionally, such integration has been achieved by
1623: symbolic modeling techniques, motivated by object oriented (OO)
1624: design~\cite{intTaskObj}
1625: and languages such as UML~\cite{jack-editorial}. More recent efforts in
1626: personalization applications can be found
1627: in~\cite{li-catalog,schwabe2,human1,schwabe1}.
1628: In~\cite{schwabe2}, the
1629: derivation of models of interaction from use cases is presented.
1630: Kramer et al.~\cite{human1} emphasize the importance of task analysis and
1631: advocate end-user analysis of algorithms and tools employed in
1632: personalization systems.
1633: In~\cite{schwabe1}, the authors emphasize
1634: an OO modeling of an information system, where personalization is introduced
1635: as a function from the conceptual design stage.
1636: PIPE's support for personalization, on the other hand, is built into the
1637: programmatic model of the information space and doesn't require any special
1638: handling. It also emphasizes properties such as the closure of personalization
1639: operators and the factorizability of information spaces, that help
1640: relate design decisions to needs identified through scenario analysis.
1641: While these same issues are pertinent in~\cite{schwabe2,schwabe1}, the
1642: approach there is more reminiscent of design patterns (and integrating
1643: requirements via OO analysis), whereas our idea is to use explanations to
1644: identify opportunities for providing personalization facilities. PIPE's
1645: approach also makes more effective use of domain-specific knowledge, both
1646: embodied in the scenarios and assumed in the modeling of the
1647: information seeking
1648: activity.
1649: 
1650: As Russell and Norvig point out, empirical analysis of efficiency
1651: is central to EBG~\cite{russel-norvig}. They emphasize 
1652: that `the efficiency of an [information system factorization] is actually its 
1653: average-case complexity
1654: on a population of [scenarios that are likely to be encountered].' Being
1655: too specific when operationalizing explanations will lead to making more
1656: distinctions than losing them, contributing to lesser orthogonality (salience,
1657: as used in~\cite{potts})
1658: among scenarios. Defining operationality~\cite{keller-op} carefully
1659: in the personalization context is an area for future research.
1660: 
1661: \section{Discussion}
1662: \label{discuss}
1663: This research makes contributions to the state-of-the-art in both
1664: personalization systems and scenario-based design. For personalization, 
1665: we have clarified the aspects of requirements specification and high-level 
1666: elucidation of goals, showing how explanations from usage scenarios can provide 
1667: models for PIPE. In particular, the problem of designing a representation has 
1668: been formalized as a search through a space of PIPE models, driven
1669: by the operationality criterion. The techniques presented in this paper can 
1670: also be used to analyze existing personalization facilities, by determining if 
1671: they address the requirements and opportunities of observed usage scenarios.
1672: 
1673: Our methodology of developing explanations of scenarios also adds value 
1674: to the overall effort of creating scenario-based descriptions
1675: of software and systems and is a further argument for adopting scenario-based
1676: design (SBD) methods. By adopting the EBG view of operationalization, and 
1677: for applications such as personalization, we can use a strong domain theory 
1678: to reason about how scenarios can be effectively supported. In particular, our 
1679: methodology helps to propositionalize information system designs.  
1680: 
1681: From the SBD viewpoint, PIPE emphasizes programmatic 
1682: approaches of transforming between representations. We can extend our
1683: approach to investigate other opportunities for partial evaluation and
1684: also to include other program transformation techniques. 
1685: This will support the provision of views on scenarios, performing
1686: tree-manipulation operations, and propagating effects of changes through 
1687: a representation. For instance, the paradigm of 
1688: `training wheels in a user interface'~\cite{jack-wheels} which relies on 
1689: masking functionality can be expressed using such methods. 
1690: 
1691: It should be remarked that both PIPE and the explanation-based view of
1692: operationalization are two {\it specific} choices that we have made in 
1693: understanding the early stages in the lifecycle of personalization
1694: systems. Programs and partial evaluation serve the role of a
1695: modeling methodology and transformation technique for 
1696: information spaces; explanations supply the mechanism
1697: that connects needs and requirements identified from SBD to
1698: modeling choices in PIPE. While this is admittedly only one (and 
1699: to our knowledge, the first) approach, it provides a glimpse into how 
1700: other lifecycle models can be organized and how they will 
1701: differ from models for general software systems. 
1702: 
1703: This investigation suggests many interesting avenues for future research.
1704: It is good EBG tradition to identify novel ways in which explanations can
1705: be generalized and personalization is fertile with opportunities.
1706: For instance, PIPE models allow for out-of-turn 
1707: interactions (by partial evaluation). This helps overcome the mismatch between
1708: the user's mental model and the facilities available for describing the
1709: information seeking goal. In a non-PIPE implementation, the user has to
1710: manually reconcile this mismatch and perform an exploratory mode of seeking
1711: before being able to use the system to satisfy the goal. We would
1712: like to generalize such scenarios to recognize 
1713: when `exploratory steps have been used because out-of-turn interaction was 
1714: not possible.' For instance, this would allow us to explain that `Nancy
1715: clicked on all links at the top-level page, not because this is what she
1716: wanted but because she was exploring to see which one of them 
1717: led to her choice of link at the second-level.' 
1718: Another form of generalization pertains to the constructive induction of
1719: intermediate subgoals in explanation structures. Recall that we employed
1720: think-aloud protocols to augment our domain theory to support certain
1721: explanations. Automating the induction of subgoals for recurring patterns
1722: of interaction (such as manual information integration) is a possible
1723: direction for future work.
1724: 
1725: The concept of operationality can be explored more carefully in the context
1726: of personalization. Our study has exploited only two dimensions of 
1727: operationality, namely the number of explanation trees and the operationality
1728: boundaries in each. Once again EBG research~\cite{keller-op} suggests
1729: other important dimensions --- such as variability, granularity, and
1730: certainty --- which can be used to define an `operationality assessment
1731: procedure.' Studying these concepts for information systems will allow 
1732: the characterization of personalization applications in terms of 
1733: operationality dimensions. For instance, differences between news-feed
1734: customization services (e.g., {\tt myCNN.com}, PointCast) 
1735: can be expressed in terms of operationality. Such characterizations will also 
1736: aid 
1737: in clarifying the concept of utility (and usability) of personalization systems, since
1738: operationality is primarily concerned with empirical efficiency of models.
1739: 
1740: The notion of a personalization lifecycle can be usefully extended, to support
1741: iterative improvement of PIPE models and to include stages like verification 
1742: and validation. Support for iterative refinement is important in 
1743: extending and composing existing personalization systems. It requires a tighter integration
1744: between the explanation construction procedure and the way in which scenarios
1745: are selected for explanation. Iterative improvement can also benefit from 
1746: existing approaches to scenario repair and related EBG techniques such
1747: as `learning by failing to explain~\cite{hall-fail}.' 
1748: Methodologies for verification and validation can be incorporated 
1749: in our framework, in the form of analytic and empirical frameworks for
1750: utility analysis~\cite{dejong}. Such frameworks can take advantage of 
1751: the characterization of the space of PIPE models produced by explaining 
1752: scenarios and prior knowledge of the distribution of problem scenarios.
1753: 
1754: An emerging frontier involves modeling {\it
1755: context} in information systems.
1756: Consider:
1757: \begin{descit}
1758: \noindent
1759: {\bf Person:} Remember the hotel where we hosted the annual convention?\\
1760: {\bf Secretary:} Yes.\\
1761: {\bf Person:} Reserve it for next Friday's event.
1762: \end{descit}
1763: Creating a personalization system that exploits context
1764: amounts to storing
1765: and retrieving smaller (or partial) explanations for use in constructing
1766: larger-scope explanations.
1767: Scenarios that do not permit complete explanations 
1768: can also be interpreted as activities
1769: for building and organizing context, for use in later situations. 
1770: The explanation-based view of scenarios allows the decomposition of 
1771: structures to aid in such reasoning.
1772: 
1773: Our systems-oriented view of personalization will find greater acceptance
1774: if tools and software are available for automating various aspects of
1775: the methodology. For instance, scenario management tools and explanation
1776: engines can be prototyped for targeted information spaces. Specific
1777: techniques for web mining and modeling user interactions can be incorporated
1778: as reusable sub-explanations. This will allow us to design personalization
1779: systems around existing system infrastructure. To aid in the maintainability
1780: of PIPE models, specific scenario libraries (called `chunking~\cite{chunking}' in AI) and `frequently used explanations' can also be designed.
1781: 
1782: The central role played by the domain theory in our
1783: methodology signifies a back-to-basics
1784: approach in personalization system design. For a given information seeking
1785: activity, a domain theory is characterized by its `explanatory power' and
1786: how effectively it allows us to define the parameters of a personalization
1787: space. This suggests that we should aim for a more fundamental understanding
1788: of how domain theories characterize information spaces and the
1789: situations to which they can be usefully applied. Our methodology is the 
1790: first in which such questions can be directly expressed. Extending work
1791: in these directions will help us to architect an information resource
1792: for personalization and to provide rigorous metrics for evaluating the 
1793: applicability of PIPE in a new situation. Together, they will take important 
1794: steps in establishing a lifecycle of personalization system design. 
1795: 
1796: \section*{Acknowledgements}
1797: Saverio Perugini helped implement the study
1798: presented in the Appendix.
1799: Marcos Gon\c{c}alves identified several pertinent references. Robert
1800: Capra helped make connections from our work to
1801: mixed-initiative interaction and contextual abstractions. All three colleagues
1802: read drafts of this paper and provided important comments.
1803: 
1804: \bibliographystyle{plain}
1805: \bibliography{final}
1806: 
1807: \newpage
1808: \section{Appendix: An Example Application}
1809: \label{case}
1810: \begin{figure}
1811: \centering
1812: \begin{tabular}{cc}
1813: \mbox{\psfig{figure=pigments.epsi,width=5in}}
1814: \end{tabular}
1815: \caption{Original layout of the `Pigments through the Ages' website.}
1816: \label{pigmentspicture}
1817: \end{figure}
1818: 
1819: As a demonstrator of the ideas presented in this paper, we describe the
1820: personalization of the `Pigments through the Ages' website at
1821: [{\tt www.webexhibits.org/pigments}], a public service that uses
1822: pigment analysis catalogs to identify and reveal the palettes of
1823: painters in different eras and genres. As shown in the top-level interface
1824: depicted in Fig.~\ref{pigmentspicture}, a variety of information resources
1825: are modeled in this site. Users can search for paintings by artist, style,
1826: period, or by membership in a particular pigment group. Notice also that the
1827: interface in Fig.~\ref{pigmentspicture} provides some `hardwired' scenarios such
1828: as comparing palette similarity tables or analyzing pigment usage in a certain
1829: age. However, even a simple query such as `What is the influence of colors
1830: from the baroque era on the neo-classic styles of paintings?'
1831: cannot be accommodated without manual information integration because the
1832: interaction sequences are hardwired. 
1833: 
1834: \begin{figure}
1835: \centering
1836: \framebox{
1837: \begin{minipage}{2\colwidth}
1838: \small
1839: \begin{description}
1840: \item[Jeremy's scenario]: Jeremy is attempting to compare how colors from
1841: the baroque
1842: era were used in the neo-classic paintings. In particular, he is interested
1843: in usage graphs
1844: for pigments in neo-classic that are most similar to ones used in baroque.
1845: Jeremy surveys the existing facilities at
1846: the site and chooses `by Artist, Style, or Period' as the {\tt search
1847: method}. Next,
1848: he specifies `neo-classic baroque' in the text box for {\tt painting
1849: keywords.} He
1850: (correctly) reasons that this specification identifies the paintings to be
1851: used for analysis.
1852: Next, he chooses `All pigments' from the {\tt specify display} dropbox and
1853: selects
1854: `Usage graphs' as the {\tt analysis kind}. He (incorrectly) assumes that
1855: this will
1856: compare every painting from baroque with every painting from neo-classic
1857: and that the
1858: system will present an usage
1859: graph for each such comparison.
1860: On inspecting the results Jeremy notices,
1861: instead, that the usage graphs are for {\it all} pigments used in the set
1862: \{neo-classic $\cup$ baroque\}, not quite
1863: what he had in mind. He wonders for 5 minutes and realizes that the site
1864: does not provide any direct
1865: interface to specify his form of analysis.
1866: 
1867: Jeremy pursues an alternative strategy. He is going to first find the
1868: common colors across
1869: baroque and neo-classic, and then determine their usage patterns in
1870: neo-classic.
1871: He opens an additional browser window. In the new one,
1872: he specifies `by Artist, Style, or Period' as the {\tt search method} and
1873: `neo-classic baroque' as the {\tt painting keywords.} He clicks on the
1874: palette similarity
1875: table checkbox and obtains a matrix of values that indicate how colors from
1876: one period were
1877: utilized in another. As he expected, this time the specialized interface
1878: interprets
1879: that the two groups he specified in {\tt painting keywords} have to be
1880: compared with each other.
1881: The results page provides a matrix whose entries denote similarity levels.
1882: He picks
1883: out the pigments corresponding to the highest similarity levels and shifts
1884: control to his old
1885: browser window. There, he types in these pigments in the {\tt pigment
1886: keywords} textbox and,
1887: this time, types only `neo classic' in the {\tt painting keyword} textbox.
1888: All other settings were
1889: as he left them (including the `Usage graphs' request). This time, the
1890: output screen provides a
1891: histogram of the usage of the baroque pigments in neo-classic, which
1892: satisfies Jeremy's
1893: information seeking goal.
1894: \end{description}
1895: \end{minipage}}
1896: \caption{A `Pigments through the Ages' scenario whose explanation was
1897: subsequently operationalized.}
1898: \label{exam-scenario}
1899: \end{figure}
1900: 
1901: \vspace{-0.1in}
1902: \subsubsection*{Problem Scenario Development}
1903: A group of 10 participants were identified and instructed to explore the
1904: layout and organization of information at this site. After a 
1905: period of acquainting themselves with the site, they were asked to 
1906: identify one specific query (or analysis) and use
1907: the facilities at the site to answer their query. The exact interaction
1908: sequences (including clicked hyperlinks, manual information integration) was 
1909: recorded for all the participants. One such scenario is described in
1910: Fig.~\ref{exam-scenario}.
1911: 
1912: \vspace{-0.1in}
1913: \subsubsection*{Domain Theory}
1914: The domain theory for this application was obtained from three sources.
1915: The first was an explicit crawl of the site that outlined how interactions
1916: result in specification of information seeking attributes. The second was
1917: a `Background' webpage at [{\tt http://\hskip0ex webexhibits.\hskip0ex org/\hskip0ex pigments/\hskip0ex intro/\hskip0ex index.\hskip0ex html}]
1918: that outlined a schema for how the website should
1919: be used and how to browse through the various sections. For instance,
1920: one mode of operation suggested at the site was to choose the `Usage Research'
1921: category and see which pigments were used in different paintings. Another
1922: mode of operation was to jump to a particular pigment page and then
1923: browse through categories of information outlining technical details. 
1924: All of these forms of navigation at the site were modeled as possibilities
1925: for satisfying the top-level personalization goal. The third source
1926: was from analyzing user interactions that revealed opportunities for
1927: information integration across multiple pages of the site. Once again these
1928: were modeled as specific possibilities of instantiating
1929: the top-level personalization
1930: goal. The domain theory was 
1931: represented in CLIPS but only certain portions of the theory were materialized
1932: when conducting explanations. We ensured that all 10 scenarios can
1933: be explained by the domain theory.
1934: 
1935: \vspace{-0.1in}
1936: \subsubsection*{Constructing and Analyzing Explanations} 
1937: Explanations of user interactions revealed that starting from
1938: either artists, paintings, or eras, the users systematically browsed
1939: through subcategories or compared palettes to arrive at the relevant
1940: pigments (used by that artist, in the painting, or in that era, 
1941: respectively). Furthermore, all pigments share common modes of 
1942: information seeking, such as browsing through their history of use, procedures
1943: for preparation, and technical details of their chemical composition. 
1944: 
1945: \vspace{-0.1in}
1946: \subsubsection*{Operationalization}
1947: We hence operationalized the explanation structure(s) as two function
1948: invocations in sequence, the first to determine an appropriate pigment
1949: category, and the second to browse through the entries in that category by
1950: various means. 
1951: %The latter web source could be factored out and
1952: %procedurally invoked at any point in the program where the name of the
1953: %pigment has been resolved by other means. 
1954: %
1955: We thus arrived at a single
1956: structure in support of all the 10 scenarios.
1957: The factorization implied by the structure permits the following analysis:
1958: \begin{descit}{}
1959: For the pigment categories defined by $X$, provide the details involving $Y$.
1960: \end{descit}
1961: $X$ denotes information such as a genre, a style, a painting, or
1962: a particular artist. $Y$ denotes features of pigments such as usage history,
1963: chemical composition, and dyeing processes.
1964: Each of $X$ and $Y$ could be defined either directly or
1965: involving attributes of other entities that relate to them. For instance
1966: $X$ could be a painting keyword such as `Rembrandt' (which means that we
1967: are interested in pigments used by Rembrandt) or it could be the result of
1968: the palette similarity function applied on two painting styles (which would
1969: mean that we are interested in pigments that satisfy some acceptable 
1970: threshold for similarity). In addition, there are dependencies among the 
1971: allowed entries in $X$ and $Y$. 
1972: %Since PIPE supports partial information, the above query
1973: %can be invoked without any knowledge of $X$; this would just mean that
1974: %a particular type of information is requested of all pigment categories
1975: %(and hence, pigments).
1976: %The operationalization implied by the above analysis indicated
1977: %to us that we include ways to infer attributes of parent entities, given
1978: %properties of sub-entities. For instance, the `bone black' category
1979: %indicates that both the {\tt bone\_black} and {\tt black} program
1980: %variables should be set. (This was possible only in structures that
1981: %involved subtrees.)
1982: Jeremy's scenario
1983: satisfies the above template where
1984: $X$ denotes the result of applying the palette similarity function to
1985: `neo-classic baroque' and $Y$ denotes `usage graphs.'
1986: %The above template for personalization
1987: %is thus not currently supported by the facility, automatically.
1988: 
1989: \vspace{-0.1in}
1990: \subsubsection*{Representation in PIPE}
1991: To support the user in defining $X$ and $Y$, we modeled various information
1992: sources such as
1993: the catalog contents (which contains paintings from 950 to 1981),
1994: the palette similarity table (which is just a function in our program),
1995: citations of paintings, and auxiliary information such
1996: as images, histories, where the painting is housed, and other legends.
1997: Overlaps of painting styles across different periods contribute to sources 
1998: of semistructure in this site and a corresponding reduction in the 
1999: composite program size. Our composite
2000: program was represented in 2369 lines of C code involving hundreds of program
2001: variables that could be turned on or off with user input. Approaches
2002: for modeling the various elements in this study are 
2003: described in~\cite{pipe-tois}. We did not implement the mappings from the
2004: (specialized) program back to the information space because we only wanted
2005: to evaluate the effectiveness of our modeling (more on this below).
2006: 
2007: \vspace{-0.1in}
2008: \subsubsection*{Evaluation}
2009: The evaluation of systems designed with PIPE is an interesting issue
2010: in itself; we address this topic in greater detail in~\cite{pipe-tois}.
2011: While user satisfaction surveys show convincing results (see, for
2012: instance~\cite{naren-ic}), PIPE is more a modeling methodology for
2013: personalization, and not a system per se. As such, its effectiveness
2014: depends on what is modeled (and how). The research presented in this
2015: paper gives us a direct way to assess the modeling capability of PIPE.
2016: 
2017: We identified a test group of 15 users (different from those
2018: who participated in the original scenario analysis)
2019: and asked them to experiment with
2020: the unpersonalized pigments site. Each of them
2021: was then asked to identify and carefully describe 2-3 personalization
2022: scenarios. In total, 35 scenarios were identified.
2023: An example is the following analysis:
2024: \begin{descit}{}
2025: What are the symbolic connotations of pigments used by artists in the
2026: Renaissance era?
2027: \end{descit}
2028: (One of the answers to this query is a web page that describes the
2029: interpretation of red as invincibility
2030: in Jan van Eyck's 1434 classic {\it Arnolfini Wedding}.)
2031: We then evaluated our PIPE representation by the fraction
2032: of scenarios that can be described in our modeling (and are hence
2033: amenable to personalization by partial evaluation). 
2034: All scenarios
2035: except two passed our test. The two unmodelable scenarios involved
2036: the `Orpiment' pigment which was listed in both
2037: the `Yellow' and `Orange' categories and was variously referred to by users
2038: as belonging to one, but not the other.
2039: This ambiguity implies that our modeling did not contain
2040: sufficient information to complete the proof (i.e., it
2041: could not uniquely distinguish between these two distinct specifications
2042: involving $X$). More contextual information needs to be encoded in our
2043: modeling so that this ambiguity is resolved. 
2044: 
2045: A full listing of the scenarios used
2046: in evaluation follows. Except for scenarios~\ref{orp1} and ~\ref{orp2}, 
2047: all others can be
2048: supported. Scenarios~\ref{first1},~\ref{orp2},~\ref{pref2},
2049: ~\ref{pref3}, and~\ref{pref4} indicate preferences for presentation which
2050: can be addressed
2051: when we recreate the personalized pages from the specialized program.
2052: Scenario~\ref{strange} states preferences
2053: for interactions at many levels. 
2054: This amounts to repeated partial
2055: evaluations of the information
2056: space, in the order of attributes stated by that user.
2057: Scenarios such as 27 imply a
2058: desire to use complete evaluation with the designed PIPE model, not
2059: partial evaluation. 
2060: \begin{enumerate}
2061: \item 
2062: \label{first1}
2063: What are the symbolic connotations of pigments used by artists in the
2064: Renaissance era? Arrange the results on a single page,
2065: in alphabetical order of pigments.
2066: \item (similar to Jeremy's scenario) I would like to see how colors used in
2067: 1800-1900 have influenced paintings in the early part of the 20th
2068: century. Show usage graphs for pigments that are similar across these
2069: eras.
2070: \item I would like to specify a pigment choice, not based on
2071: a property of
2072: painting, style, or era. Rather, I want all pigments for which descriptions of
2073: chemical composition are available. So, if a pigment does not have this
2074: information,
2075: it should not be listed.
2076: \item I would like to browse through pigment details at the root page, not
2077: go through
2078: information about paintings or painters.
2079: \item \label{orp1} I
2080: am interested to see how usage of pigments of a subcategory compares with that
2081: of pigments in the parent category. For instance, does Orpiment usage
2082: correlate with
2083: usage of Yellow in 1800-1900 paintings?
2084: \item The site facility lists only at most 10 palettes at a time. If I need
2085: more, I have to
2086: carefully pose multiple subqueries so that each of them does not involve
2087: more than
2088: 10 paintings. Can you fix this problem so that I can see all palettes?
2089: \item My period specifications don't seem to work at the site. When I manually
2090: browse the site, I see annotations such as `1900-2000' and `1650-1750.' But
2091: when
2092: I pose my range as `1875-1925,' the site doesn't seem to understand.
2093: Should I have to break my range up into these prespecified ranges? That seems
2094: cumbersome.
2095: \item \label{orp2}
2096: Can I get a listing of pigments used by the Impressionists
2097: along with their parent categories, side-by-side on a single page?
2098: \item I would like the pigments arranged by history (e.g., middle ages),
2099: followed
2100: by an organization along countries.
2101: \item I would like to be able to
2102: choose a pigment according to ease of preparation.
2103: \item I would like to search for pigments using a combination of
2104: two criteria (such as geographical use and time period), but the site
2105: allows only
2106: one at a time.
2107: \item I would like the interaction to proceed as follows: At the first level,
2108: I will make a choice of history of usage, after which I will browse the site in
2109: the traditional manner.
2110: \item \label{pref2}
2111: I would like pigments of the Blue category to be identified in alphabetical
2112: order.
2113: Then I would like to see the swatches of paints from the top five to be placed
2114: alongside swatches of paint from the bottom five. This will show me the
2115: range of intensities
2116: of Blue available.
2117: \item I would like a listing of pigments by their chemical name, not their
2118: colloquial names.
2119: \item I find myself repeatedly browsing pigments' pages to study the
2120: fascinating stories of
2121: how these pigments originate. Some of these pages don't seem to have any
2122: stories. Can you
2123: provide me a listing of only those pigments that have stories?
2124: \item Which are the pigments that have German names or equivalents?
2125: \item I would like to see the descriptions of pigments that have pictures in
2126: them. I am not interested in purely textual descriptions.
2127: \item I would like to directly select a specific pigment from a list, on the
2128: first page.
2129: \item Which pigments have been used by Alchemists?
2130: \item \label{pref3}
2131: Can you cascade the brief descriptions of pigments and remove all the other
2132: information
2133: pertaining to preparation and technical details?
2134: \item Which pigments have citations to them? I would like a listing of only
2135: those.
2136: \item \label{pref4}
2137: I am interested in the Green earth pigment.
2138: I would like a page that has pictures of paintings and along with each, a
2139: picture of the
2140: swatch of paint from Green earth. This is just so that it is visually easy
2141: to see how
2142: much the painting emphasizes Green earth.
2143: \item I would like pigments arranged by the year in which they were first
2144: introduced.
2145: \item I am interested in making pigments.
2146: Can you please instruct me how to make every
2147: pigment in the purple category?
2148: \item \label{strange} I am interested in the citation lists for green pigments.
2149: However, I would like to browse them by first making a selection of artist.
2150: Then I will
2151: select a period. And finally I will select among titles, if there are
2152: choices. For the green pigments
2153: used in these titles, I would like to see the citations.
2154: \item How is the name for the Azurite pigment derived? What is the word origin?
2155: \item Produce the 3D model for Titanium dioxide.
2156: \item I know that Kandinsky has suggested that black indicates an inner
2157: harmony of silence.
2158: Can I see which forms of blacks were used in his paintings?
2159: \item Can you give information about how pigments are used for body art,
2160: tattoos
2161: and other non-conventional forms of paintings?
2162: \item What are the time periods when Lemon Yellow was used?
2163: \item For paintings by Monet, can you display the top five most frequent
2164: pigments?
2165: \item What forms of white pigments have been used in paintings? Which ages were
2166: they introduced in?
2167: \item Give a histogram of how chrome yellow has been used over the times.
2168: \item Arrange histograms of all purple pigments used after the 17th century.
2169: \item How do pigments used by Picasso compare in usage with pigments used
2170: in the 1920s,
2171: in general?
2172: \end{enumerate}
2173: \end{document}
2174: 
2175: