cs0404025/main.tex
1: \documentclass[10pt, a4paper]{article}
2: \usepackage{lrec2000}
3: \usepackage{graphicx}
4: 
5: \title{Test Collections for Patent-to-Patent Retrieval and \\
6:   Patent Map Generation in NTCIR-4 Workshop}
7: 
8: \name{Atsushi Fujii$^{\ast}$, Makoto Iwayama$^{\dagger}$, Noriko Kando$^{\ddagger}$}
9: 
10: \address{$^{\ast}$Institute of Library and Information Science \\
11: University of Tsukuba \\
12: 1-2 Kasuga, Tsukuba, 305-8550, Japan \\
13: fujii@slis.tsukuba.ac.jp \\ \\
14: $^{\dagger}$Hitachi, Ltd. \\
15: 1-280 Higashi-Kougakubo, Kokubunji, 185-8601, Japan \\
16: iwayama@crl.hitachi.co.jp \\ \\
17: $^{\ddagger}$
18: National Institute of Informatics \\
19: 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Japan \\
20: kando@nii.ac.jp}
21: 
22: \abstract{This paper describes the Patent Retrieval Task in the Fourth
23:   NTCIR Workshop, and the test collections produced in this task. We
24:   perform the invalidity search task, in which each participant group
25:   searches a patent collection for the patents that can invalidate the
26:   demand in an existing claim. We also perform the automatic patent
27:   map generation task, in which the patents associated with a specific
28:   topic are organized in a multi-dimensional matrix.}
29: 
30: \begin{document}
31: 
32: \maketitleabstract
33: 
34: \section{Introduction}
35: \label{sec:introduction}
36: 
37: In the Third NTCIR Workshop (NTCIR-3), which is a TREC-style
38: evaluation forum for research and development on information retrieval
39: and natural language processing, the authors of this paper organized
40: the Patent Retrieval
41: Task~\cite{iwayama:sigir-2003,iwayama:ntcir-2003}. This was the first
42: serious effort to produce a test collection for evaluating patent
43: retrieval systems.
44: 
45: The process of patent retrieval differs significantly depending on the
46: purpose of retrieval. In NTCIR-3 Workshop, the ``technology survey''
47: task was performed, in which patents are regarded as technical
48: publications rather than legal documents.
49: In practice, given a query, which is a clipping of a newspaper
50: articles related to a specific technology, two years of patent
51: publications were searched for the documents relevant to the query.
52: Search topics were in five languages. The same contents in Japanese,
53: English, Korean, traditional/simplified Chinese were used to perform
54: cross-language retrieval.
55: 
56: Given a success in NTCIR-3 Workshop, the authors are also performing
57: the Patent Retrieval Task in NTCIR-4 Workshop, which is held from
58: January 2003 to June 2004. However, unlike NTCIR-3 Workshop, we are
59: focusing on the ``invalidity search'' and ``patent map generation''
60: tasks.  This paper describes the test collections for both tasks.
61: 
62: Because NTCIR-4 Workshop is performed in one and half years, it is
63: difficult to explore long-term research topics, such as the patent map
64: generation task.  Thus, while we perform the invalidity search task,
65: which resembles the traditional ad-hoc IR task, as the main task, we
66: perform the patent map generation task as a feasibility study, for
67: which no quantitative evaluation is conducted.
68: 
69: \section{Invalidity Search Task}
70: 
71: \subsection{Overview}
72: 
73: The purpose of invalidity search is to find the patents that can
74: invalidate the demand in an existing claim. This is an associative
75: patent (patent-to-patent) retrieval task. In real world, invalidity
76: search is usually performed by examiners in a government patent office
77: and searchers of the intellectual property division in private
78: companies.
79: 
80: The task was performed as follows. First, the task organizers (i.e.,
81: the authors of this paper) provided each participant group with the
82: document collection and search topics.
83: 
84: Second, each group submitted retrieval the results queried by the
85: topics.  In a single retrieval result, the top 1000 retrieved
86: documents must be sorted by the relevance score. However, because
87: patent documents are long, it is effective to indicate the important
88: passages (i.e., fragments) in a relevant document. Thus, for each
89: retrieved document, all passages in the document must be sorted as to
90: which a passage provides grounds to judge if the document is relevant.
91: 
92: Third, human experts performed relevance judgment for the submitted
93: results and produced a list of relevant documents and passages, on a
94: topic-by-topic basis. Finally, the list was used to evaluate each
95: submitted result.
96: 
97: In the dry run, which was performed from June to September in 2003,
98: seven topics were produced and used for a preliminary evaluation.  In
99: the formal run, 103 search topics were produced and the evaluation
100: results for each group will be released at the workshop final meeting
101: in June 2004.  The analysis of the formal run results has not been
102: completed and is beyond the scope of this paper.
103: 
104: After the workshop final meeting, we complete a test collection
105: consisting of the search topics, the document collection, and the
106: relevance judgments for each topic.
107: 
108: \subsection{Document Sets}
109: 
110: The document set used as a target collection consists of five years of
111: unexamined Japanese patent applications published in 1993-1997. The
112: file size and number of documents are approximately 40GB and 1.7M,
113: respectively.
114: 
115: For the sake of passage-based evaluation, the passages in each
116: document were standardized. In Japanese patent applications,
117: paragraphs are identified and annotated with the specific tags by
118: applicants. We used these paragraphs as passages, and therefore the
119: passage identification process was fully automated.
120: 
121: The English patent abstracts, which are human translations of the
122: Japanese Patent Abstracts published in 1993-1997, were also provided
123: for training English-to-Japanese cross-language IR systems.
124: 
125: \subsection{Search Topics}
126: 
127: A search topic is a Japanese patent application rejected by the
128: Japanese Patent Office.  For each topic patent, one or more citations
129: were identified by examiners to invalidate the demand in the topic
130: patent. If these citations are included in our document collection,
131: they can be used as relevant documents for the topic.
132: 
133: We asked 12 members of the Intellectual Property Information Search
134: Committee in the Japan Intellectual Property Association (JIPA) to
135: produce seven topics for the dry run and 34 topics for the formal run.
136: Each JIPA member belongs to the intellectual property division in the
137: company he or she works for, and they are all experts in patent
138: searching. The JIPA member also performed relevance judgment to
139: enhance the relevant documents.
140: 
141: A search topic file includes a number of additional SGML-style tags.
142: The claim as a target of invalidation is specified by \verb|<CLAIM>|.
143: 
144: A claim usually consists of multiple components (e.g., parts of a
145: machine and substances of a chemical compound) and relevance judgment
146: is performed on a component-by-component basis in real world case. To
147: simulate this scenario, human experts annotate each component with
148: \verb|<COMP>|.
149: 
150: To invalidate an invention in a topic patent, relevant documents must
151: be the ``prior art'', which had been open to the public before the
152: topic patent was filed. Thus, the date of filing is specified by
153: \verb|<FDATE>| and only the patents published before the topic was
154: filed can potentially be relevant.
155: 
156: To perform cross-language retrieval, the claims translated into
157: English and simplified Chinese are also used. Thus, the topic language
158: is specified by \verb|<LANG>|.  However, the translated claims do not
159: maintain the order of phrases and sentences in Japanese claims and
160: thus do not include the \verb|<COMP>| tags.  Figure~\ref{fig:topic}
161: shows an example topic claim translated into English.
162: 
163: \begin{figure}[htbp]
164:   \begin{center}
165:     \leavevmode
166:     \small
167:     \begin{quote}
168:       \verb|<TOPIC>|\\
169:       \verb|<NUM>008</NUM>| \\
170:       \verb|<CLAIM>|(Claim 1) A sensor device, characterized in that
171:       an open recessed part is formed on a box-shaped forming base, a
172:       conductive film of a designated pattern is formed on the surface
173:       of the forming base including the inner surface of the recessed
174:       part, an element for a sensor is bonded to the recessed part,
175:       and the forming base is closed with a cover.\verb|</CLAIM>| \\
176:       \verb|</TOPIC>|
177:     \end{quote}
178:     \caption{The claim in an English search topic (008).}
179:     \label{fig:topic}
180:   \end{center}
181: \end{figure}
182: 
183: Through a preliminary study in collaboration with JIPA, we found that
184: for invalidity search the number of relevant documents for a single
185: topic is small, compared with existing IR test collections.
186: Consequently, the evaluation results obtained with our collection can
187: potentially be unstable.
188: 
189: The same problem is identified in the question answering task, and
190: thus the hundreds of questions are usually used to resolve this
191: problem~\cite{voorhees:sigir-2000}.
192: 
193: To increase the number of topics with a limited cost, we produced
194: additional 69 topics for which only the citations provided by the
195: Japanese Patent Office were used as the relevant documents. However,
196: the validity of rejection was verified manually,  the process of
197: producing additional topics was not fully automated.
198: 
199: \subsection{Submissions}
200: 
201: Each group was allowed to submit one or more retrieval results, in
202: which at least one result must be obtained using only the
203: \verb|<CLAIM>| and \verb|<FDATE>| fields. For the remaining results,
204: any information in a topic file, such as the International Patent
205: Classification (IPC) codes, can be used.
206: 
207: The results of the dry run showed that for specific topics, an
208: IPC-base system successfully retrieved relevant patents that could not
209: be retrieved by the text-based systems.
210: 
211: \subsection{Relevance Judgments}
212: 
213: The relevance degree of a document with respect to a topic is
214: determined on the basis of the relevance degrees of the document with
215: respect to each component in the topic. Relevance judgment for patents
216: is performed based on the following two ranks:
217: \begin{itemize}
218: \item patent that can invalidate a topic claim (A)
219: \item patent that can invalidate a topic claim, when used with other
220:   patents (B)
221: \end{itemize}
222: The documents that can invalidate the demands of all essential
223: components in a target claim were judged as ``A''. The documents that
224: can invalidate demands of most of the essential components in a target
225: claim (but not all essential components) were judged as ``B''.
226: 
227: For the main 34 topics, to identify relevant documents exhaustively,
228: the pooling method and manual search were used. The human experts who
229: produced the topics performed manual searches to collect as many
230: relevant patents as possible. The experts were allowed to use any
231: systems and resources, so that we were able to obtain a patent
232: document set retrieved under the circumstances of their daily patent
233: searching. The citations provided by the Japanese Patent Office were
234: also used as the relevant documents.
235: 
236: For the 34 topics, the resultant number of A and B documents were 159
237: and 185, respectively.  We analyzed details of the number of relevant
238: documents obtained by the different sources. In
239: Figure~\ref{fig:diagram}, ``C'', ``J'', and ``S'' denote the sets the
240: relevant documents (A and B) obtained by the citations, the manual
241: searches by the JIPA members, and the 30 systems participated in the
242: pooling, respectively.
243: 
244: It should be noted that because the JIPA members collected the
245: citations before the manual search, $|$C $\cap$ J$|$ is always zero.
246: Looking at this figure, each source was independently effective to
247: collect the relevant documents. While $|$C$|$ and $|$J$|$ were almost
248: equivalent, $|$S$|$ was comparable with \mbox{$|$C $\cup$ J$|$}.
249: 
250: \begin{figure}[htbp]
251:   \bigskip
252:   \begin{center}
253:     \leavevmode
254:     \includegraphics[height=1.5in]{diagram.eps}
255:     \caption{Details of the number of relevant documents.}
256:     \label{fig:diagram}
257:   \end{center}
258: \end{figure}
259: 
260: The evaluation score is fundamentally determined by the conventional
261: mean average precision. At the same time, each group is encouraged to
262: propose new evaluation measures effective for patent IR systems.
263: 
264: In addition to the conventional document-based evaluation, we also
265: explore the passage-based evaluation. Relevant passages were
266: determined based on the following criteria:
267: \begin{itemize}
268: \item If a single passage can be grounds to judge the document in
269:   question as relevant (either A or B), this passage is judged as
270:   relevant.
271: \item If a ``group'' of passages can be grounds to judge the document
272:   in question as relevant, this passage group is judged as relevant.
273: \end{itemize}
274: The experts exhaustively identified all relevant passages and passage
275: groups.
276: 
277: It should be noted that a relevant passage group is equally
278: informative as a single relevant passage. In other words, we newly
279: introduce the concept of ``combinational relevance''.
280: 
281: This feature provides a salient contrast to the conventional IR
282: evaluation method, in which all relevant passages or documents are
283: independently important and thus combinations of partially relevant
284: documents are not considered.
285: 
286: The evaluation score for each system is determined by the number of
287: passages which would have to be searched until a user obtains a
288: sufficient grounds to judge the document as relevant.
289: 
290: \section{Patent Map Generation Task}
291: 
292: In principle, the purpose of the patent map generation task is to
293: generate a patent map driven by a specific theme, such as automobiles,
294: by (semi-)automatic method. This can be seen as a text mining task.
295: 
296: In practice, the organizers provided participants with the patent
297: documents retrieved by a specific topic, and participants are
298: requested to organize those documents in a two-dimensional matrix.
299: The x and y axes can vary depending on the topic, but they are usually
300: ``problems to be solved'' and ``solutions'', respectively.
301: 
302: To produce the topics and documents, we used the test collection
303: produced for the NTCIR-3 Patent Retrieval Task. We selected six search
304: topics for which more than 100 relevant documents were identified.
305: The NTCIR-3 collection includes the following three document sets:
306: \begin{itemize}
307: \item two years worth of unexamined Japanese patent applications
308:   published in 1998 and 1999,
309: \item Japanese abstracts, the JAPIO Patent Abstracts, which are
310:   human-edited abstracts for the above applications,
311: \item English abstracts, the Patent Abstracts of Japan (PAJ), which
312:   are human translations of the JAPIO Patent Abstracts.
313: \end{itemize}
314: Any document set can be used for patent map generation purposes.
315: Because the search topics are in the five languages independently (see
316: Section~\ref{sec:introduction}), cross-language patent map generation
317: can also be performed.
318: 
319: However, the patent map generation task is as a feasibility study, and
320: thus human experts evaluated the submitted maps subjectively.
321: 
322: \section{Conclusion}
323: 
324: We built test collections for the patent-to-patent invalidity search
325: and automatic patent generation tasks in the NTCIR-4 Workshop.  After
326: the NTCIR-4 final meeting, the test collection will be available to
327: the public for research
328: purposes\footnote{http://www.slis.tsukuba.ac.jp/\~{}fujii/ntcir4/cfp-en.html}.
329: 
330: The test collections can directly be used for the following research
331: purposes:
332: \begin{itemize}
333: \item retrieval of very long semi-structured documents,
334: \item associative document retrieval,
335: \item passage retrieval,
336: \item evaluation of retrieval systems on the basis of combinational
337:   relevance,
338: \item classification and text mining.
339: \end{itemize}
340: 
341: Future work would include exploiting patent documents in different
342: applications, as follows:
343: \begin{itemize}
344: \item term recognition
345:   
346:   patent documents are associated with inventions and thus include a
347:   large number of new and technical terms.
348: 
349: \item sub-language studies
350:   
351:   claims in patent applications are written in a controlled language.
352: 
353: \item machine translation and cross-language retrieval
354:   
355:   inventions filed in multiple languages (i.e., patent families) can
356:   be used to extract translation lexicons.
357: 
358: \end{itemize}
359: 
360: \section{Acknowledgments}
361: 
362: The authors would like to thank the Japan Intellectual Property
363: Association for their support in the NTCIR-4 Patent Retrieval Task.
364: 
365: \bibliographystyle{lrec2000}
366: \begin{thebibliography}{3}
367: \expandafter\ifx\csname natexlab\endcsname\relax\def\natexlab#1{#1}\fi
368: 
369: \bibitem[Iwayama et~al., 2003{\natexlab{a}}]{iwayama:sigir-2003}
370: Iwayama, Makoto, Atsushi Fujii, Noriko Kando, and Yuzo Marukawa,
371:   2003{\natexlab{a}}.
372: \newblock An empirical study on retrieval models for different document genres:
373:   Patents and newspaper articles.
374: \newblock In {\em Proceedings of the 26th Annual International ACM SIGIR
375:   Conference on Research and Development in Information Retrieval\/}.
376: 
377: \bibitem[Iwayama et~al., 2003{\natexlab{b}}]{iwayama:ntcir-2003}
378: Iwayama, Makoto, Atsushi Fujii, Noriko Kando, and Akihiko Takano,
379:   2003{\natexlab{b}}.
380: \newblock Overview of patent retrieval task at {NTCIR}-3.
381: \newblock In {\em Proceedings of the Third NTCIR Workshop on Research in
382:   Information Retrieval, Automatic Text Summarization and Question
383:   Answering\/}.
384: 
385: \bibitem[Voorhees and Tice, 2000]{voorhees:sigir-2000}
386: Voorhees, Ellen~M. and Dawn~M. Tice, 2000.
387: \newblock Building a question answering test collection.
388: \newblock In {\em Proceedings of the 23rd Annual International ACM SIGIR
389:   Conference on Research and Development in Information Retrieval\/}.
390: 
391: \end{thebibliography}
392: 
393: \end{document}
394: