Prior Art Search and its Evaluation
Prior Art Search is an information seeking task where searchers, for instance patent examiners, search for published literature to determine whether the claimed invention in a patent application is novel. In Prior Art Search, search tasks are often time-sensitive and consist of rich information needs with multiple aspects/subtopics.In this thesis, we explore information retrieval techniques and evaluation metrics for prior art search. The work consists of three parts. Given a patent application document, we first manage to retrieve relevant patent documents by performing retrieval at both the document level and the passage level. At the document level, we focus on automatically formulating effective search queries. The queries are formed by extracting terms from claims, titles and hyphenating phrases in a patent application and refined based on Inverse Document Frequency (IDF) and Part-of-Speech (POS) tagging. At the passage level, we propose a TF-IDF-based retrieval algorithm to calculate the relevance score for each passage and select the most relevant passages.Second, we propose a novel evaluation metric for prior art search. The new evaluation metric, termed the Cube Test (CT), is based on the proposed conceptual user utility model - the Water Filling Model, which describes the process of prior art search. We compare our metric with existing prior art search evaluation metrics, as well as existing Web search evaluation metrics in correlation and discriminative power. Experiment results show that our metric effectively captures the characteristics of prior art search.In the third part of this thesis, we model the learning curve in complex search using a Sigmoid function and improve the CT metric by taking into account the Eureka Effect shown in the learning curve. Our experiments demonstrate that there exists differences between user-perceived relevance and user-received relevance and the refined CT metric is able to reflect the learning effect in complex search.Part of this thesis research has been published in the 22nd International Conference on Information and Knowledge Management (CIKM 2013) and the prior art search system described here has won the 1st position in the 2013 Conferences and Labs of the Evaluation Forum (CLEF 2013) Prior Art Search Track evaluation.
MetadataShow full item record
Showing items related by title, author, creator and subject.