CLEF-IP 2010; Building strategies, a year later

                   W. Alink, R. Cornacchia, and A.P. de Vries

                         Centrum Wiskunde & Informatica,
                Science Park 123, 1098 XG Amsterdam, Netherlands
                 {alink,cornacchia}@spinque.com,arjen@cwi.nl
                  http://www.cwi.nl/, http://www.spinque.com/


      Abstract. After participating in last year’s CLEF IP (2009) evalua-
      tion benchmark, our scores were rather low. The CLEF IP 2010 PAC
      task enabled us to correct some experiments and obtain better results,
      basically using the same techniques (almost the same BM25-category
      strategy as used last year) and improved strategy builder software, and
      less computing hardware at our disposal. The results are now compara-
      ble with other participants. Similar to last year, no feature extraction
      techniques have been applied; and queries only used the structural infor-
      mation provided in the XML-format of the patent-documents. Further-
      more we participated in the new CLS task, which, although scores were
      rather low, shows again the flexibility of our approach. The low scores
      can be explained by the straight-forward method applied searching the
      patent-document collection using keywords from the topic-patent, and
      using the IPCR-classifications extracted from the documents as results.


1   Introduction
The main objective of this research is to demonstrate the importance of flex-
ibility in expressing strategies for patent-document retrieval. While last year’s
submission focussed on flexibility, this year also scalability, and retrieval quality
have been taken into account. The results of our system are comparable with
other participants, while it can also be operated interactively, which makes it
a powerful tool. The paper gives an overview of the system (Section 2) and
the techniques used to generate the runs (Section 3). Afterwards the results are
evaluated (Section 4). Finally a conclusion is drawn (Section 5).

2   System overview
We created our submission for the CLEF-IP 2010 evaluation benchmark using
Spinque’s strategy builder interface. The setup is similar to last year’s setup,
only the system has matured. The hardware requirements have lowered from a
supercomputer to a single server, and querying could be performed on a desktop
machine, whereas query-speed has even increased. Last year the strategies were
still optimized by hand; this year, there was no performance tweaking done
between strategy definition and performing the benchmark runs. The reader is
pointed to last year’s paper [1] for more details about the setup.
2.1   Index creation
The index was created on a high-end server: 2.4Ghz 4 core processor, 36GB
RAM, 5x 2TB SATA disks in RAID-5 configuration. All querying was done on a
3-year old desktop with average computing specs: 2.4Ghz processor, 4 cores, 8GB
ram, 2x500GB SATA disks in RAID-0. Creating a generic index for the whole
collection took about 3 days. Creating a full run over 2000 topics took about 12
hours. A SQL dump of the resulting database is roughly 300GB uncompressed
in size (81GB bzipped).


3     CLEF-IP Experiments
This Section reports on the experiments conducted for the official submission.
Fine-tuning of all parameters used for the PAC task was performed on the train-
ing set provided. The parameters for the CLS task haven’t been trained.
    Instead of merging patent-documents belonging to the same patent into a
single document (suggested in the CLEF-IP instructions), we have indexed (sim-
ilar to last year’s submission) the original documents, and, aggregate scores from
different patent-documents into patents as part of the search strategies.
    Two runs have been submitted for CLEF-IP 2010.

3.1   Prior-Art Candidate Search Task
The strategy can be explained as follows: first make a selection of patents in
the corpus that have at least one classification code in common with the topic
patent, or have the same assignee. Search this selection using 26 keywords from
topic-patent using the BM25 model. As a last step in the strategy, and due to
the evaluation measures that are used for CLEF-IP 2010, all patent-documents
within the same simple-family have been given the same score as the best scoring
patent-document of that family. See figure 1.
    Difference with last year is the searching for patents by same assignee, and
the hard selection before keyword search based on classification and assignee,
instead of a mixture between keywords, classifactions, and assignees.

3.2   Classification Task
Used same strategy builder as for the PAC run, no additional coding, or re-
configuring. The strategy was to search documents using 26 keywords from the
topic-patent, and then extract the classifications of these documents as results.
See figure 2.


4     Evaluation and analysis
We learned from the problems found during last year’s participation. Similar
to last year’s contribution, the strategy building contribution, shows flexibil-
ity without re-programming, re-indexing, or re-configuring a system. Last year
                  Fig. 1: Prior-Art Candidate search strategy


(CLEF-IP 2009) the system was still under heavy development. This caused the
software to perform far from optimal. This year, most of the issues have been
resolved.


4.1   Prior-Art Candidate Search Task

The results are more in line with the expected results than last year. It looks
like the participants with similar strategies perform similar.
                       Fig. 2: IPCR Classification strategy


    Our run did not use structural information outside the given structural infor-
mation in the XML documents in which the patent-documents were provided. It
was shown by Lopez and Romary [2] that an increase of MAP by .10 (absolute)
can be achieved when using citations extracted from the topic-patent.
    Notice that the MAP, recall, and PRES are all quite stable over the different
languages of the topic-patent, while some other participants seem to have much
higher fluctuations between them. A possible explanation for this phenomenon
is the lack of language specific optimizations.
    In contrast to last year, we did not merge the results of the classification
search, with the results of the keyword search, but filtered the results of the
classification search with the results of the keyword search. This had a major
consequence; documents not having any classification in common with the topic
document were not extracted. Recall is therefore likely to be lower than when
the results would have been mixed; notice that the average number of results
per topic is not the maximum (1000), but a little lower. Results in [3] confirm
this effect of hard (facet) selections. However, it has been much easier to define a
balance between the weight of the classification and the weight of the keywords,
which took a lot of time in last year’s submission. More work on automatically
tuning the weights for multiple components of the search to a specific task has
to be done, and would likely yield better results, both in terms of MAP and
recall.


4.2   Classification Task
Our classification run got low scores for both precision and MAP. The recall
(at 25 and at 50) was reasonable to high compared to other participants. This
is most probably due to the fact that for each topic we tried to provide the
full 1000 results. Other dedicated systems do clearly outperform the strategy we
have build. It would be interesting to see what happens to the precision scores
if a cut-off was applied to the results based on the computed probabilities (and
using a fixed cut-off value for all topics). Other improvements could probably be
made by using more aspects of the patent, perhaps using the patent citations
used in the topic patent, and information on the inventor and assignee. Due to
the limited time available, such runs have not been created.


5     Conclusion

Participation in the CLEF-IP 2010 evaluation track has been easier than last
year. Compared to last year’s results seem to have improved relative to other
participants for the PAC task. The results for the CLS task show that there is
still a lot of room for improvement. Also, more work is needed on the automatic
tuning of strategy parameters.
     Acknowledgements: We would like to thank the MonetDB kernel developers
and Spinque for their support.


References
1. Wouter Alink, Roberto Cornacchia, and Arjen de Vries. Running CLEF-IP exper-
   iments using a graphical query builder. In Lecture Notes in Computer Science (to
   appear), 2009.
2. Patrice Lopez and Laurent Romary. Multiple retrieval models and regression models
   for prior art search. In CLEF Working Notes, 2009.
3. Lanbo Zhang and Yi Zhang. Interactive Retrieval based on Faceted Feedback. In
   Proceedings of the 33rd ACM SIGIR Conference, 2010.