<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linked Relevance Feedback for the ImageCLEF Photo Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ray R. Larson</string-name>
          <email>ray@sims.berkeley.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of California</institution>
          ,
          <addr-line>Berkeley</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we will describe Berkeley's approach to the ImageCLEFphoto task for CLEF 2007. Once again (as in ImageCLEFphoto for CLEF 2006) we used entirely text-based methods for retrieval. For some runs this year, however, we exploited the basic similarity of the topics and database from 2006 to acquire the metadata descriptions of the “example images” in the 2007 queries, and used that metadata to expand the query content for each topic. The results speak for themselves: use of what amounts to relevance feedback based on image metadata is much more effective than use of unexpanded queries, and even provides a method of cross-language retrieval for unknown languages when parallel topics and example images can be established. We submitted 19 runs for ImageCLEFphoto this year, of which 8 where monolingual English, German and Spanish, and the remaining 11 where bilingual from various languages to English, German and Spanish.</p>
      </abstract>
      <kwd-group>
        <kwd>Cheshire II</kwd>
        <kwd>Logistic Regression</kwd>
        <kwd>Relevance Feedback</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        This paper discusses the retrieval methods and evaluation results for Berkeley’s participation in the
ImageCLEFphoto task. This year we used only text-based retrieval methods for ImageCLEFphoto,
ignoring the images themselves, but using, for some runs, metadata associated with the reference
images specified in the queries from 2006. We have not yet been able to convert the BlobWorld
software that we wanted to use for combined text and image processing approaches, as used in
some previous work (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]), but hope to be able to do so for future work.
      </p>
      <p>This year Berkeley submitted 19 runs, of which 3 where English Monolingual, 3 German
Monolingual, and 2 were Spanish monolingual. Of the remaining 11 runs, all were bilingual, including
German⇒English, German⇒Spanish, English⇒German, English⇒Spanish, Spanish⇒German,
Spanish⇒English, French⇒English, Russian⇒German, Russian⇒English, Chinese⇒German, and
Chinese⇒English.</p>
      <p>This paper first describes the retrieval methods used, including our blind feedback method for
text, followed by a discussion of our official submissions and the methods used for query expansion.
Finally we present some discussion of the results and our conclusions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Retrieval Algorithms</title>
      <p>(Note, this section repeats information provided in our 2006 Notebook paper, since the basic
retrieval algorithms used and the approaches to indexing the content have not been changed since
then.)</p>
      <p>
        The basic form and variables of the Logistic Regression (LR) algorithm used for all of our
submissions was originally developed by Cooper, et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. As originally formulated, the LR
model of probabilistic IR attempts to estimate the probability of relevance for each document
based on a set of statistics about a document collection and a set of queries in combination
with a set of weighting coefficients for those statistics. The statistics to be used and the values
of the coefficients are obtained from regression analysis of a sample of a collection (or similar
test collection) for some set of queries where relevance and non-relevance has been determined.
More formally, given a particular query and a particular document in a collection P (R | Q, D)
is calculated and the documents or components are presented to the user ranked in order of
decreasing values of that probability. To avoid invalid probability values, the usual calculation of
P (R | Q, D) uses the “log odds” of relevance given a set of S statistics, si, derived from the query
and database, such that:
      </p>
      <p>S
log O(R | Q, D) = b0 + X bisi (1)
i=1
where b0 is the intercept term and the bi are the coefficients obtained from the regression analysis of
the sample collection and relevance judgements. The final ranking is determined by the conversion
of the log odds form to probabilities:</p>
      <p>P (R | Q, D) = 1 + elog O(R|Q,D)
elog O(R|Q,D)
(2)
2.1</p>
      <sec id="sec-2-1">
        <title>TREC2 Logistic Regression Algorithm</title>
        <p>
          For all of our ImageCLEF submissions this year we used a version of the Logistic Regression (LR)
algorithm that has been used very successfully in Cross-Language IR by Berkeley researchers for a
number of years[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and which is also used in our GeoCLEF and Domain Specific submissions. For
the ImageCLEF task we used the Cheshire II information retrieval system implementation of this
algorithm. One of the current limitations of this implementation is the lack of decompounding for
German document and query terms. As noted in our other CLEF notebook papers, the Logistic
Regression algorithm used was originally developed by Cooper et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] for text retrieval from the
TREC collections for TREC2. The basic formula is:
log O(R|C, Q) = log
        </p>
        <p>p(R|C, Q)
1 − p(R|C, Q)
= log
p(R|C, Q)
p(R|C, Q)
1 |XQc| qtfi
p|Qc| + 1 i=1 ql + 35
= c0 + c1 ∗
+ c2 ∗
− c3 ∗</p>
        <p>1 |XQc| log
p|Qc| + 1 i=1</p>
        <p>1 |XQc| log
p|Qc| + 1 i=1</p>
        <p>tfi
cl + 80
ctfi</p>
        <p>Nt
+ c4 ∗ |Qc|
where C denotes a document component (i.e., an indexed part of a document which may be the
entire document) and Q a query, R is a relevance variable,
p(R|C, Q) is the probability that document component C is relevant to query Q,
p(R|C, Q) the probability that document component C is not relevant to query Q, which is 1.0
p(R|C, Q)
|Qc| is the number of matching terms between a document component and a query,
qtfi is the within-query frequency of the ith matching term,
tfi is the within-document frequency of the ith matching term,
ctfi is the occurrence frequency in a collection of the ith matching term,
ql is query length (i.e., number of terms in a query like |Q| for non-feedback situations),
cl is component length (i.e., number of terms in a component), and
Nt is collection length (i.e., number of terms in a test collection).
ck are the k coefficients obtained though the regression analysis.</p>
        <p>If stopwords are removed from indexing, then ql, cl, and Nt are the query length, document
length, and collection length, respectively. If the query terms are re-weighted (in feedback, for
example), then qtfi is no longer the original term frequency, but the new weight, and ql is the
sum of the new weight values for the query terms. Note that, unlike the document and collection
lengths, query length is the “optimized” relative frequency without first taking the log over the
matching terms.</p>
        <p>
          The coefficients were determined by fitting the logistic regression model specified in log O(R|C, Q)
to TREC training data using a statistical software package. The coefficients, ck, used for our
official runs are the same as those described by Chen[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. These were: c0 = −3.51, c1 = 37.4,
c2 = 0.330, c3 = 0.1937 and c4 = 0.0929. Further details on the TREC2 version of the Logistic
Regression algorithm may be found in Cooper et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Blind Relevance Feedback</title>
        <p>
          In addition to the direct retrieval of documents using the TREC2 logistic regression algorithm
described above, we have implemented a form of “blind relevance feedback” as a supplement to the
basic algorithm. The algorithm used for blind feedback was originally developed and described by
Chen [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Blind relevance feedback has become established in the information retrieval community
due to its consistent improvement of initial search results as seen in TREC, CLEF and other
retrieval evaluations [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The blind feedback algorithm is based on the probabilistic term relevance
weighting formula developed by Robertson and Sparck Jones [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Blind relevance feedback is typically performed in two stages. First, an initial search using
the original topic statement is performed, after which a number of terms are selected from some
number of the top-ranked documents (which are presumed to be relevant). The selected terms
are then weighted and then merged with the initial query to formulate a new query. Finally the
reweighted and expanded query is submitted against the same collection to produce a final ranked
list of documents. Obviously there are important choices to be made regarding the number of
top-ranked documents to consider, and the number of terms to extract from those documents. For
ImageCLEF this year, having no prior data to guide us, we chose to use the top 10 terms from 10
top-ranked documents. The terms were chosen by extracting the document vectors for each of the
10 and computing the Robertson and Sparck Jones term relevance weight for each document. This
weight is based on a contingency table where the counts of 4 different conditions for combinations
In doc
Not in doc</p>
        <sec id="sec-2-2-1">
          <title>Relevant</title>
          <p>Rt
R − Rt
R</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>Not Relevant</title>
          <p>Nt − Rt
N − Nt − R + Rt
N − R</p>
          <p>Nt
N − Nt</p>
          <p>N
of (assumed) relevance and whether or not the term is, or is not in a document. Table 1 shows
this contingency table.</p>
          <p>The relevance weight is calculated using the assumption that the first 10 documents are relevant
and all others are not. For each term in these documents the following weight is calculated:
wt = log</p>
          <p>Rt
R−Rt</p>
          <p>Nt−Rt
N−Nt−R+Rt
(3)</p>
          <p>The 10 terms (including those that appeared in the original query) with the highest wt are
selected and added to the original query terms. For the terms not in the original query, the new
“term frequency” (qtfi in Equation 3 above) is set to 0.5. Terms that were in the original query,
but are not in the top 10 terms are left with their original qtfi. For terms in the top 10 and in
the original query the new qtfi is set to 1.5 times the original qtfi for the query. The new query
is then processed using the same LR algorithm as shown in Equation 3 and the ranked results
returned as the response for that topic.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Approaches for ImageCLEFphoto</title>
      <p>In this section we describe the specific approaches taken for our official submitted runs for the
ImageCLEFphoto task. First we describe the indexing and term extraction methods used, and
then the search features we used for the submitted runs.
3.1</p>
      <sec id="sec-3-1">
        <title>Indexing and Term Extraction</title>
        <p>Although the Cheshire II system uses the XML structure of documents and extracts selected
portions of the record for indexing and retrieval, for the submitted runs this year we used only a
single one of these indexes that contains the entire content of the document.</p>
        <p>Name</p>
        <p>Description
docno Document ID
title Article Title
topic All Content Terms
date Date of Image
geoname Image Place names</p>
        <p>Content Tags</p>
        <p>Used
DOCNO
TITLE
DOC
DATE
LOCATION
no
no
yes
no
no</p>
        <p>Table 2 lists the indexes created for the ImageCLEF database and the document elements
from which the contents of those indexes were extracted. The “Used” column in Table 2 indicates
whether or not a particular index was used in the submitted ImageCLEF runs. Note that the
database from 2006 was also maintained with the same indexes, although it was not used directly
for retrieval since the full extent of changes to that database was not known. The metadata from
2006 database was, however used indirectly for query expansion as described in Section 3.2 below.</p>
        <p>For all indexing we used language-specific stoplists to exclude function words and very common
words from the indexing and searching. The German language runs, however, did not use
decompounding in the indexing and querying processes to generate simple word forms from compounds.
Last year (after the official runs were submitted) we found that using the metadata associated
the “relevant images” included in the topics could be used to expand the query with very good
results.</p>
        <p>This year, metadata associated with the topic “relevant images” was removed, but since the
image ids had not changed from the 2006 collection for the same images, we were able to extract
the 2006 metadata for the same relevant images and use it for query expansion. It should be
noted that we did not use any actual relevance data from 2006, and only used the 2006 metadata
associated with the images provided with the ImageCLEFphoto 2007 images in the topic “image”
tags.</p>
        <p>For the runs that use this form of query expansion we add to the queries data from the “TITLE”
and “LOCATION” elements of the 2006 metadata annotations, in the appropriate target language,
associated with images included in the “image” element of the 2007 topics (e.g. rather like image
processing approaches, but using only the associate metadata text). We did not use the Description
or Notes fields of the metadata, since testing with 2006 data showed that including them in the
query expansion tended reduce performance when compared to title and location alone.
Searching the ImageCLEF collection used Cheshire II scripts to parse the topics and submit the
title or title and narrative from the topics to the “topic” index containing all terms from the
documents. For the monolingual search tasks we used the topics in the appropriate language
(English or German), and for bilingual tasks the topics were translated from the source language
to the target language using LEC Power Translator (a PC and web-based program and service).
This was the first time that we have used this translation tool, which was primarily selected for
the broad coverage of languages (including Russian and Chinese), and the apparent accuracy of
translation using 2006 data. Because of the method used for query expansion, the expanded
queries in target languages were largely the same, with variation only in the translated titles.</p>
        <p>Because no narrative was provided for the topics we used only the titles for searching in
unexpanded queries. In all cases the “topic” index mentioned above was used, and probabilistic
searches were carried out. Two forms of the TREC2 logistic regression algorithm were used. One
used the basic algorithm as described above, and the other used the TREC2 algorithm with blind
feedback using the top 10 terms from the 10 top-ranked documents in the initial retrieval.</p>
        <p>Our expanded query runs did use the example images in topics, this involved retrieving the
associated metadata records from the 2006 database for the example image ids included in the
queries and using their titles and location information to expand the basic query.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results for Submitted Runs</title>
      <p>The summary results (as Mean Average Precision) for the official submitted monolingual and
bilingual runs for both English, German and Spanish target languages are shown in Table 3, the
Recall-Precision curves for these runs are also shown in Figures 1 (for monolingual) and 2 (for
bilingual). In Figures 1 and 2 the names are abbrevated as indicated in the “Abbrev.” column. 3.</p>
      <p>Table 3 shows all of our submitted runs for the ImageCLEF Photo task. Precision and recall
curves for the runs are shown in Figures 1 and 2.</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion and Conclusions</title>
      <p>Our officially submitted runs using query expansion as described above were presented separately
in the provided evaluations along with others who apparently used 2006 topic descriptions, notes
and qrels. As noted above we used ONLY the provided 2007 topics, but looked up the example
images provided in the 2007 topics in the 2006 metadata. It is worth noting that this approach
is related to our Digital Library work where we link to multiple resources based on metadata
provided in other databases (e.g., linking library catalog records to Wikipedia articles based solely
on the catalog data).</p>
      <p>If all runs were to be considered on the basis of effectiveness alone then the query expansion
runs listed above would be top-ranked runs of all types.</p>
      <p>In Table 4 we compare our best performing runs, all using blind relevance feedback, for
monolingual English and German search tasks with and without query expansion. As the “Percent
Improv.” column shows query expansion provided a 140% and 132% boost for German and
English, respectively. This is strong indication that expanding queries using the metadata of relevant
images is a very good strategy for the ImageCLEFphoto task.</p>
      <p>We would argue that this query expansion approach is exactly comparable to image based
approaches that use the example images as the basis for their queries.</p>
      <p>As an further experiment, we decided to try this expansion method for Bilingual retrieval
without any translation of the original topic, that is, we used the title in the original language
(in this case French) and did our expansion using appropriate metadata for the target language
(English).
BERK-DE-EN-AUTO-QEFB-TXT
BERK-DE-ES-AUTO-FB-TXT
BERK-EN-DE-AUTO-QEFB-TXT
BERK-EN-ES-AUTO-FB-TXT
BERK-ES-EN-AUTO-QEFB-TXT
BERK-ES-DE-AUTO-QEFB-TXT
BERK-FR-EN-AUTO-QEFB-TXT
BERK-RU-EN-AUTO-QEFB-TXT
BERK-RU-DE-AUTO-QEFB-TXT
BERK-ZH-EN-AUTO-QEFB-TXT
BERK-ZH-DE-AUTO-QEFB-TXT</p>
      <p>FB
NOFB
EXPFB
FB
NOFB
EXPFB
FB
NOFB
DEEN-EXPFB
DEES-FB
ENDE-EXPFB
ENES-FB
ESEN-EXPFB
ESDE-EXPFB
FREN-EXPFB
RUEN-EXPFB
RUDE-EXPFB
ZHEN-EXPFB
ZHDE-EXPFB
Mono. German
Mono. German
Mono. German
Mono. English
Mono. English
Mono. English
Mono. Spanish
Mono. Spanish
German⇒English
German⇒Spanish
English⇒German
English⇒Spanish
Spanish⇒English
Spanish⇒German
French⇒English
Russian⇒English
Russian⇒German
Chinese⇒English</p>
      <p>Chinese⇒German</p>
      <p>Mono. German
Mono. English</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Aitao</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Multilingual information retrieval using english and chinese queries</article-title>
          . In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck, editors,
          <source>Evaluation of CrossLanguage Information Retrieval Systems: Second Workshop of the Cross-Language Evaluation Forum</source>
          , CLEF-2001, Darmstadt, Germany,
          <year>September 2001</year>
          , pages
          <fpage>44</fpage>
          -
          <lpage>58</lpage>
          . Springer Computer Scinece Series LNCS 2406,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Aitao</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <source>Cross-Language Retrieval Experiments at CLEF</source>
          <year>2002</year>
          , pages
          <fpage>28</fpage>
          -
          <lpage>48</lpage>
          . Springer (LNCS #2785),
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Aitao</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fredric C.</given-names>
            <surname>Gey</surname>
          </string-name>
          .
          <article-title>Multilingual information retrieval using machine translation, relevance feedback and decompounding</article-title>
          .
          <source>Information Retrieval</source>
          ,
          <volume>7</volume>
          :
          <fpage>149</fpage>
          -
          <lpage>182</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W. S.</given-names>
            <surname>Cooper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. C.</given-names>
            <surname>Gey</surname>
          </string-name>
          .
          <article-title>Full Text Retrieval based on Probabilistic Equations with Coefficients fitted by Logistic Regression</article-title>
          .
          <source>In Text REtrieval Conference (TREC-2)</source>
          , pages
          <fpage>57</fpage>
          -
          <lpage>66</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>William</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Cooper</surname>
          </string-name>
          ,
          <string-name>
            <surname>Fredric C. Gey</surname>
          </string-name>
          , and Daniel P. Dabney.
          <article-title>Probabilistic retrieval based on staged logistic regression</article-title>
          .
          <source>In 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Copenhagen, Denmark, June 21-24, pages
          <fpage>198</fpage>
          -
          <lpage>210</lpage>
          , New York,
          <year>1992</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Ray</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Larson</surname>
          </string-name>
          .
          <article-title>Probabilistic retrieval, component fusion and blind feedback for XML retrieval</article-title>
          .
          <source>In INEX 2005</source>
          , pages
          <fpage>225</fpage>
          -
          <lpage>239</lpage>
          .
          <source>Springer (Lecture Notes in Computer Science, LNCS 3977)</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Ray</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Larson</surname>
            and
            <given-names>Chad</given-names>
          </string-name>
          <string-name>
            <surname>Carson</surname>
          </string-name>
          .
          <article-title>Information access for a digital library: Cheshire II and the Berkeley environmental digital library</article-title>
          . In Larry Woods, editor,
          <source>Knowledge: Creation, Organization and Use: Proceedings of the 62nd ASIS Annual Meeting</source>
          , Medford, NJ, pages
          <fpage>515</fpage>
          -
          <lpage>535</lpage>
          . Information Today,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Robertson</surname>
          </string-name>
          and
          <string-name>
            <given-names>K. Sparck</given-names>
            <surname>Jones</surname>
          </string-name>
          .
          <article-title>Relevance weighting of search terms</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          , pages
          <fpage>129</fpage>
          -
          <lpage>146</lpage>
          , May-June
          <year>1976</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>