<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The University of Amsterdam at the CLEF 2008 Domain Specific Track</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Edgar Meij emeij@science.uva.nl</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ISLA, University of Amsterdam</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Maarten de Rijke</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>We describe our participation in the CLEF 2008 Domain Specific track. The research questions we address are threefold: (i) what are the effects of estimating and applying relevance models to the domain specific collection used at CLEF 2008, (ii) what are the results of parsimonizing these relevance models, and (iii) what are the results of applying concept models for blind relevance feedback? Parsimonization is a technique by which the term probabilities in a language model may be re-estimated based on a comparison with a reference model, making the resulting model more sparse and to the point. Concept models are term distributions over vocabulary terms, based on the language associated with concepts in a thesaurus or ontology and are estimated using the documents which are annotated with concepts. Concept models may be used for blind relevance feedback, by first translating a query to concepts and then back to query terms. We find that applying relevance models helps significantly for the current test collection, in terms of both mean average precision and early precision. Moreover, parsimonizing the relevance models helps mean average precision on title-only queries and early precision on title+narrative queries. Our concept models are able to significantly outperform a baseline query-likelihood run, both in terms of mean average precision and early precision on both title-only and title+narrative queries.</p>
      </abstract>
      <kwd-group>
        <kwd>Parsimonious Models</kwd>
        <kwd>Language Models</kwd>
        <kwd>Relevance Feedback</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        We describe our participation in the 2008 CLEF Domain Specific track. Our main motivation for
participating was to evaluate the retrieval models we have developed for another, very similar domain on the
CLEF Domain Specific test collection. Our concept models have thus far been developed and evaluated
on the TREC Genomics test collections, which also consists of documents which are manually annotated
using concepts from a thesaurus [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
      </p>
      <p>The main idea behind our approach is to model the language use associated with concepts from a
thesaurus or ontology. To this end we use the document annotations as a bridge between vocabulary terms
and the concepts in the knowledge source at hand. We model the language use around concepts using
a generative language modeling framework, which provides theoretically sound estimation methods and
builds upon a solid statistical background.</p>
      <p>
        Our concept models may be used to determine semantic relatedness or to generate navigational
suggestions, either in the form of concepts or vocabulary terms. These can then be used as suggestions for
the user or for blind relevance feedback [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14, 18</xref>
        ]. In order to apply blind relevance feedback using our
models, we perform a double translation. First we estimate the most likely concepts given a query and then
we use the most distinguishing terms from these concepts to formulate a new query. In a sense we are using
the concepts as a pivot language [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. To find the most distinguishing terms given a concept, we apply a
technique called parsimonization [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Parsimonization is an algorithm based on expectation-maximization
(EM) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and may be used to re-estimate probabilities of one model with respect to another. Events that are
well-predicted by the latter model will lose probability mass, which in turn will be given to the remaining
events. Recently, we have successfully applied parsimonization to the estimation of relevance models on
a variety of tasks and collections [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In all of these cases, as well as with our concept models, we find
that interpolating the newly found query with the original one yields the best performance—an observation
which is in line with the literature [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The research questions we address are threefold: (i) what are the effects of estimating and applying
relevance models to the collection used at the CLEF 2008 Domain Specific track [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], (ii) what are the
results of parsimonizing these relevance models, and (iii) what are the results of applying our concept
models for blind relevance feedback?
      </p>
      <p>We find that applying relevance models helps significantly for the current test collection, in terms of
both mean average precision and early precision. Moreover, we find that parsimonizing the relevance
models helps mean average precision on title-only queries and early precision on title+narrative queries.
Our concept models are able to significantly outperform a baseline query-likelihood run, both in terms of
mean average precision and in terms of early precision on both title-only and title+narrative queries.</p>
      <p>The remainder of this paper is organized as follows. In Section 2 we introduce the retrieval framework
we have used for our submission, i.e., statistical language modeling. In Section 3 and 4 we introduce the
specifics of our models and techniques. In Section 5 we describe the experimental setup, our parameter
settings, and the preprocessing steps we performed on the collection. In Section 6 we discuss our experimental
results and we end with a concluding section.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Language Modeling</title>
      <p>
        Language modeling is a relatively new framework in the context of information retrieval [
        <xref ref-type="bibr" rid="ref7">7, 16, 20</xref>
        ]. It is
centered around the assumption that a query as issued by a user is a sample generated from some underlying
term distribution—the information need. The documents in the collection are modeled in a similar fashion
and are usually considered to be a mixture of a document-specific model and a more general background
model. At retrieval time, each document is ranked according to the likelihood of having generating the
query (query-likelihood).
      </p>
      <p>
        Lafferty and Zhai [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] propose to generalize the query likelihood model to the KL-divergence scoring
method, in which the query is modeled separately. Scoring documents then comes down to measuring the
divergence between a query model P (tj Q) and each document model P (tj D), in which the divergence
is negated for ranking purposes. When the query model is generated using the empirical,
maximumlikelihood estimate (MLE) on the original query as follows:
      </p>
      <p>P (tj~Q) =
n(t; Q)
jQj
;
(1)
where n(t; Q) is the number of occurrences of term t in query Q and jQj the length of the query, it can
be shown that documents are ranked in the same order as using the query likelihood model [20]. More
formally, the score for each document given a query using the KL-divergence retrieval model is:
Score(Q; D) =</p>
      <p>KL( Qjj D) =</p>
      <p>X P (tj Q) log P (tj D) + X P (tj Q) log P (tj Q);
t2V
t2V
where V denotes the vocabulary. The entropy of the query—Pt2V P (tj Q) log P (tj Q)—remains constant
per query and can be ignored for ranking purposes.
2.1</p>
      <sec id="sec-2-1">
        <title>Smoothing</title>
        <p>Each document model is estimated as the MLE of each term in the document P (tj D), linearly interpolated
with a background language model P (t), which in turn is calculated as the likelihood of observing t in a
sufficiently large collection, such as the document collection:</p>
        <p>P (tj D) =</p>
        <p>
          P (tj D) + (1
)P (t):
We smooth using Bayesian smoothing with a Dirichlet prior and set = jDj+
where is the Dirichlet prior that controls the influence of smoothing [
          <xref ref-type="bibr" rid="ref2">2, 22</xref>
          ].
and (1
) = jDj Dj+j ,
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Query Modeling</title>
        <p>
          Relevance feedback can be applied to better capture a user’s information need [
          <xref ref-type="bibr" rid="ref1 ref12">1, 12, 19</xref>
          ]. In the context
of statistical language modeling, this is usually performed by estimating a new query model, viz. P (tj Q),
in Eq. 2 [16, 21]. Automatically reformulating queries (or blind relevance feedback) entails looking at the
terms in some set of (pseudo-)relevant documents and selecting the most informative ones with respect to
the set or the collection. These terms may then be reweighed based on information pertinent to the query
or to the documents and—in a language modeling setting—be used to estimate a query model.
        </p>
        <p>
          Relevance modeling is one specific technique by which to estimate a query model given a set of
(pseudo-)relevant documents DQ. The query and documents are both taken to samples of an
underlying generative model—the relevance model. There are several ways by which to estimate the parameters
of this model given the observed data (query and documents), each following a different independence
assumption [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. For our current experiments we use method 2, which is formulated as:
where q1; : : : ; qk are the query terms, D a document, and t a term. Bayes’ rule is used to estimate the term
P ( Djt):
(2)
(3)
(4)
(5)
(6)
(7)
P (tj^Q) / P (t) Y
        </p>
        <p>X</p>
        <p>P (qij Di )P ( Di jt);
qi2Q Di2DQ
P ( Djt) =</p>
        <p>P (tj D)P ( D) ;</p>
        <p>P (t)
where we assume the document prior P ( D) to be uniform. Similar to Eq. 3, the term P (tj D) may be
interpreted as a way of accounting for the fact that the (pseudo-)relevant documents contain terms related
to the information need as well as terms from a more general model. We set it to the following mixture:
P (tj D)
=
+ (1</p>
        <p>)P (t);
n(t; D)</p>
        <p>
          jDj
where P (t) is the probability of observing t in a sufficiently large collection such as the entire document
collection. Query models obtained using relevance models perform better when they are subsequently
interpolated with the initial query using a mixing weight [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]:
        </p>
        <p>P (tj Q)
=</p>
        <p>P (tj~Q) + (1
)P (tj^Q)</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Concept Models</title>
      <p>
        In order to leverage the explicit knowledge encapsulated in the GIRT/CSASA thesauri, we perform blind
relevance feedback using the concepts defined therein. We leverage the dual document representations—
concepts and terms—to create a generative language model for each concept, which bridges the gap
between terms and concepts. Related work has also used textual representations to represent concepts, see
e.g., [
        <xref ref-type="bibr" rid="ref4">4, 17</xref>
        ], however, we use statistical language modeling techniques to parametrize the concept models,
by leveraging the dual representation of the documents.
      </p>
      <p>
        To incorporate concepts in the retrieval process, we propose a conceptual query model which is an
interpolation of the initial query with another query model. This model is obtained from a double concept
translation. In this translation, concepts are used as a pivot language [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]; the initial query is translated to
concepts and back to expanded query terms:
      </p>
      <p>P (tj Q) =
Note that we assume that the probability of selecting a term is no longer dependent on the query once we
have selected a concept given that query. Then, two components need to be estimated: the probability of
a concept given a query P (cjQ) and of a term given a concept P (tjc). To acquire P (tjc), we will use the
assignments of GIRT/CSASA thesaural concepts to the documents in the collection and aggregate over the
documents Dc which are labeled with a particular concept c:
We drop the conditional dependence of t on c given D, again assume the document prior to be uniform,
and apply Bayes’ rule to obtain:</p>
      <p>P (tjc) =</p>
      <p>X P (tjD; c)P (Djc):</p>
      <p>D2Dc
P (tjc) =</p>
      <p>1
P (c)</p>
      <p>X P (tj D)P (cj D);</p>
      <p>D2DC
P (c) =</p>
      <p>Pc0</p>
      <p>PD n(c; D)</p>
      <p>PD0 n(c0; D0)
P (cj D) =</p>
      <p>n(c; D)
Pc0 n(c0; D)</p>
      <p>:
P (cjQ) =</p>
      <p>X P (cj D)P (DjQ);
D2DQ
(8)
(9)
(10)
(11)
where P (c) is a maximum likelihood (ML) estimate on the collection:
and P (cj D) is determined using the ML of the concepts associated with that document
Next, we also need need a way of estimating concepts for each query, which means that we are looking for
a set of concepts CQ such that c 2 CQ have the highest posterior probability P (cjQ). We approach this by
looking at the assignment of concepts to documents ane again consider documents which are related to the
original query, by using the top ranked documents DQ from the initial retrieval run:
where P (DjQ) is determined using the retrieval scores. Note that we again assume that the probability of
observing a concept is independent of the query, once we have selected a document given the query, i.e.,
P (cjD; Q) = P (cj D). This enables us to directly use Eq. 10.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Parsimonization</title>
      <p>If P (tj D) and P (cj D) in Eq. 6 and Eq. 10 are estimated based on MLE, general terms and concepts may
acquire too much probability mass, simply because they occur more frequently. Parsimonization may be
0.8
0.7
0.6
0.5
n
o
i
ics 0.4
e
r
P
0.3
0.2
0.1
0.0</p>
      <p>UAmsBaseline.res.eval
UAmsConceptModels.res.eval</p>
      <p>UAmsParsRelModels.eval</p>
      <p>
        UAmsRelModels.eval
used to reduce the amount and probability mass of non-specific terms in a language model by iteratively
adjusting the individual term probabilities based on a comparison with a large reference corpus, such as the
collection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. While one of the introduced models may already contain a way of incorporating a reference
corpus, viz. Eq. 6, we propose to make the estimates more sparse. Doing so enables more specific terms to
receive more probability mass, thus making the resulting model more to the point. In order to achieve this,
we consider both models to be a mixture of a document model P (xj D) and a background model P (x),
where x 2 ft; cg, and we “parsimonize” the estimates through applying the following EM algorithm until
the estimates do not change significantly anymore:
      </p>
      <sec id="sec-4-1">
        <title>E-step:</title>
      </sec>
      <sec id="sec-4-2">
        <title>M-step:</title>
        <p>ex = n(x; D)
(1</p>
        <p>P (xj D)
)P (x) + P (xj D)
P (xj D) = Px0 ex0
ex
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental Setup</title>
      <p>We did not perform any preprocessing on the document collection, besides replacing German characters as
well as HTML entities. To estimate our concept models, we have used the CONTROLLED-TERM-EN field
in the documents. We submitted the following runs:
0.45
0.35
0.25
!pam 00..0155 209 129 201 223 221 210 220 215 214 222 203 216 027 123 212 211 202 224 206
-0.05
-0.15
-0.25
-0.35
Something went wrong with the submitted UAmsParsRelModels run based on parsimonious relevance
models, making it identical to the UAmsRelModels run. In this paper we report on the corrected version.</p>
      <p>
        In all runs which use blind relevance feedback, we use the 5 terms with the highest probability from the
10 highest ranked documents to estimate our query models. We have then used the 2007 CLEF Domain
Specific topics to find the optimal parameter settings for (Eq. 6) and (Eq. 7 and Eq. 8). For our current
experiments we set = 50 and fix = 0:15 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Results and Discussion</title>
      <p>
        Table 1 lists the results of our runs. On the 2007 data, we found that adding the narrative field of the topics
helps retrieval effectiveness. For comparative purposes we have included results for both title-only and
title+narrative runs. On the 2008 topics, we do not find the same improvement when adding the narrative
field, besides slightly improving precision@10. When looking at the longer topics (title+narrative),
applying parsimonization to the relevance models hurts mean average precision, but helps early precision. This
precision-enhancing effect is in line with earlier results [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>The proposed concept models improve significantly over the query-likelihood baseline, both in terms
of mean average precision and precision@10 and for both title-only and title+narrative topics. From the
precision-recall plot in Figure 1 (title-only) it is clear that our concept model improves slightly in early
precision and that the biggest gain is obtained between recall levels 0.2 and 0.7. It also shows that the
relevance modeling approaches mainly help to improve recall and not so much precision.</p>
      <p>Figure 2 displays a per-topic comparison between the query-likelihood run and each of the other runs.
From these contrastive plots it emerges that topics 205 and 225 are hurt most when using relevance
models. Further analysis should indicate which characteristics of these topics are responsible for this result.
Interestingly, these two topics are hurt less when we apply our concepts models, whereas topic 219 is hurt
most in this run. On the other side of the graph, there are quite a few topics which are helped using either
relevance models or concept models. Especially topic 216 is improved when applying parsimonious
relevance models (&gt; 0:25 increase in mean average precision). The positive difference when applying concept
models is even more distinctive; topic 217 is nearly improved by a 0.5 increase in mean average precision.</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>We have described our participation in this year’s CLEF Domain Specific track. Our aim was to evaluate
blind relevance feedback models as well as concept models on the CLEF Domain Specific test collection.
The results of our experiments show that applying relevance modeling techniques has a significant positive
effect on the current topics, in terms of both mean average precision and precision@10. Moreover, we
find that parsimonizing the relevance models helps mean average precision on title-only queries and early
precision on title+narrative queries. When we apply concept models for blind relevance feedback, we
observe an even bigger as well as significant improvement over the query-likelihood baseline, also in terms
of mean average precision and early precision. Moreover, unlike (parsimonious) relevance models, our
concept model improves title-only as well as title+narrative queries on both measures.
8</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>This work was carried out in the context of the Virtual Laboratory for e-Science project, which is supported
by a BSIK grant from the Dutch Ministry of Education, Culture and Science (OC&amp;W) and is part of the
ICT innovation program of the Ministry of Economic Affairs (EZ). Maarten de Rijke was supported by the
E.U. IST programme of the 6th FP for RTD under project MultiMATCH contract IST-033104, and by the
Netherlands Organisation for Scientific Research (NWO) under project numbers 220-80-001, 017.001.190,
640.001.501, 640.002.501, 612.066.512, STE-07-012, 612.061.814, 612.061.815.
9
[16] J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR ’98,
1998.
[17] D. R. Recupero. A new unsupervised method for document clustering by using wordnet lexical and
conceptual relations. Inf. Retr., 10(6):563–579, 2007.
[18] D. Trieschnigg, E. Meij, M. de Rijke, and W. Kraaij. Measuring concept relatedness using language
models. In SIGIR ’08, 2008.
[19] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR ’96,
pages 4–11, 1996.
[20] C. Zhai. Risk Minimization and Language Modeling in Text Retrieval. PhD thesis, Carnegie Mellon</p>
      <p>University, 2002.
[21] C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information
retrieval. In CIKM ’01, pages 403–410, New York, NY, USA, 2001. ACM Press.
[22] C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information
retrieval. ACM Trans. Inf. Syst., 22(2):179–214, April 2004.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anick</surname>
          </string-name>
          .
          <article-title>Using terminological feedback for web search refinement: a log-based study</article-title>
          .
          <source>In SIGIR '03</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <surname>J. Goodman.</surname>
          </string-name>
          <article-title>An empirical study of smoothing techniques for language modeling</article-title>
          .
          <source>In ACL</source>
          , pages
          <fpage>310</fpage>
          -
          <lpage>318</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Dempster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Laird</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and D. B.</given-names>
            <surname>Rubin</surname>
          </string-name>
          .
          <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>
          .
          <source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
          ,
          <volume>39</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Gabrilovich</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Markovitch</surname>
          </string-name>
          .
          <article-title>Computing semantic relatedness using wikipedia-based explicit semantic analysis</article-title>
          .
          <source>In IJCAI'07</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Hersh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Bhupatiraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Roberts</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Hearst</surname>
          </string-name>
          .
          <article-title>TREC 2005 Genomics track overview</article-title>
          .
          <source>In Proceedings of the 14th Text Retrieval Conference. NIST</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>W.</given-names>
            <surname>Hersh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Roberts</surname>
          </string-name>
          .
          <article-title>TREC 2007 Genomics track overview</article-title>
          .
          <source>In TREC '07</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hiemstra</surname>
          </string-name>
          .
          <article-title>A linguistically motivated probabilistic model of information retrieval</article-title>
          .
          <source>In ECDL '98</source>
          , pages
          <fpage>569</fpage>
          -
          <lpage>584</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hiemstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Robertson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zaragoza</surname>
          </string-name>
          .
          <article-title>Parsimonious language models for information retrieval</article-title>
          .
          <source>In SIGIR '04</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Kraaij</surname>
          </string-name>
          and F. de Jong.
          <article-title>Transitive probabilistic CLIR models</article-title>
          .
          <source>In Proceedings of RIAO</source>
          <year>2004</year>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kurland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Domshlak</surname>
          </string-name>
          .
          <article-title>Better than the real thing?: iterative pseudo-query processing using cluster-based language models</article-title>
          .
          <source>In SIGIR '05</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lafferty</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          .
          <article-title>Document language models, query models, and risk minimization for information retrieval</article-title>
          .
          <source>In SIGIR '01</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Lavrenko</surname>
          </string-name>
          and
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <article-title>Relevance based language models</article-title>
          .
          <source>In SIGIR '01</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>E.</given-names>
            <surname>Meij and M. de Rijke</surname>
          </string-name>
          .
          <article-title>Thesaurus-based feedback to support mixed search and browsing environments</article-title>
          .
          <source>In ECDL '07</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>E.</given-names>
            <surname>Meij</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Trieschnigg</surname>
          </string-name>
          , M. de Rijke, and
          <string-name>
            <given-names>W.</given-names>
            <surname>Kraaij</surname>
          </string-name>
          .
          <article-title>Parsimonious concept modeling</article-title>
          .
          <source>In SIGIR '08</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>Meij</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Weerkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          , and M. de Rijke.
          <article-title>Parsimonious relevance models</article-title>
          .
          <source>In SIGIR '08</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>