<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Microblog Search Task at CLEF 2017: Query Generation using IR and LDA Topic Modeling Combination</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Malek</forename><surname>Hajjem</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences of Tunis</orgName>
								<orgName type="laboratory">LIPAH research Laboratory</orgName>
								<orgName type="institution">Tunis EL Manar Univeristy</orgName>
								<address>
									<addrLine>Campus Universitaire Farhat Hached B.P. n94</addrLine>
									<postCode>1068</postCode>
									<settlement>Tunis</settlement>
									<country key="TN">Tunisia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chiraz</forename><surname>Latiri</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences of Tunis</orgName>
								<orgName type="laboratory">LIPAH research Laboratory</orgName>
								<orgName type="institution">Tunis EL Manar Univeristy</orgName>
								<address>
									<addrLine>Campus Universitaire Farhat Hached B.P. n94</addrLine>
									<postCode>1068</postCode>
									<settlement>Tunis</settlement>
									<country key="TN">Tunisia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Microblog Search Task at CLEF 2017: Query Generation using IR and LDA Topic Modeling Combination</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">167A0091D803D0605269712A28DBDCEC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:29+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>CLEF</term>
					<term>Microblogs Search</term>
					<term>LDA</term>
					<term>Information Retrieval</term>
					<term>Aggregation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The microblogs search task at CLEF 2017 consists of developing a system to search the most relevant microblogs for cultural query in a collection about festivals in all languages. Our general approach to get this objective is the following: we propose to generate from the initial tweet queries, provided for the task, extended queries able to get an answer-rich set of microblogs. This is achieved using a thematic representation of tweet query extracted from microblog corpus. We investigate in this paper a novel method to improve topics learned from Twitter content without modifying the basic machinery of LDA. This latter is based on Information Retrieval (IR) process to generate a query-specific set of similar tweets. The result then represent the input of a basic LDA topic modeling process. Finally, the output thematic cluster serves as our source of expansion for the initial queries.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The microblog search is the second task<ref type="foot" target="#foot_0">1</ref> at CLEF 2017 <ref type="bibr" target="#b7">[8]</ref>(Conference and Labs of the Evaluation Forum) from Cultural Microblog Contextualization track. This task consists in developing a system to search the most relevant microblogs for cultural query in a collection about festivals in all languages. Topics, announcing some cultural event, were gathered from different sources<ref type="foot" target="#foot_1">2</ref>  <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b3">4]</ref>. The goal is to retrieve relevant and diverse tweets related to each event from a dataset of 70 000 000 microblogs. This corpus dates from May to September 2015 and is about the keyword "Festival". We consider that using topic models such as Latent Dirichlet Allocation (LDA) could be useful in this microblog search task. Taking into account that "topic model is often employed to mine "latent topics" from high dimensionality of terms in text" <ref type="bibr" target="#b1">[2]</ref>, these topics can be used to describe the content of a collection or a query. In fact, the high probability topics and words within the topics can be viewed as a loose description of a text data. The contribution of this work is to identifying the "topic" or topics being discussed in a query, and then using this knowledge of topics to include semantically related words. To better match with ambiguous nature of tweet query, topics are extracted from a microblog corpus. A novel method to improve topics learned from Twitter content without modifying the basic machinery of LDA is investigated. Following this introduction, the paper is organized as follows. In Section 2, a state of the art is shown. In Section 3, the methodology used is presented. The Conclusion section wraps up the paper.</p><p>2 Related works 2.1 Topic modeling for short texts Topic models are used to uncover the latent semantic structure from text corpus. A topic consists of a cluster of words that frequently occur together. Using contextual clues, topic models can connect words with similar meanings and distinguish between uses of words with multiple meanings. Traditional topic models, like LDA, rely on co-occurrence patterns of words in documents to learn latent topics <ref type="bibr" target="#b1">[2]</ref>. Due to the messy nature of short texts, applying directly conventional topic models (e.g. LDA and pLSA) on such short texts is not efficient. Indeed, naive topic models implicitly capture the document-level word co-occurrence patterns to reveal topics. Thus, to avoid data sparsity, works such as <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b8">9]</ref> have applied topic models to tweets based on a pooling strategy. It consists in aggregating similar short texts in one document. In <ref type="bibr" target="#b11">[12]</ref>, authors proposed tweet pooling startegy which was based on user aggregated messages. Authors in <ref type="bibr" target="#b6">[7]</ref> also have experimented several schemes to train a standard topic model and compare their quality and effectiveness. In <ref type="bibr" target="#b0">[1]</ref>, authors have proposed to gather tweets occurring in a same user-to-user conversation and show that this new technique outperforms other pooling methods in terms of clustering quality and document retrieval. Closer to our work, authors in <ref type="bibr" target="#b8">[9]</ref> have proposed a new method of tweet pooling using hashtags where documents with common hashtag were gathered. All these works have proved that by training a topic model on aggregated messages, they obtained a higher quality of learned model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Topic model for IR: query expansion</head><p>There are two obvious approaches to including topic models in IR. In the first, a document is represented by itself and the topics to which it belongs. A second approach is to calculate a query related topic by using topic models and use it for query expansion. In this case, queries are reformulated (i.e. usually expanded) to improve the retrieval effectiveness. Authors in <ref type="bibr" target="#b12">[13]</ref> proposed a method to find a good query-related topic based on LDA where experiments confirm that query expansion based on the derived topics achieved statistically significant improvements. Others, in <ref type="bibr" target="#b10">[11]</ref> implement one of most common local approach of query expansion -Pseudo Relevance Feedback (PRF). In this last, top k documents are considered to be relevant and extracts their topic's terms to extend queries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Topic model for microblog search</head><p>With the rapid development of microblogs, microblog search has become one of the most trendy research areas in recent years. In contrast to traditional text retrieval, microblog search significantly differs. In fact, microblogs users often issue short queries to find relevant information. Moreover, the restriction of messages length lead to a problem in discriminating terms within a given item. To improve retrieval effectiveness in microblogs researches tend to use query expansion techniques. The goal is reducing the usual query/document mismatch. In this area, using topic model like, LDA, could be useful. However, unlike regular text, Topic modeling is not good at processing short text. Rare are works which try to enhance microblog search using topic modeling such as LDA. We cite <ref type="bibr" target="#b9">[10]</ref>, where authors present a method of contextualization of short messages using a thematic representation extracted from Wikipedia. This representation allows to extend the vocabulary of short messages by a set of thematically related words. The results show the contribution of this method to a better understanding of short messages. Other works like <ref type="bibr" target="#b2">[3]</ref> propose a novel approach to locating the microblogging experts on a given query. First they define the experts by social influence and content relevance. For the social influence, they present a global-ranking algorithm as GUserRank and a topicranking algorithm as TUserRank after applying the LDA topic model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Ressources and Data pre-proceeding</head><p>To build a robust LDA model, a large amount of data is needed. From this perspective, two tweet corpora are used in different runs. We note that we have choose to use explicitly microblog corpus to respect the noisy nature of tweet query. The firt corpus is a comparable tweet corpus about Arab spring collected through Twitter's API<ref type="foot" target="#foot_2">3</ref> in Arabic and French languages. Basically, accessing Twitter data is done by collecting tweets that contain specific keywords. More information about this corpus could be found here <ref type="bibr" target="#b5">[6]</ref>. The second tweet corpus is provided to the participants in the evaluation campaign. It is composed of 70 000 000 microblogs. This corpus dates from May to September 2015 and is about the keyword "Festival". Notice that before we applied LDA, redundancy was eliminated by deleting retweets. A language detection was also performed using a java library<ref type="foot" target="#foot_3">4</ref> to remove foreign language tweets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Information retrieval based approach for tweet pooling</head><p>In this work, an unsupervised topic model based on aggregating tweets that are thematically closed is presented. The goal is to adapt the LDA basic process to short tweet text. This will lead to improve the quality of topics latent discovered. To perform tweet pooling, we propose to use information retrieval strategy and hierarchical classification in order to avoid data sparsity in short texts as illustrated in Figure <ref type="figure" target="#fig_1">1</ref>. Our approach represents an alternative of state of the art methods based on tweet pooling via meta data (hashtags, user information, etc). Indeed, such methods are highly dependent on the meta data content of the tweet corpus. Our approach relies on three main steps, namely:</p><p>-Step 1: Preliminary set generation: For each tweet i , we propose to retrieve a set γ i of matched tweets out of a large tweet collection C of n tweets, using an information retrieval system. Thus, for a tweet tweet i , performed as a query, several tweets in C may match it with different similarity degrees.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Description of Runs submitted</head><p>The submitted runs follow three steps (see Figure <ref type="figure" target="#fig_0">2</ref>):</p><p>-Extract Topics using the combined method of IR and aggregation strategy as described above section 3.2 -Project the resulted Topic on the tweet text to detect the subset of thematic relevant terms -Reformulate the initial query using the subset of thematic terms as enriched features in form of indri query -RUN 1: Extend the initial query using the latent topic extracted from comparable tweet corpus <ref type="bibr" target="#b4">[5]</ref> through IR-LDA -RUN 2: Extend the initial query using latent topic extracted from the Festival tweet corpus through IR-LDA</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion</head><p>A large multilingual collection of posts have become publicly available due to the phenomenal growth of using social networks and microblogs all over the world. This makes from Microblogs valuable information sources. However little is known about how search socially-generated content in effective way. In this paper, we present a method to expand short messages using a thematic representation. A novel method for aggregating tweets in order to improve the quality of LDA-based topic modeling for short text is used to improve the quality of latent topics.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>-Step 2 :</head><label>2</label><figDesc>Pooled set construction: The idea is to aggregate a set of tweets in Γ partitions by gathering preliminary sets, resulting from the IR process, according to a combination criterion. If same search results are assigned to different tweets then the tweets are considered thematically close. -Step 3: Topic extraction: It consists on learning latent topics from aggregating tweets via LDA.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 :</head><label>1</label><figDesc>Fig. 1: IR for tweet aggregation</figDesc><graphic coords="4,181.68,389.51,251.97,188.98" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 :</head><label>2</label><figDesc>Fig. 2: Query expansion using latent topic inferred through IR-LDA method</figDesc><graphic coords="5,181.68,222.57,251.97,188.98" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="6,139.68,115.84,336.00,252.00" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://mc2.talne.eu/spip/Tasks/2-microblog-search/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">spanish query: http://www.jornada.unam.mx/2017/05/26/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">Otterapi a real time search engine that indexes the most influential tweets search api https://dev.twitter.com/rest/public/search</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://code.google.com/p/language-detection/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Topic modeling in twitter: Aggregating tweets by conversations</title>
		<author>
			<persName><forename type="first">D</forename><surname>Alvarez-Melis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Saveski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Tenth International AAAI Conference on Web and Social Media</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Latent dirichlet allocation</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Mach. Learn. Res</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003-03">Mar. 2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Locating query-oriented experts in microblog search</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Workshop on Semantic Matching in Information Retrieval co-located with the 37th international ACM SIGIR conference on research and development in information retrieval, SMIR@SIGIR 2014</title>
				<meeting>Workshop on Semantic Matching in Information Retrieval co-located with the 37th international ACM SIGIR conference on research and development in information retrieval, SMIR@SIGIR 2014<address><addrLine>Queensland, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014-07-11">July 11, 2014. 2014</date>
			<biblScope unit="page" from="16" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Contextualisation de messages courts :l&apos;importance des métadonnées</title>
		<author>
			<persName><forename type="first">J.-V</forename><surname>Cossu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gaillard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-M</forename><surname>Juan-Manuel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">El</forename><surname>Bèze</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EGC&apos;2013 13e Conférence Francophone sur l&apos;Extraction et la Gestion des connaissances</title>
				<meeting><address><addrLine>Toulouse, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-01">Jan. 2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Features extraction to improve comparable tweet corpora building</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hajjem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Latiri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">JADT Acte</title>
				<meeting><address><addrLine>, nice, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-06">Juin 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Twitter as a multilingual source of comparable corpora</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hajjem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Latiri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Slimani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 12th International Conference on Advances in Mobile Computing and Multimedia</title>
				<meeting>the 12th International Conference on Advances in Mobile Computing and Multimedia<address><addrLine>Kaohsiung, Taiwan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">December 8-10, 2014. 2014</date>
			<biblScope unit="page" from="342" to="345" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Empirical study of topic modeling in twitter</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">D</forename><surname>Davison</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Social Media Analytics, SOMA &apos;10</title>
				<meeting>the First Workshop on Social Media Analytics, SOMA &apos;10<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="80" to="88" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Clef 2017 microblog cultural contextualization lab overview</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M P M J</forename></persName>
		</author>
		<author>
			<persName><forename type="first">-Y</forename><forename type="middle">N</forename><surname>Liana Ermakova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lorraine</forename><surname>Goeuriot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sanjuan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference of the Cross-Language Evaluation Forum for European Languages Proceedings</title>
		<title level="s">LNCS volume</title>
		<meeting><address><addrLine>CLEF; Dublin.</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Improving lda topic models for microblogs via tweet pooling and automatic labeling</title>
		<author>
			<persName><forename type="first">R</forename><surname>Mehrotra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sanner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Buntine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Xie</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;13</title>
				<meeting>the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;13<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="889" to="892" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Combinaison de thèmes latents pour la contextualisation de Tweets</title>
		<author>
			<persName><forename type="first">M</forename><surname>Morchid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dufour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Linarès</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EGC&apos;2013 13e Conférence Francophone sur l&apos;Extraction et la Gestion des connaissances</title>
				<meeting><address><addrLine>Toulouse, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-01">Jan. 2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Implement topic relevance model for query expansion</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ren</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Twitterrank: Finding topic-sensitive influential twitterers</title>
		<author>
			<persName><forename type="first">J</forename><surname>Weng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E.-P</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM &apos;10</title>
				<meeting>the Third ACM International Conference on Web Search and Data Mining, WSDM &apos;10<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="261" to="270" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A comparative study of utilizing topic models for information retrieval</title>
		<author>
			<persName><forename type="first">X</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Allan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, ECIR &apos;09</title>
				<meeting>the 31th European Conference on IR Research on Advances in Information Retrieval, ECIR &apos;09<address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="29" to="41" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
