<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">An Ontological Representation of Documents and Queries for Information Retrieval Systems</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mauro</forename><surname>Dragoni</surname></persName>
							<email>mauro.dragoni@unimi.it</email>
							<affiliation key="aff0">
								<orgName type="department">Dipartimento di Tecnologie dell&apos;Informazione</orgName>
								<orgName type="institution">Università degli Studi di Milano</orgName>
								<address>
									<addrLine>Via Bramante 65</addrLine>
									<postCode>I-26013</postCode>
									<settlement>Crema (CR)</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Célia</forename><surname>Da Costa Pereira</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Dipartimento di Tecnologie dell&apos;Informazione</orgName>
								<orgName type="institution">Università degli Studi di Milano</orgName>
								<address>
									<addrLine>Via Bramante 65</addrLine>
									<postCode>I-26013</postCode>
									<settlement>Crema (CR)</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andrea</forename><forename type="middle">G B</forename><surname>Tettamanzi</surname></persName>
							<email>andrea.tettamanzi@unimi.it</email>
							<affiliation key="aff2">
								<orgName type="department">Dipartimento di Tecnologie dell&apos;Informazione</orgName>
								<orgName type="institution">Università degli Studi di Milano</orgName>
								<address>
									<addrLine>Via Bramante 65</addrLine>
									<postCode>I-26013</postCode>
									<settlement>Crema (CR)</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">An Ontological Representation of Documents and Queries for Information Retrieval Systems</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">81E5EA33603E2F0804B37C2F7F19A755</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T04:41+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents a vector space model approach, for representing documents and queries, using concepts instead of terms and WordNet as a light ontology. This way, information overlap is reduced with respect to the classic semantic expansion techniques. Experiments undertaken on Much-More benchmark showed the effectiveness of the approach.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>This paper presents an ontology-based approach for a conceptual representation of documents. Such an approach is inspired by a recently proposed idea presented in <ref type="bibr" target="#b8">[9]</ref>, and uses an adapted version of that method to standardize the representation of documents and queries. The proposed approach is somehow similar to the classic query expansion technique. However additional considerations have been taken into account and some improvements have been applied as explained below.</p><p>Query expansion is an approach used in Information Retrieval (IR) in order to improve the system's performance. It consists of the expansion of the content of the query by adding the terms that are semantical correlated with the original terms of the query <ref type="bibr" target="#b11">[12]</ref>. Several works demonstrated the enhanced performance of IR systems that implement query expansion approaches <ref type="bibr" target="#b18">[19]</ref> [3] <ref type="bibr" target="#b4">[5]</ref>. However, the query expansion approach has to be used carefully because, as demonstrated in <ref type="bibr" target="#b7">[8]</ref>, expansion might degrade the performance of some individual queries. This is due to the fact that an incorrect choice of terms and concepts for the expansion task might harm the retrieval process by drifting it away from the optimal correct answer.</p><p>Document expansion applied to IR has been recently proposed in <ref type="bibr" target="#b1">[2]</ref>. In that work a sub-tree approach has been implemented to represent concepts in documents and queries. However, when using a tree structure there is a redundancy of information because more general concepts may be represented implicitly by using only the leaf concepts they subsume. The smart idea behind the representation of documents by using concepts is that documents and queries may http://ims.dei.unipd.it/websites/iir10/index.html</p><p>Copyright owned by the authors.</p><p>be represented in the same way. This way, the risk of omitting some related terms (as it may happen in the classical query expansion technique), is reduced. However, it is necessary to use a language resource that permits to cover a higher number of terms in order to avoid information loss.</p><p>This paper presents a new representation for documents and queries. The proposed approach exploits the structure of the well-known machine readable dictionary WordNet in order to reduce the redundancy of information generally contained in a concept-based document representation. The second improvement is the reduction of the computational time needed to compare documents and queries represented by using concepts. This representation has been applied to the ad-hoc retrieval problem. The approach has been evaluated on the MuchMore<ref type="foot" target="#foot_0">1</ref> Collection <ref type="bibr" target="#b3">[4]</ref> and the results demonstrate its viability.</p><p>In Section 2 an overview of the environment in which ontology has been used is presented. Section 3 presents the tools used for this work. Section 4 illustrates the proposed approach to represent information, while Section 5 compares this approach with other two well-known approaches used in conceptual representation of documents. In Section 6 the results obtained from the test campaign are discussed. Finally, Section 7 concludes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">RELATED WORKS</head><p>An increasing number of recent information retrieval systems make use of ontologies to help the users clarify their information needs and come up with semantic representations of documents. Many ontology-based information retrieval systems and models have been proposed in the last decade. An interesting review on IR techniques based on ontologies is presented in <ref type="bibr" target="#b10">[11]</ref>, while in <ref type="bibr" target="#b15">[16]</ref> the author studies the application of ontologies to a large-scale IR system for web purposes. A model for the exploitation of ontologybased knowledge bases is presented in <ref type="bibr" target="#b6">[7]</ref>. The aim of this model is to improve search over large document repositories. The model includes an ontology-based scheme for the annotation of documents, and a retrieval model based on an adaptation of the classic vector-space model <ref type="bibr" target="#b14">[15]</ref>. Another information retrieval system based on ontologies is presented in <ref type="bibr" target="#b13">[14]</ref>. The authors propose an information retrieval system which has landmark information database that has hierarchical structures and semantic meanings of the features and characteristics of the landmarks.</p><p>The implementation of ontology models has been also investigated by using fuzzy models <ref type="bibr" target="#b5">[6]</ref>.</p><p>In IR, the user's input queries usually are not detailed enough, so the satisfactory query results can not be brought back. Query expansion of IR can help to solve this problem. However, the common query expansion in IR cannot get steady retrieval results. Ontologies play a key role in query expansion research. A common use of ontologies in query expansion is to enrich the resources with some well-defined meaning to enhance the search capabilities of existing web searching systems.</p><p>In <ref type="bibr" target="#b17">[18]</ref> the authors propose and implement query expansion method which combines domain ontology with the frequency of terms. Ontology is used to describe domain knowledge; logic reasoner and the frequency of terms are used to choose fitting expansion words. This way, higher recall and precision can be gotten as user's query results.</p><p>In <ref type="bibr" target="#b9">[10]</ref> the authors present an approach to expand queries that consists in searching terms from the topic query in an ontology in order to add similar terms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">PRELIMINARIES</head><p>The roadmap to prove the viability of a concept-based representation of documents and queries consists in two main tasks:</p><p>-to choose a method that permits to represent all documents terms by using the same set of concepts;</p><p>-to implement an approach that permits to index and to evaluate each concept, in both documents and queries, with the appropriate weight.</p><p>To represent documents, the method described in Section 4 has been used, combined with the use of the WordNet machine-readable dictionary. From the WordNet database, the set of terms that do not have hyponymy has been extracted, each term is named "base concept". A vector, named "base vector", has been created and, to each component of the vector, a base concept has been assigned. This way, each term is represented by using the base vector of the WordNet ontology.</p><p>The representation described above has been implemented on top of the Apache Lucene open-source API. 2  In the pre-indexing phase, each document has been converted in its ontological representation. After the calculation of the importance of each concept in a document, only concepts with a degree of importance higher than a fixed cut-value have been maintained, while the others have been discarded. The cut-value used in these experiments is 0.01. This choice has a drawback, namely that an approximation of representing information is introduced due to the discard of some minor concepts. However, we have experimentally verified that this approximation does not affect the final results.</p><p>During the evaluation activity, queries have been also converted into the ontological representation. This way, weights have to be assigned to each concept to evaluate all concepts with the right proportion. One of the features of Lucene is the possibility of assigning a payload to each term of the 2 See URL http://lucene.apache.org/.</p><p>query. Therefore, to each element present in the conceptbased representation of the query, its concept weight has been used as boost value.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">DOCUMENT REPRESENTATION</head><p>Conventional IR approaches represent documents as vectors of term weights. Such representations use a vector with one component for every significant term that occurs in the document. This has several limitations, for example:</p><p>1. different vector positions may be allocated to the synonyms of the same term; this way there is an information loss because the importance of a determinate concept is distributed among different vector components;</p><p>2. the size of a document vector have to be at least equal to the total number of words of the language used to write the document;</p><p>3. every time a new set of terms is introduced (which is a high-probability event), all document vectors must be reconstructed; the size of a repository thus grows not only as a function of the number of documents that it contains, but also of the size of the representation vectors.</p><p>To overcome these weaknesses of term-based representations, an ontology-based representation has been used <ref type="bibr" target="#b8">[9]</ref>. An ontology-based representation has been recently proposed in <ref type="bibr" target="#b8">[9]</ref> which exploits the hierarchical is-a relation among concepts, i.e., the meanings of words. For example, to describe with a term-based representation documents containing the three words: "animal", "dog", and "cat" a vector of three elements is needed; with an ontology-based representation, since "animal" subsumes both "dog" and "cat", it is possible to use a vector with only two elements, related to the "dog" and "cat" concepts, that can also implicitly contain the information given by the presence of the "animal" concept. Moreover, by defining an ontology base, which is a set of independent concepts that covers the whole ontology, an ontology-based representation allows the system to use fixed-size document vectors, consisting of one component per base concept.</p><p>Calculating term importance is a significant and fundamental aspect for representing documents in conventional information retrieval approaches. It is usually determined through term frequency-inverse document frequency (TF-IDF). When using an ontology-based representation, such usual definition of term-frequency cannot be applied because one does not operate by keywords, but by concepts. This is the reason why it has been adopted the document representation based on concepts proposed in <ref type="bibr" target="#b8">[9]</ref>, which is a concept-based adaptation of TF-IDF.</p><p>In this paper, an adaptation of the approach proposed in <ref type="bibr" target="#b8">[9]</ref> is presented. The original approach was proposed for domain specific ontologies and does not always consider all the possible concepts in the considered ontology, in the sense that it assumes a cut at a given specificity level. Instead, the proposed approach has been adapted for more general purpose ontologies and it takes into account all independent concepts contained in the considered ontology. This way, information associated to each concept is more precise and the problem of choosing the suitable level to apply the cut is overcome. The quantity of information given by the presence of concept z in a document depends on the depth of z in the ontology graph, on how many times it appears in the document, and how many times it occurs in the whole document repository. These two frequencies also depend on the number of concepts which subsume or are subsumed by z. Let us consider a concept x which is a descendant of another concept y which has q children including x. Concept y is a descendant of a concept z which has k children including y. Concept x is a leaf of the graph representing the used ontology. For instance, considering a document containing only "xy", the occurrence of x in the document is 1 + (1/q). In the document "xyz", the occurrence of x is 1 + (1/q(1 + 1/k)). As it is possible to see, the number of occurrences of a leaf is proportional to the number of children which all of its ancestors have. Explicit and implicit concepts are taken into account by using the following formulas:</p><formula xml:id="formula_0">N (c) = occ(c) + c∈Path(c,..., ) depth(c) i=2 occ(ci) i j=2 ||children(cj)|| , (1)</formula><p>where N (c) is the number of occurrences, both explicit and implicit, of concept c and occ(c) is the number of lexicalizations of c occurring in the document. The value N (c) is the weight associated with the concept c.</p><p>Given the ontology base I = b 1 , . . . , b n , where the b i s are the base concepts, the quantity of information, info(b i ), pertaining to base concept bi in a document is:</p><formula xml:id="formula_1">info(bi) = N doc (b i ) Nrep(bi) ,<label>(2)</label></formula><p>where N doc (b i ) is the number of explicit and implicit occurrences of b i in the document, and N rep (b i ) is the total number of its explicit and implicit occurrences in the whole document repository. This way, every component of the representation vector gives a value of the importance relation between a document and the relevant base concept.</p><p>A concrete example can be explained starting from the light ontology represented in Figures <ref type="figure" target="#fig_1">1 and 2</ref>, and by considering a document D1 containing concepts "xxyyyz".</p><p>In this case the ontology base is:</p><formula xml:id="formula_2">I = {a, b, c, d, x}</formula><p>and, for each concept in the ontology, the vectors N doc are: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">REPRESENTATION COMPARISON</head><p>In Section 4 the approach used to represent information was described. This section shows the improvements obtained by applying the proposed approach and it illustrates a comparison between the proposed approach and other two approaches commonly used in conceptual document representation. The expansion technique is generally used to enrich information content of queries. However, in the last years some authors applied the expansion technique also to represent documents <ref type="bibr" target="#b1">[2]</ref>. Like in <ref type="bibr" target="#b12">[13]</ref> [2], we propose an approach that uses WordNet to extract concepts from terms.</p><p>The two main improvements obtained by the application of the ontology-based approach are illustrated below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Information Redundancy.</head><p>Approaches that apply the expansion of documents and queries, use correlated concepts to expand the original terms of documents and queries. A problem with expansion is that information is redundant and there is not a real improvement of the representation of the document (or query) content. With the proposed representation this redundancy is eliminated because only independent concepts are taken into account to represent documents and queries. Another positive aspect is that the size of the vector representing document content by using concepts is generally lower than the size of the vector representing document content by using terms.</p><p>An example of technique that shows this drawback is presented in <ref type="bibr" target="#b12">[13]</ref>. In this work the authors propose an indexing technique that takes into account WordNet synsets instead of terms. For each term in documents, the synsets associated to that terms are extracted and then used as token for the indexing task. This way, the computational time needed to perform a query is not increased, however, there is a significant overlap of information because different synsets might be semantically correlated. An example is given by the terms "animal" and "pet", these terms have two different synsets, however, observing the WordNet lattice, the term "pet" is linked with an "is-a" relation with the term "animal". Therefore, in a scenario in which a document contains both terms, the same conceptual information is repeated. This is clear because, even if the terms "animal" and "pet" are not represented by using the same synset, they are semantically correlated because "pet" is a sub-concept of "animal". This way, when a document contains both terms, the presence of the term "animal" has to contribute to the importance of the concept "pet" instead of to be represented with a different token.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Computational Time.</head><p>When IR approaches are applied in a real-world environment, the computational time needed to evaluate the match between documents and the submitted query has to be considered. It is known that systems using the vector space model have higher efficiency. Conceptual-based approaches, such as the one presented in <ref type="bibr" target="#b1">[2]</ref>, generally implement a nonvectorial data structure which needs a higher computational time with respect to a vector space model representation. The approach proposed in this paper overcomes this issue because the document content is represented by using a vector and therefore, the computational time needed to compute document score is comparable to the computational time needed by using the vector space model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">EXPERIMENTS</head><p>In this section, the impact of the ontology document and query representation is evaluated. The evaluation method follows the TREC protocol <ref type="bibr" target="#b16">[17]</ref>. For each query the first 1000 retrieved documents have been considered and the precision of the system has been calculated at different points: 5, 10, 15, and 30 documents retrieved. Moreover, the precision/recall graph has been calculated</p><p>The experimental campaign has been performed by using the MuchMore collection that consists of 7823 abstracts of medical papers and 25 queries with their relevance judgments. One of the particular features of this collection is that there are a lot of medical terms. This way, a term-based representation is more advantaged with respect to semantic representation, because specific terms present in documents (for example "Arthroscopic") are very discriminant. Indeed, by using a semantic expansion some problems may occur because, generally, the MRD and thesaurus used to expand terms do not contain some domain-specific terms.</p><p>The precision/recall graph showed in Figure <ref type="figure" target="#fig_2">3</ref> illustrates the comparison between the proposed approach (gray curve with circle marks), the classical term-based representation (black curve), and the synset representation method <ref type="bibr" target="#b12">[13]</ref> (light gray curve with square marks). As expected, for all recall values, the proposed approach obtained better results than the term-based representation. The best gain of the concept-based representation is at recall levels 0.0, 0.2, and 0.4. While for recall values between 0.6 and 1.0, the conceptbased precision curve lies with the other two curves.</p><p>A possible explanation for this scenario is that for documents that are well related to a particular topic the adopted ontology representation is able to improve the representation of the documents contents. However, for documents that are partially related to a topic or that contains many ambiguous terms, the proposed approach is not able to maintain an high precision of the results. At the end of this section some improvements that may be responsible of this fact are discussed.</p><p>In Table <ref type="table" target="#tab_0">1</ref> the three different representations are compared for the Precision@X and MAP values. The results show that the proposed approach obtains better results for the all precision levels and also for the MAP value. An in-depth study of this first experiments campaign has been performed, and we have noticed that for some queries the concept-based representation obtained results that are below our expectations. By inspecting the implemented model, some issues have been noticed and are at now under analysis:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Systems</head><p>-Absence of some terms in the ontology: some terms, in particular terms related to specific domains (biomedical, mechanical, business, etc.), are not defined in the machine readable dictionary used to define the concept-based version of the documents. This way there is, in some cases, a loss of information that affects the final retrieval result.</p><p>-Proper names have not been considered: proper names of persons, geographical locations, industries, etc., are not present in the concept-based index. Observing the content of some documents and topics, proper names turn out to be a discriminant feature in some cases.</p><p>-Verbs and adjective are not present as well in the ontology: the concept representation of terms, described in Section 4, does not take into account verbs and adjectives.</p><p>This happens because verbs and adjectives are structured in a different way than nouns. The hyperonymy and hyponymy relations (that make MRD comparable with ontologies) are not defined for verbs and adjectives, therefore another approach will be studied and implemented to overcome this drawback.</p><p>-Term ambiguity: the concept-based representation has the problem of introducing an error given by not using a word sense disambiguation algorithm. Using such a method, concepts associated to incorrect senses would be discarded or weighted less. Therefore, the conceptbased representation of each word would be finer, with the consequence of representing the information contained in a document with more precision.</p><p>Improving the actual model with the above features, would certainly yield significantly better results in the next experiments campaign. This positive view is motivated by the fact that, in spite of these issues, the preliminary goal of outperforming the precision of the term-based representation has been accomplished.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">CONCLUSION</head><p>In this paper we have discussed an approach to index documents and to represent queries for information retrieval purposes which exploits a conceptual representation based on ontologies.</p><p>Experiments have been performed on the MuchMore Collection to validate the approach with respect to problems like term-synonymity in documents.</p><p>Preliminary experimental results show that the proposed representation improves the ranking of the documents. Investigation on results highlights that further improvement could be obtained by integrating WSD techniques like the one discussed in <ref type="bibr" target="#b0">[1]</ref> to avoid the error introduced by considering incorrect word senses, and with a better usage and interpretation of WordNet to overcome the loss of information caused by the absence of proper nouns, verbs, and adjectives.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Ontology representation for concept 'z'.</figDesc><graphic coords="3,101.62,54.03,143.40,120.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Ontology representation for concept 'y'.</figDesc><graphic coords="3,364.63,54.03,143.40,120.42" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Precision/recall results.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Comparisons table between semantic expansion approaches.</figDesc><table><row><cell></cell><cell></cell><cell></cell><cell>Precisions</cell><cell></cell></row><row><cell></cell><cell>P5</cell><cell>P10</cell><cell>P15</cell><cell>P30</cell><cell>MAP</cell></row><row><cell>Term-Based</cell><cell cols="5">0.544 0.480 0.405 0.273 0.449</cell></row><row><cell cols="6">Synset-Indexing [13] 0.648 0.484 0.403 0.309 0.459</cell></row><row><cell>Concept-Based</cell><cell cols="5">0.744 0.544 0.478 0.394 0.507</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://muchmore.dfki.de</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Evolving neural networks for word sense disambiguation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Azzini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dragoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Da</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Costa</forename><surname>Pereira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tettamanzi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of HIS &apos;08</title>
				<meeting>of HIS &apos;08<address><addrLine>Barcelona, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008-12">September 10-12. 2008</date>
			<biblScope unit="page" from="332" to="337" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">An information retrieval driven by ontology: from query to document expansion</title>
		<author>
			<persName><forename type="first">M</forename><surname>Baziz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Boughanem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Pasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Prade</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">RIAO. CID</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Evans</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Furui</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Soulé-Dupuy</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Techniques for efficient query expansion</title>
		<author>
			<persName><forename type="first">B</forename><surname>Billerbeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zobel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SPIRE</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">A</forename><surname>Apostolico</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Melucci</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">3246</biblScope>
			<biblScope unit="page" from="30" to="42" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Mercure at trec7</title>
		<author>
			<persName><forename type="first">M</forename><surname>Boughanem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Dkaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mothe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Soulé-Dupuy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">TREC</title>
				<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="355" to="360" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Automatic query expansion based on divergence</title>
		<author>
			<persName><forename type="first">D</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Van Rijsbergen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jose</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CIKM</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="419" to="426" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A fuzzy ontology-approach to improve semantic information retrieval</title>
		<author>
			<persName><forename type="first">S</forename><surname>Calegari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sanchez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bobillo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Da Costa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Amato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Fanizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">URSW</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">T</forename><surname>Lukasiewicz</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Martin</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Nickles</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Peng</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Pool</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Smrz</surname></persName>
		</editor>
		<editor>
			<persName><surname>Vojtás</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">327</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">An adaptation of the vector-space model for ontology-based information retrieval</title>
		<author>
			<persName><forename type="first">P</forename><surname>Castells</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vallet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Knowl. Data Eng</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="261" to="272" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A framework for selective query expansion</title>
		<author>
			<persName><forename type="first">S</forename><surname>Cronen-Townsend</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Croft</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CIKM</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Grossman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Gravano</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Zhai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Herzog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Evans</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="236" to="237" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Soft computing in ontologies and semantic Web, chapter An ontology-based method for user model acquisition</title>
		<author>
			<persName><forename type="first">C</forename><surname>Da</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Costa</forename><surname>Pereira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G B</forename><surname>Tettamanzi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Studies in fuzziness and soft computing</title>
				<editor>
			<persName><forename type="first">Zongmin</forename><surname>Ma</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="211" to="227" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Integrating mesh ontology to improve medical information retrieval</title>
		<author>
			<persName><forename type="first">M</forename><surname>Díaz-Galiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Cumbreras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martín-Valdivia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Ráez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ureña-López</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">5152</biblScope>
			<biblScope unit="page" from="601" to="606" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Ontology-based information retrieval: Overview and new proposition</title>
		<author>
			<persName><forename type="first">O</forename><surname>Dridi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">RCIS</title>
		<editor>O. Pastor, A. Flory, and J.-L. Cavarero</editor>
		<imprint>
			<biblScope unit="page" from="421" to="426" />
			<date type="published" when="2008">2008</date>
			<publisher>IEEE</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Query expansion</title>
		<author>
			<persName><forename type="first">E</forename><surname>Efthimiadis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Annual review of information science and technology</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Williams</surname></persName>
		</editor>
		<meeting><address><addrLine>Medford NJ</addrLine></address></meeting>
		<imprint>
			<publisher>Information Today Inc</publisher>
			<date type="published" when="1996">1996</date>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="page" from="121" to="187" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Indexing with wordnet synsets can improve text retrieval</title>
		<author>
			<persName><forename type="first">J</forename><surname>Gonzalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Verdejo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Chugur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cigarrán</surname></persName>
		</author>
		<idno>CoRR, cmp-lg/9808002</idno>
		<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Ichigen-san: An ontology-based information retrieval system</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hattori</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Hiramatsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Okadome</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Parsia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sirin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">APWeb</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">X</forename><surname>Zhou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Shen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Kitsuregawa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">3841</biblScope>
			<biblScope unit="page" from="1197" to="1200" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A vector space model for automatic indexing</title>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="613" to="620" />
			<date type="published" when="1975">1975</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Research on ontology-driven information retrieval</title>
		<author>
			<persName><forename type="first">S</forename><surname>Tomassen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">OTM Workshops (2</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">R</forename><surname>Meersman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Tari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Herrero</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">4278</biblScope>
			<biblScope unit="page" from="1460" to="1468" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Overview of the sixth text retrieval conference (trec-6)</title>
		<author>
			<persName><forename type="first">E</forename><surname>Voorhees</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Harman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">TREC</title>
				<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="1" to="24" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Design and implementation of ontology-based query expansion for information retrieval</title>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Fu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CONFENIS (1</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Xu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Tjoa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Chaudhry</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">254</biblScope>
			<biblScope unit="page" from="293" to="298" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Query expansion using local and global document analysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Croft</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGIR</title>
				<editor>
			<persName><forename type="first">H.-P</forename><surname>Frei</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Harman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Schäuble</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Wilkinson</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="1996">1996</date>
			<biblScope unit="page" from="4" to="11" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
