<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Semantic Answer Validation in Question Answering Systems for Reading Comprehension Tests</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Helena</forename><surname>Gómez-Adorno</surname></persName>
							<email>helena.gomez@cs.buap.mx</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="institution" key="instit1">Benemérita</orgName>
								<orgName type="institution" key="instit2">Universidad Autónoma de Puebla Av</orgName>
								<address>
									<addrLine>San Claudio y 14 Sur</addrLine>
									<postBox>.P. 72570</postBox>
									<settlement>Puebla</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">David</forename><surname>Pinto</surname></persName>
							<email>dpinto@cs.buap.mx</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="institution" key="instit1">Benemérita</orgName>
								<orgName type="institution" key="instit2">Universidad Autónoma de Puebla Av</orgName>
								<address>
									<addrLine>San Claudio y 14 Sur</addrLine>
									<postBox>.P. 72570</postBox>
									<settlement>Puebla</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Darnes</forename><surname>Vilariño</surname></persName>
							<email>darnes@cs.buap.mx</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="institution" key="instit1">Benemérita</orgName>
								<orgName type="institution" key="instit2">Universidad Autónoma de Puebla Av</orgName>
								<address>
									<addrLine>San Claudio y 14 Sur</addrLine>
									<postBox>.P. 72570</postBox>
									<settlement>Puebla</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Semantic Answer Validation in Question Answering Systems for Reading Comprehension Tests</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">9EA34E2A69ED289201BED5422141F83A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:41+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Question answering system</term>
					<term>reading comprehension</term>
					<term>information retrieval</term>
					<term>semantic similarity</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper it is presented a methodology for tackling the problem of answer validation in question answering for reading comprehension tests. The implemented system accepts a document as input and it answers multiple choice questions about it based on semantic similarity measures. It uses the Lucene information retrieval engine for carrying out information extraction employing additional automated linguistic processing such as stemming, anaphora resolution and part-of-speech tagging. The proposed approach validates the answers, by comparing the text retrieved by Lucene for each question with respect to its candidate answers. For this purpose, a validation based on semantic similarity is executed. We have evaluated the experiments carried out in order to verify the quality of the methodology proposed using a corpus widely used in international forums. The obtained results show that the proposed system selects the correct answer to a given question with a percentage of 12% more than with a lexical similarity based validation.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Reading comprehension comprises the ability of human reader for understanding the main ideas written in a text. In order to evaluate the quality of reading comprehension, there exist tests that require readers for reading a story or article and answer a list of questions about it. From the point of view of automatic evaluation of reading comprehension tests, it is needed to take advantage of the techniques developed in the framework of question answering.</p><p>In this paper we present some experiments for exploring answer validation in question answering architectures that can be applied to reading comprehension tests as an evaluation method for language understanding systems (machine reading systems). Such tests take the form of standardized multiple-choice diagnostic reading skill tests.</p><p>The main idea behind QA systems for reading comprehension tests is to answer questions based on a single document. This approach is different from that of traditional QA systems, in which they have a very large corpus for searching the requested information, which implies in some cases a very different system architecture.</p><p>The rest of the paper is organized as follows. Section 2 presents the related work. Section 3 describes the System Architecture. Section 4 presents the evaluation results in a collection of documents of the QA4MRE task at CLEF 2011. Finally, Section 5 presents the conclusions obtained, so that it outlines some future work directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>The QA for reading comprehension tests field has been inactive for a long time, due to the lack of agreement in the way the systems evaluation should be done <ref type="bibr" target="#b0">[1]</ref> . In 2011, and later in the 2012, the CLEF conference<ref type="foot" target="#foot_0">1</ref> proposed a QA task for Machine Reading (MR) systems evaluation called QA4MRE. The task consists of reading a document and identifying answers for a set of questions about the information that is expressed or implied in the text. The questions are written in the form of multiple choices; each question has 5 different options, and only one option is the correct answer. The detection of the correct answer is specifically designed to require various types of inference, and the consideration of prior knowledge acquired from a collection of reference documents <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>.</p><p>The QA4MRE task encourage the interest in this research line, because it provides a single evaluation platform for the experimentation with new techniques and methodologies towards giving a solution to this problem. In this sense we can take the systems presented in this conference as state-of-the-art work for this research field.</p><p>However there exist other research works <ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref> that also have deal with the problem of QA for reading comprehension tests in the past, unfortunately with low level of accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">System Architecture</head><p>The proposed architecture is made up of three main modules: Document processing, Information Extraction and Answer validation. Each of these modules is described in the following subsections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Document Processing</head><p>First we analyze the queries associated to each document, applying a Part-Of-Speech (POS) tagger in order to identify the "question keywords" (what, where, when, who, etc.), and the result is passed to the hypothesis generation module (this module will be explained more into detail in Section 3.2).</p><p>Afterwards, we perform anaphora resolution for the documents associated with the questions using the JavaRAP<ref type="foot" target="#foot_1">2</ref> system. It has been observed that applying anaphora resolution in QA systems improves the results obtained, in terms of precision <ref type="bibr" target="#b6">[7]</ref>. Given that JavaRAP does not resolve anaphors of first-person pronouns, we added the following process for the resolution of these cases:</p><p>1. Identify the author of the document, which is usually the first name in the document. For this purpose, the Stanford POS tagger<ref type="foot" target="#foot_2">3</ref> was used. 2. Each personal pronoun in the first person set PRP={"I", "me", "my", "myself"} generally refers to the author. 3. Replace each term of the document that is in the PRP set, by the document author name identified in step 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Information Extraction</head><p>Secondly, we extract the meaningful information by means of two submodules: Hypothesis Generation and Information Retrieval.</p><p>The first submodule (Hypothesis Generation) receives as input the set of questions with their multiple choice answers, which were previously processed in the previous module. We construct what we means hypothesis as the concatenation of the question with each of the possible answers. This hypothesis is intended to become the input to the Information Retrieval (IR) module, i.e., the query. In order to generate the hypothesis, first the "question keyword" is identified and subsequently replaced by each of the five possible answers, thereby obtaining five hypotheses for each question. For example, given the question: Where was Elizabeth Pisani's friend incarcerated?. And a posible answer: in the Philippines. The obtained hypothesis is: in the Philippines was Elizabeth Pisani's friend incarcerated.</p><p>The benefit of using these hypotheses as queries for the IR module is to search passages containing words that are in both, the question and the multiple-choice answer, instead of search passages containing words from the question and the answer, independently.</p><p>The second submodule (Information Retrieval-IR) was built using the Lucene<ref type="foot" target="#foot_3">4</ref> IR library. It is responsible for indexing the document collection, and for the further passage retrieval, given an hypothesis as a query.</p><p>The IR module returns a relevant passage for each hypothesis which is used as a support text to decide whether or not the hypothesis can be the right answer. For each hypothesis the first passage returned is taken (only one), which is considered the most important one. This process generates a pair "Hypothesis + Passage (H-P )", along with a lexical similarity score calculated by Lucene.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Answer validation</head><p>Finally, the answer validation module aims to assign a score based on semantic similarity to the pair H-P generated in the Information Retrieval module. The reason for including this measure is that the lexical similarity score given by Lucene is not enough to capture the similarity between the hypothesis and the support text, when they do not share the same words. To overcome this problem, two things can be done: 1) To include a query expansion module trying to add synonyms, hyperonyms, etc, in order to obtain a higher lexical similarity, and 2) To add a semantic similarity algorithm which can discover the degree of similarity between two sentences, even though they do not share the same words exactly. For example in the hypothesis: "she esteems him is Annie Lennox's opinion about Nelson Mandela", the recovered passage is "Everyone one in the world respects Nelson Mandela, everyone reveres Nelson Mandela"; but the score assigned by Lucene is too low and it does not select that answer as the correct one. The addition of semantic similarity score will help to raise the score of these two phrases and select the correct answer because it will probably find the relation between the words "esteems", "revers" and "respect".</p><p>In order to determine whether or not the passage P is similar to an hypothesis H, we implemented an approach based in <ref type="bibr" target="#b7">[8]</ref>.</p><p>The similarity measure used in that paper <ref type="bibr" target="#b8">[9]</ref> gives a weight to each word of the sentence in terms of the degree of specificity of the word. For example the words catastrophe and disaster gain more weight than words could and should. The similarity inter-words for both sentences is integrated into this measure. The two similarity measures proposed are: Corpus-based (PMI-IR) and Knowledge-based Measures (Wordnet <ref type="bibr" target="#b9">[10]</ref>).</p><p>The similarity between two sentences S1 y S2 is given by the equation 1</p><formula xml:id="formula_0">sim(S1, S2) = 1 2 ( ∑ w ϵ {S 1 } (maxSim(w, S2) * idf (w)) ∑ w ϵ {S 1 } idf (w) + ∑ w ϵ {S 2 } (maxSim(w, S1) * idf (w)) ∑ w ϵ {S 2 } idf (w) )<label>(1)</label></formula><p>To find maxSim we have used two semantic similarity measures between words, which are described as follows:</p><p>-Mutual Information PMI-IR measure. It comes from the pointwise mutual information formulae suggested by <ref type="bibr" target="#b10">[11]</ref> as an unsupervised measure for the evaluation of semantic similarity of words. It is based on statistical data collected by an information retrieval engine over a very large corpus (i.e. the web). Given two words w 1 y w 2 , its PMI-IR is measure by:</p><formula xml:id="formula_1">P M I − IR(w1, w2) = log2 p(w1&amp;w2) p(w1) * p(w2)<label>(2)</label></formula><p>-WordNet measure. It is based on the shortest path that connects two concepts in the taxonomy (hyperonyms, homonyms) extracted from Wordnet, a lexical database that groups the words in sets of synonyms called "synsets". The given score is in the interval 0 to 1, where the score 1 represents the equality of the concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental results</head><p>This section describes the data sets used for evaluating the methodology proposed in this paper. Additionally, the results obtained in the experiments carried out are reported and discussed.</p><p>In order to determine the performance of the system proposed in this paper we used the corpus provided in the QA4MRE task of the CLEF 2011. The features of the test data set is detailed in Table <ref type="table" target="#tab_0">1</ref>. Table <ref type="table" target="#tab_1">2</ref> presents the obtained results in terms of number of correct answered questions. It is shown that the semantic similarity measures are able to find some answers that otherwise with the lexical similarity measure are unable to find. The number of the different correct answers achieved by the PMI measure is 15 and the ones achieved by the Path measure is 8. The lexical similarity achieved 18 different correct answers, whereas the number of correct answers achieved by both, lexical and semantic similarity is 21. In total, the number of correct answers given by both similarity measures is 54 (45%). This precision overcomes the 32% achieved by the approach that uses only the lexical similarity measure. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion and Future Work</head><p>In this paper we have presented a methodology for tackling the problem of question answering for reading comprehension tests, making emphasis on the validation step. There were presented two semantic similarity measures, one based on PMI and the other one based on Wordnet, specifically the shortest path measure.</p><p>We have compared the performance of the system presented in this paper using the lexical and semantic similarity measures. We have observed that the semantic similarity measures are able to discover answers that with the lexical similarity measure could not be discovered.</p><p>As future work we would like to determine which question is more suitable to be validated by a semantic measure, and which one is better to be validated with a lexical measure. Making this process automatic will improve the overall precision of the methodology.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Features of the test data set (QA4MRE 2011 task)</figDesc><table><row><cell>Features</cell><cell>2011</cell></row><row><cell>Topics</cell><cell>3</cell></row><row><cell>Topic details</cell><cell>Climate Change,</cell></row><row><cell></cell><cell>Music &amp; Society and AIDS</cell></row><row><cell>Reading tests (documents)</cell><cell>4</cell></row><row><cell>Questions per document</cell><cell>10</cell></row><row><cell>Multiple-choice answers per question</cell><cell>5</cell></row><row><cell>Total of questions</cell><cell>120</cell></row><row><cell>Total of answers</cell><cell>600</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Comparison of the number of correct answers obtained with different similarity measures</figDesc><table><row><cell>Similarity Measures</cell><cell>2011</cell></row><row><cell>PMI</cell><cell>15</cell></row><row><cell>Path</cell><cell>8</cell></row><row><cell>Lexical</cell><cell>18</cell></row><row><cell>Both</cell><cell>21</cell></row><row><cell>Total (PMI + Lexical + Both)</cell><cell>54</cell></row><row><cell>Precision (Lexical + Both)</cell><cell>0.32%</cell></row><row><cell cols="2">Precision (PMI + Lexical + Both) 0.45%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">The Cross-Lingual Evaluation Forum: http://www.clef-initiative.eu</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://wing.comp.nus.edu.sg/ qiu/NLPTools/JavaRAP.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://nlp.stanford.edu/software/tagger.shtml</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://lucene.apache.org/core/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Natural language question answering: the view from here</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gaizauskas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nat. Lang. Eng</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="275" to="300" />
			<date type="published" when="2001-12">December 2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of QA4MRE at CLEF 2011: Question answering for machine reading evaluation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Peñas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Forner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Á</forename><surname>Rodrigo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F E</forename><surname>Sutcliffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Forascu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sporleder</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF</title>
				<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of QA4MRE at CLEF 2012: Question answering for machine reading evaluation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Peñas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Forner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Á</forename><surname>Rodrigo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F E</forename><surname>Sutcliffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sporleder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Forascu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Benajiba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Osenova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Deep Read: a reading comprehension system</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Light</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Breck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Burger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. ACL &apos;99</title>
		<title level="s">Association for Computational Linguistics</title>
		<meeting>the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. ACL &apos;99<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="325" to="332" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A rule-based question answering system for reading comprehension tests</title>
		<author>
			<persName><forename type="first">E</forename><surname>Riloff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Thelen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ANLP/NAACL Workshop on Reading comprehension tests as evaluation for computer-based language understanding sytems</title>
		<title level="s">Association for Computational Linguistics</title>
		<meeting>the ANLP/NAACL Workshop on Reading comprehension tests as evaluation for computer-based language understanding sytems<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="13" to="19" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A machine learning approach to answering questions for reading comprehension tests</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">T</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Teo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L P</forename><surname>Kwan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of EMNLP/VLC-2000 at ACL-2000</title>
				<meeting>EMNLP/VLC-2000 at ACL-2000</meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Importance of pronominal anaphora resolution in question answering systems</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Vicedo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ferrandez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL</title>
				<meeting>the 38th Annual Meeting of the Association for Computational Linguistics (ACL</meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="555" to="562" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">FCC: Three approaches for semantic textual similarity</title>
		<author>
			<persName><forename type="first">M</forename><surname>Carrillo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vilariño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pinto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tovar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>León</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Castillo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">proceedings of Semeval 2012</title>
		<title level="s">Association for Computational Linguistics</title>
		<meeting>Semeval 2012<address><addrLine>Montréal, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="631" to="634" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Corpus-based and knowledge-based measures of text semantic similarity</title>
		<author>
			<persName><forename type="first">R</forename><surname>Mihalcea</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Corley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Strapparava</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st national conference on Artificial intelligence -Volume 1. AAAI&apos;06</title>
				<meeting>the 21st national conference on Artificial intelligence -Volume 1. AAAI&apos;06</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="775" to="780" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Wordnet: a lexical database for the english language</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A</forename><surname>Miller</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Word association norms, mutual information, and lexicography</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">W</forename><surname>Church</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hanks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Comput. Linguist</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="22" to="29" />
			<date type="published" when="1990-03">March 1990</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
