<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Translation inference across dictionaries via a combination of graph-based methods and co-occurrence statistics</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Thomas</forename><surname>Proisl</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Corpus Linguistics Group</orgName>
								<orgName type="institution">Friedrich-Alexander-Universität Erlangen-Nürnberg</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Philipp</forename><surname>Heinrich</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Corpus Linguistics Group</orgName>
								<orgName type="institution">Friedrich-Alexander-Universität Erlangen-Nürnberg</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefan</forename><surname>Evert</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Corpus Linguistics Group</orgName>
								<orgName type="institution">Friedrich-Alexander-Universität Erlangen-Nürnberg</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Besim</forename><surname>Kabashi</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Corpus Linguistics Group</orgName>
								<orgName type="institution">Friedrich-Alexander-Universität Erlangen-Nürnberg</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Translation inference across dictionaries via a combination of graph-based methods and co-occurrence statistics</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">71E564B18C38912E93EA0498934DAAA0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T23:24+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This system description explains how to use several bilingual dictionaries and aligned corpora in order to create translation candidates for novel language pairs. It proposes (1) a graph-based approach which does not depend on cyclical translations and (2) a combination of this method with a collocation-based model using the multilingually aligned Europarl corpus.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Translation of lexical items is a fundamental problem in computational linguistics which plays an important role not only in machine translation, but also in various more specific tasks such as mapping of queries, tags, denotators, and alike across different languages. With ever more bilingual lexicons being electronically available for some language pairs, the problem arises of how to use them to create new bilingual dictionaries.</p><p>The organizers of the shared task on Translation Inference Across Dictionaries (TIAD) provided partial bilingual dictionaries for the following four language chains for the eight languages German (de), English (en), Portuguese (pt), Japanese (ja), Spanish (es), Dutch (nl), Danish (da), and French (fr):</p><p>1. German-English-Portuguese 2. German-Japanese-Spanish-Portuguese 3. German-Danish-French-Spanish-Portuguese 4. German-Dutch-Spanish-Danish-French-Portuguese</p><p>The resulting language graph is visualized in Figure <ref type="figure" target="#fig_0">1</ref>. In addition, the four chains also include Portuguese-German dictionaries for "closing the loop" (dashed edge). According to the task guidelines, use of the Portuguese-German dictionaries is limited to validation purposes. The objective of the task is to create three new dictionaries (dotted edges): German-Portuguese, Danish-Spanish and Dutch-French.</p><p>A naïve approach to that problem would be to recursively collect all translation candidates: For each source word, take all translations of that word from the source-pivot 1 dictionary; then, for each translation, take all translations from the pivot 1 -pivot 2 dictionary and so on until the target language is reached. The problem with this approach is that it results in very noisy and divergent dictionaries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>German</head><p>A common solution to that problem is to make use of cycles (cf. Section 2), in this case by utilizing the Portuguese-German dictionaries. We opted for a novel approach: Instead of relying on cycles, we apply a weighting scheme. We also experiment with combining the translation candidates found via this graph-based approach with candidates extracted from parallel corpora.<ref type="foot" target="#foot_0">1</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related work</head><p>The automatic creation of multilingual dictionaries, especially the macro-structure of their entries and annotation interfaces <ref type="bibr" target="#b4">(Kernerman, 2011)</ref> as well as the exploitation of resources such as aligned corpora and existing bilingual dictionaries, have attracted commercial and academic research projects for obvious reasons. <ref type="bibr" target="#b11">Tanaka and Umemura (1994)</ref>, for example, construct a bilingual dictionary using a third language as a pivot language by utilizing the structure of dictionaries and the lexical entries (nouns). They measure the nearness of the meaning of the lexical entries to distinguish between true translation equivalents and spurious ones introduced as a result of ambiguity in the pivot language. Similarly, <ref type="bibr" target="#b3">Kaji et al. (2008)</ref> construct a Japanese-Chinese dictionary using English as intermediate language. They use monolingual corpora of the first and second language to eliminate the spurious translations caused by the ambiguity of the third language. The wide-coverage monolingual corpora provide the basis for extracting word associations in one language and translation candidates in the target language. This method enables generating domain-specific translation candidates. <ref type="bibr" target="#b13">Villegas et al. (2016)</ref> infer new translations for the languages in a graph of as many as 22 bilingual dictionaries. They consider translation candidates up to three languages away and assign a confidence score to those candidates, which is based on the density of cycles containing the potential target. A cycle is a translation chain which starts and ends at the same lexical item (for a formal definition of translation chains, see section 3.1). Similarly, <ref type="bibr" target="#b7">Mausam et al. (2009)</ref> rely on cycles ("translation circuits" in their terms) to match senses probabilistically, and <ref type="bibr" target="#b8">Saralegi et al. (2011)</ref> improve precision in pivot-based automatic creation of bilingual dictionaries by inverse consultation, i. e. by looking up translation candidates for all the possible candidates in the target language in the source language. This, however, only works if dictionaries in both directions are at disposal. <ref type="bibr" target="#b2">Haghighi et al. (2008)</ref>, on the other hand, do not use a third language at all: They learn bilingual dictionaries only using monolingual corpora and word features in each language. Last but not least, using noisy dictionaries as input, Shezaf and Rappoport (2010) present a method for generating higher-quality dictionaries: their method requires two (noisy) bilingual dictionaries (from the source language to the target language and vice versa) and two comparable monolingual corpora (one in the source language and one in the target language) as input and calculate similarity scores for translation candidates based on the number of words co-occurring with the source word that can be translated into words co-occurring with the target word.</p><p>The collocation-based approach described in the present paper, on the other hand, employs a similar idea as can be found in <ref type="bibr" target="#b6">Kovář et al. (2016)</ref>, who use a transformation of the Dice coefficient for extracting translation candidates from parallel corpora with sentence alignment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">System description</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Graph-based approach</head><p>As mentioned in Section 1, we opted for a novel graph-based approach that does not rely on cycles. Instead, we use a weighting scheme. As an additional, selfimposed constraint, we do not make use of the Portuguese-German dictionaries at all.</p><p>For our weighting scheme, we do not only use the four paths provided by the task organizers but all available simple chains from a source language to a target language. Simple chains are paths that ignore the orientation of the edges and where no vertex can occur twice. We distinguish between language chains, i. e. chains from one language to another, as illustrated in Figure <ref type="figure" target="#fig_0">1</ref>, and translation chains, i. e. chains from one word to another, via the languages in a given language chain.</p><p>Formally, let L s,t denote the set of language chains from source language s to target language t. Each language chain ∈ L s,t is assigned a weight</p><formula xml:id="formula_0">w = 1 (| | + |r |) , (<label>1</label></formula><formula xml:id="formula_1">)</formula><p>where | | is the length of the chain and |r | is the number of edges in that are traversed in reverse. The weights are normalized such that</p><formula xml:id="formula_2">∈Ls,t w = 1. (<label>2</label></formula><formula xml:id="formula_3">)</formula><p>The intuition behind these weights is that the more intervening languages we have and the more dictionaries we use in reverse, the more the quality suffers. Therefore, short chains should get a higher weight than long ones and using a dictionary in reverse should be penalized.</p><p>Let R w, denote the set of translation chains from word w in the source language of a language chain to words in the target language of that language chain. Each translation chain r ∈ R w, connects w to a potential translation equivalent e = τ (r). Each translation equivalent e in the set of translation equivalents</p><formula xml:id="formula_4">E w, = {τ (r)|r ∈ R w, }<label>(3)</label></formula><p>is assigned a weight</p><formula xml:id="formula_5">w e, = |{r ∈ R w, |τ (r) = e}| |R w, | . (<label>4</label></formula><formula xml:id="formula_6">)</formula><p>This weight corresponds to the relative frequency of translation chains from w to e via the languages in language chain . Now that we have weights for all language chains and for all translations along a language chain, we can obtain all translation equivalents in the target language t for word w from the source language s, i. e. E w = ∈Ls,t E w, . Each translation equivalent e ∈ E w is assigned a weight w e = ∈Ls,t w w e, .</p><p>(5)</p><p>The weights are normalized such that e∈Ew w e = 1. Now we can simply select the n translation equivalents with the highest weights. But what is a suitable value for n, i. e. how can we determine the best number of translation equivalents for a given word? Let R w = ∈Ls,t R w, be the set of all chains from word w in the source language s to words in the target language t. Then, we set</p><formula xml:id="formula_7">n = |E w | 1 c , (<label>6</label></formula><formula xml:id="formula_8">)</formula><p>where x is the ceiling function and c = r∈Rw |r| /|Rw|. This means we approximate n by the average number of translations for each word along the translation chains for word w.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Collocation-based approach</head><p>We make use of the Europarl corpus (see <ref type="bibr">Koehn, 2005: release v7</ref>) in its preprocessed and sentence-aligned form <ref type="bibr" target="#b12">(Tiedemann, 2012)</ref> <ref type="foot" target="#foot_1">2</ref> . As a further preprocessing step, all monolingual corpora except for the Portuguese one are lemmatized with off-the-shelf algorithms. Unfortunately, we did not lemmatize the Portuguese corpus in time. For the language pair de-pt, our procedure thus yields lexical surface realizations as translation candidates (see below). We retrieve translation candidates by analyzing first-order (syntagmatic) collocations. The procedure is implemented via the R-package wordspace <ref type="bibr" target="#b1">(Evert, 2014)</ref> <ref type="foot" target="#foot_2">3</ref> . For each language pair, lemmata (or, in the case of Portuguese, types) are extracted together with their alignment beads from the corpus in order to create lemma-sentence matrices with the intersection of alignment beads as columns. As an example, the French corpus contains 28,100 lemmata, the Dutch one 36,048, and there is an intersection of 2,003,463 alignment beads.</p><p>These matrices are then transformed into one term-term co-occurrence matrix for each language pair. The nl-fr co-occurrence matrix from the example above has thus 36,048 rows and 28,100 columns. Subsequently, the Dice score is calculated for each lemma of the source language (if it occurs in the corpus) and each target term. The Dice score is a de-facto standard for the determination of translation candidates <ref type="bibr" target="#b10">(Smadja et al., 1996)</ref> and represents the harmonic mean of the conditional probabilities P {source|target} and P {target|source}. Let O 11 denote the co-occurrence frequency of source and target term, R 1 the marginal frequency of the target term and C 1 the marginal frequency of the source term (notation and formula taken from <ref type="bibr" target="#b0">Evert, 2008)</ref>, then the Dice score can be calculated by means of</p><formula xml:id="formula_9">dice (O 11 , R 1 , C 1 ) = 2O 11 R 1 + C 1 . (<label>7</label></formula><formula xml:id="formula_10">)</formula><p>The higher its value, the higher the association between source and target term. Thus, for every source term, the target terms with the highest Dice scores serve as translation candidates. Note that in this step we ignore all candidates which solely consist of punctuation marks and/or digits in order to improve translation quality.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Combination of collocation-based and graph-based approaches</head><p>Without having an evaluation measure which determines the trade-off between precision and recall of the translation candidates, we opted for a very simple combination of the two approaches above: the final list of candidates is gained by union of the graph-based candidates and four collocation-based candidates.<ref type="foot" target="#foot_3">4</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Evaluation</head><p>The evaluation procedure was announced after submission of the translation candidates and solely takes precision (and no recall)<ref type="foot" target="#foot_4">5</ref> into account. For each language pair and system, 100 source-target-candidates were sampled. Subsequently, each translation pair was reviewed manually according to whether the target term was a correct (possible) translation of the source term. Two scalar performance measures are given, see Table <ref type="table" target="#tab_0">1</ref>: Precision is the percentage of (manually determined) correct translations among the proposed candidates. Additionally, "gold-precision" only labels those candidates as "true positives" which can be found in the organizers' (undisclosed) gold-standard of translations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Results and discussion</head><p>Results for the graph-based approached ("graph") and the combination of collocation-based and graph-based approaches ("combined") can be found in Table <ref type="table" target="#tab_0">1</ref>. Two findings seem noteworthy: Firstly, the solely graph-based method consistently outperforms the combined approach for both evaluation measures. Secondly, in the nl-fr language-pair setting, both systems are drastically beaten by the baseline, whereas in the other two settings both systems outperform the baseline.</p><p>Obviously, our strategy of providing multiple translation candidates proved to be suboptimal for the official task evaluation, which only focused on precision. Note however that our system is easily adaptable in case a reasonable evaluation measure is given a priori: both graph-based and collocation-based methods yield nbest lists of candidates with a scalar score-function enabling a more sophisticated selection of actual candidates. Advantages of our proposed graph-based system are twofold: Firstly, it does not require cycles, i. e. it can be applied in greater variety of settings. Secondly, the weighting scheme takes into account the number of dictionaries involved and the directionality in which they are used on the one hand, and, on the other, the relative frequency of translation chains leading to a translation equivalent; thus, the system automatically determines a suitable number of translation equivalents.</p><p>The proposal of further candidates retrieved from the Europarl corpus has turned out to be counterproductive for the reasons elaborated above. Nevertheless, given more realistic settings in which recall of all (or most) possible translations is important, retrieval of candidates not comprised in any of the bilingual corpora (or of those with atypical translation paths) seems desirable. Future work will thus use more sophisticated methods for combining graph-based and collocation-based candidates, e. g. by using the Borda count or the Schulz method.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. The language graph. Numbers on the edges show which language chains in the above enumeration are using the respective edge. Dotted edges indicate the desired new direct translation paths.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Evaluation measures for all language pairs for both submitted systems (based on samples of size 100). Precision is the percentage of correct translations among the sampled candidates, gold-precision is the percentage of correct translations that were also part in the organizers' gold standard. The baseline figures were provided by the task organizers and are based on a depth-first search for cycles of translations which include the desired source and target languages.</figDesc><table><row><cell>6</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">We are talking about candidates, since automatic translation techniques yield n-best lists of terms. Both the evaluation function which ranks the candidate terms as well as the precise value of n are at the very core of lexical translation research.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://opus.lingfil.uu.se/Europarl.php</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://wordspace.r-forge.r-project.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">The graph-based method yields between two and three candidates on average depending on the language pair. Assuming an overlap of one or two candidates between both methods, this heuristic guarantees that the collocation-based approach delivers approximately two additional candidates.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">5 Note that recall is not well-defined in the case of lexical translation: while human experts may easily agree on some unambiguous translations (thus making it feasible to create a gold-standard for calculating precision), they might disagree quickly on particular or unusual translations (thus making it impossible to create a gold-standard for measuring recall)</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">.6  That is to say: if the system is to focus on precision, a very small number of candidates should be given, and their selection should be based on the distribution of the score functions of both the graph-based and the collocation-based candidate lists.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Corpora and collocations</title>
		<author>
			<persName><forename type="first">Stefan</forename><surname>Evert</surname></persName>
		</author>
		<ptr target="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.159.6220&amp;rep=rep1&amp;type=pdf" />
	</analytic>
	<monogr>
		<title level="m">Corpus Linguistics. An International Handbook</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="1212" to="1248" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Distributional Semantics in R with the wordspace Package</title>
		<author>
			<persName><forename type="first">Stefan</forename><surname>Evert</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/C14-2024" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations</title>
				<meeting>COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Learning Bilingual Lexicons from Monolingual Corpora</title>
		<author>
			<persName><forename type="first">Aria</forename><surname>Haghighi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Percy</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Taylor</forename><surname>Berg-Kirkpatrick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dan</forename><surname>Klein</surname></persName>
		</author>
		<ptr target="http://www.aclweb.org/anthology/P08-1#page=815" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2008">2008. 2008</date>
			<biblScope unit="page" from="771" to="779" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Automatic Construction of a Japanese-Chinese Dictionary via English</title>
		<author>
			<persName><forename type="first">Hiroyuki</forename><surname>Kaji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shin'ichi</forename><surname>Tamamura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dashtseren</forename><surname>Erdenebat</surname></persName>
		</author>
		<ptr target="http://www.lrec-conf.org/proceedings/lrec2008/pdf/175_paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC&apos;08)</title>
				<meeting>the Sixth International Conference on Language Resources and Evaluation (LREC&apos;08)</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="699" to="706" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">From dictionary to database: Creating a global multi-language series. Electronic Lexicography in the 21st Century: New Applications for New Users</title>
		<author>
			<persName><forename type="first">Ilan</forename><surname>Kernerman</surname></persName>
		</author>
		<ptr target="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.698.6117&amp;rep=rep1&amp;type=pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of eLex 2011</title>
				<meeting>eLex 2011<address><addrLine>Bled</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2011-11-12">10-12 November 2011. 2011</date>
			<biblScope unit="page" from="113" to="121" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Europarl: A parallel corpus for statistical machine translation</title>
		<author>
			<persName><forename type="first">Philipp</forename><surname>Koehn</surname></persName>
		</author>
		<ptr target="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.459.5497&amp;rep=rep1&amp;type=pdf" />
	</analytic>
	<monogr>
		<title level="m">MT summit</title>
				<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="79" to="86" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Sketch Engine for Bilingual Lexicography</title>
		<author>
			<persName><forename type="first">Vojtěch</forename><surname>Kovář</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vít</forename><surname>Baisa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Miloš</forename><surname>Jakubíček</surname></persName>
		</author>
		<idno type="DOI">10.1093/ijl/ecw029</idno>
		<ptr target="https://doi.org/10.1093/ijl/ecw029" />
	</analytic>
	<monogr>
		<title level="j">International Journal of Lexicography</title>
		<idno type="ISSN">0950-3846</idno>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="339" to="352" />
			<date type="published" when="2016-09">September 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Compiling a massive, multilingual dictionary via probabilistic inference</title>
		<author>
			<persName><forename type="first">Stephen</forename><surname>Mausam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Oren</forename><surname>Soderland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><forename type="middle">S</forename><surname>Etzioni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Weld</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeff</forename><surname>Skinner</surname></persName>
		</author>
		<author>
			<persName><surname>Bilmes</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/P09-1030" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP</title>
				<meeting>the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="262" to="270" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Analyzing methods for improving precision of pivot based bilingual dictionaries</title>
		<author>
			<persName><forename type="first">Xabier</forename><surname>Saralegi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iker</forename><surname>Manterola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Inaki</forename><surname>San</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vicente</forename></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/D11-1078" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the Conference on Empirical Methods in Natural Language Processing</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="846" to="856" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Bilingual lexicon generation using non-aligned signatures</title>
		<author>
			<persName><forename type="first">Daphna</forename><surname>Shezaf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ari</forename><surname>Rappoport</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/P10-1011" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 48th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="98" to="107" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Translating collocations for bilingual lexicons: A statistical approach</title>
		<author>
			<persName><forename type="first">Frank</forename><surname>Smadja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kathleen</forename><forename type="middle">R</forename><surname>Mckeown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vasileios</forename><surname>Hatzivassiloglou</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/J96-1001" />
	</analytic>
	<monogr>
		<title level="j">Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="38" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Construction of a bilingual dictionary intermediated by a third language</title>
		<author>
			<persName><forename type="first">Kumiko</forename><surname>Tanaka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kyoji</forename><surname>Umemura</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/C94-1048" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th conference on Computational linguistics</title>
				<meeting>the 15th conference on Computational linguistics</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="1994">1994</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="297" to="303" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Parallel Data, Tools and Interfaces in OPUS</title>
		<author>
			<persName><forename type="first">Jörg</forename><surname>Tiedemann</surname></persName>
		</author>
		<ptr target="http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC&apos;12)</title>
				<meeting>the Eighth International Conference on Language Resources and Evaluation (LREC&apos;12)</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="2214" to="2218" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Leveraging RDF graphs for crossing multiple bilingual dictionaries</title>
		<author>
			<persName><forename type="first">Marta</forename><surname>Villegas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maite</forename><surname>Melero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gracia</surname></persName>
		</author>
		<ptr target="http://www.lrec-conf.org/proceedings/lrec2016/pdf/613_Paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC&apos;16)</title>
				<meeting>the Tenth International Conference on Language Resources and Evaluation (LREC&apos;16)</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="868" to="876" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
