<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Machine Learning for Translation Inference Across Dictionaries</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kathrin Donandt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Chiarcos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maxim Ionov</string-name>
          <email>ionovg@cs.uni-frankfurt.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Goethe-Universitat Frankfurt</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes our contribution to the closed track of the Shared Task Translation Inference across Dictionaries (TIAD2017),1 held in conjunction with the rst Conference on Language Data and Knowledge (LDK-2017). In our approach, we use supervised machine learning to predict high-quality candidate translation pairs. We train a Support Vector Machine using several features, mostly of the translation graph, but also taking into consideration string similarity (Levenshtein distance). As the closed track does not provide manual training data, we de ne positive training examples as translation candidate pairs which occur in a cycle in which there is a direct connection.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>back to a word in the source language. This means that data originally provided
`to close the loop' contributes to the training data, an approach approved in
coordination with the Shared Task organizers.2
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data and preprocessing</title>
      <p>The data used in the Shared Task is provided by KDictionaries Ltd., and
consists of excerpts of bilingual learner dictionaries. Dictionary fragments for the
following language pairs are provided:
{ German (de) 7! Danish (da), Dutch (nl), English (en), Japanese (jp)
{ Danish (da) 7! French (fr)
{ Dutch (nl) 7! Spanish (es)
{ French (fr) 7! Spanish (es), Brazilian Portuguese (pt-BR)
{ Japanese (jp) 7! Spanish (es)
{ Spanish (es) 7! Brazilian Portuguese (pt-BR), Danish (da)
{ English (en) 7! Brazilian Portuguese (pt-BR)
The dictionary information is not exhaustive, but limited to sample data
accounting for a selection of German and Brazilian Portuguese words along the
following paths:
{ de 7! en 7! pt-BR (7! de)3
{ de 7! jp 7! es 7! pt-BR (7! de)
{ de 7! da 7! fr 7! es 7! pt-BR (7! de)
{ de 7! nl 7! es 7! da 7! fr 7! pt-BR (7! de)
The task is to produce dictionaries for three novel language combinations: de 7!
pt-BR, da 7! es, and nl 7! fr.</p>
      <p>Following the baseline system,4 we represent dictionary entries in a graph:
{ The original bilingual dictionaries are given in a tabular format, with one
row comprising seven attributes (word, part-of-speech and example phrase
for source word and target word, respectively, as well as an ID, containing
the source and target language abbreviation).
{ For every source word and every target word, we create a node which contains
the word itself word, its language and its part-of-speech.
2 `Closing the loop' is the core strategy of the provided baseline system, meaning
that translation candidates wA !? wC are pre- ltered to instances where a
backtranslation wC !? wA can be extrapolated from the data. In our experiments,
we found that using this back-translation information outperforms any approach
operating on features of the translation path from wA to wC alone.
3 The Portuguese-German sets are provided for the sole purpose of `closing the loop',
i.e., selecting valid and invalid translation pairs as in the baseline implementation,
but not to be reversed.
4 https://gitlab.com/kd-public/tiad-2017 baseline
{ Two nodes are connected by a directed edge if they are given as source and
target in the original data.
{ For all languages except source and target language of the extrapolated
dictionary, we also add an edge in the opposite direction.5
{ Multiple nodes with the same attributes are uni ed. Thus, words which
appear in several dictionaries are connected to several target words.
The baseline builds on cycles retrieved from this data, which may involve one or
more pivot languages, e.g. 'skrivebord'@da ! 'bureau'@fr !
'escrivaninha'@ptBR ! 'escritorio'@es ! 'skrivebord'@da. The TIAD baseline implements a
depth- rst search for cycles over this data and returns the rst cycle
encountered. In our approach, we extract all possible cycles using a modi ed version
of this approach. For the source and target languages of the dictionary to be
inferred, we extract all paths from any source language word to any target
language word. In the absence of manually devised gold data, we use these cycles
and the paths as training and test data for the machine learning.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Approach</title>
      <p>We use a simple Support Vector Machine (SVM) for classifying a
source-wordtarget-word-pair as valid or invalid translation. In addition, we let the SVM
determine both the likelihood of a pair belonging to the positive class and that
of belonging to the negative class.
3.1</p>
      <sec id="sec-3-1">
        <title>De ning training data</title>
        <p>
          We de ne positive instances as being those pairs which occur in a cycle in which
there is a direct connection (length 1) between the target word and the source
word, i.e.,
1. we only consider pairs which occur in the provided dictionary as positive
examples to be sure that the pair really is a valid translation6, and
2. we do not consider every pair occurring in a cycle automatically as a valid
translation.7
This approach formalizes the observation that the cycle criterion implemented
in the baseline is likely to yield invalid translations in the case of correlated
polysemy, cf. the following cycle where the correlated polysemy of Esperanto pojno
and Spanish mun~eca leads to a wrong translation pair 'doll'@en and
'Handgelenk'@de (i.e., `wrist') [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]:
5 This corresponds to the direction reversal as implemented by the baseline algorithm.
6 For nl-fr, we therefore do not have positive examples for the SVN training, as there
is no nl-fr dictionary in the test data.
7 The baseline treats a pair as valid translation if at least one cycle is found during
the depth- rst search.
We de ne negative instances as the word pairs which occur in a path and not in
a cycle; these are in total 17 373 pairs. From these, we randomly sample 1 080
training instances, the same amount as the number of positive training examples.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Training features</title>
        <p>For any given translation pair (wsrc, wtgt), we consider the following features:</p>
      </sec>
      <sec id="sec-3-3">
        <title>Number of paths from wsrc to wtgt (NumP)</title>
        <p>The existence of a high amount of such paths might indicate that the
translation pair is a valid one as there are many possibilities how to get from the
source word to the target word, varying in the amount, the succession and
the language of the pivot words.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Frequency of source word in a dictionary (MaxOccDict)</title>
        <p>Di erent translation possibilities of a source word in a dictionary means high
polysemy for this word, making its translation more di cult. For words
occurring in several dictionaries as source word, we take the maximum number
of occurrence</p>
      </sec>
      <sec id="sec-3-5">
        <title>Minimum/maximum path length (PLen)</title>
        <p>A short path makes it less probable to change the meaning going from source
to target word, a long path makes this in turn more likely; if there exists a
really short path, that might be a good sign; we take both the minimum and
maximum length of paths, as the maximum length might help the SVM to
identify bad pairs</p>
      </sec>
      <sec id="sec-3-6">
        <title>Di erence of paths (PDi )</title>
        <p>We look at how the paths between wsrc and wtgt (P in the following) di er:
a) For every path p 2 P , we retrieve the set of languages involved. We then
count the number of sets (language di erence).
b) For every path p 2 P , we retrieve the set of words (nodes from a graph)
involved. For every word set w, Pw P is the set of paths with the same
word set. The number of switches sw(Pw) is the number of paths in Pw,
except that for paths for which also exists a reverse version, only one is
being counted. If the same set of words occurs on di erent paths, this is
likely to indicate reliable translation pairs. We return maxPw sw(Pw) as
the word sequence di erence excluding reversal.</p>
      </sec>
      <sec id="sec-3-7">
        <title>Minimum/maximum path probability (PProb)</title>
        <p>For two words wA and wB with a direct connection in a dictionary A 7! B, we
calculate their probability P (wA ! wB) = jfwB0jwA ! wB0 2 A 7! Bgj 1.
sTohuercperowboarbdiliwty0 oafnadnytagrigveetnwpoartdh wpn=iws0P !(w0w!1!wn::): != Qwnin=11P!(wwin1b!etwweei)n.</p>
        <p>We return minimum and maximum path probabilities.</p>
      </sec>
      <sec id="sec-3-8">
        <title>Levenshtein distance on a path (Lev)</title>
        <p>With the exception of Japanese, the provided dictionaries involve only two
language families, Romance (French, Portuguese, Spanish) and Germanic
(English, German, Dutch, Danish). As it is likely that cognates occupy the
same semantic elds in languages descending from a common source, we
calculate pairwise relative Levenshtein distance8 for all words per language
family on a path and return their average value as Levenshtein distance for
each path from wsrc to wtgt. As a feature for the pair wsrc, wtgt, we take the
average of all these path Levenshtein distances.</p>
        <p>We employ this feature set for training an SVM classi er9, and we also train
SVMs with every feature in isolation to assess the impact of individual features
on our gold data (valid translation pairs are pairs occuring in cycles and having
a direct connection from target word back to source word, and invalid pairs are
those which do not occur in a cycle). The classi cation is done by assigning the
labels 1 or 0 to candidate translation pairs10.</p>
        <p>Table 1 lists precision, recall and F1 measure of our internal evaluation of
SVM performance and individual features, using 80% of the data we de ned
as our gold data as training and the remaining 20% as test set. As mentioned
above, our de nition of gold data does not include nl!fr pairs, thus the results
for this pair is not present in the table. The last row contains the scores if all
features are used to train the SVM.</p>
        <p>In general, classi cation performance for German and Portuguese are
relatively low, with classi cation using the path length (PLen) being roughly on a
par with the full feature set. Path length is a dominating factor for Danish and
Spanish. For this translation pair, the combination of all features returns higher
results than the features individually. For German and Portuguese, we get a
higher (or equal (PLen)) precision when using all features for the SVM training
compared to the use of an individual feature. However, the combination could
not outperform the single features in terms of F1 and recall.</p>
        <p>Reliable conclusions about the performance of individual factors apparently
require a more substantial data, to counterbalance speci c characteristics of
individual dictionaries, to generalize beyond the apparent noise in the data and
to avoid over tting due to insu cient amounts of training data.</p>
        <p>
          As a general pattern, it seems the Levenshtein-based approach performs
poorly on both datasets. One reason may be found in the structure of the
prede ned paths where languages within one language family are usually adjacent.
On longer or more homogeneous paths, Levenshtein may have a greater impact.
Linguistic homogeneity (i.e., whether adjacent languages belong to the same
language family) may actually explain why Levenshtein is more successfull for
8 Relative Levenshtein is de ned here as Levenshtein distance divided by the sum of
the length of both strings.
9 We use the C-SVM [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] implementation of libsvm [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], accessed via scikit-learn package
for Machine Learning in Python [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] with RBF Kernel, = 0:10, C = 1, equally
weighted classes and = 0:001.
10 Our submission results assigns probabilities, see below.
        </p>
        <p>Danish and Spanish, as the shortest path between them involves only a single
pivot language (French) which is historically related to Spanish, whereas the
shortest German-Portuguese path connects both languages via English which is
(due to Romance) in uence rather remote from other Germanic languages. In
this case, one may also ask whether limiting Levenshtein to pre-de ned language
families should not be extended to known language contact phenomena in order
to accomodate the special ties between English and Romance.</p>
        <p>
          A second aspect is that Levenshtein is actually a poor approximation for
phonological similarity (which would be criterion for cognates and thus, semantic
overlap) as all kind of character replacements are regarded equally likely, whereas
sound change usually tends to preserve phonological characteristics (e.g., it is less
likely that /o/ corresponds to /t/ in a related language than that it corresponds
to /u/, but both substitutions are weighted equally in classical Levenshtein.) An
alternative implementation with weighted Levenshtein would thus be advisable,
however, most of the modi cations, e.g. the Damerau-Levenshtein algorithm [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ],
use additional data to obtain the optimal weights, which may contradict to the
nature of the closed track of the Shared Task, i.e. generating translation pairs
without additional resources.
        </p>
        <p>Finally, alternative strategies to aggregate Levensthein distance metrics may
produce di erent results, too. It is possible that such di erent adaptations of
Levenshtein would show a similar band-width in performance as the path-based
metrics, so that our results reveal little about the applicability of form-based
factors in general. Within the scope of the shared task, however, these could not
be explored to a greater extend.
After training the SVM model, we let the model predict the probability of a
candidate translation pair being a valid translation. Following the baseline
implementation, we thereby only consider pairs occurring in a cycle. We remove
all the pairs for which the SVM returns a probability of less than 75%.</p>
        <p>The TIAD Shared Task comes with two modes of evaluation | an
evaluation against existing dictionaries as gold data (Gold in the table), and a manual
assessment of precision on sample data instances (Manual in the table), both
provided by KDictionaries. According to the the Shared Task results11, our
system outperforms the other participating systems in terms of manual and gold
precision.</p>
        <p>Table 2 lists the results, i.e. the precision values, of our system and the
baseline implementation, as calculated by the organizers. Recall was not calculated,
because it would require to determine all possible valid translations, which is a
rather di cult endeavour and was not in the scope of the Shared Task12. For
the 'Gold' evaluation, only inferences that were in the gold standard data were
considered. The 'Manual' evaluation results were obtained by taking both the
gold standard data and human translators' evaluation into consideration. For
the 'Manual' evaluation, our system outperforms the baseline both for dk7!es
and nl7!fr cases. It should be noted, however, that according to the evaluation
on the gold standard data it is outperformed by the baseline for dk7!es and
de7!pt-BR pairs. The reason behind this di erence should be in some borderline
cases which were not present in the existing dictionaries but were labeled as
correct by a human annotator.</p>
        <p>
          An obvious reason for the good performance of the baseline (and, for that
matter, our system) is that its prediction heavily relies on the existence of
cycles in the data, by which the potential noise from polysemy is being elimitated
relatively e ciently. This is a result very much in line with earlier research [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ],
however, it is also a slightly arti cial scenario, as it is e ectively only applicable
for languages for which at least two bilingual dictionaries with two other
languages already exist [
          <xref ref-type="bibr" rid="ref5 ref7">5, 7</xref>
          ]: In the practical reality of language documentation
or NLP research on low-resource languages, for which bootstrapping inferred
dictionaries would be particularly useful, normally only one major dictionary
(between the minority language and the national language or English) is
available, and if there are more, they are often limited in coverage (which limits the
11 Cf. https://tiad2017.wordpress.com/data/.
12 Justi cation according to the organizers' note regarding the evaluation results.
value of such languages as pivot languages).13 Most related research therefore
focuses on translation inference from simple paths rather than cycles [
          <xref ref-type="bibr" rid="ref1 ref10 ref14 ref15 ref8">14, 15, 1,
10, 8</xref>
          ]. Our internal evaluation (Tab. 1) which abstracts from this meta-factor
indicates that path-based factors are successfully able to disentangle probable and
less probable candidate translations as measured against the cycle criterion, but
also that the combination of multiple path-based factors by means of machine
learning is likely to outperform `intuitive' metrics such as path probability.
        </p>
        <p>
          Along with path-based factors, etymological closeness has been considered a
major factor in such studies [
          <xref ref-type="bibr" rid="ref11 ref12">12, 11</xref>
          ]. Here, this factor has been approximated by
a relative Levenshtein distance metric. The non-satisfying performance of this
metric in our scenario has been discussed before, it should be noted, however,
that conventional Levenshtein metrics are no longer considered to be state of
the art, and more elaborate approaches to detecting cognates are to be tested
in follow-up experiments.
        </p>
        <p>A third category of features, involving semantic or grammatical information,
requires external resources and was thus beyond our Shared Task contribution.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper described our contribution to the closed track of the Shared Task on
Translation Inference across Dictionaries (TIAD-2017).</p>
      <p>
        Further improvement of our approach is expected to be achieved by modi
cation and inclusion of additional features for the SVM training. The features
used here were mainly properties of the path from source to target word in the
translation graph. A possible extension lies in the inclusion of word context
features [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]: As the example phrases in the provided dictionaries only constitute
a limited context, text corpora for each language should be consulted. Another
possibility of including context information would be the usage of a
distributional semantics approach. Word embeddings trained on di erent corpora would
have to be mapped to a common semantic space in order to calculate vector
distances.
      </p>
      <p>In the overall evaluation, our system yields marginal (if any) improvement
over the cycle-based based baseline. It is, however, more demanding in terms of
resources: The exhaustive search for cycles and paths in the graph is
computational expensive and for larger datasets therefore not feasible. However, as we
used a machine learning approach, availability of massive training data is crucial
and a computational acceptable alternative for the exhaustive search should be
preferred.</p>
      <p>
        It should be noted that the setup of the task and the provided data favors
systems that make use of cycles or loops in translation chains. For low resource
languages, where translation inference across dictionaries is probably most
relevant, this scenario is, however, rather unlikely, as multiple, large-coverage
dic13 As a representative example for such low-resource dictionaries, one may consider the
Intercontinental Dictionary Series (IDS) which provides data for up to only 1310 (!)
entries per language [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
tionaries are mostly available for major languages. Our observations regarding
the impact of path-based factors (and possible limitations of Levensthein-based
methods) are nevertheless also relevant for the more general case where only
non-cyclical sequences of dictionaries are available. For future editions of the
Shared Task, we would thus be interested in exploring a path-based rather than
cycle-based setup for translation inference across dictionaries.
      </p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <p>The research described in this paper was conducted in the project `Linked Open
Dictionaries' (LiODi, 2015-2020), funded by the German Ministry for Education
and Research (BMBF) as an Early Career Research Group on eHumanities.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bond</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ogura</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Combining linguistic resources to create a machine-tractable Japanese-Malay dictionary</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>42</volume>
          (
          <issue>2</issue>
          ),
          <volume>127</volume>
          {
          <fpage>136</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <issue>2</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>LIBSVM: A library for Support Vector Machines</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology (TIST) 2</source>
          (
          <issue>3</issue>
          ),
          <volume>1</volume>
          {
          <fpage>27</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine Learning</source>
          <volume>20</volume>
          (
          <issue>3</issue>
          ),
          <volume>273</volume>
          {
          <fpage>297</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Damerau</surname>
            ,
            <given-names>F.J.:</given-names>
          </string-name>
          <article-title>A technique for computer detection and correction of spelling errors</article-title>
          .
          <source>Commun. ACM</source>
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <volume>171</volume>
          {176 (Mar
          <year>1964</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/363958.363994
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Istvan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shoichi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Bilingual dictionary generation for low-resourced language pairs</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2009)</source>
          . pp.
          <volume>862</volume>
          {
          <fpage>870</fpage>
          .
          <string-name>
            <surname>Edinburgh</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scotland</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Key</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Comrie</surname>
            ,
            <given-names>B</given-names>
          </string-name>
          . (eds.):
          <article-title>Intercontinental Dictionary Series (IDS). Max Planck Institute for Evolutionary Anthropology</article-title>
          , Leipzig (
          <year>2015</year>
          ), http://ids.clld.org/
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>K.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al Tarouti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalita</surname>
            ,
            <given-names>J.K.</given-names>
          </string-name>
          :
          <article-title>Automatically creating a large number of new bilingual dictionaries</article-title>
          .
          <source>In: Proceedings of the 29th AAAI Conference on Arti cial Intelligence (AAAI-2015)</source>
          . pp.
          <volume>2174</volume>
          {
          <fpage>2180</fpage>
          .
          <string-name>
            <surname>Austin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Texas</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Mairidan</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ishida</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hirayama</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bilingual dictionary induction as an optimization problem</article-title>
          .
          <source>In: Proceedings of the 9th International Conference on Language Resources</source>
          and
          <article-title>Evaluation (LREC-</article-title>
          <year>2014</year>
          ). pp.
          <volume>2122</volume>
          {
          <fpage>2129</fpage>
          . Reykjavik, Iceland (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , et al.:
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (Oct),
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Saralegi</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manterola</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vicente</surname>
            ,
            <given-names>I.S.:</given-names>
          </string-name>
          <article-title>Analyzing methods for improving precision of pivot based bilingual dictionaries</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2011)</source>
          . p.
          <volume>846</volume>
          {
          <fpage>856</fpage>
          .
          <string-name>
            <surname>Edinburgh</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scotland</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sbrissia</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nohama</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hahn</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Cognate mapping: A heuristic strategy for the semi-supervised acquisition of a Spanish lexicon from a Portuguese seed lexicon</article-title>
          .
          <source>In: Proceedings of the 20th International Conference on Computational Linguistics (COLING-2004)</source>
          . pp.
          <volume>813</volume>
          {
          <fpage>819</fpage>
          .
          <string-name>
            <surname>Geneva</surname>
          </string-name>
          ,
          <string-name>
            <surname>Switzerland</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Skoumalova</surname>
          </string-name>
          , H.:
          <article-title>Bridge dictionaries as bridges between languages</article-title>
          .
          <source>International Journal of Corpus Linguistics</source>
          <volume>6</volume>
          (
          <issue>11</issue>
          ),
          <volume>95</volume>
          {
          <fpage>105</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Soderland</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weld</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skinner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bilmes</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.:
          <article-title>Compiling a massive, multilingual dictionary via probabilistic inference</article-title>
          .
          <source>In: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association of Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL-IJCNLP</source>
          <year>2009</year>
          ). p.
          <volume>262</volume>
          {
          <fpage>270</fpage>
          .
          <string-name>
            <surname>Suntec</surname>
          </string-name>
          ,
          <string-name>
            <surname>Singapore</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Tanaka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Umemura</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Construction of a bilingual dictionary intermediated by a third language</article-title>
          .
          <source>In: Proceedings of the 15th Conference on Computational Linguistics (COLING-1994)</source>
          . p.
          <volume>297</volume>
          {
          <fpage>303</fpage>
          .
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Tsuchiya</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Purwarianti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakita</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakagawa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Expanding IndonesianJapanese small translation dictionary using a pivot language</article-title>
          .
          <source>In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics</source>
          . pp.
          <volume>197</volume>
          {
          <fpage>200</fpage>
          .
          <string-name>
            <surname>Prage</surname>
          </string-name>
          , Czech
          <string-name>
            <surname>Republic</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melero</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bel</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gracia</surname>
          </string-name>
          , J.:
          <article-title>Leveraging RDF graphs for crossing multiple bilingual dictionaries</article-title>
          .
          <source>In: Proceedings of the 10th International Conference on Language Resources</source>
          and
          <article-title>Evaluation (LREC-</article-title>
          <year>2016</year>
          ). Portoroz,
          <string-name>
            <surname>Slovenia</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Yujie</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isahara</surname>
          </string-name>
          , H.:
          <article-title>Automatic construction of Japanese-Chinese translation dictionary using English as intermediary</article-title>
          .
          <source>Journal of Natural Language Processing</source>
          <volume>12</volume>
          (
          <issue>2</issue>
          ),
          <volume>63</volume>
          {
          <fpage>85</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>