<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A study of the saturation of analogical grids agnostically extracted from texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rashel Fam</string-name>
          <email>fam.rashel@fuji.waseda.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yves Lepage ?</string-name>
          <email>yves.lepage@waseda.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IPS, Waseda University 2-7 Hibikino</institution>
          ,
          <addr-line>Wakamatsu-ku, Kitakyushu-shi, 808-0135 Fukuoka-ken</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <fpage>13</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>Analogical grids aim to capture the organization of the lexicon of a language. We conduct experiments on analogical grids extracted in four di erent languages with di erent morphological richness. We study the saturation of analogical grids against their size. We observe that the logarithm of the saturation of an analogical grid is linear in the logarithm of its size. More surprisingly, the coe cients of this log-log linear relation are extremely close across all four languages, even when the size or the genre of the corpus vary.</p>
      </abstract>
      <kwd-group>
        <kwd>analogical grids</kwd>
        <kwd>saturation</kwd>
        <kwd>organization of lexicon</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction and background
show : shows : showing : showed
walk : walks : walking : walked
open : opens : opening :
study : : studying :
read : reads : reading :
makan : dimakan : memakan : makanan
minum : diminum : meminum : minuman
main : : : mainan
beli : dibeli : :</p>
      <p>Figure 1 shows two examples of analogical grids, one in English, the other
one in Indonesian. Such analogical grids may be automatically constructed from
the set of words contained in a text. Each cell in an analogical grid either
contains a word form or is empty. As exempli ed in Figure 1 (left ), a column (or a
row) in an analogical grid usually exhibits similar word forms for di erent words:
e.g., in nitive, present 3rd person singular, present participle, etc. for di erent
English verbs on the left of Figure 1. Analogical grids are not paradigm tables,
? This work was supported by a JSPS Grant, Number 15K00317 (Kakenhi C), entitled
Language productivity: e cient extraction of productive analogical clusters and their
evaluation using statistical machine translation.</p>
      <p>Copyright © 2017 for this paper by its authors. Copying permitted for private and
academic purpose. In Proceedings of the ICCBR 2017 Workshops. Trondheim, Norway
i.e., they are not the result of a linguistic formalization with explicit lexemes
and exponents as in standard works in morphology, but they constitute a
preliminary step in that direction. Analogical grids too give a compact view of the
organization of the lexicon, but they are the output of an empirical procedure,
e.g., the one introduced in [4].</p>
      <p>Analogical grids can be used to study word productivity in a given language
as in [12, 9, 6]. They can also be used to make comparisons across languages
as in [4], where the goal is to explain unseen words by using analogical grids
automatically built from the set of all words contained in texts in 12 di erent
languages.</p>
      <p>In this paper, we report an interesting phenomenon observed when building
analogical grids in various di erent languages using the method in [4]. This
phenomenon relates the saturation of the obtained analogical grids to their size. The
experimental results show that the coe cients which characterize the relation
would not be in uenced by the size, the genre or the language of the texts used.</p>
      <p>The paper is organized as follows: Section 2 introduces basic notions related
to analogical grids. Section 3 presents our experiments on four languages with
di erent richness in morphology. It analyzes the results and explores the
relationship between the saturation and the size of analogical grids. Section 4 presents
further experiments to inquire the relation. Section 5 gives conclusion.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Basic notions</title>
      <p>In this section, we mathematically de ne the basic notions related to analogical
grids. The method to extract such analogical grids has already been presented
elsewhere [8, 4].
2.1</p>
      <sec id="sec-2-1">
        <title>Illustration with toy data</title>
        <p>Anto memakan nasi dan meminum air. Nasi itu dibeli di pasar. Di pasar,
Anto melihat mainan. Anto senang main bola. Setelah main, Anto suka
minum es dan makan cilok. Makanan dan minuman itu juga dia beli di
pasar. Es dan cilok memang enak dimakan dan diminum selesai olahraga.
air anto beli bola cilok dan di dia dibeli dimakan diminum enak es itu juga
main mainan makan makanan melihat memakan memang meminum
minum minuman nasi olahraga pasar selesai senang setelah suka</p>
        <p>The top of Figure 2 is a forged example text in Indonesian, a language which
is known for its relative richness in derivational morphology. We intentionally do
not give its translation into English to place the reader in the agnostic position
of the computer in front of such data. The list of words, sorted in lexicographic
order, that can be extracted from this text, is given at the bottom of Figure 2.</p>
        <p>From this word list, some commonalities between words can be identi ed at
a glance. An example is the word makan and the word makanan. Another is the
words bola and beli which share the same consonants in the same order: b and
l. However, the existence of only one pair is not enough to support the evidence
that two words are actually in relation one with the other. On the contrary, for
the words makan and the word makanan, the same ratio is seen to hold between
several other word pairs from the same text, like minum and minuman, or main
and mainan. These actually re ect a phenomenon in Indonesian morphology by
using the su x -an which builds a noun from active verb.</p>
        <p>In standard linguistics, a systematization of these relationships between word
forms is given by paradigm tables, which is the result of linguistic formalisation.
Here, we agnostically extract analogical grids relying on a formal relationship
between words, proportional analogy. The right part of Figure 1 shows the
analogical grid extracted from the set of words given in Figure 2.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Analogical grids</title>
        <p>
          An analogical grid is a table of dimension M N as de ned by Formula (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ).
As illustrated by Figure 1, analogical grids extracted from texts usually contain
empty cells. (Caution: there is no importance in the order of lines or rows.)
P11 : P12 :
P21 : P22 :
.. .
        </p>
        <p>.</p>
        <p>. .</p>
        <p>Pn1 : Pn2 :
: P1m
: P2m
.
.</p>
        <p>.
: Pnm
()
8(i; k) 2 f1; : : : ; ng2;
8(j; l) 2 f1; : : : ; mg2;</p>
        <p>
          Pij : Pil :: Pkj : Pkl
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
        </p>
        <p>
          The de nition of analogical grids in Formula (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) implies that for any four
word forms at the intersection of two rows and two columns form a proportional
analogy between sequences of characters [7, 13]. A proportional analogy is de ned
as a relationship between four objects where two properties are met:
(a) equality of ratios (de ned hereafter) between the rst and the second terms
on one hand, and the third and the fourth terms on the other hand, and
(b) exchange of the means (the second and the third terms can always be
exchanged).
        </p>
        <p>A : B :: C : D
()</p>
        <p>A : B = C : D
A : C = B : D</p>
        <p>
          According to Formula (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), we can get many analogies from analogical grids
in Figure 1. Figure 3 shows three of them.
        </p>
        <p>
          We de ne the ratio between two words in Formula (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) as a vector of features
made up of all the di erences in number of occurrences in the two words, for all
makan : makanan :: main : mainan
makan : memakan :: minum : meminum
        </p>
        <p>minum : diminum :: beli : dibeli
the characters, whatever the writing system, plus, the distance between the two
words.</p>
        <p>0jAja jBja1</p>
        <p>
          BjAjb jBjb C
A : B = BBBB@jAjz ...jBjz CCCCA
d(A; B)
In Formula (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ), the notation jSjc stands for the number of occurrences of
character c in string S. The last dimension, written as d(A; B), is the edit distance
between the two strings. This indirectly gives the number of common characters
appearing in the same order in A and B.1
        </p>
        <p>The above de nition of ratios captures pre xing and su xing. Although we
do not show it here, this de nition also captures parallel in xing or
interdigitation, well-known phenomena in semitic languages [1, 14]. However, reduplication
or repetition (e.g. consonant spreading) are not captured by this de nition.
makan : makanan
0 11
main : mainan
0 11
=</p>
        <p>B 0 C
B .C
B .C
B .C
B@ 0 AC
3
=</p>
        <p>B 0 C
B .C
B .C
B .C
B@ 0 AC</p>
        <p>3
&amp;
)
makan : makanan :: main : mainan</p>
        <p>
          This formal de nition of word ratio in Formula (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) gives the same vector for
the ratios makan : makanan , makan : namakan , and makan : mnaakan . This
is due to the use of insertion and deletion as the only edit operations.
        </p>
        <p>
          The purpose of working with analogical grid, and not only with individual
analogies, is that Formula (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) imposes more constraints for a word form to enter
1 The only two edit operations used are insertion and deletion, hence, d(A; B) =
jAj + jBj 2 s(A; B). jSj denotes the length of a string S and s(A; B) is the length
of the longest common sub-sequence (LCS) between A and B.
a grid: a word form in a grid must satisfy all analogy relationship with all
surrounding word forms in the grid. The word form makanan in the analogical grid
of Figure 1 (right) is the only word form which ts in, among makanan, namakan,
or mnaakan. For example, as proved below, using the words main and mainan
from the analogical grid, the inequality between the ratios makan : main and
namakan : mainan implies that there is no analogy between these four words.
The same holds for the word form mnaakan. In all these cases, the inequality
comes from di erent edit distance values.
6=
        </p>
        <p>) makan : main 6:: namakan : mainan</p>
        <p>The above discussion shows that there should be a relationship between the
size of the analogical grids and the freedom in lling an empty cell in an
analogical grid.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Size and saturation of analogical grids</title>
        <p>We simply de ne the size of an analogical grid as its number of rows multiplied
by its number of columns. The analogical grids in Figure 1 has a size of 4 5 = 20
(left ) and 4 4 = 16 (right ) respectively.</p>
        <p>Let us now turn to the number of empty cells of an analogical grid, or rather
the number of non-empty cells which we call its saturation2. We compute it
using Formula (4) which will give a saturation of 80 % (left ) and 75 % (right ) for
Figure 1.</p>
        <p>Saturation = 100</p>
        <p>Number of empty cells</p>
        <p>Total number of cells
100
(4)
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <sec id="sec-3-1">
        <title>Data used</title>
        <p>We carried out experiments on a multilingual parallel corpus created from the
translation of the Bible collected by Christodoulopoulos3 [10]. We selected four
languages with di erent richness in morphology: English, Russian, Modern Greek,
and Indonesian. The reason for using a multilingual parallel corpus is the need
to draw conclusions across di erent languages in a reliable way. Table 1 presents
statistics on the corpus. For each text in each language, we rst extracted the
list of all words, and nally built all analogical grids.
2 In [2, p. 79], saturation is the maximal proportion of word forms attested for any
one lemma of a given paradigm. Here we use the term for each entire grid.
3 http://homepages.inf.ed.ac.uk/s0787820/bible/
106
s105
ird
lag104
icg
laon103
a
f
reo102
b
m
uN 10
11
106
s105
ird
lag104
icg
laon103
a
f
reo102
b
m
uN 10
11
106
s105
ird
lag104
icg
laon103
a
f
reo102
b
m
uN 10
11
106
s105
ird
lag104
icg
laon103
a
f
reo102
b
m
uN 10
11
103 106
Analogical grid size
109
103 106
Analogical grid size
109
103 106
Analogical grid size
109
103 106
Analogical grid size
109
The graphs at the bottom of Figure 5 show the number of analogical grids with
the same sizes in each language. Most of the analogical grids have a small size.
The number of analogical grids with the same size decreases gradually as the
size increases. Languages with a richer morphology produce bigger analogical
grids in average and also more analogical grids for a given size. All of this meets
intuition.</p>
        <p>English
100%
10%
on
i
trau 1%
taS 0.1%
100%
90%
ittaoSn 6800%%
rau 70%
100%
10%
on
i
trau 1%
taS 0.1%
100%
90%
ittaoSn 6800%%
rau 70%
100%
10%
on
i
trau 1%
taS 0.1%</p>
        <p>We now turn to the study of the saturation of analogical grids compared
to their size. The top of Figure 6 shows saturation against size for analogical
grids in each language. Analogical grids with smaller sizes tend to have higher
saturation. Some tables are extremely sparse. Because of the logarithmic scale
on the y-axis, the bottom half is for tables with a saturation less than 1 %.</p>
        <p>
          In all cases, the plots exhibit a similar linear shape in logarithmic scale across
all languages. This would correspond to Formula (
          <xref ref-type="bibr" rid="ref4">5</xref>
          ). We con rmed the similarity
by the computation of the coe cients a and b for each language, as obtained by
the least squares method. These coe cients are presented in Table 2. They are
almost the same in all languages.
        </p>
        <p>
          log(saturation) = a
log(size) + b
(
          <xref ref-type="bibr" rid="ref4">5</xref>
          )
        </p>
        <p>As mentioned in Section 2.2, intuitively, analogical grids with higher
saturation are more reliable to ll in because there are more word forms around the
empty cells as supporting evidence. However, it may not always be the case. For
instance, an analogical grid for regular English verbs extracted from any text is
very hollow but empty cells can be lled in a reliable way.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and further experiments</title>
      <p>Let us make a rst remark on the type of the observed relation. This is not yet
another instance of a Zip an law, because, in the present case, the objects are
not ranked individually according to their frequency (number of occurrences).
In a Zip an law, the x-axis stands for the list of individual objects ranked by
frequency. Recall also that our analogical grids do not encapsulate any
information about the frequency of individual words whatsoever. In our graphs, two
analogical grids with the same size have the same abscissa. If they also have the
same saturation, they have the same ordinate and are thus plotted as the same
point.</p>
      <sec id="sec-4-1">
        <title>Language English</title>
      </sec>
      <sec id="sec-4-2">
        <title>Indonesian</title>
        <p>Modern Greek
Russian</p>
        <p>The interesting fact that comes into light is not so much the fact that the
relation between size and saturation of analogical grids be a log{log relation, but
the fact that it exhibits very similar slopes in all four languages. A reasonable
explanation is that these coe cients are independent of the language because
they characterize the corpus used. The corpus is de ned by its size and its genre.</p>
        <p>We rst inquired whether the coe cients depend on the size of the corpus
used. We performed the same experiment in English and let the size of the corpus
vary: a half, a quarter, an eighth of the original size. The computation of the
coe cients led to very similar results as shown in Table 2.</p>
        <p>We then inquired the in uence of the genre and performed the same
experiment with the same size of text in English again. We chose the Europarl corpus
for this experiment. Again, the computation of the linear coe cients led to very
similar results, as shown in Table 2.</p>
        <p>Further experiments with more parameters varying are required to con rm
that the coe cients of the relationship between saturation and the size are
always very similar. However, for the time being, we observe that the parameters
are relatively close at least for these four languages whith di erent richness in
morphology.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We studied analogical grids in di erent languages with di erent morphological
richness. These analogical grids were automatically built from actual texts,
using a technique which has been presented in previous work. Without surprise,
languages known to be richer in morphology produce bigger and more
analogical grids than languages less rich in morphology. Empty cells in such analogical
grids are interesting because they could be lled by words that should then be
tested against the actual language.</p>
      <p>We studied the relation between size and saturation in analogical grids.
Experimental results clearly showed that the logarithm of the saturation of an
analogical grids linearly depends on the logarithm of its size. This is not so
surprising. More interestingly, the computation of the coe cients characterizing
this log-log linear relation led to the result that, across all the four languages
used, and even when having size and genre varying in one language, these
coefcients are almost always the same: the relation between the saturation and the
size of an analogical grid would be almost independent of the size, the genre and
the language of a text.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Beesley</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Consonant spreading in Arabic stems</article-title>
          .
          <source>In: Proceedings of COLINGACL'98</source>
          . vol. I, pp.
          <volume>117</volume>
          {
          <fpage>123</fpage>
          .
          <string-name>
            <surname>Montreal</surname>
          </string-name>
          (Aug
          <year>1998</year>
          ), http://www.aclweb.org/ anthology/P98-1018
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Structures and distributions in morphology learning</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Pennsylvania. (
          <year>2008</year>
          ), http://nlp.cs.swarthmore.edu/~richardw/ papers/chan2008-structures.pdf
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dryer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eisner</surname>
          </string-name>
          , J.:
          <article-title>Discovering morphological paradigms from plain text using a dirichlet process mixture model</article-title>
          .
          <source>In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing</source>
          (EMNLP'
          <year>2011</year>
          ). pp.
          <volume>616</volume>
          {
          <fpage>627</fpage>
          . Association for Computational Linguistics, Edinburgh, Scotland, UK (
          <year>2011</year>
          ), https://www.cs.jhu.edu/~jason/papers/dreyer+eisner.
          <source>emnlp11.pdf 4</source>
          .
          <string-name>
            <surname>Fam</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lepage</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Morphological predictability of unseen words using computational analogy</article-title>
          .
          <source>In: Proceedings of the Computational Analogy Workshop at the 24th International Conference on Case-Based Reasoning (ICCBR-CA-16)</source>
          . pp.
          <volume>51</volume>
          {
          <fpage>60</fpage>
          .
          <string-name>
            <surname>Atlanta</surname>
          </string-name>
          ,
          <string-name>
            <surname>Georgia</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          5.
          <string-name>
            <surname>Goldsmith</surname>
          </string-name>
          , J.:
          <article-title>Unsupervised learning of the morphology of a natural language</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>27</volume>
          ,
          <issue>153</issue>
          {
          <fpage>198</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hathout</surname>
          </string-name>
          , N.:
          <article-title>Acquisition of the morphological structure of the lexicon based on lexical similarity and formal analogy</article-title>
          .
          <source>In: Proceedings of the 3rd Textgraphs workshop on Graph-based Algorithms for Natural Language Processing</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          .
          <article-title>Coling 2008 Organizing Committee</article-title>
          , Manchester,
          <string-name>
            <surname>UK</surname>
          </string-name>
          (
          <year>August 2008</year>
          ), http: //www.aclweb.org/anthology/W08-2001
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          7.
          <string-name>
            <surname>Langlais</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yvon</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Scaling up analogical learning</article-title>
          .
          <source>In: Coling</source>
          <year>2008</year>
          : Companion volume:
          <source>Posters</source>
          . pp.
          <volume>51</volume>
          {
          <fpage>54</fpage>
          .
          <article-title>Coling 2008 Organizing Committee</article-title>
          , Manchester,
          <string-name>
            <surname>UK</surname>
          </string-name>
          (
          <year>August 2008</year>
          ), http://www.aclweb.org/anthology/C08-2013
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lepage</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Analogies between binary images: Application to Chinese characters</article-title>
          . In: Prade,
          <string-name>
            <surname>H.</surname>
          </string-name>
          , Richard, G. (eds.) Computational Approaches to Analogical Reasoning: Current Trends, pp.
          <volume>25</volume>
          {
          <fpage>57</fpage>
          . Springer, Berlin, Heidelberg (
          <year>2014</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -54516-
          <issue>0</issue>
          _
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          9.
          <string-name>
            <surname>Neuvel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fulop</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          :
          <article-title>Unsupervised learning of morphology without morphemes</article-title>
          .
          <source>In: Proceedings of the ACL-02 Workshop on Morphological and Phonological Learning</source>
          . pp.
          <volume>31</volume>
          {
          <fpage>40</fpage>
          .
          <article-title>Association for Computational Linguistics</article-title>
          (
          <year>July 2002</year>
          ), http: //www.aclweb.org/anthology/W02-0604
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          10.
          <string-name>
            <surname>Resnik</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olsen</surname>
            ,
            <given-names>M.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diab</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The Bible as a parallel corpus: Annotating the `book of 2000 tongues'</article-title>
          .
          <source>Computers and the Humanities</source>
          <volume>33</volume>
          (
          <issue>1</issue>
          ),
          <volume>129</volume>
          {
          <fpage>153</fpage>
          (
          <year>1999</year>
          ), http://dx.doi.org/10.1023/A:1001798929185
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          11.
          <string-name>
            <surname>Schone</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Knowledge-free induction of morphology using latent semantic analysis</article-title>
          .
          <source>In: Proceedings of CoNLL-2000 and LLL-2000</source>
          . pp.
          <volume>67</volume>
          {
          <fpage>72</fpage>
          . Lisbon, Portugal (
          <year>2000</year>
          ), http://web.stanford.edu/~jurafsky/W00-0712.pdf
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          12.
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ford</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>In praise of Sakatayana: some remarks on whole word morphology</article-title>
          . In: Singh,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (ed.)
          <article-title>The Yearbook of South Asian Languages and Linguistics200</article-title>
          . Sage, Thousand
          <string-name>
            <surname>Oaks</surname>
          </string-name>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          13.
          <string-name>
            <surname>Stroppa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yvon</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>An analogical learner for morphological analysis</article-title>
          .
          <source>In: Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)</source>
          . pp.
          <volume>120</volume>
          {
          <fpage>127</fpage>
          . Association for Computational Linguistics, Ann Arbor,
          <string-name>
            <surname>Michigan</surname>
          </string-name>
          (
          <year>June 2005</year>
          ), http://www.aclweb.org/anthology/W/W05/W05-0616
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          14.
          <string-name>
            <surname>Wintner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>Natural Language Processing of Semitic Languages, chap. Morphological Processing of Semitic Languages</source>
          , pp.
          <volume>43</volume>
          {
          <fpage>66</fpage>
          . Springer, Berlin, Heidelberg (
          <year>2014</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -45358-
          <issue>8</issue>
          _
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>