<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modeling Syntactic Dependency Relationships in Wikidata Lexicographical Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mahir Morshed</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Illinois at Urbana-Champaign</institution>
          ,
          <addr-line>Urbana, IL 61801</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a scheme with which a lexeme on Wikidata consisting of multiple parts may be annotated to denote syntactic dependencies among its parts. The scheme is su ciently general to accommodate many dependency grammar frameworks and can take advantage of Wikidata lexemes' structure to reduce redundancy in representation while still being exible enough for further quali cation. While we note some challenges in adjustments to the scheme for particular phenomena, we contend that adopting this scheme will aid syntactic parsing e orts in other general domains as well as text generation systems for the Abstract Wikipedia project.</p>
      </abstract>
      <kwd-group>
        <kwd>lexicographical data</kwd>
        <kwd>syntax</kwd>
        <kwd>dependency grammar</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The projects under the Wikimedia Foundation's umbrella have frequently been
used for various natural language processing tasks, including disambiguating
word senses [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], recognizing named entities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and for low-resourced languages
potentially many others [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Some e orts at syntactic annotation of text from
these projects also exist [
        <xref ref-type="bibr" rid="ref3 ref6">3, 6</xref>
        ], but these typically infer grammatical information
from the text ingested based on systems with some prior acquired syntactic
reasoning, rather than retrieve this information directly from textual elements.
      </p>
      <p>
        The under-construction Abstract Wikipedia project [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] has as a goal the
ability to generate text in any natural language from a representation
constructed purely of abstract concepts, these concepts transformed via
languagespeci c renderers into some textual representation. The building blocks of this
text are planned to be Wikidata lexemes{objects corresponding to units of
linguistic meaning (primarily words, but also expressions with multiple parts such
as compound words, idioms, and proverbs). These lexemes are similarly
structured to Wikidata items, but they are modeled in a separate namespace, have
special elds for lemmata, language, and lexical category, and have separate
substructures for di erent meanings (senses) and in ectional realizations (forms).
      </p>
      <p>Copyright © 2021 for this paper by its author. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        For these building blocks to be useful, some mappings from concepts to
lexemes must rst exist, which presently consist of synonym and translation
linkages between senses and correspondences between Wikidata items and senses.
A concept that in one language is representable with one word may need
multiple words in another, however; depending on the sort of multi-part expression
used, the syntactic information needed for adjustment of that expression in
different contexts may di er. The English verb `evade' has a correspondence with
the South American Spanish phrase `hacer el quite' [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], for example; `el quite'
appears to play a role similar to an object of the verb `hacer' and could be
marked and adjusted as such within a sentence. Not only may other equivalents
between languages behave even more di erently, but the composition of
lexemes to represent more complex concepts only yields non-decreasing potential
syntactic di erences.
      </p>
      <p>
        Databases of multi-component expressions have previously been developed
for individual languages [
        <xref ref-type="bibr" rid="ref4 ref5 ref8">4, 5, 8</xref>
        ], most primarily focusing on annotating phrase
structure constituency relations, but some also doing so with dependency
grammar relations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Although the multilingual structured nature of Wikidata's
lexicographical data makes it attractive as a place to aggregate such databases,
the structure of Wikidata lexemes and their statements makes annotating
constituencies directly di cult: the overhead for storing each of a phrase structure
tree's intermediate levels, whether as separate lexeme statements or as entirely
separate objects, and compared to storing dependency information, may be much
greater than necessary for the Abstract Wikipedia project and other language
generation applications.
      </p>
      <p>We thus propose here a compact representation of syntactic dependencies
within the structures of Wikidata lexicographical data, generally applicable to
di erent avors of dependency grammar, but here demonstrated with respect
to Universal Dependencies (UD). We contend that the marking up of this
information is useful even for modeling structures of multi-part elements that may
be regarded in some languages as words, and that it permits lexemes with
syntax represented this way to form parts of other lexemes which, as single units,
take part in other dependency relations. We recognize too that modi cations
to handle special syntactic cases may not necessarily be immediately acceptable
to those annotating relevant lexemes. We nevertheless believe that the greater
portion of what may be annotated of multi-part lexemes with this representation
will not only make those lexemes usable in the syntactic parsing of other texts,
but will also considerably ease the generation of text through the manipulation
of underlying dependency graphs.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Implementation</title>
      <p>What follows is an outline of the proposed dependency representation within
Wikidata lexemes. RDF predicates for Wikidata properties and quali ers1 are
1 https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF Dump Format (section
\Full list of pre xes") lists the RDF pre xes used herein.
provided in monospace, as are Wikidata items, lexemes, and their forms and
senses, which are left unpre xed.
`series ordinal'</p>
      <p>what go around come around
(L333986) (L3006) (L333609) (L3210) (L333609)</p>
      <p>1 2 3 4 5
what goes around comes around
`object form' (L333986-F1) (L3006-F2) (L333609-F1) (L3210-F2) (L333609-F1)
`object sense' (Lt3h3a3t9w86h-icSh1) to(Lm30o0v6e-aSw2a)y (L3a3tp3lv6aa0cre9ise-dS1) t(oL3a2p1p0r-oSa1ch) (L3a3tp3lv6aa0cre9ise-dS1)
2
4
2
0
4
rela`thioenadship' (Qre5l6a8t7iv0i2z2er6) (Q1sc9ul7ba0uj8esc5et32) (Qa1ldo2vc7ae2rt4bio4ina8l0) (Q17ro57o0t74) (Qa1ldo2vc7ae2rt4bio4ina8l0)
mark
csubj
obl:lmod
root
obl:lmod
Each component of the surface form of a multi-part lexeme is represented by a
use of the `combines' property (p:P5238) linking to a lexeme for said
component. The component's representation in the lexeme is speci ed with a quali er
noting the form which is the object of the `combines' statement ('object form',
pq:P5548) on each statement, and the position of each part using a `series
ordinal' quali er (pq:P1545). For completeness, the statement may also note the
sense of the component being used with `object sense' (pq:P5980);
languagespeci c machinery for text generation may nd utility in di erent treatment of
a constituent based on the meaning it expresses, although we do not further
consider uses of that quali er here.</p>
      <p>An example of these quali ers in action is shown in the rst three rows of
Figure 1 for the proverb \what goes around comes around". Note that `series
ordinal' values need not be numeric; a separate ordering to handle in xes,
circum xes, and other non-sequential phenomena may well be desired for some
languages.</p>
      <p>Many dependency treebanks optionally store information about the part of
speech of a word token and the grammatical features it bears in the context in
which it appears. In a Wikidata lexeme, the former of these is a top-level feature
(via wikibase:lexicalCategory), while the latter of these reside on the
individual forms of lexemes (via wikibase:grammaticalFeature). As a result this
information for the components of a multi-part lexeme generally does not need
to be reproduced on the multi-part lexeme itself; they may be programmatically
retrieved from the parts themselves based on the main `combines' values and
`object form' quali ers respectively.
2.2</p>
      <sec id="sec-2-1">
        <title>Syntactic annotation</title>
        <p>A dependency relationship may be thought of as a directed edge between a
dependent and a head, so that specifying both ends of the relationship and the
type of relationship su ces to de ne the dependency, and so that the resulting
set of dependencies for a multi-component lexeme resembles a tree. To de ne
dependency relationships between parts of a lexeme within Wikidata's
existing structure, a set of quali er properties are instead applied to the `combines'
statement for a relationship's dependent.</p>
        <p>The rst of these quali ers, `head position' (pq:P9764), indicates the `series
ordinal' of the head of a dependency relationship. The second of these, `head
relationship' (pq:P9763), indicates the type of said relationship. Both of these
quali ers, having been created in late July 2021, are as yet little used beyond
additions of these by the author on existing lexemes, and user scripts to better
facilitate their addition to new lexemes on Wikidata are yet to be written.
Documentation of appropriate `head relationship' values is slowly being developed,
however2.</p>
        <p>what</p>
        <p>goes
subjective
clause
around
location
adverbial
comes
root
around
location
adverbial</p>
        <p>The relationships in the proverb `what goes around comes around' are de ned
in the fourth and fth rows of Figure 1, with UD equivalents below them. A
potential diagram generated with `series ordinal' and the two new quali ers is
2 With respect to UD, see https://www.wikidata.org/wiki/</p>
        <p>Wikidata:Lexicographical data/Universal Dependencies .
shown in Figure 2. The link from \what" to \goes" is de ned by the edge from
`series ordinal' value `1' to `series ordinal' value `2', where this latter value is
speci ed via the `head position' value on \what". Most other relationships are
marked up similarly. As a convention, the `head position' of the root part is `0'
and the `head relationship' is the item for `root'.</p>
        <p>The two quali ers, as two separate properties de ning parts of the same
relationship, were proposed because Wikidata statements and quali ers cannot have
statements or quali ers themselves as values, much less annotate connections
between them, without major modi cations to Wikidata's Wikibase software and
its Query Service. While such connections might also be stored in an entirely
separate Wikibase instance, in the absence of an implicit federation system
between Wikibase triple stores, querying such connections becomes more costly for
end users and downstream applications than necessary.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Potential challenges</title>
      <p>With this dependency representation, a large number of syntactic phenomena in
multi-part lexemes can be faithfully modeled. There nevertheless remain some
situations that an application of this representation by itself would not
satisfactorily handle, and for which the introduction of certain changes might be
controversial. We outline some of these challenging aspects here.
3.1</p>
      <sec id="sec-3-1">
        <title>Elided components</title>
        <p>In some parallel constructions, one may decide to omit in later parts of a phrase
portions common to earlier parts of the same phrase. The sentence \I wrote the
book, he wrote the story, and she wrote the poem" may be shortened without
loss of understanding by omitting the latter two occurrences of `wrote'. The
resulting subgraphs of the latter phrases may be marked di erently as well; in
UD the special orphan relation would link `he' and `the story', as well as `she'
and `the poem', in the shortened version of the sentence.</p>
        <p>I write the book and he no value the story
`series ordinal' 1 2 3 4 5 6 7 8 9
`head position' 2 0 4 2 7 7 2 9 7
`head relationship' subject root dmeitneerr- odbirjeecctt contijounnc- subject conjunct dmeitneerr- odbirjeecctt
UD equivalent of
`head relationship'
nsubj root
det
obj
cc
nsubj
conj
det
obj</p>
        <p>UD alternatively de nes, in its speci cations of entirely optional `enhanced
dependencies', the concept of `orphan nodes' which represent elided words and
can take part in those dependency relationships originally substituted with
orphan. To mimic this concept, elided words might be similarly indicated by setting
the value of the `combines' statement to the special Wikibase value \no value"
and otherwise marking up relationships with respect to that elided word. (In the
interest of preserving some contextual information, the `object form' quali er
might still refer to the form the elided part would take on were it still present
in the lexeme.) An example of an elided word's use is shown in Figure 3.</p>
        <p>The insertion of extra `nonexistent' components to a multi-part lexeme may
appear to some to be a repurposing of the `combines' property, given that these
components do not appear in the surface form of that lexeme and that any
counts of components of such a lexeme may appear in ated. We might
alternately contend that this is merely a syntactic analogue of when grammatical
features added to a lexeme form may not necessarily change the form despite
those features' importance (as, for example, when further in ecting a
Hindustani adjective that is already in ected for female gender), and that in counting
components ltering out \no value" statements is a small addition to a query.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Cross-clausal relationships</title>
        <p>`series ordinal'
`head position'</p>
        <p>She
1
4
`hUeDadeqreuliavtaiolennsthoipf' nsubj
`object has role'
and
2
3
cc</p>
        <p>I
3
1
visited Japan
4 5
0 4
and
6
7
conj
subject
root
obj
cc
`head relationship' subject contijounnc- conjunct root object contijounnc- conjunct
conj
object</p>
        <p>A clause may contain multiple instances of the same type of element, such
as two subjects or two objects, to which other components in the clause apply
equally. \She and I visited Japan and Korea" contains two subjects, each of
which applies to two objects, and \They washed and combed the dog" contains
two predicates. The graph of the rst phrase may directly connect `she' to the
predicate and connect `I' to `she' (the approach taken by UD), and the graph
of the second phrase may alternately group the actions `washed' and `combed'
into an umbrella predicate to which `they' and `the dog' directly attach. Either
of these approaches, however, adds distance between components that we might
regard as being close syntactically.</p>
        <p>UD's `enhanced dependencies` also allow multiple relationships to share the
same dependent, so that in the rst example the word `I' points both to `she'
(as a conjunct) and `visited' (as a subject), turning the resulting syntactic tree
into a general directed (possibly cyclic) graph. Since multiple `head position'
and `head relationship' quali ers on a single `combines' statement cannot be
separated to refer to di erent dependency relationships, one possible solution is
to mark out the semantic roles more explicitly, and in some cases redundantly,
using the `object has role' (pq:P3831) or `has quality' (pq:P1552) quali ers on
the `combines' statements of conjuncts. An example using the rst phrase and
the `object has role' quali er is shown in Figure 4.</p>
        <p>The extra marking of syntactic roles may appear to some to simply duplicate
information, especially since a conjunct of a particular part very frequently has
the same role as that part. We might instead say that the explicit marking of
the relationships expressed by conjuncts allows them to be queried more readily,
so that traversing conjunct paths becomes unnecessary.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Echo words and reduplication</title>
        <p>In many South Asian languages, it is a frequently productive process for a word
and a nonce word rhyming with it, when taken together, to refer to something
related to said word. In Bengali the noun \ranna" is the act of cooking, while
\rannabanna" refers to `cooking and related activities'; in Hindi \samna" is
`to encounter', while \amna samna" is the act of encountering. The extra echo
components that result from this productive process do not themselves have any
meaning on their own, however, and so creating lexemes for those components
would not be appropriate.</p>
        <p>ranna some value
`series ordinal' 1 2
`head position' 0 1
`head relationship' root echo word
UD equivalent of
`head relationship' root compound:redup
`stated as'
`banna'</p>
        <p>Just as Wikidata statements and quali ers can specify that \no value" exists
as an object of their relationships, so too can they specify that the special
Wikibase value \some value" exists; the implication desired here is that a component
is present but that no separate Wikidata lexeme exists for it. Since, unlike the
elided word case, something is still being realized in the surface form, the
quali er `stated as' (pq:P1932) on the `combines' statement can provide that form.
An example of \some value" with a `stated as' quali er is shown in Figure 5.</p>
        <p>The addition of statements with \some value" may appear to some
particularly in exible, given that individual lexeme forms can have pronunciation and
grammatical information and be tied to usage examples, and that these rump
`combines' statements might become particularly unwieldy if that information
were admitted there. At the same time, for languages with multiple spelling
conventions (where otherwise these might be handled with separate representations
using di erent language codes), the use of the datatype that `stated as' expects,
which does not allow attaching a language code, might lead to confusion when
selecting which of a number of alternatives to use. While we cannot quite counter
the latter beyond suggesting a new quali er with a new datatype exist, for the
former we might contend instead that the contribution that these echo words
have outside the scope of the lexeme is especially minimal and, if one so desires,
can be derived from the word it echoes without considerable di culty.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>Wikidata lexicographical data has the potential to support storing information
about the syntactic structure of multi-part lexemes, and we have provided a
scheme using existing Wikidata quali ers on an existing Wikidata property for
this structured information, noting as well ways in which said scheme could be
improved and potential issues in pursuing those ways. Although the examples
provided herein used Universal Dependencies as a basis, this by no means is
limited to that particular framework; conversion between frameworks to
accommodate di erent use cases is just as possible with Wikidata lexemes as without
them. We envision the possibility of full treebanks being constructed using
Wikibase and some variant of this scheme as a starting point, as well as the
introduction of new structured datatypes to better handle the sorts of connections
and speci cations that have been handled by this scheme{all in addition to the
downstream task improvements that have the potential to bene t from resources
using this scheme.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Diccionario de americanismos. Asociacion de Academias de la Lengua Espan~ola Penguin Random House Grupo Editorial,
          <string-name>
            <surname>Barcelona</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Darg` is, R.,
          <string-name>
            <surname>Auzina</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bojars</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paikens</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Znotins</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Annotation of the corpus of the saeima with multilingual standards</article-title>
          . In: Fiser,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Eskevich</surname>
          </string-name>
          , M., de Jong, F. (eds.)
          <source>Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ).
          <source>European Language Resources Association (ELRA)</source>
          , Paris, France (may
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Flickinger</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oepen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Ytrest l, G.:
          <article-title>WikiWoods: Syntacto-semantic annotation for English Wikipedia</article-title>
          .
          <source>In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)</source>
          .
          <source>European Language Resources Association (ELRA)</source>
          , Valletta, Malta (May
          <year>2010</year>
          ), http://www.lrecconf.org/proceedings/lrec2010/pdf/432 Paper.pdf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gantar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Slovene lexical database</article-title>
          .
          <source>Natural language processing</source>
          , multilinguality pp.
          <volume>72</volume>
          {
          <issue>80</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gregoire</surname>
          </string-name>
          , N.:
          <article-title>Duelme: a dutch electronic lexicon of multiword expressions</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>44</volume>
          (
          <issue>1</issue>
          ),
          <volume>23</volume>
          {
          <fpage>39</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Haverinen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ginter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laippala</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Viljanen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakoski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Dependency annotation of wikipedia: First steps towards a nnish treebank</article-title>
          . In: Eighth International Workshop on Treebanks and Linguistic Theories. p.
          <volume>95</volume>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>K.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ock</surname>
          </string-name>
          , C.Y.:
          <article-title>Using wiktionary to improve lexical disambiguation in multiple languages</article-title>
          . In: Gelbukh,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (ed.)
          <source>Computational Linguistics and Intelligent Text Processing</source>
          . pp.
          <volume>238</volume>
          {
          <fpage>248</fpage>
          . Springer Berlin Heidelberg, Berlin, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Shudo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kurahone</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tanabe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A comprehensive dictionary of multiword expressions</article-title>
          .
          <source>In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</source>
          . pp.
          <volume>161</volume>
          {
          <issue>170</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Turki</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vrandecic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamdi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adel</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Using wikidata as a multi-lingual multi-dialectal dictionary for arabic dialects</article-title>
          .
          <source>In: 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA)</source>
          . pp.
          <volume>437</volume>
          {
          <issue>442</issue>
          (
          <year>2017</year>
          ). https://doi.org/10.1109/AICCSA.
          <year>2017</year>
          .115
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Vondricka</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Design of a multiword expressions database</article-title>
          .
          <source>The Prague Bulletin of Mathematical Linguistics</source>
          <volume>112</volume>
          (
          <issue>1</issue>
          ),
          <volume>83</volume>
          {101 (Apr
          <year>2019</year>
          ). https://doi.org/10.2478/pralin-2019-0003, http://dx.doi.org/10.2478/pralin2019-
          <fpage>0003</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Vrandecic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Architecture for a multilingual wikipedia</article-title>
          . CoRR abs/
          <year>2004</year>
          .04733 (
          <year>2020</year>
          ), https://arxiv.org/abs/
          <year>2004</year>
          .04733
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>