<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Lemma Bank of the LiITA Knowledge Base of Interoperable Resources for Italian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eleonora Litta</string-name>
          <email>eleonoramaria.litta@unicatt.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Passarotti</string-name>
          <email>marco.passarotti@unicatt.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Brasolin</string-name>
          <email>paolo.brasolin@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Moretti</string-name>
          <email>giovanni.moretti@unicatt.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Mambrini</string-name>
          <email>francesco.mambrini@unicatt.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <email>valerio.basile@unito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Di Fabio</string-name>
          <email>andrea.difabio@unito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristina Bosco</string-name>
          <email>cristina.bosco@unito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRCSE Research Centre, Università Cattolica del Sacro Cuore</institution>
          ,
          <addr-line>Largo Gemelli 1, 20123 Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi di Torino - Dipartimento di Informatica</institution>
          ,
          <addr-line>Corso Svizzera 185, 10149 Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper introduces the LiITA Knowledge Base of interoperable linguistic resources for Italian. After describing the principles of the Linked Data paradigm, on which LiITA is grounded, the paper presents the lemma-centred architecture of the Knowledge Base and details its core component, consisting of a large collection of Italian lemmas (called the Lemma Bank) used to interlink distributed lexical and textual resources. tive impact on the empirical study of the language and resource usability. Indeed, diferent resources may proWhen considering the number of digital linguistic re- vide diferent information or use diferent granularity sources, either lexical or textual, Italian is among the rich- of information about the same common object, namely est languages: e.g., at the time of writing, a search on the words, which appear as occurrences in corpora and as CLARIN Virtual Language Observatory,1 filtered for the entries in dictionaries or lexicons. Making this wealth Italian language, returns more than 8 000 results. Like of information interact represents one of today's main other high-resource languages, Italian is provided with a challenges, to best leverage the huge asset of (meta)data large set of fundamental resources, including WordNets collected over decades of work. ([1] and [2]), a few treebanks available from the Univer- As a consequence, a very active line of research cursal Dependencies collection2, historical corpora 34 and rently focuses on the so-called Linguistic Linked Open reference corpora of written (e.g., CORIS/CODIS [3]) and Data (LLOD), aiming to define common practices for the spoken language (e.g., KIParla [4]). representation and publication of linguistic resources acHowever, as is the case for many other languages, most cording to the principles of the Linked Data paradigm, linguistic resources for Italian vary in terms of data for- which underpins the Semantic Web5. mat, annotation criteria, and/or adopted tagsets. Such A recently concluded COST Action (Nexus Linvariation hinders full interaction between the (meta)data guarum6) resulted both in the creation of a large and provided by the many available resources, with a nega- cohesive scientific community and in the definition of a set of shared vocabularies for linguistic knowledge description. Some of these vocabularies have been widely applied in the LiLa Knowledge Base (KB), which is probably the main LLOD use case currently available. LiLa (Linking Latin) is a KB of Latin linguistic resources made interoperable through their representation and publication according to the Linked Data principles. Thanks to its streamlined and language-independent architecture, LiLa is today a reference model for projects aiming to achieve online interoperability between distributed linguistic resources. Building on the experience of LiLa and reusing its ar-</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Linked Open Data</kwd>
        <kwd>Linguistic Resources</kwd>
        <kwd>Italian</kwd>
        <kwd>Interoperability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>chitecture, the LiITA (Linking Italian)7 project has started
the creation of a KB of interoperable linguistic resources
for Italian published as Linked Data. This paper describes
the development of the fundamental component of the
LiITA KB, which consists of a collection of Italian lemmas
(called the Lemma Bank) that serves as the connection
point between word occurrences and their entries in the
corpora and lexical resources that will be published in
the KB.</p>
      <p>Language)9 is a query language for (meta)data
represented in RDF;
4. Include links to other URIs to allow people (and</p>
      <p>machines) to discover more things.</p>
      <p>
        Applying the principles of the Linked Data paradigm to
(meta)data derived from linguistic resources and
publishing them on the Web ofers several benefits [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Firstly,
as for representation and modelling of (meta)data, RDF
is a very versatile model, suitable for representing
metadata such as those conveyed by the various levels of
2. Linguistic Linked Data annotation available in linguistic resources (morphology,
syntax, lemmatisation, etc.). Moreover, the adoption of
Introduced by Tim Berners-Lee et alii [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the concept of a common data model (RDF) enables both structural (or
the Semantic Web is based on the assumption that docu- syntactic) interoperability, which is the ability of diferent
ments published on the World Wide Web are associated systems to process exchanged data using shared
protowith information and metadata structured in such a way cols and formats (such as HTTP and URI), and conceptual
as to allow their querying and semantic interpretation (or semantic) interoperability, which is the ability of a
not only by humans but also by automated agents. system to automatically and semantically interpret the
      </p>
      <p>
        This structuring is implemented in the form of Linked exchanged information using a common set of classes
Data, which are the pillars of the Semantic Web. Unlike and data categories defined in ontologies and
vocabua web made of hypertexts, where links are not semanti- laries [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The Italian language is no stranger to this
cally interpretable, the Semantic Web consists of links paradigm101112. But this is the first attempt to create such
between “objects” associated with a unique and persis- a kind of resource in the form of a lemma bank in Italian.
tent identifier (URI: Uniform Resource Identifier). The
links between objects are semantically interpretable as
they are represented through vocabularies for knowledge 3. The LiITA Knowledge Base
description recorded in the form of ontologies.
      </p>
      <p>The Linked Data paradigm is founded on four princi- This Section introduces the fundamental architecture
ples defined by Berners-Lee himself 8: of the LiITA KB and details its core component, i.e., a
collection of canonical forms of citations (lemmas) for
the Italian language13. The base URI of the resource
is http://www.liita.it/data/, a namespace we
reserved by buying the domain from a registrar to use also
as a URL, e.g., for the project website.
1. Use URIs as “names for things” to identify them
uniquely and persistently. The “things” dealt with
when handling linguistic (meta)data in Linked
Data are linguistic objects, such as occurrences
of words in texts, lexical entries in dictionaries,
or sets of parts of speech; 3.1. The Architecture of LiITA
2. Use HTTP URIs to allow people (and machines)</p>
      <p>
        to look up things on the Web; The architecture of the LiITA KB resembles that of the
3. Use standards such as RDF and SPARQL to pro- LiLa KB for Latin14, which is based on the assumption
vide useful information about what is identified that the sources of the (meta)data that the KB makes
by a URI, for the purpose of representation and re- interoperable are all related to words. These sources are
trieval of (meta)data. RDF (Resource Description linguistic resources and specifically:
Framework) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is the data model that underlies
the Semantic Web. According to this model, in- • lexical resources, such as dictionaries or lexicons,
formation in the Semantic Web is organised and which describe the properties of words and
conrepresented in terms of triples, i.e., relationships sist of lexical entries;
between a Subject and an Object through a Prop- • textual resources, such as corpora and digital
lierty. The classes to which Subjects and Objects braries, which provide texts and are made of
ocbelong, as well as the semantics of Properties, are currences of words (tokens).
established by ontologies shared by the diferent
communities that enrich and use the Semantic
      </p>
      <p>Web. SPARQL (SPARQL Protocol And RDF Query
7http://www.liita.it/
8https://www.w3.org/DesignIssues/LinkedData
9https://www.w3.org/TR/rdf-sparql-query/
10http://hdl.handle.net/20.500.11752/ILC-1007
11http://hdl.handle.net/20.500.11752/ILC-66
12http://hdl.handle.net/20.500.11752/ILC-558
13https://github.com/LiITA-LOD
14https://lila-erc.eu/data-page/</p>
      <p>Lexical entries and word occurrences coming from
distributed resources are made interoperable in LiITA
by linking them to their respective lemmas. This makes
it possible to perform federated searches on the
diferent linguistic resources that LiITA makes interoperable.</p>
      <p>For example, one can search for all occurrences (tokens)
of the same lemma in multiple textual corpora; or
extract from multiple corpora all those tokens that have
certain lexical properties provided by one or more lexical
resources.</p>
      <p>Given the central role played by lemmas in the
architecture of LiITA, the core component of the KB is
a collection of conventional citation forms (lemmas) of Figure 1: The OntoLex-Lemon model.
Italian words, called the Lemma Bank.</p>
      <p>
        In the LiLa KB lemmas are described with the help of
custom ontology.15 This ontology, on the one hand,
provides detailed information on some morphological and OntoLex-Lemon model.
linguistic features of the lemmas (e.g. the part of speech, In Figure 1, the Classes of OntoLex-Lemon are
graphthe gramatical gender for nouns and the inflectional class) ically represented within rectangles. The relationships
relying on the OLiA annotation model [
        <xref ref-type="bibr" rid="ref9">9, 151-155</xref>
        ]. On between Classes are shown as arrows associated with
the other hand, the LiLa ontology defines classes and the name of the Property that connects two Classes.
properties to model the task of lemmatization, such as The main Class of OntoLex-Lemon is
the property lila:hasLemma16 which links lemmas to ontolex:LexicalEntry18, understood as the
corpus tokens. The class of lila:hasLemma17 is defined unit of lexicon analysis that gathers one or more
as a subclass of ontolex:Form (on which, see sec. 3.2), forms (ontolex:Form19) and one or more lexical
so that the LiLa KB is not a lexical resource in itself, but senses (ontolex:LexicalSense20), lexical concepts
rather a collection of canonical forms that can be either (ontolex:LexicalConcept21) or entities from
used to lemmatize texts or to index lexical entries. ontologies.
      </p>
      <p>Lexical senses are lexicalised senses: a sense belongs
exactly to one lexical entry. Semantic aspects that can
3.2. The LiITA Lemma Bank be expressed by multiple words are represented through
Data modelling lexical concepts, which can therefore have more than one
lexicalisation. A typical example of a lexical concept is the
The Lemma Bank of LiITA consists of a collection of lem- synset in a resource like WordNet, which groups multiple
mas of the Italian language, i.e., lexical citation forms words related by a conceptual synonymy relationship.
adopted (more or less conventionally) in linguistic re- Forms can have one or more graphical
varisources. These lemmas are the names of entries in (most) ants (written representations), represented through
lexical resources and the forms chosen to gather all oc- the Data Property ontolex:writtenRep22, and
currences of a particular word in (lemmatised) textual possibly one or more phonetic variants (Property
resources. As mentioned above, the Lemma Bank plays a ontolex:phoneticRep23). One of these forms, the
fundamental role in the LiITA KB, acting as the connec- object of the ontolex:canonicalForm Property24,
tion point between entries in various lexical resources is the form that is conventionally chosen to represent
and word occurrences in textual resources. the entire set of inflected forms of a lexical entry. The</p>
      <p>
        Following the principles of the Linked Data paradigm, Lemma Bank of LiITA is a collection of such forms,
conceptual interoperability among the distributed re- modelled as individuals of the Class lila:Lemma25,
sources connected in LiITA is achieved by applying a which is a subclass of ontolex:Form, originally created
vocabulary for knowledge description commonly used for the LiLa project, and adopted in the LiITA Lemma
in the world of Linguistic Linked Open Data. In the spe- Bank accordingly. The lemmas of the LiITA Lemma
cific case of the Lemma Bank, this means adopting the
vocabulary defined by OntoLex-Lemon [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], one of the
most widely used models for representing and
publishing lexical resources as Linked Data. Figure 1 shows the
18http://www.w3.org/ns/lemon/ontolex#LexicalEntry
19http://www.w3.org/ns/lemon/ontolex#Form
20http://www.w3.org/ns/lemon/ontolex#LexicalSense
21http://www.w3.org/ns/lemon/ontolex#LexicalConcept
22http://www.w3.org/ns/lemon/ontolex#writtenRep
23http://www.w3.org/ns/lemon/ontolex#phoneticRep
24http://www.w3.org/ns/lemon/ontolex#canonicalForm
25http://lila-erc.eu/ontologies/lila/Lemma
15http://lila-erc.eu/ontologies/lila/.
16http://lila-erc.eu/ontologies/lila/hasLemma
17http://lila-erc.eu/ontologies/lila/hasLemma
Bank are unbound by any relationship with a lexical the part of speech Adjective. Participles are modelled as
entry, as the Lemma Bank is not a lexical resource individuals of the lila:Hypolemma Class and are
conconsisting of lexical entries but a set of canonical forms nected to their verbal lemma (cadere ‘to fall’) through the
of citation. This reflects the role of the Lemma Bank in lila:isHypolemma Property.
      </p>
      <p>LiITA as a collection of lemmas used to make resources Regardless of whether two resources lemmatise
particiinteroperable. ples according to diferent criteria (namely, one under the</p>
      <p>The LiITA Lemma Bank makes textual resources for participial lemma and the other under the verbal lemma),
Italian interoperable through the lila:hasLemma Prop- the two diferent lemmatisations are harmonised in the
erty26, which links a token in a corpus with its lemma Lemma Bank.
in the Lemma Bank. Lexical resources, on the other
hand, are connected to the Lemma Bank through the Data acquisition
ontolex:canonicalForm Property, which links a
lexical entry in the resource to its corresponding lemma in The lemmas and PoS that constitute the Lemma Bank
the Lemma Bank. is based on the lexical base of an online version of the</p>
      <p>
        By using the Property lila:hasPos27, each lemma in dictionary Nuovo De Mauro32, which amounts to about
the Lemma Bank is assigned one part of speech, following 145 000 entries; out of these, 13 000 multi-word
expresthe Universal PoS tagset [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. sions were excluded because they were deemed
unnec
      </p>
      <p>In the case of words that are assigned multiple PoS essary, as lemmatisers usually deal with single tokens.
tags in lexical resources, multiple lemmas are created in About 94 000 lemmas were derived from the remaining
the Lemma Bank. For instance, the word sopra ‘over’ is 131 000 entries. The most numerically abundant PoS
usually assigned four PoS: preposition, adverb, adjective with which the Lemma Bank was populated are listed in
and noun. Thus, four distinct lemmas are created in the Table 1.</p>
      <p>Lemma Bank with four diferent PoS represented via the
lila:hasPos Property.
To harmonise diferent lemmatisation criteria that may be 56 575 Nouns
found in linguistic resources, the Lemma Bank of LiITA 19 912 Adjectives
includes two specific Properties. The symmetric Prop- 15 885 Verbs
erty lila:lemmaVariant28 connects diferent forms of 359 Proper Nouns
the inflectional paradigm of a word that can be used as 311 Adverbs
lemmas. A typical case is that of pluralia tantum, which 111026 PCroonnjouunnctsions
can be lemmatised either in the plural form or in the sin- 40 Prepositions
gular form. This model allows, for example, for both the 58 Articles
lila:Lemma pantaloni and pantalone, which are linked
to each other by the lila:lemmaVariant Property.</p>
      <p>
        While lila:lemmaVariant links lemmas that This population process was not an easy task for
are assigned the same part of speech, the Prop- two main reasons. Firstly, the online version of Nuovo
erty lila:hasHypolemma29 (and its inverse De Mauro is tailored for visualisation: data is mixed
propertylila:isHypolemma30) connects lemmas with graphical information. Secondly, Nuovo De Mauro
that can be used for the same word but have diferent stems from one of the greatest eforts in Italian
lexicoparts of speech. This is the case for the adjectives used graphic history, namely GRADIT (Grande dizionario
italas adverbs, e.g. veloce which can be interpreted (and iano dell’uso [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]). The resource includes information
lemmatised) either as a form of adjective (hence modelled especially hard to handle computationally: De Mauro
as a lila:Lemma) or as an adverb (hence modelled as a and colleagues described for every lemma not only each
lila:Hypolemma31, a subclass of lila:Lemma). of its usual lexicographic metadata (meaning, PoS,
exam
      </p>
      <p>Past participles are another kind of hypolemma (e.g. ples, etc.) but also frequency, semantic domain, grouping
caduto ‘fallen’), which in the Lemma Bank are assigned of senses, multi-word expressions and more. The
extraction of data is in practice hindered by information that
must be filtered out because it is not relevant for our
purposes of building a lemma bank or is provided in some
non-homogeneous forms. Therefore, in order to ease this
26http://lila-erc.eu/ontologies/lila/hasLemma
27http://lila-erc.eu/ontologies/lila/hasPOS
28http://lila-erc.eu/ontologies/lila/lemmaVariant
29http://lila-erc.eu/ontologies/lila/hasHypolemma
30http://lila-erc.eu/ontologies/lila/isHypolemma
31http://lila-erc.eu/ontologies/lila/Hypolemma
32https://dizionario.internazionale.it/. PoS tags were converted
automatically into the Universal tagset, adopted in the Lemma Bank.
initial work, we decided to preliminary extract the afore- eased by a graphical interface which will help with the
mentioned PoS, leaving out a part of the minor lexical task of writing complex SPARQL queries.
categories like acronyms (e.g. NASA, FBI ), exclamation Finally, given its language-independent architecture
marks, or unit symbols (e.g. cm, kg) setting them aside and the use of common vocabularies for knowledge
defor future developments of LiITA. scription, LiITA promises to have a substantial
method</p>
      <p>For the time being, the Nuovo De Mauro’s PoS cat- ological impact on how linguistic resources are published
egorisation rationale was adopted with some in-house and made interoperable as Linked Data.
adjustment. In fact, the Nuovo De Mauro’s PoS
categorisation rationale was mapped to the UPOS tagset. The
original tagging was that of the Italian grammarian tradi- Acknowledgments
tion, hence we had to adapt some tags, for example
conjunctions. As a matter of fact, De Mauro’s conjunctions This contribution is funded by the European Union
didn’t distinguish between subordinate and coordinate, - Next Generation EU, Mission 4 Component 1 CUP
so, we aligned manually each of the dictionary’s conjunc- J53D2301727OOO1. The PRIN 2022 PNRR project
“Litions to the UPOS tags. For the rest of De Mauro’s PoS ITA: Interlinking Linguistic Resources for Italian
we have manually found the correspondence with UPOS via Linked Data” is carried out jointly by the Università
tagset. Cattolica del Sacro Cuore, Milano and the Università di
Torino.</p>
    </sec>
    <sec id="sec-2">
      <title>4. Conclusion and Future Work</title>
      <p>In this paper we presented the first steps towards the
publication as LLOD of a collection of canonical forms
of citation (lemmas) for Italian. Such Lemma Bank is the
core component of LiITA, a knowledge base of
interoperable linguistic resources for Italian inspired by the LiLa
knowledge base for Latin. LiITA aims to compensate the
current lack of interoperability between Italian resources,
as well as to become the pivot to interlink all the present
and future lexicons and corpora for Italian. To this aim,
the Lemma Bank is modelled such that it can harmonise
diferent lemmatisation criteria found in lexical and
textual resources, following a bottom-up approach rather
that a top-down one.</p>
      <p>Building a Lemma Bank to make distributed resources
interoperable in Linked Data is an open-ended process.
As the linking of more and more resources to the KB
might require the inclusion of new lemmas, the LiITA
Lemma Bank will keep on growing, both through the
extraction of lemmas from other lexical sources and in a
resource-driven fashion.</p>
      <p>
        Beside extending the Lemma Bank and linking the first
resources, the LiITA project will develop online services,
following what has been done for LiLa [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The process
of linking a text or corpus in the KB must be supported by
an accessible tool performing automatic lemmatisation,
PoS-tagging and linking. Currently, a new Stanza model
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] has been trained combining all the existing Italian
treebanks. This model will serve as the foundation for
the linkage process of textual resources to be included
in the LiITA KB.33 The advanced interrogation of data
ofered by all the resources interlinked in LiITA will be
33The current model’s performances are presented in Table 2 in
Appendix. The model can be found at
https://github.com/LiITALOD/LiITA
      </p>
    </sec>
    <sec id="sec-3">
      <title>Appendix</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pianta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bentivogli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Girardi</surname>
          </string-name>
          ,
          <article-title>Multiwordnet: developing an aligned multilingual database</article-title>
          ,
          <source>in: First international conference on global WordNet</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>293</fpage>
          -
          <lpage>302</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Roventini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marinelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bertagna</surname>
          </string-name>
          , ItalWordNet v.
          <volume>2</volume>
          ,
          <year>2016</year>
          . URL: http://hdl.handle.
          <source>net/20.500</source>
          .11752/ ILC-62,
          <article-title>ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli"</article-title>
          , National Research Council, in Pisa.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Favretti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tamburini</surname>
          </string-name>
          , C. De Santis, Coris/- codis:
          <article-title>A corpus of written italian based on a defined and a dynamic model, A rainbow of corpora: Corpus linguistics and the languages of the world (</article-title>
          <year>2002</year>
          )
          <fpage>27</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mauri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ballarè</surname>
          </string-name>
          , E. Goria,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cerruti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Suriano</surname>
          </string-name>
          , et al.,
          <article-title>Kiparla corpus: a new resource for spoken italian</article-title>
          ,
          <source>in: CEUR WORKSHOP PROCEEDINGS</source>
          , SunSITE Central Europe,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Lassila,</surname>
          </string-name>
          <article-title>The semantic web</article-title>
          ,
          <source>Scientific american 284</source>
          (
          <year>2001</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>An introduction to the resource description framework</article-title>
          ,
          <source>Journal of library administration 34</source>
          (
          <year>2001</year>
          )
          <fpage>245</fpage>
          -
          <lpage>255</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chiarcos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nordhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Littauer</surname>
          </string-name>
          ,
          <article-title>Building a linked open data cloud of linguistic resources: Motivations and developments, The People's Web Meets NLP: Collaboratively Constructed Language Resources (</article-title>
          <year>2013</year>
          )
          <fpage>315</fpage>
          -
          <lpage>348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Ide</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          ,
          <article-title>What does interoperability mean, anyway? toward an operational definition of interoperability for language technology</article-title>
          ,
          <source>in: Proceedings of the Second International Conference on Global Interoperability for Language Resources. Hong Kong</source>
          , China,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chiarcos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gracia</surname>
          </string-name>
          ,
          <source>Linguistic Linked Data: Representation, Generation and Applications</source>
          , Springer, Cham,
          <year>2020</year>
          . URL: https: //www.springer.com/gp/book/9783030302245. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -30225-2.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bosque-Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gracia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <article-title>The ontolex-lemon model: development and applications</article-title>
          ,
          <source>in: Proceedings of eLex 2017 conference</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Petrov</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          <string-name>
            <surname>Universal</surname>
          </string-name>
          Part
          <article-title>-of-Speech Tagset</article-title>
          , in: N. C. C. Chair),
          <string-name>
            <given-names>K.</given-names>
            <surname>Choukri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Declerck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. U.</given-names>
            <surname>Doğan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Maegaard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mariani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Odijk</surname>
          </string-name>
          , S. Piperidis (Eds.),
          <source>Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)</source>
          ,
          <source>European Language Resources Association (ELRA)</source>
          , Istanbul, Turkey,
          <year>2012</year>
          , pp.
          <fpage>2089</fpage>
          -
          <lpage>2096</lpage>
          . URL: http://www.lrec-conf.org/proceedings/ lrec2012/pdf/274_Paper.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>T. De</surname>
            <given-names>Mauro</given-names>
          </string-name>
          ,
          <article-title>Grande dizionario italiano dell'usoGradit</article-title>
          , UTET,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Moretti,</surname>
          </string-name>
          <article-title>The services of the lila knowledge base of interoperable linguistic resources for latin</article-title>
          ,
          <source>in: Proceedings of the 9th Workshop on Linked Data in Linguistics@ LREC-COLING</source>
          <year>2024</year>
          ,
          <year>2024</year>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bolton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Stanza: A Python natural language processing toolkit for many human languages</article-title>
          ,
          <source>in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations</source>
          ,
          <year>2020</year>
          . URL: https://nlp.stanford.edu/pubs/ qi2020stanza.pdf .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>