<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Role of a Computational Lexicon for Query Expansion in Full- Text Search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emiliano Giovannetti</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Albanesi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Bellandi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Marchi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mafalda Papini</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Via G. Moruzzi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pisa name.surname@ilc.cnr.it</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>English. This work describes the first experiments conducted with a computational lexicon of Italian in a context of query expansion for full-text search. An application, composed of a graphical user interface and backend services to access the lexicon and the database containing the corpus to be queried, was developed. The text was morphologically analysed to improve the precision of the search process. Some examples of queries are given to show the potential of a text search approach supported by a complex and stratified lexical resource.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Italiano. Il presente lavoro illustra i
primi esperimenti condotti con un
lessico computazionale dell’italiano in
un contesto di query expansion per la
ricerca full-text. È stata sviluppata una
applicazione composta da una
interfaccia grafica utente e un backend
di servizi che permette l’accesso sia al
lessico che al database contenente il
corpus da interrogare. Il testo è stato
analizzato morfologicamente al fine di
migliorare la precisione del processo di
ricerca. Alcuni esempi di query sono
forniti al fine di mostrare le potenzialità
di un approccio di ricerca sul testo
supportato da una risorsa lessicale
complessa e stratificata.
The need of techniques going beyond the mere
“search by keyword” in the querying of textual
resources dates back to the dawn of
computational linguistics. Seminal works in the
Copyright ©️ 2021 for this paper by its authors. Use
permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
60s on the development of the very first
question answering (QA) systems already
included linguistic resources as support
datasets. To bring some “old school” examples,
the “General Inquirer” QA system
        <xref ref-type="bibr" rid="ref17">(Stone et al.,
1962)</xref>
        used a thesaurus for “coding words as to
concept membership” while Simmon’s
“Protosynthex” was equipped with a synonym
dictionary
        <xref ref-type="bibr" rid="ref16">(Simmons et al, 1963)</xref>
        to “expand the
meaning of the question's words to any desired
level”. One of the first works specifically
focussed on the use of a lexical resource for
NLP tasks was about COMPLEX (for
“COMPutational LEXicon”), a resource
developed at IBM
        <xref ref-type="bibr" rid="ref9">(Klavans, 1988)</xref>
        .
      </p>
      <p>
        The support of linguistic resources has
proved its potential in the field of information
retrieval (IR) too, as highlighted in many of Bill
Woods’ works, culminating in the introduction
of his conceptual indexing technique and the
conceptual taxonomy resource
        <xref ref-type="bibr" rid="ref19">(Woods, 1997)</xref>
        and later refined in an article entitled
“Linguistic Knowledge can Improve
Information Retrieval”
        <xref ref-type="bibr" rid="ref20">(Woods, et al, 2000)</xref>
        .
More recently, other researchers have stressed
the importance of the availability of a “Lexical
Knowledge Base” (another way to refer to a
computational lexicon) in tasks such as Word
Sense Disambiguation, since their use, in some
contexts, can outperform supervised systems
        <xref ref-type="bibr" rid="ref1">(Agirre et al., 2009)</xref>
        .
      </p>
      <p>
        The use of linguistic resources in QA of the
earliest period of computational linguistics can
be considered as the precursor of “query
expansion” (QE), the technique that Manning
and Raghavanat describe as the most used
“local method” in IR to tackle those situations
in which “the same concept may be referred to
using different words”
        <xref ref-type="bibr" rid="ref10">(Manning et al., 2008)</xref>
        .
      </p>
      <p>Though QE may be obtained in different
ways (among which query reformulations based
on query log mining) we are here interested in
those applications that make use of lexical
resources.</p>
      <p>
        Most of the works, published from the 90s to
nowadays (proving that QE is still being
investigated), exploit WordNet
        <xref ref-type="bibr" rid="ref5">(Fellbaum,
1998)</xref>
        , the de facto and most widespread
ontological (or lexical, depending from the
point of view) multilingual resource. Ellen
Vorhees was one of the first and used
WordNet’s IS_A relations to improve text
retrieval (Vorhees, 1993). Moving on directly to
the most recent works, WordNet has been used
with all its ontological features to expand
queries in a semantic text search context in
        <xref ref-type="bibr" rid="ref11">(Ngo et al., 2018)</xref>
        while in
        <xref ref-type="bibr" rid="ref6">(Azad and Deepak,
2019)</xref>
        the authors combined WordNet and
Wikipedia for QE, exploiting the first to expand
individual terms and the second to expand
phrase terms.
      </p>
      <p>The research work here illustrated places
itself in the context of full-text search carried
out using a lexical resource-driven QE
technique. However, the focus of this research,
differently from that of the cited works, is not
on the specific QE technique and the relative
evaluation, but on the resource we chose to
exploit, introduced in the next section, in place
of WordNet and on the frontend and backend
technologies implemented to query the text, as
described in details in Section 3. The
advantages derived from the adoption of a rich
and highly structured computational lexicon
will also be remarked through some query
examples shown in Section 4. The developed
application can be freely accessed and used to
query the corpus1.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Context and the Resource</title>
      <p>This work stems from the activities conducted
by the Institute of Computational Linguistics of
CNR (ILC-CNR) in the context of the Talmud
Translation Project2. The need of providing a
way to query the Italian translation of the
Talmud3 on a linguistic basis was the initial
spark that led to the idea of experimenting the
use of a computational lexicon for Italian. As a
matter of fact, this resource (described below)
represents a “linguistic mine” which has never</p>
      <sec id="sec-2-1">
        <title>1https://klab.ilc.cnr.it/talmudSearch/</title>
        <p>2https://www.talmud.it/
3The corpus here queried is limited to eight
tractates of the babylonian Talmud: Rosh
Hashanah, Berakhot, Ta'anit, Kiddushin, Chagigah,
Beitza, Sukkah, and Megillah
been exploited for tasks of full-text search or
information retrieval.
2.1</p>
        <sec id="sec-2-1-1">
          <title>The Parole-Simple-Clips Lexicon</title>
          <p>
            “PAROLE-SIMPLE-CLIPS” (PSC) is a
computational lexicon of Italian, developed
from 1996 to 2003 by ILC-CNR
            <xref ref-type="bibr" rid="ref15">(Ruimy et al.,
2002)</xref>
            . Currently, the resource is stored as a
MySQL database available on CLARIN4, and
represents a unicum among the available
linguistic resources for Italian, thanks to its
richness and articulated structure of data. Based
on the Generative Lexicon theory
            <xref ref-type="bibr" rid="ref13">(Pustejovsky,
1995)</xref>
            , the schema on which the linguistic
information is encoded is composed of four
distinct, but strictly interconnected layers of
analysis: phonology, morphology, syntax, and
semantics.
          </p>
          <p>In these features lies the motivation of this
work, since the available linguistic information
may be combined in ways that go well beyond
what resources such as WordNet allow to do in
the context of text search support. Even
considering semantics alone, the information in
PSC is detailed with fine-grained features that
are not described in WordNet’s network of
synsets: PSC encodes the meaning of each
lexical sense as an array of information,
including “templates” (see below), semantic
traits, semantic roles, and argumental
structures.</p>
          <p>In this work, we document the first steps in
the use of PSC for QE. At this stage we used: i)
the Morphological Units, classified according
to their POS, which represent the lemmas of the
computational lexicon; ii) the Phonological
Units that represent the inflected forms of the
lemmas; iii) the Semantic Units (SemUs), that
describe the senses expressed by the words.
Furthermore, we considered the following
morphological and semantic information: i)
morphological traits (e.g. gender, number); ii)
relations between SemUs (at the moment
limited to synonymy and hyponymy); iii) the
association between SemUs and “templates”,
representing sets of senses, labeled according to
one of the types represented in the Simple
Ontology (Lenci et. al., 2001). The other parts</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>4https://dspace-clarin</title>
        <p>it.ilc.cnr.it/repository/xmlui/handle/20.500.11752/I
LC-88.
of linguistic information will be the subject of
future works, according to an incremental
approach.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The Process and the Application</title>
      <p>The whole search process involves a series of
steps that can be summarized as follows (see
Fig. 1 for a schematic functional architecture of
the application):
i) the user inserts a first set of data to
formulate the desired query in the Graphical
User Interface;
ii) the interface requests, via Web API, the
lexicon backend services which return the
linguistic data matching the initial query;
iii) the user completes the query taking into
account the linguistic data and starts the
search;
iv) the interface executes the query
expansion and requests, via Web API, the
text backend services which collect, tag, and
return the matching textual portions of the
Talmud;
v) the interface shows the results to the user.</p>
      <p>First of all, to make the lexicon efficiently
queryable, it needed to be transformed from
relational data into linked data (Section 3.1). At</p>
      <sec id="sec-3-1">
        <title>5https://lexinfo.net/</title>
        <p>6We remark that the conversion of PSC Simple is not
the focus of this work, but it was necessary for
the same time, a list of services to query both
PSC and the database storing the Italian
translation of the Talmud needed to be
developed in order to answer to the interface
requests (Section 3.2). The interface itself was
designed on the basis of the available linguistic
information exposed from PSC and developed
accordingly (Section 3.3). Finally, to improve
the precision of the search process, the queried
corpus was also POS-tagged (Section 3.4).
3.1</p>
        <sec id="sec-3-1-1">
          <title>A First Conversion of PSC</title>
          <p>The first phase of our work was to consider the
relational database of PSC as the data source for
the generation of a first Linked Data (LD)
conversion. Two main reasons led to the need
for a conversion of PSC: i) to ease the reuse of
the lexicon itself, in virtue of the intrinsic nature
of LD, ii) the possibility of performing
automated reasoning on data if appropriately
modeled taking into account ontological
principles, for example to compute inferred
closures, infer new knowledge on the basis of
class taxonomies, property hierarchies, and so
on. Accordingly to the LD principles, we first
had to look for existing vocabularies for the
modeling of lexicons.</p>
          <p>
            In the context of the Semantic Web, the de
facto standard for representing lexical
information is the lemon model (Cimiano et al.,
2016). Its core module, called OntoLex, allows
to represent grammatical, basic morphological
and semantic information by means of three
main classes: Lexical Entry, Form (lemma and
inflected forms), and Lexical Sense. Lemon
relies on external vocabularies to define
semantic relations between senses: in this
conversion we modelled PSC’s synonymy and
hyponymy with LexInfo ontology5. Currently,
the converted resource includes 72006 lexical
entries (48735 nouns, 6522 verbs, and 11830
adjectives), 469726 inflected forms, and 57130
senses. Explicit lexico-semantic relations
include 1803 meronyms, 4060 synonyms, and
44487 hyponyms. This initial conversion of
PSC as Linked Data was purely functional to
the linguistic querying of the Italian translation
of the Babylonian Talmud6. Therefore, it was
decided to convert a selected number of
linguistic data to be exploited for the process of
query expansion. At the time of writing this
performing linguistic searches experiments on the
Italian translation of the Talmud.
proposal, a complete conversion of PSC as
LOD (Linked Open Data) is in progress. This
complete conversion will also take full
advantage of the already available works on the
resource as documented in
            <xref ref-type="bibr" rid="ref8">(Khan et al., 2018)</xref>
            and
            <xref ref-type="bibr" rid="ref4">(Del Gratta et al., 2015)</xref>
            .
3.2
          </p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Setting up the Backend</title>
          <p>
            Once the computational lexicon was converted,
the implementation of the querying system
continued with the creation of the backend
services needed to access both the lexicon and
the database storing the text to be queried.
Regarding the lexicon, a GraphDB7 repository,
containing all the converted data, was set up.
The access to the repository was implemented
with a set of REST services that can be invoked
from any web client8. The services have been
based on the already available backend of
LexO, a collaborative web tool for the creation
and editing of lemon lexical resources
            <xref ref-type="bibr" rid="ref2">(Bellandi, 2021)</xref>
            . At the same time, a list of
analogous services was made available to
retrieve the textual portions of the corpus
matching the expanded queries coming from the
frontend of the system. The Italian translation
of the babylonian Talmud is currently stored as
a MySQL database, where each segment of text
appears both in its original and POS-tagged
version (see 3.4).
3.3
          </p>
        </sec>
        <sec id="sec-3-1-3">
          <title>The Graphical User Interface</title>
          <p>The GUI (Fig. 2) set up to query the corpus was
developed using Angular9, one of the most
widespread frameworks for frontend Web
development, which provides high levels of
portability and scalability. In this first version
of the search system, the interface was
conceived as a sort of “hub” of the whole
architecture: from the one side to interact with
the user and from the other side to invoke the
services exposed by GraphDB and the Talmud
database. The interface is divided into two
sections. In the left-hand column, the available
tractates of the Talmud that can be queried are
represented as a tree allowing the user to specify
the search context at different levels of
granularity. The right-hand section contains the
search parameters, where the user can choose
7Ontotext GraphDB is a highly efficient and robust
graph database with RDF/OWL and SPARQL
support
(https://graphdb.ontotext.com/documentation/free/f
ree/graphdb-free.html)
between three types of search using the
available tabs: Keyword, Form/Lemma, or
Semantic Traits.</p>
          <p>The first one is the classic keyword-based
search. The second type, via the Form/Lemma
tab, allows to search for a specific word form or
the set of inflected forms of a given lemma by
specifying some morphological constraints. By
entering a word in the text field, the GUI
invokes the lexicon backend services to retrieve
the lemmas corresponding to the indicated
parameters and displays them with their
different senses. Users can then proceed with
the search or they can select one or more
lemmas and apply to them morphological
constraints by clicking on the three dots icon on
their right. The selection of at least one of the
senses enables the semantic extension search
feature: a drop-down menu allows users to look
for all the other senses in the lexicon appearing
as hypernyms, hyponyms, or synonyms at a
specified distance. The forms obtained with this
extension are subject to the propagation of the
morphological constraints applied to the lexical
entry to which they are linked, whether explicit
(entered from the interface) or implicit (in the
case of a search by form). Finally, the “semantic
traits” tab provides two template trees on which
multiple selections are possible: the first click
selects a template with all its descendants, the
second deselects the descendants, and the third
deselects the node itself. When the selection
changes, the lexicon is queried to obtain the list
of senses linked to the chosen templates. Users
can then select the desired senses which will be
used to retrieve the forms of the relative lemmas
to be used in the QE.</p>
          <p>All the entered data are used to compose the
expanded query, which will be constituted by
all the inflected forms provided by the lexicon
and matching the indicated morphological
constraints, semantic extension, or templates.</p>
          <p>The results coming from the backend
services accessing the Talmud database are then
displayed in a table on the right-hand side, upon
which a panel lists the forms retrieved from the
lexicon and used for the QE.
8The source code of the REST services is available
at https://github.com/andreabellandi/LexO-backend
9https://angular.io/
3.4</p>
        </sec>
        <sec id="sec-3-1-4">
          <title>POS-Tagging of the Text</title>
          <p>For the purpose of reducing the lexical
ambiguity in cases where a searched word could
match with homographs, the corpus was
automatically analyzed and annotated with
morphological information.</p>
          <p>
            In particular, we parsed all the sentences of
the eight tractates of the babylonian Talmud
with Stanford's Stanza tools
            <xref ref-type="bibr" rid="ref14">(Qi et al., 2020)</xref>
            using the pre-trained model based on the UD
Italian ISDT treebank10. The tool was
configured to use the processors for
tokenization, multi-word token expansion, and
Part-of-Speech tagging, which also includes the
attribution of morphological traits. Each
morphologically annotated textual segment was
then stored in the MySQL database to return
just the forms matching with the morphological
constraints coming from the GUI.
4
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Examples of Queries</title>
      <p>In this last section, we show a concrete
application of the approach by introducing
some query examples. Each query can also be
tested by the reader by accessing the available
application.</p>
      <p>The first two examples show the search for
words with specific morphological traits and the
application of semantic extension. In these
cases, the “Form/Lemma” type of search is
selected. In the first example, the word
“insegnamento” (teaching) is inserted as a
lemma. The system finds it in the lexicon and
10https://universaldependencies.org/treebanks/it_isd
t/index.html
shows it as a noun with one single sense. The
user then adds a morphological constraint by
setting the “number” trait to “plural”. Finally,
the user extends the search to direct hyponyms
(distance = 1) and submits the query.</p>
      <p>This is a simple case of propagation of the
morphological traits through semantics. The
lexicon contains the two following key
information: i) the fact that the sense of
“insegnamento” has three hyponyms:
erudizione” (erudition), “istruzione”
(instruction), and “catechesi” (catechesis); ii)
all the inflected forms and the relative
morphological traits of the searched word and
its three hyponyms. On the basis of these data,
the system composes the final query, which
allows to search for all the plural forms of the
four lemmas as nouns. As a result, 103 textual
segments are retrieved, containing the words
“insegnamenti” (97 matches) and “istruzioni”
(6 matches) (Fig. 2).</p>
      <p>The second example involves the verb
“permettere” (to permit/allow), searched as a
lemma, with morphological constraints on the
finite mood (“indicative”, “subjunctive”,
“imperative”, “conditional”). In addition, the
user selects just one of the two available senses
of the verb (the one with the definition “dare a
qlcu la possibilità' di fare qlco” - to give sb the
chance to do smth -) and then extends the search
to its synonyms. In this case, the lexicon
proposes two synonyms of the selected sense:
the (single) senses of words “concedere” and
“consentire”. The resulting expanded query
retrieves from the database a total of 405
matches, containing 334 strings of “permettere”
(for 131 available forms of the lexicon), 44
strings of “concedere” (for 45 available forms)
and 27 strings of “concedere” (for 41 forms).</p>
      <p>The last type of search is structured as a more
explorative querying of the corpus. In the
semantic traits tab, the user can choose one or
more between noun/verb or adjectival templates
(group of senses), to look for all words relative
to a specific semantic field, such as objects,
weather verbs, metalanguage, etc.</p>
      <p>In this example, the user selects the template
“Air animal”, which appears as a “leaf” of the
sub-tree under the parent-node “Entity”. Once
the template is chosen, the system retrieves
from the lexicon all the relative senses and
shows them in a window. It is then possible to
select all the available 165 senses or just some
of them. Finally, the user can run the search: the
system composes the expanded query and
retrieves 226 textual segments of the Talmud
containing words (both as lemmas and inflected
forms) with senses referring to the semantic
field of “Air animal”: “uccello” (bird), “mosca”
(fly), “cavallette” (grasshoppers), and so on.</p>
      <p>Among future developments, a feature for a
“grouped” selection of multiple templates will
be added, that will allow to search for textual
segments containing co-occurrences of words
referring to the specified templates. To bring an
example, the grouped selection of templates
“Color” and “Earth animal” will retrieve
segments containing multiword expressions
such as “vacca rossa” (red cow), “gatta nera”
(black she-cat), “oche bianche” (white gooses),
etc.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>As shown in this paper, the availability of a rich
and structured linguistic resource (as the
computational lexicon we have taken into
account) seems to provide an edge over the
standard query expansion techniques for
fulltext search based on WordNet. Now that a very
first portion of the resource has been made
available (though with a preliminary
conversion) and the web application has been
implemented, the road is cleared for the next
steps.</p>
      <p>The first critical issue that will need to be
faced involves the limitedness of the resource,
covering most - but not all - the lemmas, forms,
and senses of standard contemporary Italian and
that lacks many domain-related terms or senses.
To fill this gap the resource will have to be
updated and enriched with more entries.</p>
      <p>At the same time, as anticipated, a more
indepth and rigorous conversion of PSC will have
to be carried out, a process that will probably
take a lot of time and research effort and that for
the sake of this first experiment would have
been premature and unnecessary. As soon as the
whole conversion will be ready, the rest of the
information encoded in the lexicon will be
made available and integrated in the search
process.</p>
      <p>Though the benefits of the availability of a
computational lexicon wrt WordNet (or a
similar resource) may seem obvious in a context
of QE for full-text search, an empirical
evaluation would be desirable. However, the set
up of a benchmark conceived for this purpose
appears anything but easy, mainly due to the
lack of comparable works or evaluation
campaigns focussing on the role of linguistic
resources as support.</p>
      <p>In conclusion, we believe these first
experiments carried out by querying the
talmudic text appear promising, especially
considering that only a small part of the lexicon
has been used. In addition, the support in the
disambiguation provided by the POS tagging of
the text suggests that an hybridization of a
resource-driven QE technique with a deeper
stochastic annotation of the corpus to be queried
may constitute an interesting experimental field
to be investigated.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was conducted in the context of the
TALMUD project and the scientific
cooperation between S.c.a r.l. PTTB and
ILCCNR.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Eneko</given-names>
            <surname>Agirre</surname>
          </string-name>
          , Oier Lopez de Lacalle, and
          <string-name>
            <given-names>Aitor</given-names>
            <surname>Soroa</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Knowledge-Based WSD on Specific Domains: Performing better than Generic Supervised WSD</article-title>
          .
          <source>In IJCAI'09: Proceedings of the 21st international joint conference on Artificial intelligence</source>
          .
          <fpage>1501</fpage>
          -
          <lpage>1506</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Bellandi</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>LexO: An Open-source System for Managing OntoLex-Lemon Resources</article-title>
          . Language Resources &amp; Evaluation. https://doi.org/10.1007/s10579-021-09546-4
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Ontology-Lexicon Community Group (W3C). Phillip</surname>
            <given-names>Cimiano</given-names>
          </string-name>
          ,
          <string-name>
            <surname>John P. McCrae</surname>
          </string-name>
          , and Paul Buitelaar (eds).
          <year>2016</year>
          .
          <article-title>Lexicon Model for Ontologies: Community Report</article-title>
          . https://www.w3.org/
          <year>2016</year>
          /05/ontolex/#overview
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Riccardo Del Gratta</surname>
            ,
            <given-names>Francesca</given-names>
          </string-name>
          <string-name>
            <surname>Frontini</surname>
            , Fahad Khan, and
            <given-names>Monica</given-names>
          </string-name>
          <string-name>
            <surname>Monachini</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Converting the PAROLE SIMPLE CLIPS Lexicon into RDF with lemon</article-title>
          .
          <source>Semantic web</source>
          <volume>6</volume>
          :
          <fpage>387</fpage>
          -
          <lpage>392</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Christiane</given-names>
            <surname>Fellbaum</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>WordNet: An electronic lexical database</article-title>
          . MA: MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Azad</given-names>
            <surname>Hiteshwar Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Akshay</given-names>
            <surname>Deepak</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A new approach for query expansion using Wikipedia and WordNet</article-title>
          .
          <source>Information sciences</source>
          <volume>492</volume>
          :
          <fpage>147</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Lenci</surname>
          </string-name>
          , Nuria Bel, Federica Busa, Nicoletta Calzolari, Elisabetta Gola, Monica Monachini, Antoine Ogonowski, Ivonne Peters, Wim Peters, Nilda Ruimy, Marta Villegas, and Antonio Zampolli.
          <year>2000</year>
          .
          <article-title>SIMPLE: A General Framework for the Development of Multilingual Lexicons</article-title>
          .
          <source>International Journal of Lexicography</source>
          <volume>13</volume>
          (
          <issue>4</issue>
          ):
          <fpage>249</fpage>
          -
          <lpage>263</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Fahad</given-names>
            <surname>Khan</surname>
          </string-name>
          , Andrea Bellandi, Francesca Frontini, and
          <string-name>
            <given-names>Monica</given-names>
            <surname>Monachini</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>One Language to rule them all: Modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL</article-title>
          .
          <source>In Proceedings of the 11th International Conference on Language Resources and Evaluation - LREC2018</source>
          ,
          <year>2018</year>
          , Miyazaki, Japan. hal-
          <fpage>01832652</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Judith</given-names>
            <surname>Klavans</surname>
          </string-name>
          .
          <year>1988</year>
          .
          <article-title>COMPLEX: a computational lexicon for natural language systems</article-title>
          .
          <source>In COLING '88: Proceedings of the 12th conference on Computational Linguistics</source>
          . https://doi.org/10.3115/991719.991802
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Christopher D. Manning</surname>
            , Prabhakar Raghavan, and
            <given-names>Hinrich</given-names>
          </string-name>
          <string-name>
            <surname>Schütze</surname>
          </string-name>
          .
          <year>2008</year>
          . Introduction to Information Retrieval, Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Vuong M. Ngo</surname>
          </string-name>
          ,
          <string-name>
            <surname>Tru H. Cao</surname>
          </string-name>
          , and
          <string-name>
            <surname>Tuan</surname>
            <given-names>M. V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>WordNet-Based Information Retrieval</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          Features. preprint arXiv:
          <year>1807</year>
          .05574.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>James</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>The Generative Lexicon</article-title>
          . MA: MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Peng</given-names>
            <surname>Qi</surname>
          </string-name>
          , Yuhao Zhang, Yuhui Zhang, Jason Bolton, and
          <string-name>
            <given-names>Christopher D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Association for Computational Linguistics (ACL) System Demonstrations</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Nilda</given-names>
            <surname>Ruimy</surname>
          </string-name>
          , Monica Monachini, Raffaella Distante, Elisabetta Guazzini, Stefano Molino, Marisa Ulivieri.,
          <string-name>
            <surname>Nicoletta</surname>
            <given-names>Calzolari</given-names>
          </string-name>
          , and Antonio Zampolli.
          <year>2002</year>
          .
          <article-title>Clips, a multi-level italian computational lexicon: A glimpse to data</article-title>
          .
          <source>In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC02).</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Robert F. Simmons</surname>
          </string-name>
          , Sheldon Klein, and
          <string-name>
            <surname>Keren McConlogue</surname>
          </string-name>
          .
          <year>1963</year>
          .
          <article-title>Indexing and dependency logic for answering English questions</article-title>
          .
          <source>American Documentation</source>
          <volume>15</volume>
          (
          <issue>3</issue>
          ):
          <fpage>196</fpage>
          -
          <lpage>204</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Philip J. Stone</surname>
            ,
            <given-names>Robert F.</given-names>
          </string-name>
          <string-name>
            <surname>Bales</surname>
            ,
            <given-names>J. Zvi</given-names>
          </string-name>
          <string-name>
            <surname>Namenwirth</surname>
            , and
            <given-names>Daniel</given-names>
          </string-name>
          <string-name>
            <surname>Ogilvie</surname>
          </string-name>
          .
          <year>1962</year>
          .
          <article-title>The general inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information</article-title>
          .
          <source>Behavioral Science</source>
          <volume>7</volume>
          (
          <issue>4</issue>
          ):
          <fpage>484</fpage>
          -
          <lpage>498</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Ellen</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Voorhees</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Using WordNet to disambiguate word senses for text retrieval</article-title>
          .
          <source>In SIGIR '93: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval</source>
          . https://doi.org/10.1145/160688.160715
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>William A.</given-names>
            <surname>Woods</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Conceptual indexing: A better way to organize knowledge</article-title>
          .
          <source>Technical Report SMLI TR-97-61</source>
          , Sun Microsystems Laboratories, Mountain View, CA, April. www.sun.com/research/techrep/1997/abstract61 .html.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>William A.</given-names>
            <surname>Woods</surname>
          </string-name>
          , Lawrence A. Bookman, Ann Houston, Robert J.
          <string-name>
            <surname>Kuhns</surname>
            , Paul Martin,
            <given-names>and Stephen</given-names>
          </string-name>
          <string-name>
            <surname>Green</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Linguistic knowledge can improve information retrieval</article-title>
          .
          <source>In ANLC '00: Proceedings of the sixth conference on Applied natural language processing</source>
          . https://doi.org/10.3115/974147.974183
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>