<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of lexical semantic changes in corpora with the Diachronic Engine</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pierluigi Cassotti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierpaolo Basile</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco de Gemmis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Bari Aldo Moro Bari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. With the growing availability of digitized diachronic corpora, the need for tools capable of taking into account the diachronic component of corpora becomes ever more pressing. Recent works on diachronic embeddings show that computational approaches to the diachronic analysis of language seem to be promising, but they are not user friendly for people without a technical background. This paper presents the Diachronic Engine, a system for the diachronic analysis of corpora lexical features. Diachronic Engine computes word frequency, concordances and collocations taking into account the temporal dimension. It is also able to compute temporal word embeddings and timeseries that can be exploited for lexical semantic change detection.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Synchronic corpora are widely used in
linguistics for deriving a set of abstract rules that
govern a particular language under analysis by
using statistical approaches. The same
methodology can be adopted for analyzing the evolution
of word meanings over time in the case of
diachronic corpora. However, this process can be
very time-consuming. Usually, linguists rely on
software tools that can easily explore and clean the
corpus, while highlighting the more relevant
linguistic features. Sketch Engine1
        <xref ref-type="bibr" rid="ref12 ref13">(Kilgarriff et al.,
2004; Kilgarriff et al., 2014)</xref>
        is the leading tool in
the corpus analysis field. Beyond several
interesting features, Sketch Engine includes trends
        <xref ref-type="bibr" rid="ref14 ref24">(Kilgarriff et al., 2015)</xref>
        , which allow for diachronic
analysis based on the frequency distribution of
words. Trends rely on merely frequency features,
ignoring word usage information. Moreover, the
Sketch Engine interface does not provide
temporal information about concordances and
collocations. NoSketchEngine2 is an open-source version
of SketchEngine. It requires technical expertise
for the setup and, contrarily to SketchEngine, it
does not support word sketches, terminology,
thesaurus, n-grams, trends and corpus building. An
interesting system is DiaCollo3
        <xref ref-type="bibr" rid="ref11">(Jurish and der
Wissenschaften, 2015)</xref>
        , a software tool for the
discovery, comparison, and interactive visualization
of target word combinations. Combinations can
be requested for a particular time period, or for a
direct comparison between different time periods.
However, DiaCollo is focused exclusively on the
extraction and visualization of collocations from
diachronic corpora.
      </p>
      <p>
        In recent works about computational diachronic
linguistics, techniques based on word embeddings
produce promising results. In Semeval Task 1
        <xref ref-type="bibr" rid="ref23">(Schlechtweg et al., 2020)</xref>
        , for instance, type
embeddings rich high performances on both subtasks.
However, these techniques are not included in any
aforementioned linguistic tool. In order to bridge
this gap, we try to build a tool that includes
approaches for the analysis of diachronic
embeddings. The result of our work is Diachronic Engine
(DE), an engine for the management of diachronic
corpora that provides tools for change detection of
lexical semantics from a frequentist perspective.
DE includes tools for extracting diachronic
collocations, concordances in different time periods as
well as for computing semantic change time-series
by exploiting both word frequencies and word
embeddings similarity over time.
      </p>
      <p>The rest of the paper is organized as follows:
2https://nlp.fi.muni.cz/trac/noske
3https://www.clarin.eu/showcase/
diacollo
Section 2 describes the technical details of DE,
while Section 3 shows some use cases of our
engine that encompass that address time-series. We
also present the results of a preliminary evaluation
about the system’s usability in Section 4.
Conclusions and future work close the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Diachronic Engine</title>
      <p>Diachronic Engine (DE) is a web application for
lexical semantic change analysis in diachronic
corpora. The DE pipeline needs diachronic corpora to
compute statistics about the corpus. A diachronic
corpus must include a temporal feature (e.g., year
or timestamp of the publication date); DE exploits
that feature to sort the documents.</p>
      <p>We adopt the vertical format to represent word
information, as specified for the IMS Corpus
Workbench (CWB). In a vertical corpus, each
word is in a new line. In each line, fields, called
p-attributes, are separated by tabs. In DE the
default p-attributes are word, lemma, PoS tag and
syntactic dependency. Non-recursive XML tags
(s-attributes) on a separate line can be used for
representing sentences, paragraphs and documents.</p>
      <p>
        Corpora can be served in vertical format4 or in
plain-text mode; in the latter case, the plain-text
is transformed in vertical format using the Spacy
UDPipe5
        <xref ref-type="bibr" rid="ref25">(Straka, 2018)</xref>
        tool, which splits
plaintext into sentences and then predicts the
PoStag, the lemma and the syntactic dependency for
each token. UDPipe is a dependency parser that
provides models for several languages. Models
are built by using the Universal Dependencies 6
datasets as training data. Input files’ names must
contain the temporal tag of the period to which
they refer. DE automatically detects temporal
pat4https://www.sketchengine.eu/my_
keywords/vertical/
      </p>
      <p>5https://pypi.org/project/
spacy-udpipe/
6http://universaldependencies.org
terns in the name of the files. In particular, the last
sequence of numbers in the file name is used to
sort the documents.</p>
      <p>
        Corpora are stored and managed by the CWB,
a tool for the manipulation of large, linguistically
annotated corpora. In particular, DE relies on
the Corpus Query Processor (CQP)
        <xref ref-type="bibr" rid="ref6">(Christ et al.,
1999)</xref>
        , a specialized search engine for linguistic
research.
      </p>
      <p>
        For building temporal word embeddings, DE
exploits Temporal Random Indexing (TRI)
        <xref ref-type="bibr" rid="ref1 ref2">(Basile
et al., 2014; Basile et al., 2016)</xref>
        that computes
a word vector for each time period by summing
shared random vectors over all the periods. TRI
is able to produce aligned word embeddings in
a single step and it is based on Random
Indexing
        <xref ref-type="bibr" rid="ref21">(Sahlgren, 2005)</xref>
        , where a word vector (word
embedding) svjTk for the word wj at time Tk is
the sum of random vectors ri assigned to the
cooccurring words taking into account only
documents dl 2 Tk. Co-occurring words are defined
as the set of m words that precede and follow the
word wj . Random vectors are vectors initialized
randomly and shared across all time slices so that
word spaces are comparable.
      </p>
      <p>
        Future versions will include other approaches,
such as Procustes
        <xref ref-type="bibr" rid="ref10">(Hamilton et al., 2016)</xref>
        ,
Dynamic Word Embeddings
        <xref ref-type="bibr" rid="ref27">(Yao et al., 2018)</xref>
        ,
Dynamic Bernoulli Embeddings
        <xref ref-type="bibr" rid="ref19 ref22">(Rudolph and Blei,
2018)</xref>
        and Temporal Referencing
        <xref ref-type="bibr" rid="ref7">(Dubossarsky et
al., 2019)</xref>
        .
      </p>
      <p>The DE architecture is based on the
clientserver paradigm. The back-end of DE has been
developed with Flask, a web framework written
in Python. Concordances are retrieved by CQP,
that indexes the corpus as soon as it is uploaded to
the server, while collocations and frequencies are
computed in Python. The back-end provides a set
of services by a REST API where the input/output
is based on JSON messages.</p>
      <p>The back-end consists of three macro
components: User Handler, Corpus Handler and
Diachronic Operations. The User Handler
manages registered users information such as
username and passwords. Admitted operations on
users are creation, read, update and delete. The
Corpus Handler Component manages corpora
information such as name, language, the list of fields
in the vertical files, corpus visibility. Moreover, it
deals with corpora types: each corpus has a label
indicating if it is synchronic or diachronic. For
diachronic corpora also the temporal range is stored.
Operations admitted on corpora are: creation,
update, delete, search and read. The Diachronic
Operations component shows frequency lists,
collocations of words, time-series, change-points and
concordances. This component relies on CWB
that indexes vertical files.</p>
      <p>The Diachronic Operations component
architecture is sketched in Figure 2.</p>
      <p>The front-end of DE has been developed with
JHipster7, using Spring8 for server-side
applications and Angular for client-side applications. The
front-end communicates with the back-end by the
means of the REST API.</p>
      <p>The front-end design is inspired by the Google’s
Material Design and the Sketch Engine interface.
The user interface provides multilingual support
in Italian and English, but we plan to extend it to
other languages.</p>
      <p>This architecture allows the independence
between the back-end and the front-end, in this way
is possible to develop a different front-end or
connect the front-end to a different implementation of
the back-end. The only constraint is the REST API
interface.</p>
      <p>A screenshot of the DE homepage is provided
in Figure 1. The homepage provides an easy
access to all corpora owned by the logged user with
links to available tools. The front-end provides
also tools for creating and managing users and
corpora. In particular, it is possible to define different
grant permissions for each corpus.</p>
      <p>
        The tool is distributed as open-source software
under the GNU v3 license9.
7https://www.jhipster.tech/
8https://spring.io/
9https://github.com/swapUniba/
Diachronic-Engine
DE provides a set of tools for managing and
querying diachronic corpora. The core of the
backend is based on the IMS Open Corpus Workbench
(CWB) 10, which allows querying the indexed
corpora by using the powerful CQP. Other tools have
been integrated to facilitate the analysis of a
diachronic corpus:
Word frequency Many works show a
correlation between lexical semantic change and
frequency differences between time periods.
Google Ngram Viewer
        <xref ref-type="bibr" rid="ref17">(Michel et al., 2011)</xref>
        uses n-grams frequencies over time to show
the change in the semantics of n-grams.
SketchEngine exposes the Trends tool, which
uses a linear regression of frequencies to
predict words that appear to be changed. In DE,
queries can be filtered by part-of-speech, as
well as by time periods. We use normalized
frequencies, that can be filtered by time
period.
      </p>
      <p>
        Collocations Collocations have shown to be an
effective tool in diachronic analysis
        <xref ref-type="bibr" rid="ref3">(Basile
et al., 2019)</xref>
        . A collocation is a sequence of
words that occurs more often than would be
expected. In order to compute the collocation
strength we use the logDice
        <xref ref-type="bibr" rid="ref20">(Rychly`, 2008)</xref>
        :
log
      </p>
      <p>
        2fxy
fx + fy
logDice takes into account the frequency of
the word fx, of the collocate fy and the
frequency of the whole collocation fxy.
Collocation results can be grouped by the PoS tag.
Concordances Concordances offer a way to find
“the evidence” directly in the text by
exploiting the context. The Concordances tool lists
instances of a word with its immediate left
and right context and the period the
collocation belongs to. An example of concordances
from “L’Unita`”
        <xref ref-type="bibr" rid="ref4">(Basile et al., 2020)</xref>
        , is shown
in Figure 3.
      </p>
      <p>Time-series A time-series (w) of a word w is
an ordered sequence of cosine similarities
between the word vector at time k (vwk) and the
previous one at time k 1 (vwk 1):
(w)k =
vwk vwk 1
jvwkjjvwk 1j
10http://cwb.sourceforge.net/</p>
      <p>
        Diachronic Engine relies on word vectors
computed by Temporal Random Indexing,
but it is possible to integrate other
approaches. In order to detect change points,
we use the Mean Shift algorithm
        <xref ref-type="bibr" rid="ref26">(Taylor,
2000)</xref>
        . According to this model, we define a
mean shift of a general time series pivoted
at time period j as:
      </p>
      <p>K( ) =
l
1</p>
      <p>l</p>
      <p>
        X
j k=j+1
k
1 Xj
j k=1
k
(1)
In order to understand if a mean shift is
statistically significant at time j, a bootstrapping
        <xref ref-type="bibr" rid="ref8">(Efron and Tibshirani, 1994)</xref>
        approach under
the null hypothesis that there is no change in
the mean is adopted. In particular, statistical
significance is computed by first
constructing B bootstrap samples by permuting (ti).
Second, for each bootstrap sample P, K(P ) is
calculated to provide its corresponding
bootstrap statistic and statistical significance
(pvalue) of observing the mean shift at time
j compared to the null distribution. Finally,
we estimate the change point by considering
the time point j with the minimum p-value
score. The output of this process is a ranking
of words that potentially have changed
meaning. Time-series is able to compare multiple
words at the same time and allows to filter
words by time period.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Use cases</title>
      <p>
        In this section, we describe two use cases
concerning both historical and computational linguistics.
DE is an extension of existing tools for synchronic
corpora. It shares many of the use cases already
available on those tools, such as applications in
lexicography, terminology and linguistics.
Lexical semantic changes can reveal aspects of
real-world events, such us global armed conflicts
        <xref ref-type="bibr" rid="ref15">(Kutuzov et al., 2017)</xref>
        . DE provides several tools
to help events detection through time-series:
the comparison of two time-series for
highlighting potential correlations between
lexical-semantic changes
the plot of the time-series of cosine similarity
between two word vectors over time,
showing how the relatedness between two words
changes over time
the detected change points can bring out
hidden information
      </p>
      <p>In Figure 4, the time-series of “terrorismo”
(terrorism) is shown. The time-series appears to be
influenced by real-world events happening in Italy.
In particular, we can observe a decrease in
similarity starting in 1968 and culminating in 1970
during a crucial moment in Italy: “Anni di piombo”
(Years of Lead), years marked by terrorism and
violent clashes carried out by political activists.
3.2</p>
      <p>
        Annotation of semantic shifts
The manual annotation of lexical-semantic shifts
can be very expensive. Although robust
frameworks
        <xref ref-type="bibr" rid="ref22">(Schlechtweg et al., 2018)</xref>
        for the
annotations already exist and are successfully used in
evaluation tasks
        <xref ref-type="bibr" rid="ref23">(Schlechtweg et al., 2020)</xref>
        , no
tools for facilitating the annotation are available
yet.
      </p>
      <p>
        DE can provide useful tools for the annotation
of semantic shifts:
1. Frequencies over time can be preliminary
exploited to filter words that have good
coverage in the years under analysis;
2. Change points in time-series offer an
overall and intuitive idea of the potential semantic
shifts;
3. Diachronic concordances and collocations
can support the identification of the type of
change
        <xref ref-type="bibr" rid="ref5">(Blank, 2012)</xref>
        , such as when a word
gains or loses a meaning.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>
        We place a particular focus on the usability of our
tool by giving a satisfactory experience. To
understand the strength and weakness of the user
interface, we conduct a preliminary usability test,
according to the eGLU protocol
        <xref ref-type="bibr" rid="ref24">(Simone et al.,
2015)</xref>
        . We use 21 participants. As a first step
of the evaluation, we want to test the system’s
usability by measuring the task success rate: the
ratio of users able to accomplish a set of predefined
tasks. We ask participants to perform four tasks
and we compute the average task success over all
the 21 participants. During the evaluation, all
participants complete their tasks without difficulties
except for the showing frequency list task, where
they had some problems with the corpus
selection. We have already fixed this issue: the user
is warned to choose a corpus from those available
if no corpus is selected.
      </p>
      <p>Results of the evaluation are reported in Table
1.</p>
      <sec id="sec-4-1">
        <title>Task</title>
        <p>User registration
Login and show
user information
Add a corpus
Show frequency
list
Overall</p>
      </sec>
      <sec id="sec-4-2">
        <title>Avg. task success</title>
        <p>1
1</p>
        <p>Moreover, we designed and dispensed a
questionnaire for measuring user satisfaction. The
questionnaire is composed of ten questions about
the usability and the design of DE with a Likert
scale of five values. The questionnaire results
return an average score of 84.05/100. The system
appear likeable to use.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper, we present the Diachronic Engine,
a tool for the analysis of lexical semantic change.
DE integrates and extends current tools for corpus
analysis enabling the study of corpus diachronic
features. DE includes tools not included in other
systems, such as time-series and change points
detection based on diachronic word embeddings.</p>
      <p>
        As future work, we plan to provide pre-loaded
corpora such as Google Ngram, Diacoris
        <xref ref-type="bibr" rid="ref18">(Onelli
et al., 2006)</xref>
        and the integration of other
approaches for computing diachronic word
embeddings. Moreover, we plan to add a tool for the
annotation of lexical-semantic shifts inspired by
DUREL
        <xref ref-type="bibr" rid="ref22">(Schlechtweg et al., 2018)</xref>
        .
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>
        The authors would like to thank Dr. Ferrante and
Dr. Lopatriello for supporting the preliminary
development of the Diachronic Engine
        <xref ref-type="bibr" rid="ref16 ref9">(Ferrante,
2019; Lopatriello, 2020)</xref>
        . This research has been
partially funded by ADISU Puglia under the
postgraduate programme “Emotional city: a
locationaware sentiment analysis platform for mining
citizen opinions and monitoring the perception of
quality of life”.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          , Annalina Caputo, and
          <string-name>
            <given-names>Giovanni</given-names>
            <surname>Semeraro</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Analysing word meaning over time by exploiting temporal random indexing</article-title>
          .
          <source>In First Italian Conference on Computational Linguistics</source>
          CLiC-it (CLiC-it
          <year>2014</year>
          ). CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          , Annalina Caputo, Roberta Luisi, and
          <string-name>
            <given-names>Giovanni</given-names>
            <surname>Semeraro</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Diachronic analysis of the Italian language exploiting google ngram</article-title>
          .
          <source>In Proceedings of the Third Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2016</year>
          ),
          <article-title>page 56</article-title>
          . CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          , Giovanni Semeraro, and
          <string-name>
            <given-names>Annalina</given-names>
            <surname>Caputo</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Kronos-it: a dataset for the Italian semantic change detection task</article-title>
          .
          <source>In Proceedings of the 6th Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2019</year>
          ). CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          , Annalina Caputo, Tommaso Caselli, Pierluigi Cassotti, and
          <string-name>
            <given-names>Rossella</given-names>
            <surname>Varvara</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>A diachronic Italian corpus based on “L'Unita`”</article-title>
          .
          <source>In Proceedings of the 7th Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2020</year>
          ). CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Blank</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Prinzipien des lexikalischen Bedeutungswandels am Beispiel der romanischen Sprachen</article-title>
          , volume
          <volume>285</volume>
          . Walter de Gruyter.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Oliver</given-names>
            <surname>Christ</surname>
          </string-name>
          , Bruno M Schulze,
          <string-name>
            <given-names>Anja</given-names>
            <surname>Hofmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Esther</given-names>
            <surname>Koenig</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>The ims corpus workbench: Corpus query processor (cqp): User's manual</article-title>
          . University of Stuttgart,
          <volume>8</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Haim</given-names>
            <surname>Dubossarsky</surname>
          </string-name>
          , Simon Hengchen, Nina Tahmasebi, and
          <string-name>
            <given-names>Dominik</given-names>
            <surname>Schlechtweg</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>TimeOut: Temporal Referencing for Robust Modeling of Lexical Semantic Change</article-title>
          .
          <source>In 57th Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>457</fpage>
          -
          <lpage>470</lpage>
          .
          <article-title>Association for Computational Linguistics (ACL).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Bradley</given-names>
            <surname>Efron</surname>
          </string-name>
          and
          <string-name>
            <given-names>RJ</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>An Introduction to the Bootstrap</article-title>
          . CRC Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Ferrante</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Diachronic-engine: Un tool per la gestione dei corpora diacronici</article-title>
          .
          <source>B.Sc. degree Thesis in Metodi per il Ritrovamento dell'Informazione.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>William L. Hamilton</surname>
            , Jure Leskovec, and
            <given-names>Dan</given-names>
          </string-name>
          <string-name>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Diachronie word embeddings reveal statistical laws of semantic change. In 54th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <article-title>ACL 2016 - Long Papers</article-title>
          , volume
          <volume>3</volume>
          , pages
          <fpage>1489</fpage>
          -
          <lpage>1501</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Bryan</given-names>
            <surname>Jurish</surname>
          </string-name>
          and Berlin-Brandenburgische Akademie der Wissenschaften.
          <year>2015</year>
          .
          <article-title>Diacollo: On the trail of diachronic collocations</article-title>
          .
          <source>In Proceedings of the CLARIN Annual Conference</source>
          , pages
          <fpage>28</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Adam</given-names>
            <surname>Kilgarriff</surname>
          </string-name>
          , Pavel Rychly, Pavel Smrz, and
          <string-name>
            <given-names>David</given-names>
            <surname>Tugwell</surname>
          </string-name>
          .
          <year>2004</year>
          . Itri-04
          <article-title>-08 the sketch engine</article-title>
          .
          <source>Information Technology</source>
          ,
          <volume>105</volume>
          :
          <fpage>116</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Adam</given-names>
            <surname>Kilgarriff</surname>
          </string-name>
          , V´ıt Baisa, Jan Busˇta, Milosˇ Jakub´ıcˇek, Vojteˇch Kova´rˇ,
          <string-name>
            <surname>Jan</surname>
            <given-names>Michelfeit</given-names>
          </string-name>
          , Pavel Rychly`, and V´ıt Suchomel.
          <year>2014</year>
          .
          <article-title>The sketch engine: ten years on</article-title>
          .
          <source>Lexicography</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <fpage>7</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Adam</given-names>
            <surname>Kilgarriff</surname>
          </string-name>
          , Ondrˇej Herman, Jan Busˇta, Vojteˇch Kova´rˇ, et al.
          <year>2015</year>
          .
          <article-title>Diacran: a framework for diachronic analysis</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Andrey</given-names>
            <surname>Kutuzov</surname>
          </string-name>
          , Erik Velldal, and
          <string-name>
            <given-names>Lilja</given-names>
            <surname>Øvrelid</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Tracing armed conflicts with diachronic word embedding models</article-title>
          .
          <source>In Proceedings of the Events and Stories in the News Workshop</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Gabriele</given-names>
            <surname>Lopatriello</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Diachronic engine: A tool for the management of diachronic corpora</article-title>
          .
          <source>Master Thesis in Intelligent Information Access and Natural Language Processing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Jean-Baptiste</surname>
            <given-names>Michel</given-names>
          </string-name>
          , Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K Gray, Joseph P Pickett, Dale Hoiberg, Dan Clancy,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Norvig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jon</given-names>
            <surname>Orwant</surname>
          </string-name>
          , et al.
          <year>2011</year>
          .
          <article-title>Quantitative analysis of culture using millions of digitized books</article-title>
          .
          <source>science</source>
          ,
          <volume>331</volume>
          (
          <issue>6014</issue>
          ):
          <fpage>176</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Corinna</given-names>
            <surname>Onelli</surname>
          </string-name>
          , Domenico Proietti, Corrado Seidenari, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Tamburini</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>The diacoris project: a diachronic corpus of written Italian</article-title>
          .
          <source>In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC</source>
          <year>2006</year>
          ), pages
          <fpage>1212</fpage>
          -
          <lpage>1215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Maja</given-names>
            <surname>Rudolph</surname>
          </string-name>
          and
          <string-name>
            <given-names>David</given-names>
            <surname>Blei</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Dynamic embeddings for language evolution</article-title>
          .
          <source>In Proceedings of the 2018 World Wide Web Conference</source>
          , pages
          <fpage>1003</fpage>
          -
          <lpage>1011</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Pavel</surname>
            <given-names>Rychly`.</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>A lexicographer-friendly association score</article-title>
          .
          <source>RASLAN 2008 Recent Advances in Slavonic Natural Language Processing, page 6.</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Magnus</given-names>
            <surname>Sahlgren</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>An introduction to random indexing</article-title>
          .
          <source>In Methods and applications of semantic indexing workshop at the 7th international conference on terminology and knowledge engineering.</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Dominik</given-names>
            <surname>Schlechtweg</surname>
          </string-name>
          , Sabine Schulte im Walde, and
          <string-name>
            <given-names>Stefanie</given-names>
            <surname>Eckmann</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Diachronic usage relatedness (durel): A framework for the annotation of lexical semantic change</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>2</volume>
          (
          <issue>Short Papers)</issue>
          , pages
          <fpage>169</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Dominik</given-names>
            <surname>Schlechtweg</surname>
          </string-name>
          ,
          <string-name>
            <surname>Barbara</surname>
            <given-names>McGillivray</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Simon</given-names>
            <surname>Hengchen</surname>
          </string-name>
          , Haim Dubossarsky, and
          <string-name>
            <given-names>Nina</given-names>
            <surname>Tahmasebi</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Semeval-2020 task 1: Unsupervised lexical semantic change detection</article-title>
          .
          <source>In Proceedings of the 14th International Workshop on Semantic Evaluation</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Borsci</given-names>
            <surname>Simone</surname>
          </string-name>
          , Boscarol Maurizio,
          <string-name>
            <given-names>Cornero</given-names>
            <surname>Alessandra</surname>
          </string-name>
          , et al.
          <source>2015. Il Protocollo eGLU 2</source>
          .1.
          <string-name>
            <given-names>Il</given-names>
            <surname>Protocollo</surname>
          </string-name>
          eGLU-M.
          <article-title>Come realizzare test di usabilita` semplificati per i siti web ei servizi online delle PA</article-title>
          .
          <article-title>Glossario dell'usabilita`.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Milan</given-names>
            <surname>Straka</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>UDPipe 2.0 prototype at CoNLL 2018 UD shared task</article-title>
          .
          <source>In Proceedings of the CoNLL</source>
          <year>2018</year>
          <article-title>Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies</article-title>
          , pages
          <fpage>197</fpage>
          -
          <lpage>207</lpage>
          , Brussels, Belgium. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Wayne A</given-names>
            <surname>Taylor</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Change-point analysis: a powerful new tool for detecting changes</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Zijun</given-names>
            <surname>Yao</surname>
          </string-name>
          , Yifan Sun, Weicong Ding,
          <string-name>
            <given-names>Nikhil</given-names>
            <surname>Rao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Hui</given-names>
            <surname>Xiong</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Dynamic word embeddings for evolving semantic discovery</article-title>
          .
          <source>In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (WSDM</source>
          <year>2018</year>
          ), pages
          <fpage>673</fpage>
          -
          <lpage>681</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>