<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Extracting Relations from Italian Wikipedia Using Self-Training</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lucia Siciliani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierluigi Cassotti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierpaolo Basile</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco de Gemmis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pasquale Lops</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dipartimento di Informatica</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universita` degli Studi di Bari Aldo Moro</institution>
          ,
          <addr-line>Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe a supervised approach for extracting relations from Wikipedia. In particular, we exploit a self-training strategy for enriching a small number of manually labeled triples with new self-labeled examples. We integrate the supervised stage in WikiOIE, an existing framework for unsupervised extraction of relations from Wikipedia. We rely on WikiOIE and its unsupervised pipeline for extracting the initial set of unlabelled triples. An evaluation involving different algorithms and parameters proves that self-training helps to improve performance. Finally, we provide a dataset of about three million triples extracted from the Italian version of Wikipedia and perform a preliminary evaluation conducted on a sample dataset, obtaining promising results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The goal of an Open Information Extraction (Open
IE) system is to extract relations occurring within
a text written in natural language. Each relation
is structured in the form of a triple that is
composed by three elements i.e. {(arg1; rel; arg2)}.
More specifically, given a relation, arg1 and arg2
can be nouns or phrases, while rel is a phrase
that denotes the semantic relation between them.
Open IE finds its application in several NLP tasks
like Question Answering, Knowledge Graph
Acquisition, Knowledge Graph Completion, and Text
Summarization. For this reason, Open IE is
gaining ever-growing attention as a research topic.
Given the nature of the task, approaches for Open</p>
      <p>Copyright © 2021 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).</p>
      <p>
        IE are deeply intertwined with the language of
the corpora that have to be analyzed. Due to the
availability of English corpora, the majority of
the state-of-the-art works are specific for that
language. For what concerns the Italian language, the
model proposed by Guarasci et al. (2020) relies
on verbal behavior patterns based upon
LexiconGrammar features. In a previous work, we
proposed WikiOIE
        <xref ref-type="bibr" rid="ref2">(Cassotti et al., 2021)</xref>
        , a
framework in which Open IE methods for the Italian
language can be easily developed with the aim of
encouraging researchers to conduct further work also
for under-represented languages. The first
solutions developed in WikiOIE are unsupervised,
relying merely on PoS tags patterns and dependency
relations. In Cassotti et al. (2021) the triples
extracted by WikiOIE underwent a deep error
analysis. The error analysis reveals syntactic errors such
as missing subject or incomplete object
information and semantic errors such as generic subject
or relation. In this work, we propose a supervised
approach to automatically filter out non-relevant
triples provided by WikiOIE and a self-training
strategy. Self-training
        <xref ref-type="bibr" rid="ref12">(Yarowsky, 1995)</xref>
        works
iteratively: a classification model is trained on
labeled data, the trained model is used to classify
unlabeled data i.e. pseudo-labels, the
classification model is retrained on labeled data and
highconfident pseudo-labels. Specifically, we
manually annotate a small number of triples extracted
by WikiOIE. Afterward, the annotated triples are
augmented using self-training. Finally, the set of
triples obtained through self-training at the
previous step is exploited to train a supervised model.
The paper is structured as follows: after a brief
introduction of state-of-the-art methods for Open IE,
Section 3 provides details about the self-training
and the supervised model behind our
methodology. Section 4 reports the results of the evaluation,
while Section 5 closes the paper.
At first, the IE task was performed by extracting
from the text relations that were defined a-priori.
However, the increasing amount of corpora
available nowadays makes this process unfeasible, thus
creating the urge to propose novel solutions to
tackle this problem.
      </p>
      <p>The Open IE task was defined in 2008 by
Etzioni et al. (2008). The three most important
elements characterizing this task are the following:
it is domain independent, meaning that the text
relations must be extracted from, can be related to
any topic, the extraction must be unsupervised,
approaches to solve this task must take into account
the amount of data available and must be scalable.</p>
      <p>Along with the definition of a new task, the
authors proposed a model called TextRunner. It
applies an approach that is composed of three main
modules. The first one is a learner that exploits
a parser to label the training data as trustworthy
or not and then uses the extracted information to
train a Naive Bayes classifier. Next, the extractor
uses POS-tag features to obtain a set of candidate
tuples from the corpus, and only those labeled as
trustworthy are kept. Finally, a module
denominated assessor assigns a probability score to the
tuples extracted at the previous step based on the
number of occurrences in the corpus.</p>
      <p>
        The learning-based approach used in
TextRunner has also been applied by several other
systems like WOE
        <xref ref-type="bibr" rid="ref10">(Wu and Weld, 2010)</xref>
        , OLLIE
        <xref ref-type="bibr" rid="ref6">(Mausam et al., 2012)</xref>
        , and ReNoun
        <xref ref-type="bibr" rid="ref11">(Yahya et al.,
2014)</xref>
        . In particular, WOE exploits
Wikipediabased bootstrapping: the system extracts the
sentences matching the attribute-value pairs available
within the info-boxes of Wikipedia articles. This
data is then used to build two versions of the
system: the first one based on PoS-tags, regular
expressions, and other shallow features of the
sentence, the latter based on features of
dependencyparse trees, thus obtaining better results than the
other one but with a lack of performance in terms
of speed.
      </p>
      <p>
        In recent works, OIE has been treated as a
sequence labeling task. In this setting, models
are trained to extract triple elements, i.e., subject,
predicate, and object using a modified BIO tag
schema
        <xref ref-type="bibr" rid="ref7">(Ratinov and Roth, 2009)</xref>
        that involves
particular prefixes to represent the triple elements,
i.e., A0, P, and A1. Hohenecker et al. (2020)
provide an evaluation of different training strategies
and different neural network architectures such
as bidirectional Long short-term Memory
(BiLSTM), Convolutional Neural Networks (CNNs),
and Transformers improving the state-of-the-art
on the OIE16 benchmark
        <xref ref-type="bibr" rid="ref8">(Stanovsky and Dagan,
2016)</xref>
        which focuses on the English language.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>
        In this section, we describe our supervised
approach based on self-training integrated into the
information extraction system called WikiOIE1.
Before discussing details about the supervised
approach, it is necessary to recap how WikiOIE
works. The input of the pipeline is represented by
the textual format of the Wikipedia dump obtained
through the WikiExtractor tool2
        <xref ref-type="bibr" rid="ref1">(Attardi, 2015)</xref>
        .
The text is extracted from the Wikipedia dump
and processed using the UDPipe tool
        <xref ref-type="bibr" rid="ref9">(Straka and
Strakova´, 2017)</xref>
        . For this task, we use version 1
of UDPipe with version 2.5 of the ISDT-Italian
model. We opt for UDPipe, since it is trained
using Universal Dependencies data for over 100
languages. In this way, our system can be potentially
used on different Wikipedia dumps of several
languages. WikiOIE directly calls the REST API
provided by UDPipe so that it is easy to change the
endpoint and the model/language. Another
advantage of using Universal Dependencies is the
common tag-set that is defined for all the languages.
PoS-tags3 and syntactic dependencies4 are
annotated with shared sets of labels. Again, this
feature also allows the system to be independent from
the language. The Wikipedia dump is read
lineby-line. Each line contains a fragment (passage)
of text that is processed using UDPipe. The
output of this process is a set of sentences, and each
sentence is annotated with syntactic dependencies.
The sentence is transformed into a dependency
graph that is the input of the Wiki Extractor
module. This module extracts facts from the sentence
in the form of triples (subject, predicate, object)
and assigns a score.
      </p>
      <p>As aforementioned, each sentence occurring in
the text is annotated by UDPipe that provides
an1The code is available on GitHub: https://github
.com/pippokill/WikiOIE.</p>
      <p>2https://github.com/attardi/wikiextra
ctor/wiki/File-Format</p>
      <p>3https://universaldependencies.org/u/
pos/</p>
      <p>4https://universaldependencies.org/u/
dep/
notations following the CoNLL-U format5. As
shown in Figure 1, each token into the sentence
is denoted by an index (first column)
corresponding to the token position into the sentence (starting
from 1). In the other columns are stored the
features extracted by UDPipe, such as the token, the
lemma, the universal PoS-tag, the head of the
current word, and the universal dependency relation
to the HEAD. If the head of the current word is
equal to 0, it means that that token represents the
head of the whole sentence, then the universal
dependency relation will be equal to root. Figure 1
also reports the dependency graph of the sentence
that is used by the Wiki Extractor module for
extracting triples. We use an unsupervised pipeline
based on both PoS-tag and dependencies to extract
the first set of triples.</p>
      <p>The first step of the extraction process consists
of identifying sequences of PoS-tags that match
verbs as reported in Table 1. In Table 1, the first
column reports the PoS-tag patterns, while the
sec5https://universaldependencies.org/fo
rmat.html</p>
      <sec id="sec-2-1">
        <title>PoS-tag Pattern Example</title>
        <p>AUX VERB ADP ... e` nato nel ...</p>
        <p>AUX VERB ... e` nato ...</p>
        <p>AUX=(essere, to be) ... e` ...</p>
        <p>VERB ADP ... nacque nel ...</p>
        <p>VERB ... acquis`ı ...
ond one reports an example of pattern usage. The
sentence showed in Figure 1 matches the last
pattern (VERB, fondo`).</p>
        <p>When the information extraction algorithm
ifnds a valid predicate pattern, it checks for a
candidate subject and object for the predicate. A valid
subject/object candidate must match the following
constraints:
1. the candidate must be composed by a
sequence of tokens belonging to the
following PoS-tags: noun, adjective, number,
determiner, adposition, proper noun;
2. the sequence of tokens composing the
candidate can contain only one determiner and/or
one adposition.</p>
        <p>The candidate subject must precede the verb,
while the candidate object must follow the
predicate pattern. For the sentence in Figure 1 the
candidate subject is “Nakamura”, while the
candidate object is “il quartier generale di il Kyokushin
Karate”6.</p>
        <p>After identifying the candidate subject and
object, the triple is accepted only if both the subject
and the object have a syntactic relation with the
verb. In particular, one of the tokens belonging to
the subject/object must have a dependent relation
with a token of the verb pattern.</p>
        <p>More details about the unsupervised extraction
of triples are reported in Cassotti et al. (2021).
3.1</p>
      </sec>
      <sec id="sec-2-2">
        <title>Self-Training</title>
        <p>
          Using the unsupervised approach, we obtain
3,562,803. We randomly select a subset of 200
triples for which the predicate occurs at least 20
times. Then, each triple is annotated by two
experts as relevant (valid) or not-relevant. Details on
this dataset and the results of the annotation
process are reported in
          <xref ref-type="bibr" rid="ref2">(Cassotti et al., 2021)</xref>
          . For the
self-training, we select only triples in which the
two experts agree. Finally, we have a set of 137
triples that we call L.
        </p>
        <p>From the whole set of 3.5M triples, we
randomly select the 1% of unlabeled triples in which
the predicate occurs at least 20 times. This subset
is denoted as U . The set L is split in two subsets:
Lt for training and Lv for validation. In particular,
Lt is used as the initial dataset for the self-training
procedure, while Lv is used for setting the initial
parameters’ values of the learning algorithm.</p>
        <p>As a preliminary step, we search for the best
parameters using Lt for training and Lv for
validating the performance. We use the macro-averaged
F1 score since our dataset is highly unbalanced:
the 82% of the triples are labelled as relevant.</p>
        <p>The self-training process works as follow:
1. from the set U , we randomly select p triples;
2. we train a supervised model using labeled
triples in Lt;
3. the p triples are labeled using the trained
model, and a confidence score is assigned to
each classified triple;
6It is important to note that UDPipe splits the articulated
preposition “del” in “di:ADP” and “il:DET”.
5. if U contains at least p triples go to step 1
otherwise ends. The self-training loop can also
be terminated if a specific number of
iterations is reached.</p>
        <p>The resulting set of labeled triples Lt is used to
train the final model, which is employed to
classify all the triples extracted using the unsupervised
approach.</p>
        <p>More details about both the parameters’ values
and the training algorithm are reported in Section
4.
3.2</p>
      </sec>
      <sec id="sec-2-3">
        <title>Supervised Approach</title>
        <p>For both the self-training and the classification of
triples, we exploit algorithms provided by
LibLinear7. In particular, we use both logistic regression
and support vector classification: the former can
provide a confidence score, while the latter
cannot.</p>
        <p>The set of features is selected by taking into
account the supervised approaches already
developed for English. In particular, we use:
• the PoS-tags occurring into the subject,
object, and predicate;
• the sequence of PoS-tags that compose the
predicate. This feature is also computed for
both the subject and the object;
• the n-gram that composes the predicate;
• the set of dependencies that link the subject
to the predicate;
• the set of dependencies that link the object to
the predicate.</p>
        <p>The C value of the learning algorithm is
determined by performing a grid search using Lt for
training and Lv for validating. Due to the small
size of the original set L, we perform a 50/50 split.
More details are reported in Section 4.</p>
        <p>7https://www.csie.ntu.edu.tw/ cjlin/liblinear/</p>
      </sec>
      <sec id="sec-2-4">
        <title>Method</title>
        <p>Slog
Ssvc</p>
        <p>C P0 R0 F 10
10 .54 .58 .56
8 .60 .75 .66
The goal of the evaluation is twofold: 1) measure
the performance and the contribution of the
selftraining; 2) evaluate the quality of the extracted
triples. For the first goal, we evaluate how the new
instances added to the initial set of training affect
the performance. For the second goal, we
manually annotated a small subset of extracted triples in
order to evaluate their quality.
4.1</p>
      </sec>
      <sec id="sec-2-5">
        <title>Evaluate Self-Training</title>
        <p>The first step is to determine the best parameters
for the learning algorithm. We use two algorithms:
L2-regularized logistic regression (Slog) and
L2regularized L2-loss support vector classification
(Ssvc). For both algorithms, we perform a grid
search for selecting the best value for the
parameter C. Results of the grid search is reported in
Table 2. In the table, we report the best C value
for each approach. We denote with 0 the class of
not-relevant triples, while 1 denotes relevant ones.
F1 refers to macro-average F1. Results show that
classifiers have poor performance in recognizing
the class 0 since the dateset is both small and
unbalanced.</p>
        <p>We perform two self-training steps (one for
each learning algorithm) using p = 1, 000 and 20
as the number of maximum iterations. For the
logistic regression, we set 0.85 as threshold. After
the self-training step, we obtain a new training set
which contains new instances. Table 3 reports for
each learning approach the size of the new
training set and the performance computed on the
validation set. Moreover, the last column reports the
increment of F1 with respect to the performance
obtained before the self-training.</p>
        <p>Experiments using self-training show that Slog
is able to improve (+19%) its performance, while
self-training has a negative impact on Ssvc
performance (-18%). Probably, this is due to the fact
that it is not possible to set a threshold for
selecting good classified instances during the
selftraining when the Ssvc is involved. After
observing the overall performances in both Tables 2 and
3, we select as training set for extracting triples
the one obtained by Slog during the self-training.
Slog is also able to both overcome the performance
of Ssvc obtained without self-training and achieve
also an improvement in F 10.</p>
        <p>After the extraction and classification process,
we obtain 2,974,374 triples8 as reported in Table
4. The original set of triples extracted from the
unsupervised approach was 3,562,803, this means
that the 16.52% of unsupervised triples was
classiifed as not-valid. Table 4 reports also information
about the number of distinct subjects, objects, and
predicates for both the unsupervised and
supervised datasets. The supervised dataset is released
in the same JSON format described in Cassotti et
al. (2021).
4.2</p>
      </sec>
      <sec id="sec-2-6">
        <title>Evaluate Triples</title>
        <p>For the evaluation, we follow the same
methodology proposed in Cassotti et al. (2021). In
particular, we sample a subset of 200 triples from the
ifnal set of classified triples. The triples selected
are the ones for which the predicate occurs at least
20 times. Then, each triple is annotated by two
experts as relevant (valid) or not-relevant. We used
Cohen’s Kappa coefficient (K) to measure the
pairwise agreement between the two experts. K is a
more robust measure than simple percent
agreement calculation since it takes into account the
agreement occurring by chance. Higher values of
K correspond to higher inter-rater reliability. Open</p>
        <p>8The triples are available on Zenodo: https://zeno
do.org/record/5655028. The triples obtained by the
unsupervised approach are available here: https://doi.</p>
        <p>org/10.5281/zenodo.5498034.</p>
        <p>Dataset #triples
unsupervised 3,562,803
supervised (Slog) 2,974,374
#dist. subj #dist pred
1,298,481 269,551
1,189,648 241,053
#dist obj
2,030,742
1,720,348</p>
        <p>Dataset #valid (exp 1) #ratio (exp 1) #valid (exp 2) #ratio (exp 2)
unsupervised 115 0.64 161 0.81
supervised (Slog) 158 0.79 163 0.82</p>
        <p>
          IE task lacks a formal definition of triple relevance
thus for the annotation process, we adopt the
concept of triple relevance reported in
          <xref ref-type="bibr" rid="ref8">(Stanovsky and
Dagan, 2016)</xref>
          that is based on assertiveness,
minimalism, and completeness. This ensures that: the
triples extracted still enclose the semantics of the
original sentence (assertiveness), each element of
the triple is as compact as possible without any
unnecessary In our evaluation, we decide to give
less weight to minimalism and focus more on the
extraction completeness. After the annotation, we
compute the ratio of relevant triples (column
#ratio in Table 5) for each dataset and expert.
Specifically, the ratio is computed dividing the number
of triples annotated as relevant by the number of
sampled triples.
        </p>
        <p>Results of the evaluation are reported in Table 5,
where also the previous results on the set of
unsupervised triples is reported. It is important to
highlight that the two datasets are not directly
comparable since they are composed of different triples.</p>
        <p>In particular, a small subset of the unsupervised
dataset is used to train the supervised one as
explained in Section 3. Cohen’s kappa coefficient
for each dataset is provided in the last column of
Table 5.</p>
        <p>We obtain a good result in terms of number of
valid triples. In particular, the supervised model
provides a set of triples that improve the
agreement between annotators. The supervised
approach removes noisy and ambiguous triples since
the initial subset Lt used for self-training contains
only triples for which the annotators agree.</p>
        <p>In this task, it is not always possible to
compute standard metrics such as recall since it is not
easy to determine the total number of valid triples
due to the task’s “open” nature. As future work,
we plan to extend the number of manually
annotated triples for performing a more rigorous
evaluation and comparison of different information
extraction methods for Italian.
5</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Future Work</title>
      <p>We propose a self-training strategy for
implementing a supervised open information extraction
system for the Italian version of Wikipedia. Our
approach exploits a small set of manually
labeled triples for expanding the training set. We
integrate this system into WikiOIE, which is a
framework for open information extraction on
Wikipedia dumps. WikiOIE exploits UDPipe as
a tool for processing and annotating the text and
can be extended by adding several information
extraction approaches.</p>
      <p>We perform an extensive evaluation for
measuring the impact of self-training on the overall
classification performance. Results prove that
selftraining is able to improve the classification
performance and help to identify not-relevant triples.</p>
      <p>Finally, we sampled a subset of extracted
triples, evaluated by two experts. The number
of relevant triples increases when the self-training
strategy is used by also improving the agreement
between annotators.</p>
      <p>As future work, we plan to extend the
evaluation to a larger scale study, exploit several
learning algorithms, and explore the application of the
approach to other languages.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This research was partially funded by the
INTERREG-MEDITERRANEAN Social and
Creative project, priority axis 1: Promoting
Mediterranean innovation capacities to develop
smart and sustainable growth, Programme specific
objective 1.1 To increase transnational activity of
innovative clusters and networks of key sectors of
the MED area (2019-2022).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Giusepppe</given-names>
            <surname>Attardi</surname>
          </string-name>
          .
          <year>2015</year>
          . Wikiextractor. https: //github.com/attardi/wikiextractor.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Pierluigi</given-names>
            <surname>Cassotti</surname>
          </string-name>
          , Lucia Siciliani, Pierpaolo Basile, Marco de Gemmis, and
          <string-name>
            <given-names>Pasquale</given-names>
            <surname>Lops</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Extracting Relations from Italian Wikipedia using Unsupervised Information Extraction</article-title>
          . In Vito Walter Anelli, Tommaso Di Noia, Nicola Ferro, and Fedelucio Narducci, editors,
          <source>Proceedings of the 11th Italian Information Retrieval Workshop</source>
          <year>2021</year>
          (
          <article-title>IIR 2021)</article-title>
          .
          <article-title>CEUR-WS</article-title>
          . http://ceur-ws.org/Vol2947/paper2.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Oren</given-names>
            <surname>Etzioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Michele</given-names>
            <surname>Banko</surname>
          </string-name>
          , Stephen Soderland, and
          <string-name>
            <surname>Daniel</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Weld</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Open information extraction from the web</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>51</volume>
          (
          <issue>12</issue>
          ):
          <fpage>68</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Raffaele</given-names>
            <surname>Guarasci</surname>
          </string-name>
          , Emanuele Damiano, Aniello Minutolo, Massimo Esposito, and Giuseppe De Pietro.
          <year>2020</year>
          .
          <article-title>Lexicon-Grammar based open information extraction from natural language sentences in Italian</article-title>
          .
          <source>Expert Syst. Appl.</source>
          ,
          <volume>143</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Hohenecker</surname>
          </string-name>
          , Frank Mtumbuka, Vid Kocijan, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Lukasiewicz</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Systematic comparison of neural architectures and training approaches for open information extraction</article-title>
          .
          <source>In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          , pages
          <fpage>8554</fpage>
          -
          <lpage>8565</lpage>
          , Online, November. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Mausam</surname>
            ,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Schmitz</surname>
            , Stephen Soderland, Robert Bart, and
            <given-names>Oren</given-names>
          </string-name>
          <string-name>
            <surname>Etzioni</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Open Language Learning for Information Extraction</article-title>
          .
          <source>In Jun'ichi Tsujii</source>
          , James Henderson, and Marius Pasca, editors,
          <source>Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL</source>
          , pages
          <fpage>523</fpage>
          -
          <lpage>534</lpage>
          ,
          <string-name>
            <surname>Jeju</surname>
            <given-names>Island</given-names>
          </string-name>
          , Korea, 7. ACL.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Lev</given-names>
            <surname>Ratinov</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Roth</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Design challenges and misconceptions in named entity recognition</article-title>
          .
          <source>In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)</source>
          , pages
          <fpage>147</fpage>
          -
          <lpage>155</lpage>
          , Boulder, Colorado, June. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Gabriel</given-names>
            <surname>Stanovsky</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ido</given-names>
            <surname>Dagan</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Creating a Large Benchmark for Open Information Extraction</article-title>
          . In Jian Su, Xavier Carreras, and Kevin Duh, editors,
          <source>Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016</source>
          , pages
          <fpage>2300</fpage>
          -
          <lpage>2305</lpage>
          , Austin, Texas, USA,
          <volume>11</volume>
          . The Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Milan</given-names>
            <surname>Straka</surname>
          </string-name>
          and Jana Strakova´.
          <year>2017</year>
          . Tokenizing, POS Tagging,
          <article-title>Lemmatizing and Parsing UD 2.0 with UDPipe</article-title>
          .
          <source>In Jan Hajic and Dan Zeman</source>
          , editors,
          <source>Proceedings of the CoNLL</source>
          <year>2017</year>
          <article-title>Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies</article-title>
          , pages
          <fpage>88</fpage>
          -
          <lpage>99</lpage>
          , Vancouver, Canada, 8. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Fei</given-names>
            <surname>Wu and Daniel</surname>
          </string-name>
          <string-name>
            <given-names>S.</given-names>
            <surname>Weld</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Open Information Extraction Using Wikipedia</article-title>
          . In Jan Hajic, Sandra Carberry, and Stephen Clark, editors,
          <source>ACL</source>
          <year>2010</year>
          ,
          <article-title>Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics</article-title>
          , pages
          <fpage>118</fpage>
          -
          <lpage>127</lpage>
          , Uppsala, Sweden, 7. The Association for Computer Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Mohamed</given-names>
            <surname>Yahya</surname>
          </string-name>
          , Steven Whang, Rahul Gupta, and
          <string-name>
            <surname>Alon</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Halevy</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>ReNoun: Fact Extraction for Nominal Attributes</article-title>
          . In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors,
          <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing</source>
          , EMNLP, pages
          <fpage>325</fpage>
          -
          <lpage>335</lpage>
          , Doha, Qatar, 10. ACL.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>David</given-names>
            <surname>Yarowsky</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Unsupervised word sense disambiguation rivaling supervised methods</article-title>
          .
          <source>In 33rd Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>189</fpage>
          -
          <lpage>196</lpage>
          , Cambridge, Massachusetts, USA, June. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>