<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Leveraging Wikipedia for Ontology Pattern Population from Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michelle Cheatham and James Lambert</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Charles Vardeman II</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Copyright held by the author(s). In A. Martin, K. Hinkelmann, A.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Gerber</institution>
          ,
          <addr-line>D. Lenat, F. van Harmelen, P. Clark (Eds.)</addr-line>
          ,
          <institution>Proceedings of, the AAAI 2019 Spring Symposium on Combining Machine Learning with Knowledge Engineering (AAAI-MAKE 2019). Stanford, University</institution>
          ,
          <addr-line>Palo Alto, California, USA, March 25-27, 2019.</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Notre Dame, 111C Information Technology Center</institution>
          ,
          <addr-line>Notre Dame, IN 46556</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Wright State University</institution>
          ,
          <addr-line>3640 Colonel Glenn Hwy., Dayton, OH 45435</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Traditional approaches to populating ontology design patterns from unstructured text often involve using a dictionary, rules, or machine learning approaches that have been established based on a training set of annotated documents. While these approaches are quite effective in many cases, performance can suffer over time as the nature of the text documents changes to reflect advances in the domain of interest. This is particularly true when attempting to populate patterns related to fast-changing domains such as technology, medicine, or law. This paper explores the use of Wikipedia as a source of continually updated background knowledge to facilitate ontology pattern population as the domain changes over time.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Two of the central underpinnings of scientific inquiry are
the need to verify results through reproduction of the
experiments involved and the importance of “building on the
shoulders of giants.” In order for these things to be
possible, experimental results need to be both discoverable and
reproducible. Important steps have been made recently in
pursuit of this, including the relaxation of page restrictions
on “methodology” sections in many academic journals and
the requirements by some funding agencies that
investigators make any data they collect publicly available. However,
in order for previous work to be truly verifiable and reusable,
researchers must be able to not only access the results of
those efforts but also to understand the context in which they
were created. A key element of this is the need to preserve
the underlying computations and analytical process that led
to prior results in a generic machine-readable format.</p>
      <p>
        In previous work towards this goal, we developed an
ontology design pattern (ODP) to represent the
computational environment in which an analysis was performed. This
model is briefly described in Section of this work, and more
detail is available in
        <xref ref-type="bibr" rid="ref3">(Cheatham et al. 2017)</xref>
        . This paper
describes our work on the next step: development of an
automated approach to populate the ODP based on data
extracted from academic articles. We explore the performance
of two common approaches to this task and show that, due to
the fast-changing nature of computer technology, they lose
their effectiveness over time. We then evaluate the utility of
using a continuously manually-curated knowledge base to
mitigate this performance degradation. The results illustrate
that this method holds some promise.
      </p>
      <p>The remainder of this paper is organized as follows.
The section Computational Environment Representation
presents the schema of the ontology we seek to populate,
while the Dataset section describes the collection of
academic articles we use as our training and test sets. The
approach and results are presented and analyzed next, and
finally some conclusions and ideas for future work in this area
are discussed.</p>
    </sec>
    <sec id="sec-2">
      <title>Computational Environment Representation</title>
      <p>The Computational Environment ODP was developed over
the course of several working sessions by a group of
ontological modeling experts, library scientists, and domain
scientists from different fields, including computational
chemists and high-energy physicists interested in preserving
analysis of data collected from the Large Hadron Collider
at CERN. Our goal was to arrive at an ontology design
pattern that is capable of answering the following competency
questions:</p>
      <p>What environment do I need to put in place in order to
replicate the work in Paper X?
There has been an error found in Script Y. Which analyses
need to be re-run?
Based on recent research in Field Z, what tools and
resources should new students work to become familiar
with?
Are the results from Study A and Study B comparable
from a computational environment perspective?</p>
      <p>We focused on creating a model to capture the actual
environment present during a computational analysis.
Representing all possible environments in which it is feasible for
the analysis to be executed is outside of the scope of our
current effort. We also do not include the runtime
configuration and parameters as part of the environment. The
rationale is that to some extent the same environment should
be applicable to many computational analyses in the same
field of study, but this would not be true if we included such
analysis-specific information as runtime parameters as part
of the environment. Data sources were considered outside
of the confines of the computational environment for similar
reasons. External web services and similar resources were
not included because they are not inter-related in the same
way as the environmental elements are. For instance,
deciding to use a different operating system often necessitates
using a different version of drivers, libraries, and software
applications, whereas in most cases the entire hardware
configuration could be changed with no impact on external
services.</p>
      <p>
        Figure 1 shows the schema that we targeted for population
in this study. It is a slightly modified version of the ODP
presented in
        <xref ref-type="bibr" rid="ref3">(Cheatham et al. 2017)</xref>
        – the overall goal and
competencies of the pattern remain the same, but some entities
have been omitted, and properties related to the
manufacturer, make and model of computers and hardware
components have been added, as well as an entity related to
programming language, because this information is important
for our current application goals.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset</title>
      <p>The dataset consists of 100 academic articles published in
2012 or later (called “current”)1 and 20 published in or
before 2002 (called “old”). There are 20 current papers and
four old ones from each of five fields of study: biology,
chemistry, engineering, mathematics, and physics. Twenty
of the current articles were randomly selected to make up
the current training set while the remainder form the test set.
The documents were collected by searching Google Scholar
using the query &lt;field of study&gt; AND algorithm (e.g.
biology and algorithm, chemistry and algorithm). Patents and
citations were omitted from the search criteria. Any paper
with a PDF download link was searched for the terms cpu
and processor, in order to quickly determine if the paper had
any content related to a computational environment. If the
paper contained either term, it was retained in the dataset.</p>
      <p>
        We then manually created a gold standard for the dataset
by going through each paper and listing any information
relevant to the ODP shown in Figure 1. This process was
completed by a single person, but in the few cases in which a
question arose (e.g. whether an R4000 is a make or a model),
the opinions of others and external knowledge sources such
as manufacturer’s websites and websites about historical
computing technologies were consulted. A typical entry in
the gold standard is shown in Listing 1. Note that the
entries given do not directly correspond to entities in the
ontology. For example, the single tag memory size unit: GB
corresponds to an instance of the Memory class in the
ontology with a hasSize of some Amount that in turn hasUnit
GB. Tagging based on each entity within the ontology,
including those such as Memory that represent blank nodes,
would have made the tagging process more arduous,
however, and the information can be readily expanded to match
1Three articles from the “current” set had to be thrown out at a
later stage due to irregularities that caused them to be unparseable
by multiple PDF parsers.
the desired schema using SPARQL construct queries, as
described in
        <xref ref-type="bibr" rid="ref17">(Zhou et al. 2018)</xref>
        .
      </p>
      <p>Listing 1: Sample entry from the gold standard
b i o 15
c p u make : HT Xeon
c p u f r e q u e n c y v a l u e : 3 . 2
c p u f r e q u e n c y u n i t : GHz
c p u num c o r e s : 16
memory t y p e : RAM
memory s i z e v a l u e : 8
memory s i z e u n i t : GB
p r o g r a m m i n g l a n g u a g e : C / C++
o s k e r n e l name : L i n u x
o s d i s t r i b u t i o n name : Red H a t</p>
      <p>
        A preliminary analysis of the training and test datasets
indicate that approaches trained on the older articles may face
difficulties. For example, Figure 2 shows the number of
distinct values for each tag in the test set (left), current training
set (middle) and old training set (right). From this we can
see that the relative number of distinct values for each tag is
about the same in the test set and the current training set (i.e.
the tags with the largest number of distinct values in the test
set are also those with the largest number of distinct values
in the training set). This is less true for the old training set
– for instance, the older articles never talk about the
number of CPU cores (presumably because they were all single
core) and the only programming language they mention is
Fortran. It therefore may be difficult for models trained on
the old training set to correctly extract information relevant
to these tags in the test set. New entities becoming relevant
over time are sometimes termed “emergent entities” and are
particularly challenging for many NER systems
        <xref ref-type="bibr" rid="ref14">(Nakashole,
Tylenda, and Weikum 2013)</xref>
        .
      </p>
      <p>Figure 3 shows the number of times each tag was used
in the test set and both training sets. It is evident that the
older articles talk more at the level of computer
manufacturer, make and model, while more current articles mention
more details such as information about the CPU, the
graphics card, and the amount of memory. Looking at both
figures (2 and 3), we see that some tags, such as CPU
manufacturer, are key for this ontology population task, because
there are only a few distinct values that must be recognized,
but these few values occur many times within the test set.
Models trained on the old training set may be at a particular
disadvantage in these cases, if the values for those key tags
in the older documents are not reflective of those in the test
set.</p>
    </sec>
    <sec id="sec-4">
      <title>Approach and Results</title>
      <p>
        In this work we formulate the problem of populating the
computational environment pattern from text solely as a
Named Entity Recognition (NER) task. This is in
contrast to many other approaches that divide this process into
two steps: entity recognition/typing and relation extraction
        <xref ref-type="bibr" rid="ref16">(Petasis et al. 2011)</xref>
        . In other words, we are trying to directly
arrive at the tags specified in the gold standard as described
in the Dataset section, with the intention of later creating
SPARQL construct queries to expand these tags into the
terminology used by the ODP. If more than one computational
environment is described in an article, the appropriate
relations between instances can be then determined based on
proximity in the underlying text.
      </p>
      <p>Testset
Current
Old
Test set
Current
Old</p>
      <sec id="sec-4-1">
        <title>Preprocessing</title>
        <p>
          In academic articles, the computational environment tends
to be discussed in a small number of isolated points within
a document, often a single paragraph, sentence, footnote,
or caption. Because some of the techniques we employ in
our approach, particularly the use of Wikipedia, are
timeintensive, we begin our ontology population task by
attempting to identify the key portions of each document (which we
generically refer to as the “key paragraph”, with the
understanding that it may actually be a footnote, caption or other
element). Our goal in doing this was to arrive at an
ontologyagnostic approach with high recall. The approach we take
is quite basic: a Python script is given the ontology (in the
form of an OWL file) and the name of the academic article
in which to identify the key paragraphs. The script parses
the names of all entities from the ontology, splits the entire
content academic article (including footnotes, captions, etc.)
into paragraphs, and counts the number of times any
ontology term appears in each paragraph. Any paragraph with a
count within 90 percent of the maximum count for that
article is considered a key paragraph. This approach produces a
recall of .94 and a precision of .99, for an F-measure of .96.
The remainder of the discussion in this section assumes that
the key paragraphs have been successfully identified prior to
invoking the approach under consideration. The text in each
key paragraph is split into sentences, tokenized, lemmatized,
and part of speech tagging is performed using the Stanford
NLP pipeline
          <xref ref-type="bibr" rid="ref11 ref2">(Manning et al. 2014)</xref>
          .
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Machine Learning</title>
        <p>
          NER techniques generally fall into two categories:
machine learning-based and rules-based
          <xref ref-type="bibr" rid="ref13 ref6">(Nadeau and Sekine
2007)</xref>
          . Machine learning-based NER systems typically
involve training a classifier by manually tagging input
documents. The classifier then attempts to learn how to correctly
recognize and type entities from new documents based on
the training set and a set of features such as a word’s part
of speech, position in a document, prefix/suffix, frequency,
etc. Various approaches have been employed to model the
relationship between the features and the entities, including
Support Vector Machines
          <xref ref-type="bibr" rid="ref4">(Isozaki and Kazawa 2002)</xref>
          ,
Maximum Entropy
          <xref ref-type="bibr" rid="ref4">(Chieu and Ng 2002)</xref>
          , and neural networks
          <xref ref-type="bibr" rid="ref10">(Lample et al. 2016)</xref>
          . Because the task of manually
generating training data is onerous, there are also semi-supervised
and unsupervised machine learning-based NER approaches,
but these often only rival, rather than exceed, the
performance of supervised systems
          <xref ref-type="bibr" rid="ref13 ref6">(Nadeau and Sekine 2007)</xref>
          and
so are not considered here.
        </p>
        <p>
          In this work we applied a Conditional Random Field
(CRF) based classifier to the task of tagging information
relevant to the computational environment ontology. CRF was
chosen due to its long-standing popularity for information
extraction from unstructured text
          <xref ref-type="bibr" rid="ref1">(Kristjansson et al. 2004;
Bundschus et al. 2008)</xref>
          . We again used the Stanford NLP
group’s implementation
          <xref ref-type="bibr" rid="ref5">(Finkel, Grenager, and Manning
2005)</xref>
          . This classifier uses features such as word order,
ngrams, part of speech, and word shape to create a
probabilistic model to predict the tags in previously unseen
documents. We developed models based on both the current and
old training sets, an example of which is shown in Listing 2
Test
        </p>
        <p>We then used each model to tag the articles in the test set
and assessed the performance in terms of precision, recall
and F-measure (Table 1). The F-measure of the approach
on the training data is not quite 1.0 because tagging in the
Stanford NLP pipeline only happens at the level of tokens,
and some of the articles contain malformed tokens, such as
IntelCorei7, that contain information about more than one
tag. Still, the performance of both models is quite good on
the training data. The F-measure using the model trained on
current documents drops considerably on the test set, but
0.66 may good enough to be of some use in many
applications (Others have found that the performance of NER in
technical domains such as biomedicine is in the 58-75 range
(Zhang and Ciravegna 2011)). Conversely, the performance
using the model trained on older articles is abysmal.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Rules</title>
        <p>f
Rules-based NER systems allow users to craft rules using a
regular expression language that can incorporate text, part
of speech, and previously assigned tags. An example rule,
which states that any noun phrase that is between an
operating system kernel name and the word “version” should be
tagged as an operating system distribution name, is shown
in Listing 3.</p>
        <p>Listing 3: Sample rule
r u l e T y p e : ” t o k e n s ” ,
p a t t e r n : ( [ f n e r : o s k e r n e l name g ]
( [ ( f p o s :NNg j f p o s : NNPg ) &amp;
! f word : / ( ? i ) v e r s i o n / g ] ) ) ,
a c t i o n : ( A n n o t a t e ( $ 1 , n e r ,</p>
        <p>” o s d i s t r i b u t i o n name ” ) ) ,
s t a g e : 2
g</p>
        <p>
          Since the results of the machine learning-based approach
were mediocre, we also implemented a rules-based approach
using the Stanford NLP group’s TokenRegex system
          <xref ref-type="bibr" rid="ref11 ref2">(Chang
and Manning 2014)</xref>
          . A strong effort was made to establish a
set of rules for each training set that was as general as
possible while still maximizing F-measure. Of course, the more
general the rules are, the more likely they are to conflict with
one another at some point. Developing these rule sets was
therefore quite a difficult task, with each one taking about
two days to create. These rule sets were then used to tag the
articles from the test set. The performance is shown in Table
2.
        </p>
        <p>We again see that the performance of the rule set based
on current articles drops on the test set, though it remains
significantly higher than that of the machine learning-based
approach (.73 versus .66 F-measure). Additionally, the
performance of the rule set based on older articles, while only
a little more than half that of the current rules, is much
more reasonable than that of the CRF classifier using the
model trained on old articles (.41 versus .02 F-measure). We
tried combining the machine learning- and rules-based
approaches, but the combined performance was not any better
than that of the rules-based approach alone. We therefore
did not consider the machine learning-based approach any
further for this ontology population task.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Enhancing Rules with Background Knowledge</title>
        <p>As shown in the previous section, the performance of named
entity recognition in this domain degrades considerably over
time. In this case, the rules created from articles at least ten
years older than those being tagged were 44 percent less
effective than rules based on contemporaneous articles (.41
versus .73 F-measure). In looking into the specifics of this
issue, one underlying problem is that some tags’ rules are
based on other tags. For example, a computer’s make can
often be recognized because it follows a computer’s
manufacturer, as in Lenovo ThinkPad. If Lenovo is not recognized
as a computer manufacturer there is a double hit to
performance, because not only is that word not tagged correctly,
but neither is ThinkPad. The common computer
manufacturers a decade ago (e.g. Silicon Graphics, Compaq, DEC)
are not common today, which leads to the observed
performance degradation. Even more problematic is that over
time completely new technologies become relevant to
descriptions of computational environments. A good example
is GPUs, which have only become popular for parallel
processing relatively recently.</p>
        <p>Our goal in this work was to reduce the performance
degradation seen as rules age by leveraging a source of
background knowledge that is continuously updated. In addition,
we sought to develop an approach that does not require the
types of time-consuming training typical of machine
learning and rules based methods. Ideally, the approach should
take the ontology (or desired set of tags) as input and require
no additional configuration.</p>
        <p>The solution we developed uses Wikipedia as the
background knowledge source. We developed a custom NER
module that fits into the Stanford NLP pipeline. The
module takes the list of desired tags as input. We limit this list
to the tags that are not related to numeric values (i.e. we
omit tags like num-processors and cpu num-cores) because
querying Wikipedia for a number like “4” or “8” is not going
to produce any useful information. We also provide common
synonyms for tags (i.e. “processor” for CPU and “company”
for manufacturer). When tagging a document, the module
queries Wikipedia for each proper noun in the key
paragraph(s) and if a page is returned, determines which tag, if
any, is most relevant by counting the number of times each
tag appears in the first three sentences of the document and
weighting tags that appear earlier more than those that
appear later. After the Wikipedia annotator finishes, the
rulesbased annotator is run.</p>
        <p>
          Other researchers have also leveraged Wikipedia for
various aspects of the NER and ontology population tasks. Many
of these efforts are focused on using Wikipedia to do
multilingual NER
          <xref ref-type="bibr" rid="ref15 ref7">(Nothman et al. 2013; Kim, Toutanova, and
Yu 2012)</xref>
          . More related to our current work, Kazama and
Torisawa determine a candidate tag for an entity by
extracting the first noun after a form of the word “be” within the
introductory sentence of its Wikipedia page and use that as
one of the features in a CRF classifier
          <xref ref-type="bibr" rid="ref13 ref6">(Kazama and
Torisawa 2007)</xref>
          . Klieger et al. take a similar approach but rather
than using a classifier, they leverage WordNet to find
synonyms and hypernyms of the type identified from Wikipedia
in order to arrive at one of the tags of interest
          <xref ref-type="bibr" rid="ref8">(Kliegr et
al. 2008)</xref>
          . A more thorough survey of NER approaches that
utilize Wikipedia can be found in (Zhang and Ciravegna
2011). The key difference between our work and existing
approaches in that the results have been analyzed in a way
that enables the ability of Wikipedia to mitigate the atrophy
of a rules-based technique to be specifically evaluated.
        </p>
        <p>Our method leads to the results shown in Table 3. While
this approach is relatively basic, we see that it improves the
performance of the old rule set by 20 percent (0.49 versus
0.41 F-measure). The performance of the current rule set is
reduced by 4 percent (0.70 versus 0.73 F-measure),
indicating that this approach should not be used unless it is
warranted by changes in the domain of interest and the age of
the model used for ontology population.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>
        In this work we consider the problem of populating an
ontology from a fast-changing domain: computational
environments. We show that the performance of two popular
approaches degrades significantly over time and propose
the use of a continuously updated background knowledge
source to mitigate this performance degradation. A basic
implementation of this idea improved the performance of
a rules-based approach based on older documents by 20
percent. It remains to be determined if this technique can
achieve similar results in other fast-changing fields, such
as medicine or law. In addition, it is possible that this
result can be improved upon by a more advanced use of
Wikipedia, such as through neural network based methods
like word2vec
        <xref ref-type="bibr" rid="ref12">(Mikolov et al. 2013)</xref>
        . We plan to explore this
in our future work on this topic.
      </p>
      <p>All of the materials used in this project, including
the article set, answer set, ontology, machine learning
models, rules, and code, have been published to GitHub
(https://github.com/mcheatham/compEnv-extraction).</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The first and third authors acknowledge partial support by
the National Science Foundation under award PHY-1247316
DASPOS: Data and Software Preservation for Open
Science. The third author would like to acknowledge partial
support from Notre Dame’s Center for Research
Computing.
Meeting of the Association for Computational Linguistics,
363–370. Association for Computational Linguistics.</p>
      <p>Zhang, Z., and Ciravegna, F. 2011. Named entity
recognition for ontology population using background knowledge
from Wikipedia. In Ontology Learning and Knowledge
Discovery Using the Web: Challenges and Recent Advances.
IGI Global. 79–104.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Bundschus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Dejori</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Stetter</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tresp</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ; and Kriegel, H.-P.
          <year>2008</year>
          .
          <article-title>Extraction of semantic biomedical relations from text using conditional random fields</article-title>
          .
          <source>BMC bioinformatics 9</source>
          (1):
          <fpage>207</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>A. X.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>TokensRegex: Defining cascaded regular expressions over tokens</article-title>
          .
          <source>Technical Report CSTR 2014-02</source>
          , Department of Computer Science, Stanford University.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Cheatham</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <given-names>Charles</given-names>
            <surname>Vardeman</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          ; Karima, N.; and Hitzler,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Computational environment: An odp to support finding and recreating computational analyses</article-title>
          .
          <source>In 8th Workshop on Ontology Design and Patterns -WOP2017.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Chieu</surname>
            ,
            <given-names>H. L.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>H. T.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Named entity recognition: a maximum entropy approach using global information</article-title>
          .
          <source>In Proceedings of the 19th International Conference on Computational Linguistics</source>
          , volume
          <volume>1</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Finkel</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          ; Grenager,
          <string-name>
            <given-names>T.</given-names>
            ; and
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2005</year>
          .
          <article-title>Incorporating non-local information into information extraction systems by Gibbs sampling</article-title>
          .
          <source>In Proceedings of the 43rd Annual Isozaki</source>
          ,
          <string-name>
            <given-names>H.</given-names>
            , and
            <surname>Kazawa</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <year>2002</year>
          .
          <article-title>Efficient support vector classifiers for named entity recognition</article-title>
          .
          <source>In Proceedings of the 19th International Conference on Computational Linguistics</source>
          , volume
          <volume>1</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Kazama</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Torisawa</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Exploiting Wikipedia as external knowledge for named entity recognition</article-title>
          .
          <source>In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Multilingual named entity recognition using parallel data and metadata from wikipedia</article-title>
          .
          <source>In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume</source>
          <volume>1</volume>
          ,
          <fpage>694</fpage>
          -
          <lpage>702</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Kliegr</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Chandramouli</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Nemrava</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Svatek</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Izquierdo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>Combining image captions and visual analysis for image concept classification</article-title>
          .
          <source>In Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD</source>
          <year>2008</year>
          ,
          <volume>8</volume>
          -
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          2004.
          <article-title>Interactive information extraction with constrained conditional random fields</article-title>
          .
          <source>In AAAI</source>
          , volume
          <volume>4</volume>
          ,
          <fpage>412</fpage>
          -
          <lpage>418</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ; Ballesteros,
          <string-name>
            <given-names>M.</given-names>
            ;
            <surname>Subramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ;
            <surname>Kawakami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ; and
            <surname>Dyer</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Neural architectures for named entity recognition</article-title>
          .
          <source>arXiv preprint arXiv:1603</source>
          .
          <fpage>01360</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          ; Surdeanu,
          <string-name>
            <given-names>M.</given-names>
            ;
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ;
            <surname>Finkel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          ; Bethard,
          <string-name>
            <given-names>S. J.;</given-names>
            and
            <surname>McClosky</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <year>2014</year>
          .
          <article-title>The Stanford CoreNLP natural language processing toolkit</article-title>
          .
          <source>In Association for Computational Linguistics (ACL) System Demonstrations</source>
          ,
          <fpage>55</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ; Chen,
          <string-name>
            <given-names>K.</given-names>
            ;
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            ; and
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          ,
          <volume>3111</volume>
          -
          <fpage>3119</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Nadeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Sekine</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>A survey of named entity recognition and classification</article-title>
          .
          <source>Lingvisticae Investigationes</source>
          <volume>30</volume>
          (
          <issue>1</issue>
          ):
          <fpage>3</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Nakashole</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tylenda</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ; and Weikum,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Finegrained semantic typing of emerging entities</article-title>
          .
          <source>In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , volume
          <volume>1</volume>
          ,
          <fpage>1488</fpage>
          -
          <lpage>1497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Nothman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Ringland,
          <string-name>
            <given-names>N.</given-names>
            ;
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          ; Murphy,
          <string-name>
            <given-names>T.</given-names>
            ; and
            <surname>Curran</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. R.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Learning multilingual named entity recognition from wikipedia</article-title>
          .
          <source>Artificial Intelligence</source>
          <volume>194</volume>
          :
          <fpage>151</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Krithara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Zavitsanos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Ontology population and enrichment: State of the art</article-title>
          .
          <source>In Knowledge-driven multimedia information extraction and ontology evolution</source>
          ,
          <fpage>134</fpage>
          -
          <lpage>166</lpage>
          . SpringerVerlag.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Cheatham</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Krisnadhi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and Hitzler,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>A complex alignment benchmark: Geolink dataset</article-title>
          .
          <source>In International Semantic Web Conference</source>
          ,
          <volume>273</volume>
          -
          <fpage>288</fpage>
          . Springer.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>