<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Community graph and linguistic analysis to validate relationships for knowledge base population</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rashedur Rahman</string-name>
          <email>rashedur.rahman@irt-systemx.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brigitte Grau</string-name>
          <email>brigitte.grau@limsi.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sophie Rosset</string-name>
          <email>sophie.rosset@limsi.fr</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IRT SystemX, LIMSI, CNRS, Université Paris-Saclay</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LIMSI</institution>
          ,
          <addr-line>CNRS, ENSIIE</addr-line>
          ,
          <institution>Université Paris-Saclay</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LIMSI, CNRS, Université Paris-Saclay</institution>
        </aff>
      </contrib-group>
      <fpage>133</fpage>
      <lpage>143</lpage>
      <abstract>
        <p>Relation extraction between entities from text plays an important role in information extraction and knowledge discovery related tasks. Relation extraction systems produce a large number of candidates where many of them are not correct. A relation validation method justifies a claimed relation based on the information provided by a system. In this paper, we propose some features by analyzing the community graphs of entities to account for some sort of world knowledge. The proposed features improve validation of relations significantly when they are combined with voting and some stateof-the-art linguistic features.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Extracting relations from texts is important for
different information extraction and knowledge
discovery related tasks such as knowledge base
population, question-answering, etc. This task requires
Natural Language Understanding of pieces of text
which is particularly complex when searching for
a large number of semantic relations that describe
entities in the open domain. The relationship types
can be relative to the family of a person (spouse,
children, parents etc.) or characteristics of a
company (founder, top_members_or_employees etc.),
etc. This task, named slot filling, is evaluated in
the KBP evaluation1 in which systems must
extract instances of around 40 relation types related
to several kinds of entities (person, organization,
location and their different sub-types). Thus, in
order to take advantage of several system’s
capabilities and improve results, a final step can be added
that enables to validate results of the systems. The
method described in this paper is in the framework
of the latter task which, given an entity and a
response provided by a system (its value and a
textsegment that justifies the claimed relation), has to
decide whether the value is correct or not. We
focus on relations that occur between two entities.</p>
      <p>
        Different approaches have been studied for
validating relations particularly by evaluating the
confidence that a system can have on the source of the
response, i.e. the document that justifies the
response
        <xref ref-type="bibr" rid="ref33">(Yu et al., 2014)</xref>
        and the confidence score
of the system
        <xref ref-type="bibr" rid="ref28">(Viswanathan et al., 2015)</xref>
        .
Nevertheless, other criteria are needed that concern
validating semantics of a relation by linguistic
characteristics
        <xref ref-type="bibr" rid="ref14 ref22 ref24 ref32 ref9">(Niu et al., 2012; Hoffmann et al., 2011;
Yao et al., 2011; Riedel et al., 2010)</xref>
        and are
similar to those used in relation extraction task.
      </p>
      <p>
        However, in most cases, the different relation
validation methods do not take into account the
global information, that can be computed on the
collection of text-documents. Collection level
global information about the object of a relation
and words around the mentions have been taken
into account for web relation extraction by
        <xref ref-type="bibr" rid="ref1">(Augenstein, 2016)</xref>
        . Such information allows to
introduce some sort of world knowledge for
making choices based on criteria that are independent
of how a relation is expressed in the text-segment.
We hypothesize that two entities having a true
relationship should be linked to more common
entities than a proposed false relationship between that
pair of entities. For example, the spouse of a
person will share more places and relationships with
his/her spouse than with other people. Therefore,
we extracted a graph of entities from the
collection that allowed us to propose new
characterizations of the relations by graph-based features
        <xref ref-type="bibr" rid="ref13">(Han
et al., 2011)</xref>
        ,
        <xref ref-type="bibr" rid="ref9">(Friedl et al., 2010)</xref>
        , (Solá et al.,
2013). We also introduce information-theoretic
measurements on the graph of entities, some of
which have been successfully used in other tasks,
such as entropy for knowledge detection in
publishing networks
        <xref ref-type="bibr" rid="ref15">(Holzinger et al., 2013)</xref>
        and
mutual information for the validation of responses in
question answering systems
        <xref ref-type="bibr" rid="ref19 ref6">(Magnini et al., 2002;
Cui et al., 2005)</xref>
        . Additionally, we propose
dependency pattern edit-distance for capturing the
syntactic evidence of relations. Word-embeddings
have also been explored to detect the unknown
triggers of relation expression.
      </p>
      <p>The relation validation method we propose is
thus based on three categories of information:
linguistic information associated with the expression
of the relations in texts, information coming from
the graphs of entities built on the collection, and
finally information related to the systems and the
proposals made. We evaluated our relation
validation system on a sub-part of KBP CSSF-2016
corpus and show that the validation step achieves
around 5% to 8% higher accuracy over the
baseline features when they are combined together
with the proposed graph-based features.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <p>Relation validation methods have studied different
kinds of features to decide if a type of relation
exists or not.</p>
      <p>
        Existence and semantic assessment of relation
candidates rely on linguistic features, as
syntactic paths or the existence of trigger words between
the pair of entity-mentions. Dependency tree
        <xref ref-type="bibr" rid="ref7">(Culotta and Sorensen, 2004)</xref>
        ,
        <xref ref-type="bibr" rid="ref3">(Bunescu and Mooney,
2005)</xref>
        ,
        <xref ref-type="bibr" rid="ref10">(Fundel et al., 2007)</xref>
        provides clues for
deciding the presence of a relation in unsupervised
relation extraction. Gamallo et al. (2012)
proposed rule-based dependency parsing for open
information extraction. They defined some patterns
of relation by parsing the dependencies and
discovering verb-clauses in the sentences.
      </p>
      <p>
        Syntactic analysis cannot characterize the type
of a relation. Therefore, words around the
entity mentions in sentences have been analyzed to
characterize the semantics of a relation
        <xref ref-type="bibr" rid="ref22">(Niu et al.,
2012)</xref>
        ,
        <xref ref-type="bibr" rid="ref14">(Hoffmann et al., 2011)</xref>
        ,
        <xref ref-type="bibr" rid="ref32">(Yao et al., 2011)</xref>
        ,
        <xref ref-type="bibr" rid="ref24 ref9">(Riedel et al., 2010)</xref>
        ,
        <xref ref-type="bibr" rid="ref21">(Mintz et al., 2009)</xref>
        .
Chowdhury et al. (2012) proposed a hybrid kernel by
combining dependency patterns and trigger words
for bio-medical relation extraction. Thus we
explored these different kinds of linguistic features
for validating relationships.
      </p>
      <p>
        It can be difficult to identify the trigger words
for different types of relation in the open
domain. Therefore, recently neural network based
methods have been popular for relation
classification task
        <xref ref-type="bibr" rid="ref29">(Vu et al., 2016)</xref>
        ,
        <xref ref-type="bibr" rid="ref8">(Dligach et al.,
2017)</xref>
        ,
        <xref ref-type="bibr" rid="ref35">(Zheng et al., 2016)</xref>
        . These methods use
word-embeddings for automatically learning the
patterns and semantics of relations without using
any handcrafted features. Dependency based
neural networks have also been proposed
        <xref ref-type="bibr" rid="ref4">(Cai et al.,
2016)</xref>
        ,
        <xref ref-type="bibr" rid="ref18">(Liu et al., 2015)</xref>
        to capture features on the
shortest path.
      </p>
      <p>
        A voting method has been proposed by
        <xref ref-type="bibr" rid="ref26">(Sammons et al., 2014)</xref>
        for ensemble systems to
validate the outcomes that are proposed by
multiple systems from different information sources.
This method shows good results and remains
stable from a dataset to another.
      </p>
      <p>
        Several graph based methods
        <xref ref-type="bibr" rid="ref12 ref17 ref25">(Gardner and
Mitchell, 2015)</xref>
        ,
        <xref ref-type="bibr" rid="ref17">(Lao et al., 2015)</xref>
        ,
        <xref ref-type="bibr" rid="ref31 ref4">(Wang et al.,
2016)</xref>
        have been proposed for knowledge base
completion task by applying Path Ranking
Algorithm
        <xref ref-type="bibr" rid="ref16 ref24">(Lao and Cohen, 2010)</xref>
        . These methods
basically use the already existing relationships in a
knowledge base to learn inference and create new
relations by the inference model. Yu and Ji (2016)
proposed a graph based method for trigger-word
identification for slot filling task by using
PageRank and Affinity propagation on a graph built at
sentence level.
      </p>
      <p>
        Information-theoretic measurements on graphs
have been successfully used in some related tasks.
Holzinger et al.
        <xref ref-type="bibr" rid="ref15">(Holzinger et al., 2013)</xref>
        measured
entropy to discover knowledge in publication
networks. Some question-answering systems
measured point-wise mutual information
        <xref ref-type="bibr" rid="ref19">(Magnini
et al., 2002)</xref>
        ,
        <xref ref-type="bibr" rid="ref6">(Cui et al., 2005)</xref>
        to exploit
redundancy. In order to find the important and
influential nodes in a social network, centralities of the
nodes have been measured
        <xref ref-type="bibr" rid="ref9">(Friedl et al., 2010)</xref>
        .
Solá et al. (2013) explored the concept of
eigenvector centrality in the multiplex networks. In
order to validate the proposed relationships, we
apply these different measures on graphs of entities
constructed from the text-collection.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Community Graph of Entities</title>
      <sec id="sec-3-1">
        <title>Definition of the Graph</title>
        <p>Let, a graph G = (E, R), a query relation (slot)
rq, a query entity eq✏E, candidate responses ec =
{ec1, ec2, . . . , ecn}✏E where rq = r(eq, ec)✏R.</p>
        <p>The list of candidates is generated by different
relation extraction systems. Suppose, other relations
ro✏R where ro 6= rq. We characterize whether
a candidate-entity eci of Ec is correct or not for
a query relation (rq) by analyzing the
communities of Xq and Xc formed by the query entity and
each candidate response. A community Xi
contains the neighbors of ei, and this up to several
possible steps.</p>
        <p>X</p>
        <p>D
Barack
Obama</p>
        <p>C</p>
        <p>Fig. 1 shows an example of such type of graph
where the entity of a query, its type and
relationship name are Barack Obama, person and
spouse accordingly. The candidate responses are
Michelle Robinson and Hilary Clinton that are
linked to Barack Obama by spouse relation
hypothesis. The objective is to classify Michelle
Robinson as the correct response based on the
community analysis. The communities of Barack
Obama (green rectangle), Michelle Robinson
(purple circle) and Hilary Clinton (orange ellipse)
are defined by in_same_sentence relation which
means the pair of entities are mentioned in the
same sentence in the text. The graph is thus
constructed from untyped semantic relationships
based on co-occurrences. It would also be
possible to use typed semantic relationships provided
by a relation extraction system.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Construction of the Community Graph</title>
        <p>The graph of entities as illustrated in Fig. 1 is
created from a graph representing the knowledge
extracted from the texts (lower part of Fig. 2) called
knowledge graph. This knowledge graph is
generated after applying systems of named entity
recognition (NER) and sentence splitting.</p>
        <p>
          Recognition of named entities is done
using Stanford system
          <xref ref-type="bibr" rid="ref20">(Manning et al., 2014)</xref>
          and
Luxid2. Luxid is a rule-based NER system that
uses some external information sources such as
2http://www.expertsystem.com/fr/
Freebase, geo-names etc and perform with high
precision. It is able to decompose the entity
mentions into components, such as first name, last
name and title for a person named entity and
classifies location named entities into country,
state/province and city. When the two systems
disagree, as in (Stanford: location, Luxid: person),
we choose the annotation produced by Luxid
because it provides more precise information about
the detected entity than Stanford does.
        </p>
        <p>The knowledge graph represents documents,
sentences, mentions and entities as nodes and the
edges between these nodes represent relationships
between these elements.</p>
        <p>IN_SAME_SENTENCE
barack obama
michelle obama
Barack
name.first
Michelle
name.first
Obama
name.last
Sentence-1</p>
        <p>Doc-1</p>
        <p>Multiple mentions of the same entity found in
the same document are connected to the same
entity node in the knowledge graph, based on the
textual similarity of the references and their
possible components, which corresponds to a first step
of entity linking on local criteria. This operation
is performed by Luxid. However, an entity can
be mentioned in different documents with
different forms (eg, Barack Obama, President Barack
Obama, President Obama etc.) which creates
redundant nodes in the knowledge graph. Entities
are then grouped according to the similarity of
their names and the similarity of their neighboring
entities calculated by Eq. 2. This step groups the
similar entities into a single entity in the
community graph (upper part of Fig. 2). This latter graph
is constructed from the information on the entities
and relations present in the knowledge graph and
the link with the documents is always maintained.
It is thus possible to know the number of
occurrences of each entity and each relation. The graph
is stored in a Neo4j database, a graph-oriented
database, which makes it possible to extract the
subgraphs linked to an entity by queries. We only
consider as members of the communities the
entities of type person, location and organization.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Relation Validation</title>
      <p>In order to predict whether a relationship is
correct or not, we consider this problem as a binary
classification task based on three categories of
information. We calculate a set of features using the
graphs (see section 4.1), to which we add features
based on a linguistic analysis of the text that
justifies the candidate and describes the relationship
(see section 4.2) and an estimation of trust on the
candidates according to the frequencies of them in
the responses of each query (see section 4.3).
Table 1 summarizes all the features used for the
classification task.
4.1</p>
      <sec id="sec-4-1">
        <title>Graph-based Features</title>
        <p>We assume that a correct candidate of a query is an
important member of the community of the query
entity. A community Xe of an entity is defined by
the sub-graph formed by its neighbors up to
several levels. A merging of the communities of two
particular entities includes all the neighbors of that
pair of entities. We, therefore, define different
features related to this hypothesis.</p>
        <p>We hypothesize that the network density (Eq. 1)
of the community of a correct candidate merged
with the community of the query entity must be
higher than the density of an incorrect candidate’s
community merged with that of the same entity.
⇢ Xe =
number of existing edgeswith e
number of possible edges
(1)
According to the Fig. 1 the merged community
of Michelle Robinson and Barack Obama is more
dense than the merged community of Hilary
Clinton and Barack Obama.</p>
        <p>We compute the network similarity (Eq. 2)
between two communities and hypothesize that the
score of the network similarity between the
communities of a query entity and a correct candidate
would be higher than the score between that query
entity and a wrong candidate.</p>
        <p>similarity = |Xq \ Xc|</p>
        <p>p|Xq||Xc|
where, Xq and Xc are the community
members of the query entity and of a candidate entity
accordingly.
(2)</p>
        <p>
          The eigenvector centrality
          <xref ref-type="bibr" rid="ref2">(Bonacich and
Lloyd, 2001)</xref>
          measures the influence of a node in
a network. A node will be even more
influential if it is connected to other influential nodes.
We hypothesize that the query-entity should be
more influenced by the correct candidate than by
other candidates. We measure the influences of the
candidates in the community of the query-entity
by calculating the absolute difference between the
score of eigenvector centrality of the query-entity
and that of each candidate. We, therefore, assume
that this difference should be smaller for a correct
candidate than for an incorrect candidate. Suppose
A = (ai,j ) is the adjacency matrix of a graph G.
The eigenvector centrality xi of node i is
calculated recursively by Eq. 3.
        </p>
        <p>xi = 1 X ak,ixk
k
(3)
where, 6= 0 is a constant and the equation
can be expressed in matrix form: x = xA</p>
        <p>We also hypothesize that mutual information
(Eq. 4) between the pair of communities of a
query-entity and a correct candidate must be
higher than that computed between the
communities of the query-entity and an incorrect candidate.
M I(Xq, Xc) = H(Xq) + H(Xc)
H(Xq, Xc)
(4)
where, H(X) =
p(e) =</p>
        <p>n
X p(ei) log2(p(ei))
i=1
number of edges of e
number of edges of X</p>
        <p>Xq and Xc are the communities of a
queryentity and a candidate respectively.</p>
        <p>The community of an entity (query-entity or
candidate) is extended up to the third level to
measure eigenvector centrality and mutual
information.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Linguistic Features</title>
        <p>For assessing if a relation exists between the pair
of entity mentions, we define syntactic features.
For characterizing the semantic of the relation, we
represent it by seed words and analyze the
sentence at the lexical level.</p>
        <p>
          Syntactic features are calculated from
dependency analysis, i.e. the parser
          <xref ref-type="bibr" rid="ref20">(Manning et al.,
2014)</xref>
          provides a tree in which nodes are the
        </p>
        <sec id="sec-4-2-1">
          <title>Graph</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Linguistic</title>
        </sec>
        <sec id="sec-4-2-3">
          <title>Baseline (voting)</title>
        </sec>
        <sec id="sec-4-2-4">
          <title>Network density</title>
        </sec>
        <sec id="sec-4-2-5">
          <title>Eigenvector centrality</title>
        </sec>
        <sec id="sec-4-2-6">
          <title>Mutual information</title>
          <p>network similarity
Minimum edit distance between dependency patterns</p>
        </sec>
        <sec id="sec-4-2-7">
          <title>Dependency pattern length</title>
          <p>Are the query and filler mentions found in the same clause</p>
        </sec>
        <sec id="sec-4-2-8">
          <title>Has trigger word between mentions</title>
        </sec>
        <sec id="sec-4-2-9">
          <title>Has trigger word in dependency path</title>
        </sec>
        <sec id="sec-4-2-10">
          <title>Has trigger word in minimum subtree</title>
        </sec>
        <sec id="sec-4-2-11">
          <title>Is trigger based relation</title>
        </sec>
        <sec id="sec-4-2-12">
          <title>Filler credibility</title>
          <p>words of the sentence and the edges between
them are labeled by their syntactic role. We
collected a list of dependency patterns for each
relation from annotated examples. For example,
in the sentence Paola, Queen of the Belgians is
the wife of King Albert of Belgium. the
dependency pattern between Paola and King
Albert is [nn, nsubj, prep_of] and the
dependencies are nn(Queen, Paola), nsubj(wife, Queen),
prep_of(wife, Albert). We simplify the pattern
[nn, nsubj, prep_of] to [nsubj, prep_of] by
removing leading and following nn for noise reduction.
We notice that sometimes the dependency
patterns contain consecutive labels like [nsubj, dobj,
prep_of, prep_of, poss]. In such cases, we
simplify the pattern by substituting the consecutive
labels with a single label which leads to simplify
[nsubj, dobj, prep_of, prep_of, poss] into [nsubj,
dobj, prep_of, poss]. This simplification
generalizes the dependency patterns.</p>
          <p>The acquired patterns are compared to the
simplified dependency path of a sentence by
computing edit distances. Suppose a list of pre-annotated
dependency patterns are (a,b,c), (a,c,d), (b,c,d) for
a relation R and the dependency pattern (a,c,b)
is extracted from a relation provenance sentence
between the query and the filler mention for a
claimed relation to be R. We calculate the edit
distance between each pair of [(a,c,b), (a,b,c)],
[(a,c,b), (a,c,d)],[(a,c,b), (b,c,d)] and keep the
minimum edit distance as a feature.</p>
          <p>Since relations are often expressed in short
dependency paths, the length of the simplified
pattern is considered as a feature.</p>
          <p>The semantic analysis is performed based on
trigger words associated with the relation types.
We consider semantic features as boolean values
by defining two types of trigger words: positive
trigger and negative trigger. Positive trigger words
refer to the keywords that strongly support a
particular relation while the negative triggers strongly
negate the claimed relations. For example, wife,
husband, married are positive triggers while
parent, children, brother are negative triggers for a
spouse relation. We collected these seed words
from the assessed dataset of TAC KBP 2014 slot
filling task. In total we collected around 250
triggers and 553 dependency patterns of 41 relations
from 3, 579 annotated snippets.</p>
          <p>
            Since the relations are expressed by a variety
of words it is hard to collect all the trigger words
for a relation. Therefore, we associate a word
embedding to each trigger by using a pre-trained
GloVe
            <xref ref-type="bibr" rid="ref23">(Pennington et al., 2014)</xref>
            model. Thus,
deciding if a word is a trigger or not relies on the
similarity of their embeddings. Suppose, a, b are
two words between the query and filler mention of
a claimed relation R and x, y, z the positive triggers
for the claimed relation. We compute the
similarity between the vectors of each pair of (a,x), (a,y),
(a,z), (b,x), (b,y), (b,z). If the similarity score for
a word from a, b satisfies a predefined threshold
(0.7) we consider that there exists a trigger word.
We check whether there is any positive and/or
negative trigger word in three cases for validating a
claimed relation: 1) between the mentions at
surface level 2) in the dependency path and 3) in the
minimum subtree as in
            <xref ref-type="bibr" rid="ref11 ref22 ref5">(Chowdhury and Lavelli,
2012)</xref>
            .
          </p>
          <p>Some relations can be expressed without using
any trigger word. For example, the snippet Mr.
David, from California won the prize expresses
the per:city_of_residence relation without
explicitly using any trigger word. We classify the
relation types in two classes: can be expressed
without trigger word or not, and use a boolean flag
(is_trigger-based_relation) as a feature.
4.3</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Voting: Filler Credibility</title>
        <p>We use and calculate the credibility score for
candidates based on all the responses given by
different systems to a query.</p>
        <p>f iller credibility (Fi, Q)</p>
        <p>number of occurrences of Fi
=
# of occurrences of all the candidates
(5)</p>
        <p>Let F be the candidates of a query Q supplied
by the systems S. The credibility of a candidate
Fi is computed by the equation 5.</p>
        <p>
          The filler credibility counts the relative vote of
a candidate which indicates the degree of
agreement by different systems to consider the
candidate as correct. Since we can assume that
systems already performed some linguistic and
probabilistic analysis to make the responses, filler
credibility holds strong evidence for a candidate to be
correct. Some slot filling and slot filler
validation methods have used the system credibility
          <xref ref-type="bibr" rid="ref33">(Yu
et al., 2014)</xref>
          and confidence score
          <xref ref-type="bibr" rid="ref25 ref28 ref30">(Wang et al.,
2013; Viswanathan et al., 2015; Rodriguez et al.,
2015)</xref>
          of the responses for validating relations but
these features are much system and data
dependent. Therefore, we use only filler credibility as
the baseline.
5
5.1
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>Data</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiments and Results</title>
      <p>We perform our experiments by using TAC-KBP
English cold start slot filling (CSSF) datasets of
2015 and 2016. TAC provided a reference
corpus for the English CSSF-2015 evaluation task
that consisted of 45, 000 documents. These
documents include texts from newswires and
discussion_forums. We parsed these texts for building
the knowledge graph. We compiled our
training data from the assessments of English
CSSF2015 responses of slot filling systems. There were
9, 339 round-1 queries for English CSSF2015 and
in total 330, 314 round-2 queries were generated
by all the slot filling systems based on the
responses of round-1 queries. NIST assessed SF
responses of around 2, 000 round-1 and 2, 500
round-2 queries. A lot of queries have been
answered with only wrong responses. Therefore, we
do not take into account these queries for
building our training corpus. We selected only queries
that have been answered with correct and wrong
responses. This subset counted total 1, 296 (1, 080
round-1 and 216 round-2) slot filling queries.</p>
      <p>We extracted answers corresponding to those
queries from the system assessment file that
contains the assessment of the filler values and
relation-provenance offsets accordingly. The
relation provenance offsets refer to the document
ids and begin-end position of the text segments
in the evaluation corpus. The values of filler
assessment can be correct (C), wrong (W) and
inexact (X) while the assessment values of relation
provenance can be correct (C), wrong (W), short
(S) and long (L) where S and L are considered
as inexact. We only take into account the C and
W filler assessments and separate the correct and
wrong responses according to the relation
provenance assessment. When the relation provenance
assessment is C the filler assessment can be
either C or X but not W. It results in 68, 076
responses. Several features have to be computed on
complete sentences, and not on sentence excerpts.
As the relation provenance offset of a SF response
is not guaranteed to be a complete sentence, we
extract the complete sentence corresponding to the
relation provenance offset snippet from the source
document.</p>
      <p>The linguistic features are calculated from the
analyzed sentences where the mentions of the pair
of entities (query and candidate) must be
identified. However, our system cannot find both
entries in all selected sentences. This happens when
either the query entity or the candidate entity is
mentioned by a pronoun or nominal anaphora as
we do not use any co-reference resolution. In
addition, the named entity detection system, which
results from two efficient systems, does not
detect all the entity mentions (of queries and
candidates) present in the queries and hypotheses. This
restriction also applies to the computation of
features based on the graph that is constructed over
the recognized named entities. This behavior
corresponds to a trend generally observed in named
entity recognition systems when applied to
different documents from those on which they were
trained (here web documents and blogs instead of
newspaper articles or Wikipedia pages).</p>
      <p>Moreover, adding the constraint to finding two
entities in the same sentence causes this additional
decrease in performance. A total of 55, 276
hypotheses (of the initial 68, 076) could be processed
to compute the linguistic features for the responses
of 1, 296 queries that have been responded with
both correct and wrong candidates. Our system
restricts to extract graph based features of limited
number of query-responses due to the NER
limitation and in_same_sentence constraint. In
summary, we can extract both the linguistic and graph
features for 4, 321 responses from 260 queries,
(213 from round-1 and 47 from round-2). Since
there are many wrong responses compared to the
number of correct responses, we take a subset of
the wrong responses randomly from CSSF-2015
dataset for training the system after removing the
duplicate responses where the ratio of correct and
wrong responses of each query is 2 : 1
approximately. After applying the filtering process the
training dataset contains in total 3, 481 (1, 268
positive and 2, 213 negative) instances.</p>
      <p>Similar process has been applied on TAC KBP
CSSF-2016 dataset that we use for testing.
CSSF2016 dataset consists of around 30, 000
documents. Around 34, 267 responses (of 925 queries)
have been assessed as correct or wrong by TAC.
Our system could compute graph based features
for around 3, 884 responses of 352 queries that
have been responded by both correct and wrong
answers. There are 699 correct and 3, 185 wrong
responses among 3, 884 responses with graph
based features in our test dataset. The statistics
of the training and test datasets are shown in the
columns 1 to 4 of Table 4.
We have trained the models by using several
classifiers and evaluated relation validation method by
standard precision, recall, F-measure and accuracy
of different models.</p>
      <p>In this experiment we show the contribution of
the proposed graph based features for validating
relations. Since community-graph based analysis
does not account the semantics but holds some
evidences of how the entities are associated to each
other we expect significant gain of precision by
our relation validation method so that a better
Fscore can be achieved.</p>
      <p>We compare the classification performances of
different classifiers e.g. LibLinear, SVM, Naive
Bayes, MaxEnt and Random Forest based on the
best combination of the features as shown in
Table 2. We obtain the best precision (32.2),
recall (48.1), F-score (38.5) and accuracy (72.4) by
Random Forest classifier. The second highest
precision (29.0) and accuracy (72.35) are resulted
by MaxEnt while Naive Bayes results the second
highest recall (45.8) and F-score (38.5). Since
Random Forest results the best scores over other
classifiers we observe the performances of
different feature sets by this classifier.</p>
      <p>
        Table 3 presents the classification performances
of different feature sets by Random Forest
classifier. Filler credibility as a single feature obtains
the F-score and accuracy of 34.6 and 62.1
accordingly. It represents a strong baseline, as shown in
        <xref ref-type="bibr" rid="ref26">(Sammons et al., 2014)</xref>
        . The combination of filler
credibility and linguistic features boosts the
performance by around 6 points in term of accuracy
though the F-score drops slightly. When the filler
credibility is combined with the graph features it
gains a significant precision (29.0) and accuracy
(70.3) which are around 4 and 8 points higher
accordingly. This combination also increases the
Fscore slightly. This may seem surprising, as the
graphs are the same for assumptions about pairs
of identical entities but linked by different
relationships. It should be noted that the relation
hypotheses are already the results of different
systems based on the semantics of relations. The
graph-based features account the global context of
the hypotheses of relations and the experimental
results show the significant contribution of them.
      </p>
      <p>The best precision (32.2), F-score (38.5) and
accuracy (72.4) are achieved by combining filler
credibility, linguistic and graph features. This</p>
      <sec id="sec-5-1">
        <title>Classifier</title>
        <sec id="sec-5-1-1">
          <title>LibLinear SVM</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>Naive Bayes</title>
        </sec>
        <sec id="sec-5-1-3">
          <title>MaxEnt</title>
        </sec>
        <sec id="sec-5-1-4">
          <title>Random Forest</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Feature Groups</title>
        <p>combination improves the precision, F-score and
accuracy by around 7, 4 and 10 points over the
filler credibility. The precision is gained because
of low false negative that indicates the system
classifies a small number of wrong responses as
correct. This results strongly signify the contribution
of graph based features for validating the claimed
relations.</p>
        <p>We also investigated the classification
performance of Fc+Linguistic+Graph model relation by
relation as shown in Table 4.</p>
        <p>We notice that the classification performances
are not similar for different relations according
to the F-score and accuracy although we train
a single model with the responses of different
relations. However, as we do not have the similar
number of training instances for all the relations
(e.g. per-city_of_birth and org-subsidiaries),
as we see in column 2, it may impact the
results. Additionally, in the test data, there
are a very small number of positive responses
compared to the negative ones (see column 5)
for some relations that cause inconsistency in
classification performance. For example, the
distribution of positive and negative responses of
per:top_members_employees, per:city_of_birth
and per:parents are more balanced compared
to countries_of_residence, org-subsidiaries,
employee_or_member_of, country_of_headquarters.
Therefore, these two sets of relations make a
clear difference between their scores. Also, some
relations (org-parents, per-country_of_birth and
per-stateorprovince_of_birth) have very small
number of positive instances where all of these
have been classified as wrong and these relations
individually counts zero true positive. Therefore,
the precision, recall and F-score become zero.
However, most of the negative instances of
these relations have been correctly classified
as wrong. Moreover, the test dataset contains
only negative instances for some relations
(per-cities_of_residence, per-schools_attended,
per-children, org-alternate_names and
orgfounded_by). All of the instances have been
correctly classified as wrong that result 100%
accuracy for these relations. Interestingly, we
notice that proposed relation validation method
discards all the wrong instances of per-children
relation although the training dataset does not
contain any positive or negative instances of this
relation. A similar result has also been observed
for per-country_of_death relation where the
system of relation validation obtains an accuracy
of 86.7 to validate the instances. These results
justify that the relation validation system trained
by the instances of different relations is able to
predict correctly whether an instance of a relation
is correct or wrong even though the system is not
trained by the instances of that particular relation.</p>
        <p>Table 5 explains better that our system performs
better than the baseline system to discard the
negative responses. It presents the confusion matrix
Relation Name
org-top_members_employees
per-city_of_birth
statesorprovinces_of_residence
org-city_of_headquarters
per-parents
per-country_of_death
per-countries_of_residence
org-country_of_headquarters
stateorprovince_of_headquarters
org-subsidiaries
per-employee_or_member_of
org-parents
per-country_of_birth
per-stateorprovince_of_birth
per-cities_of_residence
per-schools_attended
per-children
org-alternate_names
org-founded_by
All Together
(true positive (TP), false negative (FN), false
positive (FP) and true negative (TN)) of the
classification task by different feature sets. The
combination, Fc+Linguistic+Graph discards 77.7%
wrong responses which is around 14% higher than
the baseline. All the experimental results signify
the contribution of the proposed features for the
task of relation validation.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>This paper presents a method of validating
relationships from the system outputs. We have
introduced some features on the linked entities which
are computed at the global level of the
collection. We have proposed to deal with the
community graphs of entities that make it possible
to account for general knowledge about the
entities having true relationships. Experimental
results have shown that our proposed features
significantly improve a baseline constructed from the
votes on the responses of different systems. The
proposed method outperforms the baseline to
discard wrong relationships.</p>
      <p>The calculation of the different characteristics is
dependent on the parsing of the texts, in particular,
on the results of the NER system. This part has to
be improved in order to evaluate the contribution
of community graph on more responses. Although
the proposed method results better F-score and
accuracy compared to the baseline, the method also
discards some positive responses that drops the
recall; thus we have to overcome this limitation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Isabelle</given-names>
            <surname>Augenstein</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Web Relation Extraction with Distant Supervision</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Sheffield.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Phillip</given-names>
            <surname>Bonacich</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paulette</given-names>
            <surname>Lloyd</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Eigenvector-like measures of centrality for asymmetric relations</article-title>
          .
          <source>Social networks</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ):
          <fpage>191</fpage>
          -
          <lpage>201</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Razvan C Bunescu and Raymond J Mooney</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>A shortest path dependency kernel for relation extraction</article-title>
          .
          <source>In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics</source>
          , pages
          <fpage>724</fpage>
          -
          <lpage>731</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Rui</given-names>
            <surname>Cai</surname>
          </string-name>
          , Xiaodong Zhang, and
          <string-name>
            <given-names>Houfeng</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Bidirectional recurrent convolutional neural network for relation classification</article-title>
          .
          <source>In Proceedings of the 54th Annual</source>
          <article-title>Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>Association for Computational Linguistics</source>
          , Berlin, Germany, pages
          <fpage>756</fpage>
          -
          <lpage>765</lpage>
          . http://www.aclweb.org/anthology/P16-1072.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Md</given-names>
            <surname>Faisal Mahbub Chowdhury</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alberto</given-names>
            <surname>Lavelli</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Combining tree structures, flat features and patterns for biomedical relation extraction</article-title>
          .
          <source>In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics</source>
          , pages
          <fpage>420</fpage>
          -
          <lpage>429</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Hang</given-names>
            <surname>Cui</surname>
          </string-name>
          , Renxu Sun,
          <string-name>
            <given-names>Keya</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>Min-Yen Kan</surname>
            , and
            <given-names>TatSeng</given-names>
          </string-name>
          <string-name>
            <surname>Chua</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Question answering passage retrieval using dependency relations</article-title>
          .
          <source>In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM</source>
          , pages
          <fpage>400</fpage>
          -
          <lpage>407</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Aron</given-names>
            <surname>Culotta</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Sorensen</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Dependency tree kernels for relation extraction</article-title>
          .
          <source>In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics</source>
          , page
          <volume>423</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Dmitriy</given-names>
            <surname>Dligach</surname>
          </string-name>
          , Timothy Miller,
          <string-name>
            <given-names>Chen</given-names>
            <surname>Lin</surname>
          </string-name>
          , Steven
          <string-name>
            <surname>Bethard</surname>
            , and
            <given-names>Guergana</given-names>
          </string-name>
          <string-name>
            <surname>Savova</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Neural temporal relation extraction</article-title>
          .
          <source>EACL 2017 page 746.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Dipl-Math Bettina</surname>
            <given-names>Friedl</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Julia</given-names>
            <surname>Heidemann</surname>
          </string-name>
          , et al.
          <year>2010</year>
          .
          <article-title>A critical review of centrality measures in social networks</article-title>
          .
          <source>Business &amp; Information Systems Engineering</source>
          <volume>2</volume>
          (
          <issue>6</issue>
          ):
          <fpage>371</fpage>
          -
          <lpage>385</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Katrin</given-names>
            <surname>Fundel</surname>
          </string-name>
          , Robert Küffner, and
          <string-name>
            <given-names>Ralf</given-names>
            <surname>Zimmer</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Relex-relation extraction using dependency parse trees</article-title>
          .
          <source>Bioinformatics</source>
          <volume>23</volume>
          (3):
          <fpage>365</fpage>
          -
          <lpage>371</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Pablo</given-names>
            <surname>Gamallo</surname>
          </string-name>
          , Marcos Garcia, and
          <string-name>
            <surname>Santiago</surname>
          </string-name>
          Fernández-Lanza.
          <year>2012</year>
          .
          <article-title>Dependency-based open information extraction</article-title>
          .
          <source>In Proceedings of the joint workshop on unsupervised and semi-supervised learning in NLP. Association for Computational Linguistics</source>
          , pages
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Matt</given-names>
            <surname>Gardner</surname>
          </string-name>
          and Tom M Mitchell.
          <year>2015</year>
          .
          <article-title>Efficient and expressive knowledge base completion using subgraph feature extraction</article-title>
          .
          <source>In EMNLP</source>
          . pages
          <fpage>1488</fpage>
          -
          <lpage>1498</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Xianpei</surname>
            <given-names>Han</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Le</given-names>
            <surname>Sun</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Collective entity linking in web text: a graph-based method</article-title>
          .
          <source>In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM</source>
          , pages
          <fpage>765</fpage>
          -
          <lpage>774</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Raphael</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          , Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S Weld.
          <year>2011</year>
          .
          <article-title>Knowledgebased weak supervision for information extraction of overlapping relations</article-title>
          .
          <source>In Proceedings of the 49th Annual</source>
          <article-title>Meeting of the Association for Computational Linguistics: Human Language TechnologiesVolume 1</article-title>
          . Association for Computational Linguistics, pages
          <fpage>541</fpage>
          -
          <lpage>550</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Holzinger</surname>
          </string-name>
          , Bernhard Ofner, Christof Stocker, André Calero Valdez, Anne Kathrin Schaar, Martina Ziefle, and
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Dehmer</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>On graph entropy measures for knowledge discovery from publication network data</article-title>
          .
          <source>In Availability, reliability, and security in information systems and HCI</source>
          , Springer, pages
          <fpage>354</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Ni</given-names>
            <surname>Lao</surname>
          </string-name>
          and William W Cohen.
          <year>2010</year>
          .
          <article-title>Relational retrieval using a combination of path-constrained random walks</article-title>
          .
          <source>Machine learning 81(1)</source>
          :
          <fpage>53</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Ni</given-names>
            <surname>Lao</surname>
          </string-name>
          , Einat Minkov, and William W Cohen.
          <year>2015</year>
          .
          <article-title>Learning relational features with backward random walks</article-title>
          .
          <source>In ACL (1)</source>
          . pages
          <fpage>666</fpage>
          -
          <lpage>675</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Yang</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Furu Wei,
          <string-name>
            <given-names>Sujian</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Heng</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming Zhou</surname>
          </string-name>
          , and
          <string-name>
            <surname>Houfeng</surname>
            <given-names>WANG.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>A dependencybased neural network for relation classification</article-title>
          .
          <source>In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          : Short Papers).
          <source>Association for Computational Linguistics</source>
          , Beijing, China, pages
          <fpage>285</fpage>
          -
          <lpage>290</lpage>
          . http://www.aclweb.org/anthology/P15-2047.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Magnini</surname>
          </string-name>
          , Matteo Negri, Roberto Prevete, and
          <string-name>
            <given-names>Hristo</given-names>
            <surname>Tanev</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Is it the right answer?: exploiting web redundancy for answer validation</article-title>
          .
          <source>In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics</source>
          , pages
          <fpage>425</fpage>
          -
          <lpage>432</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Christopher D. Manning</surname>
            , Mihai Surdeanu, John Bauer, Jenny Finkel,
            <given-names>Steven J.</given-names>
          </string-name>
          <string-name>
            <surname>Bethard</surname>
          </string-name>
          , and
          <string-name>
            <surname>David McClosky</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations</article-title>
          . pages
          <fpage>55</fpage>
          -
          <lpage>60</lpage>
          . http://www.aclweb.org/anthology/P/P14/P14-5010.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Mike</given-names>
            <surname>Mintz</surname>
          </string-name>
          , Steven Bills, Rion Snow, and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Distant supervision for relation extraction without labeled data</article-title>
          .
          <source>In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics</source>
          , pages
          <fpage>1003</fpage>
          -
          <lpage>1011</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Feng</given-names>
            <surname>Niu</surname>
          </string-name>
          , Ce Zhang, Christopher Ré, and Jude W Shavlik.
          <year>2012</year>
          .
          <article-title>Deepdive: Web-scale knowledgebase construction using statistical learning and inference</article-title>
          .
          <source>VLDS</source>
          <volume>12</volume>
          :
          <fpage>25</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Jeffrey</surname>
            <given-names>Pennington</given-names>
          </string-name>
          , Richard Socher, and
          <string-name>
            <given-names>Christopher D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Glove: Global vectors for word representation</article-title>
          .
          <source>In Empirical Methods in Natural Language Processing (EMNLP)</source>
          . pages
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . http://www.aclweb.org/anthology/D14-1162.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Riedel</surname>
          </string-name>
          , Limin Yao, and
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Modeling relations and their mentions without labeled text</article-title>
          .
          <source>In Machine Learning and Knowledge Discovery in Databases</source>
          , Springer, pages
          <fpage>148</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Miguel</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          , Sean Goldberg, and Daisy Zhe Wang.
          <year>2015</year>
          .
          <article-title>University of florida dsr lab system for kbp slot filler validation 2015</article-title>
          .
          <source>In Proceedings of the Eighth Text Analysis Conference (TAC2015).</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Mark</given-names>
            <surname>Sammons</surname>
          </string-name>
          , Yangqiu Song, Ruichen Wang, Gourab Kundu,
          <string-name>
            <surname>Chen-Tse</surname>
            <given-names>Tsai</given-names>
          </string-name>
          , Shyam Upadhyay, Siddarth Ancha, Stephen Mayhew, and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Roth</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Overview of ui-ccg systems for event argument extraction, entity discovery and linking, and slot filler validation</article-title>
          .
          <source>Urbana</source>
          <volume>51</volume>
          :
          <fpage>61801</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Luis</given-names>
            <surname>Solá</surname>
          </string-name>
          , Miguel Romance, Regino Criado, Julio Flores,
          <source>Alejandro Garcia del Amo, and Stefano Boccaletti</source>
          .
          <year>2013</year>
          .
          <article-title>Eigenvector centrality of nodes in multiplex networks</article-title>
          .
          <source>Chaos: An Interdisciplinary Journal of Nonlinear Science</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ):
          <fpage>033131</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Vidhoon</given-names>
            <surname>Viswanathan</surname>
          </string-name>
          , Nazneen Fatema Rajani, Yinon Bentor, and
          <string-name>
            <given-names>Raymond</given-names>
            <surname>Mooney</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Stacked ensembles of information extractors for knowledgebase population</article-title>
          .
          <source>In Proceedings of ACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Ngoc</given-names>
            <surname>Thang</surname>
          </string-name>
          <string-name>
            <surname>Vu</surname>
          </string-name>
          , Heike Adel, Pankaj Gupta, and
          <string-name>
            <given-names>Hinrich</given-names>
            <surname>Schütze</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Combining recurrent and convolutional neural networks for relation classification</article-title>
          .
          <source>In Proceedings of the</source>
          <year>2016</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          . Association for Computational Linguistics, San Diego, California, pages
          <fpage>534</fpage>
          -
          <lpage>539</lpage>
          . http://www.aclweb.org/anthology/N16-1065.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>I-Jeng</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Edwina Liu, Cash Costello, and
          <string-name>
            <given-names>Christine</given-names>
            <surname>Piatko</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Jhuapl tac-kbp2013 slot filler validation system</article-title>
          .
          <source>In Proceedings of the Sixth Text Analysis Conference (TAC</source>
          <year>2013</year>
          ). volume
          <volume>24</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>Quan</given-names>
            <surname>Wang</surname>
          </string-name>
          , Jing Liu, Yuanfei Luo,
          <string-name>
            <given-names>Bin</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C</given-names>
            <surname>Lin</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Knowledge base completion via coupled path ranking</article-title>
          .
          <source>In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</source>
          . pages
          <fpage>1308</fpage>
          -
          <lpage>1318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Limin</given-names>
            <surname>Yao</surname>
          </string-name>
          , Aria Haghighi,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Riedel</surname>
          </string-name>
          , and
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Structured relation discovery using generative models</article-title>
          .
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics</source>
          , pages
          <fpage>1456</fpage>
          -
          <lpage>1466</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <given-names>Dian</given-names>
            <surname>Yu</surname>
          </string-name>
          , Hongzhao Huang, Taylor Cassidy, Heng Ji, Chi Wang,
          <string-name>
            <surname>Shi Zhi</surname>
          </string-name>
          , Jiawei Han,
          <string-name>
            <surname>Clare R Voss</surname>
          </string-name>
          , and
          <string-name>
            <surname>Malik</surname>
          </string-name>
          Magdon-Ismail.
          <year>2014</year>
          .
          <article-title>The wisdom of minority: Unsupervised slot filling validation based on multi-dimensional truth-finding</article-title>
          .
          <source>In COLING</source>
          . pages
          <fpage>1567</fpage>
          -
          <lpage>1578</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <given-names>Dian</given-names>
            <surname>Yu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heng</given-names>
            <surname>Ji</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Unsupervised person slot filling based on graph mining</article-title>
          .
          <source>In ACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <given-names>Suncong</given-names>
            <surname>Zheng</surname>
          </string-name>
          , Jiaming Xu,
          <string-name>
            <given-names>Peng</given-names>
            <surname>Zhou</surname>
          </string-name>
          , Hongyun Bao, Zhenyu Qi, and
          <string-name>
            <given-names>Bo</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A neural network framework for relation extraction: Learning entity semantic and relation pattern</article-title>
          .
          <source>KnowledgeBased Systems</source>
          <volume>114</volume>
          :
          <fpage>12</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>