<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>April</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Knowledge Base Augmentation using Tabular Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yoones A. Sekhavat</string-name>
          <email>yoones.a.s@ualberta.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco di Paolo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Denilson Barbosa</string-name>
          <email>denilson@ualberta.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Merialdo</string-name>
          <email>merialdo@dia.uniroma3.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ronaldinho Fabio Cannavaro Kaka Lionel Messi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brazil Italy Brazil Argentina</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Barcelona FC Juventus AC Milan Barcelona FC</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Roma Tre University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Alberta</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <volume>8</volume>
      <issue>2014</issue>
      <abstract>
        <p>Large linked data repositories have been built by leveraging semi-structured data in Wikipedia (e.g., DBpedia) and through extracting information from natural language text (e.g., YAGO). However, the Web contains many other vast sources of linked data, such as structured HTML tables and spreadsheets. Often, the semantics in such tables is hidden, preventing one from extracting triples from them directly. This paper describes a probabilistic method that augments an existing knowledge base with facts from tabular data by leveraging a Web text corpus and natural language patterns associated with relations in the knowledge base. A preliminary evaluation shows high potential for this technique in augmenting linked data repositories.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        This paper focuses on the problem of identifying plausible
relations between pairs of entities that appear in the same
row of a table. Our main assumption is that if someone went
to the trouble of juxtaposing these entities on the same le,
then there must be a relation between them. Moreover, we
seek to augment an existing repository with new instances of
relations already de ned. In other words, the list of possible
relations is part of the input. We also assume the entities can
be resolved and linked to a linked open data repository or
knowledge base. However, unlike previous works (e.g., [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]),
our method does not require that.
      </p>
      <p>Overview. As an example, assume our input table has only
columns 1 and 3 in Figure 1, and assume we have two
candidate relations: plays-for and lives-in. We start from a
list of textual patterns that are associated with each
relation; such patterns are detected by automatic methods for
building knowledge bases from natural language2. For
example, patterns for the relation plays-for could be \scored
for" and \signed contract with". In practice, thousands of
patterns are associated with a single relation; conversely,
the same pattern may be associated with multiple relations.
Thus, we build a probabilistic model to estimate the
posterior probability of a relation, given a set of text patterns
observed. To make a prediction for a given pair of entities,
we: (1) collect all sentences containing both entities from
a large text corpus; (2) extract the text in between them;
(3) match those texts against the list of patterns; and (4)
estimate the posterior probability of all candidate relations.
Paper Outline. Section 2 illustrates our model to extract
triples from tabular data. Section 3 discusses our rst
implementation of the model, and Section 4 reports the results of
experiments on this rst implementation. Section 5 presents
related work, and Section 6 concludes the paper and presents
ideas to improve our model.</p>
    </sec>
    <sec id="sec-2">
      <title>2. TRIPLE EXTRACTION</title>
      <p>In general terms, the problem we address is as follows:
Given a table T , with n rows and k columns, whose cells
contain mentions to entities, and a set R = fr1; : : : ; rmg of
relations of interest, we aim to produce all triples hti[x]; rj; ti[y]i
8ti 2 T , rj 2 R, where ti[x] denotes the value of column x
in row i.</p>
      <p>Furthermore, we seek to assign relation rj to pairs of columns
in T , in the sense that the predicate rj holds for all rows of
T . The problem boils down to the case of an input table with
just two columns x and y, and, without loss of generality,
we address this case here.</p>
      <p>
        Understanding semantic relations between entities using text
corpora is a challenging task because a relation can be
expressed with di erent textual (surface) forms. For example,
the relation plays-for can be inferred from sentences with
the patterns \scored for" and \signed contract with". On the
other hand, a pattern may express more than one relation.
2We use the patterns in the PATTY system [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
For example, the pattern \played in" may represent
relations plays-for and performed-at (e.g., \Pink Floyd played
in Pompeii").
      </p>
      <p>Another challenge, which has been the subject of substantial
work recently, is entity linking. For example, \Barcelona"
may represent the football club or the city. Evidently, the
choice of entity linking approach will have an impact on
the quality of the triples produced by a method like ours.
To avoid this factor in our study of the relation assignment
problem we use exact matching to \link" the entities. In
e ect, this is akin to assuming the entities are already linked.</p>
    </sec>
    <sec id="sec-3">
      <title>2.1 Relation between a Pair of Entities</title>
      <p>We start with determining whether a relation r applies to
two entities. In the absence of further information, a
reasonable approach is to search the (textual) Web for all sentences
connecting the entities and determine whether r follows from
those sentences. While textual entailment and other
reasoning methods could be used for this purpose, we turn this into
a probabilistic inference as follows. We start from a known
set of textual patterns p1; : : : ; pk that are commonly
associated with relation r; thus giving us prior probabilities for
each pattern expressing the relation. To determine if
entities e1 and e2 belong in relation r, we search the corpus for
all sentences containing both e1 and e2, extract all words
in between them, and match those words against the list
of patterns. We can use Bayesian inference to compute the
posterior probability of r given the observed patterns.
In this model, relation r is a categorical class variable whose
domain is R, and the patterns p1; :::; pn are binary evidence
variables, representing patterns observed between entities in
the text corpus. The model is thus:</p>
      <p>We assume evidence variables fp1; :::; png are conditionally
independent, which is reasonable since the probability that
pattern pi represents a relation does not depend on the
probability that another pattern pj also represents that relation.
Given this assumption and using the Chain rule, we have:</p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Relation between Two Lists of Entities</title>
      <p>Consider now the case of a table with n rows (i.e. a list of n
entity pairs). There are two main approaches, illustrated in
Figure 2: we can nd one relation assignment for each row
and compute an aggregate from all these assignments, or we
can observe all text patterns for all rows at once and obtain
a global assignment. We discuss each next.</p>
      <p>Rank Aggregation. In this approach, we rst obtain a
ranking of all relations in R sorted in decreasing likelihood
according to the model above, for each pair of entities in the
table. Next, we combine these ranked lists to nd a single
best relation for all entity pairs.</p>
      <p>
        Rank aggregation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a well-studied problem: given n
ranked lists l1; :::; ln, and a distance measure d, the problem
is to nd a list l such that Pin=1 d(l; li) is minimized. Among
di erent distance measures to aggregate ranking lists, we use
Spearman's Footrule (SP) and Kendall's tau (KE) which
were shown to outperform other approaches [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Kendall's
tau distance between two permutations of a list is the sum
of the number of pairs from the list which are not in the
same order in these two permutations. This distance is also
known as the number of exchanges needed in a bubble sort
to convert one permutation to another. On the other hand,
Spearman's Footrule distance is the sum of the distances
between positions of each item in two di erent
permutations. We also employ a simple average score method to
aggregate ranking lists called Mean Ranking (MR), which
works as follows: given a sorted (descending) list of
probabilities P r(rjp1; :::; pk) generated for each pair hxi; yii, and
m possible relations in the knowledge base, we assign a score
sci(r) = m pos(r), where pos(r) is the position of relation
r in this sorted list.
      </p>
      <p>Global Ranking. Another approach is to feed all patterns
for all entity pairs in the relation as evidence to the
probabilistic model. In this Global Ranking (GR) approach, all
observed patterns for all entity pairs simultaneously
contribute to selection of the most probable relation.</p>
    </sec>
    <sec id="sec-5">
      <title>3. IMPLEMENTATION</title>
      <p>This section describes one instantiation of the probabilistic
model developed in the previous section, using o -the-shelf,
real-world knowledge bases and text corpora.</p>
      <p>
        Data. We use YAGO [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and PATTY [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to obtain
relations and patterns. YAGO is a popular knowledge base
with about 10 million entities and 120 million facts, and
PATTY, developed by the same research group, associates
22,779 patterns mined from New York Times articles and
Wikipedia with 25 relations. Of those, 24 exist in YAGO
and were used here. Our text corpus is the NELL
SubjectVerb-Object triple corpus [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], with about 604 million triples
      </p>
      <p>As mentioned before, we link entities mentioned in the tables
to those in the text corpus with exact string matching. That
is, given entities e1 and e2 in a table row, we nd all triples in
the NELL corpus which have e1 as the subject and e2 as the
object. Similarly, we match the verbs of the resulting triples
against the PATTY patterns (to obtain the observations).
In e ect, our system uses the \intersection" of PATTY
patterns and NELL triples, consisting of 4,357 unique patterns
and 108,699,400 triples. The YAGO relation
has-academicadvisor was discarded as none of its patterns are found in
the NELL corpus. Table 1 shows the relations used and the
number of patterns in each of them. Note the wide variation
in the number of patterns per relation.</p>
      <p>Estimating Prior Probabilities. Recall Equation 2. We
estimate the prior probability of each of the 24 relations
as P r(r)=jrj= Pri2R(jrij), where jrj is the number of
instances of relation r, and R is the set of all relations in
YAGO. As for P r(pjr), the prior probability that pattern p
occurs among instances of relation r, we use the associations
between YAGO relations and textual patterns in PATTY:
P r(pjr) = jpj= Ppi2P T (r) jpij, where P T (r) is the set of
patterns associated with r. To avoid zero probabilities, we use
the add-one Laplace smoothing technique.</p>
    </sec>
    <sec id="sec-6">
      <title>4. EXPERIMENTAL EVALUATION</title>
      <p>Since we do not query YAGO to make predictions, we use
some of its facts to build a ground-truth to test the
accuracy of our model. We extracted facts from YAGO relations
where both entities can be matched exactly in the NELL
corpus. The number of facts from YAGO that can be found
from NELL triples using exact matching varies across
relations. The lower bound was 25 facts for relation
is-knownfor. For the other relations, we picked several random
samples with 25 pairs and tested each separately. The di erence
in accuracy was negligible, so we used 25 facts per relation.
Effectiveness. We performed experiments to evaluate the
e ectiveness of three di erent rank aggregation techniques
(i.e., Spearman's Footrule (SP), Kendall's tau (KE), and
Mean Ranking (MR)) as well as the Global Ranking
approach (GR). The ranking of the correct relation generated
by the system is considered as a measure of success to
annotate that relation. The best result is achieved when the
correct relation appears at the top of the ranked list for the
facts in that relation in the ground-truth. Table 2 shows the
number of relations ranked the top 3 positions, as well as
anywhere above the 3rd place.</p>
      <p>As one can see, GR outperforms the rank aggregation
techniques in identifying correct relations. Among the rank
aggregation techniques, MR works slightly better than SP and
KE in this test. However, there is no statistical signi cance
in the di erences between the results of rank aggregation
techniques based on analysis of variance. Another
observation is that the number of PATTY patterns has an e ect
on the accuracy, and this e ect is more pronounced for the
ranking aggregation methods. As shown in Table 1, relations
associated with fewer patterns are less likely to be identi ed
correctly by rank aggregation techniques. We argue this is
due to lack of su cient evidence (patterns) for each pair. It
follows that rank aggregation techniques require more
evidence in order to infer correct relations. On the other hand,
the GR performs better for relations associated with fewer
patterns. This happens because it is more likely that many
relevant patterns appear in the union of patterns of all pairs
compared to individual entity pairs.</p>
      <p>We also ltered out relations with 200 or less patterns in
our corpus, recomputed their prior probabilities, and
reexecuted the experiments on the same dataset for them.
Table 3 shows the results of this experiment. Although MR
performed slightly better than the other techniques, a
statistical test reveals that the di erences are not signi cant.
What is important to note is that, as expected, ltering out
less popular relations leads to higher overall accuracy.
Applications using our technique can exploit this trade-o to
set the appropriate threshold.</p>
      <p>Performance. For e ciency reasons, we index patterns
using a su x tree in memory. The average execution times in
milliseconds for processing a pair of entities (taken over 20
executions) are: 1688 for SP, 1868 for KE; 1729.4 for MR;
and 1719 for GR. As one can see, there are no considerable
di erences among the methods. In fact, our observations
are that the majority of the time is spent on matching the
entities against the NELL corpus.</p>
    </sec>
    <sec id="sec-7">
      <title>4.1 Towards Knowledge Augmentation</title>
      <p>The ultimate goal of our technique is knowledge
augmentation by generating new instances of relations from tabular
data that are not already in the knowledge base. We
performed preliminary experiments to assess if our system could
accomplish this goal.</p>
      <p>Our rst test was with a spreadsheet including song data
available at www.aardvarkdjservices.co.uk (a website
specialized in music services). We looked at 48 singer, song
pairs from 2 albums, with 24 songs from Elvis Presley, and
24 songs from Frank Sinatra. Every approach returned
created as the best relation between those entities. We
manually veri ed the 48 facts in this case, and found that only
31 were already in YAGO. In another experiment, we used
a spreadsheet with data about NBA players extracted from
wwww.espn.go.com, and tested our system with 100 player,
team pairs. YAGO had 92 of these pairs in the is-a
liatedto relation. Every con guration of our system identi ed all
100 pairs as instances of the plays-for relation, which, one
can argue, is a suitable and more speci c relation for these
entities.</p>
    </sec>
    <sec id="sec-8">
      <title>5. RELATED WORK</title>
      <p>
        A lot of work has been done towards understanding
tables within text or online. Some have attempted to exploit
column headers to identify relations between two columns
(e.g., [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]), which is akin to schema-based data integration.
We make no assumption about the existence of this
information in our approach as this information may not be
available for all tables, making our method more similar to an
instance-based data integration approach.
      </p>
      <p>
        A probabilistic model is proposed in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to associate class
labels with columns and identify relations between entity
columns and the rest of columns. Recently, the joint
inference technique is used to simultaneously annotate table cells,
table columns and relations between columns. In [
        <xref ref-type="bibr" rid="ref10 ref8">8, 10</xref>
        ],
graphical models are employed to annotate column headers,
table cells, and relations between columns. Our work is
similar, to some extent, to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] in which Wikipedia's tables are
used to generate new triples using DBpedia as a knowledge
base. Unlike our method, these techniques require linking
entities to one or more linked open data repositories.
The problem has also been considered in terms of extracting
schema for tabular data. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], an extraction system is
proposed to convert data stored in spreadsheets into relational
tuples. In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a set of row classes representing common
features of individual rows in a table is identi ed. Then,
Conditional Random Fields techniques are used to generate
a sequence of row labels.
      </p>
      <p>
        Our work is also related to relation extraction techniques
from text corpora. In supervised learning (e.g., [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]),
manually labelled relations are used to train a model for labeling
relations. On the other hand, in unsupervised approaches
(e.g., [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]), strings between entities in a text corpus are
clustered and then simpli ed to generate relations. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the
classi er is trained using textual features of sentences
between known entities in Freebase. This technique generates
instances of new relations, while our technique generates
instances of relations linked to exiting relations in linked open
data repositories.
      </p>
    </sec>
    <sec id="sec-9">
      <title>6. CONCLUSION</title>
      <p>In this paper, we described a probabilistic approach for
augmenting linked open data repositories using tabular data,
thus tapping into these under-explored sources of valuable
information. Unlike prior methods that focus on natural
language understanding to determine whether two named
entities are even related, we start from the (reasonable)
assumption that all entities in the same row of a table are
related by construction. Unlike previous methods that
attempt to understand tabular data, we take a more pragmatic
and e ective stance: we label pairs of columns in the table
with relations coming from an established knowledge base.
By doing so, all facts we extract can be interpreted in the
same way as those in the knowledge base.</p>
      <p>
        We described a rst implementation of our model using
linked open data resources{YAGO, PATTY and NELL{and
showed experimentally that the approach is e ective, despite
the limitations in the way we match entities. We also showed
that it rather easy to nd new knowledge with our model.
Yet, we have only sketched a research direction rich in
opportunities to improve knowledge building and linking in our
opinion. There are some limitations that we aim to address
in future work. Instead of a limited number of YAGO
relations, we aim to use a wide range of relations (e.g., those in
Freebase). Moreover, we can increase recall by using proper
entity linking techniques such as those in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>Another interesting line of future work would be to estimate
how many new triples could be extracted from tabular data
on the entire Web, and how accurate they could be. To
do so, one needs a systematic approach and some
machinery to automatically check if the new facts already exists in
the knowledge base, as well as whether or not these facts
are accurate. Di erent notions of accuracy apply here. For
example, it may be that the new facts contradict existing
knowledge, or it could be that they are expressed at a
different granularity, as was the case for our experiment with
NBA players. One could also use both quantitative and
qualitative metrics to chart which websites provide the best
data.</p>
      <p>Acknowledgements. This work was supported in part by
grants from the Natural Sciences and Egineering Reseach
Council of Canada and the IBM Alberta Centre for
Advanced Studies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. D.</surname>
          </string-name>
          <article-title>Adel o and</article-title>
          <string-name>
            <given-names>H.</given-names>
            <surname>Samet</surname>
          </string-name>
          .
          <article-title>Schema extraction for tabular data on the web</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>6</volume>
          (
          <issue>6</issue>
          ):
          <volume>421</volume>
          {
          <fpage>432</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <article-title>Linked data-the story so far</article-title>
          .
          <source>Int. J. Sem. Web Inf. Sys.</source>
          ,
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bollacker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Paritosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sturge</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Taylor</surname>
          </string-name>
          . Freebase:
          <article-title>A collaboratively created graph database for structuring human knowledge</article-title>
          .
          <source>In SIGMOD'08 Conf. Proc.</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Cafarella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          , E. Wu, and
          <string-name>
            <surname>Y. Zhang.</surname>
          </string-name>
          <article-title>Webtables: Exploring the power of tables on the web</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>1</volume>
          (
          <issue>1</issue>
          ),
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Cafarella</surname>
          </string-name>
          .
          <article-title>Automatic web spreadsheet data extraction</article-title>
          .
          <source>In SSW'13 Conf. Proc.</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>DiFranzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Graves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Michaelis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          .
          <article-title>Data-gov wiki: Towards linking government data</article-title>
          .
          <source>In AAAI Spring Symp.: Linked data meets AI</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sivakumar</surname>
          </string-name>
          .
          <article-title>Comparing top k lists</article-title>
          .
          <source>In SODA'03 Conf. Proc.</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Limaye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarawagi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Chakrabarti</surname>
          </string-name>
          .
          <article-title>Annotating and searching web tables using entities, types and relationships</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>3</volume>
          (
          <issue>1</issue>
          -2):
          <volume>1338</volume>
          {
          <fpage>1347</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mintz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bills</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Snow</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <article-title>Distant supervision for relation extraction without labeled data</article-title>
          .
          <source>In ACL'09 Conf. Proc.</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V.</given-names>
            <surname>Mulwad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Finin</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Joshi</surname>
          </string-name>
          .
          <article-title>Semantic Message Passing for Generating Linked Data from Tables</article-title>
          .
          <source>In ISWC'13 Conf. Proc.</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Munoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Mileo</surname>
          </string-name>
          .
          <article-title>Triplifying wikipedia's tables</article-title>
          .
          <source>In LD4IE'13 Workshop Proceedings</source>
          , ISWC. CEUR,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakashole</surname>
          </string-name>
          , G. Weikum, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Suchanek</surname>
          </string-name>
          .
          <article-title>Patty: A taxonomy of relational patterns with semantic types</article-title>
          .
          <source>In EMNLP-CoNLL'12 Conf. Proc.</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shinyama</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Sekine</surname>
          </string-name>
          .
          <article-title>Preemptive information extraction using unrestricted relation discovery</article-title>
          .
          <source>In HLT-NAACL'06 Conf. Proc.</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Suchanek</surname>
          </string-name>
          , G. Kasneci, and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum. Yago</surname>
          </string-name>
          :
          <article-title>A core of semantic knowledge</article-title>
          .
          <source>In WWW'07 Conf. Proc.</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Talukdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wijaya</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          .
          <article-title>Acquiring temporal constraints between relations</article-title>
          .
          <source>In CIKM'12 Conf. Proc.</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Venetis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Madhavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pasca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          , G. Miao, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Recovering semantics of tables on the web</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>4</volume>
          (
          <issue>9</issue>
          ):
          <volume>528</volume>
          {
          <fpage>538</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Yosef</surname>
          </string-name>
          , J. Ho art, I. Bordino,
          <string-name>
            <given-names>M.</given-names>
            <surname>Spaniol</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum. Aida</surname>
          </string-name>
          :
          <article-title>An online tool for accurate disambiguation of named entities in text and tables</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>4</volume>
          (
          <issue>12</issue>
          ):
          <volume>1450</volume>
          {
          <fpage>1453</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ji</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <article-title>Tree Kernel-Based relation extraction with Context-Sensitive structured parse tree information</article-title>
          .
          <source>In EMNLP-CoNLL'07 Conf. Proc.</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>