<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>BOEMIE: Reasoning-based Information Extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgios Petasis</string-name>
          <email>petasis@iit.demokritos.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ralf Moller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vangelis Karkaletsis</string-name>
          <email>vangelis@iit.demokritos.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Software Systems (STS) Hamburg University of Technology Schwarzenbergstr.</institution>
          <addr-line>95, 21073, Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications National Centre for Scienti c Research (N.C.S.R.) \Demokritos" GR-153 10</institution>
          ,
          <addr-line>P.O. BOX 60228, Aghia Paraskevi, Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents a novel approach for exploiting an ontology in an ontology-based information extraction system, which substitutes part of the extraction process with reasoning, guided by a set of automatically acquired rules.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Information extraction (IE) is the task of automatically extracting structured
information from unstructured documents, mainly natural language texts. Due
to the ambiguity of the term \structured information", information extraction
covers a broad range of research, from simple data extraction from Web pages
using patterns and regular grammars, to the semantic analysis of language for
extracting meaning, such as the research areas of word sense disambiguation
or sentiment analysis. The basic idea behind information extraction (the
concentration of important information from a document into a structured format,
mainly in the form of a table) is fairly old, with early approaches appearing in
the 1950s, where the applicability of information extraction was proposed by
the Zellig Harris for sub-languages, with the rst practical systems appearing at
the end of the 1970s, such as Roger Schank's systems [
        <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
        ], which exported
\scripts" from newspaper articles. The ease of evaluation of information
extraction systems in comparison to other natural language processing technologies
such as machine translation or summarisation, where evaluation is still an open
research issue, made IE systems quite popular and led to the Message
Understanding Conferences (MUC) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] that rede ned this research eld.
      </p>
      <p>
        Ontology-Based Information Extraction (OBIE) has recently emerged as a
sub eld of information extraction. This synergy between IE and ontologies aims
at alleviating some of the shortcomings of traditional IE systems, such as e cient
representation of domain knowledge, portability into new thematic domains, and
interoperability in the era of Semantic Web [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Ontologies are a means for
sharing and re-using knowledge, a container for capturing semantic information of a
particular domain. A widely accepted de nition of ontology in information
technology and AI community is that of \a formal, explicit speci cation of a shared
conceptualization" [
        <xref ref-type="bibr" rid="ref10 ref29">29, 10</xref>
        ], where \formal implies that the ontology should be
machine-readable and shared that it is accepted by a group or community" [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
According to [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], an ontology-based information extraction system is a system
that \processes unstructured or semi-structured natural language text through
a mechanism guided by ontologies to extract certain types of information and
presents the output using ontologies". This de nition suggests that the main
differences between traditional IE systems and OBIEs are: a) OBIEs present their
output using ontologies, and b) OBIEs use an information extraction process
that is \guided" by an ontology. In all OBIE systems the extraction process is
guided or driven by the ontology to extract things such as classes, properties
and instances [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], in a process known as ontology population [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        However, the way the extraction process is guided by an ontology in all OBIEs
has not changed much with respect to traditional information extraction systems.
According to a fairly recent survey [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], OBIEs do not employ new extraction
methods, but they rather employ existing methods to identify the components
of an ontology. Current research on the eld investigates the development of
\reusable extraction components" that are tied to ontology portions that are
able to identify and populate [
        <xref ref-type="bibr" rid="ref11 ref30">30, 11</xref>
        ]. In this paper we propose an alternative
approach that tries to minimise the use of traditional information extraction
components, and substitute their e ect with reasoning. The motivation behind
the work presented in this paper is to propose a new \kind" of ontology-based
information extraction system, which integrates further ontologies and traditional
information extraction approaches, through the use of reasoning for \guiding"
the extraction process, instead of heuristics, rules, or machine learning. The
proposed approach splits a traditional OBIE in two parts, the rst part of which
deals with the gathering of evidence from documents (in the form of ontology
property instances and relation instances among them), while the second part
employs reasoning to interpret the extracted evidence, driven by plausible
explanations for the observed relations. Thus, the innovative aspects of the presented
approach include a) the use of an ontology through reasoning as a substitute for
the embedded knowledge usually found in the extraction components of OBIEs,
b) a proposal of how reasoning can be applied for extracting information from
documents, and c) an approach for inferring the required interpretation rules
even when the ontology evolves with the addition of new concepts and relations.
      </p>
      <p>The rest of this paper is organised as follows: In section 2 related work is
presented in order to place our approach within the current state-of-the-art. In
section 3 the proposed approach is presented, detailing both the interpretation
process and the automatic reasoning rule acquisition. Finally, section 4 concludes
this paper and outlines interesting directions for further research.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Ontology-based information extraction has recently emerged as a sub eld of
information extraction that tries to bring together traditional information
extraction and ontologies, which provide formal and explicit speci cations of
conceptualizations, and acquire a crucial role in the information extraction process.
A set of recent surveys have been presented that analyse the state-of-art in the
research elds of OBIEs [
        <xref ref-type="bibr" rid="ref13 ref14 ref31">13, 14, 31</xref>
        ] and ontology learning/evolution [
        <xref ref-type="bibr" rid="ref23 ref24">24, 23</xref>
        ], a
relevant research eld since many OBIE systems also perform ontology
evolution/learning. OBIE systems can be classi ed according to the way they acquire
the ontology to be used for information extraction. One approach is to consider
the ontology as an input to the system: The OBIE is guided by a manually
constructed ontology or from an \o -the-shelf" ontology. Most OBIE systems
appear to adopt this approach [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Such systems include SOBA [
        <xref ref-type="bibr" rid="ref3 ref5">5, 3</xref>
        ], KIM [
        <xref ref-type="bibr" rid="ref25 ref26">25,
26</xref>
        ] the implementation by Li and Bontcheva [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and PANKOW [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Artequact
[
        <xref ref-type="bibr" rid="ref1 ref15 ref2">15, 2, 1</xref>
        ]. The other approach is to construct an ontology as a part of the
information extraction process, either starting from scratch or by evolving an initial,
seed ontology. Such systems include Text-To-Onto [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], the implementation by
Hwang [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Kylin [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], the work by Maedche et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], the work of Dung and
Kameyama [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. However, all the aforementioned systems employ traditional
information extraction methods to identify elements of the ontology, and none
attempts to employ reasoning, as the work presented in this paper suggests.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>The BOEMIE approach</title>
      <p>
        The work presented in this paper has been developed in the context of the
BOEMIE project. It advocates an ontology-driven multimedia content analysis,
i.e. semantics extraction from images, video, text, audio/speech, through a novel
synergistic method that combines multimedia extraction and ontology evolution
in a bootstrapping fashion. This method involves, on one hand, the continuous
extraction of knowledge from multimedia content sources in order to populate
and enrich the ontologies and, on the other hand, the deployment of these
ontologies to enhance the robustness of the multimedia information extraction system.
More details about BOEMIE can be found in [
        <xref ref-type="bibr" rid="ref23 ref6">6, 23</xref>
        ].
      </p>
      <p>As already mentioned, the proposed approach splits a traditional OBIE in
two parts, the rst part of which deals with the gathering of evidence from
documents (in the form of ontology property instances and relation instances
among them), while the second part employs reasoning to interpret the extracted
evidence, driven by plausible explanations for the observed relations. As a result,
the typical extraction process in also split in two phases: \low-level analysis"
(where traditional extraction techniques such as machine learning are used) and
\semantic interpretation", where analysis' results are explained, according to the
ontology, through reasoning. Each of the two phases identi es di erent elements
of the ontology, whose elements are also split in two groups, the \mid-level
concepts" (MLCs - identi ed by low-level analysis), and the \high-level concepts"
(HLCs), which are identi ed through semantic interpretation.</p>
      <p>The implications of this separation are signi cant: the low-level analysis
cannot assume that a Person/Athlete/Journalist has been found in a multimedia
document, just because a name has been identi ed. Instead the low-level analysis
reports that a name, an age, a nationality, a performance, etc. has been found,
and reports how all these are related through binary relations, extracted from
modality-speci c information (i.e. linguistic events for texts, spatial relations for
images/videos, etc.). The identi cation of Person/Athlete/Journalist instances
is done through reasoning, using the ontology and the reasoning (interpretation)
rules, as low-level analysis cannot know how the Person or Athlete concepts are
de ned in the ontology (i.e. what their properties/axioms/restrictions are). In
essence, BOEMIE proposes a novel approach for constructing an OBIE, by
keeping the named-entity extraction phase from traditional IE systems, modifying
relation extraction to re ect modality-speci c relations at the ontological level,
and implementing the remaining phases of traditional IE systems through
reasoning. For example, low-level analysis of an image is responsible for reporting
only that a few tenths of faces have been detected (i.e. the faces of athletes and
the audience { represented as MLC instances), along with a human body (i.e.
the body of an athlete { represented as an MLC instance), a pole, a mattress,
two vertical bars, a horizontal bar, etc. (all these are instances of MLCs). After
MLC instances have been identi ed, the low-level analysis is expected to
identify relational information about these MLC instances. For example, the low-level
analysis is expected to identify that a speci c face is adjacent to a human body
and both are adjacent to the pole and the horizontal bar. The low-level analysis
is expected to report the extracted relational information through suitable
binary relations between each pair of related MLC instances. On the other hand,
the low-level is not expected to interpret its ndings and hypothesise instances
of HLC instances, such as the existence of athletes and their number. It is up
to the second phase, the semantic interpretation, to identify how many athletes
are involved (each one represented as instance of the \Athlete" HLC), and to
interpret the scene shown in the image as an instance of the \Pole Vault" HLC
concept, e ectively explaining the image.
3.1</p>
      <sec id="sec-3-1">
        <title>De nitions</title>
        <p>The approach presented in this paper organises the ontology into four main
ontological modules, the \low-level features", the \mid-level concepts", the
\highlevel concepts", and the \interpretation rules", which are employed through
reasoning in order to provide one or more \interpretations" of a multimedia
document.</p>
        <p>De nition 1 (low-level features). Low-level features are concepts related to
the decomposition of a multimedia document (i.e. the description of an HTML
page into text, images or other objects), and concepts that describe surface forms
on single modality documents, such as segments in text and audio documents,
polygons in image and video frames, etc.</p>
      </sec>
      <sec id="sec-3-2">
        <title>De nition 2 (Mid-Level Concept (MLC)). Mid-level concepts are concepts</title>
        <p>that can be materialised (i.e. have surface forms) on documents of a single
modality. Anything that can be extracted by an OBIE that has a surface form on a
document, is an MLC.</p>
        <p>For example, the names of persons, locations, etc. in texts, the faces, bodies of
persons in images and the sound events (i.e. applauses) in audio tracks are all
MLC concepts. The BOEMIE OBIE extracts only instances of MLCs (MLCIs)
and relations (i.e. spatial) among them.</p>
      </sec>
      <sec id="sec-3-3">
        <title>De nition 3 (High-Level Concept (HLC)). High-level concepts are com</title>
        <p>pound concepts formed from MLCs. HLCs cannot be directly identi ed in a
multimedia document, as they cannot be associated with a single surface form (i.e.
segment).</p>
        <p>For example, the concept \Person" is an HLC, that groups several MLCs
(properties), such as \PersonName", \Age", \Nationality", \PersonFace", \PersonBody",
etc. Instances of HLCs (HLCIs) in the BOEMIE OBIE are identi ed through
reasoning over MLC instances (MLCIs) in the ontology, guided by a set of rules,
in a process known as \interpretation".</p>
        <p>De nition 4 (interpretation). Interpretation is the identi cation of one or
more HLC instances (HLCIs) in a multimedia document.</p>
        <p>An OBIE can have identi ed several MLC instances (MLCIs) and relations
between them in a multimedia document. If these MLC instances satisfy the axioms
of the ontology and the interpretation rules are able to generate one or more HLC
instances (HLCIs), then this multimedia document is considered as interpeted
(or explained ) by the ontology, with the HLC instances (HLCIs) constituting the
interpretation of the document. If the same MLC instances (MLCIs) are involved
in more than one HLC instances (HLCIs) of the same HLC, then the document
is considered to have multiple interpretations, usually due to ambiguity.
3.2</p>
      </sec>
      <sec id="sec-3-4">
        <title>Semantic Extraction</title>
        <p>
          The extraction engine is responsible for extracting instances of concept
descriptions that can be directly identi ed in corpora of di erent modalities. These
concept descriptions are mid-level concepts (MLCs). For example, in the textual
modality the name or the age of a person is an MLC, as instances of these
concepts are associated directly with relevant text segments. On the other hand, the
concept person is not an MLC, as it is a \compound", or \aggregate" concept in
such a way that instances of this concept are related to instances of name, age,
gender or maybe instances of other compound concepts. Compound concepts are
referred to as high-level concepts (HLCs), and instances of such concepts
cannot be directly identi ed in a multimedia document, and thus associated with
a content segment. Thus, such instances and also relationships between these
instances have to be hypothesized. In particular, this engine implements a
modular approach [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] that comprises the following three level of abstraction: 1. The
low-level analysis, which includes a set of modality-speci c (image, text, video,
audio) content analysis tools. 2. A modality-speci c semantic interpretation
engine. 3. A fusion engine, which combines interpretations from each modality3.
        </p>
        <p>
          The rst two levels implement ontology-driven, modality-speci c information
extraction, while the last one fuses the information obtained from the previous
levels of analysis. The rst level involves the identi cation of \primitive"
concepts (MLCs), as well as instances of binary relations amongst them. The second
level involves the semantic interpretation engine, responsible for hypothesizing
instances of high-level concepts (HLCs) representing the interpretation of (parts
of) a document. Semantic interpretation operates on the instances of MLCs
and relations between them extracted by the information extraction engine. The
goal of semantic interpretation is to explain why certain instances of MLCs are
observed in certain relations according to the background knowledge (domain
ontology and a set of interpretation rules) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], by creating instances of
highlevel concepts and relating these instances. Semantic interpretation is performed
through calls to a non-standard reasoning service, known as explanation
derivation via abduction. The semantic interpretation is performed on the extracted
information (MLC/relation instances) from a single modality in order to form
modality-speci c HLC instances. The fact that content analysis is separated from
semantic interpretation, along with the fact that semantic interpretation is
performed through reasoning using rules from the ontology, allows single-modality
extraction to be adaptable to changes in the ontology.
        </p>
        <p>Once a multimedia document has been decomposed into single-modality
elements and each element has been analysed and semantically interpreted
separately, the various interpretations must be fused into one or more alternative
interpretations of the multimedia document as a whole. This process is
performed at a third level, where the modality-speci c HLC instances are fused
in order to produce HLC instances that are not modality-speci c, and contain
information extracted from all involved modalities. Fusion is also formalized as
explanation generation via abductive reasoning.</p>
        <p>
          Example: the OBIE for the text modality The low-level analysis system
implemented in the context of BOEMIE exploits the infrastructure o ered by
the Ellogon4 platform [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], and the Conditional Random Fields [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] machine
learning algorithm, in order to build an adaptable named-entity recognition and
classi cation (NERC) system, able to identify MLC instances (MLCIs) and
relations between MLCIs. Both NERC and relation extraction components operate
in a supervised manner, using MLC instances that populate the (seed or evolved)
ontology as training material (whose surface forms are available through their
low-level features). The fact that both components use the populated ontology
as training source, allows them to adapt to ontology changes, and improve their
extraction performance over time, as the ontology evolves. The performance of
3 The fusion engine will not be described in this paper, as it is similar to the semantic
interpretation engine. More information can be found at [
          <xref ref-type="bibr" rid="ref23 ref6">6, 23</xref>
          ].
4 http://www.ellogon.org
the NERC and relation extraction components has been measured to about 85%
and 70% (F-measure), in the thematic domain of athletics, involving news items
and biographies from o cial sites like IAAF5 (International Association of
Athletics Federations). More details about the low-level analysis system for the text
modality can be found in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          The modality speci c interpretation engine (not only for text, but for all
modalities) is a process for generating instances of HLCs, by combining instances
of MLCs, through reasoning over instances. Abduction is used for this task, a
type of reasoning where the goal is to derive explanations (causes) for
observations (e ects). In the framework of this work we regard as explanations the
high-level semantics of a document, given the middle-level semantics, that is, we
use the extracted MLCIs in order to nd HLCIs [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The reasoning process is
guided by a set of rules, which belong into two kinds, deductive and abductive.
Assuming a knowledge base, = (T; A) (i.e. an ontology), and a set of assertions
, (i.e. the assertions of the semantic interpretation of a document), abduction
tries to derive all sets of assertions (interpretations) such as [ j= ,
while the following conditions must be satis ed: (a) [ is satis able, and
(b) is a minimal explanation for , i.e. there exists no other explanation 0
( 0 ) that [ 0 j= holds. For example, assuming the following ontology
(containing both a \terminological component" { TBox, and a set of rules):
        </p>
        <sec id="sec-3-4-1">
          <title>J umper v Human</title>
        </sec>
        <sec id="sec-3-4-2">
          <title>P ole v SportsEquipment</title>
        </sec>
        <sec id="sec-3-4-3">
          <title>Bar v SportsEquipment P ole u Bar v ?</title>
        </sec>
        <sec id="sec-3-4-4">
          <title>P ole u J umper v ?</title>
        </sec>
        <sec id="sec-3-4-5">
          <title>J umper u Bar v ?</title>
        </sec>
        <sec id="sec-3-4-6">
          <title>J umpingEvent v 9 1hasP articipant:J umper</title>
        </sec>
        <sec id="sec-3-4-7">
          <title>P oleV ault v J umpingEvent u 9hasP art:P ole u 9hasP art:Bar</title>
        </sec>
        <sec id="sec-3-4-8">
          <title>HighJ ump v J umpingEvent u 9hasP art:Bar</title>
          <p>near (Y; Z) P oleV ault (X) ; hasP art (X; Y ) ; Bar (Y ) ;
near (Y; Z)</p>
          <p>HighJ ump (X) ; hasP art (X; Y ) ; Bar (Y ) ;
hasP art (X; W ) ; P ole (W ) ;
hasP articipant (X; Z) ; J umper (Z)
hasP articipant (X; Z) ; J umper (Z)
And a document (i.e. an image) describing a pole vault event, whose analysis
results contain instances of the MLCs \Pole", \Human", \Bar" and a relation
that the human is near the bar:</p>
          <p>pole1 : P ole
human1 : Human
5 http://www.iaaf.org/</p>
          <p>bar1 : Bar
(bar1; human1) : near
The interpretation process splits the set of analysis assertions into two
subsets: (a) 1 (bona de assertions): fpole1 : P ole; human1 : Human; bar1 : Barg,
which are assumed to be true by default, and (b) 2 ( at assertions):
f(bar1; human1 : near)g, containing the assertions aimed to be explained. Since
1 is always true, [ j= can be expressed as [ 1 [ j= 2. Then, a query
Q1 is formed from each at assertion ( 2), such as Q1 := f()jnear (bar1; human1)g.
Executing the query, a set of possible explanations (interpretations) is retrieved:
1 = fN ewInd1 : P oleV ault; (N ewInd1; bar1) : hasP art;
(N ewInd1; N ewInd2) : hasP art; N ewInd2 : P ole;
(N ewInd1; human1) : hasP articipant; human1 : J umperg
2 = fN ewInd1 : P oleV ault; (N ewInd1; bar1) : hasP art;
(N ewInd1; pole1) : hasP art; (N ewInd1; human1) : hasP articipant;
human1 : J umperg
3 = fN ewInd1 : HighJ ump; (N ewInd1; bar1) : hasP art;</p>
          <p>
            (N ewInd1; human1) : hasP articipant; human1 : J umperg
Each interpretation is scored, according to a heuristic based on the number of
hypothesized entities and the number of involved 1 assertions used, and the best
scoring interpretations are kept. For the example interpretation shown above,
2 is the best scoring explanation, as 1 has an excessive hypothesized entity
(N ewInd2), and 3 does not use the \Pole" instance from 1. More details about
interpretation through abduction can be found in [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ] and [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]. The language of
the rules is SROIQV , and they can be written in OWL using nominal schemas
[
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. For instance the rst rule of the TBox shown in the previous section can
be written as follows:
          </p>
        </sec>
        <sec id="sec-3-4-9">
          <title>Bar u fY g u 9hasP art :(P oleV ault u fXg u (9hasP art:(P ole u fW g)) u</title>
          <p>
            (9hasP articipant:(J umper u fZg))) v fY g u 9near:fZg
where fZg is a nominal variable [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. However, we are going to use a more
\comfortable" notation for rules through out this paper.
3.3
          </p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>The Role of Interpretation Rules</title>
        <p>Rules are considered part of the ontology TBox and their role is to provide
guidance to the interpretation process. Their main responsibility is to provide
additional knowledge on how analysis results (speci ed through MLCIs and
relations between MLCIs) can be mapped into HLCIs within a single modality, and
how HLCIs from various modalities can be fused. As a result, rules can be split
in two categories: rules for semantic interpretation, and rules for fusion. Both
categories follow the same design pattern for rules: each rule is built around a
speci c instance or a relation between two instances in the \head" of the rule,
followed by a set of statements or restrictions in the \body" of the rule. When
a rule is applied by the semantic interpretation engine, instances can be created
to satisfy the rule, either for concepts/relations of the head (forward rules) or
for concepts on the head (backward rules).</p>
        <p>Forward rules Forward rules perform an action (usually the addition of a
relation between two instances) described in the head of the rule, if the restrictions
contained in the body have been satis ed. For example, consider the following
ABox fragment:
(personN ame1; \J aroslavRybakov") : hasV alue</p>
        <p>(ranking1; \1") : hasV alue
(person1; personN ame1) : hasP ersonN ame
personN ame1 : P ersonN ame</p>
        <p>person1 : P erson
ranking1 : Ranking
(personN ame1; ranking1) : personN ameT oRanking
This ABox fragment describes the situation where the semantics extraction
engine has identi ed two MLCIs, a person name (\Jaroslav Rybakov") and a
ranking (\1"), connected with the \personNameToRanking" relation. Also, a \Person"
instance exists that relates only to the \PersonName" instance, but not to the
\Ranking" instance. Despite the fact that the personN ame1 MLCI is related to
the ranking1 MLCI, the person1 HLCI that aggregates personN ame1 is not
related to the \Ranking" instance. In order for the \Person" instance to be related
to the \Ranking" instance, a forward rule like the following one must be present
during interpretation:
personT oRanking(X; Z)</p>
        <p>P erson(X); P ersonN ame(Y );
hasP ersonN ame(X; Y );
personN ameT oRanking(Y; Z)
This rule can be interpreted as follows: if a \Person" instance X and a
\PersonName" instance Y are found connected with a hasP ersonN ame(X; Y ) relation,
and a relation \personNameToRanking" exists between the \PersonName" Y and
any instance Z, then add a relation between the \Person" instance X and the
instance Z. The fact that the rule is applied in a forward way, suggests that all
restrictions in the body have to be met, for the relation \personToRanking" on
the head to be added in an ABox.</p>
        <p>Backward rules Backward rules on the other hand assume that the
restriction described by the head is already satis ed by the ABox (i.e. instances and
relations exist in the ABox), and that the action involves the addition of (one or
more) missing instances or relations to satisfy the body. Consider for example
the following ABox fragment from the image modality:
personBody1 : P ersonBody
personF ace1 : P ersonF ace
(personBody1; personF ace1) : isAdjacent
This ABox fragment describes two MLCIs (a person face and a person body)
that are found adjacent inside an image. Also, suppose that the TBox contains
a backward rule like the following one:
isAdjacent(Y; Z)</p>
        <p>P erson(X); P ersonBody(Y ); P ersonF ace(Z);
hasP art(X; Y ); hasP art(X; Z)
This rule roughly suggests that if a person face and a person body instances
are aggregated by a person instance (and thus both body parts are related to
the person instance with the \hasPart" relation), then the two body parts must
be adjacent to each other. However, since the relation isAdjacent(personBody1;
personF ace1) already exists in the ABox and the rule is a backward one, it will
try to hypothesise a \Person" instance X, and aggregate the two body parts.
3.4</p>
      </sec>
      <sec id="sec-3-6">
        <title>Rules for Semantic Interpretation</title>
        <p>One domain of rules application is the semantic interpretation of the results
obtained from the low level analysis, performed on multimedia resources. During
this interpretation process, the MLCIs and the relations among MLCIs are
examined, in order to aggregate the MLCIs into HLCIs. Then, relations that hold
between MLCIs are promoted to the HLCIs that aggregate the corresponding
MLCIs. Finally, an iterative process starts, which tries to aggregate the HLCIs
into other HLCIs and again promote the relations, until no other instances can
be added to an ABox. As a result, two types of rules are required during
interpretation: rules that aggregate concept instances (either MLCIs or HLCIs) into
instances of HLCs, and rules that promote relations. However, not all relations
must be promoted: only relations that hold between a property instance of an
HLCI and an instance that is not a property of the HLCI should be promoted
to the HLCI. The aggregation of instances into HLCIs is performed with the
help of backward rules 6, while the promotion of relations from properties to the
aggregating HLCI is performed with forward rules.
3.5</p>
      </sec>
      <sec id="sec-3-7">
        <title>Acquiring Rules</title>
        <p>When the ontology changes (i.e. through the addition of a new concept) the
interpretation rules must be modi ed accordingly. We tried to automate this
6 Backward rules imply the use of abduction to hypothetize instances not contained
in the original ABox.
task by monitoring ontological changes: the actions performed by an ontology
expert to the ontology are monitored and re ected to the interpretation rules,
following a transformation based approach. Considering as input what an ABox
can contain without the current concept de nition available, and as output the
instances that can be generated from the concept if de ned, the rule generation
approach tries to nd a set of rules that can transform the input into the desired
output. In order to perform this transformation, the transformation is split into
a set of more primitive \operations" that can be easily transformed into rules.</p>
        <p>Assuming the set of all possible concepts C, the set of all possible
relations R, a set of prede ned operations O on a single concept c 2 C, and a
modi cation M over c, where M = fmi (c; ci; ri)g1N , mi 2 O, ci 2 C, ci 6= c,
N
ri 2 R, the target is to calculate a rule set S = frig1 , ri = Tmi (c; ci; ri), that
corresponds to the modi cation M . Tmi is a function that transforms a
hypothesized initial state (c0; ci; ri0) to the desired state (c; ci; ri) for modi cation mi,
Tmi : (c0; ci; ri0) ! (c; ci; ri), c0 2 C, r0 2 R. Each function Tmi depends not only
on mi and the two states, but also on the interpretation engine. Since the
objective of rules generation was to eliminate manual supervision, a pattern based
approach was selected for representing each Tmi . Each pattern is responsible for
generating the required rules from transforming the initial state (c0; ci; ri0) to a
nal state (c; ci; ri) for each operation in O, possibly biased towards the speci c
interpretation model.</p>
        <p>Operations over a Single Concept A set of prede ned operations O has
been de ned that captures all modi cations that can be performed on a concept
c within the BOEMIE system. This set contains the following operations:
{ De nition of a new MLC c: This operation re ects the addition of a new MLC
to the ontology TBox, an action that is not associated with the modi cation
of the rules associated with the TBox. For this operation, T = fg.
{ De nition of a new HLC c that aggregates a single concept c0: This operation
describes the action of the de nition of an HLC based on the presence of
either an MLC or an HLC. Typical usage of this operation is when a new HLC
c has been de ned that aggregates another concept c0, and c0 is enough to
de ne this concept c. In such a case, it is assumed that during interpretation
an instance of c should be created for every instance of c0 found in an ABox.
Thus the set of rules T should create an instance of c for every instance of c0.
Example of this operation is the de nition of \Person" (c) that aggregates
either \PersonName" or \PersonFace" (c0).
{ Addition of a single concept c0 to an existing HLC c: This operation deals
with the extension of an existing HLC c with a concept c0, i.e. when adding a
new property to an existing HLC. In such a case, T should contain rules that
aggregate instances of concept c0 with instances of concept c, and promote
all relations between the instance of c0 and instances not aggregated by the
instance of c to the c instance. Examples include the extension of \Person"
with properties like \Age", \Gender", or \PersonBody" and the \SportsEvent"
with \Date", or \Location".
{ Removal of a single concept c0 from an HLC c: This operation handles
property removals from HLCs. The rule set T is identical to the operation of
adding a property to an HLC, with the di erence that each rule in T is
located and removed from the TBox rules, instead of extending it.
{ Removal of HLC c that aggregates a single concept c0: Again, this operation
is the negation of creating a new HLC that aggregates a single concept
operation. Thus, the rule set T is identical between the two operations, but
this operation causes the removal of all rules in T from the TBox.
{ Removal of an MLC c: Similar to the addition of a new MLC operation, this
operation has no e ect on the TBox rule set, i.e. no rules are removed.</p>
      </sec>
      <sec id="sec-3-8">
        <title>Rule templates for concept de nition operations In this subsection the</title>
        <p>templates for generating rules are described, for the operators that do not have
an empty set T , and are not related to removals, which share the same T with
the corresponding addition operations.</p>
        <p>De nition of a new HLC c that aggregates a single concept c0 The rule set T
during the de nition of a new HLC c from a concept c0 should contain rules
that create instances of c from instances of c0 found in the ABox of a multimedia
resource. In the interpretation model used in BOEMIE, this can be accomplished
by a single backward rule, which can be described with the following pattern:
hc0i (X)</p>
        <p>hci (Y ); has hc0i (Y; X)
For example, if c is \Person" and c0 is \PersonName", the following rule can be
generated from this pattern:</p>
        <p>P ersonN ame(X)</p>
        <p>P erson(Y ); hasP ersonN ame(Y; X)
Addition of a single concept c0 to an existing HLC c The rule set T during the
addition of a property c0 to an HLC c should contain rules that relate instances
of c with instances of c0 found in the ABox of a multimedia resource. In addition,
it should contain rules that promote the relations of a c0 instance with all
instances not aggregated by c onto the c instance. This operation re ects an action
performed on the de nition of concept c, from which the \ nal" state (c; c0; r) is
known. The state (c; c0; r) is the part of the concept de nition that relates to how
c aggregates c0. For example, if \Person" in the image modality is de ned as
having only a single property (hasP ersonF ace : P ersonF ace), and the operation
is to extend it also with \PersonBody" through the role \hasPersonBody", then
(c; c0; r) = (P erson; P ersonBody; hasP ersonBody). According to the adopted
interpretation model, c0 can be aggregated with c only if c0 is related with any
property of c. If c00, c00 6= c0 is an aggregated by c concept, then an \initial" state
(c00; c0; r00) is hypothesized, relating c0 with c00 through the relation r00. Continuing
the example, since \Person" has a single aggregated concept, only one initial state
can be hypothesized, i.e. (c00; c0; r00) = (P ersonF ace; P ersonBody; isAdjacent).
Once both initial and nal states have been decided, then a rule pattern can be
de ned to transform the initial into the nal state. In the interpretation model
used within BOEMIE, this can be accomplished by a single backward rule, which
can be described with the following pattern:
hr00i (Y; Z)
hci (X) ; has hc00i (X; Y ) ; hc00i (Y ) ;
hri (X; Z) ; hc0i (Z)
Applied to our example, this pattern will lead to the following rule:
isAdjacent (Y; Z)</p>
        <p>P erson (X) ; hasP ersonF ace (X; Y ) ; P ersonF ace (Y ) ;
hasP ersonBody (X; Z) ; P ersonBody (Z)
This rule can relate instances of \PersonBody" to instances of \Person", already
related to instances of \PersonFace". The same process should be repeated for
all possible initial states that can be found for concept c.</p>
        <p>However these are not the only rules that should be added in set T . Each
relation w de ned in the TBox that can have as subject concepts c and c0, must
be promoted from c0 to c. This can be accomplished with forward rules that can
be generated by the following pattern:
hwi (X; Z)</p>
        <p>hci (X) ; hri (X; Y ) ; hc0i (Y ) ; hwi (Y; Z)
Please note that in this pattern no type is speci ed for variable Z, allowing Z
to take as value instances of any concept that is in the range of the relation hwi.
Assuming w = isN ear, this pattern can lead to the following rule:
isN ear (X; Z)</p>
        <p>P erson (X) ; hasP ersonBody (X; Y ) ; P ersonBody (Y ) ;
isN ear (Y; Z)
The rule set T must be extended with a single rule of the above form for each
w that can be found in the ontology TBox.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>In this paper we have presented a novel approach for exploiting an ontology in an
ontology-based information extraction system, which substitutes part of the
extraction process with reasoning, guided by a set of automatically acquired rules.
Innovative aspects of the presented framework include the use of reasoning in the
construction of an ontology-based information extraction system that can adapt
to changes in the ontology and the clear distinction between concepts of the
lowlevel analysis (MLCs), and the semantic interpretation (HLCs). An interesting
future direction is the investigation of how reasoning can be better applied on
modalities involving the dimension of time, such as video. In BOEMIE a
simple approach has been followed regarding the handling of time sequences, where
extracted real objects or events were grounded to timestamps, and arti cial
relations like \before" and \after" were added. Nevertheless, an enhancement that
maintains the temporal semantics from the perspective of reasoning will be an
interesting addition.</p>
      <sec id="sec-4-1">
        <title>Acknowledgments.</title>
        <p>This work has been partially funded by the BOEMIE Project, FP6-027538, 6th
EU Framework Programme.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Millard</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weal</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>P.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shadbolt</surname>
            ,
            <given-names>N.R.</given-names>
          </string-name>
          :
          <article-title>Automatic ontology-based knowledge extraction from web documents</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>18</volume>
          (
          <issue>1</issue>
          ),
          <volume>14</volume>
          {21 (Jan
          <year>2003</year>
          ), http://dx.doi.org/10.1109/MIS.
          <year>2003</year>
          .1179189
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Alani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Millard</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weal</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>P.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shadbolt</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>Automatic extraction of knowledge from web documents</article-title>
          .
          <source>In: Workshop on Human Language Technology for the Semantic Web and Web Services</source>
          ,
          <article-title>2 nd Int</article-title>
          .
          <source>Semantic Web Conf. Sanibel Island</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartung</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Racioppa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Ontology-based information extraction and integration from heterogeneous data sources</article-title>
          .
          <source>Int. J. Hum.-Comput. Stud</source>
          .
          <volume>66</volume>
          (
          <issue>11</issue>
          ),
          <volume>759</volume>
          {788 (Nov
          <year>2008</year>
          ), http://dx.doi.org/10.1016/ j.ijhcs.
          <year>2008</year>
          .
          <volume>07</volume>
          .007
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Magnini</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Ontology Learning from Text: Methods, Evaluation and Applications</article-title>
          ,
          <source>Frontiers in Arti cial Intelligence and Applications Series</source>
          , vol.
          <volume>123</volume>
          . IOS Press, Amsterdam (7
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siegel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based information extraction with soba</article-title>
          .
          <source>In: In: Proc. of the International Conference on Language Resources and Evaluation (LREC)</source>
          . pp.
          <volume>2321</volume>
          {
          <issue>2324</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Castano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peraldi</surname>
            ,
            <given-names>I.S.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaya</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Moller, R.,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wessel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Multimedia Interpretation for Dynamic Ontology Evolution</article-title>
          .
          <source>Journal of Logic and Computation</source>
          <volume>19</volume>
          (
          <issue>5</issue>
          ),
          <volume>859</volume>
          {
          <fpage>897</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Handschuh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Towards the self-annotating web</article-title>
          .
          <source>In: Proceedings of the 13th international conference on World Wide Web</source>
          . pp.
          <volume>462</volume>
          {
          <fpage>471</fpage>
          . WWW '04,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2004</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/ 988672.988735
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dung</surname>
            ,
            <given-names>T.Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kameyama</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Ontology-based information extraction and information retrieval in health care domain</article-title>
          .
          <source>In: Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery</source>
          . pp.
          <volume>323</volume>
          {
          <fpage>333</fpage>
          . DaWaK'07, Springer-Verlag, Berlin, Heidelberg (
          <year>2007</year>
          ), http://dl.acm.org/citation.cfm? id=
          <volume>2391952</volume>
          .
          <fpage>2391991</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Espinosa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaya</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melzer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Moller, R.:
          <article-title>On ontology based abduction for text interpretation</article-title>
          . In: Gelbukh,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (ed.)
          <source>Proc. of 9th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2008)</source>
          . pp.
          <volume>194</volume>
          {
          <fpage>205</fpage>
          . No. 4919
          <string-name>
            <surname>in</surname>
            <given-names>LNCS</given-names>
          </string-name>
          , Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Gruber</surname>
            ,
            <given-names>T.R.</given-names>
          </string-name>
          :
          <article-title>Toward principles for the design of ontologies used for knowledge sharing</article-title>
          .
          <source>Int. J. Hum.-Comput. Stud</source>
          .
          <volume>43</volume>
          (
          <issue>5-6</issue>
          ),
          <volume>907</volume>
          {928 (Dec
          <year>1995</year>
          ), http://dx.doi. org/10.1006/ijhc.
          <year>1995</year>
          .1081
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wimalasuriya</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Using information extractors with the neural electromagnetic ontologies</article-title>
          . In: Meersman,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Dillon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.S.</given-names>
            ,
            <surname>Herrero</surname>
          </string-name>
          , P. (eds.)
          <source>OTM Workshops. Lecture Notes in Computer Science</source>
          , vol.
          <volume>7046</volume>
          , pp.
          <volume>31</volume>
          {
          <fpage>32</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          :
          <article-title>Incompletely and imprecisely speaking: Using dynamic ontologies for representing and retrieving information</article-title>
          . In: Franconi,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Kifer</surname>
          </string-name>
          , M. (eds.)
          <source>KRDB. CEUR Workshop Proceedings</source>
          , vol.
          <volume>21</volume>
          , pp.
          <volume>14</volume>
          {
          <fpage>20</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Iosif</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Ontology-Based Information Extraction under a Bootstrapping Approach</article-title>
          . In: Pazienza,
          <string-name>
            <given-names>M.T.</given-names>
            ,
            <surname>Stellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            . (eds.)
            <surname>Semi-Automatic Ontology</surname>
          </string-name>
          <string-name>
            <surname>Development</surname>
          </string-name>
          : Processes and Resources,
          <source>chap. 1</source>
          , pp.
          <volume>1</volume>
          {
          <fpage>21</fpage>
          . IGI Global, Hershey, PA, USA (April
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fragkou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iosif</surname>
          </string-name>
          , E.:
          <article-title>Ontology based information extraction from text</article-title>
          . In: Paliouras,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Spyropoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.D.</given-names>
            ,
            <surname>Tsatsaronis</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <source>Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. Lecture Notes in Computer Science</source>
          , vol.
          <volume>6050</volume>
          , pp.
          <volume>89</volume>
          {
          <fpage>109</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Millard</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shadbolt</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Artequakt:
          <article-title>Generating tailored biographies from automatically annotated fragments from the web</article-title>
          .
          <source>In: Workshop on Semantic Authoring, Annotation &amp; Knowledge Markup (SAAKM?02)</source>
          ,
          <source>the 15th European Conference on Arti cial Intelligence</source>
          ,
          <source>(ECAI?02)</source>
          . vol. -, pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          (
          <issue>2002</issue>
          ), http://eprints.soton.ac.uk/256913/, event Dates:
          <source>July 21-26</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. Krotzsch,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Maier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Krisnadhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            ,
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Nominal schemas for integrating rules and description logics</article-title>
          . In: Rosati,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Rudolph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Zakharyaschev</surname>
          </string-name>
          , M. (eds.)
          <article-title>Description Logics</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>745</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>La</surname>
            <given-names>erty</given-names>
          </string-name>
          , J.D.,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>F.C.N.</given-names>
          </string-name>
          :
          <article-title>Conditional random elds: Probabilistic models for segmenting and labeling sequence data</article-title>
          .
          <source>In: Proceedings of the Eighteenth International Conference on Machine Learning</source>
          . pp.
          <volume>282</volume>
          {
          <fpage>289</fpage>
          . ICML '
          <fpage>01</fpage>
          , Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Hierarchical, perceptron-like learning for ontology-based information extraction</article-title>
          .
          <source>In: Proceedings of the 16th international conference on World Wide Web</source>
          . pp.
          <volume>777</volume>
          {
          <fpage>786</fpage>
          . WWW '07,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2007</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/1242572.1242677
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The text-to-onto ontology learning environment</article-title>
          .
          <source>In: Software Demonstration at ICCS-2000 - Eight International Conference on Conceptual Structures</source>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Intelligent exploration of the web</article-title>
          .
          <source>chap. Bootstrapping an ontology-based information extraction system</source>
          , pp.
          <volume>345</volume>
          {
          <fpage>359</fpage>
          .
          <string-name>
            <surname>Physica-Verlag</surname>
            <given-names>GmbH</given-names>
          </string-name>
          , Heidelberg, Germany, Germany (
          <year>2003</year>
          ), http://dl.acm. org/citation.cfm?id=
          <volume>941713</volume>
          .
          <fpage>941736</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Marsh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perzanowski</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Muc-7 evaluation of ie technology: Overview of results</article-title>
          .
          <source>In: Proceedings of the Seventh Message Understanding Conference (MUC7)</source>
          . http://www.itl.nist.gov/iaui/894.02/related projects/muc/index.html (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spyropoulos</surname>
          </string-name>
          , C.D.:
          <article-title>Ellogon: A New Text Engineering Platform</article-title>
          .
          <source>In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC</source>
          <year>2002</year>
          ). pp.
          <volume>72</volume>
          {
          <fpage>78</fpage>
          .
          <string-name>
            <surname>European Language Resources Association</surname>
          </string-name>
          , Las Palmas, Canary Islands,
          <source>Spain (May</source>
          <volume>29</volume>
          {31
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krithara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zavitsanos</surname>
          </string-name>
          , E.:
          <article-title>Ontology Population and Enrichment: State of the Art</article-title>
          . In: Paliouras,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Spyropoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.D.</given-names>
            ,
            <surname>Tsatsaronis</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <source>Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, Lecture Notes in Computer Science</source>
          , vol.
          <volume>6050</volume>
          , pp.
          <volume>134</volume>
          {
          <fpage>166</fpage>
          . Springer Berlin / Heidelberg (
          <year>2011</year>
          ), http://dx.doi.org/10.1007/ 978-3-
          <fpage>642</fpage>
          -20795-2\_
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krithara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zavitsanos</surname>
          </string-name>
          , E.:
          <article-title>Ontology Population and Enrichment: State of the Art</article-title>
          . In: Paliouras,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Spyropoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.D.</given-names>
            ,
            <surname>Tsatsaronis</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <source>Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, Lecture Notes in Computer Science</source>
          , vol.
          <volume>6050</volume>
          , pp.
          <volume>134</volume>
          {
          <fpage>166</fpage>
          . Springer Berlin / Heidelberg (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Popov</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiryakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirilov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ognyano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goranov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Kim - semantic annotation platform</article-title>
          . In: Fensel,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Sycara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.P.</given-names>
            ,
            <surname>Mylopoulos</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>International Semantic Web Conference. Lecture Notes in Computer Science</source>
          , vol.
          <volume>2870</volume>
          , pp.
          <volume>834</volume>
          {
          <fpage>849</fpage>
          . Springer (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Popov</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiryakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ognyano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirilov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Kim - a semantic platform for information extraction and retrieval</article-title>
          .
          <source>Natural Language Engineering</source>
          <volume>10</volume>
          (
          <issue>3-4</issue>
          ),
          <volume>375</volume>
          {
          <fpage>392</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Schank</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abelson</surname>
            ,
            <given-names>R.P.</given-names>
          </string-name>
          : Scripts,
          <article-title>Plans, Goals and Understanding: an Inquiry into Human Knowledge Structures</article-title>
          . L. Erlbaum, Hillsdale, NJ (
          <year>1977</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Schank</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          , Kolodner, J.L.,
          <string-name>
            <surname>DeJong</surname>
          </string-name>
          , G.:
          <article-title>Conceptual information retrieval</article-title>
          .
          <source>In: Proceedings of the 3rd annual ACM conference on Research and development in information retrieval (SIGIR '80)</source>
          . pp.
          <volume>94</volume>
          {
          <fpage>116</fpage>
          . Cambridge, UK (
          <year>1980</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Studer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benjamins</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fensel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Knowledge engineering: Principles and methods</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          <volume>25</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>161</volume>
          {198 (Marz
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Wimalasuriya</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Components for information extraction: ontologybased information extractors and generic platforms</article-title>
          .
          <source>In: Proceedings of the 19th ACM international conference on Information and knowledge management</source>
          . pp.
          <volume>9</volume>
          {
          <fpage>18</fpage>
          . CIKM '10,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2010</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/ 1871437.1871444
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Wimalasuriya</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Ontology-based information extraction: An introduction and a survey of current approaches</article-title>
          .
          <source>J. Inf. Sci</source>
          .
          <volume>36</volume>
          (
          <issue>3</issue>
          ),
          <volume>306</volume>
          {323 (Jun
          <year>2010</year>
          ), http://dx.doi.org/10.1177/0165551509360123
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weld</surname>
            ,
            <given-names>D.S.:</given-names>
          </string-name>
          <article-title>Automatically re ning the wikipedia infobox ontology</article-title>
          .
          <source>In: Proceedings of the 17th international conference on World Wide Web</source>
          . pp.
          <volume>635</volume>
          {
          <fpage>644</fpage>
          . WWW '08,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2008</year>
          ), http://doi.acm.
          <source>org/10</source>
          . 1145/1367497.1367583
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>