<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semion: a smart triplification tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea G. Nuzzolese</string-name>
          <email>nuzzoles@cs.unibo.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aldo Gangemi</string-name>
          <email>aldo.gangemi@cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Ciancarini</string-name>
          <email>cianca@cs.unibo.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valentina Presutti</string-name>
          <email>valentina.presutti@cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, Università di Bologna</institution>
          ,
          <addr-line>Mura Anteo Zamboni 7, Bologna</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Semantic Technology Lab, ISTC-CNR</institution>
          ,
          <addr-line>Via Nomentana 56, Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Web of Data is fed by \tripli ers", tools able to transform content (often databases) to linked data. Tripli ers implement di erent methods and typically are based on bulk recipes which allow for no or limited customization of the process. Furthermore, their consumption or refactoring is often di cult due to mismatches between the semantics embedded in original structures, and the RDF or OWL semantics obtained thorugh the recipes. Semion is a method and a tool for customizing and expliciting the semantics of data reengineering and refactoring.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Commonly accepted solutions for tranforming non-RDF data
sources into RDF are based on ad hoc semantics-driven
approaches, that make implicit assumptions on the domain
semantics of the non-RDF data source (e.g. a relational
database is trasformed mapping a table into an rdfs:Class, a
table column into an rdf:Property and a table record into an
instance of the speci c RDF table class). The tool described
here, Semion, implements a method that rstly makes no
semantic assumption at the domain level, and just transforms
the data source into RDF triples driven by an OWL
description of the data source structure (a source meta-model),
which can be de ned and customized by the user. Secondly,
the RDF triples can be modelled by aligning them to any
additional RDF or OWL ontology, which acts as either a
metalevel \mediator" to the required semantics (e.g. SKOS [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] or
the OWL metamodel [
        <xref ref-type="bibr" rid="ref1 ref3">3, 1</xref>
        ]), or as a reference domain
ontology (e.g. DOLCE, FOAF, or the Gene Ontology). In
particular, we exemplify the alignment of tripli ed data with
the Linguistic Meta-Model (LMM) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], an OWL-DL
ontology that formalizes the distinctions of the semiotics.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. METHOD</title>
      <p>
        The method implemented in Semion is based on an
approach that substantially divides the reengineering process
from the modelling one. The reengineering process performs
the semantic lifting just extracting RDF triples driven by the
OWL description of the structure of the datasource provided
as input. On the other hand, the modelling process allows
to introduce semantics in the extracted data set, by using
a semiotic-cognitive approach based on the Linguistic-Meta
Model (LMM) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The most important feature of LMM is its
ability to support the representation of di erent knowledge
sources developed according to di erent underlying semiotic
theories [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Figure 1 shows the basic key concepts that
are behind our transforming method. The \Data source"
bubble represents the input consisting of a non-RDF data
source that is reengineered into an RDF data set according
to its type, to its structure described by an OWL
metamodel and to a de ned mapping. The RDF dataset is then
refactored (\Refactoring process" frame in gure 1) to the
LMM framework according to speci c customizable
alignment rules. Once the RDF dataset is aligned to LMM it is
possible to grounds it to a formal semantics and nally to
express its logics.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. TOOL</title>
      <p>The method described in the previous section is
implemented in Semion. Currently the tool is still a prototype and
has been tested only for transforming relational databases,
but it was designed to perform the transormation of any
kind of data source. The gure 2 shows the reengineering
interface of the Semion tool. It helps the user to de ne the
schema of the database structure that is described by using
the meta-model provided for the structure of the database
itself. Both because the database could be large and
because the user could not know exactly how the database was
designed, the tool provides a wizard interface that
automatically extracts the RDF of a database's schema. Once the
RDF of the database's schema is available, the interface
allows the user to transorm the data from the database to the
RDF format. Before performing data extraction from the
database it is also possible to correct possible issues derived
from a bad design or a bad mantainance of the database. In
fact, the tool provides functionalities to set in the RDF of the
database's schema primary and foreing keys and eventually
related relations. The refactoring interface allows the user to
align the dataset to speci c ontologies for adding semantics
to data. The alignment ontologies can by chosen following
the method that Semion implements i.e. rst the dataset is
aligned to the LMM framework, then to an ontology that
contains the distinctions of the formal semantics and nally
to an ontology that contains the logics. Semion performs
ontology alignments through SPARQL CONSTRUCT, that
are obtained from the rules written in a human-readable
syntax (see gure 3), that are based on the form:</p>
      <p>
        antecedent ! consequent
Using this syntax, a rule e.g. asserting that being an
instance of class Table in dataset meta-model implies to be a
Concept of DOLCE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] would be written:
      </p>
      <p>dbs : T able(?x) ! DU L : Concept(?x)
This rule will be interpreted as the SPARQL query:
CONSTRUCT f ?x rdf:type DUL:Concept. g
WHERE f ?x rdf:type dbs:Table. g
With the same syntax can be written, through the Semion
tool, rules for transforming LMM to the FormalSemantics
vocabulary. The rules could be
IOLite:FormalExpression(?x) ! FormalSemantics:Query(?x)
DUL:Relation(?x) ! FormalSemantics:Class(?x)
Rules for aligning the FormalSemantics vocabulary to OWL
can be written as the following
FormalSemantics:isSubsumedBy(?x, ?y) ! rdfs:subClassOf(?x, ?y)
FormalSemantics:Class(?x) ! owl:class(?x)
The Semion tool can be downloaded from the following URL
http://stlab.istc.cnr.it/software/semion/tool</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Cuenca-Grau</surname>
          </string-name>
          and
          <string-name>
            <surname>B. Motik.</surname>
          </string-name>
          <article-title>OWL 2 Web Ontology Language: Model-Theoretic Semantics</article-title>
          . http://www. w3.org/TR/2008/WD-owl2
          <string-name>
            <surname>-</surname>
          </string-name>
          semantics-20080411/,
          <year>2008</year>
          . Visited on April
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Masolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oltramari</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Schneider</surname>
          </string-name>
          .
          <article-title>Sweetening Ontologies with DOLCE</article-title>
          .
          <source>In Proceedings of 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW)</source>
          , volume
          <volume>2473</volume>
          of Lecture Notes in Computer Science, page
          <volume>166</volume>
          , Sigunza, Spain, Oct.
          <volume>1</volume>
          {4
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F. v.</given-names>
            <surname>Harmelen</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness. OWL Web</surname>
          </string-name>
          <article-title>Ontology Language Overview</article-title>
          . W3C recommendation,
          <issue>W3C</issue>
          , Feb.
          <year>2004</year>
          . http://www.w3.org/TR/2004/RECowl-features-
          <volume>20040210</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Miles</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Bechhofer. SKOS Simple</surname>
          </string-name>
          <article-title>Knowledge Organization System Reference. W3C working draft</article-title>
          ,
          <source>W3C</source>
          ,
          <year>June 2008</year>
          . http://www.w3.org/TR/2008/WDskos-reference-
          <volume>20080609</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Picca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Gliozzo</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. Gangemi.</surname>
          </string-name>
          <article-title>LMM: an OWL-DL MetaModel to Represent Heterogeneous Lexical Knowledge</article-title>
          .
          <source>In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08)</source>
          , Marrakech, Morocco, May
          <year>2008</year>
          .
          <article-title>European Language Resources Association (ELRA)</article-title>
          . http://www.lrec-conf.org/proceedings/lrec2008/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>