=Paper=
{{Paper
|id=None
|storemode=property
|title=Semion: A Smart Triplification Tool
|pdfUrl=https://ceur-ws.org/Vol-674/Paper166.pdf
|volume=Vol-674
|dblpUrl=https://dblp.org/rec/conf/ekaw/NuzzoleseGPC10
}}
==Semion: A Smart Triplification Tool==
Semion: a smart triplification tool
Andrea G. Nuzzolese Aldo Gangemi Valentina Presutti
Semantic Technology Lab, Semantic Technology Lab, Semantic Technology Lab,
ISTC-CNR ISTC-CNR ISTC-CNR
Via Nomentana 56 Via Nomentana 56 Via Nomentana 56
Rome, Italy Rome, Italy Rome, Italy
nuzzoles@cs.unibo.it aldo.gangemi@cnr.it valentina.presutti@cnr.it
Paolo Ciancarini
Dept. of Computer Science,
Università di Bologna
Mura Anteo Zamboni 7
Bologna, Italy
cianca@cs.unibo.it
ABSTRACT 2. METHOD
The Web of Data is fed by “triplifiers”, tools able to trans- The method implemented in Semion is based on an ap-
form content (often databases) to linked data. Triplifiers im- proach that substantially divides the reengineering process
plement different methods and typically are based on bulk from the modelling one. The reengineering process performs
recipes which allow for no or limited customization of the the semantic lifting just extracting RDF triples driven by the
process. Furthermore, their consumption or refactoring is OWL description of the structure of the datasource provided
often difficult due to mismatches between the semantics em- as input. On the other hand, the modelling process allows
bedded in original structures, and the RDF or OWL se- to introduce semantics in the extracted data set, by using
mantics obtained thorugh the recipes. Semion is a method a semiotic-cognitive approach based on the Linguistic-Meta
and a tool for customizing and expliciting the semantics of Model (LMM) [5]. The most important feature of LMM is its
data reengineering and refactoring. ability to support the representation of different knowledge
sources developed according to different underlying semiotic
1. INTRODUCTION theories [5]. Figure 1 shows the basic key concepts that
Commonly accepted solutions for tranforming non-RDF data
sources into RDF are based on ad hoc semantics-driven ap-
proaches, that make implicit assumptions on the domain
semantics of the non-RDF data source (e.g. a relational
database is trasformed mapping a table into an rdfs:Class, a
table column into an rdf:Property and a table record into an
instance of the specific RDF table class). The tool described
here, Semion, implements a method that firstly makes no se-
mantic assumption at the domain level, and just transforms
the data source into RDF triples driven by an OWL de-
scription of the data source structure (a source meta-model),
which can be defined and customized by the user. Secondly,
the RDF triples can be modelled by aligning them to any ad-
ditional RDF or OWL ontology, which acts as either a meta-
level “mediator” to the required semantics (e.g. SKOS [4] or
the OWL metamodel [3, 1]), or as a reference domain onto-
logy (e.g. DOLCE, FOAF, or the Gene Ontology). In par-
ticular, we exemplify the alignment of triplified data with Figure 1: Tranforming method: key concepts.
the Linguistic Meta-Model (LMM) [5], an OWL-DL onto-
logy that formalizes the distinctions of the semiotics.
are behind our transforming method. The “Data source”
bubble represents the input consisting of a non-RDF data
source that is reengineered into an RDF data set according
to its type, to its structure described by an OWL meta-
model and to a defined mapping. The RDF dataset is then
refactored (“Refactoring process” frame in figure 1) to the
LMM framework according to specific customizable align-
ment rules. Once the RDF dataset is aligned to LMM it is
possible to grounds it to a formal semantics and finally to
express its logics.
3. TOOL ontology alignments through SPARQL CONSTRUCT, that
The method described in the previous section is implemen- are obtained from the rules written in a human-readable
ted in Semion. Currently the tool is still a prototype and syntax (see figure 3), that are based on the form:
has been tested only for transforming relational databases,
antecedent → consequent
but it was designed to perform the transormation of any
kind of data source. The figure 2 shows the reengineering Using this syntax, a rule e.g. asserting that being an in-
stance of class Table in dataset meta-model implies to be a
Concept of DOLCE [2] would be written:
dbs : T able(?x) → DU L : Concept(?x)
This rule will be interpreted as the SPARQL query:
CONSTRUCT { ?x rdf:type DUL:Concept. }
WHERE { ?x rdf:type dbs:Table. }
With the same syntax can be written, through the Semion
tool, rules for transforming LMM to the FormalSemantics
vocabulary. The rules could be
IOLite:FormalExpression(?x) → FormalSemantics:Query(?x)
Figure 2: Semion tool: view of the reengineering
DUL:Relation(?x) → FormalSemantics:Class(?x)
interface.
interface of the Semion tool. It helps the user to define the
schema of the database structure that is described by using Rules for aligning the FormalSemantics vocabulary to OWL
the meta-model provided for the structure of the database can be written as the following
itself. Both because the database could be large and be-
cause the user could not know exactly how the database was
designed, the tool provides a wizard interface that automat- FormalSemantics:isSubsumedBy(?x, ?y) → rdfs:subClassOf(?x, ?y)
ically extracts the RDF of a database’s schema. Once the FormalSemantics:Class(?x) → owl:class(?x)
RDF of the database’s schema is available, the interface al-
lows the user to transorm the data from the database to the
RDF format. Before performing data extraction from the The Semion tool can be downloaded from the following URL
database it is also possible to correct possible issues derived http://stlab.istc.cnr.it/software/semion/tool
from a bad design or a bad mantainance of the database. In
fact, the tool provides functionalities to set in the RDF of the
database’s schema primary and foreing keys and eventually 4. REFERENCES
related relations. The refactoring interface allows the user to [1] B. Cuenca-Grau and B. Motik. OWL 2 Web Ontology
Language: Model-Theoretic Semantics. http://www.
w3.org/TR/2008/WD-owl2-semantics-20080411/,
2008. Visited on April 2010.
[2] A. Gangemi, N. Guarino, C. Masolo, A. Oltramari, and
L. Schneider. Sweetening Ontologies with DOLCE. In
Proceedings of 13th International Conference on
Knowledge Engineering and Knowledge Management
(EKAW), volume 2473 of Lecture Notes in Computer
Science, page 166 ff, Sigünza, Spain, Oct. 1–4 2002.
[3] F. v. Harmelen and D. L. McGuinness. OWL Web
Ontology Language Overview. W3C recommendation,
W3C, Feb. 2004. http://www.w3.org/TR/2004/REC-
owl-features-20040210/.
[4] A. Miles and S. Bechhofer. SKOS Simple Knowledge
Organization System Reference. W3C working draft,
W3C, June 2008. http://www.w3.org/TR/2008/WD-
Figure 3: Semion tool: view of the refactorer inter-
skos-reference-20080609/.
face.
[5] D. Picca, A. M. Gliozzo, and A. Gangemi. LMM: an
align the dataset to specific ontologies for adding semantics OWL-DL MetaModel to Represent Heterogeneous
to data. The alignment ontologies can by chosen following Lexical Knowledge. In Proceedings of the Sixth
the method that Semion implements i.e. first the dataset is International Language Resources and Evaluation
aligned to the LMM framework, then to an ontology that (LREC’08), Marrakech, Morocco, May 2008. European
contains the distinctions of the formal semantics and finally Language Resources Association (ELRA).
to an ontology that contains the logics. Semion performs http://www.lrec-conf.org/proceedings/lrec2008/.