=Paper=
{{Paper
|id=Vol-2238/paper4
|storemode=property
|title=Design and Implementation of a Diagrammatic Tool for Creating RDF graphs
|pdfUrl=https://ceur-ws.org/Vol-2238/paper4.pdf
|volume=Vol-2238
|authors=Anca Chiș-Rațiu,Robert Andrei Buchmann
|dblpUrl=https://dblp.org/rec/conf/ifip8-1/Chis-RatiuB18
}}
==Design and Implementation of a Diagrammatic Tool for Creating RDF graphs==
https://ceur-ws.org/Vol-2238/paper4.pdf
Design and Implementation of a Diagrammatic Tool for
Creating RDF graphs
Anca Chiș-Rațiu1 and Robert Andrei Buchmann2
Business Informatics Research Center, Faculty of Economics and Business Administration,
Babeș-Bolyai University, Cluj Napoca, Romania
1
achisratiu@yahoo.com,2robert.buchmann@econ.ubbcluj.ro
Abstract. Graph databases are the next generation of relational databases, as
they allow convenient retrieval of complex network structures and employ
graph theory to store and navigate relationships. The resulting data models are
simpler and more expressive than those produced using traditional relational da-
tabases. However, there is a shortage of tools for creating such graphs in an in-
tuitive visual way. Metamodeling can help in this respect, as it enables the agile
customization of tools for diagrammatic knowledge representation and editing,
under constraints imposed through metamodels.
Our goal is to present a modeling tool that was customized to support the crea-
tion of Resource Description Framework (RDF) data graphs, by integrating no-
tions of Conceptual Modeling and the Agile Modeling Method Engineering
(AMME) framework, using ADOxx as an instance environment to obtain a us-
able prototype - a starting point for the future evolution an Open Models Labor-
atory project. The proposed modeling tool is inspired by the model-driven code
generation paradigm, as an ADOxx script was developed to generate, out of di-
agrammatic structures, RDF graphs written in the N-triples serialization syntax.
The paper highlights benefits of the proposed modeling tool as this successfully
resolves fundamental user-oriented issues regarding the easy production of
knowledge graphs guided by the Linked Enterprise Data paradigm and support-
ed by a Conceptual Modeling "look and feel".
Keywords: Resource Description Framework, Conceptual modeling, Enterprise
knowledge graphs
1 Introduction
Web 2.0 established a large information space where people share valuable data and
contribute human-readable knowledge in various fields - see Wikipedia, a free ency-
clopedia project, where many individuals are motivated to contribute by adding and
revising content. The next stage, Web 3.0, has encouraged a similar production scale
and scope for machine-readable knowledge to be made available to Linked Enterprise
Data projects [1], where experts share and retrieve large-scale connected RDF graphs
[2] that store flexible relational data enriched with domain-specific rules or schemata.
38
Linked Data is about information creation coupled with information sharing, where
documentation, construction and distribution processes are equal in terms of im-
portance.
In order to support enterprise knowledge creation, conceptual modeling methods
can be employed - not only for data modeling purposes, but also for capturing do-
main-specific abstractions, business entities and their relationships. Modeling meth-
ods are deployed as tools that enable the description of a system based on instantiated
concepts defined on a metamodel level, and can be enhanced by mechanisms that are
relevant for the production or transformation of knowledge graphs.
The goal of this paper is to present a conceptual modeling tool that was customized
with the help of a metamodeling methodology to allow users to easily construct RDF
knowledge graphs by visual means. The model-driven code generation paradigm in-
spired the work, in the sense that the tool generates knowledge encoded in the ma-
chine-readable N-triples syntax [3], readily available for uploading and managing it in
an RDF database management system. This started as a student dissertation project
that opens the path towards establishing a research project fit to participate in the
Open Models Laboratory (OMiLAB) ecosystem [4][5].
Code generation is acknowledged as a key benefit of conceptual modeling, sup-
porting the development of software systems in the sense that development time be-
comes consistently shorter. As examples, Modeliosoft tool products [6] provide mod-
el-driven code generation for development languages like Java (from UML classes),
C++ and SQL (generated from ERD diagrams). Process automation is made possible
by generating BPEL or XPDL process descriptions from BPMN diagrams. However,
there is a shortage of tools that generate RDF serializations from visual models of
their corresponding graphs – i.e., RDF visual tools typically provide a visualization of
already created RDF graphs, instead of providing a graphical way to build such views
and then generate the corresponding machine-readable structures. In the first draft of
our prototype, the target syntax of choice is N-triples (supported by any RDF man-
agement systems). In the future this can be extended towards a richer knowledge edi-
tor, to be distributed for open use within the OMiLAB global network.
The motivation for creating a new modeling tool for creating RDF data graphs is
that nowadays we have an increased utilization of enterprise knowledge, but the
means of creating that knowledge with an optimal learning curve for non-technical
users are limited. RDF graphs should be created in an easy way, not more complicat-
ed than filling data in the cells of SQL tables, and not necessarily conforming a pre-
imposed schema since RDF graphs can be schemaless databases – i.e., instance data
can be created separately from the schema; or, the schema is employed for semantic
annotation purposes, to support reasoning rather than validation. Linked Enterprise
Data is commonly lifted or derived through adapters (e.g., D2RQ [7]) from legacy
non-graph data sources. When data is created from scratch, graph creation is often
blended with ontology engineering in the same tool (form-based or text-based), fol-
lowing the traditional relational database creation process – first schema, then in-
stance data.
We aim to minimize the effort of RDF data creation by developing a diagrammatic
tool with conceptual modeling "look and feel", potentially evolving towards a flexible
39
knowledge base editor (with the possibility to add domain-specific dynamic notation
and other metamodeling-powered features). In the current draft, the tool only allows
simple graph editing, limited annotation of nodes and the generation of machine-
readable serialization. The key novelty is that this is built on an open access meta-
modeling platform, ADOxx [8], to ensure the evolution of the tool according to future
academic exploitation goals – e.g., teaching semantic technology with user-friendly
tool support.
The requirements selected for this tool originate in teaching goals – i.e., to provide
an intuitive RDF editing toolkit to novices, focusing on a visual building process as a
replacement for the traditional text-based or form-based building. Additionally, a
meta-requirement imposed the need to have an agilely evolving tool that can incorpo-
rate additional functionality and domain-specific annotations towards the goal of ena-
bling enterprise knowledge creation – hence the adopted metamodeling approach.
The remainder of the paper is further structured as follows: Section 2 introduces
background on the enablers - RDF, AMME and the ADOxx metamodeling platform.
Section 3 provides methodological considerations in relation to Design Science, as a
guide for this effort. Section 4 provides comments on related works and proposed
benefits. Section 5 presents the modeling method conceptualization providing insights
about syntax, semantics, the serialization mechanism and the modeling procedure.
The paper ends with conclusions and an outlook to future developments.
2 Background on Technological Enablers
Enterprises are willing to adopt the Linked Enterprise Data concept [1] due to its ben-
efits regarding semantic interoperability and connectivity of data originating in legacy
silos. However, this also comes with a need of easily building connected data, requir-
ing no more effort than when filling data tables in traditional databases. Conceptual
modeling can help in this respect, if we see it through the lens of agility – i.e., concep-
tual modeling tools that do not necessarily follow established methods (ER, UML
etc.) but are instead customized for specific requirements. In our case, these require-
ments originate in the need to capture RDF semantics with (i) minimal notation, (ii) a
Conceptual Modeling "look and feel", and (iii) mechanisms for generating machine-
readable RDF from diagrammatic graphs.
Agile development accepts change and even expects it, therefore the proposed tool
also highlights characteristics of Agile Modeling Method Engineering (AMME) as a
key methodology for customizing modeling tools for a diversity of purposes. Accord-
ing to [9] and [10], AMME provides a conceptualization method that repurposes agili-
ty principles established in software engineering, and ensures that the necessary se-
mantics are captured in relation to modeling needs. This framework has also been
applied in the community-oriented research environment of OMiLAB, in the creation
of a multitude of tools – see BEE-UP (an educational project for teaching Model-
Driven Software Engineering and Business Process Management topics) [11], or the
ComVantage method (a research-oriented project addressing Knowledge Manage-
ment and Enterprise Architecture Management concerns) [12]. Both mentioned tools
40
allow the lifting of RDF graphs from modeling languages (e.g., UML, the domain-
specific ComVantage language) – however those graphs are limited to the semantics
prescribed by the supported languages. In our case, AMME is employed to tailor a
minimal modeling language directly for the RDF semantics, without any intermediate
abstraction layer.
2.1 Resource Description Framework (RDF)
The standard technology for representing and sharing semantic information is the
Resource Description Framework (RDF), where "resources" can be anything includ-
ing documents, people, objects or abstract concepts. In particular, RDF is used to
publish and interlink data on the Web or to represent knowledge in knowledge man-
agement applications. RDF employs a graph-based data model, which is significantly
different than the earlier interoperability standards such as XML (based on DOM and
hierarchical data structures). Data graphs are more flexible than DOM trees because
the queries are more powerful and flexible as they can navigate a graph in any direc-
tion along arbitrary chains of relationships. RDF databases are considered NoSQL
databases since they are queried with other means than SQL – the standard language
for this is SPARQL [13]. Consequently, RDF graphs are the main data model serving
the Linked Data paradigm. They are also related to the concept of Smart Data, if rea-
soning and rule systems are deployed on top of RDF graphs.
The RDF data model is based on small units called statements, describing re-
sources in the form of triples of resource identifiers (URIs):