<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">RDOTE -Transforming Relational Databases into Semantic Web Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Konstantinos</forename><forename type="middle">N</forename><surname>Vavliakis</surname></persName>
							<email>kvavliak@issel.ee.auth.gr</email>
							<affiliation key="aff0">
								<orgName type="department">Electrical and Computer Engineering Department</orgName>
								<orgName type="institution">Aristotle University of Thessaloniki</orgName>
								<address>
									<addrLine>GR541 24</addrLine>
									<settlement>Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Informatics and Telematics Institute</orgName>
								<orgName type="institution">CERTH</orgName>
								<address>
									<addrLine>GR570 01</addrLine>
									<settlement>Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Theofanis</forename><forename type="middle">K</forename><surname>Grollios</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Electrical and Computer Engineering Department</orgName>
								<orgName type="institution">Aristotle University of Thessaloniki</orgName>
								<address>
									<addrLine>GR541 24</addrLine>
									<settlement>Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pericles</forename><forename type="middle">A</forename><surname>Mitkas</surname></persName>
							<email>mitkas@eng.auth.gr</email>
							<affiliation key="aff0">
								<orgName type="department">Electrical and Computer Engineering Department</orgName>
								<orgName type="institution">Aristotle University of Thessaloniki</orgName>
								<address>
									<addrLine>GR541 24</addrLine>
									<settlement>Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Informatics and Telematics Institute</orgName>
								<orgName type="institution">CERTH</orgName>
								<address>
									<addrLine>GR570 01</addrLine>
									<settlement>Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">RDOTE -Transforming Relational Databases into Semantic Web Data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">A20ADAE6987EB7CC7E5E2A1AAB698D0F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T01:43+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Relational Databases to Ontology Transformation</term>
					<term>RDB2RDF</term>
					<term>RDF Dump</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>During the last decade, there has been intense research and development in creating methodologies and tools able to map Relational Databases with the Resource Description Framework. Although some systems have gained wider acceptance in the Semantic Web community, they either require users to learn a declarative language for encoding mappings, or have limited expressivity. Thereupon we present RDOTE, a framework for easily transporting data residing in Relational Databases into the Semantic Web. RDOTE is available under GNU/GPL license and provides friendly graphical interfaces, as well as enough expressivity for creating custom RDF dumps.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The large volume of data residing in relational databases led to the creation of systems for instantiating ontology schemata using relational information, with some, like D2RQ, gaining wider acceptance in the Semantic Web community. Unfortunately, all these tools require advanced user skills as they are either built upon complex declarative languages and lack friendly user interfaces or they provide GUIs with limited expressivity.</p><p>We present RDOTE, a system able to map multiple Relational Databases (RDB) into different ontology schemata and integrate them into a single ontology file. RDOTE is online 1,2 available under the GNU/GPL license and provides drag 'n drop operations, drop down lists and recommendation mechanisms, that allow users to define all the necessary mappings between tables/columns and classes/properties, in order to create domain-specific mappings according to a selected ontology schema.</p><p>The main contribution of RDOTE towards Semantic Web researchers is twofold: a) it can transform datasets currently residing in (one or many) Relational Databases into Semantic Web data through a friendly interface and b) it can quickly instantiate an ontology schema with real data, allowing easy experimentation with large ontology datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Background Information -Relevant Work</head><p>Throughout related bibliography, one may find numerous methodologies and systems for publishing data residing in Relational Databases into the Semantic Web. D2RQ <ref type="bibr" target="#b0">[1]</ref> is the mostly embraced by the Semantic Web Community. D2RQ offers a powerful declarative language for mapping Relational Databases to ontologies, nevertheless, no graphical user interfaces are provided. Less prevalent systems, like RDB2Onto <ref type="bibr" target="#b3">[4]</ref> are highly configurable too but they also lack friendly user interfaces. On the other hand, systems like SquirrelRDF <ref type="bibr" target="#b5">[6]</ref> offer a simplistic approach to publish RDF data from Relational Databases (still absent GUI), which may not be expressive enough in case of complex databases/mappings. Dartgrid <ref type="bibr" target="#b1">[2]</ref> and ODEMapster plugin for the NeOn Toolkit <ref type="bibr" target="#b4">[5]</ref> currently offer a graphical user interface, but they have limited scope and applicability, as well as limited expressivity compared to RDOTE. Finally Virtuoso RDF View <ref type="bibr" target="#b2">[3]</ref> comes with a graphical user interface, which is available only for the Virtuoso database, rather than other popular relational DBMS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">RDOTE Functionality</head><p>An overview of RDOTE 's functionality is depicted in Figure <ref type="figure" target="#fig_0">1</ref>. RDOTE lies in the category of Domain Semantics-driven Mapping Generation tools, all mappings are formulated via graphical user interfaces and stored in text files. RDOTE has a generalized application domain and is currently applied on two test cases: one in the bibliographic domain and one in the art object documentation domain, available online. For the complete transformation of a RDB to an ontology, one has to take the following steps:</p><p>1. Connect to the desired RDBMS and load the respective ontology schema:</p><p>Users first have to connect to the desired RDBs (MySQL and Oracle supported) and load an ontology schema containing the TBox of the desired domain. RDOTE can load ontology files in various formats (RDF/XML, N3, N-triples), and/or persistent ontologies. RDOTE 's GUI depicts the TBox in a tree representation form, as well as the ABox of the loaded ontology. RDOTE also presents the tables contained in each connected RDB and the respective columns. 2. Write SQL queries that select the desired tuples to be processed as an RDF graph: Each query represents a result set that is used to populate an Ontology Class or acts as a dataset of literals for linking instances of a class with a datatype property. For unambiguous creation of instances, the primary key is required in the URI, whereas the user selected columns are required for populating datatype properties. Thus, along with the user defined SQL query, RDOTE automatically selects the primary keys that have been defined for the tables participating in the SQL query. 3. Define renaming options and merging strings in case there are queries containing multiple selected columns: This optional step is responsible for renaming result sets, based on regular expression pattern matching. Each tuple matching the defined regular expression will be renamed when used as a literal. Moreover, this step allows users to define merging strings for SQL queries that select data from multiple columns.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Connect queries defined in Step 2 with ontology classes (Class Mappings):</head><p>Next the Class Mappings that are responsible for the creation of all the ontology instances have to be defined. SQL queries are connected to ontology classes and in this way for each tuple of a SQL query, an instance of the respective class is created. In case one wishes to use the actual data instead of just creating URIs, RDOTE provides the possibility of copying the actual tuple information into any datatype property of the ontology schema.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Define conditional links of Class Mappings with other Class Mappings via</head><p>object properties or, in case of datatype properties, with SQL queries: For each Property Mapping, users can select a Class Mapping, drag 'n drop a property from the ontology tree and in case of an object property, select a second Class Mapping, whereas in the case of datatype property, select a SQL query. Next a join condition between the first Class Mapping and the second Class Mapping/SQL Query has to be defined, either by the user or RDOTE, which can propose possible join conditions as calculated by a greedy graph algorithm traversing the SQL schema. Users can then insert any other conditional restriction SQL92 supports. 6. Instantiate the ontology schema and store it either in text file format or in a persistent repository: Finally users can launch RDOTE 's engine, which instantiates the loaded ontology schema and creates an RDF dump of the selected relational data which are stored either in text file format (RDF/XML, N3, N-triples) or in a persistent storage format (MySQL and Oracle). RDOTE first creates all the instances and then it links them with object properties or adds literals using datatype properties.</p><p>All steps previously described are realized through RDOTE 's friendly interfaces (screenshots are available in RDOTE 's homepage), while at any step in the mapping definition process, users can save their project, that is all their mappings and database/ontology connections, and resume later.</p><p>The maximum supported RDF dump size RDOTE can create depends on the output format and naturally on the numbers of triples. In the case of text format, this size is also significantly dependent on Java maximum heap size. In our case, for maximum Java heap size of 1024MB, RDOTE successfully created 2 million triples in RDF/XML-ABBREV format in less than five minutes. Memory limitations do not exist in the case of persistent storage, but time limitations begin to emerge. In this case, RDOTE managed to create 10 million triples in less than five hours.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusions -Future Work</head><p>RDOTE provides the Semantic Web research community and domain experts with the necessary means for easily enriching Ontology schemata with the vast amount of data currently residing in relational databases. It also enables quick instantiations of new ontology schemata for testing and experimentation. By allowing easy transportation of legacy data into semantically aware data structures, RDOTE aspires to bring the Semantic Web vision one step closer.</p><p>RDOTE is constantly updated. In the near future we expect to incorporate an export/import mechanism for D2RQ compliant mapping files, as well as a query builder graphical user interface together with complementary interfaces and mechanisms that will facilitate and hasten the mapping creation process even further. Finally, further evaluation and testing on large datasets is pending.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. RDOTE Architecture</figDesc><graphic coords="2,152.06,511.23,311.12,130.02" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://sourceforge.net/projects/rdote/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://www.youtube.com/watch?v=pk7izhFeuf0</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">D2rq -treating non-rdf databases as virtual rdf graphs</title>
		<author>
			<persName><forename type="first">C</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Seaborne</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Poster presented in Internatiotal Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2004-11">2004. November 2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Dartgrid iii: A semantic grid toolkit for data integration</title>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">First International Conference on Semantics, Knowledge and Grid</title>
				<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page">12</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Rdf support in the virtuoso dbms</title>
		<author>
			<persName><forename type="first">O</forename><surname>Erling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mikhailov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference on Social Semantic Web. LNI</title>
				<imprint>
			<publisher>GI</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">113</biblScope>
			<biblScope unit="page" from="59" to="68" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Rdb2onto: Relational database data to ontology individual mapping in: Tools for acquisition, organisation and presenting of information and knowledge</title>
		<author>
			<persName><forename type="first">M</forename><surname>Laclavk</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
	<note type="report_type">Tech. rep</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Upgrading relational legacy data to the semantic web</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gómez-Pérez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WWW &apos;06: Proceedings of the 15th international conference on World Wide Web</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="1069" to="1070" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">D</forename><surname>Steer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Squirrelrdf. Tech. rep</title>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
