<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Complex Correspondences for Query Patterns Rewriting</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Pascal</forename><surname>Gillet</surname></persName>
							<email>pascalgillet@ymail.com</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Université de Toulouse 2</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Cassia</forename><surname>Trojahn</surname></persName>
							<email>cassia.trojahn@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Université de Toulouse 2</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ollivier</forename><surname>Haemmerlé</surname></persName>
							<email>ollivier.haemmerle@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Université de Toulouse 2</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Camille</forename><surname>Pradel</surname></persName>
							<email>camille.pradel@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Université de Toulouse 2</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Complex Correspondences for Query Patterns Rewriting</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">2A7414669CCC23A071508E91DB365880</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:43+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper discusses the use of complex alignments in the task of automatic query patterns rewriting. We apply this approach in SWIP, a system that allows for querying RDF data from natural language-based queries, hiding the complexity of SPARQL. SWIP is based on the use of query patterns that characterise families of queries and that are instantiated with respect to the initial user query expressed in natural language. However, these patterns are specific to the vocabulary used to describe the data source to be queried. For rewriting query patterns, we experiment ontology matching approaches in order to find complex correspondences between two ontologies describing data sources. From the alignments and initial query patterns, we rewrite these patterns in order to be able to query the data described using the target ontology. These experiments have been carried out on an ontology on the music domain and DBpedia ontology.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Despite the fact that SPARQL is the standard de facto language for querying RDF data, its complexity may restrict its use at a large scale, specially for nonexpert RDF users. Translating natural language queries into SPARQL ones is the object of researches in both Natural Language Processing and Semantic Web fields. Within the SWIP system <ref type="bibr" target="#b10">[11]</ref>, users express queries in natural language sentences and pre-written query patterns are instantiated with respect to a syntactic analysis of the initial query. Several possible interpretations of the queries are shown to the user which selects the query he/she is interested in. Unlike other proposals, such as the one in <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b3">4]</ref>, where user queries are limited to keywords, or in <ref type="bibr" target="#b15">[16]</ref>, where users can express their queries using a visual query language, the originality of SWIP is related to the use of query patterns.</p><p>The main principle behind query patterns states that, in real applications, the submitted queries are variations of few typical query families (i.e., the family of queries asking for the actors playing in movies or the queries asking for the members of a musical band, or the albums of a musical artist). On the one hand, the use of patterns avoids exploring the whole ontology to link the semantic entities identified from the keywords since the potential relations are already expressed in the patterns. The process thus benefits from the pre-established families of frequently expressed queries for which real information needs exist.</p><p>A query pattern can also be seen as the projection of a subgraph of the underlying knowledge base, used as a mediator for the translation between the query expressed in natural language and the corresponding SPARQL query. On the other hand, one of the main limitations of the query pattern-based approach is that reusing patterns across different data sources can be done in a very limited extent. For each data source to be queried, the corresponding query patterns have to be (manually) built. Rewriting query patterns based on a vocabulary to query patterns based on another vocabulary is a task that can be carried out with the help of ontology alignments.</p><p>This paper discusses the use of complex correspondences for automatic query patterns rewriting. While the usefulness of simple correspondences has long been recognised, query rewriting requires more expressive links between ontology entities expressing the true relationships between them. At a lower level of abstraction, ontology alignments have been used to support the task of SPARQL query rewriting <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b7">8]</ref>. However, query patterns and SPARQL refer to different levels of expressivity in their representations, where query patterns are articulated as a set of subpatterns and rely solely on the T Box of ontologies. Hence, we experiment ontology matching approaches in order to find complex correspondences between two ontologies describing two different data collections. From the complex alignments and source query patterns, we rewrite these patterns in order to be able to query the data collection described using the target ontology. Experiments have been carried out on an ontology on the music domain and DBpedia ontology. As a main outcoming, we have a set of manually validated complex correspondences, from which query patterns can be reused across the data sets described using those ontologies.</p><p>The rest of the paper is organised as follows. First, we introduce complex correspondences and query patterns ( §2). Then, we present the approach for query pattern rewriting that is based on the use of complex correspondences ( §3). Next, the experiments are discussed ( §4). Finally, we discuss related work ( §5) and conclude the paper ( §6).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Foundations</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Complex correspondences</head><p>Matching two ontologies is the process of generating an alignment between them <ref type="bibr" target="#b5">[6]</ref>. An alignment A is directional and refers to a source ontology O and a target ontology O , denoted A O→O : Definition 1 (Alignment). An alignment A O→O between two ontologies O and O is a set of correspondences A O→O = {c 1 , c 2 , ..., c n }, where each c i is a triple e i , e i , r , where:</p><p>whether the correspondence is simple, then it relates one and only one entity (i.e., a class or a property) e i of O to one and only one entity e i of O (1:1);</p><p>or the correspondence is complex, and it involves one or more entities in a logical formulation (1:n,m:1,m:n), where e i refers to a subset of elements ∈ O, and e i refers to a subset of elements ∈ O , and these elements are related using the constructors of a formal language (First-Order Logic or Description Logics); r is a relation, e.g., equivalence (≡), more general( ), more specific ( ),</p><p>holding between e i and e i ; additionally, a value n (typically in [0,1]) is assigned to c i indicating the degree of confidence that the relation r holds between the e and e .</p><p>The correspondence e i , e i , r is unique in A O→O . On the other hand, e i or e i may be present in more than one correspondence c i . The alignment A O→O is said complex if it contains at least one complex correspondence. In the rest of this paper, a correspondence c i , which is a triple e i , e i , r (suffix notation), will be noted e i r e i . For example, F ilm W ork is a simple correspondence and asserts that F ilm in O is more specific than W ork in O . On the other hand, we can have the following two complex correspondences :</p><formula xml:id="formula_0">∀x, Short F ilm(x) ≡ F ilm(x) ∧ duration(x, y) ∧ y ≤ 59 (1) ∀x, Biopic(x) ≡ F ilm(x) ∧ Celebrity(y) ∧ topic(x, y)<label>(2)</label></formula><p>(1) asserts that a Short F ilm in O is equivalent to a F ilm in O whose duration is less than 59 minutes, and (2) asserts that a Biopic (or biographical film) in O is equivalent to a F ilm in O whose the topic is about a famous person. We can either use First-Order Logic (FOL) or the constructors of Description Logics (DL) for expressing the complex correspondences. In FOL, a class corresponds to a unary predicate with one variable, a property to a binary predicate with two variables, and an instance to a constant. In DL, a class corresponds to a (atomic) concept, a property to a role, and an instance to an individual. We can equally transpose logical statements from DL to FOL, and conversely, as long as the DL fragment is always respected. In particular, we take advantage of the expressivity allowed by SHOIN , describing the following DL operators: ¬C (negation of concepts), C C (intersection or conjunction of concepts), C C (union or disjunction of concepts), ∃R.C (existential restriction), ∀R.C (universal restriction), ≤ nR (at most restriction), ≥ nR (at least restriction). For instance, the formula (1) becomes (3) and the formula (2) becomes (4) :</p><formula xml:id="formula_1">Short F ilm ≡ F ilm ∃duration. ≤ 59 (3) Biopic ≡ F ilm ∃topic.Celebrity<label>(4)</label></formula><p>In a (complex) correspondence formula expressed in FOL, a variable may occur in several entities in left and right operands. Intuitively, two classes referring to the same single variable are bound and are first connected by a simple correspondence (with a subsumption relation, otherwise there is no need for an additional complex correspondence to characterise the entities involved). For instance, the formulas (1) and ( <ref type="formula" target="#formula_0">2</ref>) are consistent only if Court metrage F ilm and Biopic F ilm, respectively. The use of DL in (4) requires to explicitly assert the simple correspondence Biopic F ilm first, as we do not know if Biopic is a specialisation or rather a generalisation of F ilm or Celebrity otherwise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Query patterns</head><p>A pattern p O targetting an ontology O is composed of an RDF graph which is the prototype of a relevant family of queries. A pattern can be composed of several subpatterns sp i , such as p O = {sp 1 , sp 2 , ..., sp n }. Subpatterns are assigned minimal and maximal cardinalities, making these subgraphs optional or repeatable when generating the final SPARQL query. Formally, a query pattern can be defined as follows <ref type="bibr" target="#b11">[12]</ref>:</p><p>Definition 2 (Query pattern). Let G be a graph and v a vertex of the graph, we denote by G\v the graph deprived of the vertex v and all the arcs incident to the vertex. A query pattern p is a triple (G, Q, SP ) such as :</p><p>-G is a RDF connected graph that describes the general structure of the pattern and represents a family of requests; -Q is a subset of elements in G, called qualifier elements; these are typical of the pattern and will be taken into account during the association of the user request to the pattern in question. A qualifier element is either a vertex (class or data type), or an arc (object or data property) in G; -SP is the set of subpatterns sp in p such that ∀sp = (SG, v, card min , card max ) ∈ SP , we have:</p><p>• SG is a subgraph of G and v is a vertex of SG (and thus of G), such as G\v is not connected (v is a joint vertex in G, also called junction vertex) and admits SG\v as a connected component (i.e. all the vertices in this connected component belong to the subpattern subpattern's graph); • At least one vertex or an arc of SG is a qualifier element; • card min , card max ∈ N et 0 ≤ card min ≤ card max ; are respectively the minimum and maximum cardinality of sp that define the optional and repeatable characteristics of sp.</p><p>Figure 2.2 shows an example of a query pattern which deals with the events, and the performers involved, where musical works have been performed (or conversely) with the corresponding artist(s) and release date. The pattern is composed of three subpatterns named live, artist and date. All of them are optional, and only the subpatterns live and artist are repeatable: it is considered that a musical work can have only one release date, but a musical work may be the work of several artists and can be performed many times.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Patterns rewriting approach</head><p>The rewriting approach takes as input a complex alignment A O→O and a set P ={p O 1 ,...,p O n } of query patterns p O i , and outputs a set</p><formula xml:id="formula_2">P ={p O 1 ,...,p O n } of query patterns p O j .</formula><p>The intuition is that every subgraph from the input patterns has potentially a (complex) correspondence associating its entities to entities in the target ontology. We consider that the subpattern is the relevant unit of semantic information constituting the patterns. Each subpattern is ideally replaced with Fig. <ref type="figure">1</ref>. Query pattern asking for events and their performers, where musical works have been performed (or conversely), with the corresponding artist(s) and release date.</p><p>an equivalent subgraph corresponding to a logical statement relating concepts and properties of the target ontology. This statement is the target part of the correspondence, if any in the alignment, in which the source part matches the initial subpattern. But the subpatterns and the correspondences in the alignment may not have the same granularity (correspondences can be either simple or can relate smaller subgraphs). Thus, we define an algorithm that is similar to a Depth-First Search algorithm (DFS) for traversing and searching graph data structures in the input query patterns. It starts at the largest subgraph, i.e. the subpattern, and recursively explores its subgraphs (i.e. subpattern &gt; RDF triples &gt; classes and properties), until a correspondence is found for the considered subgraph (in which case, the target graph is written to the subpattern being outputted) or a class or property is reached. If at the end of this process, there are entities that have not been translated, the whole subpattern will be discarded<ref type="foot" target="#foot_0">1</ref> . The operation is repeated for each subpattern in the input patterns.</p><p>The approach is inherently limited by the use of ontology alignment, which is itself an incomplete process. The subpattern is the indivisible expression of a need for information: it can be rewritten by chunks but if it is not fully rewritten at the end of the process, it is discarded. Thus, the conservation of the semantics of original patterns directly depends on the completeness of the input alignment (coverage of the source ontology, quality of correspondences, etc.). We consider that some loss of (semantic) information is acceptable, and that it can be filled with other techniques (for instance, by interacting with the user). However, it is out of the scope of this paper. Figure <ref type="figure">3</ref> illustrates the rewriting of the pattern depicted in Figure <ref type="figure" target="#fig_0">2</ref>.2 (with the input alignment given later in §4.3, Table <ref type="table">1</ref>). The subpattern named live is rewritten to live', following the complex correspondence #6 in the alignment. The subpattern named artist is rewritten to artist', following the complex correspondence #2 in the alignment (in this case, only the first term of the disjunction artist author creator musicComposer appears in the resulting subpattern, for the sake of simplicity and readability).</p><p>Finally, the subpattern named date could not be directly rewritten since there is no correspondence for this subgraph. Instead, the property release date is rewritten to releaseDate, following a simple correspondence, and the data type xsd:date remains xsd:date. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Preliminary experiments</head><p>In a first series of experiments, we used a set of simple correspondences for rewriting patterns. These correspondences come from a merge of alignments generated by OAEI 2012 matching systems (the reader can refer to <ref type="bibr" target="#b6">[7]</ref> for details). Overall, 67% of the entities in the Music ontology were covered in the alignment. 25 out of 60 entities in the query patterns for this ontology could be replaced by a target entity (coverage of 41%). In terms of subpatterns, only 2 out of the 19 subpatterns could be fully rewritten using the alignment. Although we have found a handmade (reference) alignment between Music and DBpedia ontologies<ref type="foot" target="#foot_4">5</ref> , we could not use it because it is mostly composed by correspondences linking classes only, using subsumption relations, and few equivalences could be inferred from them. Despite the fact that the quality of the alignment from the matchers was not measured, these first experiments highlighted the limitations in replacing individually the entities in the patterns. Nevertheless, we needed to accurately assess this deficiency and to know to what extent we could rewrite query patterns. It turned out that simple correspondences are not sufficient to capture all the meaningful relations between entities of two related ontologies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Complex correspondences</head><p>Very few systems are able to find complex correspondences. First, we tried the tool described in <ref type="bibr" target="#b13">[14]</ref>, which finds complex correspondences using a set of predefined patterns. Besides the two ontologies to align, this tool takes an alignment as input. We used the tool on the pair Music-DBpedia, with (i) the handmade alignment between Music and DBpedia ontologies, and (ii) the merged alignment from the matchers used in the preliminary experiments ( §4.2). In both cases, few correct correspondences could be identified. We tried then the successor of this tool described in <ref type="bibr" target="#b14">[15]</ref>, which benefits of natural language processing techniques instead of requiring an input alignment, and we obtained similar results.</p><p>Hence, we manually created a set of 28 complex correspondences (along 11 simple ones) for the pair Music-DBpedia, guided by the query patterns for Music <ref type="foot" target="#foot_5">6</ref> . A subset of them is presented in Table <ref type="table">1</ref>. The idea behind using complex correspondences is that every subgraph pattern has potentially a (complex) correspondence in the target ontology. Given that the subpattern is the relevant unit of semantic information constituting the patterns, we isolated them, and for each of them, we tried to find an equivalent logical statement relating concepts and properties of the target ontology. For constructing the complex correspondence set, we used as basis a set of simple correspondences (the left operand refers to an entity of the Music ontology, and the right operand refers to an entity of the DBpedia ontology): MusicalManifestation MusicalWork, MusicalWork ≡ Musical-Work*, MusicArtist ≡ MusicalArtist*, Performance Event, Performer ≡ Mu-sicalArtist, foaf:Group Band, MusicGroup ≡ Band*, SoloMusicArtist Mu-sicalArtist, Track MusicalWork, Track ≡ Song, and Record MusicalWork<ref type="foot" target="#foot_6">7</ref> .</p><p>For each correspondence, we identified the complex correspondence patterns that characterise it, from the patterns proposed in the literature: Class  <ref type="table">1</ref>. 12 out of 28 handmade complex correspondences between Music and DBpedia ontologies.</p><p>by Attribute Type <ref type="bibr" target="#b13">[14]</ref> (CAT), Class by Inverse Attribute Type <ref type="bibr" target="#b13">[14]</ref> (CAT-1), Class by Attribute Value <ref type="bibr" target="#b13">[14]</ref> (CAV), Attribute Value Restriction <ref type="bibr" target="#b17">[18]</ref> (AVR), equivalent to CAV, Property Chain <ref type="bibr" target="#b13">[14]</ref> (PC), Aggregation <ref type="bibr" target="#b17">[18]</ref> (AGR), equivalent to PC, Inverse Property <ref type="bibr" target="#b14">[15]</ref> (IP), Union <ref type="bibr" target="#b17">[18]</ref> (OR), and Intersection <ref type="bibr" target="#b17">[18]</ref> (AND). Although no new pattern was discovered, stating that, for our case, the patterns proposed in the literature cover all types of generated correspondences, several of our correspondences are in fact compositions of them (Table <ref type="table">1</ref>). For instance, the left operand in the correspondence #4 is an assembly of the patterns CAV and CAT. In the scope of this paper, however, we do not define any algebra which would describe how patterns can be composed or associated to represent the structure of complex correspondences (definition of basic properties and laws such as associativity, commutativity and distributivity). We have also manually generated 52 multilingual complex correspondences (along 13 simple correspondences) for the Cinema and DBpedia ontologies. For instance, the correspondence expressing the relation between the artists that are awarded the Cesar Award in Cinema (source ontology) and DBpedia (target ontology) : Artiste ∃estRecompenseA.CesarDuCinema ≡ Artist ∃cesarAward(Award ∃event.F ilmF estival).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Rewriting SPARQL queries and query patterns</head><p>From the set of complex correspondences for the pair Music-DBpedia and the query patterns for Music, we applied our approach ( §3) for rewriting Music patterns in terms of the DBpedia vocabulary. Before rewriting patterns, we evaluated the use of these correspondences for rewriting SPARQL queries. For doing so, we defined a set of rules for translating a complex correspondence pattern into RDF graph patterns (Table <ref type="table" target="#tab_1">2</ref>). These rules are intended to be used for guiding the process of SPARQL rewriting. Following these rules, we managed to rewrite the 25 first SPARQL queries from the benchmark training data in QALD 2013 <ref type="foot" target="#foot_7">8</ref> . The training data include 100 natural language questions for MusicBrainz with the corresponding SPARQL queries, as well as the answers these queries retrieve. The queries have been rewritten in order to interrogate DBpedia. As an example, consider the query in Table <ref type="table">3</ref> asking if there are members of the Ramones who are not named Ramone (question #25) over MusicBrainz, and the same query rewritten for DBpedia. The MusicBrainz result answers false, while the DBpedia result asserts the opposite. The DBpedia request is not less correct: if we return the actual query solutions (SELECT) instead of testing whether or not the query pattern has a solution (ASK), we find that Clem Burke in DBpedia was a member of The Ramones under the name "Elvis Ramone", while the MusicBrainz data set directly refers to him with this alias. In fact, both sets of instances do not fully intersect and they are not necessarily/correctly interlinked <ref type="foot" target="#foot_8">9</ref> . From this point of view, 18 of the 25 rewritten queries are correct and consistent with the queries for MusicBrainz: they do not necessarily give the same results, but they do answer the same question. 3 of these 18 results give the same number of solutions with exactly the same literals. 5 out of the 7 remaining results give no solution at all (no instance). And finally, the 2 last results are not fully correct since the complex correspondences ahead are not correct themselves. For instance, M usicalM anif estation ∃release type.live M usicalW ork ∃recordedIn.P opulatedP lace turns out to be erroneous since the albums which have been recorded in a recording studio are equally selectable that are further (semi-automatically) enriched by using linguistic, structural and background knowledge-based strategies. Although different strategies have been proposed, very few matchers for generating complex correspondences are available or use EDOAL, an expressive alignment language <ref type="bibr" target="#b2">[3]</ref>, for representing them. With respect to query patterns rewriting, the problem can be seen, at a lower level of abstraction, as a problem of SPARQL rewriting. Correndo et al. <ref type="bibr" target="#b1">[2]</ref> propose a set of SPARQL rewriting rules exploiting both (complex) ontology alignments and entity co-reference resolution. Zheng et al. <ref type="bibr" target="#b19">[20]</ref> propose to rewrite SPARQL queries from different contexts, where context mappings provide the articulation of the data semantics for the sources and receivers. Makris et al. <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b7">8]</ref> present the SPARQL-RW rewriting framework that applies a set of predefined (complex) correspondences. They define a set of correspondence types that are used as basis for the rewriting process (i.e., Class Expression, Object Property Expression, Datatype Property, and Individual ). However, the way the set of complex correspondences is established is not described. Our naive approach for rewriting query patterns is close to these SPARQL rewriting proposals in the sense of using complex correspondences. One of the difficulties is the lack of established ways for automatically identifying them. Although m:n complex correspondences are proposed at conceptual level, few concrete examples are available in the literature. Most of our correspondences are n:m. As most proposals, we start from a set of (automatically) discovered simple correspondences. Finally, our method is applied in an applicative context of rewriting patterns for a question answering system over RDF data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ID</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusions and future work</head><p>This paper has discussed the use of complex correspondences for rewriting query patterns, aiming at reusing query families across data sets that overlap. Although we could not fully evaluate the rewriting process mainly due to the fact that SWIP does not treat pattern disjunctions, we were able to validate our approach on a subset of manually validated complex correspondences. This opens several opportunities for future work. First, the structure of query patterns in SWIP could evolve so that they match the structure of the complex correspondences we have established. In particular, SWIP would benefit from the disjunction of subpatterns or specification of instances in patterns. Second, we plan to represent our complex correspondences using EDOAL. Third, we plan to formalise the composition of complex correspondence patterns, which are thereby the building blocks to obtain richer correspondences, following a grammar defining a set of rules for rewriting logical statements in FOL or DL. The grammar must define the properties of pattern precedence, transitivity, associativity, commutativity, and distributivity. Finally, we plan to propose an approach for (multilingual) complex correspondence generation, exploiting specially the ABox of ontologies.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Example of query pattern rewriting.</figDesc><graphic coords="6,153.52,183.14,305.25,213.38" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>∃Q.T) ?x a A → {?x a B . {B R ?y} UNION {B Q ?z}} AND A ≡ B (∃R.T ∃Q.T) ?x a A → {?x a B . ?x R ?y ; Q ?z .} Complex correspondence patterns and SPARQL rewriting rules.</figDesc><table><row><cell></cell><cell>Formal pattern</cell><cell>SPARQL rewriting rule</cell></row><row><cell cols="2">CAT A ≡ ∃R.B</cell><cell>?x a A → { ?x R B }</cell></row><row><cell cols="2">CAT-1 A ≡ B ∃R-.T</cell><cell>?x a A → { ?x a B . ?y R ?x }</cell></row><row><cell cols="2">CAV A ≡ ∃R.{...}</cell><cell>?x a A → { ?x R "..."ˆˆex:dataType }</cell></row><row><cell cols="2">AVR A ≡ B ∃R.{...}</cell><cell>?x a A → { ?x a B . ?x R SomeValue . }</cell></row><row><cell>PC</cell><cell>R ≡ P.(A ∃Q)</cell><cell>?x R ?y → { ?x P A . A Q ?y }</cell></row><row><cell>IP</cell><cell>R-P</cell><cell>?x R ?y → ?y P ?x</cell></row><row><cell>OR</cell><cell>A ≡ B (∃R.T</cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">In this case, the pattern is still connected, i.e. there is a chain connecting each pair of vertices.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://musicontology.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://ontologies.alwaysdata.net/cinema</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://wiki.dbpedia.org/Ontology?v=181z</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">http://knoesis.org/projects/BLOOMS/#Resources_for_Download</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">The resulting alignment do not cover all possible correspondences between Music-DBpedia, but a subset of them where entities appear in the query patterns.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6"><ref type="bibr" target="#b6">7</ref> Correspondences with an asterisk were discovered using OAEI matchers.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">Open challenge on Multilingual Question Answering over Linked Data: http:// greententacle.techfak.uni-bielefeld.de/ ~cunger/qald/index.php?x=task1&amp;q=3</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">http://wiki.dbpedia.org/Interlinking?v=vn</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>within the property recordedIn: we wrongly thought it was reserved for live performances only. Thus, these results allowed us to validate our complex correspondences and remove those that prove to be incorrect.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ASK ASK WHERE {</head><p>WHERE { ?band foaf:name 'Ramones' .</p><p>?band foaf:name 'Ramones'@en . ?artist foaf:name ?artistname .</p><p>?artist foaf:name ?artistname . ?artist mo:member of ?band .</p><p>{?band dbo:bandMember ?artist} UNION {?band dbo:formerBandMember ?artist} . FILTER (NOT regex(?artistname,"Ramone")) FILTER (NOT regex(?artistname,"Ramone")) } } Table <ref type="table">3</ref>. DBpedia rewritten query (right) from MusicBrainz query (left). Namespace prefix bindings were ommitted (dbo refers to dbpedia and mo to music).</p><p>Next, we rewrote the Music query patterns in terms of the DBpedia vocabulary. Using the 11 simple correspondences and the 28 complex correspondences, we achieved a rewriting percentage of 90% of the Music patterns: on the 19 subgraphs (subpatterns) identified in the patterns for Music, we were able to transform 17 of them. For the Cinema patterns, we were able to rewrite 45 out of 51 subpatterns from the Cinema IRIT query patterns. Then, the patterns rewritten from Music were injected in the SWIP system along the DBpedia data set, in order to demonstrate the relevance of rewriting query patterns in the whole process. We managed to run five queries from QALD and originally intended to MusicBrainz. The generated SPARQL queries are (semantically) correct as long as (i) the correspondences involved do not apply any disjunction of terms, which is not currently supported in SWIP (in this case, only the most likely term is kept), and (ii) the source and target in the correspondences involved have the same information level (basically, equivalence).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Related work</head><p>Rewriting query patterns using complex correspondences is the novel aspect of this paper. With respect to complex correspondence generation, different approaches have emerged in the literature in the last years. A common approach is based on complex correspondence patterns <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b16">17,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref> ( §4.3). Walshe <ref type="bibr" target="#b18">[19]</ref> proposes refining elementary correspondences by identifying which correspondence pattern best represents a given correspondence. Following a different strategy, Qin et al. <ref type="bibr" target="#b12">[13]</ref> propose an iterative process that combines terminological, structural, and instance-based matching approaches for mining frequent queries, from which complex matching rules (represented in FOL) are generated. Nunes et al.</p><p>[10] present a two-phase instance-based technique for complex datatype property matching, where the first phase identifies simple property matches and the second one uses a genetic programming approach to detect complex matches. Recently, Arnold <ref type="bibr" target="#b0">[1]</ref> uses state-of-the-art matchers for generating initial correspondences</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Semantic enrichment of ontology mappings: Detecting relation types and complex correspondences</title>
		<author>
			<persName><forename type="first">P</forename><surname>Arnold</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">25th GI-Workshop on Foundations of Databases</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">SPARQL Query Rewriting for Implementing Data Integration over Linked Data</title>
		<author>
			<persName><forename type="first">G</forename><surname>Correndo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Salvadores</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Millard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Glaser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shadbolt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">1st International Workshop on Data Semantics</title>
				<meeting><address><addrLine>DataSem</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010-03">2010. March 2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The Alignment API 4.0</title>
		<author>
			<persName><forename type="first">J</forename><surname>David</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Semantic Web</title>
				<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="3" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Keyword search over rdf graphs</title>
		<author>
			<persName><forename type="first">S</forename><surname>Elbassuoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Blanco</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM &apos;11</title>
				<meeting>the 20th ACM International Conference on Information and Knowledge Management, CIKM &apos;11</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="237" to="242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Searching RDF Graphs with SPARQL and Keywords</title>
		<author>
			<persName><forename type="first">S</forename><surname>Elbassuoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ramanath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Schenkel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Weikum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Data Eng. Bull</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="16" to="24" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Ontology Matching</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Berlin, Heidelberg</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Réécriture de patrons de requêtes à l&apos;aide d&apos;alignements d&apos;ontologies</title>
		<author>
			<persName><forename type="first">P</forename><surname>Gillet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Haemmerlé</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Atelier Qualité et Robustesse dans le Web de Données</title>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">SPARQL-RW: transparent query access over mapped RDF data sources</title>
		<author>
			<persName><forename type="first">K</forename><surname>Makris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bikakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gioldasis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Christodoulakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">15th International Conference on Extending Database Technology</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="610" to="613" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Ontology mapping and SPARQL rewriting for querying federated RDF data sources</title>
		<author>
			<persName><forename type="first">K</forename><surname>Makris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gioldasis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bikakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Christodoulakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2010 Conference on On the Move to Meaningful Internet Systems</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1108" to="1117" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Complex Matching of RDF Datatype Properties</title>
		<author>
			<persName><forename type="first">B</forename><surname>Nunes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Casanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Breitman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Leme</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">6th Workshop on Ontology Matching</title>
				<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A Semantic Web Interface Using Patterns: The SWIP System</title>
		<author>
			<persName><forename type="first">C</forename><surname>Pradel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Haemmerlé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hernandez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Graph Structures for Knowledge Representation and Reasoning</title>
				<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="172" to="187" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Des patrons modulaires de requêtes sparql dans le système swip</title>
		<author>
			<persName><forename type="first">C</forename><surname>Pradel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Haemmerlé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hernandez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Journées d&apos;Ingénierie des Connaissances</title>
		<imprint>
			<biblScope unit="page">23</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Discovering executable semantic mappings between ontologies</title>
		<author>
			<persName><forename type="first">H</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lependu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">OTM International Conference</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="832" to="849" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A pattern-based ontology matching approach for detecting complex correspondences</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ritze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sváb-Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">4th Workshop on Ontology Matching</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Linguistic analysis for complex ontology matching</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ritze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Völker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sváb-Zamazal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">5th Workshop on Ontology Matching</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Nitelight: A graphical editor for sparql queries</title>
		<author>
			<persName><forename type="first">A</forename><surname>Russell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">R</forename><surname>Smart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Poster and Demo Session at the 7th International Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Correspondence Patterns Representation</title>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<pubPlace>Innsbruck</pubPlace>
		</imprint>
		<respStmt>
			<orgName>University of Innsbruck</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Correspondence patterns for ontology alignment</title>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fensel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Engineering: Practice and Patterns</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="83" to="92" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Identifying complex semantic matches</title>
		<author>
			<persName><forename type="first">B</forename><surname>Walshe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">9th International Conference on The Semantic Web: Research and Applications</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="849" to="853" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">SPARQL Query Mediation over RDF Data Sources with Disparate Contexts</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Madnick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WWW Workshop on Linked Data on the Web</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
