=Paper=
{{Paper
|id=Vol-2469/ERDemo09
|storemode=property
|title=Virtualized Ontology Query By Example
|pdfUrl=https://ceur-ws.org/Vol-2469/ERDemo09.pdf
|volume=Vol-2469
|authors=Lucas Peres,Ticiana L. Coelho da Silva,Jose Macedo,David Araujo
|dblpUrl=https://dblp.org/rec/conf/er/PeresSMA19
}}
==Virtualized Ontology Query By Example==
Virtualized Ontology Query By Example Lucas Peres1 , Ticiana L. Coelho da Silva1 , Jose Macedo1 , and David Araujo1 Insight Data Science Lab, Fortaleza - CE, BR {lucasperes,ticianalc,jose.macedo,david}@insightlab.ufc.br http://www.insightlab.ufc.br Abstract. The Web has evolved to a large variety of data usually pub- lished in RDF from multiple domains. A recurrent problem in recent literature concerns to perform a search over RDF instead of using struc- tured queries in triple-pattern-based languages like SPARQL, which only expert programmers can precisely specify their information needs. In this paper, we propose Von-QBE, an open source tool to query over RDF databases without any technical knowledge about RDF or the queried ontology structure. This differs from the-state-of-art tools by being schema-based instead of instance-based. It can be impracticable to use instance-based approaches in big data scenarios where the RDF data is huge and demands lots of computational resources to keep the knowledge base in memory. Moreover, most of these solutions need the knowledge base materialized into RDF(or triplified ), which can be costly for legacy bases. We present various demonstration scenarios using the IMDB movie ontology. Keywords: RDF schema · SPARQL query · Query by Example 1 Introduction The Web has evolved from a network of linked documents to one where both documents and data are linked, resulting in what is commonly known as the Web of Linked Data, that includes a large variety of data usually published in RDF from multiple domains. As any database model, RDF requires formal query languages to retrieve information. Tools that perform a search over RDF data become increasingly important since writing structured queries in triple-pattern- based languages like SPARQL [8] can be extremely difficult for non-technical users. Consider the example question, such as "Find the title of action movies produced in Northern America and the name of their company". A possible SPARQL query formulation, assuming a user familiar with the schema of the underlying knowledge base and knows which entities are present at the data instances, could consist of the following: SELECT DISTINCT ?x ?title ?company_name WHERE { Copyright © 2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). Virtualized Ontology Query By Example 149 ?x a mo:Movie; mo:title ?title; mo:isProducedBy ?y; mo:belongsToGenre [ a mo:Brute_Action ] . ?y :companyName ?company_name; :hasCompanyLocation [ a mo:Northern_America ] . } This complex query requires a user’s familiarity with the knowledge base, which in general, no user (technical or not) should be expected to have. Basi- cally, Von-QBE (stands for Virtualized Ontology Query By Example) addresses the problem: Given a natural language question QN and an underlying ontology O, its goal is to translate QN into a formal query QS as SPARQL that cap- tures the information need to be expressed by QN . Von-QBE focuses on queries that emphasize classes and relations between them, not considering aggregation, disjunctive and negation queries. A considerable number of question answering approaches for RDF data has been proposed, to name a few [6], [1], [7], and [9]. They address the same prob- lem of Von-QBE, however, they are instance-based approaches, which can be unfeasible in big data scenarios where the RDF data is huge and demands lots of computational resources to keep the knowledge base in memory. Moreover, most of these solutions need the knowledge base materialized into some RDF format(triplified), which can be a hard task for legacy bases. [5] describes Von-QBE in details. Von-QBE derives from the term virtual ontology, since it is not instance-based and it can use a virtualized ontology by tools like Ontop[2] instead of materialized RDF stores like Virtuoso1 . All in all, Von-QBE is an open-source 2 schema-based approach to query over RDF data without any previous knowledge about the ontology or RDF technical skills. It lets the user queries using natural language questions or keyword search and translates the search into a SPARQL formal query. Furthermore, Von-QBE as- sists the user to construct his/her query search interactively, providing examples. A screencast of Von-QBE is available at YouTube 3 . To the best of the authors’ knowledge, this work is the first that addresses the problem of question answering over RDF using only the schema. In the next section, we present the main components of Von-QBE. In Section 3, we present the demonstration scenarios. Section 4 draws the final conclusions. 2 Von-QBE Von-QBE addresses the problem of question answering over RDF. Beyond that, it also helps the user to construct its search interactively, providing examples. Figure 1 shows Von-QBE architecture which comprises three main components: Fragment Extractor, Fragment Expansor and Query Builder. All these compo- nents handle the ontology schema as an RDF graph[3], allowing the usage of 1 https://virtuoso.openlinksw.com/ 2 http://github.com/insightlab/von-qbe 3 https://youtu.be/ScXgGzhbx50 150 L. Peres et al. Users Web Interface 1 - Search 3 - Suggestions 7 - Data 4 - SPARQL Query Executor Keyword Matcher Fragment 5 - SPARQL Expansor 6 - Data Query Builder Fragment Constructor 2 - Fragment RDF Data Fragment Extractor Ontology Schema Von-QBE Fig. 1: Von-QBE’s architecture well-known graph algorithms from the literature. Von-QBE is implemented us- ing Scala, Java (the web service) and ReactJS (the web interface). In what follows, we provide a brief idea of each Von-QBE main components. Fragment Extraction is responsible for identifying, from a natural lan- guage question QN , the ontology subset (here we call as a fragment) that corre- sponds to the classes and properties mentioned on QN . The Fragment Extraction comprises two sub-components: 1) Keyword Matcher that identifies the ontology concepts mentioned on QN by using similarity metrics[4], and 2) Fragment Con- structor that discovers how these concepts are related on the ontology schema by using well-known graph algorithms: Dijkstra shortest path and Prim Minimum spanning tree. Fragment Expansor. Von-QBE provides examples to the user by expands QN using the ontology classes and properties that are directly connected to the fragment retrieved from the previous module. Considering our ontology O represented as an RDF graph, and the fragment nodes are ontology concepts, the Fragment Expansor expands the fragment with all edges (of course, the ones that are not already in the fragment) that come in (or out) from the fragment nodes. Query Builder works as follows: each edge in the fragment (outputted by Fragment Constructor or expanded with the suggestions outputted by Fragment Expansor and accepted by the user) is added as a triple pattern in the query, and the source and the target nodes are named as variables. Since the schema might have properties with multiple domains and ranges, Query Builder also adds a clause to inform the instance type (class) for each variable. All the triple patterns are given as input to Apache Jena library4 which generates QS according to the SPARQL syntax. 4 http://jena.apache.org Virtualized Ontology Query By Example 151 Fig. 2: Von-QBE suggests other concepts from the search query: movie title and actors Fig. 3: Results for the text movie title and actors birth name 3 Demonstration Scenario: IMDB Movie Ontology Von-QBE works with any RDF ontology that presents a schema and a SPARQL endpoint. The current version of Von-QBE supports Virtuoso SPARQL endpoint and Ontop[2] mapping to a non-RDF database. It is worth to mention that Von- QBE effectiveness depends on the RDF schema quality to work properly. In this section, we present some demonstration scenarios using the IMDB Movie Ontology5 . Figure 2 shows Von-QBE’s interface. First, the user should write the keyword search or the query at the text field. Once the user has written a term, Von-QBE suggests other concepts or relations from the ontology schema based on the user- written term(s). Figure 2 shows the suggestions given by Von-QBE to the query: movie title and actors. The property birth name (an attribute from the Actor class) appears as a suggestion. The user can choose it and keep constructing the search query. Once the user finishes writing the search query, he/she can limit the number of triples returned based on a limit value and run the query over the database. Von-QBE will then retrieve the results and return them to the user, as in Figure 3. Von-QBE also provides the generated SPARQL from the user query search, 5 https://sites.google.com/site/ontopiswc13/home/imdb-mo 152 L. Peres et al. Fig. 4: SPARQL generated from the user-search query: movie title and actors birth name as stated in Figure 4. The user has also the option to copy the results table to the clipboard as a .tsv(tab-separated values) and paste anywhere. 4 Conclusion In this paper, we present Von-QBE to address the problem of querying over RDF databases using natural language question or a keyword search. Moreover, Von- QBE also helps the user to construct his/her query and translates the user-query search into SPARQL query. From the authors’ knowledge, Von-QBE is the first tool to address such a problem using only the ontology schema. As future work, we aim at using natural language processing tools to detect entities described in the query and find its corresponding concept over the ontology schema. References 1. Arnaout, H., and Elbassuoni, S. Effective searching of rdf knowledge graphs. Journal of Web Semantics 48 (2018), 66 – 84. 2. Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., Rodriguez-Muro, M., and Xiao, G. Ontop: Answering sparql queries over relational databases. Semantic Web 8, 3 (2017), 471–487. 3. Consortium, W. W. W., et al. Rdf 1.1 concepts and abstract syntax. 4. Gomaa, W. H., and Fahmy, A. A. A survey of text similarity approaches. IJCA 68, 13 (2013), 13–18. 5. Peres, L., Silva, T. L. C. d., Macedo, J., and Araujo, D. Ontology based query by example. ER 2019. 6. Usbeck, R., Ngomo, A.-C. N., Bühmann, L., and Unger, C. Hawk–hybrid question answering using linked data. In European Semantic Web Conference (2015), Springer, pp. 353–368. 7. Xu, K., Zhang, S., Feng, Y., and Zhao, D. Answering natural language ques- tions via phrasal semantic parsing. In Natural Language Processing and Chinese Computing. Springer, 2014, pp. 333–344. 8. Yahya, M., Berberich, K., Elbassuoni, S., Ramanath, M., Tresp, V., and Weikum, G. Natural language questions for the web of data. In Proceedings of the Virtualized Ontology Query By Example 153 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2012), Association for Computational Linguistics, pp. 379–390. 9. Yih, S. W.-t., Chang, M.-W., He, X., and Gao, J. Semantic parsing via staged query graph generation: Question answering with knowledge base.