=Paper=
{{Paper
|id=Vol-3254/paper341
|storemode=property
|title=ForestQB: An Adaptive Query Builder to Support Wildlife Research
|pdfUrl=https://ceur-ws.org/Vol-3254/paper341.pdf
|volume=Vol-3254
|authors=Omar Mussa,Omer Rana,Benoît Goossens,Pablo Orozco-terWengel,Charith Perera
}}
==ForestQB: An Adaptive Query Builder to Support Wildlife Research==
ForestQB: An Adaptive Query Builder to Support Wildlife Research Omar Mussa1,3,* , Omer Rana1 , Benoît Goossens2 , Pablo Orozco-terWengel2 and Charith Perera1 1 School of Computer Science and Informatics, Cardiff University, United Kingdom 2 School of Biosciences, Cardiff University, United Kingdom 3 College of Computing and Informatics, Saudi Electronic University, Saudi Arabia Abstract This paper presents ForestQB, a SPARQL query builder, to assist Bioscience and Wildlife Researchers in accessing Linked-Data. As they are unfamiliar with the Semantic Web and the data ontologies, ForestQB aims to empower them to benefit from using Linked-Data to extract valuable information without having to grasp the nature of the data and its underlying technologies. ForestQB is integrating Form-Based Query builders with Natural Language to simplify query construction to match the user requirements. (Demo is available at https://iotgarage.net/demo/forestQB) Keywords Linked-Data, Visual Querying, SPARQL, Query Builders 1. Introduction Publishing the data as a Linked-Data using Semantic Web technologies is beneficial for machine learning as well as information retrieval [1]. While the data will be easily accessible by machines, Humans can also benefit from accessing the data by using a query language such as SPARQL, which is the recommended query language for querying RDF triplestore. In the field of Bioscience and Wildlife conservation, researchers tend to collect data using various sensors such as temperature, location and speed. Therefore, hundreds of gigabytes were collected over the years that would be extremely valuable if stored as a knowledge graph in an RDF triplestore. However, users usually feel intimidated to use Linked-Data as they are obliged to understand SPARQL and the underlying data structure [2]. In order to encourage these Bioscience researchers to adopt semantic web technologies in their field, it is essential to present a toolkit that fulfils their requirements to freely access the data store without the need to worry about its underlying technology. In this demo, we introduce ForestQB, a tool that aims to facilitate the knowledge extraction out of the RDF triplestores by allowing the researchers to construct their query visually. The The 21st International Semantic Web Conference, October 23–27, 2022, Hangzhou, China * Corresponding author. $ mussao@cardiff.ac.uk (O. Mussa); ranaof@cardiff.ac.uk (O. Rana); goossensbr@cardiff.ac.uk (B. Goossens); orozco-terwengelpa@cardiff.ac.uk (P. Orozco-terWengel); pererac@cardiff.ac.uk (C. Perera) 0000-0001-8614-6550 (O. Mussa); 0000-0003-3597-2646 (O. Rana); 0000-0003-2360-4643 (B. Goossens); 0000-0002-7951-4148 (P. Orozco-terWengel); 0000-0002-0190-3346 (C. Perera) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) tool provides a high level of abstraction for users by supporting a Form-based interface with an integrated conversational AI to support natural query construction. 2. ForestQB Features ForestQB is a web application that works on the browser to query and explore RDF triplestores that were exposed as a SPARQL endpoint. ForestQB was implemented based on the requirements collected by reviewing the current progress in the field and interviewing experts in the Bioscience field (the stakeholder). The recent features of the tool can be outlined as follows: • General overview: the initial interface provides a basic search functionality by allowing the user to select sensors (observable properties) as a starting point to explore the data. The interface contains a list of sensors loaded from the endpoint when the toolkit was initially loaded. The sensors’ data are fetched by sending a SPARQL query that can be adjusted from the settings to customise the tool if querying different endpoints. Hence, The user can click the search button to start loading the results related to that sensor without requiring additional steps. In addition, the interface includes a map with geospatial filtering capabilities by drawing circles directly on the map. It also contains an upper date picker that will strict the results to a general date range. The upper left side includes a list of predefined examples that will populate all required fields to promote learning by example. The primary goal is to have a user-friendly interface that will support the user to explore the data quickly. • Detailed querying: once the users decide to have more advanced querying, they can click on the search customisation to display all the detailed querying features. The interface will display all the linked properties for the chosen sensor with an additional sublist allowing them to manually add filters to each property. The filters that the user can add will vary based on the property XSD data type. For example, a string will include Contain, Match and Regex filters. Also, the same filters will act differently based on the property data type. For example, the number and date both have a Range filter, but while the number will show a numeric text field, the date will show a date picker. Furthermore, the user can choose to hide the entity from the results or mark an entity as optional to be ignored if it does not exist. • Conversational AI: the ForestQB includes a component that will display a chatbox where the user can express their query in natural language to populate the corresponding fields automatically. In addition to constructing queries, it can help the user to inquire about the underlying data structure. For example, the chatbot can answer simple questions such as "What are the sensors?", "What is Aqeela?" and "Where is Aqeela?" as shown in Figure 2. The user can completely hide the form-based interface to use the chatbot as a stand-alone query builder tool to query and retrieve the results. 3. Our SPARQL endpoint The used dataset contains sensitive historical data collected by bioscientists that were modelled as Linked-data. Thus, our SPARQL endpoint is privately available. The dataset populates Figure 1: Overview of a simple query to retrieve data of a particular animal within a specific date and location. The Longitude and Latitude of the animals were hidden due to the sensitivity of the data. Forest Observatory Ontology (FOO) 1 , a novel ontology that describes wildlife data generated by sensors. It was developed by reviewing the state-of-art and reusing entities from different mature and self-contained ontologies. FOO arose from the efforts to review Open Data Observatories [3] in order to build the Forest Observatory. The data integrates multiple ontologies to define its underlying structure, including SOSA [4] ontology. Hence, ForestQB initially relies on SOSA to get all connected sensors and their observable properties, which means it will potentially work with another endpoint if it applies SOSA. 4. Demonstration The demo will illustrate the ForestQB interface and explain its functionality. Figure 1 shows an overview of the tool to create a simple query using map filters, demonstrating the basic search with a limited number of filters to apply. The “customisation search” button is where the user 1 https://naeima.github.io/Forest-Observatory-Ontology/ Figure 2: Overview of the conversational AI (ForestBot). (a) Retrieving information about “Aqeela” sensor. (b) Constructing a query to find “Aqeela”. (c) The result will then be reflected on the Query Builder as if the user has selected them, and the search process will be triggered. can apply a more advanced search. The customisation is always hidden until the user decides to display it to allow a more neat look and hide the complexity of the interface. The results will be presented in a tabular format to reflect the user query. The conversational AI can work as an assistive tool to the Form-based builder or as a standalone query builder. The user can type their query in natural language, and it will be reflected on the interface. For example, a question such as “Where is aqeela?” will select the correct sensor from the sensors list and trigger the search process (see Figure 2). 5. Technical Details The ForestQB interface was designed after reviewing the current state-of-the-art and gathering requirements from stakeholders. The design has numerous layers and components, each of which serves a unique purpose with distinct technical details behind it. Thus, this section briefly discusses some of the technical aspects of ForestQB. • Web Application: Plain JavaScript and Vue.js (vuejs.org) Framework have been used to build the tool as a Single-page application. The tool will generate a JSON object that will be identical to the tool choices. Once the user clicks search, this object will be sent to the conversion engine (as an AJAX request) to be converted to SPARQL and sent back to the tool to be used for querying the endpoint. The separation of the SPARQL conversion mechanism from the tool will allow it to be shared between the ForestQB and the conversational AI. Both ForestQB and conversational AI share a centralised store using Vuex (vuex.vuejs.org). • Conversational AI model: The conversational AI is split into two pieces: the front-end and the classification model. The front-end is part of the main web application components. However, the Natural Language Understanding (NLU) model has been built using RASA (rasa.com) framework. Thus, all of the logic behind the NLU lies under a different web server that is powered by RASA. The front-end will send all user messages to the chatbot server to understand the user’s intent. Most of the actions (responses) are implemented within the web front-end. The conversational AI will adjust the centralised store based on the classified intent and entities to modify the JSON Query object. Then, it will trigger the search process. The chatbot will either answer simple questions about the data schema or retrieve the results when the user asks. • Map Filters: ForestQB is offering map filters to narrow down the results. The implemented map uses the Leaflet (leafletjs.com) library to visualise and allow the map’s main functionality. Thus, drawing circles on the map allows the user to create nearby filters. Once a circle is drawn, its coordinates and parameters are saved on the JSON Query object and then translated into SPARQL. In addition, the user can define the relationship between the circles as intersections or unions (as shown in Figure 1). As an adaptive feature, the map (GeoSPARQL) filters are not applied to data that do not include Longitude and Latitude. However, the other type of filters will still be valid. 6. Conclusions This paper has briefly introduced the ForestQB, a toolkit that aims to assist bioscientists in querying Linked-Data. The tool supports generating the query by mixing the conversational AI with the Form-Based query builder. The ForestQB is still a work in progress, as we plan to improve its conversational AI to handle more complex sentences. In addition, it will support more visualisation options based on our future user study. References [1] S. Malyshev, M. Krötzsch, L. González, J. Gonsior, A. Bielefeldt, Getting the Most Out of Wikidata: Semantic Technology Usage in Wikipedia’s Knowledge Graph, in: 17th International Semantic Web Conference, volume 11137 LNCS, 2018, pp. 376–394. [2] P. Warren, P. Mulholland, A Comparison of the Cognitive Difficulties Posed by SPARQL Query Constructs, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 12387 LNAI, Springer International Publishing, 2020, pp. 3–19. [3] N. Hamed, O. Rana, P. Orozco-terWengel, B. Goossens, C. Perera, Open Data Observatories: A Survey, Technical Report, Cardiff University, 2021. URL: https://orca.cardiff.ac.uk/id/ eprint/140048/14/NaeimaHamed_26_MAR_2021_V5.pdf. [4] K. Janowicz, A. Haller, S. J. D. Cox, D. Le Phuoc, M. Lefrançois, SOSA: A lightweight ontology for sensors, observations, samples, and actuators, Journal of Web Semantics 56 (2019) 1–10. URL: https://www.sciencedirect.com/science/article/pii/S1570826818300295.