=Paper= {{Paper |id=Vol-2456/paper77 |storemode=property |title=Visual Queries over Scholarly Data and other Linked Data Endpoints |pdfUrl=https://ceur-ws.org/Vol-2456/paper77.pdf |volume=Vol-2456 |authors=Kārlis Čerāns,Lelde Lace,Aiga Romane,Julija Ovcinnikova,Sergejs Kozlovics,Mikus Grasmanis,Jūlija Hodakovska,Arturs Sprogis,Agris Sostaks |dblpUrl=https://dblp.org/rec/conf/semweb/CeransLROKGHSS19 }} ==Visual Queries over Scholarly Data and other Linked Data Endpoints== https://ceur-ws.org/Vol-2456/paper77.pdf
     Visual Queries over Scholarly Data and other Linked
                       Data Endpoints1

    Kārlis Čerāns*, Lelde Lāce, Aiga Romāne, Jūlija Ovčiņņikova, Sergejs Kozlovičs2,
           Mikus Grasmanis, Jūlija Hodakovska, Artūrs Sproģis, Agris Šostaks

                Institute of Mathematics and Computer Science, University of Latvia
                                *karlis.cerans@lumii.lv



         Abstract. We demonstrate the option to use the schema-based visual query tool
         ViziQuer over realistic Linked Data endpoints, with examples over the Semantic
         Web conference-related Scholarly Data. We present the pipeline of enabling vis-
         ual query creation over a SPARQL endpoint and ready-to-use data schemas over
         existing public Linked data endpoints, available in the ViziQuer Schema store.


         Keywords: Visual query tool, RDF data, SPARQL, Linked Data


1        Introduction

Visual query composition (cf. [1], [2], [3], [4], [5]) along with facet-based ([6], [7]) and
controlled natural language-based ([8]) approaches offers a promising avenue to enable
end-user involvement in query composition over RDF/SPARQL data (cf. [1]).
   The recent version of ViziQuer notation ([5], [9]), implemented in a web-based tool3
[10], allows for visual presentation of rich instance level and aggregate queries, involv-
ing data expressions, as well as query nesting, with expressive power approaching that
of the full SPARQL 1.1 [11]. Meanwhile, the currently available examples of the Vi-
ziQuer notation and tool usage are largely related to in-house RDF data stores, not the
publicly available Linked data sets that would be one of its primary usage targets.
   We shall report in this paper and present in the demonstration:
      Visual query notation and tool usage examples over a public Linked Data end-
          point of Scholarly data4 [12];
      The pipeline of enabling visual query creation over a SPARQL endpoint, in-
          cluding a novel easy-to-use data schema extractor implementation;
      Ready-to-use data schemas over existing public Linked data endpoints, avail-
          able in the ViziQuer Schema store (within the ViziQuer tool page).


1
  Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License
    Attribution 4.0 International (CC BY 4.0).
2
  Supported by ERDF project 1.1.1.2/VIAA/1/16/214
3
  http://viziquer.lumii.lv/
4
  http://www.scholarlydata.org/
2      Scholarly Query Examples

The visual queries are defined and constructed in the context of a data schema listing
the available classes and properties and their connectivity, as well as matching the entity
short names to their full IRIs. We present a few examples here based on the data schema
of Scholarly Data [12] (the schema can be downloaded from the ViziQuer Schema
store).
   Each query is a connected graph with one main query node (orange round rectangle).
The linked nodes correspond either to joined classes or to nested queries (if the link
starts with a black bullet). The ViziQuer notation is explained in detail in [5], [9], and
the ViziQuer tool page.

                   Table 1. Visual Query examples over Scholarly Data
 1. List all conferences (IRI, acronym,
 start and end dates), together with op-
 tional conference series information
 (IRI, name). Each query node is a pat-
 tern for a data instance, listing the
 class name and conditions, as well as
 the selection items; the links show the
 instance connections.
 2. Find the number of conferences and
 the number of conferences by the year
 of their start date (two queries). The
 grouping is automatic by all non-ag-
 gregated selection fields.
 3. List all conferences together with
 their Proceedings paper count. A
 nested query construct is used for
 counting the papers in the context of
 each conference separately.
 4. List titles of all papers from the
 ISWC 2018 conference, together with
 their author count (sort descending).
 A query node can also hold an in-
 stance IRI information.
 5. List titles of all papers together with
 the papers’ first author name(s). The
 data may contain duplicate relation in-
 formation. So, the distinct person
 name values are selected for each pa-
 per, then concatenated to include all
 name forms in the output.
    6. Find the top 10 situations with per-
    sons publishing most papers at a sin-
    gle conference. Nested queries can re-
    turn more than single-item result sets
    projected onto the host query.


    7 List the top 20 keywords in proceed-
    ing papers of ISWC series confer-
    ences, together with using paper
    counts. The keyword presence is man-
    datory, denoted by {+}.


3         Enabling Visual Queries over Linked Data Endpoints

The visual queries over a data endpoint are created within a context of a ViziQuer pro-
ject that requires a pre-loaded data schema and a specified link to the SPARQL endpoint
itself (if the created queries are to be executed from the tool environment).
   The ViziQuer data schema can be either generated from an OWL ontology or ex-
tracted directly from the SPARQL endpoint (via schema-level SPARQL queries) using
an open-source schema extractor service, accompanying the ViziQuer tool. The links
to the service are maintained from the ViziQuer tool home page. The novel schema
extractor implementation is as a JAVA web service, allowing its seamless usage di-
rectly from end-user computers.
   There are public data endpoint schemas, extracted and collected at the ViziQuer
Schema Store, including Scholarly Data, UNESCO5 (SKOS), Social Semantic Web
Thesaurus6 and WikiPathways7. The schemas of resources like DBPedia and WikiData
cannot be currently extracted in their full form due to space considerations.
   It can also be envisaged that the SPARQL endpoint holders may store and maintain
the visual query data schemas, or even the visual project examples along with the end-
points themselves to enable faster users starting up the work with the endpoint.
   The visual query environment, after the data schema loading, displays the schema to
the end user in the form of a class tree, arranged by the namespaces of the top-level
classes and by the subclass relation; this form has been found suitable for moderately-
sized data schemas (as the ones, listed in the schema store). A double click on the class
tree node adds the main query node with this class into the diagrammatic query pane;
further query elements then can be added in the context of the created query element.
   There are also options for the data schema export in the form of OWL ontology from
the ViziQuer tool, to enable its further analysis or eventual graphic visualization, e.g.,
in the OWLGrEd ontology editor8.

5
  http://vocabularies.unesco.org/sparql
6
  http://vocabulary.semantic-web.at/PoolParty/sparql/semweb
7
  http://sparql.wikipathways.org/
8
  http://owlgred.lumii.lv/
4      Conclusions

There is an option to compose visual queries over Linked data endpoints, such as Schol-
arly Data, in the ViziQuer tool, so allowing the rich visual query experience over Linked
Data.
   The data schema retrieval as the necessary pre-processing step can be performed by
an open-source schema extractor that can be used either from a public server or run on
an end-user computer. Further development of the schema extraction service to be able
to handle the peculiarities (query execution limits, supported protocols, and SPARQL
subsets) of various SPARQL endpoints, is work in progress.
   Another avenue of further work is creating an interactive graphical visualization of
the data schema within the query tool to ease the end user query creation experience
further.


References
 1. Soylu, A., Giese, M., Jimenez-Ruiz, E., Vega-Gorgojo, G.., Horrocks, I.: Experiencing
    OptiqueVQS: A Multi-paradigm and Ontology-based Visual Query System for End Users.
    Universal Access in the Information Society, March 2016, Volume 15, Issue 1, pp 129–152.
 2. Zviedris, M., Barzdins, G.: ViziQuer: A Tool to Explore and Query SPARQL Endpoints. In:
    The Semantic Web: Research and Applications, LNCS, Volume 6644, pp. 441-445, (2011)
 3. Kapourani, B., Fotopoulou, E., Papaspyros, D., Zafeiropoulos, A., Mouzakitis, S., Koussou-
    ris, S.: Propelling SMEs Business Intelligence Through Linked Data Production and Con-
    sumption, In OTM 2015 Workshops pp 107-116, 2015.
 4. Haag, F., Lohmann, S., Siek, S., Ertl, T.: QueryVOWL: Visual Composition of SPARQL
    Queries. In: The Semantic Web: ESWC 2015 Satellite Events. LNCS, Vol.9341, pp. 62-66.
    Springer, (2015), http://vowl.visualdataweb.org/queryvowl/
 5. Čerāns, K., Bārzdiņš, J., Šostaks, A., Ovčiņņikova, J., Lāce, L., Grasmanis, M. and Sproģis,
    A.: Extended UML Class Diagram Constructs for Visual SPARQL Queries in ViziQuer/web
    In Voila!2017, CEUR Workshop Proceedings, Vol.1947, (2017) pp.87-98.
 6. Vega-Gorgojo, G., Giese, M., Heggestoyl, S., Soylu, A., Waaler, A.: PepeSearch: Semantic
    Data for the Masses. In: PLoS ONE 11(3): e0151573. doi: 10.1371/journal.pone.0151573,
    2016. http://dx.doi.org/10.1371/journal.pone.0151573
 7. Khalili, A., Meroño-Peñuela, A.: WYSIWYQ --- What You See Is What You Query. In
    Voila!2017, CEUR, Vol.1947, (2017) pp.123-130. http://ceur-ws.org/Vol-1947/paper11.pdf
 8. Ferré, S.: Sparklis: An expressive query builder for SPARQL endpoints with guidance in
    natural language, Semantic Web, 2017, Vol 8, pp 405-418
 9. Čerāns, K., Šostaks, A., Bojārs, U., Bārzdiņš, J., Ovčiņņikova, J., Lāce, L., Grasmanis, M.
    and Sproģis, A., ViziQuer: A Visual Notation for RDF Data Analysis Queries. In Research
    Conference on Metadata and Semantics Research. Springer CCIS, Vol.846, pp.50-62, 2018
10. Čerāns, K., Šostaks, A., Bojārs, U., Ovčiņņikova, J., Lāce, L., Grasmanis, M. Romāne, A.,
    Sproģis, A., Bārzdiņš, J. ViziQuer: A Web-Based Tool for Visual Diagrammatic Queries
    Over RDF Data. In: ESWC 2018 Satellite Events. LNCS, vol 11155, pp. 158-163, 2018.
11. SPARQL 1.1 Query Language. W3C Recommendation 21 March 2013,
    http://www.w3.org/TR/2013/REC-sparql11-query-20130321/
12. A. L. Gentile and A. G. Nuzzolese. cLODg - Conference Linked Open Data Generator. In
    ISWC 2015 Posters & Demonstrations Track, CEUR-WS.org, Vol. 1486, 2015.