GeoFedBench: A Benchmark for Federated GeoSPARQL Query Processors Antonis Troumpoukis1 , Stasinos Konstantopoulos1 , Giannis Mouchakis1 , Nefeli Prokopaki-Kostopoulou1 , Claudia Paris2 , Lorenzo Bruzzone2 , Despina-Athanasia Pantazi3 , and Manolis Koubarakis3 1 Institute and Informatics and Telecommunications, NCSR “Demokritos”, Greece {antru,konstant,gmouchakis,nefelipk}@iit.demokritos.gr 2 Dept Information Engineering and Computer Science, University of Trento, Italy {claudia.paris,lorenzo.bruzzone}@unitn.it 3 National and Kapodistrian University of Athens, Greece {dpantazi,koubarak}@di.uoa.gr Abstract. Performance benchmarks are invaluable for evaluating and comparing federated query processing systems, but it is hard to design benchmarks that are both realistic and informative about the systems being tested. In this paper we present GeoFedBench, a benchmark that has been obtained from an actual, practical application of geospatial and linked data querying and uses GeoSPARQL constructs that challenge all phases of federated query processing. The benchmark is publicly available as part of the Kobe suite. Keywords: Benchmarking, GeoSPARQL, Federated querying 1 Introduction and Motivation Performance benchmarks are invaluable for evaluating and comparing systems, but designing benchmarks is subject to considerations that are difficult to satisfy simultaneously. One potential tension is the creation of a realistic benchmark that accurately reflects how the benchmarked systems will behave in real-world use cases against the design of a benchmark that is informative with respect to system characteristics we know in advance that we need to test and measure. Given the above, we are excited to present a benchmark that has been ob- tained from an actual, practical application of geospatial and linked data query- ing. The benchmark federates a database of Earth Observation data about land usage and a database of ground observations about land usage, to search for pairs between them that simultaneously satisfy geospatial and thematic (land usage) constraints (Section 2). Besides being extracted from a real workflow in the Earth Observation do- main, the benchmark queries use GeoSPARQL constructs that challenge all ? Copyright (c) 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4/0). phases of federated query processing, from source selection to query planing and execution. Besides a detailed analysis of the queries, we also present em- prical tests demonstrating that the benchmark is both challenging but feasible (Section 3). Finally, we recap and conclude (Section 4). 2 Use case: Validating land usage data Detailed land usage data is crucial in many applications, ranging from formulat- ing agricultural policy and monitoring its execution, to conducting research on climate change resilience and future food security. Land usage can be inferred from Earth Observation images or collected through self-declaration, but in ei- ther case needs to be validated against land surveys. The standard approach for this validation is to match each instance in the land survey dataset (GPS points) with the nearest land parcel (a GIS shape) and compare the crops observed in the survey against the crops declared or inferred for the matching parcel. Although conceptually straightforward, in operational scenarios this rule can be misleading. Ground observations are geo-referenced to a point on the road adjacent to the field, which is often ambiguous in agricultural areas with several adjacent parcels; further exacerbated by GPS accuracy. However, a more sophis- ticated (and also computationally demanding) approach can estimate the error rate of the land usage data: for every survey point there must be at least one parcel with the same label in reasonable proximity; otherwise at least one nearby parcel is mis-labelled (although we cannot automatically infer which one). For our benchmark, we use the Invekos dataset, the Austrian administration’s Land Parcel Identification System with owners’ self-declaration about the crops grown in each parcel, compared against the observations from the 2018 Land use and cover area frame statistical survey (Lucas). Table 1 gives more details about these datasets. Besides geospatial processing, using these datasets also introduces a data integration aspect to our benchmark. Specifically, Lucas annotations follow the Land Cover Classification System (LCCS) whereas Invekos follows its own codelist of 212 crop types. There is no one-to-one mapping between instance labels (e.g., Invekos grassland can be Lucas E10, B55, or E30, while Lucas B13 can be Invekos spring barley or winter barley). Table 1. Dataset details All triples Geospatial triples Thematic triples Lucas 30,379 4,325 26,054 https://esdac.jrc.ec.europa.eu/projects/lucas Invekos 14,036,799 2,005,257 12,031,542 https://www.data.gv.at/katalog/dataset/e21a731f-9e08-4dd3-b9e5-cd460438a5d9 3 Benchmark queries In order to estimate the reliability of the Invekos dataset, we used queries that, for each given Lucas instance, check if: (Q1) the closest Invekos instance is under 10 meters away and their crop labels match; (Q2) the closest Invekos instance is under 10 meters away and their crop labels do not match; or (Q3) there is no Invekos instance within 10 meters. Since geo-linked data vocabularies link instances with a geometry object (which then has as an attribute the actual shape), these queries (and geoSPARQL queries in general) challenge FILTER optimizers because it presents them with comparisons between variable groundings (as opposed to constant values), and because these comparisons are non-standard extensions (the geospatial exten- sions of GeoSPARQL). In most benchmarks, filters are either not present at all [LUBM, 2] or only have unary functions or comparisons against constants [FedBench, 5] that can al- ways be pushed into one data source. LargeRDFBench [4] includes multi-variable filters that compare values from different repositories, challenging the optimizer to select the correct strategy: to fetch the left-hand side values and push the filter into the right-hand side source or to fetch both sides and apply the filter locally. Both approaches are valid, but can vary dramatically in terms of efficiency. Our benchmark presents the same challenge in a geospatial context; the federator is tested not only on correctly selecting the best strategy but also on the efficiency of its local implementation of the GeoSPARQL extension. Properties of standard vocabularies, which can appear possibly in all sources of a federation, present another challenge in the efficient evaluation of a query. When evaluating a triple pattern that contains a property such as rdf:type or owl:sameAs the source selector is prone to overestimate the set of relevant sources, thus increasing both network traffic and the overall query processing time. Current benchmarks already contain such commonly used properties. But GeoFedBench stresses source selections more on this direction by exploiting a query characteristic that appears frequently in Geospatial data; a resource ?x is linked with its geometry representation ?wkt using chains of known properties of the form ?x geo:hasGeometry ?g . ?g geo:asWKT ?wkt, where all members of the chain usually appear in the same dataset. The federation engine is tested on distinguishing which geospatial triple patterns refer to which dataset, thus avoiding to fetch redundant bindings for the variable in the middle of the chain. Finally, the complex nature of our queries challenges query planning. Current benchmarks usually contain simple queries consisting only joins between triple Table 2. Query prossesing time (msec) of the benchmark queries for three different Lucas instances. Lucas point Q1 Q2 Q3 1 157,209 146,325 152,999 2 143,751 145,049 139,447 3 156,127 152,502 136,762 patterns and FILTER operations, or some additional operators such as UNION, ORDER, LIMIT, etc. In GeoFedBench, Q1 and Q2 use a subquery for discovering the closest Invekos instance. Also, Q2 and Q3 use negation, in the form of the FILTER NOT EXISTS operator to check that there does not exist and matching Invekos instance. Both subqueries and negation are not present in any of the currently existing federated SPARQL benchmarks. To demonstrate that the benchmark is feasible but challenging, we tested on Semagrow [1], to the best our knowledge the SPARQL federation engine that supports geospatial operators. The datasets are served by Strabon geospatial RDF stores [3]. Table 2 gives the query execution time for three runs of each query, where each run grounds the query with a different Lucas point. 4 Conclusions We presented GeoFedBench, a benchmark for federated geospatial query pro- cessing. GeoFedBench is based in openly available datasets and queries challenge all phases of federated query processing. The benchmark is distributed as part of the benchmark suite of the KOBE Open Benchmark Engine, available from https://github.com/semagrow/kobe Acknowledgement This project has received funding from the European Union’s Horizon 2020 re- search and innovation programme under grant agreement No 825258. Please see http://earthanalytics.eu for more details. Bibliography [1] Charalambidis, A., Troumpoukis, A., Konstantopoulos, S.: SemaGrow: Opti- mizing federated SPARQL queries. In: Proceedings of the 11th International Conference on Semantic Systems (SEMANTiCS 2015), Vienna, Austria, 16– 17 September 2015 (2015) [2] Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. Web Semantics 3(2) (Jul 2005) [3] Kyzirakos, K., Karpathiotakis, M., Koubarakis, M.: Strabon: A semantic geospatial DBMS. In: P. Cudré-Mauroux et al. (eds.) Proceedings of ISWC 2012, Boston, MA, USA, 11-15 November 2012. LNCS vol. 7649, Springer (2012), https://doi.org/10.1007/978-3-642-35176-1 19 [4] Saleem, M., Hasnain, A., Ngomo, A.N.: LargeRdfBench: A billion triples benchmark for SPARQL endpoint federation. J. Web Semant. 48, 85–125 (2018), https://doi.org/10.1016/j.websem.2017.12.005 [5] Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: A benchmark suite for federated semantic data query processing. In: Proceedings ISWC 2011, Bonn, Germany, 23-27 October 2011. LNCS vol. 7031. Springer (2011), https://doi.org/10.1007/978-3-642-25073-6 37