=Paper=
{{Paper
|id=Vol-2977/paper5
|storemode=property
|title=Models for Space Unite! The Need and Opportunities for Domain-transcendent Modelling of Spatial Data (short paper)
|pdfUrl=https://ceur-ws.org/Vol-2977/paper5.pdf
|volume=Vol-2977
|authors=Frans Knibbe
|dblpUrl=https://dblp.org/rec/conf/esws/Knibbe21
}}
==Models for Space Unite! The Need and Opportunities for Domain-transcendent Modelling of Spatial Data (short paper)==
Models for Space Unite! ? The Need and Opportunities for Domain–transcendent Modelling of Spatial Data Frans Knibbe[0000−0003−3789−2260] fjknibbe@gmail.com Abstract. This position paper makes a case for a unified way of repre- senting spatial information. For spatial data it should not matter from which knowledge domain they originate. Whether data come from GIS (Geographic Information Systems), CAD (Computer Aided Design), CG (Computer Graphics), the Web, or any other present or future domain in which data have some kind of spatial characteristic, they share common ground. Space is a fundamental aspect of the real world, and making it so in the models that govern the digital world will yield an invaluable reward: system interoperability through data harmonisation. Globalisation, the increasing need for interdisciplinary cooperation and an increasing demand for transparency are some of the developments that fuel ongoing research and development in the fields of Semantic Web, Linked Data and graph–based data. This kind of progress — opening up and defragmentation of information — would be well served by a process of integration of disparate models for spatial information. The current work on updating GeoSPARQL could be a fitting starting point. Keywords: Space · Geometry · Model Unification · Interoperability · GeoSPARQL · Ontology 1 The Problem, and How It Came To Be Space is everywhere. It is an elementary aspect of our perception and it is present in almost all domains of humanity’s knowledge and endeavours. People share a common idea of space in the real world. Yet in the digital world, where ideas and observations take the form of data, space has far less of a common understanding. Numerous ways of modelling spatial/geometric data have come into existence and a large amount of data structures for encoding and exchanging spatial data have been devised. Until this day the diversity in working with spatial data is overwhelming. Even within the niche of GIS there is a staggering amount of data structures and software to deal with. Although this is understandable when looking at historic development, it is generally undesirable. In the later part of the twentieth century, digitalisation and globalisation have been drivers for domain communities to develop and establish standards ? "Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)." 2 Frans Knibbe for data storage and exchange. These were primarily focused on how to model and structure data within domains. This has led to industry standards for spa- tial data in domains like geography, industrial design and manufacturing and computer graphics. That development came with a good amount of competition between businesses trying to give their own inventions the upper hand, lead- ing to many similar or greatly functionally overlapping solutions for the same task. Understandably, this caused a certain amount of frustration at the users’ side of the matter. Fortunately the surge of ways to work with spatial data was somehow channeled and made more manageable by the emergence of industry standardisation organisations like the OGC and the W3C, and involvement of ISO, followed by efforts at data harmonisation by international collaborations like the UN and the EU. However, working with spatial data still is a very domain–dependent affair. With an ever growing need to share data between commercial and non–commercial organisations, national and local governments, and regular people, this domain dependency becomes increasingly burdensome. Although working with data in GIS, CAD, or CG does become doable for do- main experts — after a few years of practice — a growing number of people, Web developers for instance, would like their data to be readily available and interpretable. Even within domain communities the need to cross domain boundaries in- creases. For example, GIS people want to use computer graphics and Web graph- ics technology for visualisation of geographic features and are hampered by the lack of true 3D capabilities in GIS. And with an increasing amount of man–made structures on the earth’s surface, combined with an increasing level of detail in the gathering of geographic data, there is a mounting need to efficiently be able to use systematic geometries like circles, splines or cuboids in GIS, next to the unruly shapes that are preferred by nature. Another example: because geographic data can tell so many interesting stories, their visual and analytical power is much appreciated among non–experts. But then things often go wrong because of incorrect use of earth based coordinate reference systems (CRS), which are hard to simplify because our planet’s surface changes all the time. Yet another example is the tracking and tracing of objects like people, vehicles, or sets of keys. When objects travel from the outside to the inside of buildings, or the other way around, they too travel between different industry domains: GIS and BIM (Building Information Modelling). In short: present day technological challenges increasingly have cross–domain characteristics. Not only domain fixation is to blame for discord in models and structures for spatial data. There is also a tendency to resolve problems with methods that do alleviate niche difficulties, but fail to simplify the realm of spatial information as a whole. Although such initiatives could make a number of use cases easier to accomplish in the short term, in the long term it tends to sow more chaos. As a general principle, knowledge representation and information modelling, at least within the international and cross–organisational scope of the World Wide Web, should not be application–driven. Yes, use cases could and should be sourced from applications, and they are a valuable input for information models. But they Models for Space Unite! 3 can not be expected to fully encompass all present and future requirements, nor can they ensure a faithful and as–simple–as–possible representation of real world phenomena. What is needed for optimal information modelling is a thorough understanding of the basic elements at play. Non–interoperability of data and semantics is a huge socio-economic prob- lem. Even in areas where lives are at stake, like health care, public safety, traffic management and climate change, combining data from all relevant sources in a timely, meaningful and error-free fashion is notoriously difficult. The basic inter- operability problem is recognized by governmental organisations, and on several levels programs are being set up to combat the phenomenon. For example, the UN has the Data Interoperability Collaborative1 , the EU has the European In- teroperability Framework2 and INSPIRE3 , the UK has the National Data Strat- egy4 and the USA has several domain-specific initiatives such as USCDI5 . But a lack of interoperability is not limited to government and public services. Perhaps not always recognised as such, and rarely publicly admitted, the problem recur- rently arises within commercial organisations — within departments, between departments, and in dealings with other organisations. Not always does a lack of data and semantic interoperability cause clear system failures. Its more common influence is an ever present friction in basic procedures that are necessary to achieve more high–level goals, silently eating away time and money, as well as people’s work pleasure. Of course a lack of interoperability is not limited to spatial data. Nevertheless, space does stand out as an area in which much progress can be made. 2 The Solution, and How It Could Take Form At the start of the present century, a simple but potentially very powerful so- lution to information fragmentation was put forward: the twin concepts of the Semantic Web and Linked Data, offering both semantic harmonisation of data and metadata, and a means to interlink data in different datasets. They form the obvious paradigm to achieve interoperability of spatial data. Even so, that has not really happened yet. Interestingly enough, there is an example of a similar challenge that has proven to be feasible. The phenomenon of time has much in common with space. It is something ubiquitous in our daily lives, and therefore it is present in a large share of our digital systems. And there is a profound need for exchange of temporal data, without any misunderstandings. Contrary to space, time already has its domain–independent Web ontology: OWL-Time6 . Supporting temporal relations and arbitrary temporal reference systems, this time ontology can be 1 https://www.data4sdgs.org/initiatives/data-interoperability-collaborative 2 https://ec.europa.eu/isa2/eif 3 https://inspire.ec.europa.eu 4 https://www.gov.uk/government/publications/uk-national-data-strategy 5 https://www.healthit.gov/isa/united-states-core-data-interoperability-uscdi 6 https://www.w3.org/TR/owl-time 4 Frans Knibbe used to facilitate exchanging and linking temporal data from very diverse fields of knowledge and enterprise. Why a similar domain–independent Web ontology for spatial data has not come into existence yet has several reasons. One has to do with complexity. While spatial data can encompass one to three dimensions, with additional mea- sures in the case of linear referencing, time is a phenomenon that is basically one–dimensional. And while different temporal reference systems exist, they tend to be less complex than geographical reference systems. So for spatial data, there is much more ground to cover to arrive at a fully functional cross–domain Web ontology. Another reason why a universal space ontology has not been developed yet is the aforementioned entrenchment of space in disparate domains. Spatial in- formation, like temporal information, can be encountered in many domains. But there are several, such as GIS, CAD and BIM, Computer Graphics and Web Graphics, in which spatial aspects show a dominance that is not comparable to the weight time has in any domain. Models and data structures cater for do- main–specific requirements, and through many years of evolution do serve well enough when application requirements stay within domain bounds. The fact that space–heavy domains have their own characteristics dictating domain information models and data structures is worthy of attention when contemplating universal Web semantics for spatial data. Existing idiosyncrasies are well–rooted in domain requirements and constraints. Computer graphics are served with capabilities for fast processing of relatively large amounts of 3D data. Hence a preference for uniformly structured data, like meshes of trian- gles or clusters of cubes. CAD and BIM (in which CAD methodology is used for spatial data), are concerned with flexibility in design while maintaining ef- ficient data structures. Hence a focus on primitive shapes that can be used to make more complex shapes, by sweeps along curves and by Boolean opera- tions between shapes. In LIDAR, data structures are governed by data capture hardware, making point clouds the typical data structure. GIS historically has focused on 2D maps of mainly natural features on the Earth’s surface, resulting in relatively simple ways of using vector data and raster data. In Web Graphics other modelling constraints exist: depiction of spatial data always takes place on a Web page, and ideally fits in with existing Web standards like HTML, XML, JSON and CSS. Yet, for all the differences between different ways of modelling and encoding spatial data, all models and methods share the same roots. They describe Eu- clidean space, and use the same mathematical principles in definitions of model elements and operations. Mathematics is the foundation of all spatial models, which makes it the common language that all models speak, and an assured way of allowing any kind of translation between disparate models and data structures. If there is no more direct mutual understanding, or mode of data transforma- tion, one can always turn to mathematics as a lingua franca. This means that a cross–domain ontology for space will have to have a semantic foundation drilling down to universal mathematics. Models for Space Unite! 5 So, if a universal ontology for spatial data should be both deep (with seman- tics going to mathematical roots) and wide (with capabilities for diverse domain requirements), will that mean it is bound to become an unwieldy monstrosity? That does not need to be the case, thanks to the principle of modularity. It is a defining feature of the Semantic Web, where ontologies can always be used on a need–to–know basis. Only ontologies that are needed to encode a certain set of data need to be invoked, and large parts of those ontologies can be ignored if they are not required for the job. Cherry picking is the go–to mode of adding meaning to data on the Semantic Web, making this technology very appropriate for providing practical semantics for spatial data. Modularity can also be put to good use within a complex Web ontology, with well thought out separation of concerns aiding both development and use. In case of a space ontology, a few topics immediately stand out as candidates for modularisation. Take CRS — for some domains, defining a CRS is nothing more than defining an origin for Cartesian coordinates. In GIS they can be complicated and tricky, and often cause unintentional data errors. To a certain extent that is because CRS definitions, like metadata, are compartmentalised and weakly linked in GIS. Housing CRS definitions in a Web ontology, or a module of a spatial ontology, would make it easier to jump from coordinates to the CRS definition that describes how they should be interpreted. And sharing the same semantic roots would make it easier to combine CRS, a task increasingly required with data being used across domains and across scales: the CRS of a door handle can sit within the CRS of a door, which sits within the CRS of a building, which sits within the CRS of a land plot, which sits within the CRS of a national cartographic grid, which sits within a geographic CRS, which sits within a CRS of the solar system, which sits within a CRS of the galaxy. Different modules could also exist for popular ways of modelling spatial things. Meshes, point clouds, volume–based and boundary–based models, or cell–based models each have their own specialised capabilities and functionality that usually are sufficient for use. But as modules of one overarching ontology, included will be the benefit of sharing common principles like geometric prim- itives, geometric operators and functions, CRS and topology (the latter also probably being worthy of its own semantic module). Whatever further modu- larisation can be envisaged, it is important to know that helpful semantic data from other modules is always accessible through graph traversal. Once a basic Web ontology for space is in place, with increased capabilities to share and link spatial (meta)data across domains, it could reduce the amount and complexity of software that is used for storage and processing of spatial data, and increase its interoperability. That would be welcome, because data transformation accounts for a lot of friction in working with spatial informa- tion. It eats up resources and often incurs data getting lost in translation. The currently diverse models for spatial knowledge representation underpin a much larger amount of data structures (e.g. file and database formats). Many data structures were designed with specific use in mind, and have optimisations for those specific cases, so there will still be a need for them. But it should be pos- 6 Frans Knibbe sible to much easier transform data from one data structure to another if those structures are defined in terms of the same core model. Also, next to making data structures more interoperable, a core ontology for spatial data could make native formats for graph data on the Web more powerful. For instance, with full freedom of expression of Semantic Web data, JSON-LD could play a much bigger role in encoding any type of spatial data. With more versatile structures for spatial data in place, additional capabilities for software that is intended for direct user interaction (e.g. analysis and visualisation) and supporting software libraries (e.g. CGAL, openGL, WebGL, the Point Cloud Library, GDAL/OGR and JTS/GEOS) should be attainable. 3 A Possible Starting Point No matter the gains society could reap from a universal spatial ontology, its development and establishment will be a daunting task. Not only is there much knowledge to be encoded in ontology terms, also its development will require close collaboration between domain experts who are not used to working to- gether. One source of relief is — again — modularisation. Developing a Web ontology for spatial data could and should be done step by step, with each step resulting in ontological modules that are useful on their own. And to a certain extent, different ontological modules could be developed in parallel. Another circumstance mitigating the amount of work to be done is that progress has already been made. GeoSPARQL, OGC’s Semantic Web standard, could be a suitable kernel for development of a wider scoped ontology for spatial data. Its custodian, the OGC, is well aware of increasing demand for interdis- ciplinary exchange of spatial data and the benefits of combining geography and Linked Data. Its collaboration with the W3C in the Spatial Data on the Web Working Group7 can attest to that. Furthermore, the OGC has compiled a large body of tried and tested specifications for spatial data that have not found their way into GeoSPARQL yet, but are quite ready for that. In fact, the sparseness of the current version of GeoSPARQL (1.0) is a benefit to its potential to grow into a broader ontology for more than only geographic information. And what’s more, new versions of GeoSPARQL are currently in development — in an open manner, which allows spatial enthusiasts of any background to pitch in. The name GeoSPARQL, and its title, ’A Geographic Query Language for RDF Data’, suggest the specification is intended for geographic data only. But looking past that, the ontology itself is quite fit for non-geographic data. Its root class (SpatialObject) covers all things that are spatial in some way. So in essence all the different ways of modelling space can be included. Its two currently defined subclasses, Feature and Geometry, neither are limited to geography. But the way the Geometry class is effectuated in GeoSPARQL 1.0 does play in favour of geography. It is not really possible to describe a geometric object (a collection of coordinates in a certain coordinate system) purely in RDF at the 7 https://www.w3.org/2015/spatial/wiki/Main_Page Models for Space Unite! 7 moment. Description takes place by means of geometry literals, using one of two serialisations that are rooted in GIS: GML and WKT. Nevertheless, there does not seem to be a semantic impediment for extending GeoSPARQL in ways that allow the use of geometric models from other domains. Moreover, should one wish to use other means than numerical coordinates to describe a spatial thing, it possible to specialise SpatialObject in other directions. This is not to say that GeoSPARQL is the only starting point to consider for an all-encompassing spatial Web ontology. X3D (a standard for three–dimensional spatial data on the Web), for example, has a paired X3D Web ontology8 currently in development. And IFC (Industry Foundation Classes), the main standard in BIM, has an ontology for the Web too: ifcOWL9 . Given that IFC allows more kinds of spatial data than comparable OGC standards (OGC’s Simple Features can be regarded as a subset of IFC), it could be viewed as further on the way towards a universal ontology for spatial data. But ifcOWL currently is a result of a direct and automated translation of the IFC schemas, resulting in an ontol- ogy that is not very understandable to those without expert knowledge of IFC. GeoSPARQL, on the other hand, was handcrafted from the ground up, making it friendlier to implementers and users, and more capable of utilising the unique abilities of Semantic Web technology. Furthermore, GeoSPARQL’s rather lim- ited semantics for geometry could be viewed as a weakness, but it does provide ample opportunities for further growth in domain–independent functionality. Irrespective of the starting point for a universal Web ontology for space, de- velopers and maintainers of existing spatial ontologies ideally will have to coop- erate, inspire each other and learn from each other. This is also true for the pos- sible evolution of GeoSPARQL from a bare bones standard for geographic data to a multidisciplinary ontology for spatial data with different origins. The pro- cess will need multidisciplinary involvement of people with different origins. All viewpoints and contributions, however narrow of focus or overly broad they may be thought to be, are valuable. Development of next versions of GeoSPARQL takes place at https://github.com/opengeospatial/ogc-geosparql. Participation is open to everyone. 8 https://www.web3d.org/x3d/content/semantics/semantics.html 9 https://technical.buildingsmart.org/standards/ifc/ifc-formats/ifcowl