An Ontology-Driven Approach to Support Data Analysts with Thermal Comfort Problems in the Built Environment Iker Esnaola-Gonzalez1 , Jesús Bermúdez2 and Cristina Aceta1 1 TEKNIKER, Basque Research and Technology Alliance (BRTA), Iñaki Goenaga 5, 20600 Eibar, Spain 2 Freelance Abstract Since we spend most of our time within buildings it is of utmost importance to feel comfortable while staying indoors. However, research shows that HVAC systems only ensure occupants’ satisfaction in the 11% of the commercial buildings. The advancing spread of the Internet of Things (IoT) and the maturity of Knowledge Discovery in Databases (KDD) may contribute to develop accurate predictive models which address this challenge. But data analysts in charge of developing these predictive models may feel overwhelmed if they have insufficient domain expertise. In this article, the ontology-driven approach proposed by the EEPSA (Energy Efficiency Prediction Semantic Assistant) is presented, in which data analysts can benefit from previously captured domain knowledge. Therefore, this article proposes the exploitation of Semantic Technologies to support data analysts in the discovery of the variables that should be considered for making accurate predictive models for thermal comfort problems within buildings. Compared with the existing tools or methods, the EEPSA is able to suggest variables that may not necessarily be included in the set of data available. This fact has a big potential nowadays, where the Linked Open Data and the third-party repositories can be exploited to incorporate relevant information. Keywords Ontology, Buildings, Data Analysis, Thermal Comfort 1. Introduction Stanley Hall, who was an American psychologist, stated that man is largely a creature of habit, and nowadays most of these human habits or daily activities (e.g. sleeping, shopping, working,...) take place indoors. This statement was reinforced by the study made in the early 2000s, which concluded that we spend 87% of our time indoors [1]. Since we spend most of our time within buildings, it is of utmost importance for humans to feel comfortable while staying indoors. Building users’ comfort comprises different aspects including acoustic, visual or thermal, and the latter is defined by the ANSI/ASHRAE Standard LDAC 2022: 10th Linked Data in Architecture and Construction Workshop, May 29, 2022, Hersonissos, Greece $ iker.esnaola@tekniker.es (I. Esnaola-Gonzalez); jesus.bermudez@ehu.eus (J. Bermúdez); cristina.aceta@tekniker.es (C. Aceta)  0000-0001-6542-2878 (I. Esnaola-Gonzalez); 0000-0002-6937-9579 (J. Bermúdez); 0000-0003-3566-9460 (C. Aceta) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 8 55-20171 as: “that condition of mind that expresses satisfaction with the thermal environment and is assessed by subjective evaluation”. Consequently, the research of the thermal comfort’s impact on occupants’ well-being has become an important area of study. In [2], Arif et al. present a state-of-the-art analysis of research in the domain of health and well being of occupants and their relationship to the Indoor Environmental Quality (IEQ). In [3], Haynes showed that the IEQ has a direct impact on workers’ efficiency and productivity. Furthermore, thermal comfort influences the customer experience in retail and restaurant spaces [4]. However, although HVAC systems should ensure thermal comfort, only 11% of the commercial buildings met the criteria that no more than 20% of building occupants are dissatisfied [5]. The optimal HVAC activation strategy for ensuring occupants thermal comfort while making an efficient use of energy is still an unsolved problem in most buildings. Furthermore, it is important to note that certain type of buildings have specific features which may further hinder this problem. For example, spaces with big dimensions are prone to have bigger thermal inertia [6] and cannot be effectively climatised with rather simple solutions like thermostat- based reactive systems. Instead, HVAC systems need to be activated in a specific mode and time to ensure a comfortable thermal condition. The advancing spread of the Internet of Things (IoT) and the maturity of Knowledge Discovery in Databases (KDD) may contribute to develop accurate predictive models which address this challenge. However, data analysts in charge of developing these predictive models may feel overwhelmed if they have insufficient domain expertise [7]. Consequently, they may resort to a trial-and-error approach trying to develop high-performing predictive models. This is definitely an undesirable strategy and an assistant that supports data analysts through the predictive model development process could be of interest. In this regard, knowledge from domain experts could be captured leveraging Semantic Technologies and make this knowledge available for data analysts to exploit it. In this article, the ontology-driven assistant proposed by the EEPSA (Energy Efficiency Prediction Semantic Assistant) is presented. With this assistant, data analysts benefit from an application assistant that supports them throughout the KDD process, and aids them to discover which are the relevant variables to consider when developing a model which accurately predicts the thermal comfort within buildings. The rest of this article is structured as follows. Section 2 introduces the related work. Section 3 describes the ontology that supports the proposed data analyst assistant. In Section 4 an illustrative use case is presented for demonstration purposes. Finally, conclusions of this work are presented in Section 5. 2. Related Work At an abstract level, the KDD field is concerned with the development of methods and techniques for making sense of data [8], although in this article, the KDD is understood as a less generic process, as a process to develop predictive models which estimate unknown outcome. The 1 https://www.ashrae.org/technical-resources/bookstore/standard-55-thermal-environmental-conditions-for- human-occupancy 9 Figure 1: An overview of a typical KDD process. typical KDD process comprises five steps as shown in Figure 1: Data Selection, Preprocessing of Data, Data Transformation, Data Mining and Interpretation. The first phase, which is where the focus of this article is placed, consists in selecting a data set, a subset of variables or data samples where the knowledge discovery is going to be performed. With the expansion of the IoT and the advent of new paradigms such as Linked Data, data analysts may get lost in today’s plethora of data. Therefore, the application of a knowledge extraction process can be hindered. In order to avoid this problem, data analysts have to understand the data itself: which is the knowledge that represents and which is the additional knowledge that can be extracted from it. However, this is not a trivial task and in most cases, a domain-specific knowledge is needed to select the adequate sets of data and variables to analyse. Methods for exploring and visualising data may contribute to understanding the data which data analysts need to deal with [9]. These approaches are aimed at visualising data in a coherent and legible way, thus allowing users to obtain a good understanding of its structure, and therefore implicitly compose queries, identify links between resources and discover new pieces of information. However, for having such an understanding of the data, a deep knowledge of the domain at hand is still required. Apart from the visualisation approaches, there are more classical methods which may support the KDD’s Data Selection phase. One of them is the attribute relevance analysis which attempts to recognise those attributes with the greatest impact on the target variable, while removing the ones with less relevance from a given set of data [10]. This method aims to reduce the redundancy and uncertainty in the predictive model development process, although the performance of the attribute relevant analysis itself, may be affected by the vast amount and heterogeneity of the data that data analysts may face nowadays. This article proposes the exploitation of Semantic Technologies to support data analysts in the discovery of the variables that should be considered for making accurate predictive models. This is a different approach compared to existing work, which focus on visualising data that may not be understood without a deep domain knowledge (e.g. data visualisation tools) and cannot suggest new relevant variables that are not present in current data sets (e.g. relevance analysis). 10 3. EEPSA: An Ontology for Thermal Comfort in Buildings In order to incorporate Semantic Technologies in an assistant that supports data analysts through the predictive model development process, it is necessary to leverage proper ontologies that codify the required knowledge and enables the adequate annotation of the data. The aforementioned EEPSA, defined in [11], is an assistant based on Semantic Technologies including ontologies, ontology-driven rules and ontology-driven data access to guide data analysts through the KDD process in a semi-automatic manner, towards the development of enhanced predictive models for energy efficiency and occupants’ thermal comfort assurance in tertiary buildings. In the context of such an assistant, the EEPSA ontology2 presented in [12] is the cornerstone. The EEPSA ontology’s backbone has been defined as a combination of three Ontology Design Patterns (ODPs). ODPs are minimal ontologies that address recurrent design problems that may arise in ontology conceptualisation, formalisation or implementation activities [13]. The combination of the AffectedBy3 , the Execution-Executor-Procedure (EEP4 ) and the Result- Context (RC5 ) ODPs provide appropriate concepts to represent scenarios where executions including observations, actuations, or predictions, play a key role. The careful design of these three ODPs’ property axioms overcome weaknesses discovered in existing ODP-based ontologies such as the SOSA/SSN ontology [14] or the SEAS FeatureOfIn- terest ontology [15]. Furthermore, these ODPs try to be minimal in the number of classes and properties offered, but include appropriate ontology axioms that allow proper inferences. On top of these three ODPs, six ontology modules have been developed, which represent the set of suitable terms, concepts and relationships to support data analysts through the predictive model construction process for the problem at hand. Each ontology module specialises the knowledge in the scope of the stub classes defined in the three ODPs. More specifically, these ontology modules are FoI4EEPSA6 for representing building and building spaces; Q4EEPSA7 for representing qualities of these spaces; EXR4EEPSA8 for representing executors such as sensors and actuators; P4EEPSA9 for representing specific plans or methods; and EXN4EEPSA10 for representing executions such as observations and actuations. It is worth mentioning that the sixth ontology module EK4EEPSA11 does not specialise any stub class and it is designed to contain expert knowledge representing different types of spaces and the variables affecting their indoor conditions. The EEPSA ontology is depicted in Figure 2. Summarising, the EEPSA ontology is the addition of the following ontological resources: three ODPs, five ontology modules specialising the stub classes defined by these ODPs, and an ontology module containing expert knowledge. 2 https://w3id.org/eepsa 3 https://w3id.org/affectedBy 4 https://w3id.org/eep 5 https://w3id.org/rc 6 https://w3id.org/eepsa/foi4eepsa 7 https://w3id.org/eepsa/q4eepsa 8 https://w3id.org/eepsa/exr4eepsa 9 https://w3id.org/eepsa/p4eepsa 10 https://w3id.org/eepsa/exn4eepsa 11 https://w3id.org/eepsa/ek4eepsa 11 Figure 2: The EEPSA ontology. The EEPSA ontology’s development has been founded in the NeOn Methodology [16] and with a view to being compliant with the FAIR principles [17] leveraging the FOOPS! (Ontlogy Pitfall Scanner for FAIR) tool12 [18]. Existing resources have been reused as much as possible following the Ontological Resource Reuse Process [19], not only to capture and facilitate consensus in communities, but also to reduce redundancies and increase interoperability. Precisely for contributing to the interoperability of the solution, the ODPs and ontology modules that conform the EEPSA ontology are aligned with related domain ontologies and upper-level ontologies since this practice alleviates integration problems, helps to ensure clarity in modelling and avoids errors that have unintended reasoning implications [20, 21]. Furthermore, all the EEPSA ontology terms contain the metadata proposed by the guidelines defined by Garijo and Poveda- Villalon [22] and the EEPSA ontology’s resources (i.e. the three ODPs and the six ontology modules) are documented with the WIDOCO (a WIzard for DOCumenting Ontologies) tool. Additionally, the validity of the ontology has been performed with Themis13 [23], which verified that the EEPSA ontology satisfied all the functional requirements initially defined. Last but not least, the EEPSA ontology is available online under a Creative Commons Attribution CC BY 4.0 license, and it is published in different catalogues such as LOV14 (Linked Open Vocabularies) or LOV4IoT15 . 3.1. The EK4EEPSA Ontology Module The EK4EEPSA (Expert Knowledge ontology module for the EEPSA Ontology) captures the necessary knowledge to provide inferencing capabilities that can be exploited by data analysts in the KDD Data Selection phase. Towards such a goal, a group of thermal and energy domain 12 https://foops.linkeddata.es 13 http://themis.linkeddata.es/ 14 https://lov.linkeddata.es/ 15 https://lov4iot.appspot.com/ 12 Table 1 Requirements addressed by the EK4EEPSA ontology module. CQ Answer Which types of spaces are in a building? Bad insulated spaces,... Which are the qualities affecting an adjacent to outdoor Solar radiation, wind speed,... space’s temperature? Which are the qualities affecting a bad insulated space’s Indoor temperature, outdoor temperature? temperature,... Which are the qualities affecting an underground space’s Atmospheric pressure, occu- temperature? pancy,... Which are the qualities affecting a naturally ventilated Indoor humidity, outdoor hu- space’s relative humidity? midity,... experts were interviewed to elicit and formalise their knowledge, and capture it in the ontology module in a proper way. More specifically, the EK4EEPSA addresses requirements including the ones described in Table 1 in the form of CQs (Competency Questions). The EK4EEPSA defines a classification of types of spaces within buildings. In the context of the EEPSA ontology, a space is understood as “a part of the physical world or a virtual world whose 3D spatial extent is bounded actually or theoretically, and provides for certain functions within the zone it is contained in”, as defined by the BOT [24] (Building Topology Ontology). The categorisation of types of spaces is based on the structural features of such spaces, including spaces located in underground floors (ek4eepsa:BelowGroundLevelSpace) and spaces with a poor insulation (ek4eepsa:BadInsulatedSpace). Other categorisations of spaces were also considered, such as the one proposed by the HBC (Human Comfort in Building) ontology [25], where spaces are defined based on whether they contain certain types of building objects (e.g. hbc:SpaceWithAirTerminal) or do not (e.g. hbc:SpaceWithoutShadingDevice). Note that in the scenario addressed in this article, it may be convenient to make heavy usage of axioms expressing sufficient conditions to infer the recognition of individuals in appropriate classes. That is, it may be suitable to use equivalent class axioms with appropriate right hand class expressions, rather than being dependent on explicit assertions only. For example, the ek4eepsa:BelowGroundLevelSpace is defined as follows: ek4eepsa:BelowGroundLevelSpace ≡ bot:Space ⊓ ∃bot:hasStorey.foi4eepsa:UndergroundStorey Complementary to this space classification, for each space type, qualities that affect their indoor temperature are captured and represented in the EK4EEPSA ontology module. This representation relies on other resources of the EEPSA ontology, namely on the qualities rep- resented in the Q4EEPSA ontology module, and the axioms defined in the AffectedBy ODP. For example, the temperature of a space located in an underground level, may be affected by qualities such as the atmospheric pressure, the humidity of the space itself, and the occupancy of the room, as represented in the following axioms: 13 ek4eepsa:BelowGroundLevelSpaceIndoorTemperature ⊑ ∃aff:affectedBy.q4eepsa:AtmosphericPressure ⊓ ∃aff:affectedBy.q4eepsa:IndoorHumidity ⊓ ∃aff:affectedBy.q4eepsa:Occupancy . In the latest version of the EK4EEPSA ontology module available at the moment of writing this article (i.e. version 1.1), knowledge regarding qualities that affect the indoor relative humidity of certain types of spaces such as naturally ventilated spaces (ek4eepsa:NaturallyVentilatedSpace) has been incorporated. Likewise, further knowledge can be captured and represented in a similar way in order to cover future additional requisites. This knowledge modelling can be exploited by application programs or other services to support data analysts facing thermal comfort problems in buildings. After knowing which is the type of space at hand, data analysts get to know which are the qualities that are more relevant, which definitely guides the KDD Data Selection phase. This knowledge exploitation is illustrated with a use case in the following section. 4. Use case: Thermal Comfort in Restaurants Let us consider a scenario in which a restaurant manager needs to ensure the thermal comfort of its customers. For that purpose, a predictive model that predicts the restaurant temperature is proposed, which will later be used as the foundation to support the optimal HVAC activation strategies. Being a non-expert in the thermal comfort domain, the data analyst in charge would definitely benefit from a service that suggests the most relevant variables or attributes for developing an accurate predictive model. That is, a service that supports data analysts in the KDD Data Selection phase. In order to make use of the EEPSA’s assistance in the KDD Data Selection phase, first of all, the use case needs to be represented with the adequate EEPSA ontology terms. This includes representing the restaurant itself, its structural elements and the equipment deployed such as sensors or actuators with terms from different EEPSA modules as well as other reused well-known ontologies such as BOT. This semantic annotation phase can be accomplished by manually editing an RDF model with the help of an adapted GUI (Graphical User Interface) or a data wrangling tool, or else with a properly programmed automatic middleware. An excerpt of the semantic representation of the presented use case is as follows: : building35 rdf : type bot : Building ; (...) bot : hasStorey : l e v e l 1 . : l e v e l 1 rdf : type bot : Storey ; bot : hasSpace : r e s t a u r a n t ; bot : hasSpace : kitchen ; (...) b o t : h a s S p a c e : bathroom . : r e s t a u r a n t r d f : type bot : Space ; ( ∗ ) bot : hasElement : door01 ; 14 b o t : h a s E l e m e n t : window01 ; ( ∗ ) b o t : h a s E l e m e n t : window02 ; bot : hasElement : tempSensor01 ; (...) b o t : h a s E l e m e n t : windowShading01 . : window01 r d f : t y p e f o i 4 e e p s a : ExternalWindow . ( ∗ ) : window02 r d f : t y p e f o i 4 e e p s a : Window . (...) : tempSensor01 r d f : type exr4eepsa : TemperatureSensor . : windowShading01 r d f : t y p e e x r 4 e e p s a : B l i n d A c t u a t o r . : obs_tempSesnsor01_2589 rdf : type exn4eepsa : Observation ; eep : madeBy : t e m p S e n s o r 0 1 ; eep : u s e d P r o c e d u r e : s e n s i n g P r o c e d u r e 0 1 ; eep : o n Q u a l i t y : r e s t a u r a n t T e m p . : sensingProcedure01 rdf : type p4eepsa : SensingProcedure . : restaurantTemp rdf : type q4eepsa : IndoorTemperature . Once the scenario is semantically annotated, an inference engine needs to be applied so that new information can be deduced from the RDF model. This new inferred data is essential to support data analysts and it is derived from the knowledge captured in the EEPSA ontology. There are triple stores with inferencing engines integrated and in other cases, these reasoning capabilities need to be manually added. In the presented use case, the implicit knowledge was inferred by manually applying a HermiT version 1.3.8 OWL 2 DL reasoner. The resulting RDF model was then uploaded to a Openlink Virtuoso Server version 07.20.3217, which it remained accessible via an SPARQL endpoint. Once the RDF model is generated, data analysts can ask which the most relevant variables affecting the restaurant’s temperature are. For illustrative purposes, in this section, a step-by- step explanation is provided, although in practice it could be simplified with a single query to the triplestore. Ideally, the production of these SPARQL queries should be managed by a graphic interface that isolates data analysts from the underlying SPARQL query language in which they might not be experts, easing their interaction. In the presented use case, the data analyst would initially ask for the type of the target space with the following SPARQL query: SELECT ? s p a c e T y p e WHERE { : r e s t a u r a n t r d f : type ? spaceType . } which evaluated over the given set of triples it retrieves ek4eepsa:AdjacentToOutdoorSpace and ek4eepsa:NaturallyEnlightenedSpace. Therefore, it can be concluded that the restaurant is a space in contact with the exterior and enlightened by the sun. These results are derived from use case’s triples (specifically the ones marked with an asterisk) and the knowledge inferred thanks to the axioms ek4eepsa:AdjacentToOutdoorSpace ≡ bot:Space ⊓ ∃bot:hasElement.foi4eepsa:ExternalBuildingElement and 15 ek4eepsa:NaturallyEnlightenedSpace ≡ bot:Space ⊓ (∃bot:hasElement.foi4eepsa:ExternalWindow ⊔ ∃bot:hasElement.foi4eepsa:Skylight) Knowing which type of space is the restaurant at hand, the data analyst would then ask which are the variables that may be more relevant to develop an accurate predictive model. For that purpose, the following SPARQL query would be executed16 PREFIX a f f : < h t t p s : / / w3id . o r g / a f f e c t e d B y #> SELECT ? r e l e v a n t V a r i a b l e WHERE { : r e s t a u r a n t r d f : type ? spaceType . ? spaceType a f f : influencedBy ? r e l e v a n t V a r i a b l e . } This SPARQL query would return the following variables: • q4eepsa:IndoorHumidity • q4eepsa:Occupancy • q4eepsa:SolarRadiation • q4eepsa:WindSpeed • q4eepsa:CloudCover • q4eepsa:SunPositionDirection • q4eepsa:SunPositionElevation These variables17 are inferred thanks to the role chain axioms defined in the AffectedBy ODP: aff:influencedBy ∘ aff:affectedBy ⊑ aff:influencedBy as well as the axioms captured in the EK4EEPSA, related with the definition of spaces ad- jacent to outdoors: ek4eepsa:AdjacentToOutdoorSpace ⊑ ∃aff:influencedBy.ek4eepsa:AdjacentToOutdoorSpaceIndoorTemperature ek4eepsa:AdjacentToOutdoorSpaceIndoorTemperature ⊑ ∃aff:affectedBy.q4eepsa:IndoorHumidity ⊓∃aff:affectedBy.q4eepsa:Occupancy ⊓∃aff:affectedBy.q4eepsa:SolarRadiation ⊓∃aff:affectedBy.q4eepsa:WindSpeed and spaces naturally enlightened: 16 Note that this SPARQL query would be enough for the data analyst, although the previous one has also been displayed for demonstration purposes. 17 q4eepsa is the preferred namespace prefix for the Q4EEPSA ontology module. 16 ek4eepsa:NaturallyEnlightenedSpace ⊑ ∃aff:influencedBy.ek4eepsa:NaturallyEnlightenedSpaceIndoorTemperature ek4eepsa:NaturallyEnlightenedSpaceIndoorTemperature ⊑ ∃aff:affectedBy.q4eepsa:CloudCover ⊓∃aff:affectedBy.q4eepsa:IndoorHumidity ⊓∃aff:affectedBy.q4eepsa:Occupancy ⊓∃aff:affectedBy.q4eepsa:SunPositionDirection ⊓∃aff:affectedBy.q4eepsa:SunPositionElevation Therefore, after semantically annotating the use case restaurant, the data analyst discovers which may be the most relevant variables for developing an accurate predictive model for the problem at hand. It is remarkable that, unlike other existing approaches, the EEPSA suggests variables that may not be present in the current set of data that the data analyst has. However, knowing which variables may contribute developing an accurate predictive model, is definitely helpful for the data analyst. For example, cloud coverage, and the sun position or sun elevation could be retrieved from external Linked Open Data sources. Another example is the occupancy, which could be obtained from the reservation list that the restaurant manager may have. The EEPSA also supports this variable generation task in the KDD’s Transformation phase, although details of this support are left out of scope of this article. 5. Conclusions Under circumstances where a deep thermal comfort and building domain knowledge is required to efficiently develop a predictive model, having insufficient expertise could make data analysts feel overwhelmed. The EEPSA tries to address this issue by supporting data analysts in the KDD’s Data Selection phase. For that purpose, it leverages the EEPSA ontology, which codifies the required knowledge and enables the adequate annotation of the data. In this article, an ontology-driven assistant has been proposed, which has been demonstrated in a restaurant. Compared with the rest of existing tools or methods, the EEPSA is able to suggest variables that may not necessarily be included in the set of data available. This fact has a big potential nowadays, where the Linked Open Data and the third-party repositories are valuable sources of knowledge which can be exploited to incorporate relevant information to the set of data available. The EEPSA is oriented to energy efficiency and thermal comfort problems in tertiary buildings. However, this same approach could be extended to other domains. As a matter of fact, it is expected to pave the way towards the development of ontology-driven approaches that fill the gap of data analysts with insufficient domain knowledge. Acknowledgments This work is supported by the REACT project which has received funding from the European 17 Union’s Horizon 2020 research and innovation programme under grant agreement no. 824395. References [1] N. E. Klepeis, W. C. Nelson, W. R. Ott, J. P. Robinson, A. M. Tsang, P. Switzer, J. V. Behar, S. C. Hern, W. H. Engelmann, The national human activity pattern survey (nhaps): a resource for assessing exposure to environmental pollutants, Journal of Exposure Science and Environmental Epidemiology 11 (2001) 231. [2] M. Arif, M. Katafygiotou, A. Mazroei, A. Kaushik, E. Elsarrag, et al., Impact of indoor environmental quality on occupant well-being and comfort: A review of the literature, International Journal of Sustainable Built Environment 5 (2016) 1–11. [3] B. P. Haynes, The impact of office comfort on productivity, Journal of Facilities Management 6 (2008) 37–51. [4] L. Gutierrez, E. Williams, et al., Co-alignment of comfort and energy saving objectives for us office buildings and restaurants, Sustainable cities and society 27 (2016) 32–41. [5] G. Brager, L. Baker, Occupant satisfaction in mixed-mode buildings, Building Research & Information 37 (2009) 369–380. [6] S. Verbeke, A. Audenaert, Thermal inertia in buildings: A review of impacts across climate and building use, Renewable and Sustainable Energy Reviews 82 (2018) 2300 – 2318. doi:10.1016/j.rser.2017.08.083. [7] A. Bernstein, F. Provost, S. Hill, Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification, IEEE Transactions on Knowledge and Data Engineering 17 (2005) 503–518. doi:10.1109/TKDE.2005.67. [8] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, From data mining to knowledge discovery in databases, AI magazine 17 (1996) 37. doi:10.1609/aimag.v17i3.1230. [9] A.-S. Dadzie, M. Rowe, Approaches to visualising linked data: A survey, Semantic Web 2 (2011) 89–124. doi:10.3233/SW-2011-0037. [10] J. Han, M. Kamber, J. Pei, Chapter 3 - data preprocessing, in: J. Han, M. Kamber, J. Pei (Eds.), Data Mining (Third Edition), The Morgan Kaufmann Series in Data Management Systems, third edition ed., Morgan Kaufmann, Boston, 2012, pp. 39 – 82. doi:10.1016/ B978-0-12-381479-1.00002-2. [11] I. Esnaola-Gonzalez, J. Bermúdez, I. Fernandez, A. Arnaiz, Semantic prediction assistant approach applied to energy efficiency in tertiary buildings, Semantic Web 9 (2018) 735–762. doi:10.3233/SW-180296. [12] I. Esnaola-Gonzalez, J. Bermúdez, I. Fernandez, A. Arnaiz, EEPSA as a core ontology for energy efficiency and thermal comfort in buildings, Applied Ontology 16 (2021) 193–228. doi:10.3233/AO-210245. [13] A. Gangemi, V. Presutti, Ontology Design Patterns, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009, pp. 221–243. doi:10.1007/978-3-540-92673-3∖_10. [14] A. Haller, K. Janowicz, S. Cox, M. Lefrançois, K. Taylor, D. L. Phuoc, J. Lieberman, R. García- Castro, R. Atkinson, C. Stadler, The modular ssn ontology: A joint w3c and ogc standard specifying the semantics of sensors, observations, sampling, and actuation, Semantic Web 10 (2019) 9–32. doi:10.3233/SW-180320. 18 [15] M. Lefrançois, Planned etsi saref extensions based on the w3c&ogc sosa/ssn-compatible seas ontology patterns, in: Proceedings of Workshop on Semantic Interoperability and Standardization in the IoT, SIS-IoT, 2017, pp. 1–8. [16] M. C. Suárez-Figueroa, A. Gómez-Pérez, M. Fernández-López, The NeOn Methodology for Ontology Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 9–34. doi:10.1007/978-3-642-24794-1∖_2. [17] M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, P. E. Bourne, et al., The fair guiding principles for scientific data management and stewardship, Scientific data 3 (2016). [18] D. Garijo, O. Corcho, M. Poveda-Villalón, Foops!: An ontology pitfall scanner for the fair principles 2980 (2021). URL: http://ceur-ws.org/Vol-2980/paper321.pdf. [19] M. Fernández-López, M. C. Suárez-Figueroa, A. Gómez-Pérez, Ontology Development by Reuse, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 147–170. doi:10.1007/ 978-3-642-24794-1∖_7. [20] N. F. Noy, Semantic integration: a survey of ontology-based approaches, ACM Sigmod Record 33 (2004) 65–70. doi:10.1145/1041410.1041421. [21] S. Cox, Ontology for observations and sampling features, with alignments to existing models, Semantic Web 8 (2016) 453–470. doi:10.3233/SW-160214. [22] D. Garijo, M. Poveda-Villalón, Best Practices for Implementing FAIR Vocabularies and Ontologies on the Web, volume 49 of Studies on the Semantic Web, IOS Press, 2020, pp. 39–54. doi:10.3233/SSW200034. [23] A. Fernández-Izquierdo, R. García-Castro, Themis: a tool for validating ontologies through requirements., in: SEKE, 2019, pp. 573–753. doi:10.18293/SEKE2019-117. [24] M. H. Rasmussen, M. Lefrançois, G. Schneider, P. Pauwels, Bot: the building topol- ogy ontology of the w3c linked building data group, Semantic Web 12 (2021) 143–161. doi:10.3233/SW-200385. [25] H. Qiua, G. Schneider, T. Kauppinen, S. Rudolph, S. Steigerd, Reasoning on human experiences of indoor environments using semantic web technologies, in: Proceedings of the 35th International Symposium on Automation and Robotics in Construction (ISARC 2018), Berlin, Germany, 2018, pp. 95–102. 19