<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Linked Data in Architecture and Construction Workshop, May</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An Ontology-Driven Approach to Support Data Analysts with Thermal Comfort Problems in the Built Environment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Iker Esnaola-Gonzalez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jesús Bermúdez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristina Aceta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TEKNIKER, Basque Research and Technology Alliance (BRTA)</institution>
          ,
          <addr-line>Iñaki Goenaga 5, 20600 Eibar</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>29</volume>
      <issue>2022</issue>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Since we spend most of our time within buildings it is of utmost importance to feel comfortable while staying indoors. However, research shows that HVAC systems only ensure occupants' satisfaction in the 11% of the commercial buildings. The advancing spread of the Internet of Things (IoT) and the maturity of Knowledge Discovery in Databases (KDD) may contribute to develop accurate predictive models which address this challenge. But data analysts in charge of developing these predictive models may feel overwhelmed if they have insuficient domain expertise. In this article, the ontology-driven approach proposed by the EEPSA (Energy Eficiency Prediction Semantic Assistant) is presented, in which data analysts can benefit from previously captured domain knowledge. Therefore, this article proposes the exploitation of Semantic Technologies to support data analysts in the discovery of the variables that should be considered for making accurate predictive models for thermal comfort problems within buildings. Compared with the existing tools or methods, the EEPSA is able to suggest variables that may not necessarily be included in the set of data available. This fact has a big potential nowadays, where the Linked Open Data and the third-party repositories can be exploited to incorporate relevant information.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Ontology</kwd>
        <kwd>Buildings</kwd>
        <kwd>Data Analysis</kwd>
        <kwd>Thermal Comfort</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Stanley Hall, who was an American psychologist, stated that man is largely a creature of habit,
and nowadays most of these human habits or daily activities (e.g. sleeping, shopping, working,...)
take place indoors. This statement was reinforced by the study made in the early 2000s, which
concluded that we spend 87% of our time indoors [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Since we spend most of our time within buildings, it is of utmost importance for humans
to feel comfortable while staying indoors. Building users’ comfort comprises diferent aspects
including acoustic, visual or thermal, and the latter is defined by the ANSI/ASHRAE Standard
55-20171 as: “that condition of mind that expresses satisfaction with the thermal environment
and is assessed by subjective evaluation”.</p>
      <p>
        Consequently, the research of the thermal comfort’s impact on occupants’ well-being has
become an important area of study. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Arif et al. present a state-of-the-art analysis of
research in the domain of health and well being of occupants and their relationship to the
Indoor Environmental Quality (IEQ). In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], Haynes showed that the IEQ has a direct impact on
workers’ eficiency and productivity. Furthermore, thermal comfort influences the customer
experience in retail and restaurant spaces [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, although HVAC systems should ensure
thermal comfort, only 11% of the commercial buildings met the criteria that no more than 20%
of building occupants are dissatisfied [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        The optimal HVAC activation strategy for ensuring occupants thermal comfort while making
an eficient use of energy is still an unsolved problem in most buildings. Furthermore, it is
important to note that certain type of buildings have specific features which may further
hinder this problem. For example, spaces with big dimensions are prone to have bigger thermal
inertia [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and cannot be efectively climatised with rather simple solutions like
thermostatbased reactive systems. Instead, HVAC systems need to be activated in a specific mode and
time to ensure a comfortable thermal condition. The advancing spread of the Internet of Things
(IoT) and the maturity of Knowledge Discovery in Databases (KDD) may contribute to develop
accurate predictive models which address this challenge.
      </p>
      <p>
        However, data analysts in charge of developing these predictive models may feel overwhelmed
if they have insuficient domain expertise [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Consequently, they may resort to a trial-and-error
approach trying to develop high-performing predictive models. This is definitely an undesirable
strategy and an assistant that supports data analysts through the predictive model development
process could be of interest. In this regard, knowledge from domain experts could be captured
leveraging Semantic Technologies and make this knowledge available for data analysts to exploit
it.
      </p>
      <p>In this article, the ontology-driven assistant proposed by the EEPSA (Energy Eficiency
Prediction Semantic Assistant) is presented. With this assistant, data analysts benefit from an
application assistant that supports them throughout the KDD process, and aids them to discover
which are the relevant variables to consider when developing a model which accurately predicts
the thermal comfort within buildings.</p>
      <p>The rest of this article is structured as follows. Section 2 introduces the related work. Section 3
describes the ontology that supports the proposed data analyst assistant. In Section 4 an
illustrative use case is presented for demonstration purposes. Finally, conclusions of this work
are presented in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        At an abstract level, the KDD field is concerned with the development of methods and techniques
for making sense of data [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], although in this article, the KDD is understood as a less generic
process, as a process to develop predictive models which estimate unknown outcome. The
1https://www.ashrae.org/technical-resources/bookstore/standard-55-thermal-environmental-conditions-forhuman-occupancy
typical KDD process comprises five steps as shown in Figure 1: Data Selection, Preprocessing
of Data, Data Transformation, Data Mining and Interpretation. The first phase, which is where
the focus of this article is placed, consists in selecting a data set, a subset of variables or data
samples where the knowledge discovery is going to be performed.
      </p>
      <p>With the expansion of the IoT and the advent of new paradigms such as Linked Data, data
analysts may get lost in today’s plethora of data. Therefore, the application of a knowledge
extraction process can be hindered. In order to avoid this problem, data analysts have to
understand the data itself: which is the knowledge that represents and which is the additional
knowledge that can be extracted from it. However, this is not a trivial task and in most cases, a
domain-specific knowledge is needed to select the adequate sets of data and variables to analyse.</p>
      <p>
        Methods for exploring and visualising data may contribute to understanding the data which
data analysts need to deal with [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. These approaches are aimed at visualising data in a coherent
and legible way, thus allowing users to obtain a good understanding of its structure, and
therefore implicitly compose queries, identify links between resources and discover new pieces
of information. However, for having such an understanding of the data, a deep knowledge of
the domain at hand is still required.
      </p>
      <p>
        Apart from the visualisation approaches, there are more classical methods which may support
the KDD’s Data Selection phase. One of them is the attribute relevance analysis which attempts
to recognise those attributes with the greatest impact on the target variable, while removing the
ones with less relevance from a given set of data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This method aims to reduce the redundancy
and uncertainty in the predictive model development process, although the performance of the
attribute relevant analysis itself, may be afected by the vast amount and heterogeneity of the
data that data analysts may face nowadays.
      </p>
      <p>This article proposes the exploitation of Semantic Technologies to support data analysts in
the discovery of the variables that should be considered for making accurate predictive models.
This is a diferent approach compared to existing work, which focus on visualising data that
may not be understood without a deep domain knowledge (e.g. data visualisation tools) and
cannot suggest new relevant variables that are not present in current data sets (e.g. relevance
analysis).</p>
    </sec>
    <sec id="sec-3">
      <title>3. EEPSA: An Ontology for Thermal Comfort in Buildings</title>
      <p>In order to incorporate Semantic Technologies in an assistant that supports data analysts through
the predictive model development process, it is necessary to leverage proper ontologies that
codify the required knowledge and enables the adequate annotation of the data.</p>
      <p>
        The aforementioned EEPSA, defined in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], is an assistant based on Semantic Technologies
including ontologies, ontology-driven rules and ontology-driven data access to guide data
analysts through the KDD process in a semi-automatic manner, towards the development of
enhanced predictive models for energy eficiency and occupants’ thermal comfort assurance in
tertiary buildings. In the context of such an assistant, the EEPSA ontology2 presented in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] is
the cornerstone.
      </p>
      <p>
        The EEPSA ontology’s backbone has been defined as a combination of three Ontology Design
Patterns (ODPs). ODPs are minimal ontologies that address recurrent design problems that
may arise in ontology conceptualisation, formalisation or implementation activities [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The
combination of the AfectedBy 3, the Execution-Executor-Procedure (EEP4) and the
ResultContext (RC5) ODPs provide appropriate concepts to represent scenarios where executions
including observations, actuations, or predictions, play a key role.
      </p>
      <p>
        The careful design of these three ODPs’ property axioms overcome weaknesses discovered in
existing ODP-based ontologies such as the SOSA/SSN ontology [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] or the SEAS
FeatureOfInterest ontology [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Furthermore, these ODPs try to be minimal in the number of classes and
properties ofered, but include appropriate ontology axioms that allow proper inferences.
      </p>
      <p>On top of these three ODPs, six ontology modules have been developed, which represent the
set of suitable terms, concepts and relationships to support data analysts through the predictive
model construction process for the problem at hand. Each ontology module specialises the
knowledge in the scope of the stub classes defined in the three ODPs. More specifically, these
ontology modules are FoI4EEPSA6 for representing building and building spaces; Q4EEPSA7 for
representing qualities of these spaces; EXR4EEPSA8 for representing executors such as sensors
and actuators; P4EEPSA9 for representing specific plans or methods; and EXN4EEPSA 10 for
representing executions such as observations and actuations. It is worth mentioning that the
sixth ontology module EK4EEPSA11 does not specialise any stub class and it is designed to
contain expert knowledge representing diferent types of spaces and the variables afecting
their indoor conditions. The EEPSA ontology is depicted in Figure 2.</p>
      <p>Summarising, the EEPSA ontology is the addition of the following ontological resources:
three ODPs, five ontology modules specialising the stub classes defined by these ODPs, and an
ontology module containing expert knowledge.</p>
      <p>2https://w3id.org/eepsa
3https://w3id.org/afectedBy
4https://w3id.org/eep
5https://w3id.org/rc
6https://w3id.org/eepsa/foi4eepsa
7https://w3id.org/eepsa/q4eepsa
8https://w3id.org/eepsa/exr4eepsa
9https://w3id.org/eepsa/p4eepsa
10https://w3id.org/eepsa/exn4eepsa
11https://w3id.org/eepsa/ek4eepsa</p>
      <p>
        The EEPSA ontology’s development has been founded in the NeOn Methodology [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] and with
a view to being compliant with the FAIR principles [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] leveraging the FOOPS! (Ontlogy Pitfall
Scanner for FAIR) tool12 [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Existing resources have been reused as much as possible following
the Ontological Resource Reuse Process [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], not only to capture and facilitate consensus
in communities, but also to reduce redundancies and increase interoperability. Precisely for
contributing to the interoperability of the solution, the ODPs and ontology modules that conform
the EEPSA ontology are aligned with related domain ontologies and upper-level ontologies
since this practice alleviates integration problems, helps to ensure clarity in modelling and
avoids errors that have unintended reasoning implications [
        <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
        ]. Furthermore, all the EEPSA
ontology terms contain the metadata proposed by the guidelines defined by Garijo and
PovedaVillalon [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] and the EEPSA ontology’s resources (i.e. the three ODPs and the six ontology
modules) are documented with the WIDOCO (a WIzard for DOCumenting Ontologies) tool.
Additionally, the validity of the ontology has been performed with Themis13 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], which verified
that the EEPSA ontology satisfied all the functional requirements initially defined. Last but not
least, the EEPSA ontology is available online under a Creative Commons Attribution CC BY 4.0
license, and it is published in diferent catalogues such as LOV 14 (Linked Open Vocabularies) or
LOV4IoT15.
3.1. The EK4EEPSA Ontology Module
The EK4EEPSA (Expert Knowledge ontology module for the EEPSA Ontology) captures the
necessary knowledge to provide inferencing capabilities that can be exploited by data analysts
in the KDD Data Selection phase. Towards such a goal, a group of thermal and energy domain
12https://foops.linkeddata.es
13http://themis.linkeddata.es/
14https://lov.linkeddata.es/
15https://lov4iot.appspot.com/
experts were interviewed to elicit and formalise their knowledge, and capture it in the ontology
module in a proper way. More specifically, the EK4EEPSA addresses requirements including
the ones described in Table 1 in the form of CQs (Competency Questions).
      </p>
      <p>
        The EK4EEPSA defines a classification of types of spaces within buildings. In the context
of the EEPSA ontology, a space is understood as “a part of the physical world or a virtual
world whose 3D spatial extent is bounded actually or theoretically, and provides for certain
functions within the zone it is contained in”, as defined by the BOT [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] (Building Topology
Ontology). The categorisation of types of spaces is based on the structural features of such
spaces, including spaces located in underground floors (ek4eepsa:BelowGroundLevelSpace ) and
spaces with a poor insulation (ek4eepsa:BadInsulatedSpace). Other categorisations of spaces
were also considered, such as the one proposed by the HBC (Human Comfort in Building)
ontology [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], where spaces are defined based on whether they contain certain types of building
objects (e.g. hbc:SpaceWithAirTerminal) or do not (e.g. hbc:SpaceWithoutShadingDevice).
      </p>
      <p>Note that in the scenario addressed in this article, it may be convenient to make heavy usage
of axioms expressing suficient conditions to infer the recognition of individuals in appropriate
classes. That is, it may be suitable to use equivalent class axioms with appropriate right hand
class expressions, rather than being dependent on explicit assertions only. For example, the
ek4eepsa:BelowGroundLevelSpace is defined as follows:
ek4eepsa:BelowGroundLevelSpace ≡
bot:Space ⊓ ∃bot:hasStorey.foi4eepsa:UndergroundStorey</p>
      <p>Complementary to this space classification, for each space type, qualities that afect their
indoor temperature are captured and represented in the EK4EEPSA ontology module. This
representation relies on other resources of the EEPSA ontology, namely on the qualities
represented in the Q4EEPSA ontology module, and the axioms defined in the AfectedBy ODP.
For example, the temperature of a space located in an underground level, may be afected by
qualities such as the atmospheric pressure, the humidity of the space itself, and the occupancy
of the room, as represented in the following axioms:</p>
      <p>In the latest version of the EK4EEPSA ontology module available at the moment of writing this
article (i.e. version 1.1), knowledge regarding qualities that afect the indoor relative humidity
of certain types of spaces such as naturally ventilated spaces (ek4eepsa:NaturallyVentilatedSpace)
has been incorporated. Likewise, further knowledge can be captured and represented in a
similar way in order to cover future additional requisites.</p>
      <p>This knowledge modelling can be exploited by application programs or other services to
support data analysts facing thermal comfort problems in buildings. After knowing which
is the type of space at hand, data analysts get to know which are the qualities that are more
relevant, which definitely guides the KDD Data Selection phase. This knowledge exploitation is
illustrated with a use case in the following section.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Use case: Thermal Comfort in Restaurants</title>
      <p>Let us consider a scenario in which a restaurant manager needs to ensure the thermal comfort
of its customers. For that purpose, a predictive model that predicts the restaurant temperature
is proposed, which will later be used as the foundation to support the optimal HVAC activation
strategies.</p>
      <p>Being a non-expert in the thermal comfort domain, the data analyst in charge would definitely
benefit from a service that suggests the most relevant variables or attributes for developing
an accurate predictive model. That is, a service that supports data analysts in the KDD Data
Selection phase.</p>
      <p>In order to make use of the EEPSA’s assistance in the KDD Data Selection phase, first of all,
the use case needs to be represented with the adequate EEPSA ontology terms. This includes
representing the restaurant itself, its structural elements and the equipment deployed such
as sensors or actuators with terms from diferent EEPSA modules as well as other reused
well-known ontologies such as BOT. This semantic annotation phase can be accomplished by
manually editing an RDF model with the help of an adapted GUI (Graphical User Interface) or a
data wrangling tool, or else with a properly programmed automatic middleware. An excerpt of
the semantic representation of the presented use case is as follows:
: b u i l d i n g 3 5 r d f : t y p e b o t : B u i l d i n g ;
( . . . )
b o t : h a s S t o r e y : l e v e l 1 .
: l e v e l 1 r d f : t y p e b o t : S t o r e y ;
b o t : h a s S p a c e : r e s t a u r a n t ;
b o t : h a s S p a c e : k i t c h e n ;
( . . . )
b o t : h a s S p a c e : bathroom .
: r e s t a u r a n t r d f : t y p e b o t : S p a c e ; ( ∗ )
b o t : h a s E l e m e n t : d o o r 0 1 ;
b o t : h a s E l e m e n t : window01 ; ( ∗ )
b o t : h a s E l e m e n t : window02 ;
b o t : h a s E l e m e n t : t e m p S e n s o r 0 1 ;
( . . . )
b o t : h a s E l e m e n t : windowShading01 .
: window01 r d f : t y p e f o i 4 e e p s a : ExternalWindow . ( ∗ )
: window02 r d f : t y p e f o i 4 e e p s a : Window .
( . . . )
: t e m p S e n s o r 0 1 r d f : t y p e e x r 4 e e p s a : T e m p e r a t u r e S e n s o r .
: windowShading01 r d f : t y p e e x r 4 e e p s a : B l i n d A c t u a t o r .
: o b s _ t e m p S e s n s o r 0 1 _ 2 5 8 9 r d f : t y p e e x n 4 e e p s a : O b s e r v a t i o n ;
eep : madeBy : t e m p S e n s o r 0 1 ;
eep : u s e d P r o c e d u r e : s e n s i n g P r o c e d u r e 0 1 ;
eep : o n Q u a l i t y : r e s t a u r a n t T e m p .
: s e n s i n g P r o c e d u r e 0 1 r d f : t y p e p 4 e e p s a : S e n s i n g P r o c e d u r e .
: r e s t a u r a n t T e m p r d f : t y p e q 4 e e p s a : I n d o o r T e m p e r a t u r e .</p>
      <p>Once the scenario is semantically annotated, an inference engine needs to be applied so that
new information can be deduced from the RDF model. This new inferred data is essential to
support data analysts and it is derived from the knowledge captured in the EEPSA ontology.
There are triple stores with inferencing engines integrated and in other cases, these reasoning
capabilities need to be manually added. In the presented use case, the implicit knowledge was
inferred by manually applying a HermiT version 1.3.8 OWL 2 DL reasoner. The resulting RDF
model was then uploaded to a Openlink Virtuoso Server version 07.20.3217, which it remained
accessible via an SPARQL endpoint.</p>
      <p>Once the RDF model is generated, data analysts can ask which the most relevant variables
afecting the restaurant’s temperature are. For illustrative purposes, in this section, a
step-bystep explanation is provided, although in practice it could be simplified with a single query
to the triplestore. Ideally, the production of these SPARQL queries should be managed by a
graphic interface that isolates data analysts from the underlying SPARQL query language in
which they might not be experts, easing their interaction. In the presented use case, the data
analyst would initially ask for the type of the target space with the following SPARQL query:
SELECT ? s p a c e T y p e
WHERE {</p>
      <p>: r e s t a u r a n t r d f : t y p e ? s p a c e T y p e . }
which evaluated over the given set of triples it retrieves ek4eepsa:AdjacentToOutdoorSpace and
ek4eepsa:NaturallyEnlightenedSpace. Therefore, it can be concluded that the restaurant is a
space in contact with the exterior and enlightened by the sun. These results are derived from
use case’s triples (specifically the ones marked with an asterisk) and the knowledge inferred
thanks to the axioms
ek4eepsa:AdjacentToOutdoorSpace ≡
bot:Space ⊓ ∃bot:hasElement.foi4eepsa:ExternalBuildingElement
and</p>
      <p>Knowing which type of space is the restaurant at hand, the data analyst would then ask
which are the variables that may be more relevant to develop an accurate predictive model. For
that purpose, the following SPARQL query would be executed16
PREFIX a f f : &lt; h t t p s : / / w3id . o r g / a f f e c t e d B y #&gt;
SELECT ? r e l e v a n t V a r i a b l e
WHERE {
: r e s t a u r a n t r d f : t y p e ? s p a c e T y p e .</p>
      <p>? s p a c e T y p e a f f : i n f l u e n c e d B y ? r e l e v a n t V a r i a b l e . }
This SPARQL query would return the following variables:
• q4eepsa:IndoorHumidity
• q4eepsa:Occupancy
• q4eepsa:SolarRadiation
• q4eepsa:WindSpeed
• q4eepsa:CloudCover
• q4eepsa:SunPositionDirection
• q4eepsa:SunPositionElevation</p>
      <p>These variables17 are inferred thanks to the role chain axioms defined in the AfectedBy ODP:
af:influencedBy
∘ af:afectedBy</p>
      <p>⊑ af:influencedBy
as well as the axioms captured in the EK4EEPSA, related with the definition of spaces
adjacent to outdoors:
ek4eepsa:AdjacentToOutdoorSpace ⊑
∃af:influencedBy.ek4eepsa:AdjacentToOutdoorSpaceIndoorTemperature
ek4eepsa:AdjacentToOutdoorSpaceIndoorTemperature ⊑
∃af:afectedBy.q4eepsa:IndoorHumidity
⊓∃af:afectedBy.q4eepsa:Occupancy
⊓∃af:afectedBy.q4eepsa:SolarRadiation
⊓∃af:afectedBy.q4eepsa:WindSpeed
and spaces naturally enlightened:</p>
      <p>16Note that this SPARQL query would be enough for the data analyst, although the previous one has also been
displayed for demonstration purposes.</p>
      <p>17q4eepsa is the preferred namespace prefix for the Q4EEPSA ontology module.
ek4eepsa:NaturallyEnlightenedSpaceIndoorTemperature ⊑
∃af:afectedBy.q4eepsa:CloudCover
⊓∃af:afectedBy.q4eepsa:IndoorHumidity
⊓∃af:afectedBy.q4eepsa:Occupancy
⊓∃af:afectedBy.q4eepsa:SunPositionDirection
⊓∃af:afectedBy.q4eepsa:SunPositionElevation</p>
      <p>Therefore, after semantically annotating the use case restaurant, the data analyst discovers
which may be the most relevant variables for developing an accurate predictive model for the
problem at hand. It is remarkable that, unlike other existing approaches, the EEPSA suggests
variables that may not be present in the current set of data that the data analyst has. However,
knowing which variables may contribute developing an accurate predictive model, is definitely
helpful for the data analyst. For example, cloud coverage, and the sun position or sun elevation
could be retrieved from external Linked Open Data sources. Another example is the occupancy,
which could be obtained from the reservation list that the restaurant manager may have. The
EEPSA also supports this variable generation task in the KDD’s Transformation phase, although
details of this support are left out of scope of this article.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Under circumstances where a deep thermal comfort and building domain knowledge is required
to eficiently develop a predictive model, having insuficient expertise could make data analysts
feel overwhelmed. The EEPSA tries to address this issue by supporting data analysts in the
KDD’s Data Selection phase.</p>
      <p>For that purpose, it leverages the EEPSA ontology, which codifies the required knowledge
and enables the adequate annotation of the data. In this article, an ontology-driven assistant
has been proposed, which has been demonstrated in a restaurant.</p>
      <p>Compared with the rest of existing tools or methods, the EEPSA is able to suggest variables
that may not necessarily be included in the set of data available. This fact has a big potential
nowadays, where the Linked Open Data and the third-party repositories are valuable sources
of knowledge which can be exploited to incorporate relevant information to the set of data
available.</p>
      <p>The EEPSA is oriented to energy eficiency and thermal comfort problems in tertiary buildings.
However, this same approach could be extended to other domains. As a matter of fact, it is
expected to pave the way towards the development of ontology-driven approaches that fill the
gap of data analysts with insuficient domain knowledge.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work is supported by the REACT project which has received funding from the European
Union’s Horizon 2020 research and innovation programme under grant agreement no. 824395.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N. E.</given-names>
            <surname>Klepeis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. R.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Robinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Tsang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Switzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Behar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Hern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Engelmann</surname>
          </string-name>
          ,
          <article-title>The national human activity pattern survey (nhaps): a resource for assessing exposure to environmental pollutants</article-title>
          ,
          <source>Journal of Exposure Science and Environmental Epidemiology</source>
          <volume>11</volume>
          (
          <year>2001</year>
          )
          <fpage>231</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Arif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Katafygiotou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mazroei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kaushik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Elsarrag</surname>
          </string-name>
          , et al.,
          <article-title>Impact of indoor environmental quality on occupant well-being and comfort: A review of the literature</article-title>
          ,
          <source>International Journal of Sustainable Built Environment</source>
          <volume>5</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Haynes</surname>
          </string-name>
          ,
          <article-title>The impact of ofice comfort on productivity</article-title>
          ,
          <source>Journal of Facilities Management</source>
          <volume>6</volume>
          (
          <year>2008</year>
          )
          <fpage>37</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Williams</surname>
          </string-name>
          , et al.,
          <article-title>Co-alignment of comfort and energy saving objectives for us ofice buildings and restaurants</article-title>
          ,
          <source>Sustainable cities and society 27</source>
          (
          <year>2016</year>
          )
          <fpage>32</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brager</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <article-title>Occupant satisfaction in mixed-mode buildings</article-title>
          ,
          <source>Building Research &amp; Information</source>
          <volume>37</volume>
          (
          <year>2009</year>
          )
          <fpage>369</fpage>
          -
          <lpage>380</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Verbeke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Audenaert</surname>
          </string-name>
          ,
          <article-title>Thermal inertia in buildings: A review of impacts across climate and building use</article-title>
          ,
          <source>Renewable and Sustainable Energy Reviews</source>
          <volume>82</volume>
          (
          <year>2018</year>
          )
          <fpage>2300</fpage>
          -
          <lpage>2318</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.rser.
          <year>2017</year>
          .
          <volume>08</volume>
          .083.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Provost</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <article-title>Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>17</volume>
          (
          <year>2005</year>
          )
          <fpage>503</fpage>
          -
          <lpage>518</lpage>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2005</year>
          .
          <volume>67</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>U.</given-names>
            <surname>Fayyad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Piatetsky-Shapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          ,
          <article-title>From data mining to knowledge discovery in databases</article-title>
          ,
          <source>AI</source>
          magazine
          <volume>17</volume>
          (
          <year>1996</year>
          )
          <article-title>37</article-title>
          . doi:
          <volume>10</volume>
          .1609/aimag.v17i3.
          <fpage>1230</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.-S.</given-names>
            <surname>Dadzie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rowe</surname>
          </string-name>
          ,
          <article-title>Approaches to visualising linked data: A survey, Semantic Web 2 (</article-title>
          <year>2011</year>
          )
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-2011-0037.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kamber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pei</surname>
          </string-name>
          , Chapter 3
          <article-title>- data preprocessing</article-title>
          , in: J. Han,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kamber</surname>
          </string-name>
          , J. Pei (Eds.),
          <source>Data Mining (Third Edition)</source>
          , The Morgan Kaufmann Series in Data Management Systems, third edition ed., Morgan Kaufmann, Boston,
          <year>2012</year>
          , pp.
          <fpage>39</fpage>
          -
          <lpage>82</lpage>
          . doi:
          <volume>10</volume>
          .1016/ B978-0
          <source>-12-381479-1</source>
          .
          <fpage>00002</fpage>
          -
          <lpage>2</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Esnaola-Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bermúdez</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arnaiz</surname>
          </string-name>
          ,
          <article-title>Semantic prediction assistant approach applied to energy eficiency in tertiary buildings</article-title>
          ,
          <source>Semantic Web</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>735</fpage>
          -
          <lpage>762</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-180296.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Esnaola-Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bermúdez</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arnaiz</surname>
          </string-name>
          ,
          <article-title>EEPSA as a core ontology for energy eficiency and thermal comfort in buildings</article-title>
          ,
          <source>Applied Ontology</source>
          <volume>16</volume>
          (
          <year>2021</year>
          )
          <fpage>193</fpage>
          -
          <lpage>228</lpage>
          . doi:
          <volume>10</volume>
          .3233/AO-210245.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Presutti</surname>
          </string-name>
          , Ontology Design Patterns, Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2009</year>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>243</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -92673-3∖_
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Haller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Janowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lefrançois</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , D. L.
          <string-name>
            <surname>Phuoc</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lieberman</surname>
            ,
            <given-names>R. GarcíaCastro</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Atkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stadler</surname>
          </string-name>
          ,
          <article-title>The modular ssn ontology: A joint w3c and ogc standard specifying the semantics of sensors, observations, sampling, and actuation</article-title>
          ,
          <source>Semantic Web</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>9</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-180320.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lefrançois</surname>
          </string-name>
          ,
          <article-title>Planned etsi saref extensions based on the w3c&amp;ogc sosa/ssn-compatible seas ontology patterns</article-title>
          ,
          <source>in: Proceedings of Workshop on Semantic Interoperability</source>
          and
          <article-title>Standardization in the IoT</article-title>
          , SIS-IoT,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gómez-Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernández-López</surname>
          </string-name>
          ,
          <article-title>The NeOn Methodology for Ontology Engineering</article-title>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2012</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>34</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -24794-1∖_2.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>M. D. Wilkinson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>I. J.</given-names>
          </string-name>
          <string-name>
            <surname>Aalbersberg</surname>
            , G. Appleton,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Axton</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Baak</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Blomberg</surname>
            ,
            <given-names>J.-W.</given-names>
          </string-name>
          <string-name>
            <surname>Boiten</surname>
            ,
            <given-names>L. B. da Silva</given-names>
          </string-name>
          <string-name>
            <surname>Santos</surname>
            ,
            <given-names>P. E.</given-names>
          </string-name>
          <string-name>
            <surname>Bourne</surname>
          </string-name>
          , et al.,
          <article-title>The fair guiding principles for scientific data management and stewardship</article-title>
          ,
          <source>Scientific data 3</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <article-title>Foops!: An ontology pitfall scanner for the fair principles 2980 (</article-title>
          <year>2021</year>
          ). URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2980</volume>
          /paper321.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernández-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gómez-Pérez</surname>
          </string-name>
          ,
          <source>Ontology Development by Reuse</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2012</year>
          , pp.
          <fpage>147</fpage>
          -
          <lpage>170</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>642</fpage>
          -24794-1∖_7.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Noy</surname>
          </string-name>
          ,
          <article-title>Semantic integration: a survey of ontology-based approaches</article-title>
          ,
          <source>ACM Sigmod Record</source>
          <volume>33</volume>
          (
          <year>2004</year>
          )
          <fpage>65</fpage>
          -
          <lpage>70</lpage>
          . doi:
          <volume>10</volume>
          .1145/1041410.1041421.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <article-title>Ontology for observations and sampling features, with alignments to existing models</article-title>
          ,
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <year>2016</year>
          )
          <fpage>453</fpage>
          -
          <lpage>470</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-160214.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <article-title>Best Practices for Implementing FAIR Vocabularies and Ontologies on the Web</article-title>
          , volume
          <volume>49</volume>
          of
          <article-title>Studies on the Semantic Web</article-title>
          , IOS Press,
          <year>2020</year>
          , pp.
          <fpage>39</fpage>
          -
          <lpage>54</lpage>
          . doi:
          <volume>10</volume>
          .3233/SSW200034.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fernández-Izquierdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>García-Castro</surname>
          </string-name>
          ,
          <article-title>Themis: a tool for validating ontologies through requirements</article-title>
          .,
          <source>in: SEKE</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>573</fpage>
          -
          <lpage>753</lpage>
          . doi:
          <volume>10</volume>
          .18293/SEKE2019-117.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Rasmussen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lefrançois</surname>
          </string-name>
          , G. Schneider,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pauwels</surname>
          </string-name>
          ,
          <article-title>Bot: the building topology ontology of the w3c linked building data group</article-title>
          ,
          <source>Semantic Web</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <fpage>143</fpage>
          -
          <lpage>161</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-200385.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>H.</given-names>
            <surname>Qiua</surname>
          </string-name>
          , G. Schneider,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kauppinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudolph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Steigerd</surname>
          </string-name>
          ,
          <article-title>Reasoning on human experiences of indoor environments using semantic web technologies</article-title>
          ,
          <source>in: Proceedings of the 35th International Symposium on Automation and Robotics in Construction (ISARC</source>
          <year>2018</year>
          ), Berlin, Germany,
          <year>2018</year>
          , pp.
          <fpage>95</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>