<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge Discovery Approach to Understand Experience in Cross-Domain Semantic Digital Twins Occupant</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alex Donkers</string-name>
          <email>a.j.a.donkers@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bauke de Vries</string-name>
          <email>b.d.vries@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dujuan Yang</string-name>
          <email>d.yang@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eindhoven University of Technology</institution>
          ,
          <addr-line>Groene Loper 6, Eindhoven, 5412AZ</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Occupant-centric decision-making in buildings requires integrating occupant information with other building information. Semantic web technologies promise to reduce data interoperability issues. However, methods to discover occupants' experiences and integrate those with linked building data are scarce. This paper aims to show how combining knowledge discovery in databases and semantic web technology could lead to an improved understanding of occupants' experiences in buildings. This approach is applied to a case study using the Open Family Home. An occupant collected feedback on various comfort indicators using a smartwatch app. Building information, sensor data, weather data, and occupant information and feedback were integrated into a cross-domain semantic digital twin. A Python script collected all the data from the digital twin and ran a data analysis, after which parameters that affected the occupant's experience were collected. The results were transformed into triples and integrated with the linked building data. This combination of knowledge discovery in databases and semantic web technologies results in enriched digital twins that can be used for occupant-centric decisionmaking. The approach presented in this study was generalized into a four-step method that can be applied at a variety of use-cases in the architecture, engineering, and construction domain.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Linked Data</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Occupant Experience</kwd>
        <kwd>Semantic Digital Twin</kwd>
        <kwd>Knowledge Discovery in Databases</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        While buildings should be designed to meet the occupants’ expectations, research suggests that
many buildings fail to satisfy their occupants [18]. Occupant-centric building operations require the
understanding of occupants and their relationship with the buildings they use. This relationship is very
multidimensional and is influenced by physiological, psychological, and environmental parameters
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. These parameters are often measured by disparate sources and stored in isolated data silos.
Standards to integrate these data and operate buildings are still lacking [18].
      </p>
      <p>
        Over the last decades, various research initiatives applied semantic web technologies to integrate
building information with other (non-building) information. Pauwels et al. [19] and Boje et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
extensively described how semantic web technologies are applied to integrate heterogeneous building
information into so-called semantic digital twins. Integrating sensor data with building information
enabled researchers to monitor building performance [
        <xref ref-type="bibr" rid="ref11">11,21</xref>
        ], reduce energy consumption [
        <xref ref-type="bibr" rid="ref13">13,17</xref>
        ], and
improve comfort [
        <xref ref-type="bibr" rid="ref5">5,17</xref>
        ]. While semantic web technologies to integrate building information and sensors
are researched extensively, the integration of occupants is relatively new.
      </p>
      <p>
        Earlier research initiatives presented methods to measure occupants’ feedback [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and integrate
this feedback with linked building data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Other studies used occupant feedback to predict individual
occupant preferences [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. However, an integrated method that combines cross-domain semantic digital
twins and knowledge discovery approaches to understand occupant preferences is lacking.
Understanding the occupants’ behavior and preferences and integrating them with linked building data
would support occupant-centric decision-making processes during the operational phase of buildings
and improves design feedback.
      </p>
      <p>Therefore, this study presents a method to perform knowledge discovery in cross-domain semantic
digital twins. The method was applied in a case study, where heterogeneous data from disparate sources
was integrated using semantic web technologies. These data were then used in a knowledge discovery
procedure, after which the discovered knowledge was integrated with the linked building data.</p>
      <p>Section 2 investigates state-of-the-art research into cross-domain semantic digital twins and
knowledge discovery using those digital twins. The research method is based on this review and is
described in section 3. Section 4 covers the results of this study, followed by a discussion and
conclusion.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Knowledge discovery in cross-domain semantic digital twins</title>
    </sec>
    <sec id="sec-3">
      <title>2.1. Cross-domain semantic digital twins</title>
      <p>
        The development of semantic web technologies for the AEC industry [19] enabled researchers to
expand their digital twins with data from other domains. Boje et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] emphasized the opportunities of
cross-domain data integration. Multiple research initiatives integrated sensor data with linked building
data to reduce energy consumption [
        <xref ref-type="bibr" rid="ref13 ref6">6,13,17</xref>
        ], monitor and improve building performance [
        <xref ref-type="bibr" rid="ref5 ref6">5,6,21</xref>
        ] and
optimize comfort levels [
        <xref ref-type="bibr" rid="ref3">3,17</xref>
        ]. Integrating system information enables controlling systems based on
the data in the digital twin [
        <xref ref-type="bibr" rid="ref2 ref8">2,8</xref>
        ]. Semantic integration of weather data could lead to enhanced monitoring
and control strategies of buildings [
        <xref ref-type="bibr" rid="ref11 ref13 ref5 ref6">5,6,11,13</xref>
        ].
      </p>
      <p>
        The increasing interest in occupant-centric building operations [18] led to the development of
semantic web technologies for occupants and their behavior. Nolich et al. [17] semantically represented
occupants, including health status and comfort preferences. Li et al. [16] developed an ontology that
represents occupant behavior and comfort. Similarly, work by Degha et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] enabled semantic
representations of occupants, their states, activities, properties, and preferences. Some recent research
initiatives created ontologies to represent occupant feedback [
        <xref ref-type="bibr" rid="ref12 ref3">3,12,17,27</xref>
        ].
2.2.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Knowledge discovery in semantic digital twins</title>
      <p>
        Knowledge discovery in databases (KDD) is defined as “the nontrivial process of identifying valid,
novel, potentially useful, and ultimately understandable patterns in data” [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. KDD processes are
applied to derive higher-level knowledge from (raw) data. Fayyad et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] described a five-step model
for knowledge discovery, being 1. Selection, 2. Preprocessing, 3. Transformation, 4. Data mining, and
5. Evaluation and interpretation.
      </p>
      <p>
        Ristoski and Paulheim [25] and Esnaola-Gonzalez [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] described how semantic web technologies
could be applied to enrich multiple steps in the KDD process. First, semantic web technologies could
help the Selection procedure by integrating data from different data silos. Petrova et al. [21],
EsnaolaGonzalez et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and Wang et al. [26] applied semantic web technologies to integrate building
information with various sensor data streams for this purpose. Semantic web technologies can then be
applied in the Preprocessing and Transformation phases. Examples mentioned in literature [
        <xref ref-type="bibr" rid="ref10">10,25</xref>
        ]
include methods for outlier detection, handling of missing data, and feature generation and selection.
Esnaola-Gonzalez et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] developed the SemOD framework to help data analysts find outliers using
SPARQL queries and applied this in cases related to building performance.
      </p>
      <p>
        According to Ristoski and Paulheim [25], the Data mining algorithms themselves hardly incorporate
linked data directly. Recent literature presents similar findings and typically performs data mining in a
dedicated software layer. Wang et al. [26] applied graph neural networks on semantic digital twins to
find room type classifications based on the available information in the graph. Esnaola-Gonzalez et al.
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] used a data mining model to predict indoor temperature to reduce energy consumption by HVAC
systems. Petrova et al. [21] applied data mining to find patterns in operational building data.
      </p>
      <p>
        Esnaola-Gonzalez [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] mentions that the Interpretation phase is often carried out by humans
interpreting the KDD results based on their domain expertise. However, semantic representations of the
results might potentially explain KDD results without human intervention. Both Esnaola-Gonzalez [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
and Petrova et al. [20,21] therefore suggested storing the semantically annotated results of the KDD
process in the graph so that the discovered knowledge could be used for design decision support.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3. Method</title>
    </sec>
    <sec id="sec-6">
      <title>3.1. Integrating knowledge discovery results with linked building data</title>
      <p>3.2.</p>
    </sec>
    <sec id="sec-7">
      <title>Data collection</title>
      <p>
        A BIM model of a semi-detached house – the Open Family Home – was created using Revit 2020
and converted to RDF Turtle following the BOT [24] and BOP [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] ontologies based on a procedure
explained in earlier work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Comfort-related data in the energy performance certificate was manually
added to the Turtle file.
      </p>
      <p>Various sensors were installed in the OFH:LivingRoom and OFH:Office to measure illuminance
(Eltek LS50), relative humidity and temperature (Eltek RHT10-D), CO2 (Eltek GD47), and indoor air
quality (Eltek GD47AC). A Gen II SRV250 receiver was used to store all sensor data on a local Eltek
Weather and sensor data</p>
      <p>OFH:OutdoorTemp</p>
      <p>bop:Property,
quantitykind:Temperature
bop:hasResult</p>
      <p>bop:hasResult
OFH:OutdoorTempDataPoint</p>
      <p>bop:DataPoint
bop:isDataPointOf</p>
      <p>bop:hosts
OFH:LRTempDataPoint
bop:DataPoint
bop:isDataPointOf
OFH:OpenFamilyHomeDB
bop:Database
bop:hasProperty
OFH:LivingRoomTemp</p>
      <p>bop:Property,
quantitykind:Temperature
bop:observedBy bop:hasProperty
OFH:LRTempSensor
bop:Sensor,
bot:Element</p>
      <p>Topology</p>
      <p>OFH:OpenFamilyHomeSite
bop:FeatureOfInterest,</p>
      <p>bot:Site
bot:hasBuilding ofo:hasLocation</p>
      <p>Occupant and feedback
OFH:Subject1
ofo:Person
opt:hasBirthdate
opt:hasGender
“xxx”
“xxx”
OFH:OpenFamilyHome
bop:FeatureOfInterest,
bot:Building
bot:hasSpace</p>
      <p>OFH:LivingRoom
bop:FeatureOfInterest,</p>
      <p>bot:Space
bot:adjacentElement</p>
      <p>OFH:Wall_01
bop:FeatureOfInterest,
bot:Element, beo:Wall
Gateway server, after which the data (30.000 measurements per sensor) were written to an InfluxDB
cloud database.</p>
      <p>
        A smartwatch app – Mintal [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] – was used to obtain occupant feedback. Two subjects reported 396
feedback responses on air quality, thermal comfort, and visual comfort between January 19 and
February 8, 2022. Fitbit’s device API provides information related to the user account. While this
research does not focus on medical data, the API allows further research into age, metabolic rate,
gender, height, weight, and heart rate reports. Meteorological data, including the outdoor temperature,
weather type, wind speed and direction, humidity, and air pressure, was acquired via CustomWeather2.
3.3.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Data integration</title>
      <p>Based on our findings in section 2.1, semantic web technologies were deployed to integrate all the
data. Figure 2 shows a graphical overview of the resulting data structure. The RDF representation of
the building’s topology forms the core of the linked data. This topology is modeled following the BOT
ontology [24] and consists of a bot:Site, a bot:Building, and multiple bot:Storeys and bot:Spaces. The
BEO3 ontology extends bot:Element and was used to model walls, floors slabs, windows, doors, and
sensors.</p>
      <p>
        Static properties are directly linked to topological elements using datatype properties, following the
level 1 complexity as mentioned in earlier research [
        <xref ref-type="bibr" rid="ref5">5,23</xref>
        ]. These static properties include simple
geometry, material characteristics, orientation, and identifiers. Dynamic properties include the sensor
measurements and weather data. Following suggestions by previous research [
        <xref ref-type="bibr" rid="ref8">8,21</xref>
        ], these data were
stored in a time-series database (InfluxDB). The graph only contains metadata about those
measurements, including the bop:Sensor, bop:Property, bop:DataPoint, and bop:Database.
      </p>
      <p>
        The occupants and their feedback are described using the OFO ontology [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Static properties of an
ofo:Person, such as birthdate and gender, are again described using datatype properties. Dynamic
properties include sensor data generated by the smartwatch and binary comfort states of the occupant
(comfortable/uncomfortable). This research only uses the heart rate sensor and monitors three comfort
parameters, namely thermal comfort, visual comfort, and air quality comfort. The data are stored in
InfluxDB and the graph contains metadata that helps finding the right data point in the time-series
database, similar to the weather and sensor data. All metadata of the occupants and their feedback is
automatically generated by the smartwatch app [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
3.4.
      </p>
    </sec>
    <sec id="sec-9">
      <title>Data analysis</title>
      <p>
        A knowledge discovery method by Lee and Ham [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] was applied to determine the influence of
individual parameters on various comfort indicators. The method compares the value of a single
parameter with a binary feedback state using boxplots. Significance is assumed if the mean value of the
parameter during positive feedback lies outside the first quartile and third quartile range of the
parameter during negative feedback (and vice-versa). Further elements of reasoning, including a
comparison of the central 50%, and the spread and the shift of the boxes, are applied to understand the
results [22].
      </p>
      <p>A Python script was developed to query the data from the various databases and perform the data
analysis. Static properties were queried directly from GraphDB using SPARQLWrapper4. To query the
latest state of dynamic properties, we first queried metadata from GraphDB (using SPARQLWrapper)
and directly used these results to build Flux queries in InfluxDB (using InfluxDB-Python5). Missing
sensor readings were replaced by their previous value. After performing the data analysis, significant
values are stored as triples using the Occupant Preference Ontology (OPO) and integrated with the
original data in GraphDB. This procedure is described in section 4.3.
2 https://www.timeanddate.com/weather/
3 https://pi.pauwel.be/voc/buildingelement
4 https://pypi.org/project/SPARQLWrapper/
5 https://github.com/influxdata/influxdb-python</p>
    </sec>
    <sec id="sec-10">
      <title>4. Results</title>
      <p>Section 4 presents the main results of the case study. The aim of this proof-of-concept study is to
test the feasibility of the method introduced in section 3.
4.1.</p>
    </sec>
    <sec id="sec-11">
      <title>Dataset overview</title>
      <p>
        Figure 3 shows an overview of the collected sensor data during the test period. The sensor data are
queried using SPARQL and Flux queries as explained in section 3.4. First, these data could be used to
determine a building’s performance based on predefined criteria, as shown in earlier work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
Secondly, the data helps to understand the occupants’ experience of the indoor environment, as shown
in the next subsections.
      </p>
      <p>Various conclusions could be drawn based on figure 3. First of all, the OFH:LivingRoom seems to
outperform the OFH:Office in both thermal comfort and air quality. The OFH:Office is relatively cold
and humid, while also facing higher CO2 levels and air pollution. This could lead to sick building
symptoms. However, the illuminance in the OFH:Office is significantly better than the illuminance in
the OFH:LivingRoom. This observation is consistent with the fact that office spaces generally achieve
a higher illuminance than living rooms.</p>
      <p>Based on those measurements, negative feedback is expected on thermal comfort, visual comfort,
and air quality. Since the OFH:Office is likely to produce the most complaints, the remainder of this
result section will focus on measurements in the OFH:Office.</p>
      <p>Illuminance
1000
900
800
]x 700
l[u 600
c 500
e
a 400
n
in 300
llm 200
u
I 100
0</p>
      <sec id="sec-11-1">
        <title>OOFFH…H::LLiivviinnggRoom</title>
      </sec>
      <sec id="sec-11-2">
        <title>OOFFHH::OOffffiiccee</title>
        <p>Temperature
23
22
] 21
[C° 20
reu 19
t
ra 18
e
pm17
e
T 16
15</p>
        <p>1
0.9
0.8
] 0.7
[V0.6
ity 0.5
lau 0.4
rq 0.3
iA0.2
0.1
0</p>
        <p>Relative humidity
70
65
]%60
[y 55
t
iid 50
um45
eh 40
tiv 35
a
le 30
R 25
4000
3500
3000
] 2500
pm2000
p
[2 1500
O
C 1000
500</p>
        <p>0
Air quality</p>
        <p>CO2</p>
        <p>This subsection shows the results of the knowledge discovery method explained in section 3.4. Three
analyses were performed to find significant parameters influencing thermal comfort (figure 4), visual
comfort (figure 5), and air quality (figure 6), respectively.</p>
        <p>
          Figure 4 shows the parameters that are expected to influence OFH:Subject1’s thermal comfort.
Remarkably, the data does not show a significant influence of the temperature and relative humidity on
the thermal comfort feedback, while standards generally include those parameters in thermal comfort
models [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>
          The occupant’s heart rate, the outdoor temperature, and the time of the day were found to
significantly influence the thermal comfort feedback of OFH:Subject1. The low outdoor temperature
might cause radiant temperature asymmetry, while higher heart rates often relate to higher metabolic
rates (increasing thermal comfort at low temperatures) [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Thermal adaptation and outdoor heat gains
during the day might cause the correlation between time and comfort feedback.
        </p>
        <p>Figure 5 shows parameters that were expected to influence the visual comfort of OFH:Subject1.
Higher illuminance is found to significantly improve the visual comfort. Interestingly, the outdoor
visibility (which is lower during foggy weather) strongly influences the visual comfort, as almost all
positive feedback was given during non-foggy weather.</p>
        <p>The influence of parameters on OFH:Subject1’s feedback on air quality is shown in figure 6. The
mean value for uncomfortable indoor air quality lies above the highest observed IAQ value during
comfortable feedback, indicating that higher IAQ values (implying higher air pollution) negatively
influence OFH:Subject1’s feedback on air quality. The CO2 concentration and relative humidity seem
to have no significant influence on the experience of air quality.</p>
        <p>To be able to use the results of our knowledge discovery method, the Python script that queries all
the data and runs the knowledge discovery algorithms ends with translating the significant parameters
to RDF triples. A SPARQL INSERT query is used to insert those triples in the GraphDB repository to
integrate them with the existing linked building data. The triples are created using string concatenation
in Python. This process is automated as it makes use of the metadata that is available in the GraphDB
repository. This includes the person, the significant properties, and the comfort property.</p>
        <p>The resulting set of triples is shown in figure 7. The purple blocks represent the occupant
preferences. Results of the knowledge discovery method are stored as literals and connected to an
instance of opo:Preference using datatype properties. The opo:Preference class represents the latest set
of preferences on a property. A timestamp is added to the opo:Preference class to enable querying the
latest values. The pattern is based on the property state pattern in OPM [23]. The opo:Preference is
linked to an ofo:Property using opo:onProperty. It is also linked to an ofo:Person and to the comfort
property of this person.</p>
        <p>Listing 1 shows how a simple SPARQL query could return the first and third quartile values from
the graph in figure 7. The result could be used to assess if the current state of the building fulfills the
expectations of the user, and automatically trigger HVAC systems if parameters lie outside the
acceptable range.</p>
        <sec id="sec-11-2-1">
          <title>OFH:Subject1</title>
          <p>ofo:Person</p>
        </sec>
        <sec id="sec-11-2-2">
          <title>OFH:IndoorAirQuality</title>
          <p>ofo:Property</p>
          <p>^^xsd:dateTime
prov:generatedAtTime
PREFIX OFH: &lt;https://github.com/AlexDonkers/OpenFamilyHome#&gt;
PREFIX opo: &lt;https://alexdonkers.github.io/opo#&gt;
PREFIX prov: &lt;http://www.w3.org/ns/prov#&gt;
SELECT ?firstQuartile ?thirdQuartile
WHERE { OFH:Subject1 opo:hasPreference ?preference .
?preference opo:onProperty OFH:IndoorAirQuality .
?preference prov:generatedAtTime ?time .
?preference opo:firstQuartile ?firstQuartile .
?preference opo:thirdQuartile ?thirdQuartile . }
ORDER BY DESC(?time) LIMIT 1
Listing 1: SPARQL query that finds OFH:Subject1's preferences on OFH:IndoorAirQuality</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>5. Discussion</title>
      <p>This paper aimed to show how combining KDD and semantic web technologies could lead to an
improved understanding of occupants’ experiences in buildings. The combination of KDD and semantic
web technologies enables data scientists to perform cross-domain knowledge discovery procedures that
justify the holistic nature of the AEC industry.</p>
      <p>The method presented in this paper can be generalized so that it can be used in a wide variety of
usecases. The stepwise approach in figure 1 enables performing KDD on data in semantic digital twins and
integrating the results into that digital twin. The approach can be executed using a single script for quick
(online) knowledge integration. Step 2 can also consist of a more extensive KDD process. The
flexibility of this approach makes it useful for knowledge discovery in relevant research domains, such
as energy performance, indoor environmental quality monitoring, and system automation.</p>
      <p>The application of KDD in the domain of occupant preferences and behavior is expected to be
a recurring process. Not only will occupant preferences and behavior differ per person, but they are also
expected to change over time. This includes seasonal changes and aging effects. Building improvements
and other building innovations might also influence future occupant preferences. The flexibility of our
stepwise approach allows for a quick update of the occupant preference module so that the building
could adapt systems based on the latest occupant feedback without significant delays.</p>
      <p>This research is a proof-of-concept of the introduced method. To unlock statistically significant
results, more data needs to be generated. More in-depth statistical analyses should be used to research
the influence of buildings’ attributes, environmental parameters, and personal characteristics on
occupants’ preferences.</p>
    </sec>
    <sec id="sec-13">
      <title>6. Conclusion</title>
      <p>While buildings can contribute to occupants’ wellbeing, there is a lack of knowledge to operate
buildings according to occupants’ expectations. Integrating heterogeneous data from various sources is
necessary to enable occupant-centric decision-making. Semantic web technologies proved to
successfully integrate such data into cross-domain semantic digital twins. However, methods to
discover deeper insights of occupants’ preferences and experiences, and integrate those insights into
the graph, are scarce.</p>
      <p>Therefore, this paper presents an approach to combine KDD procedures with semantic digital twins.
A case study was performed using the Open Family Home. A subject collected feedback on various
comfort indicators using the smartwatch app Mintal. Building information, sensor data, weather data,
and occupant information and feedback were integrated using semantic web technologies. These data
are analyzed using boxplots.</p>
      <p>The data analysis presents insights into the influence of individual parameters on the subject’s
experience of comfort. Significant parameters were translated to RDF Turtle format and stored in the
graph. Future work should demonstrate the validity of the KDD results with larger sample sizes of
occupants and buildings.</p>
    </sec>
    <sec id="sec-14">
      <title>7. Acknowledgements</title>
      <p>The authors would like to gratefully acknowledge the support from Eindhoven University
Technology, KPN (TKI-HTSM 19.0162), and the Netherlands Enterprise Agency, as part of the
‘SmartTWO: Maverick Telecom Technologies as Building Blocks for Value Driven Future Societies’
project (TK|L912P06).</p>
    </sec>
    <sec id="sec-15">
      <title>8. References</title>
      <p>[16] Y. Li, R. García-Castro, N. Mihindukulasooriya, J. O’Donnell, and S. Vega-Sánchez, Enhancing
energy management at district and building levels via an EM-KPI ontology, Automation in
Construction. 99 (2019) 152-167. doi:10.1016/j.autcon.2018.12.010.
[17] M. Nolich, D. Spoladore, S. Carciotti, R. Buqi, and M. Sacco, Cabin as a Home: A Novel Comfort
Optimization Framework for IoT Equipped Smart Environments and Applications on Cruise Ships,
Sensors. (2019). doi:10.3390/s19051060.
[18] W. O’Brien, A. Wagner, M. Schweiker, A. Mahdavi, J. Day, M.B. Kjaergaard, S. Carlucci, B.</p>
      <p>Dong, F. Tahmasebi, D. Yan, T. Hong, H.B. Gunay, Z. Nagy, C. Miller, and C. Berger, Introducing
IEA EBC annex 79: Key challenges and opportunities in the field of occupant-centric building
design and operation, Building and Environment. 178.106738 (2020).
doi:10.1016/j.buildenv.2020.106738.
[19] P. Pauwels, S. Zhang, and Y.C. Lee, Semantic web technologies in AEC industry: A literature
overview, Automation in Construction. (2017). doi:10.1016/j.autcon.2016.10.003.
[20] E.A. Petrova, AI for Bim-Based Sustainable Building Design: Integrating Knowledge Discovery
and Semantic Data Modelling for Evidence-Based Design Decision Support, Ph.D. thesis, Aalborg
University, Denmark, 2019.
[21] E. Petrova, P. Pauwels, K. Svidt, and R.L. Jensen, In Search of Sustainable Design Patterns:
Combining Data Mining and Semantic Data Modelling on Disparate Building Data, in: Advances
in Informatics and Computing in Civil and Construction Engineering, 2019, pp. 19-26.
doi:10.1007/978-3-030-00220-6_3.
[22] M. Pfannkuch, Comparing Box Plot Distributions: A Teacher’s Reasoning, Statistics Education</p>
      <p>Research Journal. 5 (2006) 27–45.
[23] M.H. Rasmussen, M. Lefrançois, M. Bonduel, C.A. Hviid, and J. Karlshø, OPM: An ontology for
describing properties that evolve over time, in: Proceedings of the 6th Linked Data in Architecture
and Construction Workshop, CEUR Workshop Proceedings, 2018, pp.24-33.
[24] M.H. Rasmussen, M. Lefrançois, G.F. Schneider, and P. Pauwels, BOT: The building topology
ontology of the W3C linked building data group, Semantic Web, 12.1. (2020) 143-161.
doi:10.3233/sw-200385.
[25] P. Ristoski, and H. Paulheim, Semantic Web in data mining and knowledge discovery: A
comprehensive survey, Journal of Web Semantics. 36 (2016) 1-22.
doi:10.1016/j.websem.2016.01.001.
[26] Z. Wang, T. Yeung, R. Sacks, and Z. Su, Room Type Classification for Semantic Enrichment of
Building Information Modeling Using Graph Neural Networks, Proceedings of the Conference
CIB W78 2021, 2021, pp. 773–781.
[27] Q. Yang, Y. Zhao, and Q. Yang, Development of an ontology-based semantic building
postoccupancy evaluation framework, International Journal of Metrology and Quality Engineering. 12
(2021). doi:10.1051/ijmqe/2021019.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Boje</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guerriero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kubicki</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rezgui</surname>
          </string-name>
          ,
          <article-title>Towards a semantic Construction Digital Twin: Directions for future research</article-title>
          ,
          <source>Automation in Construction. 114.103179</source>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1016/j.autcon.
          <year>2020</year>
          .
          <volume>103179</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.E.</given-names>
            <surname>Degha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.Z.</given-names>
            <surname>Laallam</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Said</surname>
          </string-name>
          ,
          <article-title>Intelligent context-awareness system for energy efficiency in smart building based on ontology</article-title>
          ,
          <source>Sustainable Computing: Informatics and Systems</source>
          .
          <volume>21</volume>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1016/j.suscom.
          <year>2019</year>
          .
          <volume>01</volume>
          .013.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Donkers</surname>
          </string-name>
          , B. De Vries, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Creating Occupant-Centered Digital Twins Using the Occupant Feedback Ontology Implemented in a Smartwatch App, manuscript submitted for publication.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Donkers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          , B. de Vries, and
          <string-name>
            <given-names>N.</given-names>
            <surname>Baken</surname>
          </string-name>
          , Building Performance Ontology,
          <year>2021</year>
          . URL: https://w3id.org/bop.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Donkers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>De Vries</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Baken</surname>
          </string-name>
          ,
          <article-title>Real-Time Building Performance Monitoring using Semantic Digital Twins</article-title>
          ,
          <source>in: Proceedings of the 9th Linked Data in Architecture and Construction Workshop</source>
          , CEUR Workshop Proceedings, Luxembourg, Luxembourg,
          <year>2021</year>
          , pp.
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I.</given-names>
            <surname>Esnaola-Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bermúdez</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Fernandez, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Arnaiz</surname>
          </string-name>
          ,
          <article-title>Semantic prediction assistant approach applied to energy efficiency in Tertiary buildings</article-title>
          ,
          <source>Semantic Web. 9</source>
          .
          <issue>6</issue>
          (
          <year>2018</year>
          )
          <fpage>735</fpage>
          -
          <lpage>762</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-180296.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Esnaola-Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bermúdez</surname>
          </string-name>
          , I. Fernández,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Arnaiz</surname>
          </string-name>
          ,
          <article-title>Towards a semantic outlier detection framework in wireless sensor networks</article-title>
          ,
          <source>in: Semantics2017: Proceedings of the 13th International Conference on Semantic Systems</source>
          , ACM,
          <year>2017</year>
          , pp.
          <fpage>152</fpage>
          -
          <lpage>159</lpage>
          . doi:
          <volume>10</volume>
          .1145/3132218.3132226.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Esnaola-Gonzalez</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Javier</surname>
          </string-name>
          <string-name>
            <surname>Diez</surname>
          </string-name>
          ,
          <article-title>Integrating building and IoT data in demand response solutions</article-title>
          ,
          <source>in: Proceedings of the 7th Linked Data in Architecture and Construction Workshop (LDAC</source>
          <year>2019</year>
          ),
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2019</year>
          , pp
          <fpage>92</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>U.</given-names>
            <surname>Fayyad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Piatetsky-Shapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          ,
          <article-title>From data mining to knowledge discovery in databases</article-title>
          ,
          <source>AI Magazine. 17.3</source>
          (
          <year>1996</year>
          ):
          <fpage>37</fpage>
          . doi:
          <volume>10</volume>
          .1609/aimag.v17i3.
          <fpage>1230</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.E.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <article-title>Semantic Technologies for supporting KDD Processes</article-title>
          ,
          <source>Ph.D. thesis</source>
          , University of the Basque Country UPV/EHU, Leioa, Biscay, Spain,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>S. van Gool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Pauwels</surname>
          </string-name>
          ,
          <article-title>Integrating sensor and building data flows: a case study of the IEQ of an office building in the Netherlands</article-title>
          ,
          <source>in: Proceedings of the 13th European Conference on Product and Process Modeling</source>
          <year>2020</year>
          -2021, CRC Press, Moscow, Russia,
          <year>2021</year>
          , pp.
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.M.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dibowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gall</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Braun</surname>
          </string-name>
          , Occupant Feedback and Context Awareness:
          <article-title>On the Application of Building Information Modeling and Semantic Technologies for Improved Complaint Management in Commercial Buildings</article-title>
          ,
          <source>in: IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, 1</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>108</lpage>
          . doi:
          <volume>10</volume>
          .1109/ETFA46521.
          <year>2020</year>
          .
          <volume>9212164</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hoare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pauwels</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. O'Donnell</surname>
          </string-name>
          ,
          <article-title>Building energy performance assessment using linked data and cross-domain semantic reasoning</article-title>
          ,
          <source>Automation in Construction. 124.103580</source>
          , (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1016/j.autcon.
          <year>2021</year>
          .
          <volume>103580</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Jayathissa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Quintana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdelrahman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Humans-as-a-sensor for buildingsintensive longitudinal indoor comfort models</article-title>
          ,
          <source>Buildings</source>
          .
          <volume>10</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .3390/buildings10100174.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ham</surname>
          </string-name>
          ,
          <article-title>Physiological sensing-driven personal thermal comfort modelling in consideration of human activity variations</article-title>
          ,
          <source>Building Research and Information</source>
          .
          <volume>49</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1080/09613218.
          <year>2020</year>
          .
          <volume>1840328</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>