<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Tool for Authoring Semantic Metadata for Building Datasets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sangkeun Lee</string-name>
          <email>lees4@ornl.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Borui Cui</string-name>
          <email>cuib@ornl.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahabir Bhandari</string-name>
          <email>bhandarims@ornl.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Na Luo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Piljae Im</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ISWC'22: The 21st International Semantic Web Conference</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lawrence Berkeley National Laboratory</institution>
          ,
          <addr-line>Berkely CA 94720</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Oak Ridge National Laboratory</institution>
          ,
          <addr-line>Oak Ridge TN 31380</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Brick ontology is a unified semantic metadata schema to address the standardization problem of buildings' physical, logical, and virtual assets and the relationships between them. Creating a Brick model for a building dataset means that the dataset's contents are semantically described using the standard terms defined in the Brick ontology. It will enable the benefits of data standardization, without having to recollect or reorganize the data and opens the possibility of automation leveraging the machine readability of the semantic metadata. The problem is that authoring Brick models for building datasets often requires knowledge of semantic technology (e.g., ontology declarations and RDF syntax) and leads to repeated manual trial and error processes, which can be time-consuming and challenging to do without an interactive visual representation of the data. We developed VizBrick , a tool with a graphical user interface that can assist users in creating Brick models visually and interactively without having to understand the Resource Description Framework (RDF) syntax. VizBrick provides handy capabilities such as keyword search for easy find of relevant brick concepts and relations to their data columns and automatic suggestions of concept mapping. In this demonstration, we present a use case of VizBrick to showcase how a Brick model can be created for a real world building dataset.</p>
      </abstract>
      <kwd-group>
        <kwd>Building</kwd>
        <kwd>Interactive tool</kwd>
        <kwd>Brick ontology</kwd>
        <kwd>Data standardization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Acknowledgments</title>
      <p>Notice: This manuscript has been authored by UT-Battelle, LLC under Contract No.
DEAC05-00OR22725 with the U.S. Department of Energy. The publisher, by accepting the article
for publication, acknowledges that the U.S. Government retains a non-exclusive, paid up,
irrevocable, worldwide license to publish or reproduce the published form of the manuscript, or
allow others to do so, for U.S. Government purposes. The DOE will provide public access to
these results in accordance with the DOE Public Access Plan (http://energy.gov/downloads/
doe-public-access-plan).</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Datasets collected from smart buildings are critical assets for research to improve human
interactions in built environments. One of the biggest challenges of managing smart building
data assets is that they are not standardized in many cases due to being collected from many
sources by various entities (e.g., diferent research organizations, etc.) for diferent reasons.
Such lack of a standardization disables the reusability of data assets and cause redundancy
of data, which inherently increases the cost (time and money) significantly. Brick ontology
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is the state-of-the-art open-source unified schema that consists of extensible concepts and
relationships that can semantically describe physical, logical, and virtual building dataset assets
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Creating a Brick model for a building dataset means that the contents of the datasets are
semantically described using the standard terms defined in the Brick ontology. It will enable the
benefits of data standardization, without having to recollect or reorganize the data. Moreover, a
semantic description of a dataset enables machine readability, which allows potential automation
of data processing and utilization.
      </p>
      <p>
        Figure 1 shows an example of a Brick model created for an example synthetic building
dataset. From the figure, we can understand what equipment, locations, and points (entities) are
related to this dataset, and more importantly we can also understand the relationship between
the entities. A general workflow of Brick model creation for a building dataset is as follows.
Reviewing available non-semantic metadata (e.g., spreadsheet) for the dataset is the first step of
the workflow. In this step, we aim to semantically understand the building structure, equipment,
and measurements. Then, we define entities such as Location, Zone, Equipment, and Point.
Then, we create relationships across the created entities, for instance, where entities are located,
what entities are part of other entities, and so on. During the process, proper Brick ontology
concepts and relationships need to be identified and mapped to the instances. Next, validating
the created Brick model both in syntax and semantics is necessary. Steps in this workflow may be
repeated with trial and error before finalizing the created model, and it can be time-consuming
and challenging to do without an interactive visual representation of the data. Another challenge
is that building scientists often have limited knowledge in semantic technology such as class
and relationship definitions in Brick ontology and RDF syntax. Although several visual RDF
editors [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] have been implemented so far, these tools are generic and are not specifically
targeted to building domain datasets, and they are not leveraging the predefined Brick ontology
vocabulary.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. VizBrick Implementation and Capabilities</title>
      <p>
        VizBrick provides three main capabilities as follows (Figure 2.). Firstly, it provides an interface
to review an existing tabular formatted non-Semantic metadata for a building dataset that
user targeted to create a Brick model. VizBrick imports a CSV (comma-separated values)
ifle format that contains the list of data labels (data points) and their description. Users can
review and modify the metadata before starting the entity and relationship creation process.
Secondly, it provides an interactive entity and relationship creation interface. Users can perform
keyword search against the Brick ontology and manually select specific class and relationship of
Brick ontology to create entity and relationship instances. VizBrick also provides a suggestion
capability that provides considerably properly matching class in the Brick Ontology for each
data point in the original metadata. For keyword search and suggestion, VizBrick uses the
TF-IDF (term frequency-inverse document frequency) technique [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] on the datapoint and Brick
ontology descriptions.
      </p>
      <p>The recommendation result is provided as a ranking list with score and users can provide
additional keywords for better matching (Figure 3).</p>
      <p>
        In addition, VizBrick provides a rule-based mapping between data labels and Brick classes and
edge creations using user-defined rules based on string patterns in data labels (e.g., data labels
related to ‘temperature sensor’ start with ‘ ’) (Figure 4). Lastly, users can select all or sub parts
of the created Brick entities and relationships to visualize the currently editing Brick model and
export it as a standard Resource Description Framework (RDF) data model represented in a Terse
RDF Triple Language (Turtle) format. Exported RDF files can be used with any other standard
Brick related tools [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. VizBrick has been released as open-source software and registered in
the US Department of Energy (DOE) Ofice of Scientific and Technical Information (OSTI) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
The tool is publicly available for testing and a demonstration video is available.
For demonstration, we will show how VizBrick can be used to create a Brick model for Ecobee
Donate Your Data (DYD) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] that comprises data collected by Ecobee thermostats from 1,000
single family homes in four US states. Starting with a simple tabular CSV-formatted metadata
ifle that describes the dataset, we will demonstrate how a semantic metadata file formatted in
TTL file can be generated and used in the other standard Brick tools. The demonstration will
take the following steps.
      </p>
      <p>• 1. Importing, reviewing, and editing the original metadata
• 2. Creating Brick class and relationship instances using the VizBrick search and suggestion
toolbox
• 3. Visually interact with the work in progress Brick model
• 4. Exporting and validating the Brick model
This demo will showcase how VizBrick tool can significantly reduce the level of eforts to create
a Brick model for a building data asset, which is crucial for data standardization.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Brick</surname>
          </string-name>
          :
          <article-title>A uniform metadata schema for buildings</article-title>
          ,
          <year>2022</year>
          . URL: https://brickschema.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Balaji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          , G. Fierro,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Johansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Koh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ploennigs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          , et al.,
          <article-title>Brick: Towards a unified metadata schema for buildings</article-title>
          ,
          <source>in: Proceedings of the 3rd ACM International Conference on Systems for Energy-Eficient Built Environments</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-L.</given-names>
            <surname>Herregodts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurman</surname>
          </string-name>
          , E. Mannens, R. V. d. Walle,
          <article-title>Rmleditor: a graph-based mapping editor for linked data mappings</article-title>
          ,
          <source>in: European Semantic Web Conference</source>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>709</fpage>
          -
          <lpage>723</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N.</given-names>
            <surname>Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Similea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lohmann</surname>
          </string-name>
          ,
          <article-title>Turtleeditor: A web-based rdf editor to support distributed ontology development on repository hosting platforms</article-title>
          ,
          <source>International Journal of Semantic Computing</source>
          <volume>11</volume>
          (
          <year>2017</year>
          )
          <fpage>311</fpage>
          -
          <lpage>323</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aizawa</surname>
          </string-name>
          ,
          <article-title>An information-theoretic perspective of tf-idf measures</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>39</volume>
          (
          <year>2003</year>
          )
          <fpage>45</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Brickly</surname>
          </string-name>
          :
          <article-title>A block based editor for sparql queries with brick</article-title>
          ,
          <year>2022</year>
          . URL: https://github.com/ ezrichards/brickly.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Im</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhandari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cui</surname>
          </string-name>
          , U. O. of Energy Eficiency,
          <string-name>
            <given-names>R.</given-names>
            <surname>Energy</surname>
          </string-name>
          , Vizbrick,
          <year>2022</year>
          . URL: https://www.osti.gov//servlets/purl/1871804.
          <source>doi:1 0 . 1 1</source>
          <volume>5 7 8</volume>
          / d c .
          <volume>2 0 2 2 0 6 0 9 . 1</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ecobee</surname>
          </string-name>
          :
          <article-title>Donate your data (dyd) dataset, 2022</article-title>
          . URL: https://bbd.labworks.org/ds/bbd/ecobee.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>