<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ontology based data architecture to promote data sharing in electrophysiology*</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Brenda Farrell</string-name>
          <email>bfarrell@bcm.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bobby R. Alford Department of Otolaryngology - Head &amp; Neck Surgery Baylor College of Medicine Houston</institution>
          ,
          <addr-line>TX</addr-line>
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kansas State University Libraries Kansas State University Manhattan</institution>
          ,
          <addr-line>KS</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>7</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>- Strategies to improve the preservation, searchability, and discoverability of research data are a priority. To facilitate these efforts in cell electrophysiology and biophysics we propose that ontologies be used to design and annotate data, as they provide a substantive metadata structure, with reasoned-definitions arranged in a logical, hierarchal structure where the meaning of data are unambiguously assigned. We illustrate this by describing our cell electrophysiology data with an ontology. We then make this hierarchal structure with definitions the basis of the data architecture which is implemented upon transforming the data into the storage format: Hierarchical Data Format version 5 (HDF5).</p>
      </abstract>
      <kwd-group>
        <kwd>big data</kwd>
        <kwd>data management</kwd>
        <kwd>HDF5</kwd>
        <kwd>auditory</kwd>
        <kwd>outer hair cell</kwd>
        <kwd>OHC</kwd>
        <kwd>application ontology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>Data were inconsistently reported in the past, although data
forms the back-bone of much of scientific discovery. There was
no motivation for researchers to develop intuitive
humanunderstandable (and machine readable) data structures that both
the public, and their scientific peers, could readily access. The
limits of this approach for many scientific disciplines have
revealed a number of deficiencies, including the inability to
reproduce key findings, coupled with excessive (additional) costs
to the tax payer [1]. The need to provide data sets that support
key findings of research is particularly relevant in fields where
data collection is slow and requires the sacrifice of mammals.
This is the case in auditory electrophysiology. To facilitate data
preservation and sharing, we describe our effort to transform
electrophysiological data from private to public use.</p>
    </sec>
    <sec id="sec-2">
      <title>II. DESCRIPTION OF DATA</title>
      <p>The data were generated by whole-cell voltage clamping
isolated outer hair cells obtained from the domestic guinea pig
(i.e., cavia porcellus). The electrical properties (e.g., membrane
capacitance) of the outer hair cells were then determined [2] from
the electrical recordings. The data encompass results from the
assay, as well as the associated properties of the animals and cells;
the experimental conditions employed; and a description of the
devices used.</p>
      <p>This work was funded by NIH NLM and NIDCD R01DC000354-S1.</p>
    </sec>
    <sec id="sec-3">
      <title>Jason Bengtson</title>
    </sec>
    <sec id="sec-4">
      <title>III. USE ONTOLOGY TO DESCRIBE DATA</title>
      <p>To provide for the durability of the data across both time and
space so that others may reasonably expect to make use of it, we
describe the data with an ontology; essentially, a structure which
places defined concepts in a logical relationship with one another
as a way to describe things, events, or ideas that exist in the
objective world. Many ontologies exist, including a large number
which focus on various aspects of the biological sciences. Of
relevance is Ion Channel Electrophysiology ontology, ICEPO [3]
that describes concepts that are associated with electrical and
temporal characteristics of voltage-gated ion channels. Although
some of the concepts (e.g., gating current, ICEPO_0000049) have
been used to describe the electrical characteristics of outer hair
cells, the membrane protein that forms part of the voltage-sensing
component in the lateral membrane of outer hair cells is not an
ion-channel. We generalize, where appropriate, the concepts
introduced in ICEPO to align them with voltage-dependent
behavior of membrane assemblies. Another more extensive
ontology is the Ontology of Physics for Biology (OPB) [4] that
was developed to annotate computational models of biological
systems. It uses physical (e.g., thermodynamic) dependencies to
describe biological processes, and although it also covers some
relevant concepts, it does not describe the design, collection and
analysis of data. Having examined other ontologies, we decided format, with available open source viewers, it has conversion and
that Ontology for Biomedical Investigations, OBI [5] was the editing application programming interfaces (APIs) for a variety
most robust and logical match for the data. However, it does not of languages (including MATLAB), it supports complex and
address all the concepts required for this data set. Approximately large data structures, and it already enjoys significant scientific
100 are not found within OBI. We approached such cases by usage.
importing the terms from other ontologies, and by creating new
classes (~ 20). These imported and new terms are grafted at The six main classes become the main branches of the data
suitable junctions onto a variant of OBI we have produced. design (Fig. 2, right panel). All data associated with an individual
Although these terms are being considered for acceptance into the outer hair cell were arranged in a tree configuration and saved to
OBI ontology, we stress that our goal is not to expand the a file. The tree associated with the organism Group (terminology
ontology for its own sake, but to expand and use this application used with HDF5) is shown where the classes of the ontology now
ontology to describe the data. readily become the name for subgroups, datasets (terminology
also used by HDF5), and data values. Similar mapping is</p>
      <p>The six main classes used to define the data in OBI are shown performed for the other arms (Fig. 1) that describe the data.
in Fig. 1. The data are described with one electrophysiology
assay: whole cell patch-clamp voltage clamp assay, with the ACKNOWLEDGMENT
devices (e.g. patch-clamp device) needed to perform the assay in We thank Anita Bandrowski, Owen Ellard, Ronna Hertzano,
a separate class. The assay was performed with outer hair cells Barbara Jones, Elena Pormal, Joseph Santos-Sacchi, Jeffrey
isolated from cochlea of guinea pigs; hence cell, organism, and Teeters and Randy Vita for their contributions to this work.
anatomical entity were the three other material entities. To
extract key parameters from the electrical measurements we REFERENCES
performed analysis and hence data transformation is the 6th class. 1.</p>
      <p>These six main classes are the major arms of the data. In Fig. 2
we show the class structure for the organism arm with the
easxspoecriiamteednt.daFtoar veaxlaumespleth:atthearmeantuorrimtyalalyndresceoxrdoefdthdeurainnigmaaln; 2.
whether the guinea pig exhibited a normal or albino phenotype;
the mass of the animal; the age since birth; and, for adult females,
the estrous cycle phase. Similar class relationships are 3.
formulated for the other five arms (data not shown).</p>
    </sec>
    <sec id="sec-5">
      <title>IV. DESIGN DATA ARCHITECTURE BASED UPON ONTOLOGY</title>
      <p>The original data were stored in MATLAB (Mathworks, MA), as
a struct. MATLAB is proprietary software and not suitable for
expansive data sharing, as this software must be licensed at a
significant cost, and is not accessible to everyone. We 5.
transformed this data to Hierarchical Data Format version 5
(HDF5) [6] which was developed for storage of large and/or 6.
complex sets of data. We chose it because it is an open source</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>PLoS Biol</source>
          ,
          <year>2015</year>
          .
          <volume>13</volume>
          (
          <issue>6</issue>
          ): p.
          <fpage>e1002165</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Corbitt</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.,
          <article-title>Tonotopic relationships reveal the charge density varies along the lateral wall of outer hair cells</article-title>
          .
          <source>Biophys J</source>
          ,
          <year>2012</year>
          .
          <volume>102</volume>
          (
          <issue>12</issue>
          ): p.
          <fpage>2715</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Hinard</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , et al.,
          <article-title>ICEPO: the ion channel electrophysiology ontology</article-title>
          .
          <source>Database (Oxford)</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          , et al.,
          <article-title>Ontology of physics for biology: representing physical dependencies as a basis for biological processes</article-title>
          .
          <source>J Biomed Semantics</source>
          ,
          <year>2013</year>
          .
          <volume>4</volume>
          (
          <issue>1</issue>
          ): p.
          <fpage>41</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Bandrowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.,
          <article-title>The Ontology for Biomedical Investigations</article-title>
          .
          <source>PLoS One</source>
          ,
          <year>2016</year>
          .
          <volume>11</volume>
          (
          <issue>4</issue>
          ): p.
          <fpage>e0154556</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>The HDF Group. Hierarchical Data Format, version</source>
          <volume>5</volume>
          <fpage>1997</fpage>
          -2018 https://www.hdfgroup.org/HDF5/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>