<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.1038/s41562-016-0021</article-id>
      <title-group>
        <article-title>The DISK Hypothesis Ontology: Capturing Hypothesis Evolution for Automated Discovery</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Garijo</string-name>
          <email>dgarijo@isi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yolanda Gil</string-name>
          <email>gil@isi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Varun Ratnakar</string-name>
          <email>varunr@isi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hypothesis nanopublications</institution>
          ,
          <addr-line>ontologies</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Information Sciences Institute, University of Southern California</institution>
          ,
          <addr-line>Marina del Rey, CA</addr-line>
          ,
          <country country="US">U.S.A</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <issue>0021</issue>
      <fpage>1</fpage>
      <lpage>2</lpage>
      <abstract>
        <p>Automated discovery systems can formulate and revise hypotheses by gathering and analyzing data. In order to generate new hypotheses and provide explanations of their new findings, these systems need a language to represent hypotheses, their revisions, and their provenance. This paper describes the DISK hypothesis ontology which fulfills these requirements. The paper then presents a survey of existing models for representing hypotheses along with their features and tradeoffs. We compare these hypothesis models in the context of automated discovery and hypothesis evolution.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Information systems → Artificial intelligence; Knowledge
representation and reasoning</p>
      <p>representation,
micropublications,
hypothesis
automated
evolution,
discovery,</p>
    </sec>
    <sec id="sec-2">
      <title>1 INTRODUCTION</title>
      <p>Formal representations of scientific hypotheses would be useful in
many contexts. For instance, in order to keep up with the latest
updates on a research area, scientists need to quickly understand
the contributions of an article and how it was derived from others.
However, the vast amount of new scientific publications makes
this task increasingly complex. If scientists represented
hypotheses formally in publications, related literature could be
easily searched for hypotheses of interest. Alternatively, machine
reading systems could also extract hypotheses from text in
articles, and generate these formal representations.</p>
      <p>Formal representations of hypotheses may also be used to
improve reproducibility. Community initiatives on reproducibility
promote registering hypotheses and methods before conducting
the research [Munafo et al 2017]. Hypotheses are stated in textual
form, which can express arbitrarily complex statements about
hypotheses. However, text can be imprecise and ambiguous.</p>
      <p>Creating machine readable representations of research hypotheses
would facilitate the organization and management of the
literature. To date there is not a standard way of capturing the
contents and context of a hypothesis to understand its evolution.</p>
      <p>Another important use of formal hypothesis representations is
to enable automated discovery systems to do hypothesis testing
and revision. Autonomous discovery systems generate hypotheses
autonomously based on analysis of relevant data [Pankratius et al
2016; King 2017; Gil et al 2017].</p>
      <p>In this paper, we focus on hypothesis representations to
capture hypothesis evolution in automated discovery systems. We
discuss the requirements that we have found throughout work on
the DISK discovery system [Gil et al 2017]. We propose an
ontology for hypothesis representation, and compare it to existing
models for representing hypotheses.</p>
      <p>The rest of the paper is organized as follows. Section 2
describes the DISK automated discovery system, and introduces
its hypothesis ontology. Section 3 introduces an evaluation
framework for existing models and overviews them. Section 4
discusses the different alternatives for hypothesis representation,
and Section 5 concludes the paper.</p>
    </sec>
    <sec id="sec-3">
      <title>2 REPRESENTING HYPOTHESES IN THE</title>
    </sec>
    <sec id="sec-4">
      <title>DISK AUTOMATED DISCOVERY SYSTEM</title>
      <p>Our goal is to allow automated discovery systems to test
hypotheses provided by users, and revise them based on the
results of running computational experiments autonomously.</p>
      <p>In prior work, we introduced an approach that captures
scientists’ strategies for pursuing hypotheses as lines of inquiry
that specify the data to be retrieved, the experimental workflows
to run, and how to combine the results to generate a revised
confidence level and in some cases a revised hypothesis [Gil et al
2016]. This approach was implemented in the DISK framework
(Automated DIscovery of Scientific Knowledge) and
demonstrated for cancer multi-omics [Gil et al 2017]. DISK is
given a hypothesis statement, such as whether a protein is
associated with a type of cancer, and returns either a confidence
level on that hypothesis or a revised hypothesis that refers to a
mutation of the protein or a more specific type of cancer. As new
data becomes available, DISK re-runs the analysis and
continuously revises the original hypothesis. DISK tracks the
provenance of revised hypotheses in terms of the original
hypotheses and the data analyses that were carried out.</p>
      <p>DISK uses a representation of hypotheses that is needed to
track their evolution. In DISK, a hypothesis consists of:
1. A hypothesis statement, which is a set of structured
assertions about entities in the domain. For example, that the
protein EGFR is associated with colon cancer.</p>
      <p>2. A hypothesis qualifier, which represents the veracity of the
hypothesis based on the data and the analyses done so far. A
typical qualifier is a numeric confidence level. For example, for
the hypothesis statement above we could have a confidence level
given by a p-value of 0.07.</p>
      <p>3. Hypothesis evidence, which is a record of the analyses that
were carried out to test a hypothesis statement. For example, the
evidence of a given hypothesis may include an analysis of mass
spectrometry data for 25 patients with colon cancer and 25 healthy
controls followed by clustering, cluster metrics and binary
hypothesis testing.</p>
      <p>4. A hypothesis history, which points to prior hypotheses that
were revised to generate the current one. In our example, a
hypothesis such as the association of protein EGFR with colon
cancer SubType A would link back to the original hypothesis
statement that protein EGFR is associated with colon cancer.</p>
      <p>DISK represents hypothesis statements as a graph, where the
nodes are the entities in the hypotheses and the links are their
relationships. In our work, a hypothesis statement is represented
in RDF as a simple triple, and the triple is linked to its qualifier,
evidence, and history. All those assertions are also made in RDF.
The hypothesis evidence and hypothesis history both represent
different aspects of provenance for the hypothesis. This is
captured using the PROV provenance standard [Lebo et al 2013].</p>
      <p>Figure 1 illustrates this representation using the running
example with protein EGFR. The original hypothesis HG1 had its
own statement HS1 and evidence HE1. The revised hypothesis
HG2 includes its statement HS2, its confidence level L1 (part of
the qualifier HQ2), its evidence HE2, and a link to the original
hypothesis HG1. A feature of this representation is the ability to
model different confidence levels associated to a hypothesis
statement. This often happens when evidence is obtained from
analyzing different types of data and it is unclear how to combine
the resulting confidence levels. Figure 2 shows an example. HS3
is qualified with two confidence reports (C2 and C3), which have
different supporting evidence (HE3 and HE4) each resulting from
a different data source.</p>
      <p>The DISK hypothesis ontology is available in OWL and
documented in [Garijo et al 2017]. A major focus of the DISK
hypothesis ontology is capturing hypothesis evolution. The rest of
this paper focuses on comparing this ontology to other
representations of scientific hypotheses in the literature.</p>
    </sec>
    <sec id="sec-5">
      <title>3 A SURVEY OF HYPOTHESIS</title>
    </sec>
    <sec id="sec-6">
      <title>REPRESENTATIONS</title>
      <p>In this section we present a survey of existing models of scientific
hypotheses and assess their features to support automated
discovery.
1.
2.
3.
4.
5.
6.</p>
    </sec>
    <sec id="sec-7">
      <title>3.1 Comparing hypothesis models</title>
      <p>In our analysis, we consider the following key aspects, based on
the representation presented in Section 2:</p>
      <p>Statement: Does the model have a representation for
statements in a hypothesis?
Qualifier: Does the model have a means to qualify a
hypothesis with a confidence level?
Evidence: Does the model describe the supporting evidence
for a hypothesis?
History: Does the model represent the relationship between
hypothesis revisions?
In addition, the following aspects are desirable for flexibility and
extensibility:</p>
      <p>Classification: Does the vocabulary support a taxonomy of
hypothesis statements?
Standards: Is the model defined using standards or does it
use proprietary or idiosyncratic formats?</p>
    </sec>
    <sec id="sec-8">
      <title>3.2 Models for representing hypotheses</title>
      <p>This section introduces different approaches to represent
hypothesis at different levels of granularity. We group them based
according to the level of detail at which they describe hypotheses:
coarse-grained and fine-grained representations.
3.2.1</p>
      <sec id="sec-8-1">
        <title>Coarse-grained hypothesis models</title>
        <p>We group under this section those vocabularies that include main
concepts to identify hypotheses, but do not include the means to
qualify them or describe them at a statement level. For example,
popular vocabularies like the Semantic Web for Earth and</p>
      </sec>
      <sec id="sec-8-2">
        <title>Environmental Terminology Ontology1(SWEET) [Raskin and</title>
        <p>Pan 2005] contain modules for defining hypotheses as
“Experimental Activities”. Likewise, the Ontology for</p>
      </sec>
      <sec id="sec-8-3">
        <title>Biomedical Investigations (OBI)2 [Brandowski et al 2016] and</title>
        <p>the Ontology for Clinical Research (OCRe)3 [Sim et al 2014]
have concepts to refer to a hypothesis in the context of a
biological experiment.</p>
        <p>Other vocabularies include terms to further describe
hypotheses. The EXPO Ontology aims to define a model for
representing scientific experiments, "including generic knowledge
about scientific experimental design, methodology and results
representation" [Soldatova and King, 2006]. The EXPO Ontology
extends common upper level ontologies in order to bridge the gap
between domain specific experiment formalization and upper
level ontologies. EXPO aims at describing scientific papers, and
has a specific part designed for the description of hypotheses. The
1 http://sweet.jpl.nasa.gov/2.3/reprSciModel.owl
2 http://purl.obolibrary.org/obo/OBI_0001908
3 http://purl.org/net/OCRe/OCRe.owl#OCRE400032
focus of EXPO is on how the hypothesis is defined on a research
paper (the "part of" relationship between the scientific experiment
and the hypothesis), rather than identifying the statements
contained by the hypothesis itself. However, different classes of
hypothesis are identified in the ontology (i.e., null hypothesis,
research hypothesis and scientific hypothesis).</p>
      </sec>
      <sec id="sec-8-4">
        <title>Finally, the Linked Science Vocabulary 4 proposes a</title>
        <p>lightweight model to express support to hypothesis by some
research. A hypothesis is represented to make predictions about
facts, but it is not described at a statement level.
3.2.2</p>
      </sec>
      <sec id="sec-8-5">
        <title>Fine grained hypothesis models</title>
        <p>We group in this section those approaches that provide the means
to represent in detail the statements belonging to a hypothesis,
along with their metadata.</p>
        <p>LABORS [Soldatova and Rzhetsky 2011] is designed to
support investigations run by an automated system for the area of
Systems Biology and Functional Genomics. LABORS uses EXPO
as an upper level ontology, and splits the representation of
hypotheses into textual and logical representations, using concepts
from OBI and other upper level ontologies. It also allows
aggregating hypotheses with multiple statements in hypothesis
sets, using a Datalog representation for each hypothesis statement.</p>
        <p>The nanopublication model 5 [Groth et al 2010] aims to
represent “the smallest unit of publishable information”, i.e.,
every assertion that is part of a hypothesis graph.
Nanopublications are composed of three main graphs: An
assertion graph containing the assertion or multiple assertions
which are part of the nanopublication, a provenance graph with
the statements that describe the provenance of the assertion graph
(e.g., the assertion graph came from a publication, a scientific
experiment, etc.); and lastly a publication info graph which
contains the metadata about the nanopublication itself. (e.g., who
created the nanopublication, date when the nanopublication was
created, etc.). Each of the graphs is represented using a named
graph,6 so as to be able to describe it properly with metadata from
any of the other graphs. An example can be seen in the snippet
below, where a hypothesis H1 as in Figure 1 is represented with
its provenance (sub:provenance), assertion
(sub:hypothesisAssertion) and publication (sub:pubInfo) graphs.
@prefix sub: &lt;http://example.org/hypothesis#&gt; .
@prefix np: &lt;http://www.nanopub.org/nschema#&gt; .
@prefix prov: &lt;http://www.w3.org/ns/prov#&gt; .
@prefix xsd: &lt;http://www.w3.org/2001/XMLSchema#&gt; .
@prefix ex: &lt;http://example.org#&gt;
sub:defaultGraph {
sub:n1 np:hasAssertion sub: hypothesisAssertion;
np:hasProvenance sub:provenance ;
np:hasPublicationInfo sub:pubInfo ;
a np:Nanopublication, ex:Hypothesis .
}
sub:hypothesisAssertion {##statements contained in the
hypothesis graph
ex:EGFR ex:associatedWith ex:ColonCancer .}
4 http://linkedscience.org/lsc/ns/
5 http://www.nanopub.org/nschema#
6 https://www.w3.org/TR/rdf11-concepts/
sub:provenance { ##provenance of the assertion graph
sub: hypothesisAssertion prov:generatedAtTime
"2012-0203T14:38:00Z"^^xsd:dateTime ;
ex:hasConfidenceReport ex:conf1.</p>
        <p>prov:wasAttributedTo ex:experimentScientist .
ex:conf1 a ex:ConfidenceReport;
ex:hasConfidenceLevel "0.6".</p>
        <p>prov:wasGeneratedBy ex:execution1.
}
sub:pubInfo {##publication information of the user who
performed the hypothesis
: prov:generatedAtTime "2016-03-26T12:45:00Z"^^xsd:dateTime;
prov:wasAttributedTo ex:user1 .
}</p>
        <p>The ovopublication model proposes a simple approach
designed to capture the provenance of assertions [Callahan and
Dumontier 2013]. When contrasted with nanopublications, "the
ovopub is simpler as it consists of only a single named graph with
key provenance information directly contained in and associated
with the ovopub graph" [Callahan and Dumontier 2013].
Ovopublications mix the notion of named graphs with reification
to refer to the different components and relationships of the own
ovopublication. The Ovopub model is integrated as part of the
Semanticscience Integrated Ontology (SIO)7, which also provides
the means to describe hypothesis as literals</p>
        <p>The Semantic Web Applications in Neuromedicine
(SWAN) ontology8 [Ciccarese et al 2008] aims to represent the
scientific discourse of bio-medicine papers in general and
neuromedicine papers in particular. The model is composed of several
modules for representing discourse elements and their
relationships, different types of agents, the roles, provenance and
versioning of a given statement and bibliographic references.
SWAN was designed to describe statements in papers (along with
the evidence supporting them). If we consider a hypothesis as a
text statement, the following example illustrates the SWAN
model:
@prefix swande:
&lt;http://purl.org/swan/1.2/discourseelements/&gt; .
@prefix swanco:&lt;http://purl.org/swan/1.2/swan-commons/&gt; .
@prefix swanqs: &lt;http://purl.org/swan/1.2/qualifiers/&gt; .
@prefix swandr:
&lt;http://purl.org/swan/1.2/discourserelationships/&gt; .
@prefix swanpav: &lt;http://purl.org/swan/1.2/pav/&gt; .
@prefix swanci: &lt;http://purl.org/swan/1.2/citations/&gt; .
ex:hypothesis a swande:ResearchStatement ;</p>
        <p>swande:title "EGFR is associated with colon cancer
subtype A"@en;</p>
        <p>swanco:researchStatementQualifiedAs
&lt;http://swan.mindinformatics.org/ontologies/1.2/rsqualifiers/
hypothesis&gt;;
swanci:derivedFrom ex:execution1;
ex:hasConfidenceReport ex:c1;
swanpav:authoredBy ex:experimentScientist;
swanpav:createdOn 2012-02-03T14:38:00Z"^^xsd:dateTime .</p>
        <p>In the example, a hypothesis is extracted from a research
article. The hypothesis is represented as a statement, which can be
further described with SWAN. The provenance of the hypothesis
is represented as well by representing the agents who created the
hypothesis statement.
7 http://semanticscience.org/ontology/sio.owl
8 https://www.w3.org/TR/hcls-swan/</p>
        <p>Finally, micropublications 9 [Clark et al 2014] are derived
from the SWAN model and can be considered a refinement of the
nanopublication model. Micropublications propose a semantic
model of scientific argumentation and evidence that supports
natural language statements, data and materials specifications,
discussion, etc. Figure 3 shows an illustrative example, where a
micropublication uses a mechanism similar to an assertion graph
to represent the claim of a protein being associated with a subtype
of colon cancer, along with its supporting evidence. The
micropublication model uses the Web Annotation Ontology10 to
associate a micropublication and its contents with text from
articles.
4</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>DISCUSSION</title>
      <p>Table 1 summarizes the different candidate models for hypothesis
representation in automated discovery systems, according to the
features described in Section 3.1. Most models lack support for
qualifying a given hypothesis with confidence levels. In order to
overcome this issue, we may follow an approach similar to Figure
1: extend the target model with a class (confidence Report) and
two properties (hasConfidenceReport and hasConfidenceLevel)
linking them together. A reason why the confidence level may not
be directly linked to a hypothesis is that the same hypothesis may
be evaluated at different points in time, resulting in multiple
confidence levels with different provenance information each
included in a separate confidence report.</p>
      <p>The upper half of Table 1 corresponds to the models for
coarse grained hypothesis representation. These models include a
main concept to refer to a hypothesis, but lack the means to
describe hypothesis statements. Therefore, they do not meet the
majority of requirements that DISK requires for representing
hypothesis statements, qualifiers, history and evidence. However,
the LinkedScience, OBI and EXPO vocabularies define different
types of hypotheses, and may be potential candidates for reuse if
we need to define a hypothesis taxonomy.
9 http://purl.org/mp
10 https://www.w3.org/ns/oa</p>
      <p>The lower half of Table 1 corresponds to fine-grained models
to describe hypotheses, either defining classes and properties to
qualify hypothesis statements with provenance metadata or
relating its different parts together. Among these, the
nanopublication and micropublication models are the most
flexible approaches, compliant with most of the requirements of
the DISK model (in the last row). LABORS uses a datalog
representation for describing hypothesis statements and is domain
specific. The ovopublications model is a simplification of the
nanopublication model to include provenance of assertions or
collections of assertions. Although it could be used for hypothesis
representation, we consider that the model would need to be
thoroughly extended. Similarly, the SWAN model is extended in
the micropublication approach to represent argumentation of facts
in publications. Therefore, the nanopublication and
micropublication models provide a richer initial framework.</p>
      <p>A major difference between micropublications and
nanopublications is the scope of the domain. For instance,
micropublications was explicitly designed to model facts and
argumentation of text statements. If an automated discovery
system aims to represent single assertions of hypotheses and their
evolution, then an argumentation framework such as the one
proposed in the micropublication model is not necessary. In
contrast, if the provenance trace includes all evidence to support a
particular claim made in a hypothesis, then micropublications are
an appropriate model to use.</p>
      <p>Another aspect to consider is the support from the
communities that are using these models. The nanopublication
model has been discussed for some time, and has available
tooling, documentation and examples. 11 The micropublication
model has been documented in detail with examples [Clark et al
2014], but it has not yet reached the level of adoption and tooling
that nanopublications have.
11 http://nanopub.org/</p>
      <p>Finally, both the nanopublication and micropublication
models present an important limitation for representing
hypotheses: they have been designed to describe simple facts, i.e.,
single statements or a single collection of statements as part of
their claim. In the nanopublication model this is reflected by
having a unique assertion graph per nanopublication, containing
one or more statements. If we wanted to describe a hypothesis
composed of multiple statements, each with confidence levels
assigned independently by different experiments, we would have
to extend the nanopublication model. A possibility may be
creating a new class (a hypothesis composition concept such as
the “hypotheses-set” in LABORS) that aggregates each of its
statements as an individual nanopublication. Likewise, each
micropublication contains a main claim graph and its support. A
mechanism for extending and aggregating micropublications
would also be needed to represent hypothesis with multiple
statements. Note that the extension would only be necessary in
both models if we wanted to keep the provenance for each
statement of the hypothesis. Otherwise they can be included in the
assertion graph in the case of nanopublications or the claim graph
in the case of micropublications.</p>
    </sec>
    <sec id="sec-10">
      <title>5 CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper we introduced the DISK hypothesis ontology for
representing hypotheses evolution, which was developed for the
DISK automated discovery system. We also presented a survey
of existing vocabularies to represent hypotheses, and assessed
their suitability in the context of automated knowledge discovery.
Future work includes extending the DISK ontology to align with
these models.</p>
      <sec id="sec-10-1">
        <title>Hypothesis evidence</title>
        <p>No</p>
      </sec>
      <sec id="sec-10-2">
        <title>Hypothesis history</title>
        <p>No</p>
      </sec>
      <sec id="sec-10-3">
        <title>Hypothesis classification</title>
        <p>No
Partly
No
No
No
Yes
Yes
No
Yes
Yes
Yes
No
No
No
No
No
Yes
Yes
Yes
No
Yes
Yes
Yes
No
No
Yes
No
No
No
No
No</p>
        <p>Use of
standards
Yes (OWL)
Yes (OWL)
Yes (OWL)
Yes (OWL)
Yes (OWL)
Yes (OWL)
Yes (OWL),
named graphs
Yes (OWL),
named graphs
Yes (OWL)
Yes (OWL),
named graphs
Yes (OWL),
named graphs</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>ACKNOWLEDGMENTS</title>
      <p>We gratefully acknowledge support from the Defense Advanced
Research Projects Agency through the SIMPLEX program with
award W911NF-15-1-0555, and from the National Institutes of
Health under award 1R01GM117097. We also thank our
collaborators in the DISK project, especially Parag Mallick,
Ravali Adusumilli, and Hunter Boyce for their useful feedback on
this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Callahan and Dumontier</source>
          <year>2013</year>
          ]
          <article-title>Alison Callahan and Michel Dumontier</article-title>
          . Ovopub:
          <article-title>Modular data publication with minimal provenance</article-title>
          .
          <source>arXiv preprint arXiv:1305.6800</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Brandrowski et al 2016]
          <string-name>
            <surname>Bandrowski</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brinkman</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brochhausen</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brush</surname>
            <given-names>MH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            <given-names>B</given-names>
          </string-name>
          , et al. (
          <year>2016</year>
          )
          <article-title>The Ontology for Biomedical Investigations</article-title>
          .
          <source>PLOS ONE 11</source>
          (
          <article-title>4): e0154556</article-title>
          . https://doi.org/10.1371/journal.pone.0154556
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Clark et al 2014]
          <article-title>Tim Clark, Paolo N. Ciccarese and Carole A. Goble. Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          <year>2014</year>
          ,
          <volume>5</volume>
          :
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Ciccarese et al 2008]
          <string-name>
            <surname>Ciccarese</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinoshita</surname>
            <given-names>J</given-names>
          </string-name>
          , et al.
          <source>The SWAN Scientific Discourse Ontology. Journal of biomedical informatics</source>
          .
          <year>2008</year>
          ;
          <volume>41</volume>
          (
          <issue>5</issue>
          ):
          <fpage>739</fpage>
          -
          <lpage>751</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jbi.
          <year>2008</year>
          .
          <volume>04</volume>
          .010.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Garijo et al 2017]
          <article-title>The DISK Hypothesis Ontology</article-title>
          .
          <source>Version 1.0</source>
          .0. Available from http://disk-project.org/ontology/disk#
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Gil et al 2016]
          <article-title>Gil</article-title>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ;
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ;
            <surname>Ratnakar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ;
            <surname>Mayani</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ; Adusumilli, R.; and Boyce,
          <string-name>
            <given-names>H.</given-names>
            <surname>Automated Hypothesis</surname>
          </string-name>
          <article-title>Testing with Large Scientific Data Repositories</article-title>
          .
          <source>In Proceedings of the Fourth Annual Conference on Advances in Cognitive Systems (ACS)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Raskin and Pan</source>
          <year>2005</year>
          ]
          <string-name>
            <surname>Robert</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Raskin</surname>
            and
            <given-names>Michael J.</given-names>
          </string-name>
          <string-name>
            <surname>Pan</surname>
          </string-name>
          .
          <article-title>Knowledge representation in the semantic web for Earth and environmental terminology (SWEET)</article-title>
          .
          <source>Computers &amp; Geosciences</source>
          <volume>31</volume>
          (
          <issue>9</issue>
          ):
          <fpage>1119</fpage>
          -
          <lpage>1125</lpage>
          ,
          <year>November 2005</year>
          . doi:
          <volume>10</volume>
          .1016/j.cageo.
          <year>2004</year>
          .
          <volume>12</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Sim et al 2014]
          <string-name>
            <surname>Sim</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tu</surname>
            <given-names>SW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carini</surname>
            <given-names>S</given-names>
          </string-name>
          , et al.
          <article-title>The Ontology of Clinical Research (OCRe): An Informatics Foundation for the Science of Clinical Research</article-title>
          .
          <source>Journal of biomedical informatics</source>
          .
          <year>2014</year>
          ;
          <volume>52</volume>
          :
          <fpage>78</fpage>
          -
          <lpage>91</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jbi.
          <year>2013</year>
          .
          <volume>11</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Soldatova and King</source>
          <year>2006</year>
          ]: Soldatova,
          <string-name>
            <surname>LN</surname>
          </string-name>
          &amp; King,
          <string-name>
            <surname>RD.</surname>
          </string-name>
          (
          <year>2006</year>
          )
          <article-title>An Ontology of Scientific Experiments</article-title>
          .
          <source>Journal of the Royal Society Interface</source>
          ,
          <volume>3</volume>
          (
          <issue>11</issue>
          ):
          <fpage>795</fpage>
          -
          <lpage>803</lpage>
          ,
          <year>2006</year>
          . doi:
          <volume>10</volume>
          .1098/rsif.
          <year>2006</year>
          .
          <volume>0134</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Soldatova and Rzhetsky</source>
          <year>2011</year>
          ]
          <article-title>: Soldatova, LN and Rzhetsky, A. Representation of research hypotheses</article-title>
          .
          <source>Journal of Biomedical Semantics20112(Suppl</source>
          <volume>2</volume>
          ):
          <fpage>S9</fpage>
          .
          <year>2011</year>
          . https://doi.org/10.1186/2041-1480-2-S2-S9
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>