<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>BioOnto: Towards an Integration of Biological and Biogeographic Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marcos ZARATE</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agustina BUCCELLA</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pablo FILLOTTRANI</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centro para el Estudio de Sistemas Marinos, Centro Nacional Patag o ́nico, Consejo Nacional de Investigaciones Cient ́ıficas y Te ́cnicas</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Comisi o ́n de Investigaciones Cient ́ıficas de la provincia de Buenos Aires</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Computer Science and Engineering Department, Universidad Nacional del Sur</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>GIISCO Research Group, Computer Science Department, Universidad Nacional del Comahue</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>LINVI, Faculty of Engineering, Universidad Nacional de la Patagonia San Juan Bosco</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this work we present the preliminary design of BioOnto, an ontologybased system designed to integrate two important databases for Argentine science, the National System of Biological Data (SNDB) and the Ocean Biogeographic Information System (OBIS). BioOnto uses Web Ontology Language (OWL) to make assertions about classes in the underlying model and to define object properties that are used to link instances therein. To accomplish the integration, we follow the Single Ontology Approach. We illustrate the usefulness of BioOnto presenting fragments of the ontology that supports inference about (1) the relationship between preys and predators (2) which species coexist in a marine region (in particular the Argentine region is of our interest).</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Database Integration</kwd>
        <kwd>Ontology</kwd>
        <kwd>Biogeography</kwd>
        <kwd>Biodiversity</kwd>
        <kwd>Darwin Core</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Biogeography [1] is a scientific discipline that studies the distribution of living beings on
earth, as well as the processes that have originated it, that modify it and that can make
it disappear. It is an interdisciplinary science, which is both a branch of geography and
biology, receiving its foundations of specialties such as botany, zoology, ecology or
evolutionary biology and other sciences such as geology. Currently there is a steadily
growing wealth of biogeographic and biological data from a wide range of disciplines that
being available from on-line information systems [2,3,4] around the world. In
particular, as the number of species and geographic regions under study and also the
participating institutions and research groups continue growing, addressing elaborate research
issues about the complex interrelationships among these data becomes increasingly
difficult. In the case that the information sources cannot be properly coordinated, data access
and information discovery can sometimes become a daunting task. This drives the need
to further facilitate the information integration across data sources, taking into account
different querying interests in a diversity of fields and disciplines.</p>
      <p>SNDB2 supported by the Ministry of Science, Technology and Productive
Innovation of Argentina and OBIS3 supported by the Intergovernmental Oceanographic
Commission of UNESCO, are currently two of main reference databases for ocean
biologists, ecologists, and other researchers in Argentina. Both use Darwin Core Standard
(DwC) [5] which is a general purpose vocabulary designed to facilitate the transfer and
integration of biodiversity data. Though the DwC is defined in an RDF [6] document4,
integration of biodiversity data in the Semantic Web (SW) [7] is in its early stages. One
of the major challenges for DwC in the SW context is the lack of a well-defined
ontology. Without rigorous relationships between concepts and the properties that define
them, connections between biodiversity data and related semantically rich information,
such as literature and genomes, are difficult to traverse [8,9,10,11].</p>
      <p>To facilitate the integration of these two databases, we present the design of BioOnto,
an ontology that enables the integration following the Single Ontology approaches which
uses one global ontology providing a shared vocabulary for the specification of the
semantics and all information sources are related to one global ontology. The choice of this
approach is due to the fact that it is ratified for many years as demonstrated in [12] its
implementation is successful in cases where the nature of the data is similar. In addition,
BioOnto is designed to link data together with other systems compliant with the RDF
which follows the principles established by Linked Open Data initiative (LOD)5.</p>
      <p>The paper is organized as follows: Section 2 shows the state of art and the
problems of integration that exist today. Section 3 describes the architecture, design
considerations, and features of BioOnto. Section 4 presents a use case that demonstrates how
the information from both data sets are integrated to better understand the behavior of a
particular specie. Finally Section 5 presents the conclusions and the future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>In this section we will review the work related to the integration of biodiversity and
biogeography domain. The first work consulted is [13] developed by OBIS-ENV-DATA,
which proposes an extension of DwC to expand OBIS with environmental data to
effectively manage combined datasets, although this generates advantages in the
incorporation of environmental data, still persists the integration problem which has the DwC
standard itself. In MarineTLO [14] the authors defined a core ontology for publishing
marine data concerning to the European iMarine project6, which is suitable for setting up
warehouses that can serve complex queries. The main objective is to assemble
information from different datasets to give more details of a particular marine species. However
the DwC standard is not used to build the underlying ontology, limiting the possibility
of integrating biodiversity data that respects the DwC standard. BiSciCol [15] describes
an architecture to convert biodiversity data (in standard tabular formats) such as Darwin
Core-Archives [16] into RDF representations. In this case an ontology is described
using the terms of the DwC but there are no examples of inferences or queries that may
be of help to researchers interested in integrating different databases; on the other hand,
currently the project page is not accessible. We plan to use DwC standard to capture
complex aspect of biodiversity domain. In [17] the authors describe RDF-based data
structures that are to be employed in the creation of a centralized repository of metadata
from the heterogeneous data sources in the RITMARE research network7. A different
approach is taken in [18] where the authors propose the creation of micro-linked open
data clouds formed by oceanographic LOD-compliant datasets. A number of ontology
design patterns called GeoLink Modular Oceanography Ontology are designed in [19]
for the Oceanography domain. The resulting patterns are sufficiently modular, and thus
arguably easier to extend than foundational top-level ontologies. Currently, the GeoLink
project is in the middle of populating the patterns with actual data and a very preliminary
evaluation demonstrated that the patterns together can serve as an integrating layer of
heterogeneous oceanographic data repositories.</p>
      <p>
        Current research is trying to address the existing gap to integrate biological,
oceanographic and biogeographic data. However, critical look at the available literature
indicates that there are limitations related to: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) the absence of robust, standardized, and
widely-accepted vocabularies and ontologies for linkable biodiversity and biogeography
data. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) the disagreements presently governing the use of identifiers in biodiversity and
biogeography data is a major impediment to integrate these data.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Architecture and Ontology Details</title>
      <p>
        The proposed architecture, showed in Figure 1, is divided into three layers: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
Input Data representing the databases OBIS and SNDB in which the data can be
downloaded in CSV format, then this information is converted to RDF using
OpenRefine tool [20] where data inside these files are cleaned and converted to
standardized data types (dates, numerical values, etc.). Data are converted to RDF triples
using RDF Refine8, which is an extension of OpenRefine. The columns of the CSV
file are mapped as instances of DwC classes. Every resource must have a URI that
can be used to link that resource to other resources both within this dataset and
others anywhere on the web. The base URI that is common to the main classes is:
bio-onto:http://www.cenpat-conicet.gob.ar/ontology/. In the CSV file we
have a column with unique identifier, “ID”, to use as unique values in identity URIs. We
use GREL (General Refine Expression Language9) to generate a new URI, the
expression that we defined is "occurrence/"+value.urlify() this concatenates to the base
URI the string "ocurrence/" along with the value of the ID, then a URI generated after
applying the expression would look like this: bio-onto:occurrence/valueID. The
resulting RDF to describe an occurrence is:
      </p>
      <p>SUBJECT
bio - onto : occurrence / f6bbf85d -85 ea -4605
PREDICATE
rdf : type</p>
      <p>
        OBJECT
dwc : Occurrence
7http://www.ritmare.it/en/
8http://refine.deri.ie/
9https://github.com/OpenRefine/OpenRefine/wiki/GREL-Functions
dwc: is an abbreviation for the real namespace http://rs.tdwg.org/dwc/terms/,
Table 1 describes the mapping performed together with the columns of CSV file used to
generate the main URIs. The complete mapping can be consulted at10. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Global
Ontology and Storage in which the RDF triples are stored in GraphDB11. This is a highly
efficient and robust graph database with OWL inferences and SPARQL [21] support. Then
the users can access the data resources easily using the different visualizations provided
by GraphDB12. (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Information Retrieval which provides a transparent interface to the
non-expert user, so that they can perform queries without needing to know details of the
SPARQL query language.
      </p>
      <sec id="sec-3-1">
        <title>3.1. A Multidisciplinary Approach</title>
        <p>The development of the ontology is initiated within CENPAT13. The different research
lines at CENPAT include marine biology, aquatic resource management, oceanography,
paleontology, and biological diversity, among others. A working group at CENPAT is
focused on working with issues related to the management and conservation of marine
resources, this research group called CESIMAR acronym for (Center for the Study of
Marine Systems) have a long tradition on studying the biology and behavior of marine
mammals. Because most of the research done generates a large volume of data, but
depending on its nature, they are published in SNDB (if they are biological data) or OBIS
(if they are biogeographic data). Up to date there has not been developed a tool which
allows integrating both databases. For example it is of great interest to know which
species coexist in a certain marine region, or to establish trophic relations between marine
mammals.</p>
        <p>10https://github.com/cenpat-gilia/BioOnto/blob/master/scripts/mapping.json
11http://www.ontotext.com/products/ontotext-graphdb/
12http://web.cenpat-conicet.gob.ar:7200/sparql
13Patagonian National Research Centre (CENPAT), http://www.cenpat-conicet.gob.ar/</p>
        <p>
          After several interviews with the domain-experts, the key targets of the ontology
model are specified to: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) Establish marine regions, determined by maximum and
minimum latitudes. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Establish which species of marine mammals are prey or predator and
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Determine which species coexist in the economic areas pertaining to Argentina, in
particular to see whether the commercial fishing does alter the ecological balance of the
sea species.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Ontology Development Life-cycle</title>
        <p>
          Regarding to the methodological strategy of our approach, we keep in line with the
tradition that considers ontology as an engineering artifact that is useful to model some
aspects of the world. That is why we adopted the methodology defined in [22]. The main
stages are: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) Analysis: We planned to use the ontology first with the DwC classes, we
decided to first try and represent these objects as concepts and then see if some
concepts were lacking or inadequate and eventually adjust the structure. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Building the
ontology: The building process is iterative. Basically it can be broken down: finding
conditions to constrain the concepts, introducing the properties and/or concepts needed
to build the conditions and building the subsumption hierarchies of concepts and
properties. (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Evaluation: For the evaluation we take into account two important aspects
highlighted in [23], consistency of the ontology since an inconsistent ontology would
yield questionable results, this task is performed using Ontology Debugger14 a plugin for
Prote´ge´ and Pellet reasoner15, we test the consistency after each set of changes we make,
even if the changes are supposed to be simple. Another factor to evaluate is the
complexity, one way to evaluate this is to ask the reasoner to classify the ontology. If this test
takes too much time, it is likely that the ontology will not be usable in real conditions.
If such is the case, corrections are to be made. Since usually the ontology size cannot be
reduced, the general idea is to write simpler restrictions on properties. This means using
a less complicated logic if possible. For instance, using existential restrictions instead
of qualified cardinality restrictions helps keeping the complexity lower for the reasoner.
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Maintenance: After complexity test is performed with adequate performance, we
check the ontology’s performance in real use. This is done by testing the applications
exploiting the ontology and evaluating the performance, both in terms of execution speed
and results quality. The analysis of the results help us fine tune the ontology to our exact
needs.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. The Ontology Structure</title>
        <p>At this point we might think that the integration of both databases is solved since both
are converted to RDF, their integration should be simple given that a URI can indicate
what kind of entity it identifies. But this is not enough, it is necessary to impose a layer
of computationally tractable meaning, where the relationships that hold between them
can be accurately interpreted and used in an automated way. For example, establishing
relationships between instances of classes through an inverse property, defining
cardinality restrictions or establish subsumption relationships. Description Logics (DL) [24] is
an adequate means of representing ontologies. Furthermore, OWL is based on DL, so
we decided to describe our ontology using DL and to implement it in OWL-DL [25].
Both of these flavors are well-supported by existing reasoners and is particularly
suitable for the type of reasoning we intend to perform at this stage of the work. BioOnto is
based primarily on the DwC classes and terms, but also classes and properties defined by
domain experts.</p>
        <p>As we mentioned earlier DwC [5] is a body of standards for biodiversity
informatics. It provides stable terms and vocabularies for sharing biodiversity data. DwC is
maintained by TDWG16 (Biodiversity Information Standards, formerly The International
Working Group on Taxonomic Databases). These terms are organized into nine
categories (often referred to as classes), six of which cover broad aspects of the biodiversity
domain. Occurrence refers to existence of an organism at a particular place at a particular
time, Location is the place where the organism were observed (normally a geographical
region or place), the Event class is the relationship between Occurrence and Location
this registers sampling protocols and methods, dates, time and field notes, Taxon refers
to scientific names, vernacular names, etc. of the organism observed. The remaining
categories cover relationships to other resources, measurements, and generic information
14https://git-ainf.aau.at/interactive-KB-debugging/debugger/wikis/onto-debugger
15https://github.com/stardog-union/pellet
16http://www.tdwg.org/
about records which in our case are not used. Specifically for the record level, DwC
recommends the use of a number of terms from Dublin Core17.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.3.1. Classes and Properties</title>
        <p>The main classes of the ontology are taken from the vocabulary specified in DwC, as
well as some of classes and properties therein. The BioOnto structure and properties
assigned to its classes provide a series of useful reasoning tasks that can be formed by
SPARQL queries (see Section 4 for examples). The general model of BioOnto can be
seen in Figure 2. The class hierarchy of the OWL ontology consists of seven key classes:
Occurrence represents instances where the presence of an organism at a Location is
observed. It functions primarily as a node that connects Taxon to Events. Location is
a spatial region or named place. For DwC, a set of terms describing a place, whether
named or not. Event functions primarily as a node to connect one or more Occurrences
to an Event, and one or more Events at a Location. Taxon represents a unit of biodiversity
e.g. species, genus, family, etc. Agent foaf:Agent class is imported directly into BioOnto.
An agent (e.g. person, group, software or physical artifact). Region is an area of the
ocean delimited by a latitude and longitude. Dataset is an identification of the data set
belonging to OBIS or SNDB.</p>
        <p>BioOnto defines a number of properties used to link classes (see Table 2), and
also reuses some properties of some commonly used terms like Basic Geo Vocabulary
(WGS84 lat/long18) and FOAF19. In this way BioOnto accrues most of the scientific
and technical vocabulary required to achieve a semantic understanding of the most
commonly used terms in biogeography and biodiversity, the ontology in OWL can be seen
in20.</p>
        <p>17http://dublincore.org/documents/dcmi-terms/
18https://www.w3.org/2003/01/geo/
19http://xmlns.com/foaf/spec/
20https://github.com/cenpat-gilia/BioOnto/blob/master/ontology/BioOnto.owl</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Case study: Behavior of Marine Mammals</title>
      <p>This section develops use cases that are relevant to researchers dedicated to the
conservation of the marine species and the study of animal behavior. One of the most common
problems faced by the scientists is to determine which species coexist in the Argentinean
Exclusive Economic Zone (AEEZ21). One of the species that is of particular interest to
scientists in the Argentine sea littoral is Mirounga Leonina or (Southern Elephant Seal),
because of the relevant information that provides with respect to the sea. For this reason,
Mirounga Leonina are used as oceanographic sampling platforms [26], they are ideal
carriers of electronic devices that register parameters like salinity, position, depth and
temperature, providing huge amounts of information associated with the key habitats.</p>
      <p>The data collected by different individuals during their migration is captured by
these devices, and are available in OBIS. In some cases, new biodiversity related
knowledge may be discovered by the scientists using SNDB, (e.g. determine which fish species
live in the same region than the elephant seals). This information may not exist in OBIS
records, and therefore it is highly desirable to be able to integrate this information from
SNDB in a transparent way. The following subsections show examples where the axioms
that were defined in the ontology to enable them to answer these questions.</p>
      <sec id="sec-4-1">
        <title>4.1. Inferring Predator/Prey Relationship</title>
        <p>
          Scientists recently discovered that a large percentage of Southern Elephant Seal feeding
is the Lanternfish [27]. Then it is possible to define axioms in OWL to establish the
relationship predator/prey, a knowledge that was only implicit in the ontology. Using
rolification and the chain axiom property it can be expressed the above inference in OWL
with the following three axioms expressed in Manchester OWL [28] syntax:
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) R _ E l e p h a n t some Self
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) R _ L a n t e r n f i s h some Self
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) R _ E l e p h a n t o owl : t o p O b j e c t P r o p e r t y o R _ L a n t e r n f i s h S u b P r o p e r t y O f
i s _ p r e d a t o r _ o f
21http://www.marineregions.org/gazetteer.php?p=details&amp;id=8466
        </p>
        <p>Where R Elephant and R LanternFish are object properties, and it is
required to add atoms of the form U(t,u), where U is the universal property, (i.e.,
owl:topObjectProperty). In this way after executing the reasoner, all instances of
the class ElephantSeal will be related through the property is predator of with all
instances of the class LanternFish. This approach was taken from [29] where instead
of using SWRL to define rules, we use OWL axioms.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Inferring Species into Argentinean Exclusive Economic Zone</title>
        <p>
          To determine the species that inhabit the AEEZ, we can use (since the data are
georeferenced by latitude and longitude) axioms OWL to infer that a certain specie is present
in the AEEZ, for this we use the object property into eez and the following axioms
expressed in Manchester OWL syntax.
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Location and ( Lat some xsd : double [ &gt;=" -58.4046 " ^^ xsd : double ,
&lt;=" -32.4483 " ^^ xsd : double ]) and ( long some xsd : double
[ &gt;=" -69.6095 " ^^ xsd : double , &lt;= " -52.631 " ^^ xsd : double ])
SubClassOf into_eez value AEEZ
        </p>
        <p>Any instance of the Location class that is within the maximum and minimum
values of latitude and longitude, will be related by the property into eez with an instance
of ArgentineanEEZ class called AEEZ which represents the exclusive economic zone of
Argentina.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Querying and Reasoning Facilitated by BioOnto</title>
        <p>This section shows an example of simple SPARQL queries that allow to explore the
data of both datasets, in particular we will make use of the information associated to the
species as well as to their location. The following query allows to retrieve the location
of all mammals containing the two datasets. This particular query is important because it
allows in the future to work with interfaces that allow the visualization of georeferenced
data. To run this query through the SPARQL interface of GraphDB, see22.
PREFIX geo - pos : &lt; http :// www . w3 . org /2003/01/ geo / wgs84_pos #&gt;
PREFIX bio - onto :&lt; http :// www . cenpat - conicet . gob . ar / ontology /&gt;
PREFIX dwc : &lt; http :// rs . tdwg . org / dwc / terms /&gt;
SELECT ? s_name ? lat ? long
WHERE {
?s a dwc : Occurrence .
?s bio - onto : has_taxon ? taxon .
? taxon dwc : scientificName ? s_name .
? taxon dwc : class ? class .
?s bio - onto : has_event ? event .
? event bio - onto : has_location ? loc .
? loc geo - pos : lat ? lat .</p>
        <p>? loc geo - pos : long ? long .
22http://web.cenpat-conicet.gob.ar:7200/sparql?savedQueryName=bio-Loc-of-mammals
}</p>
        <p>FILTER regex ( STR (? class ) , " Ma mma li a " )</p>
        <p>In this section we show examples that respond to the questions raised by researchers
studying animal behavior. In the case of predator/prey relations is interesting because it
allows to determine by reasoning, the species that are part of the food chain obtaining
the species of OBIS and SNDB. In addition to this reasoning, we can determine when
a species is considered a top predator, if it is not prey to any other species. Regarding
to the marine regions, we offer the possibility that the user can define a specific marine
region to perform some type of analysis. Reasoning allows to identify the species that
were observed there and determine, for example if the existence of a specie in that region
is due to the fact that it is feeding from another one.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Work</title>
      <p>
        We have presented the initial steps performed in the ontology design called BioOnto
developed with researchers from CENPAT. The proposed model enables interoperability
and a common knowledge representation among databases SNDB and OBIS, allowing
the retrieval of information that cannot be gathered by any of the individual information
sources alone. Specifically, we presented the preliminary results of the ontology that (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
allows establishing predator/prey relationships between marine species, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) define marine
geographic regions and determine what species live there.
      </p>
      <p>We are currently working on the integration of other databases (e.g. FishBase23)
which are required in research tasks within the CENPAT to answer questions about the
balance of species and their relationship with commercial fishing.</p>
      <p>We know that this databases could be integrated into the input data layer since the
proposed architecture is flexible to integrate data in structured or semi-structured format.
Furthermore in future works we will consider to link our ontology with relevant marine
ontologies24.
23http://www.fishbase.org/
24http://mmisw.org/
[6] Ora Lassila and Ralph R Swick. Resource description framework (rdf) model and syntax specification.</p>
      <p>
        1999.
[7] Tim Berners-Lee, James Hendler, Ora Lassila, et al. The semantic web. Scientific american, 284(
        <xref ref-type="bibr" rid="ref5">5</xref>
        ):28–
37, 2001.
[8] Steven J Baskauf, John Wieczorek, John Deck, and Campbell O Webb. Lessons Learned from Adapting
the Darwin Core Vocabulary Standard for Use in RDF.
[9] Benjamin M Good and Mark D Wilkinson. The life sciences semantic web is full of creeps! Briefings
in bioinformatics, 7(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ):275–286, 2006.
[10] O James Reichman, Matthew B Jones, and Mark P Schildhauer. Challenges and opportunities of open
data in ecology. Science, 331(6018):703–705, 2011.
[11] Anne E Thessen and David J Patterson. Data issues in the life sciences. 2011.
[12] Yigal Arens, Chun-Nan Hsu, and Craig A Knoblock. Query processing in the sims information mediator.
      </p>
      <p>Advanced Planning Technology, 32:78–93, 1996.
[13] Daphnis De Pooter, Ward Appeltans, Nicolas Bailly, Sky Bristol, Klaas Deneudt, Menashe` Eliezer,
Ei Fujioka, Alessandra Giorgetti, Philip Goldstein, Mirtha Lewis, et al. Toward a new data standard
for combined marine biological and environmental datasets-expanding obis beyond species occurrences.</p>
      <p>
        Biodiversity Data Journal, (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ), 2017.
[14] Yannis Tzitzikas, Carlo Allocca, Chryssoula Bekiari, Yannis Marketakis, Pavlos Fafalios, Martin Doerr,
Nikos Minadakis, Theodore Patkos, and Leonardo Candela. Integrating Heterogeneous and
Distributed Information about Marine Species through a Top Level Ontology. In Metadata and Semantics
Research: 7th Research Conference, MTSR 2013, Thessaloniki, Greece, November 19-22, 2013.
Proceedings, pages 289–301. Springer International Publishing, 2013.
[15] Brian J Stucky, John Deck, Tom Conlin, Lukasz Ziemba, Nico Cellinese, and Robert Guralnick. The
biscicol triplifier: bringing biodiversity data to the semantic web. BMC bioinformatics, 15(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ):257, 2014.
[16] K D o¨ring M Robertson T Remsen D, Braak. Darwin Core Archive How-To Guide. 2011.
[17] Cristiano Fugazza, Anna Basoni, Stefano Menegon, Alessandro Oggioni, Fabio Pavesi, Monica Pepe,
Alessandro Sarretta, and Paola Carrara. RITMARE: Semantics- Aware harmonisation of data in Italian
marine research. In Procedia Computer Science, 2014.
[18] Adam Leadbetter, Robert Arko, Cynthia Chandler, Adam Shepherd, and Roy Lowry. Linked Data An
      </p>
      <p>
        Oceanographic Perspective. The Journal of ocean Technology, 8(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), 2013.
[19] Adila Krisnadhi, Yingjie Hu, Krzysztof Janowicz, Pascal Hitzler, Robert Arko, Suzanne Carbotte,
Cynthia Chandler, Michelle Cheatham, Douglas Fils, Timothy Finin, Peng Ji, Matthew Jones, Nazifa
Karima, Kerstin Lehnert, Audrey Mickle, Thomas Narock, Margaret OBrien, Lisa Raymond, Adam
Shepherd, Mark Schildhauer, and Peter Wiebe. The GeoLink modular oceanography ontology. In
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), 2015.
[20] Ruben Verborgh and Max De Wilde. Using OpenRefine. Packt Publishing Ltd, 2013.
[21] Eric Prud’hommeaux and Andy Seaborne. SPARQL query language for RDF – W3C recommendation.
      </p>
      <p>
        Technical report, W3C, 2008.
[22] Natalya F Noy, Deborah L McGuinness, et al. Ontology development 101: A guide to creating your first
ontology, 2001.
[23] Steffen Staab and Rudi Studer. Handbook on ontologies. Springer Science &amp; Business Media, 2010.
[24] Franz Baader. The description logic handbook: Theory, implementation and applications. Cambridge
university press, 2003.
[25] Ian Horrocks, Peter F Patel-Schneider, and Frank Van Harmelen. From shiq and rdf to owl: The making
of a web ontology language. Web semantics: science, services and agents on the World Wide Web,
1(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ):7–26, 2003.
[26] Fabien Roquet, Carl Wunsch, Gael Forget, Patrick Heimbach, Christophe Guinet, Gilles Reverdin,
Jean Benoit Charrassin, Frederic Bailleul, Daniel P. Costa, Luis A. Huckstadt, Kimberly T. Goetz, Kit M.
Kovacs, Christian Lydersen, Martin Biuw, Ole A. Nøst, Horst Bornemann, Joachim Ploetz, Marthan N.
Bester, Trevor McIntyre, Monica C. Muelbert, Mark A. Hindell, Clive R. McMahon, Guy Williams,
Robert Harcourt, Iain C. Field, Leon Chafik, Keith W. Nicholls, Lars Boehme, and Mike A. Fedak.
Estimates of the Southern Ocean general circulation improved by animal-borne instruments. Geophysical
Research Letters, 2013.
[27] Jade Vacquie´-Garcia, Christophe Guinet, Ce´cile Laurent, and Fre´de´ric Bailleul. Delineation of the
southern elephant seal s main foraging environments defined by temperature and light conditions. Deep Sea
Research Part II: Topical Studies in Oceanography, 113:145–153, 2015.
[28] Matthew Horridge, Nick Drummond, John Goodwin, Alan L Rector, Robert Stevens, and Hai Wang.
      </p>
      <p>The manchester owl syntax. In OWLed, volume 216, 2006.
[29] Md Kamruzzaman Sarker, David Carral, Adila Alfa Krisnadhi, and Pascal Hitzler. Modeling owl with
rules: The rowl protege plugin. In International Semantic Web Conference (Posters &amp; Demos), 2016.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Mark</surname>
            <given-names>V</given-names>
          </string-name>
          <string-name>
            <surname>Lomolino and James H Brown. Biogeography</surname>
          </string-name>
          .
          <source>Number QH84</source>
          L65
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>PN</given-names>
            <surname>Halpin</surname>
          </string-name>
          , Andrew J Read, BD Best, KD Hyrenbach, E Fujioka, MS Coyne, Larry B Crowder,
          <string-name>
            <surname>SA</surname>
          </string-name>
          Freeman, and
          <string-name>
            <given-names>C</given-names>
            <surname>Spoerri</surname>
          </string-name>
          .
          <article-title>Obis-seamap: developing a biogeographic research data commons for the ecological studies of marine mammals, seabirds, and sea turtles</article-title>
          .
          <source>Marine Ecology Progress Series</source>
          ,
          <volume>316</volume>
          :
          <fpage>239</fpage>
          -
          <lpage>246</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Vishwas</given-names>
            <surname>Chavan</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lyubomir</given-names>
            <surname>Penev</surname>
          </string-name>
          .
          <article-title>The data paper: a mechanism to incentivize data publishing in biodiversity science</article-title>
          .
          <source>BMC bioinformatics</source>
          ,
          <volume>12</volume>
          (
          <issue>15</issue>
          ):
          <fpage>S2</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Vishwas</surname>
            <given-names>S Chavan</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Ingwersen</surname>
          </string-name>
          .
          <article-title>Towards a data publishing framework for primary biodiversity data: challenges and potentials for the biodiversity informatics community</article-title>
          .
          <source>BMC bioinformatics</source>
          ,
          <volume>10</volume>
          (
          <issue>14</issue>
          ):
          <fpage>S2</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>John</given-names>
            <surname>Wieczorek</surname>
          </string-name>
          , David Bloom,
          <string-name>
            <given-names>Robert</given-names>
            <surname>Guralnick</surname>
          </string-name>
          , Stan Blum, Markus Do¨ring, Renato Giovanni, Tim Robertson, and David Vieglais.
          <article-title>Darwin core: An evolving community-developed biodiversity data standard</article-title>
          .
          <source>PLoS ONE</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>