<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Design of a Framework to Support Reuse of Open Data about Agriculture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alec Gordon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammad Sadnan Al Manir</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brandon Smith</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amir Rezaie</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christopher J.O. Baker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of New Brunswick</institution>
          ,
          <addr-line>Saint John</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Online Datasets in Open Data Portals typically have minimal metadata and users wishing to consider their reuse in extended analyses are poorly served. One approach is to find and re-annotate the metadata according to subject-specific, community adopted vocabularies. In support of this we explore a multi-tiered framework combining the capabilities of a crawler, a tagger and a recommendation engine, as well as tools for the provisioning of data as discoverable services. We provide details of prototype scale implementations of these components and a cursory evaluation of the tagger for subject-specific metadata enrichment using the Global Agricultural Concept Scheme (GACS).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <sec id="sec-2-1">
        <title>Open Data about Agriculture</title>
        <p>
          A rudimentary search for Open Data tagged with the term agriculture identified
datasets in a variety of country/geography specific portals [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] including USA2,
UK3, France4, Australia5, Canada6, Netherlands7, and the continent of Africa8.
These datasets are published in a range of data formats and the permitted modes
of access can vary also. The following formats were found; CSV, XML, HTML,
GML, FGDB/GDB, PDF, DOC, ArcGIS, KML, ODT, ZIP, API, ArcGIS Map
Service, XLSX, JSON and RDF/OWL. Given that ODP’s often use a limited
number of tags, a more granular breakdown of the specific subtopics is necessary
for the domain of agriculture. Recently, the Agriculture Open Data Package
(AgPack9) has introduced 14 key data categories on agriculture policy and food
security perspectives that can be applied to datasets, albeit such tags are not
yet in common use in ODPs.
        </p>
        <p>
          Earlier approaches to publishing structured Open Data have leveraged
community adopted controlled vocabulary terms and dataset definitions expressed
in Resource Description Framework (RDF) serialization formats, known as as
Linked Open Data [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. This approach affords users the option to query over
linked data using the SPARQL10 query language. One such deployment of this
approach is the Agronomic Linked Data project (AgroLD) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which provides
access to data resources about plants in the form of an RDF graph for domain
experts, such as bioinformaticians. The extent to which target data is readily
discoverable and queryable depends on the skills of the end users who need to
be proficient with SPARQL and related tools.
        </p>
        <p>
          In recent years, the Global Open Data for Agriculture and Nutrition
(GODAN11) project has advocated for the publication of open data and the creation
of ecosystems where agricultural data is Findable, Accessible, Interoperable, and
Reusable (FAIR) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
1.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Target Functionality and Design Challenges</title>
        <p>The current state of the ODPs containing agricultural data provides a good
motivation for the creation of a dedicated infrastructures that supports
comprehensive Open Data exploration for potential reuse. Primarily, users want to
i) search for and query across globally distributed agricultural datasets based
on multiple keywords and defined relations, and ii) retrieve integrated data in
2 https://catalog.data.gov/dataset
3 https://data.gov.uk/
4 https://www.data.gouv.fr/en/datasets/
5 https://data.gov.au/dataset
6 https://open.canada.ca/data/en/dataset?portal_type=dataset
7 https://data.overheid.nl/data/dataset
8 https://africaopendata.org/dataset
9 https://opendatacharter.net/agriculture-open-data-package/
10 https://www.w3.org/TR/sparql11-overview/
11 https://www.godan.info/
a unified standard format so that they are compatible and readily usable with
third party tools.</p>
        <p>In order for an infrastructure to support these capabilities it needs to address
the following tasks: (i) regular crawling of the Web for sites related to
agriculture, (ii) screening of Open Data files and indexing them, (iii) downloading and
scanning the files for key agriculture vocabulary terms, (iv) generating subject
specific metadata for the data files, (v) recommending relevant datasets based
on curated metadata, (vi) change management and revision of metadata, (vii)
provision of data resources as discoverable Web services, and (viii) publishing
data according to interoperability standards.</p>
        <p>In this paper we propose a multi-tier framework, Section 2, for the harvesting
of Open Data files, subject specific enrichment of metadata, and the
provisioning of Open Data as services. Using the target use case of Open Data about
agriculture and leverage of the Global Agricultural Concept Scheme (GACS)
we provide details of prototype scale implementations, Section 3, and a cursory
evaluation of the tagger in, Section 4. In Section 5 we briefly discuss the
framework in the context of the target functionality and list future work. Section 6
contains concluding remarks.
2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Framework</title>
      <p>The multi-tier framework presented in in Figure 1 provides a solution to support
better discovery and reuse of Open Agricultural Data.</p>
      <p>
        As shown in Figure 1, the Data Sources column displays two sources of data:
i) Open Agricultural Datasets which are generated and collected based on
typical agricultural activities, and ii) the Controlled Vocabulary of Agriculture and
Nutrition such as the Global Agricultural Concept Scheme (GACS) consisting
of standard vocabularies which are agricultural concepts mapped from three
well known sources: the AGROVOC multilingual agricultural thesaurus by the
Food and Agricultural Organization (FAO) of the United Nations, the CAB
Thesaurus by the Centre for Agriculture and Biosciences International (CABI),
and the NAL Thesaurus by the US National Agricultural Library [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>In Phase 1, the country-specific ODPs hosting the Open Agricultural Datasets
are crawled and indexed for further processing. The crawler uses seed URLs of
the ODPs as inputs, fetches contents such as text, data, and hyperlinks from
recursively-linked pages, parses and stores them as segments, from which an
index is then created. Off-the-sheft crawlers12 and indexers13 can also be used for
this purpose.</p>
      <p>In Phase 2, the index is enriched and updated using a tagger Individual data
files are downloaded and parsed, and relevant tags are added based on a custom
scoring algorithm that ranks words matching to the controlled vocabulary.</p>
      <p>In Phase 3, a Semantic Recommendation System is used to suggest relevant
datasets to end users, which can then be further curated in preparation for
12 http://nutch.apache.org/
13 http://lucene.apache.org/solr/
integration with other datasets. Further enrichment of metadata using mapping
to external ontologies can be incorporated also.</p>
      <p>
        In Phase 4, access to data as services is provided using SADI Semantic Web
services [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Services are generated by Valet SADI [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] over fully enriched
semantic metadata descriptions mapped to data schemes. Services are deployed in a
service-registry and can be discovered, invoked, orchestrated into workflows and
executed automatically using a SADI specific semantic query client.
The development of the framework is ongoing and the implementation is at the
preliminary stage, albeit a light-weight crawler, tagger, and recommendation
engine have been developed and are undergoing testing. Here we provide an
outline of these components with particular emphasis on the performance of the
tagger, which plays an essential role for the subsequent phases to be successful.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Crawler</title>
        <p>The crawler in Phase 1 recursively scans through the ODP pages and sub-pages
describing each dataset and their URLs. The crawler saves this information
locally in segments which are parsed and structured into fields by an indexer.
An index of the datasets containing descriptions and metadata is created. The
file formats currently supported by the crawler are Zip (.zip), Microsoft Excel
(.xls, .xlsx), Portable Document Format (.pdf), Comma-separated values (.csv)
and Text (.txt). Similar functionalities are provided by the recently introduced
Dataset Search14 by GoogleTM.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Tagger</title>
        <p>The tagger in Phase 2 is used to enrich the descriptions of the datasets by adding
metadata from expert-authored controlled vocabularies. The core features of
the tagger are the use of (i) an in memory vocabulary graph generated from a
controlled vocabulary file and (ii) a custom scoring algorithm based on lexical
matching of terms in data files to the terms in the vocabularies.</p>
        <p>The current implementation of the tagger uses the vocabularies from GACS
to create a graph where the nodes in the graph are terms or concepts. Before a
node is created in the vocabulary graph, stemming is applied so that each term is
reduced to its root form. The concept hierarchies of the vocabulary contain both
broader concepts (as in superclass in ontologies) which identify parent nodes and
narrower concepts (as in subclass in ontologies) which identify child nodes.
Scoring of Annotations The tagger reads each word from the input data file
and applies stemming. It then searches for both an exact match and a stem
match in the vocabulary graph. If a lexical match, with or without stemming,
to a concept is detected, a score is added to the term and to each of its broader
concept terms in the graph based on their depth in the hierarchy. The narrower
concepts (more specific and lower down the hierarchy) are assigned lower scores
to avoid the selection of concepts that don’t provide significant information.
Once scoring is complete, the upper 3rd percentile of concepts are selected as
annotations for the document. This provides a barrier excluding tags that are
unrelated to the content of a document but are still contained in it, such as
terms from sources and references. The current deployment of the tagger excludes
matches to geographical locations because of their widespread use and marginal
relevance in the current study.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Augmented Tagging with Broader Concepts To illustrate how the scor</title>
        <p>ing provides additional tagging to the datasets a simple example is shown for
illustration and intelligibility purposes. The tagger was run on the dataset titled
14 https://toolbox.google.com/datasetsearch
Wheat/Barley and their Products15 hosted at the Open Government Portal16
maintained by the Government of Canada. This file contains mentions of Wheat
and Barley but not Cereals.</p>
        <p>Table 1 shows tags annotated to the Open Data file with and without the
introduction of the scoring technique. Without the implemented scoring technique
(tagging of lexical and stem-based matches to GACS) the tagger can identify
only terms directly mentioned in the files. Using the adopted scoring technique
the term Cereals, the parent term for Wheat and Barley in GACS is retrieved.</p>
        <p>Tags with lexical/stem matching
import, export, wheat, barley, permits</p>
        <p>Tags with scoring
cereals
The GACS hierarchy17, shown below, for the preferred term wheat illustrates
how the broader concept cereals is related to the narrower. Moreover, the scoring
can be extended to retrieve multiple parent terms in the hierarchy including cases
where multiple inheritance may occur.</p>
        <p>... &gt; crops &gt; f ieldcrops &gt; graincrops &gt; cereals &gt; wheat</p>
        <p>Metadata and tags provided when the file was submitted to an ODP can be
enriched in a systematic way by using the tagger, namely with lexically matched
terms found in GACS. The scoring algroithm additionally provides subject
specific tags that are broader in scope. In the subsequent phase of the framework
only the enriched datasets, including the lexically matched tags and the broader
augmented tags, are used by the recommendation engine to filter and categorize
data according to users’ interests, Phase 3.
3.3</p>
      </sec>
      <sec id="sec-3-4">
        <title>Semantic Recommendation Engine</title>
        <p>The recommendation engine in Phase 3 currently uses a content-based filtering
method, where extensive tagging of data files and custom scoring of matched
tags is employed to determine the level of similarity between files. The engine
uses the initial preferences of a user, which can be obtained from tagging an
online publication specified by the user.</p>
        <p>Upon request for recommendation, all datasets are scored according to their
relevance to the tags within the user’s profile. Scoring is done by multiplying
the normalized weight of each tag by the normalized weight of a matching tag
within the user’s profile. The cumulative score for each document is then
compared pairwise and the highest scoring documents are returned to the user as a
15 https://open.canada.ca/data/en/dataset/3a4e7f9b-64d2-432f-8394-15f6814aad62
16 https://open.canada.ca/en
17 http://browser.agrisemantics.org/gacs/en/page/C212
recommendation. Additionally, a history of the suggested files is stored within
the user’s profile to avoid repeat runs offering the same recommendations. The
engine was tested for both programmatic functionality and the quality of the
recommended datasets. Preliminary test results show the greater the numbers
of annotations, the better the relevance of the recommended datasets.
Extension of the recommendation engine will include the use of additional community
developed ontologies and inferencing based on subsumption, transitivity.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Preliminary Results of the Tagger</title>
      <p>The tagger was run on a machine running Ubuntu 17.10 server with a 4-core 3
GHz processor and 8 GB memory. During the experiment, the tagger tried to
match data from 212 CSV datasets hosted on FAOStat18 and Data.gov19 to the
beta version of GACS controlled vocabulary. The outcome of the initial
experiments showed that the scoring worked surprisingly well for most of the datasets.
As is to be expected, the tagger worked best when data files contained
meaningful agriculture related terms and performed worst when data files contained
terms mostly as names, identifiers and numeric values.</p>
      <p>Table 2 shows an analysis of results derived after running the tagger on 5
random datasets. The Topics column indicates what type of information the
data files contain, the Tags column indicates if the value of a score crossed the
threshold to select any tags or not, and the Outcome indicates whether the
matching performance of the tagger is best case, moderate case, worst case or
resulted in a false positive. For some data files the selected tags were found to
be false-positive as well as false-negative. Due to space constraints a rigorous
analysis of the tagger is beyond the scope of this paper. However, in testing it
was found that Open Datasets are very broad in scope and their composition
is complex as they often are published as spreadsheets, invoices, and statistical
reports. Often, the rows and columns can only be explained by an expert in
the subject area or by the data provider. It is also difficult to interpret when
numerical values with units are present.</p>
      <p>Thus, although automatic tagging may work for some datasets, for many
other datasets it is prone to errors. Therefore, it is recommended that the tags
added automatically be verified manually by experts before approving the files
for use in the recommender system in the subsequent phase.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion</title>
      <p>We have outlined a framework designed to address the challenges described
in Section 1.3. In addition, we have been able to corroborate the general
feasibility of our approach in so far as harvesting, tagging and recommending
files to users. At the current time the tools implemented in this framework
18 http://www.fao.org/faostat/en/#data
19 https://www.data.gov/
Title of the dataset
Incidental catch at
BC marine finfish
aquaculture sites
Adult Salmon Health
(Snorkel Surveys)
Cape Breton
Highlands
USDA FSA Farm
Payment Name/
Address File
for 2008
USDA FSA Farm
Payment File for
2010
Pineapple - Average
retail price per
pound and per
cup equivalent, 2013</p>
      <p>Topics Tags
Time, location, 321 Words were
facility, common tried, 266 matches, Best
and scientific and 9 top scoring case
name of the fish tags</p>
      <p>58 Words were
waterbody, species, tried, 51 matches, Moderate
age and quality and 2 top scoring case
tags</p>
      <p>Outcome
Names and
addresses
Identifiers and
numerical values
Packages and
market price</p>
      <p>None matched
None matched
fertilizers</p>
      <p>
        Worst
case
Worst
case
False
positive
are yet to mature and more experiments are required to assess and improve
their performance. The idea of harvesting files in ODPs likely motivated the
development of Dataset Search by GoogleTM where users are provided with an
overview of the metadata assigned by the original publisher of the datasets. In
our pilot studies, we were able to further enrich the metadata for individual files
providing agriculture specific tags from GACS that extend beyond the metadata
provided by the dataset publisher. Compared to the techniques described in
related work [
        <xref ref-type="bibr" rid="ref1 ref10">10,1</xref>
        ], the tagging approach implemented in our framework finds tags
by traversing each word of the data file and by applying lexical and semantic
matching to an expert-curated, subject specific controlled vocabulary instead of
reusing the existing tag libraries shared between ODPs. These portals tend to
use tags that are generally broad in scope as opposed to subject specific. Our
methodology additionally has the benefit of being domain agnostic and alternate
vocabularies other than GACS could be supplemented e.g. for Open Data files
about health topics.
      </p>
      <p>
        With end users in mind, the recommendation system we implemented was
designed to support users who are looking for recently published candidate data
files and consider them for reuse. In addition, it can support users wishing to
participate in crowdsourcing and provisioning of data as services. Indeed, the
greater goal for the framework includes the provision of Open Data as services
over which ad hoc queries can be run. This is possible if the data can be
sufficiently well structured, annotated with metadata and could support meaningful
queries across data sets. Given that our system is still in development and since
we have not processed large volumes of Open Data files we have yet to
determine the extent to which Open Data files can be readily made available as
services. We have proposed to leverage SADI Semantics Web services given that
registries of SADI services, along with associated query tools, can support the
target functionality where complex workflows of combined data retrieval and
data analytics services can be run. Moreover we can point to recent work where
researchers [
        <xref ref-type="bibr" rid="ref11 ref12">11,12</xref>
        ] report the use of the SADI Semantic Web services in
agriculture for surveillance tasks in precision irrigation and precision dairy farming use
cases. More recently we conducted pilot studies in the creation of services for a
decision support system in agricultural operations management. SADI services
were created to fetch target trait data for eggplant varieties and compute costs,
revenue and profits for individual eggplant varieties. User provided values for
market prices and estimated crop yields were required as inputs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Whereas
these services were build manually, more recent reports show the utility of Valet
SADI for the automated generation of services in the domain of malaria
analytics [
        <xref ref-type="bibr" rid="ref14 ref15">14,15</xref>
        ], where a registry of services specific to malaria insecticide resistance
surveillance queries was built.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>We have presented a prototype to annotate Open Data files with subject specific
tags on agriculture. The target objective is to make Open Data in ODPs more
discoverable and intelligible for potential data reuse purposes. We have proposed
to do this using a multi-phase approach involving crawling and indexing of Open
Datasets, a custom tagging approach leveraging lexical term matching and a
scoring algorithm. Files enriched with tags in this way are then made available
to a recommendation engine to support alerting of end users. Subsequent to this
we proposed the provisioning of data as services with semantic descriptions to
support ad hoc federation of data in response to complex user queries.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Alan</given-names>
            <surname>Tygel</surname>
          </string-name>
          , So¨ren Auer, Jeremy Debattista, Fabrizio Orlandi, and Maria Luiza Machado Campos.
          <article-title>Towards cleaning-up open data portals: A metadata reconciliation approach</article-title>
          .
          <source>In ICSC</source>
          , pages
          <fpage>71</fpage>
          -
          <lpage>78</lpage>
          . IEEE Computer Society,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Wei</surname>
            <given-names>Wei</given-names>
          </string-name>
          , Zhanglong Ji, Yupeng He, Kai Zhang, Yuanchi Ha,
          <string-name>
            <given-names>Qi</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <surname>Lucila</surname>
          </string-name>
          Ohno-Machado.
          <article-title>Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge</article-title>
          . Database,
          <year>2018</year>
          (1):bay017,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>David</given-names>
            <surname>Corsar</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Edwards</surname>
          </string-name>
          .
          <article-title>Challenges of open data quality: More than just license, format, and customer support</article-title>
          .
          <source>J. Data and Information Quality</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):3:
          <fpage>1</fpage>
          -
          <issue>3</issue>
          :
          <fpage>4</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bizer</surname>
          </string-name>
          , Tom Heath,
          <string-name>
            <given-names>Kingsley</given-names>
            <surname>Idehen</surname>
          </string-name>
          , and
          <string-name>
            <surname>Tim</surname>
          </string-name>
          Berners-Lee.
          <article-title>Linked data on the web (ldow2008)</article-title>
          .
          <source>In Proceedings of the 17th international conference on World Wide Web, WWW '08</source>
          , pages
          <fpage>1265</fpage>
          -
          <lpage>1266</lpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Stella</given-names>
            <surname>Zevio</surname>
          </string-name>
          , Nordine El Hassouni, Manuel Ruiz, and
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Larmande</surname>
          </string-name>
          .
          <article-title>Agrold indexing tools with ontological annotations</article-title>
          .
          <source>In Proceedings of the 9th International Conference Semantic Web Applications and Tools for Life Sciences, Amsterdam, The Netherlands, December 5-8</source>
          ,
          <year>2016</year>
          .,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Mark D Wilkinson</surname>
          </string-name>
          , Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg,
          <string-name>
            <surname>Jan-Willem</surname>
            <given-names>Boiten</given-names>
          </string-name>
          ,
          <source>Luiz Bonino da Silva Santos</source>
          ,
          <string-name>
            <surname>Philip E Bourne</surname>
          </string-name>
          , et al.
          <article-title>The FAIR Guiding Principles for scientific data management and stewardship</article-title>
          .
          <source>Scientific data, 3</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Baker</surname>
          </string-name>
          , Caterina Caracciolo, Anton Doroszenko, and
          <string-name>
            <given-names>Osma</given-names>
            <surname>Suominen</surname>
          </string-name>
          .
          <article-title>GACS core: Creation of a global agricultural concept scheme</article-title>
          .
          <source>In Metadata and Semantics Research - 10th International Conference, MTSR</source>
          <year>2016</year>
          , Go¨ttingen, Germany, November 22-
          <issue>25</issue>
          ,
          <year>2016</year>
          , Proceedings, pages
          <fpage>311</fpage>
          -
          <lpage>316</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Mark</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          , Benjamin Vandervalk, and
          <string-name>
            <surname>Luke McCarthy</surname>
          </string-name>
          .
          <source>The Semantic Automated Discovery and Integration</source>
          (SADI)
          <article-title>Web service Design-Pattern, API and Reference Implementation</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>8</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Mohammad Sadnan Al Manir,
          <string-name>
            <given-names>Alexandre</given-names>
            <surname>Riazanov</surname>
          </string-name>
          , Harold Boley, Artjom Klein, and
          <string-name>
            <given-names>Christopher J. O.</given-names>
            <surname>Baker</surname>
          </string-name>
          .
          <article-title>Valet SADI: provisioning SADI web services for semantic querying of relational databases</article-title>
          .
          <source>In IDEAS</source>
          , pages
          <fpage>248</fpage>
          -
          <lpage>255</lpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Alexandre</given-names>
            <surname>Passant. LODr - A Linking Open</surname>
          </string-name>
          <article-title>Data Tagging System</article-title>
          .
          <source>In Proceedings of the First Social Data on the Web Workshop (SDoW2008)</source>
          , Karlsruhe, Germany, October 27
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Wilfried Wo¨ber, Klemens Gregor Schulmeister, and
          <string-name>
            <surname>Christian</surname>
          </string-name>
          Aschauer et al.
          <article-title>agriOpenLink: Adaptive Agricultural Processes via Open Interfaces</article-title>
          and
          <string-name>
            <given-names>Linked</given-names>
            <surname>Services</surname>
          </string-name>
          . In M. Clasen,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hamer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lehnert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Petersen</surname>
          </string-name>
          , and B. Theuvsen, editors,
          <source>GIL Jahrestagung</source>
          , volume
          <volume>226</volume>
          <source>of LNI</source>
          , pages
          <fpage>157</fpage>
          -
          <lpage>160</lpage>
          . GI,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Slobodanka Dana Kathrin Tomic</surname>
          </string-name>
          , Wilfried Wo¨ber, and Sandra Ho¨rmann et al.
          <article-title>Enabling Semantic Web for Precision Agriculture: a showcase of agriOpenLink Project</article-title>
          . In A. Filipowska,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Polleres, editors,
          <source>SEMANTiCS (Posters Demos)</source>
          , volume
          <volume>1481</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <fpage>26</fpage>
          -
          <lpage>29</lpage>
          . CEUR-WS.org,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. Mohammad Sadnan Al Manir,
          <string-name>
            <given-names>Bruce</given-names>
            <surname>Spencer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Christopher J. O.</given-names>
            <surname>Baker</surname>
          </string-name>
          .
          <article-title>Decision Support for Agricultural Consultants With Semantic Data Federation</article-title>
          .
          <source>IJAEIS</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <fpage>87</fpage>
          -
          <lpage>99</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Jon Ha¨el Brenas, Mohammad Sadnan Al Manir,
          <string-name>
            <given-names>Christopher J. O.</given-names>
            <surname>Baker</surname>
          </string-name>
          , and
          <string-name>
            <surname>Arash</surname>
          </string-name>
          Shaban-Nejad.
          <article-title>A malaria analytics framework to support evolution and interoperability of global health surveillance systems</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>5</volume>
          :
          <fpage>21605</fpage>
          -
          <lpage>21619</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. Jon Ha¨el Brenas, Mohammad Sadnan Al Manir,
          <string-name>
            <given-names>Kate</given-names>
            <surname>Zinszer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Christopher J. O.</given-names>
            <surname>Baker</surname>
          </string-name>
          , and
          <string-name>
            <surname>Arash</surname>
          </string-name>
          Shaban-Nejad.
          <article-title>Exploring semantic data federation to enable malaria surveillance queries</article-title>
          .
          <source>In Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth - Proceedings of MIE</source>
          <year>2018</year>
          ,
          <article-title>Medical Informatics Europe</article-title>
          , Gothenburg, Sweden,
          <source>April 24-26</source>
          ,
          <year>2018</year>
          , pages
          <fpage>6</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>