=Paper= {{Paper |id=Vol-1747/D202_ICBO2016 |storemode=property |title=Reusing the NCBO BioPortal Technology for Agronomy to Build AgroPortal |pdfUrl=https://ceur-ws.org/Vol-1747/D202_ICBO2016.pdf |volume=Vol-1747 |authors=Clément Jonquet,Anne Toulet,Elizabeth Arnaud,Sophie Aubin,Esther Dzalé Yeumo,Vincent Emonet,John Graybeal,Mark A. Musen,Cyril Pommier,Pierre Larmande |dblpUrl=https://dblp.org/rec/conf/icbo/JonquetTAAKEGMP16 }} ==Reusing the NCBO BioPortal Technology for Agronomy to Build AgroPortal == https://ceur-ws.org/Vol-1747/D202_ICBO2016.pdf
           Reusing the NCBO BioPortal technology for
                 agronomy to build AgroPortal
       Clément Jonquet,1,2,6 Anne Toulet,1,2,                                 2
                                                                                Computational Biology Institute (IBC) of
  Elizabeth Arnaud,3 Sophie Aubin,4 Esther Dzalé
                                                                                           Montpellier, France
Yeumo,4 Vincent Emonet,1 John Graybeal,6 Mark A.                            3
                                                                              Bioversity International, Montpellier, France
    Musen,6 Cyril Pommier,4 Pierre Larmande2,5                                         4
                                                                                         INRA Versailles, France
                                                                               5
      1                                                                          UMR DIADE, IRD Montpellier, France
      Laboratory of Informatics, Robotics and                               6
                                                                              Center for BioMedical Informatics Research
     Microelectronics of Montpellier (LIRMM)
                                                                                         (BMIR), Stanford, USA
     University of Montpellier & CNRS, France
                                                                                           jonquet@lirmm.fr

    Abstract— Many vocabularies and ontologies are produced to       ontologies become important. In fact, there exists a need of a
represent and annotate agronomic data. By reusing the NCBO           one-stop-shop for agronomical, environmental and plant
BioPortal technology, we have already designed and                   sciences ontologies enabling to identify and select an ontology
implemented an advanced prototype ontology repository for the        for a specific task as well as offering generic services to exploit
agronomy domain. We plan to turn that prototype into a real          them in search, annotation or other scientific data management
service to the community. The AgroPortal project aims at             processes. Therefore, our goal is to enable straightforward use
reusing the scientific outcomes and experience of the biomedical     of agronomic related ontologies, avoiding data managers and
domain in the context of plant, agronomic, food, environment
                                                                     researchers the burden to deal with complex knowledge
(perhaps animal) sciences. We offer an ontology portal which
                                                                     engineering issues.
features ontology hosting, search, versioning, visualization,
comment, recommendation, enables semantic annotation, as well            In the biomedical domain, the NCBO BioPortal
as storing and exploiting ontology alignments. All of these within   (http://bioportal.bioontology.org) [3] is a well-known open
a fully semantic web compliant infrastructure. The AgroPortal        repository for biomedical ontologies originally spread out over
specifically pays attention to respect the requirements of the       the web and in different formats. The NCBO BioPortal
agronomic community in terms of ontology formats (e.g., SKOS,        functionalities have been progressively extended in the last 10
trait dictionaries) or supported features. In this paper, we         years, and the platform is fully semantic web compliant
present our prototype as well as preliminary outputs of four
                                                                     (ontologies, mappings and annotations are stored in an RDF
driving agronomic use cases. With the experience acquired in the
                                                                     triple store). However, the BioPortal is specific for health and
biomedical domain and building atop of an already existing
technology, we think that AgroPortal offers a robust and stable
                                                                     biomedical ontologies and even if an overlap exists, the portal
reference repository that will become highly valuable for the        does not span to the agronomic, environment or animal
agronomic domain.                                                    domains. An important aspect is that NCBO technology is
                                                                     domain-independent and open source. A BioPortal virtual
   Keywords—ontology repository, ontology mapping, semantic          appliance 1 is available as a server machine embedding the
annotation, agronomic sciences.                                      complete code and deployment environment, allowing anyone
                                                                     to set up a local ontology repository and customize it.
                      I. INTRODUCTION                                    In this paper we present an ontology repository advanced
    Similarly to what happens in biomedicine, communities            prototype to support these challenges in agronomy and plant
engaged in agronomic research need to access specific sets of        sciences. The portal is built atop of the NCBO BioPortal
ontologies for data annotation and integration. For instance, it     technology. The main objective of the AgroPortal project is to
has been established that the scientific challenges in plant         develop and support a reference ontology repository for the
breeding have switched from genetics to phenotyping and that         agronomic domain.
standard traits/phenotypes vocabularies are necessary to
facilitate breeder’s data integration and comparison. In parallel                             II. RELATED WORK
of very specific crop dictionaries [1], important organizations          In the biomedical or agronomic domains there exists
have produced large reference vocabularies such as                   several “knowledge organization systems” listings such as
AGROVOC (Food and Agriculture Organization), NAL                     BioSharing (biosharing.org) or the VEST Registry
Thesaurus (National Agricultural Library) or the CAB                 (aims.fao.org/vest-registry). They usually register ontologies
Thesaurus (Centre for Agricultural Bioscience International)         and provide a few metadata about them. However, because
and are currently working on integrating them [2]. The more          they are registries for different kind of resources, they do not
ontologies are being produced in the domain, the more the need
to create, store and retrieve alignments between those
                                                                     1
                                                                         www.bioontology.org/wiki/index.php/Category:NCBO_Virtual_Appliance
support the level of features that an ontology repository offers.              SPARQL endpoint (http://sparql.agroportal.lirmm.fr). While
More specifically to plant domain, the Crop Ontology web                       assuring the day to day maintenance and monitoring of the
application (www.cropontology.org) [4] publishes online sets                   portal and keeping it up-to-date with the NCBO technology,
of ontologies & dictionaries required for describing crop                      we have started to work on customizations and specific
germplasm, traits and evaluation trials. It contains 18 species-               services for the agronomic/plant community. For instances:
specific ontologies in addition to ontologies related to the crop              organizing the content of the portal, working on multilingual
germplasm domain. The current web application facilitates the                  support, interconnecting BioPortal and AgroPortal, scoring
complete ontology-engineering life cycle starting with                         annotations, supporting different formats, adding new
collaborative construction, publishing, use and modification.                  metadata.
However, it necessitates important improvements of the current
versioning, curation, multilingual aspects, user interface as well
as for data annotation and mapping features. The Planteome
portal (www.planteome.org) [5], is reusing the Gene Ontology
project AmiGO technology to build a database of searchable
and browsable annotations for plant traits, phenotypes,
diseases, genomes, gene expression data across a wide range of
plant species. Although the portal hosts the reference
ontologies in the plant biology (e.g., PO, TO, EO), the portal
focus is on data (not ontologies) and the scope is not as large as
the one we envision for AgroPortal.

    III. A PORTAL FOR AGRONOMIC RELATED ONTOLOGIES
    We have clearly identified that the NCBO technology was
the one that implements the most of the features (ontology &
mapping repository, annotator, recommender, community
support, etc.) the community would certainly be interested in,
while being aware of the technical challenges of developing
such a various and complex software. In addition, our vision is
to adopt, as the NCBO did, an open and generic approach
where users can themselves easily participate to the platform,
upload and comment content (ontologies, mappings, projects).
Plus, there are two major motivations for AgroPortal to reusing
the outcomes of biomedicine: (i) to avoid re-developing
technologies that have already been designed and extensively                            Fig. 1. Screenshots from the AgroPortal user interface
used; (ii) to offer the same tools, services and formats to both
community to facilitate the interface and interaction between
the domains e.g., to enable a user to query the BioPortal or the                           IV. DRIVING AGRONOMIC USE CASES
AgroPortal without changing a line of code.
                                                                               A. Agronomic Linked Data (AgroLD) within IBC
    We have developed and deployed an advanced prototype                           The Computational Biology Institute of Montpellier (IBC –
platform (v1.0 beta released in January 2016)                                  http://www.ibc-montpellier.fr), develops methods for data
http://agroportal.lirmm.fr – that currently hosts 49 ontologies –              integration and knowledge management within agronomic
including 28 not originally present in BioPortal2 – and we are                 sciences to improve information accessibility and
working on 37 candidate ontologies. The platform counts                        interoperability. The project is interested in identifying genes
already 38 registered users. The features offered by the portal                controlling roots and panicle branching as well as genes
are for example: (i) to search across all the ontologies, (ii) to              orthologous relationship for rice genes families. Using 8
annotate a piece of text with all the ontologies, (iii) to store and           ontologies for annotation, the project has built the AgroLD
serve mappings between ontologies within the portal and with                   RDF knowledge base (http://agrold.org) that integrates data
the NCBO BioPortal. All other features from BioPortal are                      from a variety of plant resources (e.g., Gramene, SouthGreen,
generically available for the AgroPortal: ontology versioning,                 UniProtKB, OryGeneDB) and provides a portal for
UI widget, ontology metrics, ontology recommender service,                     bioinformaticians to exploit the homogenized data models to
projects listing, community feedback (comment, subscription                    efficiently build research hypotheses [6].
to ontology changes), users’ management (and public or
private access to ontologies). In addition, two endpoints allow
                                                                               B. RDA Wheat Data Interoperability (WDI) working group
automatic querying of the content of the portal: (i) a REST web
service API (http://data.agroportal.lirmm.fr) and (ii) a                           The WDI working group is part of the Research Data
                                                                               Alliance (RDA – https://rd-alliance.org). Its goal is to provide a
2
  As of now, for technical reasons, we had to duplicate a few ontologies but
                                                                               common framework for describing, representing, linking and
the long term vision is an interconnected network of bioportals that will      publishing wheat data with respect to open standards. One of
enable anyone to access easily an ontology independently from where it’s       the needs identified by the group is to offer a repository of
actually hosted.
linked vocabularies and ontologies that are relevant for wheat.     Although multiple questions need to be addressed, we do
NCBO technology has been identified as suitable tool to             believe that the NCBO technology is a good candidate for this
address this need allowing one to search for terms across           project and we see here an opportunity to capitalize technology
multiple vocabularies and ontologies, browse mappings               and scientific outcomes of the ten last years.
between terms, receive recommendations on which
vocabularies and ontologies are most relevant for a corpus and          Considering the position of the current NCBO BioPortal
annotate text with terms. The WDI is maintaining a list of          and the importance of having such an equivalent repository of
vocabularies and ontologies within an AgroPortal specific slice     ontologies for the agronomic, environment and plant sciences,
(http://wheat.agroportal.lirmm.fr) which has been reported in       we therefore expect a broad adoption of the AgroPortal in the
the WDI’s set of guidelines for wheat data description              community. The implication of associated partners (IBC, IRD,
(http://datastandards.wheatis.org). More recently, two other        CIRAD, INRA, Bioversity International) illustrates the impact
RDA working groups (Rice Data Interoperability and                  and interests first in France, but also internationally (e.g.,
AgriSemantics) have expressed interest in using AgroPortal as       Planteome, Elixir, BioSharing, EBI, FAO). Making available
a backbone for data integration and/or standardization.             such a portal allows the researchers to focus on the
                                                                    development of new ontologies and mappings between
                                                                    ontologies with the perspective of leveraging them in their
C. INRA Linked Open Vocabularies (LovInra)                          research and not being afraid of producing an additional piece
    LovInra is an effort to publish vocabularies produced or co-    in the big data cake. Exporting NCBO research results and
produced by INRA scientists and foster their reuse beyond the       technology contributes to long term support of that technology
original researchers. Many of such resources developed within       while reinforce the connections with the biomedical domain.
specific focus projects remain unknown to the research
community despite of their value. To achieve this goal, there is        In the future we will identify more potential users for the
a clear need to publish the vocabularies with respect to open       portal and support new research scenarios. For each ontology
standards and link them to existing resources. Here again,          available in the portal, we will go through an extensive
NCBO technology has been identified a suitable repository for       description of its metadata in order for the portal to facilitate
this third used case. A specific group of ontologies has been       the comprehension of the landscape of agronomical ontologies.
setup in AgroPortal for ontologies produced or used by INRA
and we are helping ontology editors to follow the semantic web                                   ACKNOWLEDGMENT
standards when making their ontologies sharable and available.          This work is partly achieved within by Semantic Indexing of French
                                                                    biomedical Resources (SIFR – www.lirmm.fr/sifr) project funded by the
                                                                    French National Research Agency, grant ANR-12-JS02-01001, the NUMEV
D. The Crop Ontology project                                        Labex (www.lirmm.fr/numev), grant ANR-10-LABX-20, the Computational
    The Crop Ontology project (www.cropontology.org) of the         Biology Institute of Montpellier (www.ibc-montpellier.fr), grant ANR-11-
                                                                    BINF-0002 as well as by University of Montpellier and the CNRS. We also
Consultative Group on International Agricultural Research           thank the National Center for Biomedical Ontologies for help and time spent
(CGIAR) is AgroPortal’s fourth use case. The main goals of          with us in deploying the AgroPortal.
this project are: to publish online fully documented lists of
breeding traits used for producing standard field books; and to                                    REFERENCES
support data analysis and integration of genetic and phenotypic
data through harmonized breeders’ data annotation. The project      [1]   R. Shrestha, E. Arnaud, R. Mauleon, M. Senger, G. F. Davenport,
                                                                          D. Hancock, N. Morrison, R. Bruskiewich, and G. McLaren,
also offers a forum for scientists to discuss their variables,            “Multifunctional crop trait ontology for breeders’ data: field book,
methods and scales of measurement, and field-books. We work               annotation, data discovery and semantic enrichment of the literature,”
on leveraging the backend of the cropontology.org web                     AoB Plants, vol. 2010, May 2010.
application with the AgroPortal web service API, while              [2]   T. Baker and O. Suominen, “Global Agricultural Concept Scheme
keeping the current web application as the primary point of               (GACS): A multilingual thesaurus hub for Linked Data.” Report, 2014.
access. We actually offer new functionalities to the Crop           [3]   N. F. Noy, N. H. Shah, P. L. Whetzel, B. Dai, M. Dorf, N. B. Griffith,
Ontology community such as versioning, SPARQL endpoint,                   C. Jonquet, D. L. Rubin, M.-A. Storey, C. G. Chute, and M. A. Musen,
                                                                          “BioPortal: ontologies and integrated data resources at the click of a
notes, the annotation tool, while not breaking the uses of the            mouse,” Nucleic Acids Research, vol. 37, pp. 170–173, May 2009.
current application. In addition, we work on supporting the         [4]   L. Matteis,     P. Chibon,     H. Espinosa,    M. Skofic,     R. Finkers,
alignment (or mapping) of terms within and across different               R. Bruskiewich, and E. Arnaud, “Crop ontology: vocabulary for crop-
plant related ontologies: both within the crop ontologies                 related concepts.,” in 1st International Workshop on Semantics for
themselves (in different crop branch) or with other reference             Biodiversity (P. Larmande, E. Arnaud, I. Mougenot, C. Jonquet,
ontologies commonly used in plant biology.                                T. Libourel, and M. Ruiz, eds.), vol. CEUR Workshop Proceedings,
                                                                          (Montpellier, France), pp. 37–46, May 2013.
                                                                    [5]   P. Jaiswal, L. Cooper, J. L. Elser, A. Meier, M.-A. Laporte, C. Mungall,
                      V. CONCLUSION                                       B. Smith, E. K. Johnson, M. Seymour, J. Preece, X. Xu, R. S. Kitchen,
                                                                          B. Qu, E. Zhang, E. Arnaud, S. Carbon, S. Todorovic, and D. W.
    In this paper we have briefly introduced AgroPortal, an               Stevenson, “Planteome: A resource for Common Reference Ontologies
open ontology repository for the agronomy domain. We have                 and Applications for Plant Biology,” in 24th Plant and Animal Genome
discussed four use cases that are already using the portal to             Conference, PAG’16, (San Diego, USA), January 2016.
support their work on data interoperability. The thematic           [6]   A. Venkatesan, P. Larmande, C. Jonquet, M. Ruiz, and P. Valduriez,
boundaries of the portal are not precisely defined yet, (e.g.,            “Facilitating efficient knowledge management and discovery in the
agriculture also includes animals) and it will be to the users to         Agronomic Sciences,” in 4th Plenary Meeting of the Research Data
                                                                          Alliance, (Amsterdam, The Netherlands), September 2014.
express what they expect to find into such a repository.