=Paper=
{{Paper
|id=Vol-1546/poster_55
|storemode=property
|title=Exposing French Agronomic Resources as Linked Open Data
|pdfUrl=https://ceur-ws.org/Vol-1546/poster_55.pdf
|volume=Vol-1546
|authors=Aravind Venkatesan,Nordine El Hassouni,Florian Philippe,Cyril Pommier,Hadi Quesneville,Manuel Ruiz,Pierre Larmande
|dblpUrl=https://dblp.org/rec/conf/swat4ls/VenkatesanHPPQR15
}}
==Exposing French Agronomic Resources as Linked Open Data==
Exposing French agronomic resources as Linked Open Data Aravind Venkatesan 1 , Nordine El Hassouni2 , Florian Phillipe3 , Cyril Pommier3 , Hadi Quesneville3 , Manuel Ruiz12 , Pierre Larmande145 1 Institut de biologie Computationelle, M ontpellier, France Aravind.Venkatesan@lirmm.fr 2 UM R AGAP, CIRAD, M ontpellier, France {nordine.el_hassouni,manuel.ruiz}@cirad.fr 3 URGI, INRA, Versailles, France {fphilippe,Cyril.pommier,hadi.quesneville}@versailles.inra.fr 4 UM R DIADE, IRD, M ontpellier, France 5 Equipe Zenith, INRIA et LIRM M , M ontpellier, France pierre.larmande@ird.fr Abstract. The advancements in empirical technologies has generated vast amounts of heterogeneous data. This situation has created a need to integrate the data to understand the system of interest in its entirety. Therefore, infor- mation systems play a crucial role in managing these data, enabling the biolo- gists in the extraction of new knowledge. The plant bioinformatics node of the Institut Français de Bioinformatique (IFB) maintains public information sys- tems that houses domain specific data. Currently, efforts are being taken to ex- pose the IFB plant bioinformatics resources as Linked Open Data, utilising do- main specific ontologies and metadata. Here, we present the overview and the initial results of the project. Keywords: Data integration, Data interoperability, Knowledge management, Linked Data, RDF, Bioinformatics application, Agronomic research 1 Introduction Agronomy is an overarching field that encompasses various research areas such as genetics, plant molecular b iology, and agro-ecology. The last several decades has seen the successful imp lementation of high-throughput technologies that have revolu- tionised research in agronomy. These technological advancements have resulted in a number of init iatives been taken to systematically store and share information over the web. A mong others, the Institut Français de Bioinformatique (IFB) is a French- ELIXIR node (http://www.elixir-europe.org/about/elixir-france) that is focused on providing integrated services for the life science co mmunity. The plant research wing of the IFB p latform prov ides access to a number of databases and tools distributed across six regional bioin formatics portals. Data in these portals ranges fro m ‘o mics’ to genetics (genetic markers, maps and phenotypes) for various crop species. The objec- tive of the current effort is to develop RDF knowledge bases that integrates existing domain specific ontologies and data fro m the respective regional portals, pro moting data interoperability between the resources. To this end, we have developed the Ag- ronomic Linked Data knowledge base (www.agrold.org) that is representative of the data housed in the southern region portal o f France, the South Green Bioinfo rmatics platform (SG) (http://www.southgreen.fr/). 2 AgroLD for SG resources Currently, SG consists of 12 databases covering various plant species such as Banana, Cocoa, Maize and Rice. AgroLD is being developed in phases to expose all of these databases as Linked Data. Currently, Phase I of AgroLD includes data from: 1. Trop GeneDB (Hamelin et al. 2013), a database that hosts genetic, mo lecular and phenotypic information on tropical crop species. 2. Ory GenesDB (Droc et al. 2006), a database that serves as a repository on function- al genomics for rice. 3. Ory za Tag Line (Larmande et al. 2008), a database that contains sequence infor- mat ion (Flan king Sequence Tags) that are based on molecu lar categorisation of mutagen insertion sites for rice. 4. GreenPhylDB (Conte et al. 2008), provides sequence homology information for the members of kingdom plantae. Additionally, do main specific ontologies, ontology annotations, proteomics and genomics informat ion fro m a variety of publically availab le data sources have been integrated, this includes Gene Ontology, Plant Ontology, UniprotKB, Gramene (Gene, ontology annotation, gene, Quantitative Trait Loci (QTL) and Metabolic Pathway informat ion) (Monaco et al. 2014). The objective of this is to provide the critical mass required to imp lement real world use cases. Currently, Agro LD includes data pertaining to selected species namely, Oryza species (O.sativa, O.barthii, O.brachyantha, O. glaberimma and O.meridionalis), Arabidopsis thaliana, Sorghum bicolor, Zea mays and Triticum species (T.aestivum and T. uraruta). In the subsequent phases informat ion pertaining to other species and SG databases will be considered. The AgroLD effort will be further extended set-up RDF knowledge bases to host data from other regional portals. References Barrell, D. et al., 2009. The GOA database in 2009 - An integrated Gene Ontology Annotation resource. Nucleic Acids Research, 37(SUPPL. 1). Conte, M.G. et al., 2008. GreenPhylDB: A database for plant comparative genomics. Nucleic Acids Research, 36(SUPPL. 1). Droc, G. et al., 2006. Ory GenesDB: a database for rice reverse genetics. Nucleic acids research, 34(Database issue), pp.D736–D740. Hamelin, C. et al., 2013. Trop GeneDB, the mu lti-tropical crop informat ion system updated and extended. Nucleic Acids Research, 41(D1). Larmande, P. et al., 2008. Ory za Tag Line, a phenotypic mutant database for the Génoplante rice insertion line library. Nucleic Acids Research, 36(SUPPL. 1). Monaco, M.K. et al., 2014. Gramene 2013: Co mparative plant genomics resources. Nucleic Acids Research, 42(D1).