=Paper=
{{Paper
|id=Vol-2137/paper_32.pdf
|storemode=property
|title=FoodOn: A Semantic Ontology Approach for Mapping Foodborne Disease Metadata
|pdfUrl=https://ceur-ws.org/Vol-2137/paper_32.pdf
|volume=Vol-2137
|authors=Dalia A. Alghamdi,Damion M. Dooley,Gurinder Gosal,Emma J. Griffiths,Fiona S.L. Brinkman,William W.L. Hsiao
|dblpUrl=https://dblp.org/rec/conf/icbo/AlghamdiDGGBH17
}}
==FoodOn: A Semantic Ontology Approach for Mapping Foodborne Disease Metadata==
FoodOn: A Semantic Ontology Approach for mapping Foodborne
Disease Metadata
Dalia A. Alghamdi, Damion M. Dooley, Gurinder Gosal, Emma J. Griffiths, Fiona S.L.
Brinkman and William W.L. Hsiao
1
BC Center for Disease Control, 655 W 12th Ave, Vancouver, BC V5Z 4R4, Canada
2 University of British Columbia, 2329 West Mall, Vancouver, BC V6T 1Z4, Canada
₃ Simon Fraser university, 8888 University, Burnaby, BC V5A 1S6, Canada
ABSTRACT FoodOn (http://foodon.org) is an ontology resource that
The FoodOn Food Ontology contains standardized terms and a facet- aims to model the food domain, which includes knowledge
based classification scheme for describing food products, processing and
about food and food-related human activities, such as agri-
environments. Mapping of foodborne pathogen isolate source information
(descriptors of the contaminated materials and locations) to the FoodOn culture, medicine, food safety inspection, shopping patterns
standard can facilitate data sharing and integration between multi- and sustainable development (Griffiths et al., 2016). Map-
jurisdictional health and regulatory agencies utilizing disparate software ping of foodborne pathogen isolate metadata to FoodOn can
platforms and data dictionaries. Faster and more efficient sharing of in- provide a means for standardizing, translating, and com-
formation is critical for tracking and controlling outbreaks of foodborne municating this critical contextual information between
disease at local, national and international levels. This work describes
mapping procedures which can be utilized by organizations and software
health agencies and platforms in a timely fashion. Here we
developers to better enable interoperability between foodborne pathogen describe a semi-automated method derived from mapping
surveillance and outbreak management systems. metadata from the widely used online microbial MLST typ-
ing platform Enterobase, which can be broadly applied to
other use cases.
INTRODUCTION
Globalization of food networks increases opportunities FOODON DESIGN PRINCIPLES
for the spread of foodborne pathogens beyond borders and Although there are several existing indexing systems
jurisdictions, with major impacts on global health and econ- directly or indirectly related to food and food-borne illness,
omies (Altekruse & Swerdlow et al.,1996; World Health including those maintained by Health Canada, the US De-
Organization, 2008). Whole genome sequencing (WGS) partment of Agriculture, and the UN’s Food and Agriculture
provides the highest resolution evidence for identifying, Organization, they have been built for different purposes
typing and matching foodborne pathogen isolates from dif- and so differences in their architecture hinder interoperabil-
ferent sources. WGS results must be combined with source ity. To provide a more comprehensive view of food safety,
information to be meaningfully interpreted for regulatory data from these various sources must be integrated. In a
and health interventions, outbreak investigation, and risk concerted effort to solve this semantic interoperability prob-
assessment. Isolate metadata (source of a pathogen) is criti- lem, the OBOFoundry.org family of ontologies was estab-
cal for determining mode of disease transmission, sources of lished in 2007 in order to provide a comprehensive set of
exposure and risk, susceptible populations, geographical vocabularies in the biomedical domain. FoodOn, built
distribution and more. Public health and regulatory agencies largely on a longstanding American and European facet-
not only use different analytical platforms to track and re- based food indexing system called LanguaL
solve outbreaks, but implement different data dictionaries (http://langual.org), provides a list of over 2,000 plant and
and free text descriptions for describing isolates and expo- animal food ingredient terms, as well as a supplemental list
sures. The most important factor in reducing the number of of over 9,000 indexed food products. Facets include fields
preventable cases of disease is timeliness of investigations for describing food processing, cooking and preservation, as
and responses, which is negatively impacted by the time- well as source ingredient anatomy, taxonomy, geography
consuming re-coding and manual curation required for and cultural heritage. The aim of FoodOn is to develop an
translating non-standardized information between systems international standard for describing properties of food re-
and agencies. To address the interoperability problem, it is lated to agriculture, animal husbandry, collection, distribu-
important to relate similar concepts or relations from one tion, preservation, culinary use, consumption and food safe-
agency, information management system or jurisdiction, to ty. FoodOn was accepted into the OBOFoundry in 2017.
another. Mapping terms using an ontology represents a very
powerful solution for standardizing and integrating hetero- FOODON MAPPING AND DATA HARMONIZATION
geneous data. Microbial Multilocus Sequence Typing (MLST) is a
technique used to classify and identify pathogenic strains for
* To whom correspondence should be addressed: Dalia.alghamdi@bccdc.ca
outbreak investigation and surveillance of contamination.
1
Alghamdi et al.
Enterobase is a widely used online platform enabling MLST
Griffiths, E., Dooley, D., Buttigieg, P.L., Hoehndorf, R., Brinkman, F., and
analysis of enteric pathogens such as E. coli, Salmonella,
Hsiao, W. (2016). FoodOn: A global farm-to-fork food ontology. ICBO
Shigella, Yersinia and Moraxella. Enterobase contains Conference. Corvalis, OR, USA
>100,000 genomes, along with their source metadata, en-
compassing food, anatomical and clinical, as well as envi- J. Euzenat and P. Shvaiko. Ontology Matching. Springer, 2007.
ronmental domains.
Based on curation and mapping of Enterobase isolate Jérôme Euzenat. 2007. Semantic precision and recall for ontology align-
metadata, we have developed a semi-automated ontology ment evaluation. In Proceedings of the 20th international joint conference
mapping system that will enable mapping of food safety on Artifical intelligence (IJCAI'07), Rajeev Sangal, Harish Mehta, and R.
K. Bagga (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA,
metadata to FoodOn food products and processing environ- USA, 348-353.
ments according to the following steps: (1) Syntactic analy-
sis, where each categorical term in a single free text entry
will be separated. (2) Semantic mapping of each categori-
cal term according one of the following rules: (a) mapping
to similar concept; (b) mapping to similar ancestors; (c)
mapping to similar relations; (d) combining several match-
ing techniques. (3) structural mapping, where the items are
mapped to a corresponding subclass in the reference hierar-
chy.
Non-interactive ontology matching tool can be evaluated
using recall and precision (J. Euzenat and P. Shvaiko.,
2010). They are measured based on comparing the expected
results with the results of the evaluated system. Precision
measure the ration of the correct matched terms over the
total number of the matched terms, on the other hand, recall
measures the ratio of the correct matched of the total num-
ber of expected terms to be matched. Logically, one can say
that the precision can evaluate the correctness of the evalu-
ated system and recall can evaluate the completion of the
evaluated system (Euzenat., 2007). Here, we will evaluate
our mapping accuracy by sub-sampling randomly a set of
500 records from the Enterobase and evaluate the suitability
of the terms assigned using recall and precision methods.
We will further manually review all the terms that cannot be
manually mapped to an ontological term and add the terms
to appropriate ontologies. The goal of our exercise is to min-
imize manually intervention when annotating food-related
sample sources using FoodOn.
Integrating genomic profiling of foodborne pathogens with a
food descriptor framework will help reduce barriers for
knowledge exchange among research communities, gov-
ernment risk managers and health providers.
ACKNOWLEDGEMENTS
We thank Mark Achtman, Nabil-Fareed Alikhan, Mark
Pallen, Martin Sergeant, Zhemin Zhou for sharing the En-
terobase.
REFERENCES
Altekruse S, Swerdlow D. The changing epidemiology of foodborne dis-
ease. Am J Med Sci. 1996, 311: 23-29.
World Health Organization, Foodborne disease outbreaks: guidelines for
investigation and control, WHO Press, Geneva (2008).
2