=Paper= {{Paper |id=Vol-1747/IP30_ICBO2016 |storemode=property |title=Supporting Database Annotations and Beyond with the Evidence & Conclusion Ontology (ECO) |pdfUrl=https://ceur-ws.org/Vol-1747/IP30_ICBO2016.pdf |volume=Vol-1747 |authors=Marcus Chibucos,Suvarna Nadendla,James Munro,Elvira Mitraka,Dustin Olley,Nicole Vasilevsky,Matthew Brush,Michelle Giglio |dblpUrl=https://dblp.org/rec/conf/icbo/ChibucosNMMOVBG16 }} ==Supporting Database Annotations and Beyond with the Evidence & Conclusion Ontology (ECO) == https://ceur-ws.org/Vol-1747/IP30_ICBO2016.pdf
    Supporting database annotations and beyond with the
         Evidence & Conclusion Ontology (ECO)
                       Marcus C. Chibucos1*, Suvarna Nadendla1, James B. Munro1, Elvira Mitraka1,
                        Dustin Olley1, Nicole A. Vasilevsky2, Matthew H. Brush2, Michelle Giglio1
1
    Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD United States of America
2
    Ontology Development Group, Library, Oregon Health & Science University, Portland, OR United States of America

       *Corresponding author: mchibucos@som.umaryland.edu; (410) 705-0885; 801 W. Baltimore St., Baltimore, MD, 21201



    Abstract—The Evidence & Conclusion Ontology (ECO) is a                       supported by systematically describing evidence. Because ECO
community standard for summarizing evidence in scientific                        terms are ontology terms, they contain standard definitions and
research in a controlled, structured way. Annotations at the                     are networked using defined relationships. Thus, associating
world's most frequented biological databases (e.g. model                         research data with descriptions of evidence using ECO can
organisms, UniProt, Gene Ontology) are supported using ECO
                                                                                 allow, for example, faceted queries of large datasets and
terms. ECO describes evidence derived from experimental and
computational methods, author statements curated from the                        implementations of customized quality control mechanisms.
literature, inferences drawn by curators, and other types of
                                                                                                     II. ESSENTIALS OF ECO
evidence. Here, we describe recent ECO developments and
collaborations, most notably: (i) a new ECO website containing                   A. Basic ECO structure
user documentation, up-to-date news, and visualization tools; (ii)
improvements to the ontology structure; (iii) implementing logic
                                                                                     As depicted in Fig. 1, ECO comprises two high-level classes,
via an ongoing collaboration with the Ontology for Biomedical                    ‘evidence’ (ECO:0000000) & ‘assertion method’ (ECO:0000217).
Investigations (OBI); (iv) addition of numerous experimental                     The definition of ‘evidence’ is “a type of information that is used
evidence types; and (v) addition of new evidence classes describing              to support an assertion” and ‘assertion method’ is defined as “a
computationally derived evidence. Due to its utility, popularity,                means by which a statement is made about an entity” [1].
and simplicity, ECO is now expanding into realms beyond the                      Together ‘evidence’ and ‘assertion method’ can be combined to
protein annotation community, for example the biodiversity and                   describe both the support for an assertion and whether the
phenotype communities. As ECO continues to grow as a resource,                   assertion was generated by manual or automatic means. ECO
we are seeking new users and new use cases, with the hope that                   terms descend mainly from the ‘evidence’ hierarchy. However,
ECO will continue to be a broadly used and easy-to-implement                     ‘evidence’ leaf terms are related to the ‘assertion method’ terms
community standard for representing evidence in diverse                          by the ‘used_in’ relationship. Thus, one can assert not only what
biological applications. Feel free to visit two ECO-sponsored                    evidence is used to support a particular assertion, but also
workshops at ICBO 2016 to learn more: 1. “An introduction to the                 whether the assertion was made by a human being or a computer
Evidence and Conclusion Ontology and representing evidence in                    (Fig. 1).
scientific research” and 2. “OBI-ECO Interactions & Evidence”.
                                                                                 B. Traditional uses of ECO
    Keywords—annotation; biodiversity; biomedical investigation;                     Some traditional example applications of ECO are found in
conclusion; confidence; curation; evidence; experimental evidence;               uses by the Gene Ontology [3]: (a) hierarchical ECO classes are
inference; provenance; sequence similarity.
                                                                                 used to support structured data queries; (b) when a protein is
                              I. INTRODUCTION                                    annotated based on sequence similarity to another annotated
                                                                                 protein, the identity of that protein must be recorded in the
   The Evidence & Conclusion Ontology (ECO) [1] summarizes                       annotation file along with the evidence from ECO; (c) quality
types of scientific evidence associated with biological research.                control assessment can be enforced by only allowing certain
Evidence can arise from laboratory experiments, computational                    annotations to terms from a given ontology to be supported by
methods, manual literature curation, or other means.                             particular evidence types—lest such annotations be flagged for
Researchers, biocurators, and database managers use this                         review; and (d) circular annotations based on computational
evidence to justify their conclusions and support resulting                      predictions alone can be determined, and thus avoided. In the
assertions, for example stating that a given protein has a                       ways described above, ECO has been used by many databases
                                                                                 (e.g. UniProt, model organisms, Gene Ontology, et cetera) to
particular function.
                                                                                 support protein annotations. However, ECO has additional uses.
Summarizing evidence with ECO allows projects such as the
UniProt-Gene Ontology Annotation (UniProt- GOA) project [2]                      C. Recent ECO term development
to manage large volumes of annotations in a convenient fashion,                     A growing number of resources/applications use ECO (more
as both data management and query applications are                               than 40 of which we are aware). ECO has recently expanded its

        This work is supported by funding from the National Science Foundation
    Division of Biological Infrastructure under award number 1458400.
Fig. 1. ECO root classes and combinatorial terms. Leaf terms depicted are logically defined as the ‘evidence’ parent class (‘match
to InterPro member signature evidence’) related to the ‘assertion method’ class via the ‘used_in’ relationship (gray boxes).

evidence representation through collaborations with many           Specific examples of these will be addressed at the ICBO 2016
groups, for example: IntAct [4] (biological system                 workshop titled “An introduction to the Evidence and
reconstruction), CollecTF [5] (motif prediction), Ontology of      Conclusion Ontology and representing evidence in scientific
Microbial Phenotypes [6] (microbial assays), Planteome             research” (workshop W11) and new users and adopters are
(http://planteome.org; genotype-phenotype associations), Gene      especially encouraged to attend to learn more.
Ontology [3] (logical inference & synapse research techniques),                            ACKNOWLEDGMENT
SwissProt [7] (diverse experimental assays), and UniProt [2,7]         The authors acknowledge the Ontology for Biomedical
(detection techniques).                                            Investigations (OBI) Consortium and, in particular, Bjoern
                    III. THE FUTURE OF ECO                         Peters for ongoing collaboration with ECO. We thank Christian
                                                                   J. Stoeckert, Jr. and Jie Zheng for co-organizing the ICBO 2016
          A. Increasing the logic within ECO                       workshop W08 titled “OBI-ECO Interactions & Evidence.”
    In May 2016, 14 people met in person at the Institute for
                                                                                                  REFERENCES
Genome Sciences in Baltimore, MD, while approximately seven
others joined remotely, to discuss modeling scientific research    [1]  M.C. Chibucos, C.J. Mungall, R. Balakrishnan, K.R. Christie, R.P.
                                                                        Huntley, O. White, J.A. Blake, S.E. Lewis, and M. Giglio, “Standardized
evidence [8]. An objective of the meeting, titled “OBI-ECO              description of scientific evidence using the Evidence Ontology (ECO),”
Baltimore 2016: Evidence,” was to devise strategies for cross-          Database (Oxford), v.2014:bau075, 2014.
ontology coordination between ECO and the Ontology for             [2] E.C. Dimmer, R.P. Huntley, Y. Alam-Faruque, T. Sawford, C.
Biomedical Investigations (OBI) [9]. One decided outcome of             O'Donovan, M.J. Martinet, et al., “The UniProt-GO Annotation database
the meeting was to logically define ECO ‘experimental                   in 2011,” Nucleic Acids Res., 40, D565–D570, 2012.
evidence’ classes using OBI classes. This work has been under      [3] The Gene Ontology Consortium, “Gene Ontology Consortium: going
way, and a cataloging of issues and areas for development in            forward,” Nucleic Acids Res., 43(Database issue):D1049-1056, 2015.
both ontologies has been undertaken. Followup discussions and      [4] B.H.M. Meldal, O. Forner-Martinez, M.C. Costanzo, J. Dana, J. Demeter,
a review of this ongoing work will take place at ICBO 2016 at           M. Dumousseau, et al., “The complex portal – an encyclopaedia of
                                                                        macromolecular complexes,” Nucleic Acids Res., nar.gku975, 2014.
workshop W08 titled “OBI-ECO Interactions & Evidence” and
                                                                   [5] S. Kılıç, D.M. Sagitova, S. Wolfish, B. Bely, M. Courtot, S. Ciufo, et al.,
participation by any interested users is welcome.                       “From data repositories to submission portals: rethinking the role of
                                                                        domain-specific databases in CollecTF,” Database, v.2016:baw055 2016.
         B. Beyond protein annotation
                                                                   [6] M.C. Chibucos, A.E. Zweifel, J. Herrera, W. Meza, S. Eslamfam, P. Uetz,
   Although ECO was originally created circa 2000 to support            et al., “An ontology for microbial phenotypes,” BMC Microbiology,
                                                                        14(1):294, 2014.
gene product annotation by the Gene Ontology, today ECO is
                                                                   [7] The Uniprot Consortium, “UniProt: a hub for protein information,”
used by many groups concerned with evidence, and even                   Nucleic Acids Res., 43(Database issue):D204-212, 2015.
provenance, in scientific research. While numerous                 [8] The OBI Consortium, et al., “Cross-community ontological modeling of
experimental and computational evidence types have been                 scientific evidence,” unpublished.
added to ECO on behalf of a number of resources (see above and     [9] A. Bandrowski, R. Brinkman, M. Brochhausen, M.H. Brush, B. Bug,
www.evidenceontology.org), the ECO user base and diversity              M.C. Chibucos, et al., “The Ontology for Biomedical Investigations,”
                                                                        PLoS One, 11(4):e0154556, 2016.
of applications continues to increase.
                                                                   [10] W.A. Kibbe, C. Arze, V. Felix, E. Mitraka, E. Bolton, G. Fu, et al.,
    Some examples of new/potential ECO users include                    “Disease Ontology 2015 update: an expanded and updated database of
WikiData (https://www.wikidata.org), the deep sea community             human diseases for linking biomedical knowledge through disease data,”
(https://github.com/geneontology/deep_sea), the biodiversity            Nucleic Acids Res., Oct 27. pii: gku1011, 2014
and phenotype communities, and the Disease Ontology [10].