The ImmPort Antibody Ontology William Duncan1, Travis Allen1,2, Jonathan Bona3, Olivia Helfer1, Barry Smith1,2,3, Alan Ruttenberg4, Alexander D. Diehl1,3,5 1 NYS Center of Excellence in Bioinformatics and Life Sciences, 2Department of Philosophy, 3 Department of Biomedical Informatics, 4Oral Diagnostics Sciences, 5Department of Neurology University at Buffalo Buffalo, NY, USA addiehl@buffalo.edu The resulting antibody registry was transformed into the I. INTRODUCTION AnitO ontology using the Reagent Ontology (ReO) as a Monoclonal antibodies are essential biomedical research paradigm for the representation of monoclonal antibody and clinical reagents that are produced by companies and reagents [3]. Monoclonal antibodies are classified via isotype research laboratories. The NIAID Immunology Database and and species of origin and are formally related to their protein Analysis Portal (ImmPort) is a sustainable data warehouse for targets via the recognizes relation. For example, monoclonal data generated by NIAID, DAIT and DMID funded studies antibody clone HI100 recognizes some ‘receptor-type tyrosine- designed to allow long-term archiving and re-use of protein phosphatase C isoform CD45RA’. We supplemented immunological data [1]. A variety of immunological data in the information in AntiO by creating classes for entries in the ImmPort is generated using techniques that rely upon NIF Antibody Registry [4] that represent products that contain monoclonal antibody reagents, including flow cytometry, particular monoclonal antibody clones. These classes are types immunofluorescence, and ELISA. In order to facilitate of ‘monoclonal antibody offering’ in our ontology and are querying, integration, and reuse of these data, standardized linked to clone name classes via has_part relations. We have terminology for describing monoclonal antibody reagents and also mined and standardized additional information from the their targets needs to be used for annotating data submitted to NIF Antibody Registry that is associated with particular ImmPort. monoclonal antibody offering classes, including information about product vendors, catalog numbers, conjugations A major problem with monoclonal antibody-associated data (fluorchromes, biotin, etc.) of antibody products, antibody is that data producers typically report antibody clones or target species specificity, and experimental usage. markers using non-standardized terminology: AntiO is built in an automated fashion using scripts that • CD3 vs. CD3e (protein names) combine information about monoclonal antibodies and their • HIT3e vs. UCHT1 (antibody clones for CD3e) targets found in curated spreadsheets with information text- mined from relevant NIF Antibody Registry entries to create a • 550367 vs. 300401 (catalog numbers for anti-CD3e base set of OWL2 modular ontologies that are imported into antibody reagents) the AntiO ontology (see Figure 1) along with import files for In order to address this problem, we have created the ReO and Protein Ontology terms. Additional terms from the ImmPort Antibody Ontology (AntiO) to provide a source of Ontology for Biomedical Investigations [5], the BioAssay standardized names for monoclonal antibodies and their protein Ontology [6], the Molecular Interactions Ontology [7], and the targets for use by ImmPort investigators and the scientific NCBI Taxonomy [8] are included as MIREOT’ed terms as community in general, and to provide robust querying for well [9]. The resulting combined ontology is viewable and monoclonal antibody reagents via a variety of criteria. queryable in Protégé 5 [10], and is loaded into a publicly available RDF triple store for SPARQL queries. II. METHODS III. RESULTS We curated monoclonal antibody-protein target relationships by identifying names and information about AntiO contains 941 monoclonal antibodies of common use monoclonal antibodies based on published papers, data in immunology experiments, and represents about 30,000 submissions to ImmPort, and commercial monoclonal products monoclonal antibody products from 80 vendors based on for immunology research such as the BD Lyoplate products. information derived from the NIF Antibody Registry. We have We selected standardized monoclonal antibody names (clone included the NIF ‘AB_XXXXXX’ identifiers as part of our names) and curated information about the protein targets of the monoclonal antibody offering labels antibodies using Protein Ontology and UniProt identifiers [2]. The AntiO triple store is based on OWLIM [11], is pre- For both the monoclonal antibody clone names, and the protein reasoned, and contains over a million RDF triples. A variety of targets of the monoclonal antibodies, we have included many queries using AntiO are possible. One can for instance search additional synonyms to facilitate querying. for all monoclonal antibodies that have a particular protein target (Figure 2). Or, similarly, all monoclonal antibody Supported by NIGMS 2R01GM080646 (Protein Ontology), NIAID HHSN272201200028C (ImmPort), NIAID HHSN272201200028C (HIPC). offerings (products) from a given vendor that have a particular target. More indirect querying is possible; for instance, one can search for the protein targets of monoclonal antibodies using only the catalog number of the products used. There are additional ways to search as well; one can limit searches to antibodies that work only in particular types of experiments, for instance. We have created a Bitbucket repository and wiki to provide information about the ontology, as well as example SPARQL queries (see Table 1 for URLs). TABLE I Important URLs AntiO http://protein.ctde.net:8080/openrdf- Fig. 1. AntiO Ontology ImmPort Schema Triple Store workbench/repositories/antio/query enable better reuse and integration of scientific data while AntiO Wiki https://bitbucket.org/wdduncan/antio/wiki/Home adding value to the NIF Antibody Registry data through our careful curation and standardization steps. IV. DISCUSSION Through careful curation and data extraction using ACKNOWLEDGMENT computer programs, we have developed an ontology of We thank Sanchita Bhattacharya, Patrick Dunn, Atul Butte, monoclonal antibodies used in immunological research with a Matthew Brush, Melissa Haendel, and Anita Bandrowski for focus on ImmPort clinical studies and other recently published helpful comments and support. papers in immunology. Our effort developing AntiO is complementary to existing antibody registries. While such resources let researchers find useful antibodies and the REFERENCES companies that produce them, they do not provide standardized [1] Bhattacharya S, et al., “ImmPort: disseminating data to the public for the terms for clone names, targets of the antibodies, conjugations, future of immunology,” Immunol Res. 2014, 58:234-9. etc. and so are difficult to use computationally. In collaboration [2] Natale DA, et al., “Protein Ontology: a controlled structured network of with the NIH-funded NIF Antibody Registry, we have protein entities,” Nucleic Acids Res. 2014, 42:D415-21. developed a framework that will allow researchers to more [3] Brush MH, et al., “Developing a Reagent Application Ontology within the OBO Foundry,” 2011, http://ceur-ws.org/Vol-833/paper32.pdf. easily query for monoclonal antibodies, the vendors that sell [4] Bandrowski A, et al., “The Resource Identification Initiative: A cultural them, and their protein targets and experimental usage, and shift in publishing,” F1000Res. 2015, 4:134. provides standardized terminology for all these data types and [5] Bandrowski A, et al., “The Ontology for Biomedical Investigations,” more. Our long-term goal is to develop web interfaces that will PLoS One. 2016, 11:e0154556. enable submitters of data not only to query for monoclonal [6] Visser U, et al., “BioAssay Ontology (BAO): a semantic description of antibodies and their targets, but also facilitate the finding of bioassays and high-throughput screening results,” BMC Bioinformatics. experimental results, such as clinical studies within the 2011, 12:257. ImmPort system, in which particular monoclonal antibodies [7] Orchard S, Kerrien S, “Molecular interactions and data standardisation,” were used. Methods Mol Biol. 2010, 604:309-18. [8] Sayers EW, et al., “Database resources of the National Center for Of further note is our reuse within AntiO of the compiled Biotechnology Information,” Nucleic Acids Res. 2009, 37:D5-15. NIF Antibody Registry data on antibody products, which is [9] Courtot M, et al. “MIREOT: The minimum information to reference an part of the Research Resource Identification Project [4]. By external ontology term,” Applied Ontology. 2011, 6:23-33. associating the monoclonal antibody offerings in AntiO with [10] http://protege.stanford.edu the RRIDs provided by NIF Antibody Registry, we ensure [11] Kiryakov A, Ognyanov D, Manov D, “OWLIM–a pragmatic semantic AntiO contributes to the goals of the Research Resource repository for OWL.” 2005, In Web Information Systems Engineering– Identification Project by linking to this common resource to WISE 2005 Workshops, Springer Berlin Heidelberg. SELECT distinct ?offering ?vendor ?clone ?target WHERE { ?offeringt rdfs:subClassOf offering: . ?vendori rdf:type vendor: . ?clonet rdfs:subClassOf mAB: . ?targett rdfs:subClassOf protein: . ?r1 owl:onProperty has_part: . ?r1 owl:someValuesFrom ?clonet . ?offeringt rdfs:subClassOf ?r1 . ?r2 owl:onProperty is_sold_by: . ?r2 owl:hasValue ?vendori . ?offeringt rdfs:subClassOf ?r2 . ?r3 owl:onProperty recognizes: . ?r3 owl:someValuesFrom ?targett . ?clonet rdfs:subClassOf ?r3 . ?offeringt rdfs:label ?offering . ?clonet rdfs:label ?clone . ?vendori rdfs:label ?vendor . ?targett rdfs:label ?target . filter(?clonet != mAB:) filter(?vendor = "Abcam") filter (?target = "E-selectin") } Fig. 2. Example SPARQL query and Results (see wiki for complete query listing).