<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Formalizing the Representation of Immune Exposures for Human Immunology Studies</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Randi</forename><surname>Vita</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Division for Vaccine Discovery La Jolla Institute for Allergy and Immunology</orgName>
								<address>
									<settlement>La Jolla</settlement>
									<region>California</region>
									<country key="US">U.S.A</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">James</forename><forename type="middle">A</forename><surname>Overton</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Division for Vaccine Discovery La Jolla Institute for Allergy and Immunology</orgName>
								<address>
									<settlement>La Jolla</settlement>
									<region>California</region>
									<country key="US">U.S.A</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kei-Hoi</forename><surname>Cheung</surname></persName>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">Department of Emergency Medicine</orgName>
								<orgName type="department" key="dep2">Yale Center for Medical Informatics</orgName>
								<orgName type="institution">Yale School of Medicine</orgName>
								<address>
									<settlement>New Haven</settlement>
									<region>CT</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Steven</forename><forename type="middle">H</forename><surname>Kleinstein</surname></persName>
							<affiliation key="aff2">
								<orgName type="department">Department of Pathology</orgName>
								<orgName type="institution">Yale School of Medicine</orgName>
								<address>
									<settlement>New Haven</settlement>
									<region>Connecticut</region>
									<country key="US">U.S.A</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Bjoern</forename><surname>Peters</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Division for Vaccine Discovery La Jolla Institute for Allergy and Immunology</orgName>
								<address>
									<settlement>La Jolla</settlement>
									<region>California</region>
									<country key="US">U.S.A</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="department">Interdepartmental Program in Computational Biology and Bioinformatics</orgName>
								<orgName type="institution">Yale University</orgName>
								<address>
									<settlement>New Haven</settlement>
									<region>Connecticut</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Formalizing the Representation of Immune Exposures for Human Immunology Studies</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D108F50D186C0455F3DBFBB699379233</idno>
					<idno type="DOI">10.6084/m9.figshare.6741791.v1)</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T05:05+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>immune exposure</term>
					<term>modeling</term>
					<term>HIPC</term>
					<term>ontology</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Human immunology studies typically examine how immune exposures associated with vaccinations, infectious, allergic or autoimmune diseases, or transplantations perturb the immune system with the goal to develop diagnostic tools and therapeutic interventions. While there are established approaches to formally represent the experimental data generated in such studies, which often comprises gene expression data, flow cytometry data, or serology data, the description of the immune exposures themselves is not well standardized. We here present a formal approach to represent immune exposures at a high level of granularity. We capture the exposure process (e.g. 'vaccination' or 'occurrence of allergic disease'), exposure material (e.g. 'Tdap vaccine' or 'House dust mite'), and the associated disease name and stage (e.g. 'allergic rhinitis' and 'chronic'). This representation scheme has been used successfully in the IEDB and an extended version has been adopted by HIPC to capture studies in ImmPort. We are reporting here on this scheme, our ongoing attempts to map the terms used to existing ontologies, and the challenges encountered.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>The Immunology Database and Analysis Portal (ImmPort) <ref type="bibr" target="#b0">[1]</ref> is the primary resource to capture human immunology studies funded by the National Institute of Health, Division of Allergy, Immunology and Transplantation. ImmPort provides structured data fields to capture a variety of different experimental data and free-text fields to store meta-data on cohorts from which subjects where recruited. This free-text cohort description data typically contains a description of immune exposures that are expected to perturb the immune system. While free-text allows for a detailed account how a given study is conducted and a cohort is defined, without standardization, such descriptions are difficult to query and compare across many studies in a large database such as ImmPort.</p><p>In particular, ImmPort is the designated repository for data from studies performed by the Human Immunology Project Consortium (HIPC) <ref type="bibr">[2]</ref>, a collaboration between a number of of centers aimed at performing large scale human immunology studies with a focus on profiling the human immune response to natural infection and vaccination. A key goal of the HIPC consortium is to cross-compare results from different centers. To facilitate this, we set out to develop a standardized representation of immune exposures for HIPC studies that can be stored in ImmPort to represent their central elements in a structured format.</p><p>The need to represent immune exposures extends beyond the HIPC program. Most human immunology studies examine how the immune system responds to perturbations. Subjects are compared across cohorts and/or at defined time points that are intended to isolate the effect of immune exposures. The Immune Epitope Database (IEDB) <ref type="bibr" target="#b1">[3]</ref> implemented a structured representation of immune exposures that has been applied to model over one million experiments in which human samples were tested for T cell or B cell reactivity to specific epitopes. The IEDB representation of exposures is decoupled from the epitope mapping experiments, so we decided to test if it could be utilized as a basis to describe immune exposures for the HIPC program. By adapting the IEDB model for HIPC, we have developed an even more general representation of immune exposures that can be used by the wider scientific community.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. APPROACH</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Semi-formal Immune Exposure Representation</head><p>All HIPC centers funded by the middle of 2017 were asked to supply textual descriptions of study designs that they planned on submitting to ImmPort. We then examined the immune exposures that were part of these study designs and how they would be entered into the IEDB format. As a result of this process, we found that the broader scope of HIPC compared to the IEDB required extension of the IEDB structured representation. In the following, we present the resulting expanded schema to represent immune exposures for HIPC, of which the IEDB immune exposures are a subset. This schema has been implemented by adding columns to the 'Human Subject Template' spreadsheet that is used to submit information to ImmPort.</p><p>We consider four elements critical to the description of an immune exposure, as listed as the column headers in Table <ref type="table" target="#tab_1">I</ref>. The 'Exposure process' identifies the type of process through which a host was exposed and the type of evidence for that exposure to have happened, which are tightly intertwined. This is the only element of the four that was deemed mandatory. Based on the choice made for 'Exposure process', other elements are required or not applicable as listed in Table <ref type="table" target="#tab_1">I</ref>. The 'Exposure material' describes what substance(s) the host was exposed to and/or developed immune reactions to as part of the exposure process. The 'Disease name' indicates the specific disease of the host associated with the exposure being described and lastly, the 'Disease stage' provides a broad classification of how the disease progressed at the time of the study.  To illustrate how this representation was used in practice, Table <ref type="table" target="#tab_3">II</ref> shows three examples of studies by actual HIPC centers that involved immune exposures, described in free text (first column to the left), and how these were modeled using the four elements of the exposure scheme (columns to the right). These examples illustrate the three main types of exposure processes, namely 'administration', 'disease', and 'exposure without disease'.  Thus, "Adults receiving a Varicella-zoster shot" would be the result of a vaccination 'Exposure process' which delivered the 'Exposure material' that was the Varicella-zoster virus vaccine. No disease resulted from this immune exposure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Ontology Mapping</head><p>Our intent is to map each of the four data elements described above to ontology terms with textual and logical definitions, ideally derived from established ontologies covering the various domain. For 'Exposure process', all allowed values are listed in the first column of Table <ref type="table" target="#tab_1">I</ref>. This collection of options has been assembled by the IEDB team over the past 13 years and has been proven to be robust and stable, with minimal modifications occurring in the last 5 years. Each of the options come with a definition and rules when it should be applied. These terms will be mapped to formal external ontology terms, as initiated in Supplementary Table <ref type="table">S1</ref> (https://doi.org/10.6084/m9.figshare.6741791.v1). The main challenge in this process is that terms for e.g. 'vaccination', 'infectious disease' and 'transplantation' come from different external ontologies, and presenting users their definitions side-by-side is not helpful. We are planning to engage representatives of different ontology communities, and harmonize their definitions. Until this is done, we proceeded with implementation of temporary terms for this immune exposure model in ONTIE <ref type="bibr" target="#b3">[5]</ref>, which we intend on replacing/merging with new or edited terms in the appropriate external ontologies.</p><p>In addition to the main three categories of immune exposure (administration, disease, exposure without disease) and their subtypes, there are two options (no exposure and unknown) which are not actual types of exposures but rather values to signify two different reasons why it is not possible or meaningful to fill out the exposure type for a given study subject. The value 'no exposure' is intended to be used for subjects that are enrolled as negative controls, and indicates specifically that these subjects are *not* be exposed to something. The value 'unknown' is used when samples are from subjects for which no relevant exposure information is available. This is applicable when, for example, a study utilizes samples from anonymous blood bank donors in order to establish a 'normal range'.</p><p>For 'Exposure material', the vast majority of HIPC studies submitted to us required specifying an organism that was either the causative agent of an infection, exposure without infection, or utilized to vaccinate to protect against future infection. Organisms can be specified by the broadly utilized NCBI Taxonomy <ref type="bibr" target="#b4">[6]</ref>, which has the key advantage of linking organism specifications to sequence information in NCBI. All taxa from the NCBI Taxonomy are valid entries for Exposure material, and can be looked up at https://www.ncbi.nlm.nih.gov/taxonomy. One potential concern with this choice is that NCBI does not assign new taxa to every organism isolate identified, which in some cases is desirable, such as in the case of drug resistant M. tuberculosis isolates, where it is of interest to relate even single nucleotide differences to efficacy of drug treatments. We expect that going forward, there will be a developing community consensus on how to handle this, along the lines of grouping different isolates based on their NCBI GenBank ID under their closest parent taxon. Not all 'Exposure materials' in HIPC studies submitted to us were whole organisms. In the case of vaccinations, specific antigens are often utilized over whole organisms such as in the case of subunit vaccines. Also, in the case of multi-valent vaccines, multiple organisms or antigens of organisms are combined into one vaccine. We plan to specify vaccines through the Vaccine Ontology (http://www.violinet.org/vaccineontology/) <ref type="bibr" target="#b5">[7]</ref>. It may be necessary to add new entries to the Vaccine Ontology to capture new experimental vaccines, but as vaccines administered to humans have to go through a stringent approval process, this will not overwhelm the Vaccine Ontology development team.</p><p>To specify the 'Disease name', the IEDB utilizes values from the Disease Ontology (DO) (http://disease-ontology.org/) <ref type="bibr" target="#b6">[8]</ref>, which has the advantage of providing mappings to most of the other vocabularies that could be considered such as ICD10, SNOMED CT, MESH and UMLS. The IEDB has been successful in mapping the disease terms encountered in the literature to DO terms. In addition, the Disease Ontology is part of the OBO Foundry <ref type="bibr" target="#b7">[9]</ref> and thus more compatible with other basic research ontologies, providing explicit definitions and links to basic research domains, such as clarifying which infectious agent is causative for a given disease. Thus, our immune exposure model will continue to use DO, which was incorporated into ImmPort submission templates via requiring submitters to enter DO terms to describe the diseases of the study subjects.</p><p>In terms of 'Disease stage', the IEDB has defined three values that in combination with disease name clarify some typical major distinctions how a disease manifests in different study subjects: (1) 'acute/recent onset' is utilized for subjects that currently have symptomatic disease and may or may not clear it. (2) 'chronic' is utilized for subjects that persistently have a disease and it is not considered highly likely that they will soon clear the disease without (3) 'post' is utilized for subjects that have cleared a disease which they had in the past. So far, these broad categories have proven sufficient to also describe HIPC needs, although more detailed description of disease specific stages could be desirable in the future and we are open to further discussion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. CHALLENGES AND CONCLUSIONS</head><p>The ability to formalize what otherwise would be free-text is a significant accomplishment to improve the integration of data across HIPC studies. More importantly, as this model was adopted by HIPC by adding columns to the Human Subject data submission template, all studies submitted to ImmPort can now include the same fields to describe immune exposures, the HIPC studies will be better connected to other studies in ImmPort. To ease data entry for these fields and others into ImmPort spreadsheet templates, work is ongoing through the CEDAR <ref type="bibr" target="#b8">[10]</ref> effort and others to create interactive forms that will ensure that only valid terms are entered. Now that newly entered data will be formalized, improved query and comparisons will be possible due to standardized terminology. We fully expect that as more data gets submitted to ImmPort using this scheme for HIPC, questions will continue to arise, and based on our experience with the IEDB, we expect to handle them by consulting domain expects for the disease of interest. Controversial cases will be presented to the Clinical Subcommittee, to ensure that decisions are made uniformly across the HIPC program. Overall, it has to be stressed that the structured representation of immune exposures is not intended to fully represent every nuance of each study, but rather achieve its intended function to enable a computable high level comparison of immune exposures across studies. Reassessment of how well this model meets the needs of the community and how it improves the quality of the data after several months of use would be beneficial.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>TABLE I .</head><label>I</label><figDesc>Four structured elements to describe immune exposures.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>TABLE II .</head><label>II</label><figDesc>Three examples of immune exposures modeled in this schema.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1">August 7-10, 2018  </note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGEMENTS</head><p>This work was supported by the National Institute of Allergy And Infectious Diseases of the National Institutes of Health under Award Number NIH U19 AI118610 and U19AI089992. It would not have been possible without strong support by the ImmPort team, and Patrick Dunn in particular.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">ImmPort: disseminating data to the public for the future of immunology</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Andorf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gomes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Immunol Res</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="234" to="239" />
			<date type="published" when="2014-05">May 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The immune epitope database (IEDB) 3.0</title>
		<author>
			<persName><forename type="first">R</forename><surname>Vita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Overton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Greenbaum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="405" to="412" />
			<date type="published" when="2014-10">October 2014</date>
		</imprint>
	</monogr>
	<note>Database issue</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The Ontology for Biomedical Investigations</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bandrowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brinkman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brochhausen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PLoS One</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">4</biblScope>
			<date type="published" when="2016-04">Apr 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">ONTology of Immune Epitopes (ONTIE) Representing the Immune Epitope Database in OWL</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Greenbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zarebski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The 12th Annual Bio-Ontologies Meeting, ISMB</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="45" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Database resources of the National Center for Biotechnology Information</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">W</forename><surname>Sayers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Barrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Benson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="page" from="D5" to="D15" />
			<date type="published" when="2009-05">May 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">VO: Vaccine Ontology</title>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cowell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Diehl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The 1st International Conference on Biomedical Ontology (ICBO 2009)</title>
				<meeting><address><addrLine>Buffalo, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Nature Precedings</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A</forename><surname>Kibbe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Arze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Felix</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="1071" to="1078" />
			<date type="published" when="2015-01">January 2015</date>
		</imprint>
	</monogr>
	<note>Database issue</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration</title>
		<author>
			<persName><forename type="first">B</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ashburner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rosse</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nat Biotechnol</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="1251" to="1255" />
			<date type="published" when="2007-11">November 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The center for expanded data annotation and retrieval</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Musen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Bean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Cheung</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J Am Med Inform Assoc</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="1148" to="1152" />
			<date type="published" when="2015-11">November 2015</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
