<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using Ontologies to Drive the Creation of High-Quality Metadata in CEDAR</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Rafael</forename><forename type="middle">S</forename><surname>Gonçalves</surname></persName>
							<email>rafael.goncalves@stanford.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Csongor</forename><forename type="middle">I</forename><surname>Nyulas</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marcos</forename><surname>Martínez-Romero</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Martin</forename><forename type="middle">J</forename><surname>O'connor</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">John</forename><surname>Graybeal</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mark</forename><forename type="middle">A</forename><surname>Musen</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Stanford Center for Biomedical Informatics Research</orgName>
								<orgName type="institution">Stanford University</orgName>
								<address>
									<settlement>Stanford</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Using Ontologies to Drive the Creation of High-Quality Metadata in CEDAR</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">23A9C2626554836C1B1640C8FC6A78D0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T08:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Metadata</term>
					<term>metadata authoring</term>
					<term>metadata repository</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The Center for Expanded Data Annotation and Retrieval (CEDAR) developed a suite of tools¾the CEDAR Workbench¾that allows users to build metadata templates using ontologies to annotate template fields and to constrain the options available to metadata authors for specific fields; to fill in those templates with metadata; to upload data and their metadata to online repositories; and to perform searches over the metadata stored in CEDAR's metadata repository. The CEDAR Workbench is released under a BSD 2-Clause opensource license, and it is freely available at https://metadatacenter.org.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>We present the CEDAR <ref type="bibr" target="#b0">[1]</ref> software to produce high-quality, structured, standardsbased metadata. The software we have developed¾the CEDAR Workbench <ref type="bibr" target="#b1">[2]</ref>¾is a suite of Web-based tools and APIs that offers users the ability to build highly-modular metadata acquisition forms (templates) that can be annotated with ontology terms, and whose fields can be constrained using terms or branches of terms from ontologies. Rather than having a single monolithic template, CEDAR allows users to recursively construct templates from existing, more granular templates. CEDAR template designers can share templates with individuals or groups-the metadata authors, who fill in the metadata templates, validate field entries, and submit the metadata to online repositories. The metadata produced using CEDAR templates are, by design, adherent to the FAIR data principles <ref type="bibr" target="#b2">[3]</ref>. Our goal is ultimately to provide scientists with a robust, end-to-end software solution to author and to manage high-quality FAIR metadata about scientific experiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">The CEDAR Workbench</head><p>The CEDAR Workbench is an open-source Web-based platform for the acquisition, storage, search, and reuse of metadata templates and metadata instances. At the core of the CEDAR technology lies a lightweight, standards-based model <ref type="bibr" target="#b3">[4]</ref> designed to provide a common format for describing templates and metadata. All CEDAR resources are represented as JSON-LD documents that conform to our model, which is specified by a JSON Schema. These resources can be viewed and retrieved as RDF documents. Fig. <ref type="figure" target="#fig_0">1</ref> shows an overview of CEDAR. The following are the main components of the CEDAR Workbench software. Resource Manager. Template authors and scientists who use the CEDAR Workbench are initially presented with the Resource Manager tool. The Resource Manager allows users to create and store resources in the CEDAR Metadata Repository; to organize templates and metadata into folders; and to search for these resources. From the Resource Manager, users can define groups composed of their team members for purposes of collaboration. CEDAR users can share resources (with read or write permissions) among users, among groups, or with the general community.</p><p>Template Designer. Template authors can build metadata templates using the Template Designer. In the Template Designer, users piece together fields of various types (e.g., text, checkbox, and multiple choice) to form templates. Possible field values can be constrained to terms from ontologies using an interactive look-up service linked to NCBO's BioPortal <ref type="bibr" target="#b4">[5]</ref>. With the BioPortal lookup service (Fig. <ref type="figure" target="#fig_1">2</ref>), users can interactively create new ontology terms (which can be mapped to terms in other ontologies) and value sets at template design-time for their annotation purposes. The metadata templates and their fields can be annotated using properties from ontologies.</p><p>Metadata Editor. Scientists generate metadata instances by filling in metadata templates using the Metadata Editor. This tool builds a metadata-acquisition form interface from template specifications built in the Template Designer. We implemented a computer-assisted value recommender <ref type="bibr" target="#b5">[6]</ref> in the Metadata Editor that provides context-sensitive suggestions for field values during metadata submission. The value recommender learns associations between field values in previous metadata entries using rule mining, and ranks their applicability to specific fields. The goal of the value recommender is to ease the burden of authoring high-quality metadata. Metadata generated through CEDAR templates can be submitted to external repositories, such as the NCBI BioSample <ref type="bibr" target="#b6">[7]</ref> and SRA <ref type="bibr" target="#b7">[8]</ref> repositories, or the ImmPort repository for immunology-related datasets <ref type="bibr" target="#b8">[9]</ref>. The CEDAR Workbench can be used through the Web-based components described above, or using the CEDAR API¾a collection of REST-based services that provide comprehensive access to the CEDAR ecosystem. The API allows creating, reading, updating, and deleting CEDAR resources programmatically. With this API, users can also export templates or metadata to other repositories or applications. All our software is distributed and versioned on GitHub, at https://github.com/metadatacenter.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. Overview of CEDAR. Users can manage and search for resources; design metadata templates using ontologies from BioPortal; create metadata with support from intelligent authoring features; validate metadata; and upload metadata to external repositories.</figDesc><graphic coords="2,204.24,218.83,78.40,162.81" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2. The BioPortal lookup service allows users to search for ontology terms in BioPortal, browse ontologies, and select terms or branches of terms to constrain the possible options available when filling-in specific fields in CEDAR metadata templates. It also allows users to create new terms, and to create value sets made up of existing terms in BioPortal.</figDesc><graphic coords="3,127.93,243.61,340.08,320.09" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>CEDAR is supported by NIAID grant U54 AI117925 through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative. The NCBO BioPortal has been supported by the NIH Common Fund under grant U54HG004028.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Summary</head><p>The CEDAR Workbench provides a comprehensive solution for authoring, validating, searching, and (re)using metadata. The goal behind CEDAR is to significantly improve the way scientists work with metadata, and the quality and interoperability of the metadata that they create. We meet this goal by equipping the community with a collaborative platform to build standards-based metadata templates that use ontologies as sources for standard terms, and to author and submit high-quality metadata to online repositories. CEDAR's metadata repository gives scientists a means to search for and to use metadata templates developed by the community, and to build new ones from scratch or based on existing templates. CEDAR allows its users to submit their metadata to external repositories, such as NCBI databases. We are working to allow our users to submit metadata to an increasing number of external repositories.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The center for expanded data annotation and retrieval</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Musen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Am. Med. Informatics Assoc</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="1148" to="1152" />
			<date type="published" when="2015-06">Jun. 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Gonçalves</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of International Semantic Web Conference (ISWC)</title>
				<meeting>of International Semantic Web Conference (ISWC)</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The FAIR Guiding Principles for scientific data management and stewardship</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Wilkinson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Sci. data</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page">160018</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">An Open Repository Model for Acquiring Knowledge About Scientific Experiments</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>O'connor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martínez-Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Egyedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Willrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Graybeal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Musen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of International Conference on Knowledge Engineering and Knowledge Management (EKAW)</title>
				<meeting>of International Conference on Knowledge Engineering and Knowledge Management (EKAW)</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">BioPortal: ontologies and integrated data resources at the click of a mouse</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">F</forename><surname>Noy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="page" from="W170" to="W173" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations</title>
		<author>
			<persName><forename type="first">M</forename><surname>Martínez-Romero</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of AMIA Annual Symposium</title>
				<meeting>of AMIA Annual Symposium</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata</title>
		<author>
			<persName><forename type="first">T</forename><surname>Barrett</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="D57" to="D63" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The Sequence Read Archive</title>
		<author>
			<persName><forename type="first">R</forename><surname>Leinonen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sugawara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shumway</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="D19" to="D21" />
			<date type="published" when="2011-01">Jan. 2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">ImmPort: disseminating data to the public for the future of immunology</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bhattacharya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Immunol. Res</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="234" to="239" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
