<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Biomedical Data Categorization and Integration using Human-in-the-loop Approach</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Priya</forename><surname>Deshpande</surname></persName>
							<email>pdeshpa1@depaul.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">DePaul University</orgName>
								<address>
									<settlement>Chicago</settlement>
									<region>IL</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Biomedical Data Categorization and Integration using Human-in-the-loop Approach</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F2D90E7CBCB75FBEEAD8E738BEFD5D40</idno>
					<idno type="DOI">10.14778/xxxxxxx.xxxxxxx</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T08:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Digitized world demands data integration systems that combine data repositories from multiple data sources. Vast amounts of existing clinical and biomedical research data are considered a primary force enabling data-driven research toward advancing health research and for introducing efficiencies in healthcare delivery. Datadriven research may have many goals, including but not limited to improved diagnostics processes, novel biomedical discoveries, epidemiology, and education. However, finding and gaining access to relevant data remains an elusive goal. We identified different data integration challenges and developed an Integrated Radiology Image Search (IRIS) framework that could be a step toward aiding data-driven research. We propose building a biomedical data categorization and integration framework using human-in-the-loop and developing data bridges to support search and retrieval of relevant documents from the integrated repository.</p><p>My research focuses on biomedical data integration, indexing systems, and providing relevance-ranked document retrieval from an integrated repository. Although we currently focus on integrating biomedical data sources (for medical professionals), we believe that our proposed framework and methodologies can be used in other domains as well.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>A growing amount of available biomedical data poses new challenges in data management. Data re-usability is a highly desirable goal, both for advancing science as well as for replicating or validating results of previous studies. Recognizing this need, publishers and funding bodies may require researchers to submit data generated in their work and make it available to the research community. For example, National Institutes of Health (NIH) is encouraging funded investigators to use cloud computing to conduct research and make their work accessible to larger audiences 1 . However, in the healthcare domain, datasets are often not shared because of security concerns, lack of integration, or limitations of retrieval engines. A data integration framework should make data available, accessible, and support fine-grained access control for different users <ref type="bibr" target="#b5">[6]</ref>. It would also greatly reduce the need for manual curation of data sources and data repositories. Data integration alone is insufficient without associated information retrieval mechanisms that would rank retrieved results based on relevancy. From our discussions with University of Chicago (UofC) radiologists, even the internal UofC commercial system lacks some of the Natural Language Processing (NLP) features (e.g., detecting synonyms and negation) and multimodal (text and image) search capabilities. We studied publicly available radiology data sources MyPacs.net<ref type="foot" target="#foot_0">2</ref> , EURORAD <ref type="foot" target="#foot_1">3</ref> , and RSNA Medical Imaging Resource Community (MIRC) <ref type="foot" target="#foot_2">4</ref> , that provide a collection of clinical reports and associated images, which are known as teaching files. Teaching files contain information such as patient history, findings, diagnosis, differential diagnosis, or discussion notes. While all of these public data sources are available, most of them provide only basic search capabilities -not offering NLP support or ranked retrieval mechanisms. Several studies highlighted the need to integrate clinical reports and images into databases with advanced search capabilities. Gutmark et al. <ref type="bibr" target="#b4">[5]</ref> argued for building a system that reduces errors in radiological images interpretation using teaching file databases. Talanow et al. <ref type="bibr" target="#b11">[12]</ref> described reference radiological image use for diagnosis, teaching needs, research, and the resulting need for an advanced reference search engine.</p><p>An integrated repository of teaching files can retrieve thousands of results for a text search. A search can thus become effectively useless without being able to show the most relevant results first. Publicly available radiology teaching file search engines do not provide text relevance ranking or combined text-and-image search. Lack of such systems motivated us to build Integrated Radiology Image Search (IRIS) and develop the ranking algorithm presented here. We presented IRIS at the annual Society for Imaging Informatics in Medicine (SIIM 2018) meeting (two posters: one focusing on search and another on data integration) and received feedback from doctors indicating that this work would be useful for the medical domain practitioners.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">BACKGROUND AND RELATED WORK</head><p>In this section we discuss papers that addressed the need for data integration and retrieval systems along with an overview of existing medical data retrieval systems. Several studies have highlighted the need for integration of healthcare data <ref type="bibr" target="#b9">[10]</ref>. Holzinger et al. <ref type="bibr" target="#b6">[7]</ref> talked about knowledge discovery and interactive data mining techniques in bio-informatics, the challenges to integrating biomedical data, and open research directions. Li et al. <ref type="bibr" target="#b7">[8]</ref> proposed a hybrid human-machine data integration approach that integrates records from databases with similar data types (e.g., iphone users data). However, healthcare domain data integration needs to combine heterogeneous data sources with different categories of data types. Simpson et al. <ref type="bibr" target="#b10">[11]</ref> proposed a multimodal image retrieval system that retrieves biomedical articles used in Open-i 5 . Ling et al. <ref type="bibr" target="#b8">[9]</ref> designed GEMINI, an integrative healthcare analytics system, and studied problems related to healthcare data heterogeneity and data integration in that context. From this literature survey, we concluded that healthcare needs are not met by the current search engines. The limitations of existing systems motivated us to design and develop a radiology multimodal search engine. IRIS integrates two well-known public data sources MIRC and MyPacs and two medical ontologies RadLex and The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) 7 . RSNA MIRC: Publicly available large repository with more than 2,500 teaching files and more than 12,000 images. Mypacs.net: Publicly available teaching file resource with more than 35,000 cases and 200,000 images. RadLex: RadLex is an ontological system that provides a comprehensive lexicon vocabulary for radiologists. SNOMED CT: ontology provides a standardized, multilingual vocabulary of clinical terminology that is used by physicians and other healthcare providers for the electronic exchange of clinical health information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">METHODOLOGY AND RESEARCH STEPS</head><p>In this section, we discuss major biomedical data sources and significant goals that we identified as a part of my PhD proposal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Datasets</head><p>We currently focus on three types of data a) Electronic health records; b) Radiology teaching files or teaching files used by doctors and radiologists; c) Research datasets. Electronic Health Records (EHRs): An electronic health record is a digital version of a patient's record. EHRs are maintained at hospitals and provide patient information such as history of patient, medical test results, allergies, immunization details, radiology images, and clinical reports. Medical Teaching Files: A radiology teaching files system is a collection of important cases for teaching and clinical follow-up. Teaching files share a similar overall structure but significant variations exist even within the same data sources and can include information such as patient history, findings, diagnosis, discussion, comments, references, and images related to clinical reports. Research datasets: From our survey with different research institute datasets, we observed that most of the data in healthcare domain are images (e.g., CT, X-ray, MRI). Those images are most typically stored in formats such as JPEG, DICOM, or PNG and include associated text data describing patient and case information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Data integration and rank retrieval</head><p>We have organized this project into three phases (I finished the first two phases and working on the last phase of my PhD work). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3</head><p>Data integration as an iterative process, showing how each integration step improved IRIS results <ref type="bibr" target="#b1">[2]</ref>.</p><p>4</p><p>Cluster analysis and coverage analysis for both ontologies and radiology data sources.</p><p>Unsupervised machine learning to identify data source properties -to identify best data sources and ontologies for integration (Journal paper -under review).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">IRIS 1.2</head><p>Multimodal ranked retrieval for integrated radiology data sources using context of search term by considering weighted ontology and category terms (Conference paper -under review).</p><p>6 Toward using FAIR Principles for Fine-Grained Access to aid Biomedical Data Driven Research <ref type="bibr" target="#b3">[4]</ref>.</p><p>For each phase we have identified a research question. Publications related to this work are briefly summarized in Table <ref type="table" target="#tab_0">1</ref> 3</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>.2.1 Design an integrated smart database with heterogeneous data sources</head><p>Research question #1: How to determine which data sources and ontologies need to be integrated?</p><p>Most hospitals maintain a collection of teaching files, but many public teaching file collections are also available through curated online sources (e.g., RSNA MIRC, MyPacs, and EURORAD). We developed IRIS engine as a pilot for a data integration system for the healthcare domain <ref type="bibr" target="#b0">[1]</ref>. In IRIS, we captured heterogeneous data from MIRC and MyPacs data sources, loading data into an integrated data repository. Using medical ontologies, we built our own dictionary which maps terms to their synonyms from the datasets and medical ontologies <ref type="bibr" target="#b2">[3]</ref>. We designed an unsupervised machine learning technique that performs coverage analysis of data sources and medical ontologies to learn properties of the data (e.g., topic coverage). By learning data repositories contents, one can decide which data sources need to be integrated or what repository content is lacking. Thus, this coverage analysis algorithm benefits data integration process by extracting knowledge about the repositories (addressing research question #1). Our analysis also confirmed that data integration is a continuous, iterative process <ref type="bibr" target="#b1">[2]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2">Ranked retrieval search engine with multimodal text and image-based search capabilities</head><p>Research question #2: How to find relevant documents given a keyword query or hybrid (text+image) query? Figure <ref type="figure" target="#fig_1">1</ref> shows the architecture of IRIS engine. When a user enters a text query, IRIS performs query expansion using relevant ontologies, and retrieves relevant results to the query term. Our database also stores accuracy feedback from users which is then used to evaluate and iteratively improve IRIS results.</p><p>An integrated search may result in thousands of matches; thus, we are designing a search algorithm that ranks results by incorpo- </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3">Data bridges and indexing mechanism to integrate biomedical data sources</head><p>Research question #3: How data integration performance (time) and scalability (adding variety of data sources) can be improved using data bridges? In order to make our integration solution applicable to other biomedical data sources (e.g., EHR's, clinical reports), we plan to create data adapters that will serve as a bridge between data providers and data integration systems (this work was a part of my internship at NIH). Data providers can share their data in any file format and bridges will interpret that data in a uniform manner. As shown in Figure <ref type="figure">2</ref>, our data clustering indexing approach starts Figure <ref type="figure">2</ref>: Data Categorization with human-in-the-loop with collecting different biomedical data sources. From our literature survey we observed that data preparation accounts 80% of data scientist work. Data preparation includes finding relevant data sources, extracting data from those data sources, data cleaning, and data integration. Our proposed data integration system would help data scientists and researchers optimize and streamline data preparation. We collected different biomedical data sources and working on defining standard data cleaning technique that would be applicable to the most of the similar data sources that we proposed in this work. Our data categorization module categorizes data items into different sets based on the usage of those data elements in search operation. We need support from a human to check the accuracy of data categorization, to set similarity thresholds between different data items, and apply additional domain knowledge to categorize these data items based on relevance between data objects. Our data categorization algorithm will differentiate data items based on diagnostic relevance. For example, teaching cases with title, findings, and diagnosis would be treated as one sub-category in teaching cases (that would also integrate clinical reports) while another subcategory could integrate fields those are medically less relevant e.g., discussion, history, or comments. Based on data categorization we will be designing database schema and would also evaluate schema based on standard database schema benchmark techniques. Data write bridges would be responsible for the extracting data from different data categories and loading data to the respective database schema. This data categorization work is ongoing and we do not have any experimental results yet. We will address research question #3 by implementing this module.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">EXPERIMENTAL RESULTS</head><p>In this section we briefly discuss the current results from proposed system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.1">Text-based results</head><p>We evaluated IRIS search ranking using a combination of queries received from radiologists at a well-known hospital and other queries chosen from an extensive literature survey. We have initially tested a total of 28 text queries, out of which we picked a subset of 10 queries (Q1:Cardiomegaly, Q2: ACL Tear, Q3: Annular Pancreas, Q4: Pseudocoxalgia, Q5: Varicocele, Q6: Angiosarcoma, Q7: Tracheal dilation, Q8: Appendicitis, Q9: Bronchus intermedius, Q10: Cystitis glandularis) to perform an in depth evaluation. Due to space constraints we briefly discuss text based results. We evaluated text-based results on a scale from 0 ("not relevant") to 2 ("very relevant"). We defined five categories to score text search results: "not relevant" = 0 (when term and synonyms do not appear anywhere in the results), "relevant" = 0.5 (if term or synonyms appear in any category of teaching file), "more relevant" = 1 (if term or synonyms appear in discussion category), "most relevant" = 1.5 (if term or synonyms appears in history or ddx category), and "very relevant" = 2 (if term or synonyms appears in title, findings, or diagnosis categories).</p><p>Comparison of IRIS and MIRC relevance rank algorithm using same datasets: We compared IRIS relevance rank algorithm with MIRC using the same dataset. We considered top four teaching file results from IRIS, MIRC, and Google site search. We calculated relevance score by scoring top four teaching files from each engine, using weighted ontology ranking algorithm . Figure <ref type="figure">3</ref> shows an overall analysis of results from these 3 search engines. score for each search engine shows that IRIS relevance rank algorithm performs better than other two engines.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ranking evaluation of other medical search engines:</head><p>We also considered how other public medical radiology teaching file search engines rank their search results. We used the same query set and performed a search using MIRC, MyPacs, EURO-RAD, and Open-i search engines. We discuss only two queries (Q1:"cardiomegaly" and Q8:"appedicitus") in detail and reporting scores for the top 10 search results. Figure <ref type="figure">4</ref> shows a comparative analysis of ranked results from these four engines using the relevance scores based on our metric described above. Open-i can rank search results based on different categories (e.g., based on diagnosis or based on teaching file date) -we used a diagnosis based search in Open-i. MIRC ranks results based on the date of modification with no other option available. Our analysis shows that none of the search engines return the most relevant results first. Interestingly, top results are often less relevant than the subsequent search results. For example for "cardiomegaly" MyPacs fourth result is more relevant than the top three results. EURORAD does not retrieve any results for "cardiomegaly" but we checked "appendicits" results -and those were also not ranked based on the relevance of the search term.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.2">Hybrid Text and Image based results</head><p>IRIS hybrid algorithm augments the text search with image search and re-ranks results based on the relevance to the query. Due to space constraints we briefly discuss hybrid search result. IRIS textbased and hybrid search results scored an score of 0.83 out of 1. Image search scored only about 0.53 out of 1, validating our use of the image search as an enhancement to the text search (rather than a standalone search). Hybrid search scored 0.84 out of 1 because of text results were augmented with image-based results. For hybrid search Some of the results were noticeable better than text-based search.By combining text search with image results, we are striving to get a text-based match that also includes a similar image.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">CONCLUSIONS</head><p>The ranking approach presented in this paper is significant because it enables IRIS to present the user with top relevant reference cases first. Through integrating term frequency, adding more weight to ontology terms we show that teaching files can be better ranked in order of their relevance to a search query. Currently I am working on data write bridges and categorization algorithm to improve biomedical data integration process.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>1</head><label></label><figDesc>https://commonfund.nih.gov/strides/ This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org. Publication rights licensed to the VLDB Endowment. Proceedings of the VLDB 2019 PhD Workshop, August 26th, 2019. Los Angeles, California. Copyright (C) 2019 for this paper by its authors. Copying permitted for private and academic purposes Proceedings of the VLDB Endowment, Vol. 12, No. xxx ISSN 2150-8097. DOI: https://doi.org/10.14778/xxxxxxx.xxxxxxx</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: IRIS Architecture</figDesc><graphic coords="3,65.18,67.22,226.73,123.98" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :Figure 4 :</head><label>34</label><figDesc>Figure 3: IRIS relevance rank results comparison with MIRC</figDesc><graphic coords="4,70.01,67.22,217.09,133.61" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Research work summary ID Summary</figDesc><table><row><cell></cell><cell>IRIS 1.0</cell></row><row><cell></cell><cell>Teaching file text pre-processing and indexing.</cell></row><row><cell>1</cell><cell>Smart search through substitution of synonyms</cell></row><row><cell></cell><cell>and interpreting negation. Query expansion using</cell></row><row><cell></cell><cell>RadLex through an exact term match. [1]</cell></row><row><cell></cell><cell>IRIS 1.1</cell></row><row><cell>2</cell><cell>Query synonym expansion. SNOMED CT ontology integration, shown improved results compared with</cell></row><row><cell></cell><cell>other search engines [3].</cell></row></table><note>5 https://openi.nlm.nih.gov/ 6 http://www.radlex.org/ 7 https://www.nlm.nih.gov/healthit/snomedct/ .</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://www.mypacs.net/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">https://www.myesr.org/eurorad</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://mirc.rsna.org/query</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_3">https://en.wikipedia.org/wiki/Discounted_ cumulative_gain</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">ACKNOWLEDGMENTS</head><p>This research was supported in part by the Intramural Research Program of the National Institutes of Health (NIH), National Library of Medicine (NLM), and Lister Hill National Center for Biomedical Communications (LHNCBC).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">An integrated database and smart search tool for medical knowledge extraction from radiology teaching files</title>
		<author>
			<persName><forename type="first">P</forename><surname>Deshpande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rasin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Furst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Raicu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Montner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Armato</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Medical Informatics and Healthcare</title>
		<imprint>
			<biblScope unit="page" from="10" to="18" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Big data integration case study for radiology data sources</title>
		<author>
			<persName><forename type="first">P</forename><surname>Deshpande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rasin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Furst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Raicu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Montner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Armato</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Life Sciences Conference (LSC)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="195" to="198" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Augmenting medical decision making with text-based search of teaching file repositories and medical ontologies: Text-based search of radiology teaching files</title>
		<author>
			<persName><forename type="first">P</forename><surname>Deshpande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rasin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">T</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Furst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Montner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Armato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iii</forename></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Raicu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Knowledge Discovery in Bioinformatics (IJKDB)</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="18" to="43" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Diis: A biomedical data access framework for aiding data driven research supporting fair principles</title>
		<author>
			<persName><forename type="first">P</forename><surname>Deshpande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rasin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Furst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Raicu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Antani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">54</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Use of computer databases to reduce radiograph reading errors</title>
		<author>
			<persName><forename type="first">R</forename><surname>Gutmark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Halsted</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Perry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gold</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American College of Radiology</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="65" to="68" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Practice facilitator strategies for addressing electronic health record data challenges for quality improvement: Evidencenow</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Hemler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Cholan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">F</forename><surname>Crabtree</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">J</forename><surname>Damschroder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">I</forename><surname>Solberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Ono</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Cohen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Journal of the American Board of Family Medicine</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="398" to="409" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Knowledge discovery and interactive data mining in bioinformatics-state-of-the-art, future challenges and research directions</title>
		<author>
			<persName><forename type="first">A</forename><surname>Holzinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dehmer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Jurisica</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC bioinformatics</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">I1</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Human-in-the-loop data integration</title>
		<author>
			<persName><forename type="first">G</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the VLDB Endowment</title>
				<meeting>the VLDB Endowment</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="2006" to="2017" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Gemini: an integrative healthcare analytics system</title>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">J</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">T</forename><surname>Tran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">C</forename><surname>Koh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Yip</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the VLDB Endowment</title>
				<meeting>the VLDB Endowment</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="1766" to="1771" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Merelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Pérez-Sánchez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gesing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dagostino</surname></persName>
		</author>
		<title level="m">Managing, analysing, and integrating big data in medical bioinformatics: open problems and future perspectives</title>
				<imprint>
			<publisher>BioMed research international</publisher>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Multimodal biomedical image indexing and retrieval using descriptive text and global feature mapping</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Simpson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Demner-Fushman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Antani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">R</forename><surname>Thoma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information retrieval</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="229" to="264" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Radiology teacher: a free, internet-based radiology teaching file server</title>
		<author>
			<persName><forename type="first">R</forename><surname>Talanow</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">JACR</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="871" to="875" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
