<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">SKET: an Unsupervised Knowledge Extraction Tool to Empower Digital Pathology Applications ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Giorgio</forename><forename type="middle">Maria</forename><surname>Di</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nicola</forename><surname>Ferro</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Fabio</forename><surname>Giachelle</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ornella</forename><surname>Irrera</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Stefano</forename><surname>Marchesin</surname></persName>
							<email>stefano.marchesin@unipd.it</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Gianmaria</forename><surname>Silvello</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering</orgName>
								<orgName type="institution">University of Padua</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">IRCDL (The Conference on Information and Research science Connecting to Digital and Library science)</orgName>
								<orgName type="laboratory">19th</orgName>
								<address>
									<addrLine>February 23-24</addrLine>
									<postCode>2023</postCode>
									<settlement>Bari</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">SKET: an Unsupervised Knowledge Extraction Tool to Empower Digital Pathology Applications ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">00AA7BA99557F0962B89D805B2139AB8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-04-29T06:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Knowledge Extraction</term>
					<term>Machine Learning</term>
					<term>Expert Systems</term>
					<term>Digital Pathology</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Large volumes of medical data have been produced for decades. These data include diagnoses, which are often reported as free text, thus encoding medical knowledge that is still largely unexploited. To decode the medical knowledge present within reports, we propose the Semantic Knowledge Extractor Tool (SKET), an unsupervised knowledge extraction system combining a rule-based expert system with pretrained Machine Learning (ML) models. This work demonstrates the viability of unsupervised Natural Language Processing (NLP) techniques to extract critical information from cancer reports, opening opportunities such as data mining for knowledge extraction purposes, precision medicine applications, structured report creation, and multimodal learning.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Hundred of thousands of medical reports have been used to communicate diagnoses, encoding a vast amount of medical knowledge. In this context, free-text reporting is the de facto standard to communicate diagnoses, guiding patients' treatment, and conducting therapies. Processing high volumes of free-text reports to extract the crucial knowledge is usually performed manually. However, since reports vary widely between institutions, contain noise, and lack a standard structure, this becomes an extremely time-consuming process. To overcome this limitation, Natural Language Processing (NLP) methods become essential <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref> as they empower the efficient automatic processing of thousands of reports and the extraction of relevant information for several (downstream) tasks, such as clinical note mining <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref> and structuring <ref type="bibr" target="#b11">[12]</ref>, risk prediction <ref type="bibr" target="#b12">[13]</ref>, clinical decision support <ref type="bibr" target="#b13">[14]</ref>, and precision medicine retrieval <ref type="bibr" target="#b14">[15]</ref>.</p><p>In the context of digital pathology -a field that involves the analysis of histopathology images known as Whole Slide Images (WSIs) -this work aims at proving the viability of unsupervised NLP techniques to automatically extract critical information from pathology reports and use it for different applications, such as automatic report annotation and visualization <ref type="bibr" target="#b15">[16]</ref>, as well as WSI classification <ref type="bibr" target="#b16">[17]</ref>. To this end, we present the Semantic Knowledge Extractor Tool (SKET), an unsupervised hybrid knowledge extraction system that combines rule-based techniques with pre-trained Machine Learning (ML) models to extract knowledge from pathology reports. In recent years, NLP has shifted from using rules to ML approaches <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b8">9]</ref>, which have the advantage of learning regularities from data and of generalizing to previously unseen patterns. Moreover, the advent of efficient Neural Language Models (NLMs) <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b21">22]</ref> paved the way for the pre-training era, where large NLMs trained in a self-supervised fashion on huge datasets are used to develop NLP models for a number of downstream tasks. Nevertheless, similarly to <ref type="bibr" target="#b9">[10]</ref>, we argue that rule-based techniques capture critical information that should be used together with -and not substituted by -ML to improve performance.</p><p>We evaluate SKET effectiveness on entity linking and text classification, considering three use-cases: Colon, Cervix, and Lung cancer. We resort on diagnostic reports coming from two medical centers based in Italy and The Netherlands. Then, we compare SKET with unsupervised ML approaches to understand the impact that combining rule-based techniques and pre-trained ML models have on the extraction of knowledge from diagnostic reports. The results highlight the effectiveness of ML methods for information extraction in the pathology domain but, at the same time, they also stress the role of expert knowledge in reaching the high levels of accuracy required to semi-automate the clinical practice. As further proof, SKET has been already used as core system in automatic report annotation and visualization <ref type="bibr" target="#b15">[16]</ref>, as well as weak supervision for WSI classification <ref type="bibr" target="#b16">[17]</ref>. SKET source code is publicly available at https://github.com/ExaNLP/sket.</p><p>The rest of this paper is organized as follows: Section 2 presents SKET. Section 3.2 describes the experimental evaluation. Finally, Section 4 concludes the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The Semantic Knowledge Extractor Tool</head><p>SKET combines pre-trained Named Entity Recognition (NER) models with unsupervised Entity Linking (EL) methods to extract relevant entities from diagnostic reports and link them to concepts stored in a reference ontology <ref type="foot" target="#foot_0">1</ref> . By relying on pre-trained NER models and unsupervised EL methods, SKET can serve as automated annotator in weak supervision tasks. For instance, the concepts extracted by SKET can be used as weak labels when training ML models for image classification <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24]</ref> and relation extraction <ref type="bibr" target="#b24">[25]</ref>, or as nodes to build knowledge graphs that can be used for retrieval tasks <ref type="bibr" target="#b25">[26]</ref>.</p><p>SKET consists of four main components: (1) Named Entity Recognition, (2) Entity Linking, (3) Data Labeling, and (4) Graph Creation. Components (1) and ( <ref type="formula">2</ref>) are sequential, whereas ( <ref type="formula">3</ref>) and ( <ref type="formula">4</ref>) can be applied in parallel. We briefly describe each component below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Named Entity Recognition</head><p>NER can be defined as the task of identifying and categorizing relevant information within text. A named entity can be any word or phrase -i.e., a mention -that consistently refers to a concept or object of the world. Once identified, mentions are classified into predefined categories, such as disease, gene/protein, symptom, etc.</p><p>To perform NER, SKET combines pre-trained neural models with rule-based techniques. As neural component, SKET exploits ScispaCy models <ref type="bibr" target="#b26">[27]</ref>, which provide full NER pipelines for biomedical data, consisting of large medical vocabularies, as well as Word2Vec <ref type="bibr" target="#b18">[19]</ref> word vectors trained on the PubMed Central Open Access Subset <ref type="bibr" target="#b27">[28]</ref>. Regarding the integration of expert rules, SKET extends the ScispaCy pipeline with two more components: Entity Fusion and Negation Detection. For Entity Fusion, SKET exploits expert rules to identify and merge specific mentions that would otherwise be regarded as separate by ScispaCy. For example, "high-grade" and "dysplasia" are considered as separate mentions, whereas we are interested in "high-grade dysplasia" as a unique mention. Hence, we developed regular expressions capable of identifying trigger terms that are indicative of a set of mentions that should potentially be combined into one. These expert rules have been developed on a holdout dataset, which is available in the SKET GitHub repository<ref type="foot" target="#foot_1">2</ref> . The dataset consists of 50 diagnostic reports for each use-case and medical center, for a total of 250 diagnostic reports. For Negation Detection, SKET relies on NegEx <ref type="bibr" target="#b28">[29]</ref>, a negation detection algorithm that evaluates whether extracted entities are negated within text. NegEx uses regular expressions to identify the scope of trigger terms that are indicative of negation. Then, the entities extracted within the scope of a trigger term are marked as negated and removed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Entity Linking</head><p>EL can be defined as the task of assigning unique meanings to entities mentioned within text. In a nutshell, EL aims to determine whether a target named entity refers to a specific concept or object stored within a reference ontology.</p><p>To perform EL, SKET adopts ad-hoc and similarity-based matching. Given an extracted entity, SKET performs a two-stage matching approach. First, the system tries to link the entity using ad-hoc matching. Then, if ad-hoc matching fails, it employs the similarity-based matching. For Ad-Hoc Matching, SKET employs regular expressions to find trigger terms indicative of a specific concept in the ontology. Once a trigger is found, the system matches the entity containing the trigger term with the closest ontology concept. In this case, if an extracted entity contains the (trigger) term "carcinoma", then SKET links the entity to the "colon adenocarcinoma" concept. Ad-hoc matching rules have also been developed on the holdout dataset and are available on GitHub. Regarding Similarity Matching, SKET combines string and semantic matching techniques. For string matching, SKET adopts the Gestalt Pattern Matching (GPM) algorithm <ref type="bibr" target="#b29">[30]</ref>. For semantic matching, SKET exploits the word vectors provided by ScispaCy models <ref type="bibr" target="#b26">[27]</ref>. Specifically, it computes the cosine distance between the vector representations of extracted entities and ontology concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Data Labeling</head><p>Given the set of concepts extracted from each diagnostic report, SKET maps a clinically relevant subset of such concepts to a set of annotation classes defined by pathologists.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Graph Creation</head><p>SKET builds report-level knowledge graphs using the extracted concepts as nodes and the semantic relations of the reference ontology as edges. The use of ontology concepts and relations to describe diagnostic reports increases the semantic understanding of the underlying data <ref type="bibr" target="#b30">[31]</ref>. Once created, report-level knowledge graphs are encoded in a machine-readable format through RDF.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experimental Evaluation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Setup</head><p>Tasks: We evaluate SKET on Entity Linking (Task 1) and Text Classification (Task 2). Both tasks are addressed as multi-label classification problems. Note that the number of possible labels for entity linking is much higher than for text classification, making the task an extreme multi-label classification problem <ref type="bibr" target="#b31">[32,</ref><ref type="bibr" target="#b32">33]</ref>. Datasets: For Task 1, we use 1,250 annotated reports coming from both medical centers and related to all the three use-cases. For Task 2, we resort on 9,798 annotated reports, divided among medical centers and use-cases. We refer the reader to the original publication <ref type="bibr" target="#b0">[1]</ref> for a comprehensive description of the available data. Baselines: For both tasks, we compare SKET with two unsupervised approaches based on Bio FastText <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b33">34]</ref> and BioClinical BERT <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b34">35]</ref>. For a fair comparison, both approaches adopt the same NER ScispaCy pipeline used by SKET, but without the extensions introduced with it. Then, they perform EL by computing the cosine distance between the vector representations of the extracted entities and the ontology concepts. Both baselines are straightforward approaches to perform entity linking and text classification without annotated data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Results</head><p>Table <ref type="table" target="#tab_0">1</ref> reports the results obtained by SKET and the considered baselines on Entity Linking (left) and Text Classification (right).</p><p>For entity linking (Task 1), we observe that SKET achieves high performance for both microand weighted-average F1 in each considered use-case. Regarding accuracy, its performance varies depending on the use-case -with the lowest score obtained in colon cancer with a value of 0.6280. As for the comparison of SKET with the considered baselines, we see that it outperforms them in each use-case for all measures. This result shows the effectiveness of combining ad-hoc, expert rules with ML models -making SKET both precise and sensitive. Specifically, ad hoc matching makes SKET precise, while semantic matching makes it sensitive. To support this intuition, we observe that unsupervised baselines -which only rely on ML models and semantic matching -have low accuracy values. Since we tackle the entity linking task as a multi-label classification problem, we resort on subset accuracy, where the set of concepts predicted for a report must exactly match the corresponding set of ground-truth concepts. Therefore, accuracy values are prone to rapidly decrease and less precise models are naturally affected by this. For text classification (Task 2), we see that SKET performs well on colon and lung cancer use-cases, whereas it shows lower accuracy values on cervix cancer. This result suggests that the cervix use-case is harder than the others, as subset accuracy drops fast when a model fails to predict all labels correctly. The higher values for micro-and weighted-average F1 -which do not perform exact match between predicted and ground-truth labels -further support this intuition. Compared to baselines, SKET outperforms them in colon and cervix use-cases. On the other hand, the BERT-based approach proves more effective in lung cancer. Despite this, the robustness of SKET across different use-cases makes it a viable solution in real scenarios, where annotated data are hard and expensive to get.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>In this work, we presented SKET, an unsupervised hybrid knowledge extraction system that combines rule-based techniques with pre-trained ML models to extract relevant concepts from diagnostic reports. The experimental evaluation demonstrated the effectiveness of SKET, making it a viable solution to reduce pathologists' workload. Besides, the experimental results highlighted the importance of expert knowledge in developing unsupervised systems for specialized medicine. As a result, the extracted concepts can serve different digital pathology applications, such as automatic report annotation, visualization, and retrieval, as well as image classification.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Entity linking (left) and text classification (right) results on colon, cervix, and lung cancer pathology reports. Bold values represent the highest scores achieved for each measure.</figDesc><table><row><cell></cell><cell cols="2">Entity Linking</cell><cell></cell><cell></cell><cell cols="2">Text Classification</cell><cell></cell></row><row><cell></cell><cell></cell><cell>Colon</cell><cell></cell><cell></cell><cell></cell><cell>Colon</cell><cell></cell></row><row><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell></row><row><cell>SKET</cell><cell>0.6280</cell><cell>0.8861</cell><cell>0.8694</cell><cell>SKET</cell><cell>0.7525</cell><cell>0.8386</cell><cell>0.8373</cell></row><row><cell>FastText</cell><cell>0.0660</cell><cell>0.5000</cell><cell>0.6146</cell><cell>FastText</cell><cell>0.4146</cell><cell>0.5298</cell><cell>0.5514</cell></row><row><cell>BERT</cell><cell>0.1840</cell><cell>0.3905</cell><cell>0.4527</cell><cell>BERT</cell><cell>0.5167</cell><cell>0.5697</cell><cell>0.6587</cell></row><row><cell></cell><cell></cell><cell>Cervix</cell><cell></cell><cell></cell><cell></cell><cell>Cervix</cell><cell></cell></row><row><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell></row><row><cell>SKET</cell><cell>0.7020</cell><cell>0.8322</cell><cell>0.8368</cell><cell>SKET</cell><cell>0.5281</cell><cell>0.7791</cell><cell>0.7611</cell></row><row><cell>FastText</cell><cell>0.0900</cell><cell>0.2802</cell><cell>0.3439</cell><cell>FastText</cell><cell>0.2533</cell><cell>0.4882</cell><cell>0.4445</cell></row><row><cell>BERT</cell><cell>0.0720</cell><cell>0.2715</cell><cell>0.2940</cell><cell>BERT</cell><cell>0.3066</cell><cell>0.3962</cell><cell>0.4867</cell></row><row><cell></cell><cell></cell><cell>Lung</cell><cell></cell><cell></cell><cell></cell><cell>Lung</cell><cell></cell></row><row><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell><cell>Model</cell><cell cols="3">Accuracy Micro F1 Weighted F1</cell></row><row><cell>SKET</cell><cell>0.8624</cell><cell>0.9375</cell><cell>0.9262</cell><cell>SKET</cell><cell>0.8137</cell><cell>0.8387</cell><cell>0.8262</cell></row><row><cell>FastText</cell><cell>0.2510</cell><cell>0.5610</cell><cell>0.6506</cell><cell>FastText</cell><cell>0.5221</cell><cell>0.7296</cell><cell>0.6853</cell></row><row><cell>BERT</cell><cell>0.3806</cell><cell>0.6804</cell><cell>0.8395</cell><cell>BERT</cell><cell>0.8523</cell><cell>0.8630</cell><cell>0.8526</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://w3id.org/examode/ontology/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/ExaNLP/sket/tree/main/sket/nerd/rules/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The work was supported by the ExaMode project, as part of the EU H2020 program under Grant Agreement no. 825292.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Empowering digital pathology applications through explainable knowledge extraction tools</title>
		<author>
			<persName><forename type="first">S</forename><surname>Marchesin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Giachelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Marini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Atzori</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Boytcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Buttafuoco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ciompi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Di Nunzio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fraggetta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Irrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Primov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vatrano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jpi.2022.100139</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.jpi.2022.100139" />
	</analytic>
	<monogr>
		<title level="j">Journal of Pathology Informatics</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page">100139</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The Potential for Artificial Intelligence in Healthcare</title>
		<author>
			<persName><forename type="first">T</forename><surname>Davenport</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kalakota</surname></persName>
		</author>
		<idno type="DOI">10.7861/futurehosp.6-2-94</idno>
		<ptr target="https://doi.org/10.7861/futurehosp.6-2-94.doi:10.7861/futurehosp.6-2-94" />
	</analytic>
	<monogr>
		<title level="j">Future Healthc J</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="94" to="98" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The Feasibility of Using Natural Language Processing to Extract Clinical Information from Breast Pahology Reports</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Buckley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Coopey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sharko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Polubriaginof</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Drohan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Belli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Garber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">L</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Gadd</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Specht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Roche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Gudewicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">S</forename><surname>Hughes</surname></persName>
		</author>
		<idno type="DOI">10.4103/2153-3539.97788</idno>
		<idno>doi:10.4103/2153-3539.97788</idno>
		<ptr target="https://doi.org/10.4103/2153-3539.97788" />
	</analytic>
	<monogr>
		<title level="j">J. Pathol Inform</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page">23</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Information Extraction from Multi-Institutional Radiology Reports</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hassanpour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">P</forename><surname>Langlotz</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.artmed.2015.09.007</idno>
		<ptr target="https://doi.org/10.1016/j.artmed.2015.09.007.doi:10.1016/j.artmed.2015.09.007" />
	</analytic>
	<monogr>
		<title level="j">Artif. Intell. Medicine</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="page" from="29" to="39" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Natural Language Processing in Pathology: a Scoping Review</title>
		<author>
			<persName><forename type="first">G</forename><surname>Burger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abu-Hanna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>De Keizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cornet</surname></persName>
		</author>
		<idno type="DOI">10.1136/jclinpath-2016-203872</idno>
		<ptr target="https://doi.org/10.1136/jclinpath-2016-203872.doi:10.1136/jclinpath-2016-203872" />
	</analytic>
	<monogr>
		<title level="j">Journal of Clinical Pathology</title>
		<imprint>
			<biblScope unit="volume">69</biblScope>
			<biblScope unit="page" from="949" to="955" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Mining Fall-Related Information in Clinical Notes: Comparison of Rule-Based and Novel Word Embedding-Based Machine Learning Approaches</title>
		<author>
			<persName><forename type="first">M</forename><surname>Topaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Murga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Gaddis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">V</forename><surname>Mcdonald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bar-Bachar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Bowles</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jbi.2019.103103</idno>
		<ptr target="https://doi.org/10.1016/j.jbi.2019.103103.doi:10.1016/j.jbi.2019.103103" />
	</analytic>
	<monogr>
		<title level="j">J. Biomed. Informatics</title>
		<imprint>
			<biblScope unit="volume">90</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics</title>
		<author>
			<persName><forename type="first">T</forename><surname>Oliwa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Maron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Chase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lomnicki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">V T</forename><surname>Catenacci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Furner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Volchenboum</surname></persName>
		</author>
		<idno type="DOI">10.1200/CCI.19.00008</idno>
		<ptr target="https://doi.org/10.1200/CCI.19.00008.doi:10.1200/CCI.19.00008" />
	</analytic>
	<monogr>
		<title level="j">JCO Clinical Cancer Informatics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Natural Language Processing Systems for Capturing and Standardizing Unstructured Clinical Information: A Systematic Review</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kreimeyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Foster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pandey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Arya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Halford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">F</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Forshee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Walderhaug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Botsis</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jbi.2017.07.012</idno>
		<ptr target="https://doi.org/10.1016/j.jbi.2017.07.012.doi:10.1016/j.jbi.2017.07.012" />
	</analytic>
	<monogr>
		<title level="j">J. Biomed. Informatics</title>
		<imprint>
			<biblScope unit="volume">73</biblScope>
			<biblScope unit="page" from="14" to="29" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Clinical Information Extraction Applications: A Literature Review</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rastegar-Mojarad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Moon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Afzal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mehrabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sohn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jbi.2017.11.011</idno>
		<ptr target="https://doi.org/10.1016/j.jbi.2017.11.011.doi:10.1016/j.jbi.2017.11.011" />
	</analytic>
	<monogr>
		<title level="j">J. Biomed. Informatics</title>
		<imprint>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="page" from="34" to="49" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Exploiting Rules to Enhance Machine Learning in Extracting Information From Multi-Institutional Prostate Pathology Reports</title>
		<author>
			<persName><forename type="first">E</forename><surname>Santus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schuster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Tahmasebi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Lanahan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Prinsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">F</forename><surname>Thompson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Coons</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mynderse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Barzilay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Hughes</surname></persName>
		</author>
		<idno type="DOI">10.1200/CCI.20.00028</idno>
		<ptr target="https://doi.org/10.1200/CCI.20.00028.doi:10.1200/CCI.20.00028" />
	</analytic>
	<monogr>
		<title level="j">JCO Clinical Cancer Informatics</title>
		<imprint>
			<biblScope unit="page" from="865" to="874" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Validation of Deep Learning Natural Language Processing Algorithm for Keyword Extraction from Pathology Reports in Electronic Health Records</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Choi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Seok</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">J</forename><surname>Joo</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41598-020-77258-w</idno>
		<ptr target="https://doi.org/10.1038/s41598-020-77258-w.doi:10.1038/s41598-020-77258-w" />
	</analytic>
	<monogr>
		<title level="j">Sci Rep</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Artificial Intelligence-Driven Structurization of Diagnostic Information in Free-Text Pathology Reports</title>
		<author>
			<persName><forename type="first">P</forename><surname>Giannaris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Al-Taie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kovalenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Thanintorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Kholod</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Innokenteva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Coberly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Frazier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Laziuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Popescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Shyu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hammer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Shin</surname></persName>
		</author>
		<idno type="DOI">10.4103/jpi.jpi_30_19</idno>
		<ptr target="https://doi.org/10.4103/jpi.jpi_30_19.doi:10.4103/jpi.jpi_30_19" />
	</analytic>
	<monogr>
		<title level="j">Journal of Pathology Informatics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">10</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Automating the Determination of Prostate Cancer Risk Strata From Electronic Medical Records</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Gregg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Resnick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Warner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Barocas</surname></persName>
		</author>
		<idno type="DOI">10.1200/CCI.16.00045</idno>
		<ptr target="https://doi.org/10.1200/CCI.16.00045.doi:10.1200/CCI.16.00045" />
	</analytic>
	<monogr>
		<title level="j">JCO Clinical Cancer Informatics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Automated Extraction of Grade, Stage, and Quality Information From Transurethral Resection of Bladder Tumor Pathology Reports Using Natural Language Processing</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Glaser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Jordan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Desai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Silberman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Meeks</surname></persName>
		</author>
		<idno type="DOI">10.1200/CCI.17.00128</idno>
		<ptr target="https://doi.org/10.1200/CCI.17.00128.doi:10.1200/CCI.17.00128" />
	</analytic>
	<monogr>
		<title level="j">JCO Clinical Cancer Informatics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Benchmarking Information Retrieval for Precision Oncology: the TREC Precision Medicine Track</title>
		<author>
			<persName><forename type="first">K</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Demner-Fushman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Voorhees</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">R</forename><surname>Hersh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bedrick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Lazar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pant</surname></persName>
		</author>
		<ptr target="https://knowledge.amia.org/67852-amia-1.4259402/t006-1.4263223/t006-1.4263224/2976780-1.4263306/2970178-1.4263303" />
	</analytic>
	<monogr>
		<title level="m">AMIA 2018, American Medical Informatics Association Annual Symposium</title>
				<meeting><address><addrLine>San Francisco, CA</addrLine></address></meeting>
		<imprint>
			<publisher>AMIA</publisher>
			<date type="published" when="2018">November 3-7, 2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">MedTAG: a portable and customizable annotation tool for biomedical documents</title>
		<author>
			<persName><forename type="first">F</forename><surname>Giachelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Irrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
		<idno type="DOI">10.1186/s12911-021-01706-4</idno>
		<ptr target="https://doi.org/10.1186/s12911-021-01706-4.doi:10.1186/s12911-021-01706-4" />
	</analytic>
	<monogr>
		<title level="j">BMC Medical Informatics Decis. Mak</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page">352</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations</title>
		<author>
			<persName><forename type="first">N</forename><surname>Marini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Marchesin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Otálora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wodzinski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Caputo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Rijthoven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Aswolinskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Bokhorst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Podareanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Petters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Boytcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Buttafuoco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vatrano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fraggetta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Der Laak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Agosti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ciompi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Muller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Atzori</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41746-022-00635-4</idno>
		<ptr target="http://dx.doi.org/10.1038/s41746-022-00635-4.doi:10.1038/s41746-022-00635-4" />
	</analytic>
	<monogr>
		<title level="j">npj Digital Medicine</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems!</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chiticariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">R</forename><surname>Reiss</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/D13-1079/" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013</title>
				<meeting>of the 2013 Conference on Empirical Methods in Natural Language essing, EMNLP 2013<address><addrLine>Grand Hyatt Seattle, Seattle, Washington, USA, ACL</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-21">18-21 October 2013. 2013</date>
			<biblScope unit="page" from="827" to="832" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Distributed Representations of Words and Phrases and their Compositionality</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.html" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 27th Annual Conference on Neural Information Processing Systems 2013</title>
				<meeting>of the 27th Annual Conference on Neural Information essing Systems 2013<address><addrLine>NIPS, Lake Tahoe, Nevada, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">December 5-8, 2013, 2013</date>
			<biblScope unit="page" from="3111" to="3119" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Enriching Word Vectors with Subword Information</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bojanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<idno type="DOI">10.1162/tacl_a_00051</idno>
		<ptr target="https://doi.org/10.1162/tacl_a_00051.doi:10.1162/tacl\_a\_00051" />
	</analytic>
	<monogr>
		<title level="j">Trans. Assoc. Comput. Linguistics</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="135" to="146" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Deep Contextualized Word Representations</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Neumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gardner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/n18-1202</idno>
		<ptr target="https://doi.org/10.18653/v1/n18-1202.doi:10.18653/v1/n18-1202" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018</title>
				<meeting>of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018<address><addrLine>New Orleans, Louisiana, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">June 1-6, 2018. 2018</date>
			<biblScope unit="page" from="2227" to="2237" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno>CoRR abs/1810.04805</idno>
		<ptr target="http://arxiv.org/abs/1810.04805" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Clinical-Grade Computational Pathology using Weakly Supervised Deep Learning on Whole Slide Images</title>
		<author>
			<persName><forename type="first">G</forename><surname>Campanella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Hanna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Geneslaw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Miraflor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">W K</forename><surname>Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">J</forename><surname>Busam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Brogi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">E</forename><surname>Reuter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Klimstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">J</forename><surname>Fuchs</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41591-019-0508-1</idno>
		<ptr target="https://doi.org/10.1038/s41591-019-0508-1.doi:10.1038/s41591-019-0508-1" />
	</analytic>
	<monogr>
		<title level="j">Nat Med</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="1301" to="1309" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Multiple Instance Learning: A Survey of Problem Characteristics and Applications</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Carbonneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Cheplygina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Granger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gagnon</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.patcog.2017.10.009</idno>
		<ptr target="https://doi.org/10.1016/j.patcog.2017.10.009.doi:10.1016/j.patcog.2017.10.009" />
	</analytic>
	<monogr>
		<title level="j">Pattern Recognit</title>
		<imprint>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="page" from="329" to="353" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">TBGA: a large-scale gene-disease association dataset for biomedical relation extraction</title>
		<author>
			<persName><forename type="first">S</forename><surname>Marchesin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
		<idno type="DOI">10.1186/s12859-022-04646-6</idno>
		<ptr target="https://doi.org/10.1186/s12859-022-04646-6.doi:10.1186/s12859-022-04646-6" />
	</analytic>
	<monogr>
		<title level="j">BMC Bioinform</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page">111</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Case-Based Retrieval Using Document-Level Semantic Networks</title>
		<author>
			<persName><forename type="first">S</forename><surname>Marchesin</surname></persName>
		</author>
		<idno type="DOI">10.1145/3209978.3210221</idno>
		<idno>doi:10.1145/3209978.3210221</idno>
		<ptr target="https://doi.org/10.1145/3209978.3210221" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 41st International ACM SIGIR Conference on Research &amp; Development in Information Retrieval, SIGIR 2018</title>
				<meeting>of the 41st International ACM SIGIR Conference on Research &amp; Development in Information Retrieval, SIGIR 2018<address><addrLine>Ann Arbor, MI, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2018">July 08-12, 2018. 2018</date>
			<biblScope unit="page">1451</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing</title>
		<author>
			<persName><forename type="first">M</forename><surname>Neumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>King</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ammar</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/w19-5034</idno>
		<ptr target="https://doi.org/10.18653/v1/w19-5034.doi:10.18653/v1/w19-5034" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 18th BioNLP Workshop and Shared Task, BioNLP@ACL 2019</title>
				<meeting>of the 18th BioNLP Workshop and Shared Task, BioNLP@ACL 2019<address><addrLine>Florence, Italy</addrLine></address></meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2019-08-01">August 1, 2019. 2019</date>
			<biblScope unit="page" from="319" to="327" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Distributional Semantics Resources for Biomedical Text Processing</title>
		<author>
			<persName><forename type="first">S</forename><surname>Pyysalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ginter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Moen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Salakoski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ananiadou</surname></persName>
		</author>
		<ptr target="https://bio.nlplab.org/pdf/pyysalo13literature.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proc. of LBM</title>
				<meeting>of LBM</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="39" to="44" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">W</forename><surname>Chapman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Bridewell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hanbury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">F</forename><surname>Cooper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">G</forename><surname>Buchanan</surname></persName>
		</author>
		<idno type="DOI">10.1006/jbin.2001.1029</idno>
		<ptr target="https://doi.org/10.1006/jbin.2001.1029.doi:10.1006/jbin.2001.1029" />
	</analytic>
	<monogr>
		<title level="j">J. Biomed. Informatics</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="301" to="310" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Pattern Matching: the Gestalt Approach</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Ratcliff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Metzener</surname></persName>
		</author>
		<ptr target="https://www.drdobbs.com/database/pattern-matching-the-gestalt-approach/184407970" />
	</analytic>
	<monogr>
		<title level="j">Dr Dobbs Journal</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page">46</biblScope>
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Learning Unsupervised Knowledge-Enhanced Representations to Reduce the Semantic Gap in Information Retrieval</title>
		<author>
			<persName><forename type="first">M</forename><surname>Agosti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Marchesin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
		<idno type="DOI">10.1145/3417996</idno>
		<ptr target="https://doi.org/10.1145/3417996.doi:10.1145/3417996" />
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page">48</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Taming pretrained transformers for extreme multi-label text classification</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">F</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">S</forename><surname>Dhillon</surname></persName>
		</author>
		<idno type="DOI">10.1145/3394486.3403368</idno>
		<idno>doi:10.1145/ 3394486.3403368</idno>
		<ptr target="https://doi.org/10.1145/3394486.3403368" />
	</analytic>
	<monogr>
		<title level="m">KDD &apos;20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event</title>
				<meeting><address><addrLine>, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2020">August 23-27, 2020. 2020</date>
			<biblScope unit="page" from="3163" to="3171" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Lasige-biotm at MESINESP2: entity linking with semantic similarity and extreme multi-label classification on spanish biomedical documents</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ruas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">D T</forename><surname>Andrade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Couto</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-2936/paper-24.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the Working Notes of CLEF 2021 -Conference and Labs of the Evaluation Forum</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>of the Working Notes of CLEF 2021 -Conference and Labs of the Evaluation Forum<address><addrLine>Bucharest, Romania</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">September 21st -to -24th, 2021. 2936. 2021</date>
			<biblScope unit="page" from="324" to="334" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Improving Biomedical Word Embeddings with Subword Information and MeSH</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Biowordvec</forename></persName>
		</author>
		<idno type="DOI">10.1038/s41597-019-0055-0</idno>
		<ptr target="https://doi.org/10.1038/s41597-019-0055-0.doi:10.1038/s41597-019-0055-0" />
	</analytic>
	<monogr>
		<title level="j">Scientific Data</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><surname>Alsentzer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Boag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">H</forename><surname>Weng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Naumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">B A</forename><surname>Mcdermott</surname></persName>
		</author>
		<idno>CoRR abs/1904.03323</idno>
		<ptr target="http://arxiv.org/abs/1904.03323" />
		<title level="m">Publicly Available Clinical BERT Embeddings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
