<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Identification of Disease Symptoms in Multilingual Sentences: an Ontology-Driven Approach</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Angelo</forename><surname>Ferrando</surname></persName>
							<email>angelo.ferrando@dibris.unige.it</email>
							<affiliation key="aff0">
								<orgName type="department">DIBRIS</orgName>
								<orgName type="institution">Università degli Studi di Genova</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Silvio</forename><surname>Beux</surname></persName>
							<email>silviobeux@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="department">DIBRIS</orgName>
								<orgName type="institution">Università degli Studi di Genova</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Viviana</forename><surname>Mascardi</surname></persName>
							<email>viviana.mascardi@unige.it</email>
							<affiliation key="aff0">
								<orgName type="department">DIBRIS</orgName>
								<orgName type="institution">Università degli Studi di Genova</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
							<email>prosso@dsic.upv.es</email>
							<affiliation key="aff1">
								<orgName type="department">PRHLT</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Identification of Disease Symptoms in Multilingual Sentences: an Ontology-Driven Approach</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">460BB83EE1EC6E035DBD4413D0E0DE6F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:36+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Multilingual Natural Language Processing</term>
					<term>Ontology-Driven Text Classification</term>
					<term>BabelNet</term>
					<term>Symptom Disease Identification</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper we present a Multilingual Ontology-Driven framework for Text Classification (MOoD-TC). This framework is highly modular and can be customized to create applications based on Multilingual Natural Language Processing for classifying domain-dependent contents. In order to show the potential of MOoD-TC, we present a case study in the e-Health domain.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The large amount of digital data made available in the last years from a wide variety of sources raises the need for automatic methods to extract meaningful information from them. The extracted information is precious for many purposes, and especially for commercial ones. Jackson and Moulinier <ref type="bibr" target="#b11">[12]</ref> observe that "there is no question concerning the commercial value of being able to classify documents automatically by content. There are myriad potential applications of such a capability for corporate Intranets, government departments, and Internet publishers".</p><p>The problem of classifying multilingual pieces of text was addressed since the end of the last millennium <ref type="bibr" target="#b16">[17]</ref> but it is still a significant problem because each language has its own peculiar features, making the automatic management of multilingualism an open issue.</p><p>The use of ontologies to classify multilingual texts <ref type="bibr" target="#b4">[5]</ref> is a good alternative to standard machine learning approaches in all those situations where a training set of documents is not available or it is too small to properly train the classifier. Ontology-driven text classification does not depend on the existence of a training set, as it relies solely on the entities, their relationships, and the taxonomy of categories represented in an ontology, that becomes the driver of the The first author of this paper is a PhD student in Computer Science at the University of Genova, Italy. The work of the last author was in the framework of the SomEMBED MINECO TIN2015-71147-C2-1-P research project.</p><p>classification. Another advantage of ontology-driven classification is that ontology concepts are organized into hierarchies and this makes possible to identify the category (or the categories) that best classify the document's content, by traversing the hierarchical structure.</p><p>In this paper we present MOoD-TC (Multilingual Ontology Driven Text Classifier <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b12">13]</ref>), a highly modular system which has been conceived, designed and implemented to be customized by the system developer for obtaining different domain-dependent behaviors, always centered around the multilingual text classification process. The original contribution of this paper is the exploitation of the core "multilingual word identification" functionalities of MOoD-TC for a challenging scenario in the e-Health domain, where classification is a by-product of disease symptoms identification in multilingual pieces of text, driven by a standard symptoms ontology. A customization of MOoD-TC with an ad-hoc module equipped with pre-and post-processing facilities suitable for the scenarios that motivate our work, is also described.</p><p>The paper is organized as follows: Section 2 introduces three motivating scenarios where an ontology-driven multilingual text classification may prove useful, Section 3 analyzes the state of the art, Section 4 describes MOoD-TC, Section 5 provides examples and experimental results, and Section 6 concludes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Motivating scenarios</head><p>Alice is enjoying her holidays in Stockholm. Suddenly, she feels a painful spasm to her stomach and in a few minutes a strong feeling of nausea appears. Spasms go on for half an hour, and she starts to feel worried. She does not think it is the case to go to the hospital, but she would at least ask for advice over the phone. However, she cannot speak Swedish and, in the stressful situation she is experiencing, she cannot recall how to express her health problems in English. She could speak in her native language Italian, but it is not so likely that the doctor can speak Italian as well.</p><p>Bob is making a walk in his town. He notices a young man bending over his knees, with a scared expression on his face. He runs to help him, and he understands that the problem is with his chest. The young man speaks French only and Bob cannot understand him: he calls the first aid emergency number and explains what he is seeing and what he supposes to be taking place. If he could understand what the young man says, he would be definitely more helpful.</p><p>Carol is a volunteer in Honduras. She is neither a physician nor a nurse. She has a very basic knowledge of first aid procedures and a first aid kit with medicines that she knows how to administer, given a clear diagnosis. A woman runs towards her asking for her assistance. The woman's small boy has a problem with his head and he has a high fever but, without understanding the other symptoms that the woman is trying to explain in Spanish, Carol cannot recognize and classify the problem. In the remote place where she is, she cannot contact the doctor. Carol should need to understand the other symptoms besides fever and headache, in order to select the correct medicine.</p><p>The three scenarios above are all characterized by the impossibility for the doctor to visit the patient on-the-fly and the need for the patient to be understood despite language barriers, in order to get advice for minor problems or to speed up the assistance procedure for major ones. The patient's need could be suitably addressed by identifying and translating symptoms from her language to the assistant's or the doctor's one. If automatic tools for facing this issue were available, for example as an app installed on the mobile phone, the three situations could evolve in the following way:</p><p>-Scenario 1: through the use of an app, the person needing care communicates with the "health emergency" software application in her own language. The application performs a speech-to-text translation, identifies the symptoms in the text based on a standard ontological representation of symptoms, and sends the list of symptoms expressed in the doctor's language to a center where they are managed either by intelligent software agents or by human experts. -Scenario 2: the "health emergency" software application is not directly used by the person needing care, but by the one who assists her. Like before, the assisted person can "tell" her problems to the application which performs a speech-to-text translation and identifies the symptoms represented in a domain ontology which appear in the text. The symptoms, translated into the language of the person who his giving the first assistance, may be read on the screen. That person can call the national first aid number, telling what is happening, what she sees, and the symptoms which have been understood, classified, and translated by the app. -Scenario 3: also in this case, besides a speech-to-text translation, the symptoms expressed in the language of the patient are identified w.r.t. a symptoms ontology and translated into the target language.</p><p>The way this information is used can require a further automatic processing stage, if the doctor cannot be involved in the loop and the person providing aid needs an automatic support for making a diagnosis and identifying the right therapy to administer.</p><p>In all the three situations above, a standard machine translation application and a symptoms classifier based on machine learning might not be powerful enough: the pre-and post-processing stages require to have a machine-readable explicit representation of symptoms, in some vocabulary agreed upon by all the application components and by the humans involved in the loop, in order to share them among the application components (both at the client and at the server side) and to reason about them if needed. A multilingual ontology-driven text classification approach seems the right way to face these challenging scenarios.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">State of the art</head><p>According to <ref type="bibr" target="#b7">[8]</ref>, in 1996 more than 80% of Internet users were native English speakers. This percentage has dropped to 55% in 2000 and to 27.3% in 2010. However, about 80% of the digital resources available today on the Web (including deep Web and digital libraries) are in English <ref type="bibr" target="#b9">[10]</ref>. This calls for the urgent need of establishing multilingual information systems and Cross-Language Information Retrieval (CLIR) facilities. How to manipulate the large volume of multilingual data has now become a major research question.</p><p>In this paper we are interested in Natural Language Processing (NLP) techniques for solving multilingual term identification and text classification problems in the e-Health domain where extracting information from clinical notes has been the focus of a growing body of research in the past years <ref type="bibr" target="#b13">[14]</ref>. Common characteristics of narrative text used by physicians in electronic health records make the automatic extraction of meaningful information hard. NLP techniques are needed to convert data from unstructured text to a structured form readily processable by computers <ref type="bibr" target="#b14">[15]</ref>. This structured representation can be used to extract meaning and enable Clinical Decision Support systems that assist healthcare professionals and improve health outcomes <ref type="bibr" target="#b5">[6]</ref>.</p><p>Signs and symptoms have seldom been studied for themselves in the field of biomedical information extraction. Indeed, they are often included in more general categories such as "clinical concepts" <ref type="bibr" target="#b21">[22]</ref>, "medical problems" <ref type="bibr" target="#b20">[21]</ref> or "phenotypic information" <ref type="bibr" target="#b18">[19]</ref>. Moreover, most of the available studies are based on clinical reports or narrative corpora. In <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b17">18]</ref>, indeed, the aim consists in symptom extraction from clinical records and in <ref type="bibr" target="#b19">[20]</ref> the authors identify the risk factors for heart disease based on the automated analysis of narrative clinical records of diabetic patients.</p><p>Another recent project in e-Health NLP context is the IBM Watson for Oncology <ref type="foot" target="#foot_0">1</ref> . It has an advanced ability to analyze the meaning and context of structured and unstructured data in clinical notes and reports, easily assimilating key patient information written in plain English that may be critical to select a treatment pathway. These works are different from ours because they do not address multilingual aspects and, furthermore, because they have to manage the differences between the "signs", which are identified by clinicians, and the "symptoms", which can be described directly by the sick person.</p><p>In our work we do not have to manage clinical records but directly the information provided by the person who feels sick. This difference is crucial in works using an ontology-driven approach, because clinical reports contain many more technical words<ref type="foot" target="#foot_1">2</ref> compared to a text written (or a sentence told) by a normal person describing how she feels. This allows us to use simpler ontologies. Especially from the multilingual viewpoint, having an ontology containing simple concepts, omitting useless technicalities, allows achieving better results with less effort, considering that a technical word could be less supported by the tools we use during our text classification pipeline.</p><p>The assumption upon which MOoD-TC relies, is the availability of ontologies in the domain of interest. Even if the application developer might design and implement her own domain ontology from scratch, integrating well-founded and widely used ontologies into MOoD-TC would be the most modular, reusable and scientifically acceptable approach. Luckily, many domain ontologies exist, in particular in the biomedical domain. Panacea <ref type="bibr" target="#b6">[7]</ref>, the Ontology for General Medical Science<ref type="foot" target="#foot_2">3</ref> , and the Gene Ontology<ref type="foot" target="#foot_3">4</ref> are just a few recent examples, besides the "symptoms ontology" used for our experiments and discussed in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">MOoD-TC</head><p>MOoD-TC has been developed as part of Silvio Beux' Masters Thesis <ref type="bibr" target="#b2">[3]</ref>, starting from <ref type="bibr" target="#b12">[13]</ref>. Its aim is to classify multilingual textual documents according to classes described in a domain ontology. MOoD-TC consists of the Text Classifier (TC) and the Application Domain Module (ADM). It provides a set of core modules offering functionalities which are common to any text classification problem (text pre-processing, tagging, classification) plus a customizable structure for those modules which can be implemented by the developer in order to offer application-specific functionalities. It returns a classification of the text w.r.t. the ontology taken as input. The classification performed by TC which is implemented in Java and exploits the Language Detector Library<ref type="foot" target="#foot_4">5</ref> , BabelNet <ref type="bibr" target="#b15">[16]</ref>, and TreeTagger<ref type="foot" target="#foot_5">6</ref> .</p><p>The Language Detector Library detects, with a precision greater than 99%, 53 languages making use of Naive Bayesian filters. It is devoted to recognize the language L o of the ontology o and the language L d of the textual document d. The TreeTagger tool performs the tagging of d in order to obtain, for each word w ∈ d different from a stop word, its lemma (the canonical form of the word) and its part of speech (POS). This information is used by BabelNet to perform the translation of w into the ontology language. Finally, the translated word w is searched inside the ontology and contributes to the classification of d in the category modeled by the ontology concept c having the same semantics as w . The ClassifierObject is the object that stores a correctly classified word (and additional information) of the document d with respect to o. TC returns a list of such objects. ADM specializes the text classifier task by implementing functionalities for pre-and post-processing a multilingual textual document. If an ADM is used, the entire system specializes its behaviour in the domain represented by that particular ADM (e.g., from text classifier to disease recognizer). In our system TC can work alone, but an ADM is meant to work in close connection with the core system. The core modules are implemented to work for the European languages (which share some common features like, for example, the relationship between noun and adjective), but they could be extended to cope with the peculiar features of other languages; in fact, thanks to the modularity of the system, it is possible to integrate different algorithms created specifically to handle that peculiarities, without modifying the entire system. The ADM processes the TC input and output in order to obtain a new domain oriented tool. An ADM is composed by two sub-components: pre-processing and postprocessing. The pre-processing component takes as input a digital object (for example a spoken sentence, in the scenarios discussed in Section 2) and returns a new processed text, while the post-processing component takes as input the TC output and returns a domain dependent result. Figure <ref type="figure" target="#fig_0">1</ref> shows the entire pipeline of the integration process between the TC and the ADM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Exploiting MOoD-TC for Symptom Identification</head><p>As illustrated in Section 2, the scenarios we aim to address require that disease symptoms appearing in a text are correctly identified w.r.t. a domain ontology. The pre-processing stage consists of moving from a spoken sentence to a text and the post-processing in translating the identified symptoms into a target language and, depending on the scenario, moving back from text to speech and/or reasoning over them. In the sequel we discuss the experiments related with our main task, namely that of symptoms identification.</p><p>The domain ontology used for describing symptoms is a standard ontology named the symptoms ontology<ref type="foot" target="#foot_6">7</ref> , partially shown in Figure <ref type="figure" target="#fig_1">2</ref>. It is an ontology of disease symptoms with symptoms encompassing perceived changes in function, sensations or appearance reported by a patient and indicative of a disease. We stress that our experiments in exploiting MOoD-TC for symptom identification did not require to build any new ontology. Rather, consistently with the good principle of reusing existing software whenever available and, in particular, reusing existing ontologies, we just passed the symptoms ontology as input to the TC, obtaining the results discussed in the next section. In the sequel we discuss our initial experiments with phrases in five different languages (English, French, German, Italian, Spanish), where symptoms are identified by the TC module. The classification of two sample sentences is shown below, where the TC GUI screenshot associated with each sentence shows the ontology concepts which appear in the text along with the number of their occurrences in the text. The experiments have been carried out on 32 sentences for each of the 5 languages, for a total of 160 sentences. Each sentence describes symptoms related to one of the following sixteen disease: tinnitus, food allergy, cervical, dehydration, hyperthyroidism, flu, appendicitis, food poisoning, labyrinthitis, narcolessia, pneumonia, diabetes type 1, hyperglycemia, hypoglycemia, bronchitis, jet lag (two sentences for each disease). To cover the widest range of cases we considered the diseases with the most varied symptoms. The description of symptoms associated with each disease has been retrieved from <ref type="bibr" target="#b8">[9]</ref> and each sentence contains 2 up to 9 symptom words. The sentences were manually created by the authors.</p><p>Since the final purpose of this work is to provide an automatic diagnostic system with as many symptoms as possible, in order to devise the correct diagnosis, we were mainly interested in symptoms which appear in the text but which are not identified by our classifier (false negatives). We also looked for false positives, but their number is so low to be irrelevant for our experiments. Also, false positives are due to an under classification, rather than an actually wrong classification: if the text contains the "abdominal cramp" symptom, for example, and it is classified with the more general "abdominal symptom" concept, we consider this result a false positive as a more specific concept could have been returned. Figure <ref type="figure">3</ref> shows the average number of symptoms that should have been identified w.r.t the correctly identified symptoms in the five considered languages. Figure <ref type="figure">4</ref> shows the number of false negatives (y axis) for disease (x axis). Figure <ref type="figure">3</ref> demonstrates that the results greatly vary with the disease. For example, symptoms related to tinnitus are hardly classified, but this can be easily explained by observing the ontology we used, where problems related to ears are not modeled at all. By carefully analyzing the obtained results, we also realized that sometimes the performances of the classifier are worsened by the presence of a symptom in the text which has a different grammatical role than the symptom in the ontology (usually a noun), making their matching impossible although the word root and the meaning are the same. For example, the ontology contains the noun "irritability", but if the text contains the adjective "irritable" (in any Fig. <ref type="figure">3</ref>. For each disease, the leftmost column (in black) measures the average number of symptoms that should have been identified; the next five columns show the average number of correctly identified symptoms in Italian, French, German, Spanish and English sentences respectively. Fig. <ref type="figure">4</ref>. Trend of errors for disease in the five languages (False Negatives). On the x axis the diseases are reported (labels are omitted) and on the y axis the number of false negatives for disease is reported: each line in the graphic is associated with one language. language), the identification fails. This problem is due to the way the root of a word is computed, and to the way words are managed in BabelNet.</p><p>What emerges from Figure <ref type="figure">4</ref> is that false negatives have a very similar behavior despite the language of the sentence. This is again due to the two reasons discussed above. Despite these problems, which have a clearly understood motivation and which can be addressed by extending the ontology and by refining the management of word root extraction, MOoD-TC has demonstrated to be a flexible and ready-to-use approach for multilingual symptoms identification driven by a standard ontology we retrieved on the web.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusions and Future Work</head><p>In this paper we presented the MOoD-TC architecture showing its possible use in the symptoms identification problem. The speech-to-text pre-processing stage can be faced using existing tools, and the post-processing stage with a translation of the identified symptoms into the doctor's language can be addressed using Ba-belNet, in the same way we exploit BabelNet for bridging the text, whatever its language, and the ontology. The more challenging post-processing stage of supporting the user in providing a diagnosis given a set of identified symptoms could be addressed by means of sophisticated expert system such as the old and well known MYCIN <ref type="bibr" target="#b3">[4]</ref> and more recent projects (http://www.easydiagnosis.com/, https://www.diagnose-me.com/, <ref type="bibr" target="#b1">[2]</ref>), some of which are ontology-driven <ref type="bibr" target="#b0">[1]</ref>.</p><p>Our framework does not face many well known open problems in multilingual text classification and information extraction such as negation <ref type="bibr" target="#b22">[23]</ref> and named entities, but rather it provides a flexible and modular approach ready for integrating, with limited effort, the results and algorithms addressing the above problems coming from the research community.</p><p>In the short time, our work will be devoted to overcome the problems that limit the performances of MOoD-TC in the considered scenario: we will make the word identification more flexible and we will extend the symptoms ontology with those symptoms which have not been modeled so far.</p><p>In the future, it would be interesting to run an experimental comparison between our approach and a machine learning one. In case of a limited number of labeled examples, in fact, it would be feasible to apply semi-supervised learning methods. Depending on the comparison results, we will also consider to combine both approaches, using a domain ontology to improve the results of a traditional machine learning approach.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Integration pipeline of TC and ADM.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Symptoms ontology (the three branches are children of "Symptom").</figDesc><graphic coords="6,197.01,512.69,221.33,83.51" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Phrase 1 (</head><label>1</label><figDesc>Italian language): "Credo di avere la febbre, continuo a sudare e ho i brividi. Non la smetto di tossire e fatico a mangiare a causa del male alla gola, come un forte bruciore. Mi sento stanchissimo e ho dolore a tutti i muscoli." Phrase 3 (Spanish language): "Me siento fatal. Tengo temperatura, vòmito y diarrea. Hace dos dìas que no consigo comer nada. Tengo nausea y mareos."</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="8,134.77,147.17,353.03,141.73" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://www.ibm.com/smarterplanet/us/en/ibmwatson/watson-oncology.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">A clinical report is written by a doctor.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://bioportal.bioontology.org/ontologies/OGMS</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://geneontology.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://code.google.com/p/language-detection/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">http://code.google.com/p/tt4j/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">http://purl.obolibrary.org/obo/symp.owl</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">CardioOWL: An ontology-driven expert system for diagnosing coronary artery diseases</title>
		<author>
			<persName><forename type="first">B</forename><surname>Al-Hamadani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Open Systems (ICOS)</title>
				<imprint>
			<date type="published" when="2014">2014. 2014</date>
			<biblScope unit="page" from="128" to="132" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Medical expert systems for diabetes diagnosis: A survey</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">P</forename><surname>Ambilwade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Manza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">P</forename><surname>Gaikwad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. of ARCSSE</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">11</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">MOoD-TC: A general purpose multilingual ontology driven text classifier</title>
		<author>
			<persName><forename type="first">S</forename><surname>Beux</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
			<pubPlace>, Italy</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Computer Science, University of Genova</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s Degree Thesis in</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">G</forename><surname>Buchanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Shortliffe</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1984">1984</date>
			<publisher>Addison-Wesley</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Multilingual text classification using ontologies</title>
		<author>
			<persName><forename type="first">G</forename><surname>De Melo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Siersdorfer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECIR Conference, Proceedings</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">4425</biblScope>
			<biblScope unit="page" from="541" to="548" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">What can natural language processing do for clinical decision support?</title>
		<author>
			<persName><forename type="first">D</forename><surname>Demner-Fushman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chapman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Mcdonald</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Biomedical Informatics</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="760" to="772" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Panacea, a semantic-enabled drug recommendations discovery framework</title>
		<author>
			<persName><forename type="first">C</forename><surname>Doulaverakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nikolaidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kleontas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kompatsiaris</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Biomedical Semantics</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page">13</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m">Global internet statistics</title>
				<imprint>
			<publisher>Global Reach</publisher>
			<date type="published" when="2005-06">June 2005</date>
		</imprint>
	</monogr>
	<note type="report_type">Technical report</note>
	<note>Global Reach. by language</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Complete guide to symptoms, illness &amp; surgery for people over 50</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">W</forename><surname>Griffith</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1992">1992</date>
			<publisher>Body Press/Perigee</publisher>
			<pubPlace>New York, NY</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Cross-language information access to multilingual collections on the Intenet</title>
		<author>
			<persName><forename type="first">B</forename><surname>Guo-Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hsin-Hsi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Information extraction from clinical records</title>
		<author>
			<persName><forename type="first">H</forename><surname>Harkema</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gaizauskas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hepple</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th UK e-Science All Hands Meeting</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Cox</surname></persName>
		</editor>
		<meeting>the 4th UK e-Science All Hands Meeting<address><addrLine>Nottingham, UK</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Natural Language Processing for Online Applications: Text Retrieval, Extraction &amp; Categorization</title>
		<author>
			<persName><forename type="first">P</forename><surname>Jackson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Moulinier</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2002">2002</date>
			<publisher>John Benjamins</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">My MOoD, a multimedia and multilingual ontology driven MAS: design and first experiments in the sentiment analysis domain</title>
		<author>
			<persName><forename type="first">M</forename><surname>Leotta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Beux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mascardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Briola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ESSEM Workshop, Proceedings</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="51" to="66" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Extracting information from textual documents in the electronic health record: a review of recent research</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Meystre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Savova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C</forename><surname>Kipper-Schuler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Hurdle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Yearbook of medical informatics</title>
		<imprint>
			<biblScope unit="page" from="128" to="144" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Natural language processing: An introduction</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nadkarni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ohno-Machado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chapman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Medical Informatics Association</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="544" to="551" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network</title>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Ponzetto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">193</biblScope>
			<biblScope unit="page" from="217" to="250" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">A survey of multilingual text retrieval</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">W</forename><surname>Oard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Dorr</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1996">1996</date>
			<pubPlace>, MD, USA</pubPlace>
		</imprint>
		<respStmt>
			<orgName>College Park</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical report</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Using regular expressions to extract information on pacemaker implantation procedures from clinical reports</title>
		<author>
			<persName><forename type="first">A</forename><surname>Rosier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Burgun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mabo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AMIA Annual Symposium</title>
				<meeting>the AMIA Annual Symposium<address><addrLine>Washington DC, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">R</forename><surname>South</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Garvin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">H</forename><surname>Samore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">W</forename><surname>Chapman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Gundlapalli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page">12</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2</title>
		<author>
			<persName><forename type="first">A</forename><surname>Stubbs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Kotfila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Özlem</forename><surname>Uzuner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Biomedical Informatics</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="page" from="S67" to="S77" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">i2b2/VA challenge on concepts, assertions, and relations in clinical text</title>
		<author>
			<persName><forename type="first">O</forename><surname>Uzuner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">R</forename><surname>South</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Duvall</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">JAMIA</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="552" to="556" />
			<date type="published" when="2010">2010. 2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Pooling annotated corpora for clinical concept extraction</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">B</forename><surname>Wagholikar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Torii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jonnalagadda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Biomedical Semantics</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page">3</biblScope>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">A survey on the role of negation in sentiment analysis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Balahur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Roth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Klakow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Montoyo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, NeSp-NLP &apos;10</title>
				<meeting>the Workshop on Negation and Speculation in Natural Language Processing, NeSp-NLP &apos;10<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="60" to="68" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
