=Paper=
{{Paper
|id=Vol-2470/p31
|storemode=property
|title=Automatic detection of contraindications of medicines in package leaflet
|pdfUrl=https://ceur-ws.org/Vol-2470/p31.pdf
|volume=Vol-2470
|authors=Jonas Žalinkevičius,Rita Butkienė
|dblpUrl=https://dblp.org/rec/conf/ivus/ZalinkeviciusB19
}}
==Automatic detection of contraindications of medicines in package leaflet==
https://ceur-ws.org/Vol-2470/p31.pdf
Automatic Detection of Contraindications of
Medicines in Package Leaflet
Jonas Žalinkevičius Rita Butkienė
Faculty of Informatics Faculty of Informatics
Kaunas University of Technology Kaunas University of Technology
Kaunas, Lithuania Kaunas, Lithuania
jonas.zalinkevicius@hotmail.com rita.butkiene@ktu.lt
Abstract— Before physicians prescribe medicines, they must A system that automates the extraction of
take into consideration the patient’s diseases and medicines they contraindications from leaflet text is described is in Section 3.
use. This is done to avoid complications that may occur. All Using this system all leaflets of medicines registered in
information about possible contraindications is written in the Lithuania were analyzed. The results of this analysis
medicine package leaflet. A system that can automatically detect (contraindications extracted) are used in a commercial
contraindication mention in the Lithuanian text of leaflet medications information system that is used by Lithuanian
applying natural language parsing is presented. This system physicians for prescription of medications. The evaluation of
gives a possibility to shorten the time needed for medicines the obtained results is presented in Section 4.
prescription decision making. The results of the experiment
showed that the created system successfully detected 56 per cent II. RELATED WORK
contraindications.
In Lithuania, it is established that each medicine registered
Keywords— medicine contraindications, drug–drug in Lithuania must contain a package leaflet describing
interactions, shallow parsing, morphological analysis, noun therapeutic indications, possible contraindications, safety
phrase detection precautions, and usage information in the Lithuanian
language. In order to be sure that the patient does not suffer
I. INTRODUCTION from possible contraindication, the physician should read
When a patient is diagnosed with a new disease, through all leaflet text before prescribing the medicine.
additionally physician asks the patient about his allergies, Usually, the analysis of leaflets is time-consuming, so
previous health problems, chronic deceases, what medications physicians tend to skip it and rely on the knowledge and
and food supplements he is using. After taking gathered experience they have gained.
information into consideration and evaluation of possible There are lots of systems developed for analysis and
contraindications with prescribed medication physician information extraction from the biomedical text in the English
assigns treatment and, if needed, changes previous language. But there is no solution for the detection of
assignments. Almost all information about contraindications contraindication (i.e. contraindication with disease or
can be found in the medicine package leaflet. According to contraindication with the pharmacological group) mentions in
Lithuania’s medicines registration procedure [1], every Lithuanian written text. We have analyzed articles that
package must have a leaflet written in Lithuanian. Information describe similar problems when analyzing biomedical text.
in the leaflet must be divided into six sections [2], although For example, a tool Semantator [4] was created for converting
the text in a section can be written in not structural manner. biomedical text to linked data. It used ontology-based
So, if a physician needs to find possible contraindications, he information extraction using biomedical ontology terms
must read all text in the second section (Table 1) or search for hosted in BioPortal and ontology editor Protégé for text
information on the Internet. Usually, health care information preprocessing. A semantic annotation and inference platform
consists of unstructured data and that leads to inaccurate SENTIENT-MD [3] creates a dependency graph as the first
search results that contain hundreds of links to not relevant step for dependency parsing which is one of the tasks of
documents. And the user must read through results to find semantic annotation of medical knowledge in natural
relevant information. language text. Markus Bundschus [5] used probabilistic
Automatic information extraction tools can extract graphical models (Conditional Random Fields) to identify
biomedical data, save it in a structural way, and minimize semantic relations.
information search problem. However, automatic text analysis Although all these authors work on texts written in
and information extraction from unstructured text in the English, we found that common rules and approaches could
medical domain is a challenging task [3]. The aim of this paper be applied to Lithuanian texts as well. In order to extract
is to present a system that gives physicians the possibility of a information from text, preprocessing is needed using natural
faster and more accurate way of finding contraindications language processing: text segmentation, a morphological
using automated contraindication detection in the medicine analysis should be performed and then a syntactic parse tree
package leaflet. or the dependency graph [6]. [7] should be formed. For
semantic relations detection, existing ontologies or knowledge
bases should be used.
© 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0)
110
III. SYSTEM DESCRIPTION B. Morphological analysis
In this section, a system for the detection of A morphological analysis forms a background for
contraindication mentions in the medicine leaflet text written information extraction about contraindications. In this stage, a
in Lithuanian is presented. The system implements a text given text is split into lexical units (e.g. sentences, lexemes)
analysis pipeline of four analysis stages: extraction of and analyzed morphologically. For this task, a web service
contraindication text block, morphological analysis, noun provided by the system “http://semantika.lt” [8] is used. The
phrase detection, and annotation. web service returns morphological features for each given
lexeme: part of speech, gender, number and so on.
Additionally, all annotated phrases are checked is it in the
database of noun phrases to be ignored or not. This database C. Noun phrase detection
is manually filled and helps to obtain more precise results. The Phrases that express a specific contraindication usually are
overall pipeline for the detection of contraindication mentions noun phrases, for example, heart attack, type one diabetes,
is shown in fig. 1. pancreatitis, and so on. Therefore, we chose a phrase structure
Below each stage of text analysis is discussed in more grammar method because it better fits for noun phrase
detail. detection than dependency grammar as it was suggested by
Axel Halvoet in his monography [9]. Phrase structure rules are
A. Extraction of contraindication text blocks used to split natural language written sentence into its
In Lithuania, when describing the medicine, a producer constituent parts: lexical and phrasal categories [9], [10], [11].
must follow a certain template of the package leaflet [2]. This For the noun phrase detection in the medicine’s leaflet, three
template splits the description of leaflet into 6 sections listed phrase structure rules ware specified (see Table 2).
in Table 1
TABLE II. NOUN PHRASE STRUCTURE RULES
TABLE I. MEDICINE PACKAGE LEAFLET SECTIONS No Rule
No Section A lexeme is a part of a noun phrase if it is a noun in the genitive
1 case and follows another noun in the genitive case or adjective or
1 What X is and what it is used for numeral or participle.
A lexeme is a part of a noun phrase if it is an attributive adjective
2 What you need to know before you