Automatic extraction of semantic relations between medical entities:
                     Application to the treatment relation

                  Asma Ben Abacha                               Pierre Zweigenbaum
                     LIMSI-CNRS                                     LIMSI-CNRS
             BP 133 - F-91403 Orsay Cedex                   BP 133 - F-91403 Orsay Cedex
            asma.benabacha@limsi.fr                                pz@limsi.fr


                      Abstract                           search engine.
      Information extraction is a complex task
      which is necessary to develop high-                   But, while these search engines have a big
      precision information retrieval tools. In          contribution in making large volumes of medi-
      this paper, we present MeTAE, a platform           cal knowledge accessible, their users have often
      to extract medical entities and the medical        to deal with the burden of browsing and filtering
      relations linking them. The proposed ap-           the numerous results of their queries in order to
      proach relies on linguistic patterns and do-       find the precise information they were looking for.
      main knowledge and consists in two steps:          This point is more crucial for practitioners who
      (i) recognition of medical entities and (ii)       may need an immediate answer to their queries
      identification of the correct semantic re-         during their work.
      lation between each pair of entities. The
      first step is achieved by an enhanced use             In this context, we need systems able to respond
      of MetaMap which improves the preci-               to users queries with precise answers. Such tools
      sion obtained by MetaMap by 19.59% in              need deep analysis of biomedical documents in or-
      our evaluation. The second step relies on          der to extract relevant information. At the first
      linguistic patterns which are built semi-          level of this information come the medical enti-
      automatically from a corpus selected ac-           ties (e.g. diseases, drugs, symptoms). At the sec-
      cording to semantic criteria. We evaluate          ond, more complicated level comes the extraction
      our system’s ability to identify medical en-       of semantic relationships between these entities.
      tities of 16 types. We also evaluate the
      extraction of treatment relations between a           In this paper, we present our method to extract
      treatment (e.g., medication) and a problem         semantic relations between medical entities, with
      (e.g., disease): we obtain 75.72% of pre-          an empirical study on the “treatment” relation. We
      cision and 60.46% of recall. We achieve            first propose an enhanced use of MetaMap (Aron-
      encouraging results w.r.t similar research         son, 2001) to extract medical entities and com-
      works in the literature.                           pare it with the simple application of MetaMap on
                                                         the same test corpora. To extract occurrences of
1 Introduction                                           the target relations, we then design linguistic pat-
Medical knowledge is growing significantly every         terns based on selected sentences from PubMed
year. According to some studies, the volume of           Central articles. We present a method to ob-
this knowledge doubles every five years (Engel-          tain such sentences by leveraging UMLS Metathe-
brecht, 1997), or even every two years (Hotvedt,         saurus knowledge and MeSH indexing of PubMed
1996). With large-scale digitisation, several med-       Central. We evaluate entity and relation extraction
ical search engines went on display, such as             on a distinct corpus of 580 sentences and obtain
PubMed1 for searching biomedical literature, CIS-        promising results. We also present MeTAE, a plat-
MeF2 , catalog and index of french medical Web           form for automatic semantic annotation and explo-
sites or Health On the Net3 , a public medical           ration of medical texts which incorporates these
  1
    http://www.pubmed.com
                                                         information extraction components and lets a user
  2
    http://www.chu-rouen.fr/cismef                       query the obtained information. We finally discuss
  3
    http://www.healthonnet.org                           our results and conclude on further work.


                                                     1
2 Background                                              formed. Their second method (Lee et al., 2004)
                                                          targeted the precise extraction of “treatment” rela-
The reference tool for medical entity recognition         tions between drugs and diseases. Manually writ-
is MetaMap (Aronson, 2001), a system which                ten linguistic patterns were constructed from med-
maps medical text to UMLS concepts. Using                 ical abstracts talking about cancer. Their system
MetaMap therefore provides a strong baseline to           reached 84% recall but an overall 48.14% preci-
start with. MetaMap is able to identify most con-         sion. (Embarek and Ferret, 2008) proposed an ap-
cepts in the titles of articles from MEDLINE (Pratt       proach to extract four kinds of relations (Detect,
and Yetisgen-Yildiz, 2003). (Meystre and Haug,            Treat, Sign and Cure) between five kinds of med-
2005) obtained good precision and recall measures         ical entities. The patterns used were constructed
(resp. 0.753 and 0.892) with an approach based on         automatically using an alignment algorithm wich
MetaMap for extracting “medical problems”.                maps sentence parts using an edit distance (defined
   However, the use of MetaMap leads to some              between two sentences) and different word-level
residual problems at two levels: (i) in the seg-          clues.
mentation and the extraction of medical entities:
MetaMap considers some general words and some                SemRep (Rindflesch et al., 2000), a natural lan-
verbs as medical entities (e.g. best, normal, take,       guage processing application, targeted the extrac-
reduce) and (ii) in the categorization of medical         tion of semantic relationships in biomedical text
entities: MetaMap may propose several concepts            through a rule-based approach. SemRep (Fiszman
for the same term as well as several semantic types       et al., 2007) obtained a 53% recall and 67% pre-
for the same concept. We address these two issues         cision in identifying risk factors and biomarkers
in our system by performing independent segmen-           for diseases asserted in MEDLINE citations. An
tation of the text given to MetaMap, then impos-          enhanced version of SemRep (Ahlers et al., 2007)
ing constraints on the semantic types of concepts         was proposed to identify core assertions on phar-
it detects.                                               macogenomics and obtained an overall 55% recall
   Domain-independent relation extraction has             and 73% precision.
been studied by a wide range of approaches                   Domain-independent relation extraction meth-
which can be classified in four categories. Sta-          ods are not directly applicable to the medical do-
tistical approaches based on term frequency and           main due to the lack of domain independent mark-
co-occurrence of specific terms (Hindle, 1990),           ers that may help to recognise medical entities
machine learning techniques (Zhu et al., 2009),           (e.g. capital letters, regular grammatical structure)
linguistic approaches (Hearst, 1992) (e.g. using          and to the variety in the expression of domain con-
manually written extraction rules) and hybrid ap-         cepts (e.g. Amoxicillin = amoxycillin = AMOX).
proaches which combine two or more of the pre-            To bypass these problems, medical relation extrac-
ceding methods (Suchanek et al., 2006).                   tion approaches often rely on domain knowledge
   In the medical domain, the same strategies             such as the UMLS Metathesaurus and Semantic
can be found but the specificities of the domain          Network. But the post-use of extracted relations
led to specialised methods. (Cimino and Bar-              is not always taken into account in the extraction
nett, 1993) used linguistic patterns to extract re-       procedure. For instance, if the extracted relations
lations from titles of Medline articles. The au-          are to be used in keyword querying systems, we
thors used MeSH headings and co-occurrence of             should either give priority to recall or give the
target terms in the title field of a given arti-          same priority for recall and precision, while, if the
cle to construct relation extraction rules. (Khoo         final application is a question answering system
et al., 2000) focused on extracting causal re-            for practitioners, priority should be given to the
lations from abstracts of biomedical articles by          precision of extraction. Medical relation extrac-
aligning manually-constructed graph patterns with         tion approaches sometimes also do not care about
syntactic dependency trees. (Lee et al., 2003)            extracting the arguments of a relation (e.g. (Lee et
used UMLS to identify semantic relations between          al., 2004)), or evaluate their approaches by count-
medical entities. Their first method could extract        ing relations extracted with only one argument as
68% of the semantic relations in their test cor-          correct (e.g. (Pustejovsky et al., 2002)), consider-
pus but if many relations were possible between           ing that recall is the most important measure. In
the relation arguments no disambiguation was per-         our context we are interested in medical question


                                                      2
answering systems as back-end and give priority            a disambiguation step is required on the obtained
to precision, considering the correct extraction of        concepts.
arguments as mandatory to validate the identified             To solve these problems, we propose an ap-
relations.                                                 proach in three points:
   Most relation extraction methods rely on a cor-
pus where example occurrences of the target rela-            1. Split the biomedical texts into sentences and
tions can be found. For instance, given pairs of                extract noun phrases with non-specialized
seed terms which are known to entertain the tar-                tools. We use LingPipe4 and treetagger-
get relation, semi-supervised methods such as that              chunker which offer a better segmentation ac-
introduced in (Hearst, 1992) collect occurrences                cording to empirical observations.
of these term pairs in the corpus and use them to
                                                             2. Determine medical entities as well as UMLS
build relation patterns.
                                                                concepts and semantic types with MetaMap.
   The selection of a relevant corpus is a key point
here: for such a method to work, the corpus must             3. Filter the obtained medical entities with (i) a
contain mentions of the target relationship be-                 list of the most frequent/noticeable errors and
tween these pairs of terms. We propose a method                 (ii) a restriction on the semantic types used
to increase the chances that such mentions are ac-              by MetaMap in order to keep only semantic
tually found in the selected texts.                             types which are sources or targets for the tar-
                                                                geted relations (cf. Table 1).
3 Annotation Method
Our method is twofold. In a first step, we ex-              Category       Example Semantic Types
tract medical entities from sentences and deter-            Problem        Anatomical Abnormality, Injury
mine their categories. In a second step, we extract                        or Poisoning, Disease or Syn-
semantic relations between the extracted entities.                         drome
                                                            Treatment      Pharmacologic Substance, Ther-
3.1 Medical Entity Recognition                                             apeutic or Preventive Procedure
By “medical entity”, we refer to an instance of a           Test           Diagnostic Procedure, Labora-
medical concept such as Disease or Drug. Medical                           tory Procedure
entity recognition consists in: (i) identifying med-
ical entities in the text and (ii) determining their       Table 1: Examples of categories and correspond-
categories. For instance, in the following sentence        ing UMLS semantic types
“ACE inhibitors reduce major cardiovascular dis-
ease outcomes in patients with diabetes.”, the med-        3.2 Relation Extraction
ical entity ACE inhibitors should be identified as
                                                           Our approach is based on the use of linguistic pat-
a treatment and the medical entity cardiovascular
                                                           terns. For every couple of medical entities, we
disease outcomes should be identified as a prob-
                                                           collect the possible relations between their seman-
lem.
                                                           tic types in the UMLS Semantic Network (e.g. be-
   One of the most important obstacles to identi-
                                                           tween the semantic types Therapeutic or Preven-
fying medical entities is the high terminological
                                                           tive Procedure and Disease or Syndrome there are
variation in the medical domain (e.g Swine in-
                                                           five relations: treats, prevents, complicates, etc.).
fluenza = swine flu = pig flu). MetaMap (Aron-
                                                           We construct patterns for each relation type (cf.
son, 2001) deals with this variation by using mor-
                                                           Section 3.3) and match them with the sentences
phological knowledge found in the UMLS Spe-
                                                           in order to identify the correct relation. The rela-
cialist Lexicon and term variants present in the
                                                           tion extraction process relies on two criteria: (i)
UMLS Metathesaurus. However, as mentioned in
                                                           a degree of specialization associated to each pat-
the Background section, some issues must still be
                                                           tern and (ii) an empirically-fixed order associated
addressed. According to empirical observations,
                                                           to each relation type which allows to order the pat-
the sentence and noun phrase segmentations pro-
                                                           terns to be matched. We target six relation types,
vided by MetaMap is not as performant as the seg-
                                                           described in Figure 1.
mentation provided by other non-specialized tools
                                                              4
known in Natural Language Processing. Besides,                    http://alias-i.com/lingpipe/


                                                       3
              Relation     Pattern number                         Simplified examples
                causes           28                           . . . E1 may trigger E2 . . .
              diagnoses          12             E1 is the best test for (the diagnoses of)? E2
                treats           46                    . . . E1 was found to reduce E2 . . .
               prevents          13                 . . . E1 for prophylaxis against E2 . . .

                                  Table 2: Examples of relation patterns


                                                            looked for. We build this corpus by querying the
                                                            PubMed Central database5 (PMC) of biomedical
                                                            articles with focused queries. These queries try to
                                                            identify articles that have high chances of contain-
                                                            ing the target relation between the two seed con-
                                                            cepts. We aimed to optimize precision, therefore
                                                            we applied the following principles.

                                                              • Since PMC, like PubMed, is indexed with
                                                                MeSH headings, we restrict our set of seed
                                                                concepts to those which can be expressed by
                                                                a MeSH term.

                                                              • We impose a MeSH-based search mode to
                                                                PMC by adding the /MH qualifier to the con-
   Figure 1: Excerpt of the Relations Ontology                  cepts.

                                                              • We also want these concepts to play an im-
3.3 Pattern Construction                                        portant role in the article. One way to spec-
                                                                ify this is to ask for them to be ‘major top-
Semantic relations are not always expressed with
                                                                ics’ of the paper they index ([MAJR] field
explicit words such as treat or prevent. They
                                                                in PubMed or PMC; note that this implies
are also frequently expressed with combined and
                                                                /MH).
complex expressions. Therefore, it is difficult to
build patterns which can cover all relevant expres-           • Finally, the target relation should be present
sions. However, the use of patterns is one of the               between the two concepts. MeSH and PMC
most effective methods for automatic information                provide a way to approximate a relation:
extraction from textual corpora if they are effi-               some of the MeSH subheadings (e.g., therapy
ciently designed (Cimino and Barnett, 1993; Lee                 or prevention and control) can be taken as
et al., 2004; Embarek and Ferret, 2008).                        representing underspecified relations, where
   To build patterns for a target relation R, we used           only one of the concepts is provided. For in-
a corpus-based strategy akin to that of (Hearst,                stance, Rhinitis, Vasomotor/TH can be seen
1992) and followers. We illustrate it with the                  as describing a treats relation (/TH) between
treats relation. To apply this strategy we first                some unspecified treatment and a rhinitis.
need seed terms corresponding to pairs of concepts              Unfortunately, MeSH indexing does not al-
known to entertain the target relation R. To obtain             low the expression of full binary relations
such pairs, we extracted from the UMLS Metathe-                 (i.e., linking two concepts), so we had to keep
saurus all the couples of concepts connected by                 this approximation.
the relation R. For instance, for the treats Seman-
tic Network relation, the Metathesaurus contains            Queries are thus designed according to the
45,145 treatment-problem pairs linked with the              following model: <problem>/TH[MAJR] and
“may treat” Metathesaurus relation (e.g. Diazox-            <treatment>/MH.
ide may treat Hypoglycemia).                                   They are submitted to PMC to obtain full-text
   We then need a corpus of texts where occur-              articles on the required topics. This method should
                                                               5
rences of both terms of each seed pair will be                     http://www.ncbi.nlm.nih.gov/pmc/


                                                        4
increase the chances of obtaining sentences where
one of the reference relations occurs, and provides
                                                                                   C + 0.5 × B + 0 × T
a large variety of expressions of the target relation.             P recision =                                     (1)
                                                                                            N
   The resulting corpus contains a set of medical
articles in XML format. From each article we con-              • C: number of correct entities.
struct a text file by extracting relevant fields such
                                                               • B: number of entities with correct semantic
as the title, the summary and the body (if they
                                                                 type but incorrect boundaries.
are available). Then, we split every text into sen-
tences using the segmentation model of the Ling-               • T: number of entities with wrong semantic
Pipe project. We apply MetaMap on each sentence                  types.
and keep the sentences which contain at least one
couple of concepts (c1, c2) connected by the target            • N: total number of retrieved entities. (C + B
relation R according to the Metathesaurus.                       + T = N)
   This semantic pre-analysis reduces the manual
                                                                The recall of named entity rceognition was not
effort required for subsequent pattern construc-
                                                             measured due to the difficulty of annotating man-
tion, which allows us to enrich the patterns and
                                                             ually all the medical entities in our corpus. For the
to increase their number. The patterns constructed
                                                             relation extraction evaluation, recall is the number
from these sentences consist in regular expressions
                                                             of correct treatment relations found divided by the
taking into account the occurrence of medical enti-
                                                             total number of treatment relations. Precision is
ties at precise positions. Table 2 presents the num-
                                                             the number of correct treatment relations found di-
ber of patterns constructed for each relation type
                                                             vided by the number of treatment relations found.
and some simplified examples of regular expres-
sions. A similar process was performed to extract            4.2 Results
another different set of articles for our evaluation.
                                                             Table 3 shows the precision of medical entity
                                                             recognition obtained by our entity extraction ap-
4 Evaluation
                                                             proach (text to sentences segmentation with Ling-
In this section, we present our evaluation method            Pipe, sentence to noun phrase segmentation with
and the obtained results for medical entity recog-           treetagger-chunker and stoplist filtering), using
nition and the extraction of treatment relations.            LTS+MetaMap, compared to the simple use of
                                                             MetaMap. Entity type errors are denoted by T ,
4.1 Evaluation Method                                        boundary-only errors are denoted by B and preci-
To build an evaluation corpus, we queried Pub-               sion is denoted by P .
MedCentral with MeSH queries (e.g. Rhinitis,                    The LTS+MetaMap method led to a significant
Vasomotor/th[MAJR] AND (Phenylephrine OR                     increase in the precision of medical entities rec-
Scopolamine OR tetrahydrozoline OR Ipratropium               ognized by MetaMap. Actually, LingPipe out-
Bromide)). Then we chose a subset of 20 varied ar-           performed MetaMap in sentence segmentation on
ticles (e.g. reviews, comparative studies). We veri-         our test corpus. LingPipe found 580 correct sen-
fied that no article of the evaluation corpus is used        tences where MetaMap found 743 sentences con-
in the pattern construction process. The last stage          taining boundary errors and some sentences were
of preparation was the manual annotation of med-             even cut in the middle of medical entities (most
ical entities and treatment relations in these 20 ar-        often due to abbreviations). A qualitative study
ticles (total = 580 sentences). Figure 2 shows an
example of an annotated sentence.                              <relation>
   We use the standard measures of recall, pre-                    <name>treat</name>
                                                                   <sentence>A subsequent study of patients with
cision and F-measure. The precision of named                            cSSSI also found that daptomycin resulted
entity recognition depends both on the textual                          in faster clinical improvement</sentence>
boundaries of the extracted entity and on the cor-                 <status>established-known</status>
                                                                   <source>daptomycin</source>
rectness of its associated category (semantic type).               <target>cSSSI</target>
In our evaluation, boundary-only errors cost half a            </relation>
point and the precision is calculated according to
the following formula:                                           Figure 2: Example of manual annotations


                                                         5
                                                  LTS + MetaMap                      MetaMap
                                                 Tr     Br    P               Tr        Br    P
                Disease Or Syndrome             9.81 26.48 76.94             9.09     52.27 64.77
                  Injury or poisoning           26.19 35.71 55.95            33.33    34.84 49.24
                  Neoplastic Process            37.5 12.50 56.25             29.03     6.45 67.74
              Anatomical Abnormality            40.00 0.00 60.00             85.71     0.00 14.28
            Cell or Molecular Dysfunction       44.44 44.44 27.79            66.66    25.00 20.83
                         Total                  12.23 27.10 74.21            30.08    30.52 54.62

Table 3: Medical entity extraction according to semantic types. Tr = T/N, type error rate; Br = B/N,
boundary error rate; P = precision. All results are percentages.


of the noun phrases extracted by MetaMap and
Treetagger-chunker also shows that the latter pro-
duces less boundary errors.
   For the extraction of treatment relations, we
obtained 60.46% recall, 75.72% precision and
67.23% F-measure. Other relevant approaches to
our work like (Lee et al., 2004) obtained 84%
recall, 48.14% precision and 61.20% F-measure
for the extraction of treatment relations. Semrep                Figure 3: MetAE - Annotation Interface
(Ahlers et al., 2007) obtained 54% recall, 84%
precision and 68.21% F-measure on a set of pred-
ications including the treatment relationship (i.e.
administrated to, manifestation of, treats). How-
ever, given the differences in corpora and in the na-
ture of relations, these comparisons must be con-
sidered with caution.


5 Annotation and exploration platform:
  MeTAE

We implemented our approach in the MeTAE6                        Figure 4: MeTAE - Exploration Interface
platform which allows to annotate medical texts
or files and writes the annotations of medical en-           6 Discussion
tities and relationships in RDF format in exter-
nal supports (cf. Figure 3). MeTAE allows also               Several semantic relation extraction approaches
to explore sematically the available annotations             only address relation detection (e.g. find that
through a form-based interface. User queries are             a sentence contains the searched relation (Lee
reformulated in SPARQL language according to a               et al., 2004)).     In the context of medical
domain ontology which defines the semantic types             question-answering systems, we are not only in-
associated to the medical entities and the seman-            terested in relation detection but also in the
tic relationships with their possible domains and            linked medical entities. We focus on search-
ranges. Answers consist in sentences whose anno-             ing <source,relation,target> triples such that the
tations conform to the user query and their corre-           source and the target have known categories (se-
sponding documents (cf. Figure 4).                           mantic types) and such that the relation is valid
                                                             w.r.t domain knowledge and w.r.t linguistic con-
   6
                                                             siderations (i.e. the sentence really says that
     An enhanced version of the platform MeTAE will be
available online very shortly at http://www.limsi.fr/        the source treats the target). In this context,
Individu/abacha/metae.html                                   the same sentence may contain several triples


                                                         6
<source,relation,target>.                                   Rolf Engelbrecht.    1997.    Expert systems for
   A first analysis of the false positives shows that         medicine functions and developments. Zentralbl
                                                              Gynakol;119(9):428-34.
the main error causes are: (i) errors in the extrac-
tion of medical entities (ii) patterns of the treat-        Marcelo Fiszman, Graciela Rosemblat, Caroline B
ment relation that cover also forms of expression            Ahlers, Thomas C Rindflesch. 2007. Identifying
of other relations and (iii) sentences that contain          risk factors for metabolic syndrome in biomedical
                                                             text. AMIA Annu Symp Proc, 249-253.
possible source and target entities without them
being connected with the treatment relation.                Marti A. Hearst. 1992. Automatic Acquisition of Hy-
   We obtained good results in precision and F-              ponyms from Large Text Corpora. Proceedings of
                                                             the 14th conference on Computational linguistics,
measure compared to other semantic relation ex-              539-545.
traction approaches. This meets our initial ob-
jective, which is to have a high precision in rela-         Donald Hindle. 1990. Noun classification from predi-
                                                              cate argument structures. In Proceedings of the 28th
tion extraction in order to build efficient question-         annual meeting on Association for Computational
answering systems.                                            Linguistics, 268-275.

7 Conclusion                                                Martyn O. Hotvedt. 1996. Continuing medical educa-
                                                             tion: actually learning rather than simply listening.
In this paper, we presented a knowledge and                  JAMA 1996, 275:1638.
linguistic-based approach for the extraction of             Christopher S. G. Khoo, Syin Chan, and Yun Niu.
medical entities and the semantic relations linking           2000. Extracting Causal Knowledge from a Medical
them. This approach is based on two main steps:               Database Using Graphical Patterns. In Proceedings
(i) the recognition of medical entities with an en-           of 38th Annual Meeting of the ACL, Hong Kong.
hanced use of MetaMap and (ii) the exploitation of          Chew-Hung Lee, Jin-Cheon Na, and Christopher
linguistic patterns taking into account the semantic          Khoo. 2003. Ontology Learning for Medical Dig-
types of medical entities. The results obtained on            ital Libraries. Proceedings of the 6th International
                                                              Conference of Asian Digital Library, 302-305.
a real test corpus show the effectiveness of our ap-
proach and its advantages for question-answering            Chew-Hung Lee, Christopher Khoo and Jin-Cheon Na.
systems.                                                      2004. Automatic identification of treatment rela-
   In short-term perspectives, we intend to study             tions for medical ontology learning: An exploratory
                                                              study. Proceedings of the Eighth International ISKO
the false negatives in order to improve our pat-              Conference, 245-250.
terns. We also intend to design a method which
extracts automatically contextual information such          Stéphane M. Meystre and Peter J. Haug. 2005. Com-
                                                               paring natural language processing tools to extract
as the status of the relation (e.g. hypotheti-                 medical problems from narrative text. AMIA Annu
cal, established-known) and information about pa-              Symp Proc, 525-9.
tients (e.g. gender, age).
                                                            Wanda Pratt and Meliha Yetisgen-Yildiz.  2003.
                                                              A Study of Biomedical Concept Identification:
                                                              MetaMap vs. People. AMIA Annu Symp Proc, 529-
References                                                    533.
Caroline B. Ahlers, Marcelo Fiszman, Dina Demner-
  Fushman, François-Michel Lang and Thomas C.              Denys Proux, Franois Rechenmann, and Laurent Jul-
  Rindflesh. 2007. Extracting Semantic Predi-                 liard. A Pragmatic Information Extraction Strategy
  cations From Medline Citations for Pharmacoge-              for Gathering Data on Genetic Interactions. Pro-
  nomics. Pacific Symposium on Biocomputing, 2007             ceedings of the Eighth International Conference on
                                                              Intelligent Systems for Molecular Biology, p.279-
Alan R. Aronson. 2001. Effective mapping of biomed-           285, August 19-23, 2000.
  ical text to the UMLS Metathesaurus: the MetaMap
  program. AMIA Annu Symp Proc, 17-21.                      James Pustejovsky, José M. Castaño, Jason Zhang, M.
                                                              Kotecki, and B. Cochran. 2002. Robust Relational
J.J. Cimino and G.O. Barnett. 1993. Automatic knowl-          Parsing over Biomedical Literature: Extracting In-
   edge acquisition from MEDLINE. Methods of Infor-           hibits Relations. Pacific Symposium on Biocomput-
   mation in Medicine;32(2);120-130.                          ing, 362-373.
Mehdi Embarek and Olivier Ferret. 2008. Learning            Thomas C. Rindflesch, Carol A. Bean and Charles A.
 Patterns for Building Resources about Semantic Re-           Sneiderman. 2000. Argument Identification for Ar-
 lations in the Medical Domain. Proceedings of the            terial Branching Predications Asserted in Cardiac
 Sixth International Language Resources and Evalu-            Catheterization Reports. AMIA Annu Symp Proc,
 ation (LREC’08).                                             704-708.


                                                        7
Thomas C. Rindflesch, Jayant V. Rajan, and Lawrence
  Hunter. 2000. Extracting Momecular Binding Rela-
  tionsships from Biomedical Text. Proceedings of the
  sixth conference on Applied natural language pro-
  cessing, p.188-195, April 29-May 04, 2000, Seattle,
  Washington.

Gunther Schadow and Clement J. McDonald. 2003.
  Extracting Structured information from free text
  pathology reports. AMIA Annu Symp Proc, 584-
  588.
Fabian M. Suchanek, Georgiana Ifrim and Gerhard
  Weikum. 2006. Combining Linguistic and Statis-
  tical Analysis to Extract Relations from Web Docu-
  ments. Proceedings of the 12th ACM SIGKDD in-
  ternational conference on Knowledge discovery and
  data mining, 712-717.
Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, and
  Ji-Rong Wen. StatSnowball: a statistical approach
  to extracting entity relationships. Proceedings of
  the 18th international conference on World wide
  web, April 20-24, 2009, Madrid, Spain.


                                                        8