<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Marker words for negation and speculation in health records and consumer reviews</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Skeppstedt</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carita Paradis</string-name>
          <email>carita.paradis@englund.lu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Kerren</string-name>
          <email>andreas.kerren@lnu.se</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gavagai AB</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stockholm</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sweden maria@gavagai.se</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Languages and Literature, Lund University</institution>
          ,
          <addr-line>Lund</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science Department, Linnaeus University</institution>
          ,
          <addr-line>Va ̈xj o ̈</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Conditional random fields were trained to detect marker words for negation and speculation in two corpora belonging to two very different domains: clinical text and consumer review text. For the corpus of clinical text, marker words for speculation and negation were detected with results in line with previously reported interannotator agreement scores. This was also the case for speculation markers in the consumer review corpus, while detection of negation markers was unsuccessful in this genre. Also a setup in which models were trained on markers in consumer reviews, and applied on the clinical text genre, yielded low results. This shows that neither the trained models, nor the choice of appropriate machine learning algorithms and features, were transferable across the two text genres.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        When health professionals document patient
status, they often record common symptoms that the
patient is not showing, or reason about possible
diagnoses. Clinical texts, therefore, contain a large
amount of negation and speculation
        <xref ref-type="bibr" rid="ref19">(Velupillai et
al., 2011)</xref>
        .
      </p>
      <p>
        Negations and speculations are also expressed
in consumer review texts, e.g., when the reviewed
artefact lacks an expected feature, or when
reviewers are uncertain of their opinion. Previous
research shows that the proportion of sentences
containing negation and speculation is even larger in
consumer review texts that in clinical texts
        <xref ref-type="bibr" rid="ref10 ref21">(Vincze
et al., 2008; Konstantinova et al., 2012)</xref>
        .
      </p>
      <p>
        The BioScope corpus was one of the first
clinical corpora annotated for negation and
speculation
        <xref ref-type="bibr" rid="ref21">(Vincze et al., 2008)</xref>
        . The guidelines used
for the BioScope corpus have later, with only a
few modifications, been used for annotating
consumer review texts. A qualitative analysis of the
difference between the medical genres of the
BioScope corpus and consumer review texts has
previously been carried in order to adapt the
guidelines for the genre of review texts
        <xref ref-type="bibr" rid="ref19 ref9">(Konstantinova
and de Sousa, 2011)</xref>
        . To the best of our
knowledge, there are, however, no previous studies in
which the same machine learning algorithm is
applied to both corpora and the results are compared.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>
        There are other medical corpora annotated with
the same guidelines as the BioScope corpus
        <xref ref-type="bibr" rid="ref21">(Vincze et al., 2008)</xref>
        , e.g., a drug-drug
interaction corpus
        <xref ref-type="bibr" rid="ref5">(Bokharaeian et al., 2014)</xref>
        . There
are also medical corpora annotated according
to other guidelines, e.g., guidelines that include
more fine-grained categories, such as weaker
or stronger speculation/uncertainty
        <xref ref-type="bibr" rid="ref20">(Velupillai,
2012)</xref>
        , or whether a clinical finding is
conditionally or hypothetically present in the patient
        <xref ref-type="bibr" rid="ref18">(Uzuner et al., 2011)</xref>
        . Large annotated corpora are
often constructed on English medical text, e.g., the
i2b2/VA challenge on concepts, assertions, and
relations corpus, but negation and speculation has
also been annotated in corpora with clinical text
written in, e.g., Swedish
        <xref ref-type="bibr" rid="ref20">(Velupillai, 2012)</xref>
        and
Japanese
        <xref ref-type="bibr" rid="ref3">(Aramaki et al., 2014)</xref>
        .
      </p>
      <p>
        Examples of non-medical corpora are the
previously mentioned corpus of consumer reviews
        <xref ref-type="bibr" rid="ref19 ref9">(Konstantinova and de Sousa, 2011)</xref>
        , and literary
texts annotated for negation in the *SEM shared
task
        <xref ref-type="bibr" rid="ref11">(Morante and Blanco, 2012)</xref>
        .
      </p>
      <p>Negations and speculations are often annotated
in two steps. First, marker words (often also
referred to as cue words or keywords) for
negation/speculation are annotated, and then either the
scope of text that the marker words affects is
annotated, or whether specific focus words occurring
in the text are affected by the marker words. Focus
words could, for instance, be clinical findings that
are mentioned in the same sentence as the marker
words. Automatic detection of negation and
speculation is typically divided into two subtasks
corresponding to the two annotation steps. That is,
first the marker words are detected and, thereafter,
the task of determining the scope or classifying the
focus words is carried out.</p>
      <p>
        In this study, the first of the two subtasks of
negation/speculation detection is addressed, i.e.,
the detection of marker words for negation and
speculation. This task is typically addressed
using two main approaches, either a vocabulary of
negation/speculation markers is compiled and
tokens in the text are compared to this vocabulary in
order to determine whether they are marker words
        <xref ref-type="bibr" rid="ref2 ref6">(Chapman et al., 2001; Ahltorp et al., 2014)</xref>
        , or
alternatively a machine learning model is trained.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Materials</title>
      <p>
        Two English corpora were used in the
experiments, the Bioscope corpus
        <xref ref-type="bibr" rid="ref21">(Vincze et al., 2008)</xref>
        and the SFU Review corpus annotated for
negation and speculation
        <xref ref-type="bibr" rid="ref10">(Konstantinova et al., 2012)</xref>
        .
      </p>
      <p>As previously mentioned, the annotation
guidelines for the SFU Review corpus were an
adaption of the guidelines for the BiosScope corpus,
and they were, therefore, very similar. In both
corpora, marker words expressing negation and
speculation were annotated, as well as their scope. The
general principle for the length of text to
annotate as marker words was to annotate the minimal
unit of text that still expresses negation or
speculation. The definition of negation used for the
task was “[...] the implication of the non-existence
of something”, while speculation was defined as
“[...] the possible existence of a thing, i.e. neither
its existence nor its non-existence is unequivocally
stated [...]”. Marker words could either be
individual words that express negation or speculation on
their own, e.g., “This fmayg findicateg..”, or
complex expressions containing several words that do
not convey negation or speculation on their own,
e.g., “This fraises the question ofg...”.</p>
      <p>The BioScope corpus consists of three
subcorpora, containing clinical text, biological full
papers and biological scientific abstracts. For
this study, the subcorpus containing clinical text
was used, which consists of 6,400 sentences of
which 14% contains negation and 13% contains
speculation. The pairwise agreement rates for
the three annotators involved in the project were
91/95/96 for annotating marker words for negation
and 84/90/92 for marker words for speculation.</p>
      <p>
        The corpus of consumer reviews was a
previously complied corpus, the SFU Review corpus,
to which annotations of negation and speculation
were added. The corpus contains consumer
generated reviews of books, movies, music, cars,
computers, cookware and hotels
        <xref ref-type="bibr" rid="ref16 ref17">(Taboada and Grieve,
2004; Taboada et al., 2006)</xref>
        . The corpus
consists of 17,000 sentences, of which 18% was
annotated as containing negation and 22% as
containing speculation. 10% of the corpus was doubly
annotated to measure inter-annotator agreement,
resulting in an F-score and Kappa score of 92 for
negation markers and 89 for speculation markers.
      </p>
      <p>
        There are previous studies on the detection of
speculation and negation markers in these two
corpora. A perfect precision and a recall of 0.98
were obtained, when training an IGTree classifier
to detect negation markers on the full paper
subcorpus of the BioScope corpus and evaluating it on
the clinical sub-corpus
        <xref ref-type="bibr" rid="ref12 ref13">(Morante and Daelemans,
2009b)</xref>
        . Similar results for detecting negation
markers in the clinical sub-corpus were achieved
by a vocabulary matching system. When using
the same set-up for detecting speculation markers,
i.e., training on the paper sub-corpus and
evaluating on the clinical, a precision of 0.88 and a recall
of 0.27 were achieved
        <xref ref-type="bibr" rid="ref12 ref13">(Morante and Daelemans,
2009a)</xref>
        . For these experiments, the token to be
classified, as well as its immediate neighbouring
tokens were used as features. When instead
training as well as evaluating on the clinical sub-corpus
(a conditional random fields model with tokens as
features), a precision of 0.99 and a recall of 0.87
were achieved for detecting speculation, while a
rule-base vocabulary matching system achieved a
precision of 0.95 and a recall of 0.96 on this task
        <xref ref-type="bibr" rid="ref1">(Agarwal and Yu, 2010)</xref>
        . Examples of other
results reported are a precision/recall of 0.97/0.98
for negation markers and 0.96/0.93 for
speculation markers
        <xref ref-type="bibr" rid="ref8">(Cruz D´ıaz et al., 2012)</xref>
        , using a C4.5
classifier and a support vector machine.
      </p>
      <p>
        There is also previous research on the
detection of which tokens that constitute negation and
speculation markers in the SFU Review corpus
        <xref ref-type="bibr" rid="ref7">(Cruz et al., 2015)</xref>
        . Experiments were conducted
in which 10-fold cross-validation was applied on
the entire corpus, and a feature set that included
the token and its closest neighbours was used. For
the most successful machine learning algorithm
(a cost-sensitive support vector machine), a
precision of 0.80 and a recall of 0.98 were obtained
for negation and a precision of 0.91 and a recall
of 0.94 were obtained for speculation. For the
two other evaluated algorithms (Naive Bayes and
a support vector machine with a radial basic
function kernel), much lower and slightly lower
results, respectively, were obtained. Both of these
two lower-performing models had problems
handling multi-word markers for negation that
included n’t or not, and results for these two
models were improved by a simple rule-based
postprocessing algorithm specifically designed to
handle these cases.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>Experiments consisted of training machine
learning models to recognise markers for negation and
speculation and, thereafter, evaluate these
models. Three setups were used: i) models trained
on a subset of the BioScope corpus and evaluated
on another subset of the same corpus, ii) models
trained on a subset of the SFU Review corpus and
evaluated on another subset of this corpus, and
finally iii) models trained on the SFU Review
corpus and evaluated on the BioScope corpus. The
rationale for performing the last experiment was
the difficulty that is often associated with getting
access to large amounts of clinical text, due to the
sensitive content of text belonging to this genre.
If it would be possible to successfully apply a
model trained on non-clinical text on the clinical
text genre, this might be a possible solution in
cases when the amount of available clinical data
is scarce.</p>
      <p>
        The text segments annotated as negation- and
speculation markers were coded according to the
BIO-format, i.e., a token could be the beginning
of, inside or outside of a marker segment. The
approach of structured prediction was taken, and the
PyStruct package was used
        <xref ref-type="bibr" rid="ref14 ref3 ref5">(Mu¨ller and Behnke,
2014)</xref>
        to train a linear conditional random fields
model, using the OneSlackSSVM class. Default
parameters were used (which included a
regularisation parameter of 1) and a maximum of 100
passes over the dataset to find constraints. To limit
the feature set, as the models were to be trained on
a limited amount of data, features were restricted
to the token that was to be classified, and, in
addition, a minimum of two occurrences of a token in
the training data was required for it to be included.
As linear conditional random fields were used, the
classification of a token was dependent on the
classification of the two neighbouring tokens
        <xref ref-type="bibr" rid="ref15 ref17">(Sutton
and McCallum, 2006)</xref>
        , making it possible to detect
multi-word markers.
      </p>
      <p>
        For all setups, the models were trained with
an increasingly larger size of training data, from
600 training instances to 3,000. In each
iteration, 200 new training instances were randomly
selected for inclusion in the training data. The same
experiment was repeated four times, each time
with a new, randomly selected, subset of
heldout data to use for evaluation in setups i) and ii),
and (for all experiments) new random selections
of training instances. Precision, recall and F-score
for recognising segments that were classified as
negation- or speculation markers were measured
with NLTK’s ChunkScore class
        <xref ref-type="bibr" rid="ref4">(Bird, 2002)</xref>
        .
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Results and discussion</title>
      <p>For detecting speculation markers in the SFU
Review corpus, and for detecting both speculation
and negation markers in the BioScope corpus
when trained on text of the same genre, the method
was relatively successful (Figure 1), achieving
results in line with the inter-annotator agreement.1
For detecting negation, the increase in training
data size did not affect these results, while the
general trend for speculation was an improvement of
results with more training samples, although
results remained slightly unstable.</p>
      <p>For detecting negation in the SFU Review
corpus, on the other hand, results were much lower
than the measured agreement figures. Results
were consistently low for all four folds (F-scores
0.70/0.75/0.76/0.74 for 3,000 training instances),
and the F-score decreased with a larger training
data set due to a decrease in precision, and a recall
that remained low. It could be ruled out that the
low results were due to the relatively small
training data size, since an additional model, trained on
8,000 samples, gave even lower results (an F-score
of 0.62). Multi-token negation markers including
1Previous machine learning results have typically been
achieved using a larger training set, and, therefore, a
comparison to the agreement figures was carried out, instead of a
comparison to previous results.
0.9
n’t or not were, however, very common among
false negatives and positives, and it is therefore
likely that the low results for this category were
due to the inability of the trained model to detect
multi-token negations, i.e., the same problem that
arose for two of the models trained by Cruz et
al. (2015). This might, for instance, be an effect
of not including the neighbouring words as
features. The models were, however, in general able
to detect multi-word marker words, e.g., the
following complex speculation markers I-’d-suggest,
would-think, can-either, might-expect, would-feel.
There were also a number of complex
expressions among the false positives for speculation,
that might be considered as belonging to this class,
despite not being annotated as such. Examples are
can-hope, can-either, to-think.</p>
      <p>
        Also the setting of training the model on the
SFU Review corpus and evaluating it on the
BioScope corpus gave low results for negation as well
as for speculation. It can, however, be observed
that for speculation markers, this strategy was
more successful than the previously explored
strategy of training a model on biomedical article texts
and applying it on the clinical text genre
        <xref ref-type="bibr" rid="ref12 ref13">(Morante
and Daelemans, 2009a)</xref>
        . There might thus be a
larger similarity between how speculation is
expressed in consumer reviews and in clinical texts,
than between clinical and biomedical texts.
Examining incorrectly classified segments showed that
false negatives were not limited to marker words
that might be more typical to the reasoning style
of the clinical genre, e.g., evaluate, suggest,
indicate, compatible, consistent and question, but also
included general expressions such as possible and
probable.
      </p>
      <p>Results also show that not even lessons learnt
for the choice of appropriate machine learning
algorithms and features are transferable across
genres, as the techniques for detecting negation that
was shown successful for the BioScope corpus
produced low results on the SFU Review corpus.
Future work includes research on whether these
findings also hold for the scope of the markers.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In the BioScope corpus, speculation and negation
markers were detected with results close to
previously reported annotator agreement scores. This
was also the case for speculation markers in the
SFU Review corpus, while detection of negation
markers was unsuccessful in this genre. To train
the model on consumer reviews and apply it on
clinical text also yielded low results, showing that
neither the trained models, nor the choice of
appropriate algorithms and features, were
transferable across the two text genres.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This work was funded by the StaViCTA project,
framework grant “the Digitized Society – Past,
Present, and Future” with No. 2012-5659 from the
Swedish Research Council (Vetenskapsra˚det).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Shashank</given-names>
            <surname>Agarwal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Hong</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Detecting hedge cues and their scope in biomedical text with conditional random fields</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          ,
          <volume>43</volume>
          (
          <issue>6</issue>
          ):
          <fpage>953</fpage>
          -
          <lpage>961</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Magnus</given-names>
            <surname>Ahltorp</surname>
          </string-name>
          , Hideyuki Tanushi, Shiho Kitajima, Maria Skeppstedt, Rafal Rzepka, and
          <string-name>
            <given-names>Kenji</given-names>
            <surname>Araki</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>HokuMed in NTCIR-11 MedNLP2:Automatic extraction of medical complaints from Japanese health records using machine learning and rule-based methods</article-title>
          .
          <source>In Proceedings of NTCIR-11</source>
          , pages
          <fpage>158</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Eiji</given-names>
            <surname>Aramaki</surname>
          </string-name>
          , Mizuki Morita, Yoshinobu Kano, and
          <string-name>
            <given-names>Tomoko</given-names>
            <surname>Ohkuma</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Overview of the NTCIR11 MedNLP-2 Task</article-title>
          .
          <source>In Proceedings of NTCIR-11</source>
          , pages
          <fpage>147</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Steven</given-names>
            <surname>Bird</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Nltk: The natural language toolkit</article-title>
          .
          <source>In Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics</source>
          , Stroudsburg, PA, USA. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Behrouz</given-names>
            <surname>Bokharaeian</surname>
          </string-name>
          , Alberto Diaz, Mariana Neves, and Virginia Francisco.
          <year>2014</year>
          .
          <article-title>Exploring negation annotations in the drugddi corpus</article-title>
          .
          <source>In Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BIOTxtM</source>
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Wendy W. Chapman</surname>
            , Will Bridewell, Paul Hanbury, Gregory
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Cooper</surname>
          </string-name>
          , and
          <string-name>
            <surname>Bruce</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Buchanan</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>A simple algorithm for identifying negated findings and diseases in discharge summaries</article-title>
          .
          <source>J Biomed Inform</source>
          ,
          <volume>34</volume>
          (
          <issue>5</issue>
          ):
          <fpage>301</fpage>
          -
          <lpage>310</lpage>
          , Oct.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Noa P.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , Maite Taboada, and
          <string-name>
            <given-names>Ruslan</given-names>
            <surname>Mitkov</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>A machine-learning approach to negation and speculation detection for sentiment analysis</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          , pages
          <fpage>526</fpage>
          -
          <lpage>558</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Noa P Cruz D</surname>
          </string-name>
          <article-title>´ıaz, Manuel J Man˜a Lo´pez, Jacinto Mata Va´zquez, and Victoria Pacho´n A´ lvarez</article-title>
          .
          <year>2012</year>
          .
          <article-title>A machine-learning approach to negation and speculation detection in clinical texts</article-title>
          .
          <source>Journal of the American society for information science and technology</source>
          ,
          <volume>63</volume>
          (
          <issue>7</issue>
          ):
          <fpage>1398</fpage>
          -
          <lpage>1410</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Natalia</given-names>
            <surname>Konstantinova and Sheila C. M. de Sousa</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Annotating negation and speculation: the case of the review domain</article-title>
          .
          <source>In Proceedings of the Student Research Workshop associated with The 8th International Conference on Recent Advances in Natural Language Processing, RANLP</source>
          <year>2011</year>
          ,
          <volume>13</volume>
          September,
          <year>2011</year>
          , Hissar, Bulgaria, pages
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Natalia</given-names>
            <surname>Konstantinova</surname>
          </string-name>
          ,
          <string-name>
            <surname>Sheila C.M. de Sousa</surname>
            , Noa P. Cruz,
            <given-names>Manuel J</given-names>
          </string-name>
          .
          <source>Man˜a</source>
          ,
          <string-name>
            <surname>Maite Taboada</surname>
            , and
            <given-names>Ruslan</given-names>
          </string-name>
          <string-name>
            <surname>Mitkov</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>A review corpus annotated for negation, speculation and their scope</article-title>
          .
          <source>In Nicoletta Calzolari</source>
          , Khalid Choukri, Thierry Declerck, Mehmet Ug˘ur Dog˘anur, Bente Maegaard, Joseph Mariani, Jan Odijk, and Stelios Piperidis, editors,
          <source>Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC)</source>
          , pages
          <fpage>3190</fpage>
          -
          <lpage>3195</lpage>
          , Istanbul, Turkey.
          <source>European Language Resources Association (ELRA).</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Roser</given-names>
            <surname>Morante</surname>
          </string-name>
          and
          <string-name>
            <given-names>Eduardo</given-names>
            <surname>Blanco</surname>
          </string-name>
          .
          <year>2012</year>
          . *
          <article-title>sem 2012 shared task: resolving the scope and focus of negation</article-title>
          .
          <source>In Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, SemEval '12</source>
          , pages
          <fpage>265</fpage>
          -
          <lpage>274</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Roser</given-names>
            <surname>Morante</surname>
          </string-name>
          and
          <string-name>
            <given-names>Walter</given-names>
            <surname>Daelemans</surname>
          </string-name>
          .
          <year>2009a</year>
          .
          <article-title>Learning the scope of hedge cues in biomedical texts</article-title>
          .
          <source>In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing</source>
          ,
          <source>BioNLP '09</source>
          , pages
          <fpage>28</fpage>
          -
          <lpage>36</lpage>
          , Stroudsburg, PA, USA. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Roser</given-names>
            <surname>Morante</surname>
          </string-name>
          and
          <string-name>
            <given-names>Walter</given-names>
            <surname>Daelemans</surname>
          </string-name>
          .
          <year>2009b</year>
          .
          <article-title>A metalearning approach to processing the scope of negation</article-title>
          .
          <source>In CoNLL '09: Proceedings of the Thirteenth Conference on Computational Natural Language Learning</source>
          , pages
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          , Morristown, NJ, USA. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Andreas C.</surname>
          </string-name>
          <article-title>Mu¨ller</article-title>
          and
          <string-name>
            <given-names>Sven</given-names>
            <surname>Behnke</surname>
          </string-name>
          .
          <year>2014</year>
          . pystruct
          <article-title>- learning structured prediction in python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>15</volume>
          :
          <fpage>2055</fpage>
          -
          <lpage>2060</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Charles.</given-names>
            <surname>Sutton</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrew</given-names>
            <surname>McCallum</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>An introduction to conditional random fields for relational learning</article-title>
          .
          <source>In Lise Getoor and Ben Taskar</source>
          , editors, Introduction to Statistical Relational Learning. MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Maite</given-names>
            <surname>Taboada</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jack</given-names>
            <surname>Grieve</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Analyzing appraisal automatically</article-title>
          .
          <source>In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications</source>
          , pages
          <fpage>158</fpage>
          -
          <lpage>161</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Maite</given-names>
            <surname>Taboada</surname>
          </string-name>
          , Caroline Anthony, and
          <string-name>
            <given-names>Kimberly</given-names>
            <surname>Voll</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Methods for creating semantic orientation dictionaries</article-title>
          .
          <source>In Proceedings of 5th International Conference on Language Resources and Evaluation (LREC)</source>
          , pages
          <fpage>427</fpage>
          -
          <lpage>432</lpage>
          , Genoa,
          <string-name>
            <given-names>Italy. European</given-names>
            <surname>Language Resources Association (ELRA).</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>O¨ zlem</article-title>
          . Uzuner, Brett R. South,
          <string-name>
            <given-names>Shuying</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <surname>and Scott L.</surname>
          </string-name>
          <year>DuVall</year>
          .
          <year>2011</year>
          .
          <year>2010</year>
          i2b2/
          <article-title>va challenge on concepts, assertions, and relations in clinical text</article-title>
          .
          <source>J Am Med Inform Assoc</source>
          ,
          <volume>18</volume>
          (
          <issue>5</issue>
          ):
          <fpage>552</fpage>
          -
          <lpage>556</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Sumithra</given-names>
            <surname>Velupillai</surname>
          </string-name>
          , Hercules Dalianis, and
          <string-name>
            <given-names>Maria</given-names>
            <surname>Kvist</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Factuality Levels of Diagnoses in Swedish Clinical Text</article-title>
          . In A. Moen,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Andersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Aarts</surname>
          </string-name>
          , and P. Hurlen, editors,
          <source>Proc. XXIII International Conference of the European Federation for Medical Informatics (User Centred Networked Health Care)</source>
          , pages
          <fpage>559</fpage>
          -
          <lpage>563</lpage>
          , Oslo,
          <year>August</year>
          . IOS Press.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Sumithra</given-names>
            <surname>Velupillai</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Shades of Certainty - Annotation and Classification of Swedish Medical Records</article-title>
          .
          <source>Doctoral thesis</source>
          , Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden, April.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Veronika</given-names>
            <surname>Vincze</surname>
          </string-name>
          , Gyo¨rgy Szarvas, Richa´rd Farkas,
          <source>Gyo¨rgy Mo´ra, and Ja´nos Csirik</source>
          .
          <year>2008</year>
          .
          <article-title>The BioScope Corpus: Biomedical texts annotated for uncertainty, negation and their scopes</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>9</volume>
          (
          <issue>Suppl 11</issue>
          ):
          <fpage>S9</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>