<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Vocabulary In Discharge Documents</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>The Patient's Perspective</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Veronika Laippala</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riitta Danielsson-Ojala</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heljä Lundgrén-Laine</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanna Sa- lanterä</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tapio Salakoski</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of French Studies</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Information Technology 20014 University of Turku</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Nursing Science</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Southwest Hospital District</institution>
          ,
          <addr-line>Turku</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Medical discharge documents are summaries written by a physician about the patient's condition and aim at transferring information to other health care personnel but also to the patient. According to the legislation, the patient should be able to understand the document. In practice, however, this has been shown to be problematic. This paper studies discharge documents from the patients' perspective and examines how they fulfil the legislation's demands on understandability. Concentrating on the vocabulary of the texts, we analyse the frequency of domainadapted terms, abbreviations and foreign words. The material consists of 23 528 heart patients' discharge documents (5 747 126 words). The analysis is performed with the morphological analyser FinTWOL (http://www2.lingsoft.fi/cgi-bin/fintwol). Altogether, FinTWOL analyses 24% of the corpus as unknown or foreign words, abbreviations or medical terms. The most common category, unknown words, includes misspellings and medical terms, such as l.dex. Of these, 100 most common cover for 43% of the total. These terms thus seem to be relatively fixed. Of the words analysed as abbreviations, some are common also in standard language, but others are still very domain-specific, such as I.V. (intravenous). Also the used abbreviations are very fixed: the 100 most common ones cover for 94% of the total. This, however, does not help the patient who probably reads only one document. Similarly, even though misspellings are globally infrequent, they still occur more than once per document. In order to place the obtained results in a context, we performed a similar analysis on general Finnish university newspaper text from Turku Dependency Treebank. In comparison with the 24% obtained with the discharge documents, from the total of 10 687 words, 8,6% were given a special tag. The results show that that terms and abbreviations are considerably more used in discharge documents than in general newspaper text. It is clear that a text with such a vocabulary is domain-specific and distinct from the language that the patient is used to. Also e.g. the varying use of upper and lower case letters (dg and DG for diagnosis) emphasize the particularity of the language. In</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>standard language texts such writing would not be acceptable. Standard writing
would, however, help the patients to better understand the texts.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>Medical discharge documents are care summaries written by a physician about the
patient’s condition, medication, progress of the illness and continuation of care. They
are a part of electronic patient record and aim at transferring information to other
physicians and health care personnel but also the patient. According to the legislation
(www.finlex.fi), the patient should be able to read and understand the document;
the information should be explicit and intelligible, and only generally known and
accepted abbreviations should be used.</p>
      <p>However, previous studies have shown problems in the functioning of discharge
documents: they do not necessarily reach their users on time or their quality may be
poor.1,2 As many texts written in a medical context, discharge documents seem to be
dense, telegraphic and contain frequent medical terms, abbreviations and
misspellings.3-5 Even though these features probably do not present problems for physicians,
they are difficult for the patients, who are not familiar with the domain.</p>
      <p>This paper studies discharge documents from the patients’ perspective and
examines how they fulfil the legislation’s demands on understandability. Concentrating on
the vocabulary of the texts, we analyse the frequency of domain-adapted terms,
abbreviations and foreign words in the documents in order to discover the proportion of
medical language used.</p>
    </sec>
    <sec id="sec-3">
      <title>Materials and Methods 2 3</title>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>The material consists of 23 528 heart patients’ medical discharge documents from the
years 2005-2009 from a Finnish hospital, covering 5 747 126 words, punctuation
excluded. In order to study the vocabulary, the material was analysed with the
morphological analyser FinTWOL
(http://www2.lingsoft.fi/cgibin/fintwol). In addition to the morphological reading(s), the tool identifies
abbreviations and (some) foreign words. This is particularly useful in the present study,
as these words are potentially problematic for the patient.</p>
      <p>First of all, FinTWOL analyses 10% of the discharge documents’ words as unknown.
These include misspellings and (often abbreviated) medical terms, such as l.dex, l.sin
and Dgn. In fact, of these, 100 most common cover for 43% of the total. It thus seems
that these terms are relatively fixed. On the other hand, only 6% of the unknown
words occur only once and are the most potential misspellings.</p>
      <p>Further, FinTWOL analyses 7% of the words as abbreviations. Some of these are
common also in standard language, such as mm. or l., but others are still very
domainspecific, such as I.V. (intravenous) and MCC (Morbus Cordis Coronarius). Similarly
to unknown words, also the used abbreviations are very fixed: the 100 most common
ones cover for 94% of the total. This, however, does not help the patient who
probably reads only one document. Similarly, even though misspellings are globally
infrequent, they still occur more than once per discharge document.</p>
      <p>In addition, 5% of the words are tagged as proper names. These include doctors’
names and drugs, such as Furesis, but also some Latin terms are given this tag, even
though FinTWOL has also a special tag for foreign words. This FORGN tag is given
to another 2% of the words, which are in our case Latin terms used to describe for
instance the status or the diagnosis of the patient: diabetes mellitus II (diabetes of
the adulthood).</p>
      <p>The analysis shows that abbreviations, terms and Latin words are frequent in
discharge documents; they cover on average 24% of the texts. In order to place these
numbers in a context, we performed a similar analysis on general Finnish university
newspaper text from Turku Dependency Treebank.6 From the (modest) total of 10
687 words, 0,2% were analysed as foreign, 0,4% as abbreviations, 3% were unknown
and 5% were tagged as proper names.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion</title>
      <p>The results show that terms, abbreviations and Latin words are considerably more
used in discharge documents than in general newspaper text. Even if some of these
words did not present comprehension problems for the patient, it is clear that a text
with such a vocabulary is domain-specific and distinct from the language that the
patient is used to. In addition, the varying use of upper and lower case letters in
abbreviations and spelling variants, such as the use of several abbreviations for one term
(Dg. and Dgn for diagnosis), emphasize the particularity of the language. In standard
language texts, where normative requirements must be followed, such writing would
not be acceptable. In fact, this variation associates discharge documents with an
informal register that may to some extent be understandable to the patient but not
necessarily appropriate considering the text context.
5</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>The results stress the fact that discharge documents are written fast and mostly aimed
at other physicians, not at patients. Normative, standard language writing would help
the patients to understand the texts and therefore increase their means to actively
participate in their own care. In addition, it would prevent possible misunderstandings
between professionals in different units.</p>
      <p>This study has focused only on the vocabulary of the texts. Therefore, the results
are obviously very limited as comprehension involves also other aspects, such as the
structure and syntax of the text.7 In order to study this, we are developing a
domainadapted parser similar to the one previously developed by Haverinen et al.8 for
intensive care patient reports.</p>
      <p>Finally, another obvious direction for future research would be the application or
development of language technology tools to assist in the communication between the
physician and the patient. For instance, term search and abbreviation expansion would
be very helpful as our analysis shows that the used terms and abbreviations are
relatively fixed.</p>
      <p>Acknowledgements
We are grateful to Juho Heimonen and Antti Airola for technical assistance and to
Lingsoft Ltd. for making FinTWOL available for us. This work has been supported by
the Academy of Finland.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Tallgren</surname>
            <given-names>M</given-names>
          </string-name>
          :
          <article-title>Epikriisin sijaan potilaalle lyhyt ja ytimekäs hoitoyhteenveto?</article-title>
          <source>Suomen Lääkärilehti</source>
          <volume>41</volume>
          (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kripalani</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>LeFevre</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phillips</surname>
            <given-names>CO</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            <given-names>MV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            <given-names>DW</given-names>
          </string-name>
          :
          <article-title>Deficits in communication and information transfer between hospital-based and primary care physicians: implications for patient safety and continuity of care</article-title>
          .
          <source>JAMA</source>
          <year>2007</year>
          (
          <volume>297</volume>
          ):
          <fpage>831</fpage>
          -
          <lpage>841</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Laippala</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ginter</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pyysalo</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakoski</surname>
            <given-names>T</given-names>
          </string-name>
          : Towards Automated Processing of Clinical Finnish:
          <article-title>Sublanguage Analysis and a Rule-Based Parser</article-title>
          .
          <source>International Journal of Medical Informatics</source>
          <volume>78</volume>
          (
          <issue>12</issue>
          ):
          <year>2009</year>
          :
          <fpage>e7</fpage>
          -
          <lpage>e12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hobbs</surname>
            <given-names>P</given-names>
          </string-name>
          :
          <article-title>The role of progress notes in the professional socialization of medical residents</article-title>
          .
          <source>Journal of Pragmatics</source>
          <year>2004</year>
          (
          <volume>36</volume>
          ):
          <fpage>1579</fpage>
          -
          <lpage>1607</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Tiililä</surname>
            <given-names>U</given-names>
          </string-name>
          :
          <article-title>Auttajista lausuntoautomaateiksi? Lääkäreillä keskeinen rooli etuuksista päätettäessä</article-title>
          .
          <source>Duodecim</source>
          <volume>124</volume>
          (
          <year>2008</year>
          ):
          <fpage>896</fpage>
          -
          <lpage>901</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Haverinen</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ginter</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laippala</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kohonen</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Viljanen</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nyblom</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakoski</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>A Dependency-based Analysis of Treebank Annotation Errors</article-title>
          .
          <source>Proceedings of International Conference on Dependency Linguistics (Depling'11)</source>
          , Barcelona, Spain,
          <year>2011</year>
          :
          <fpage>115</fpage>
          -
          <lpage>124</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Herring</surname>
            <given-names>SC</given-names>
          </string-name>
          :
          <article-title>Grammar and electronic communication</article-title>
          . In CA Chapelle (ed.): Encyclopedia of Applied Linguistics. Hoboken, NJ: Wiley-Blackwell,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Haverinen</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ginter</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laippala</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakoski</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>Parsing Clinical Finnish: Experiments with Rule-Based and Statistical Dependency Parsers</article-title>
          .
          <source>Proceedings of NODALIDA'09</source>
          ,
          <string-name>
            <surname>Odense</surname>
          </string-name>
          , Denmark,
          <year>2009</year>
          :
          <fpage>65</fpage>
          -
          <lpage>72</lpage>
          . http://www.finlex.fi/fi/laki/ajantasa/2009/20090298
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>