<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Calculating Prevalence of Comorbidity and Comorbidity Combinations with Diabetes in Hospital Care in Sweden Using a Health Care Record Database</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hideyuki Tanushi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hercules Dalianis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gunnar H Nilsson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Systems Sciences (DSV), Stockholm University</institution>
          ,
          <addr-line>Forum 100, SE-164 40 Kista</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Neurobiology, Care Sciences and Society, Karolinska Institutet</institution>
          ,
          <addr-line>SE-141 83 Huddinge</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>65</lpage>
      <abstract>
        <p>We have studied the prevalence of comorbidity in the Stockholm EPR corpus containing almost 600,000 patients from 900 clinics using the ICD- 10 codes assigned to each patient record. The proportion of patients with a valid ICD-10 code was 83.0%, and 41.5% of these had at least one comorbidity. The most frequent comorbidity combination with type 2 diabetes was essential hypertension (43.1%). Our approach seems feasible for large scale analysis of diagnostic codes in EPR databases.</p>
      </abstract>
      <kwd-group>
        <kwd>comorbidity</kwd>
        <kwd>chronic disease</kwd>
        <kwd>ICD-10</kwd>
        <kwd>medical records systems</kwd>
        <kwd>computerized medical record</kwd>
        <kwd>Sweden</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Today a large amount of electronic patient records (EPR) are produced, which
are rarely reused. The EPR systems are also more and more centralized
covering both several hospitals and clinics. In the Scandinavian countries we have a
unique social security number for all citizens following us from birth to death,
as well as from clinic to clinic. Diagnoses are often coded by ICD [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Among
clinical researchers there is a need to study the comorbidity of diseases among
patients [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. An issue that arises is if we can observe any correlation between the
comorbidity in our immense database and clinical researchers’ findings. Coding
and classification of diseases in medical records of individuals have received little
attention in the research area. Nowadays health care is more and more required
to deal with the management of individuals with multiple coexisting diseases,
namely comorbidity [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Comorbidity is the presence of no less than two distinct conditions in an
individual [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. One of the attempts for establishing classification of comorbidity is
a scheme of taxonomy for classifying diabetic comorbid ailments and prognostic
value of the classification [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Another attempt is made by Charlson et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] who
have developed a prognostic taxonomy for comorbid conditions able to predict
the risk of short term mortality for patients enrolled in longitudinal studies.
      </p>
      <p>
        Starfield et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] describe a categorization of morbidity in order to present a
system to measure and compare the burden of illness of patients over time in
different ambulatory care facilities, and to show how the system can predict
utilization and charges, both concurrently and prospectively.
      </p>
      <p>
        As the population of elderly people is increasing in many countries the
prevelence of chronic diseases are expected to rise. Schellevis et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and van Weel [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
conducted comorbidity analysis for chronic diseases by calculating combination
of these chronic diseases in general practice in the Netherlands. In van Weel’s
study [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] comorbidity for the ten most common chronic diseases is calculated
from approximately 12,000 patients. Davila and Hlaing [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] examined patients
admitted with primary diagnosis of essential hypertension and analyzed the
frequency of their secondary diagnosis.
      </p>
      <p>
        Type 2 diabetes is defined as a chronic disease characterized by reduced
insulin sensitivity in target tissues [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The number of patients with diabetes
worldwide is estimated more than 220 million and type 2 diabetes accounts for
90% of them [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Several comorbid conditions to type 2 diabetes have been
identified in previous studies, such as obesity, which reduces insulin sensitivity,
and dyslipidemia and hypertension, which alter vascular and cardiac structure
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Detecting comorbidity in a large population is of clinical interest due to the
fact that it may reveal new information useful for cause of diseases as well as
for new treatment strategies. The aim of this study is to analyze comorbidity
in clinical hospital setting in Sweden using an EPR database, and to investigate
comorbidity combinations with diabetes.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Material and Methods</title>
      <p>The tools used in this work1 are SQL query language and Java programming
language. Fig. 1 shows a simple illustration for procedure of comorbidity analysis.</p>
      <p>
        The Stockholm EPR (SEPR) corpus that we have access to contains almost
900 clinics with almost 600,000 patients, that are registered with their social
security number, gender, age, admission and discharge date of the patient as
well as the ICD code of the diagnosis encompassing the years 2006, 2007 and the
first half of 2008 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The unstructured information consists of free text under
different headings. Diseases in the SEPR corpus are mainly coded by ICD-10.
According to Dalianis et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], 34% of the patients did not have any ICD-10
code.
      </p>
      <p>The main issue in this work is to explore what comorbidity combinations
each individual patient has in their medical history. Hence, the data of patient’s
id, birth year, and diagnosis code are extracted from the SEPR database with
a SQL query. In the extracted data there are 2,756,082 diagnosis records from
584,600 patients (=N).</p>
      <p>The diagnosis codes in the extracted data are, however, not written in a
unified way. For instance, the alphabet is written either in upper-case (e.g. A00)
or in lower-case (e.g. a00), and different marks are used (e.g. A00-0, A00.0, A00,0,
etc). Hence, the extracted codes need to be normalized. A code expression in
upper-case without any marks (e.g. A000) is used in the analyses. Also, there are
missing, or syntactically wrong and not to correct diagnosis codes (i.e. invalid
codes) in the extracted diagnosis data. Examples of invalid codes are: AF063,
KV9, KV˚A, NCU49, NFJ09, etc. There are 1,772,013 valid ICD-10 codes (13,450
different full level codes and 1,956 different level 3 codes) and 805,568 invalid
ICD-10 codes in the extracted data.</p>
      <p>Some patients have duplicate ICD-10 codes in their medical records (178,501
duplicate cases). This is presumably because these patients have received medical
care in more than one institution. ICD-10 codes are wrong or missing in many
cases. In our case the percentage of invalid ICD-10 codes in the extracted data
is approximately 31%.</p>
      <p>Only valid ICD-10 codes are used and converted to level 3 codes (i.e.
truncating the first three characters), and other codes are not converted or interpreted
by free text diagnoses. When converting the valid full level ICD-10 codes into
level 3 codes, a patient can have the same ICD-10 code no less than twice (e.g.
before converting A00, A001 and A009 ⇒ after converting A00, A00 and A00).
Multiple ICD codes for a patient are therefore counted only once. Also, ICD-10
codes in chapters 19–21 are excluded from the frequency analysis because these
codes include causes of diseases and other factors related to health care, and
therefore not considered as diseases.</p>
      <p>
        Creating comorbidity combinations for type 2 diabetes makes it possible to
focus the analysis on this diagnosis. Also, there are similar comorbidity analyses
in previous research which can be a methodological reference to this work as
mentioned in the background [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6–8</xref>
        ]. On the ground of the above comorbidity
combinations are used as a method for the comorbidity analysis in this work. For
the comorbidity analysis in the following chapter, the frequency of comorbidity
combinations for type 2 diabetes is calculated. The frequency is calculated by
counting how many type 2 diabetes patients have additional coded diseases, e.g.
comorbid diagnoses.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>Out of the 584,600 (=N) patients the number of patients with a valid ICD-10
code was 485,271 (83.0%), and 242,435 (41.5%) had at least one comorbidity
(see Table 1).</p>
      <p>The total number of patients with type 2 diabetes were 14,162 (=n), out of
which 13,487 (95.2%) had at least one comorbidity. The most frequent
comorbidity combinations with type 2 diabetes were essential (primary) hypertension
(43.1%), and heart failure (17.0%) (see Table 2).
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>
        In our study the proportion of patients with missing valid ICD-10 codes (17.0%)
was somewhat lower than expected. A study [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] on the same material showed
about 34% missing ICD-10 codes, however another study [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] showed 1.2%
missing main diagnosis on other material. Another issue is the quality of the set ICD
codes. A study [13] found about 20% wrongly set ICD codes, and Roque et al.
[14] found 15.9% wrongly set ICD-10 codes.
      </p>
      <p>The comorbidity figures in our study (41.5%) were higher than for example
in Westert et al. [15] showing about one fifth of patients with more than one
chronic condition. However, these figures include all health care settings and are
limited to chronic diagnoses.</p>
      <p>The most frequent comorbidity combinations with type 2 diabetes in our
study were essential hypertension (43.1%) and heart failure (17.0%). Our figures
are in line with Caughey et al. [16] reporting about 51–53% comorbidity with
hypertension, and 12–42% with cardiovascular disease. However, their study
includes only eight chronic diagnoses. Similar comorbidity figures are also found
in Finland in a population study by Reunanen et al. [17].</p>
      <p>Our method to extract and present information needs to become more
streamlined since today there are some manual steps that have to be carried out. The
strength of our method is that we use continuously growing everyday clinical
data enabling more detailed analysis. The main weakness is that there is the
large proportion of patients without a valid ICD-10 code in EPR databases.
One method to solve this problem is to match the free text in the patient record
with the ICD-10 code’s textual description to populate the record with ICD-10
codes (see [14]).</p>
      <p>We believe that in the future hospital management and clinical research will
monitor and analyze the ICD-10 codes and also SNOMED-CT [18] to assess
health care and also to predict future needs.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In Swedish hospital care the proportion of patients with a valid ICD-10 codes
seemed to be fairly high (83.0%), and the comorbidity about 41.5%. The most
frequent comorbidity combinations with type 2 diabetes were essential
hypertension and heart failure. Our study raises questions on the quality and analysis
of diagnosis coding in hospital settings. However, our approach seems feasible
for large scale analysis of diagnostic codes in EPR databases. Our results on
high rates for diabetes comorbidity may have implication for both health care
planning and delivery.</p>
      <p>In the future we plan to populate our data with more ICD-10 codes extracted
from diagnosis expressions in the free text, this is similar to the approach
described by Roque et al. [14].</p>
      <p>Acknowledgments. We would like to thank Maria Skeppstedt for her
invaluable help in posing smart SQL queries to extract data from the Stockholm EPR
corpus.
13. Socialstyrelsen (The National Board of Health and Welfare, In Swedish):
Diagnosgranskningar utof¨rda i Sverige 1997-2005 samtard˚ inof¨r granskning . Artikelnummer:
2006-131-30 (2006)
14. Roque, F.S., Jensen, P.B., Schmock, H., Andreatta, M., Hansen, T., Søeby, K.,
Braedkjaer, S., Juhl, A., Werge, T., Jensen, L.J., Brunak, S.: Using electronic
patient records to discover disease correlations and stratify patient cohorts, submitted
manuscript (2011)
15. Westert, G.P., Satariano, W.A., Schellevis, F.G., van den Bos, G.A.M.: Patterns
of comorbidity and the use of health services in the Dutch population. Eur J Public
Health. 11(4), 365–72 (2001)
16. Caughey, G.E., Vitry, A.I., Gilbert, A.L., Roughead, E.E.: Prevalence of
comorbidity of chronic diseases in Australia. BMC Public Health. 221(8) (2008)
17. Reunanen, A., Kangas, T., Martikainen, J., Klaukka, T.: Nationwide survey of
comorbidity, use, and costs of all medications in Finnish diabetic individuals. Diabetes
Care. 23(9), 1265–71 (2000).
18. SNOMED Clinical Terms User Guide, http://www.ihtsdo.org</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. International Classification of Diseases (ICD), http://www.who.int/ classifications/icd/en</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Valderas</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Starfield</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sibbald</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salisbury</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roland</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Defining Comorbidity:
          <article-title>Implications for Understanding Health and Health Services</article-title>
          .
          <source>Ann Fam Med</source>
          .
          <volume>7</volume>
          (
          <issue>4</issue>
          ),
          <fpage>357</fpage>
          -
          <lpage>363</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feinstein</surname>
            ,
            <given-names>A.R.:</given-names>
          </string-name>
          <article-title>The importance of classifying initial co-morbidity in evaluating the outcome of diabetes mellitus</article-title>
          .
          <source>J Chronic Dis</source>
          .
          <volume>27</volume>
          (
          <issue>7-8</issue>
          ),
          <fpage>387</fpage>
          -
          <lpage>404</lpage>
          (
          <year>1974</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Charlson</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pompei</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ales</surname>
            ,
            <given-names>K.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MacKenzie</surname>
          </string-name>
          , R.C.
          <article-title>: A new method of classifying prognostic comorbidity in longitudinal studies: development and validation</article-title>
          .
          <source>J Chronic Dis</source>
          .
          <volume>40</volume>
          (
          <issue>5</issue>
          ),
          <fpage>373</fpage>
          -
          <lpage>83</lpage>
          (
          <year>1987</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Starfield</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mumford</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinwachs</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Ambulatory care groups: a categorization of diagnoses for research and management</article-title>
          .
          <source>Health Serv Res</source>
          .
          <volume>26</volume>
          (
          <issue>1</issue>
          ),
          <fpage>53</fpage>
          -
          <lpage>74</lpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Schellevis</surname>
          </string-name>
          , F.G., van der Velden, J., van de Lisdonk, E., van Eijk, J.T.M.,
          <string-name>
            <surname>van Weel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Comorbidity of chronic diseases in general practice</article-title>
          .
          <source>J Clin Epidemiol</source>
          .
          <volume>46</volume>
          (
          <issue>5</issue>
          ),
          <fpage>469</fpage>
          -
          <lpage>73</lpage>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. van Weel,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Chronic diseases in general practice: the longitudinal dimension</article-title>
          .
          <source>European Journal of General Practice</source>
          .
          <volume>2</volume>
          ,
          <fpage>17</fpage>
          -
          <lpage>21</lpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Davila</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hlaing</surname>
            ,
            <given-names>W.M.:</given-names>
          </string-name>
          <article-title>Comorbidities of Patients with Hypertension Admitted to Emergency Departments in Florida Hospitals</article-title>
          .
          <source>Florida Public Health Review</source>
          .
          <volume>5</volume>
          ,
          <fpage>84</fpage>
          -
          <lpage>92</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hollander</surname>
            ,
            <given-names>P.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kushner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Type2 Diabetes Comorbidities and Treatment Challenges: Rationale for DPP-4 Inhibitors</article-title>
          . Postgraduate Medicine.
          <volume>122</volume>
          (
          <issue>3</issue>
          ),
          <fpage>71</fpage>
          -
          <lpage>80</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>10. Fact sheet on diabetes, http://www.who.int/mediacentre/factsheets/fs312/ en/index.html</mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Dalianis</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velupillai</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The Stockholm EPR Corpus Characteristics and Some Initial Findings</article-title>
          .
          <source>In: Proceedings of ISHIMR</source>
          <year>2009</year>
          ,
          <source>14th International Symposium for Health Information Management Research</source>
          , Kalmar (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Socialstyrelsen</surname>
          </string-name>
          (
          <article-title>The National Board of Health and Welfare</article-title>
          , In Swedish):
          <source>Kodningskvalitet i patientregistret - Slutenavr˚d 2007 . Artikelnummer: 2009-125-1</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>