=Paper= {{Paper |id=Vol-1515/regular6 |storemode=property |title=Formal representation of disorder associations in SNOMED CT |pdfUrl=https://ceur-ws.org/Vol-1515/regular6.pdf |volume=Vol-1515 |dblpUrl=https://dblp.org/rec/conf/icbo/CheethamGGHS15 }} ==Formal representation of disorder associations in SNOMED CT== https://ceur-ws.org/Vol-1515/regular6.pdf
      Formal representation of disorder associations in SNOMED CT
    Edward Cheetham1, Yongsheng Gao2, Bruce Goldberg3, Robert Hausam4, and Stefan Schulz5,*
                                            1 Health and Social Care Information Centre, UK
                 2International Health Terminology Standards Development Organisation, Copenhagen, Denmark
                                                         3Kaiser Permanente, USA
                                                4Hausam Consulting LLC, Midvale, UT, USA
             5
                 Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria



ABSTRACT                                                                 A review of all English terms in the UMLS Metathesaurus
Medical terminologies like SNOMED CT often provide codes for fre-        (NLM, 2015), which constitutes the biggest collection of
quently co-occurring associations of findings and disorders, such as
syndromes or diseases with sequelae. The current release of SNOMED
                                                                         biomedical terminology, reveals the importance of coordi-
CT still lacks a principled solution for representing these concepts,    nating expressions or particles as parts of domain-specific
which was the reason for the IHTSDO project group "Event, Condition,     terms (Table 1). Most of these terms denote findings,
Episode" to elaborate a well-founded approach based on criteria of       events, disorders, and procedures. In SNOMED CT, the
formal ontology. The group analysed complex SNOMED CT terms and          picture is similar, as shown in Table 2. Many of these terms
proposes a simple solution, which draws on the interpretation of find-
ings, disorders, and diseases as clinical life phases. Co-occurrence,    had been incorporated into SNOMED CT because of efforts
temporal relatedness and causal relatedness were represented by          to align ICD 9 and 10 with SNOMED CT.
distinct modelling patterns in OWL-DL.
                                                                         Table 2. Distribution of coordinating and temporal connect-
1      INTRODUCTION                                                      ors in English SNOMED CT fully specified names (per-
A main purpose of clinical terminologies is to support se-               centages, rightmost column absolute count). Total number
mantic annotation of the content of medical records. Conse-              of concepts approx. 300,000.
quently, in many terminology systems such as ICD-9 and
ICD-10, in the draft of the upcoming ICD-11 (WHO, 2015),                                 ' after ' ' and ' ' caused by ' ' ue to' 'with' 'without' Σ      Σabs
as well as in SNOMED CT (IHTSDO, 2015), numerous                         Body Struct.     .0      2.5       .0       .0     .6      .0       3.1         951
codes denote clinical phenomena that frequently co-occur or
                                                                         Clin. Finding    .1      3.0       .1     2.7     4.7    1.3      11.9        11,974
are temporally related, so that complex disorders like Peri-
carditis with pericardial effusion, or Vitamin B12 deficiency            Event            .6      8.1 12.0         1.9     9.6    3.1      44.3         1,627
anaemia due to malabsorption can be encoded in one step.                 Obs. Entity      .3      2.3       .0       .0     .5      .0       3.1         257
Extreme cases are codes for highly specific clinical scenari-
                                                                         Product          .0      1.5       .0       .0     .0      .0       1.5         259
os like Extradural haemorrhage following injury without
open intracranial wound and with prolonged loss of con-                  Phys. Object     .0      2.3       .0       .0    5.4    1.2        9.0         408
sciousness (more than 24 hours) without return to pre-                   Procedure        .1      5.8       .0       .0    5.8      .3     12.1         6,497
existing conscious level.
                                                                         Qual. Value      .2      1.1       .0       .0     .4      .0       1.8         162
Table 1. English UMLS terms containing coordinating and                  Situation        .3      1.8       .0       .2    3.9      .4       6.6         243
temporal connectors, related to all English UMLS terms.                  Substance        .0      1.1       .0       .0     .4      .0       1.5         353
Substring                                Count                Rate
                                                                         Others           .0      1.2       .0       .0     .1      .0       1.3         584
' after '                                3,899                0.1%
' and '                                337,706                4.7%       ALL              .0      2.9       .2     1.0     3.0      .5       7.7       23,039
' caused by '                            3,605                0.0%
' due to '                              29,223                0.4%       While SNOMED CT is increasingly incorporating princi-
' with '                               231,128                3.2%       ples of applied ontology and provides a description logics
' without '                             21,131                0.3%       (DL) (Baader et al., 2007) based version implementing
                                                                         OWL EL (Motik et al., 2012), the current representation of
ALL                                    626,692                8.7%
                                                                         co-ordinating expressions in SNOMED CT does not follow
                                                                         clearly defined patterns. For instance, the definition of the
                                                                         SNOMED CT concept Diabetic retinopathy (disorder) uses
                                                                         the relation associated with for linking with Diabetes melli-
*
  To whom correspondence should be addressed:                            tus, whereas Paraneoplastic neuropathy (disorder) is con-
stefan.schulz@medunigraz.at



    Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                                   1
Schulz et al.



nected to Neoplastic disease using due to. Another example              3     RESULTS
is given by the concepts Dermatomycosis associated with
AIDS (disorder) and AIDS with dermatomycosis (disorder),                3.1    Typology and ontological analysis
which appear to be duplicates. Whereas the former one uses              Our analysis yielded four patterns of coordinative expres-
the relation associated with for establishing a connection              sions in the SNOMED CT Finding / Disorder hierarchy, as
with the concept AIDS, the latter one is represented as a               shown in Table 3.
subclass of AIDS. This motivated the project group Event,               Further analysis focused on the following questions:
Condition, Episode Model (ECE) of IHTSDO 1, the organi-                 • Which are exactly the entities that are denoted by the
zation that maintains SNOMED CT, to conduct a thorough                       concepts under scrutiny?
investigation of this phenomenon and to suggest a solution              • Which temporal relationships have to be distinguished?
that is in line with current principles of ontology develop-            • What does causality mean and how is it linked to tem-
ment in SNOMED CT.                                                           porality?
                                                                        According to Schulz et al. (2012b), we interpret all finding /
2       METHODS                                                         disorder codes as Clinical Situations or Clinical Life Phases
The ECE group decided to limit the scope of the investiga-              (we will use the latter term and illustrate it by the suffix
tion to the SNOMED CT hierarchy Clinical Finding / Dis-                 "CLP") i.e. a patient's life phase during which a clinically
order, following IHTSDO's current strategic directions in               relevant condition is present. For instance, the SNOMED
the content development process (IHTSDO, 2010).                         CT concept EncephalitisCLP denotes the class of processual
SNOMED CT statements that implicitly include negation                   entities of the type Life phase, in which some encephalitis
were also not considered because they are not expressible in            process is present in any temporal instant covered by this
OWL-EL. All group members selected SNOMED CT term                       life phase. Accordingly, HerniaCLP denotes the class of pro-
samples that represented coordination phenomena, in order               cessual entity of the type Life phase, in which the material
to propose recurring modelling patterns. Having done this,              disorder Hernia is fully present. The advantage of this inter-
                                                                        pretation is that we do not have to deal with hierarchies of
the group discussed the underlying meaning, in particular
                                                                        entity types of different ontological categories under the
the ontological commitment of the sample Finding / Disor-
                                                                        same umbrella. To this end, BTL2 provides the defined
der concepts and the underlying semantics with regard to
                                                                        class Condition – the disjunction of Material object, Dispo-
time and causality. As an ontological reference, BioTop-                sition and Process (Schulz et al., 2011a) – and the class
Lite2 (BTL2) (Schulz & Boeker, 2013), an upper-level on-                Situation as a life phase during which some condition holds:
tology based on OWL DL and tailored for the biomedical                  an XCLP is a Life phase during which some condition X is
domain, was used. BTL2 provides a small set of upper-level              fully present. If John has constant headache today from 6am
classes, mappable to BFO (2015). All BTL2 classes exhibit               to 11pm, this period of his life is of the type HeadacheCLP. If
a set of constraining axioms using a set of canonical rela-             he is seen by a doctor between 3pm and 3.10pm, this ten-
tions, partly derived from the OBO Relation Ontology                    minute lifespan is a new instance of the same type. If he also
(Smith et al., 2005). BTL2 heavily constrains the freedom               suffers from diabetes mellitus, then these life phases also
of the ontology engineer, which warrants a higher predicta-             instantiate 'Diabetes mellitus'CLP. We formalize this in OWL
bility of the ontologies produced.                                      DL in the following way, using OWL Manchester Syntax:

Table 3. Four patterns found for "X with Y"
Pattern Definition                       Example                        XCLP equivalentTo 'Clinical life phase' and
1         Both X and Y are co-occur- Hay fever with                                                'has condition' some X
          rent, but with no causality asthma
          or manifestational relation-
                                                                        The relation 'has condition' in BTL2 holds between a life
          ship between X and Y                                          phase and an entity that is constantly present during this life
2         X is due to Y, but X and Y Disorder of optic                  phase. Independently, we have to look at temporality, where
          are not necessarily co- chiasm due to non-                    we need to clarify what "following" and "co-occurring" ex-
          occurrent                      pituitary neoplasm             actly mean. In BFO2 we find the relation 'is preceded by',
3         X temporally follows Y. Postvaricella en-                     which is defined as relating two processes, one of which
          This does not specify that cephalitis                         ends before the second one begins.
          X is due to Y, although cau-                                  A commonly accepted framework for describing temporal
          sality is frequently implied                                  relations is Allen's (1983) interval calculus (Fig. 1). Com-
4         X is due to Y, and both X Hernia, with intes-                 pared to this, a relational statement based on BTL2
          and Y are co-occurrent         tinal obstruction              "x is preceded by y" corresponds to either the Allen-based
                                                                        statement "y takes place before x" or "y meets x".
1
    http://www.ihtsdo.org/participate/project-groups



2                                Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes
                                                                                             Formal representation of disorder associations in SNOMED CT



                  X                                          X takes place before Y         3.2   Proposal of modelling patterns
                                             Y
                                                                                            In the following, we will propose modelling solutions for
                  X                                                         X meets Y
                                         Y                                                  'X with Y' concepts in SNOMED CT for the frequently en-
                      X                                                                     countered clinical patterns in Table 3. The simplest model-
                                                                  X overlaps with Y
                                         Y                                                  ling approach for representing X with Y concepts in
                      X                                                     X starts Y      SNOMED CT is:
                                  Y
                                                                                            1. Both X and Y are co-occurrent, but with no causality
                              X                                             X during Y
                                  Y
                                                                                            between X and Y.
                                         X
                                                                                            XCLP with YCLP, both simply asserted as co-occurring, and no
                                                                       X finishes Y
                                  Y                                                         known causal/manifestational relationship implied:
                                  X                                  X is equal to Y
                                  Y
                                                                                            XwithYCLP equivalentTo XCLP and YCLP
        Fig. 1. Base relations of Allen's interval calculus.
             The converse relations are not depicted
                                                                                            XwithYCLP denotes the class of life phases that are character-
Looking at the examples where we had asserted co-                                           ised by the full presence of both the conditions X and Y:
occurrence, we agreed to interpret "x co-occurs with y" as
the disjunction of "x starts y", "x during y", "x finished y",
                                                                                            XwithYCLP equivalentTo 'Clinical life phase' and
and "x is equal to y". Let us take the example Hay fever
                                                                                                    'has condition' some X and
with asthma (Fig. 2). We have three Clinical life phase enti-
                                                                                                    'has condition' some Y
ties: 'Hay fever with asthma'CLP, 'Hay fever'CLP, and Asth-
maCLP. The possible temporal patterns result from any com-
bination of Fig.2 left hand side with Fig.2 right hand side.                                It can be shown that both definitions are equivalent, by logi-
All temporal instants of Hay fever with AsthmaCLP temporal-
                                                                                            cal transformation or by a reasoner like HermIT (2015). An
ly coincide with some instant of 'Hay fever'CLP and some
                                                                                            important result (as represented by the second definition) is
instant of AsthmaCLP.
                                                                                            that for each time interval [t1; t2] any single human is con-
 Hay fever with Asthma                           Hay fever with Asthma
                                                                                            sidered to have one single life phase, which is characterised
                  Hay fever                                        Asthma                   by the conditions that are wholly present in this interval. As
         Hay fever with Asthma                           Hay fever with Asthma              discussed in Schulz et al. (2011b), the subsumption of com-
                 Hay fever                                        Asthma
                                                                                            plex disorder classes by their constituent disorder classes is
                     Hay fever with Asthma                          Hay fever with Asthma
                 Hay fever                                       Asthma                     a characteristic phenomenon in many disease / disorder ter-
           Hay fever with Asthma                           Hay fever with Asthma            minologies, and the life phase interpretation puts it on a
                 Hay fever                                        Asthma
                                                                                            solid ground.
                                                                                            It can easily be shown that the same applies for complex
           Fig. 2. Co-occurrence of combined situations                                     clinical life phase types with more than two conjoints, e.g.
                                                                                            in the case of the Tetralogy of Fallot (Schulz et al., 2011b),
Finally, we will have a look at causality. Notwithstanding                                  a combined heart defect as an emblematic example.
past and on-going philosophical debates about its nature, we
consider the notion of causality as a primitive predicate,                                  2. X is due to Y but X and Y are not necessarily co-
which is essential for medical reasoning and decision-                                      occurrent
making. Whether y follows x accidentally or because it is                                   Here, the correct way would be to assert causality between
caused by x is seen as fundamentally different. There are
                                                                                            the conditions X and Y.
important temporal implications of causality. It is a truism
that an effect cannot precede its cause, or conversely, an
effect has to follow its cause. Referring to the Allen calcu-                               XcausedByY equivalentTo X and 'is caused by' some Y
lus, "x causes y" would then be only compatible with "x
takes place before y", "x meets y", "x overlaps y" as well
as with (switching the arguments x and y) "y during x" and                                  However, according to our interpretation, SNOMED CT
"y finishes x". All these relations have in common that the                                 disorder concepts are clinical life phases and the underlying
starting point of x precedes the starting point of y.




Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                                     3
Schulz et al.



conditions are not or are only indirectly available 2. We                      It looks uncommon that the same class YCLP appears both as
therefore axiomatically extend the notion of causation and                     a superclass and a class related via the relation due to. Tak-
allow that a clinical life phase is causally related to another                ing the example 'Hernia with intestinal obstruction'CLP: All
clinical life phase. To express this, we use the SNOMED                        life phases of this type are both Hernia life phases and Intes-
CT relation 'due to'.                                                          tinal obstruction life phases, and they are related, addition-
                                                                               ally, to a second Hernia life phase (which is a different one
                                                                               but is assumed to refer to the same hernia object). This se-
XcausedByYCLP equivalentTo XCLP and                                            cond life phase is, actually, one that precedes the inception
                        'due to' some YCLP                                     of the complication, in this case the intestinal obstruction.
                                                                               3.3     Special cases
This is a simplification of the correct representation, which                  There are other cases that we have excluded from our typol-
should be (using the BTL2 relation 'is caused by'):                            ogy, but which, nevertheless, deserve consideration:
                                                                               • Terms of the type 'Abscess of urethral gland due to
                                                                                   Neisseria gonorrhoeae'. Here, the right hand side of the
*XcausedByYCLP equivalentTo 'Clinical life phase' and
                                                                                   particle "due to" denotes a material agent, not a process.
       'has condition' some (X and 'is caused by' some Y)
                                                                                   In BTL2 this would be expressed by the relation
                                                                                   'has agent' and could be modeled in the following way
Due to the problem of referring to clinical conditions in                          (note that in this case, the relation 'has condition' is a
SNOMED CT, in the modeling pattern we propose, 'due to'                            paraphrase of the role group relation in SNOMED CT):
connects two CLPs that are related by the fact that the first
one has a condition that is caused by a condition that defines
                                                                                       XwithAgentACLP equivalentTo XCLP and
the second one. E.g., all instances of 'Disorder of optic chi-
                                                                                              'has condition' some ('has agent' some A)
asm due to non-pituitary neoplasm'CLP imply an instance of
'Non-pituitary neoplasm'CLP (which implies an instance of
'Non-pituitary neoplasm'). We consider this approximation                      •     Associativity. It can be shown that the following rule
as sufficient for the reasoning services required.                                   always holds and can be easily reduced to pattern one:

3. X temporally follows Y. This does not specify that X is
due to Y, although causality is frequently implied                                     XwithYCLP and ZCLP equivalentTo
                                                                                       XCLP and YwithZCLP equivalentTo
                                                                                       XCLP and YCLP and ZCLP
XfollowsYCLP equivalentTo XCLP and
                         follows some YCLP
                                                                                     E.g. 'Diabetes mellitus with hyperosmolar coma'CLP su-
                                                                                     perficially appears to be an X with Y pattern, but is
As mentioned before, the BTL2 relation is preceded by                                more appropriately an example of X with Y with Z (Di-
excludes the Allen relation overlaps. We argue that the rela-                        abetes mellitus, Hyperosmolar state, Coma). There is
tion follows should include overlaps, as it is common in                             no nesting.
medicine. E.g., Postvaricella encephalitis might include                       •     On this basis, more complex chained sequences accord-
cases in which the varicella infection has not ended at the                          ing to pattern four are possible using the 'due to' rela-
inception of the complication, viz. Encephalitis.                                    tionship. This might be represented as:

4. X is due to Y and both X and Y are co-occurrent
Here we propose a combination of patterns 1 and 2:                                     'Diabetes mellitus with hyperosmolar coma'CLP
                                                                                       equivalentTo
                                                                                                'Diabetes mellitus'CLP and
XdueToCooccurringYCLP equivalentTo                                                              ('Hyperosmolar state'CLP and
 XCLP and YCLP and 'due to' some YCLP                                                             'due to' some 'Diabetes'CLP) and
                                                                                                (ComaCLP and
                                                                                                  'due to' some 'Hyperosmolar state'CLP)
2
  In the case of fully defined disorder concepts, the condition would corre-
spond to the combination of location with morphology, inside the role
groups




4                              Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes
                                                                   Formal representation of disorder associations in SNOMED CT



4   DISCUSSION                                                    Limitations identified and resulting tasks will be addressed
                                                                  by the ECE working group in the future:
Table 4 gives an overview of the relations used in the pro-       • To evaluate if the proposed patterns are generic and can
posed models of our approach and their mapping to the Al-              be applied throughout SNOMED CT, especially to con-
len relations. As the BTL2 relation 'is preceded by' does not          cepts in the Event and Procedure hierarchies.
allow overlap, it seems too strict. We prefer the relation        • To prove theoretically and empirically that the pro-
follows, which makes the minimal assumption that the be-               posed patterns do not produce unexpected classification
ginning of Y is later than X. This is also the assumption of           results, especially as a consequence of the simplifica-
the causality relation, which appears as a subrelation of fol-         tion by asserting 'due to' between situations and not be-
lows. The proposal to include an overlap pattern for tempo-            tween the underlying conditions.
rally following again blurs the distinction between pattern       • To check the impact of the new models on classifica-
two and pattern three. This might be acceptable if we do not           tion time.
want to distinguish sequelae from other types of complica-        • To extend the approach to negated conditions ("with-
tions. When investigating definitions of sequelae under the            out" ) by scenarios that extend DL expressiveness or
concepts Sequela (finding) or Sequela of disorders (disor-             that represent negations as primitives. The impact on
der), we found chronic or residual conditions that are com-            reasoning behaviour will also be investigated.
plication of acute conditions that occur after the acute dis-     • To propose adjustments to the SNOMED CT naming
ease or injury phase. Sequelae can also be the result of the           conventions in the light of the new model.
treatment of the primary condition. There is no time limit on
when a late effect can occur; the residual condition may          REFERENCES
come directly after the disease or condition, or years later.     Allen, J. F (1983): Maintaining knowledge about temporal intervals.
This is a little vague in terms of whether the inciting condi-          Communications of the ACM 26(11), 832–843.
tion is still present when the complication commences. In         Baader, F. et al. (2007). The Description Logic Handbook. Cambridge:
case there is a requirement to represent sequelae (late ef-             Cambridge University Press.
fects) as distinct from e.g., immediate complications, it         BFO (2015) Basic Formal Ontology. http://ifomis.uni-saarland.de/bfo/
might be worthwhile to define sequelae as not overlapping         HermiT (2015). HermiT OWL reasoner. http://www.hermit-reasoner.com/
with their cause, and for this case to indeed use the BTL2        IHTSDO (2010). Strategic Directions to 2015.
relation 'is preceded by'.                                              http://www.ihtsdo.org/resource/resource/1
                                                                  IHTSDO (2015) International Health Terminology Standards Development
Table 4. Allen relations compatible with the relations used             Organisation. SNOMED CT.
in our models.                                                    Motik, B. et al. (2012) OWL 2 Web Ontology Language Profiles (Second
                              Proposed Relations                        Edition). W3C Recommendation http://www.w3.org/TR/owl2-
                        Y co-        Y is         Y       Y is          profiles/
 Allen Relations        occurs     preceded    follows   due-to   NLM (2015). U.S. National Library of Medicine. Unified Medical Lan-
                        with X       by X         X        X            guage System (UMLS), http://www.nlm.nih.gov/research/umls
 X takes place
                                       √           √       √      Schulz, S. et al. (2011a). Scalable representations of diseases in biomedical
 before Y
                                                                        ontologies. Journal of Biomedical Semantics. May 17; 2 Suppl 2: S6.
 X meets Y                             √           √       √      Schulz, S. & Boeker, M. (2013). An Upper Level Ontology for the Life
 X overlaps with Y                                 √       √            Sciences. Evolution, Design and Application. In: Furbach, U. &
 Y starts X               √                                             Staab, S. (eds.). Informatik 2013. IOS Press.
                                                                  Schulz, S. et al. (2011b). Consolidating SNOMED CT's ontological com-
 Y during X               √                        √       √
                                                                        mitment. Applied ontology 6: 1-11.
 Y finishes X             √                        √       √
                                                                  Schulz, S. et al. (2012b). Competing interpretations of disorder codes in
 X is equal to Y          √                                             SNOMED CT and ICD. AMIA Annu Symp Proc. 819-827.
                                                                  Smith, B. et al. (2005). Relations in biomedical ontologies. Genome Biolo-
                                                                        gy 6(5): R46.
5   CONCLUSION AND FURTHER WORK                                   WHO (2015). International Classification of Diseases.
The proposed patterns have been prototypically implement-               http://www.who.int/classifications/icd/en/
ed in SNOMED CT and have achieved better semantic clari-
ty and consistency in terminology creation and maintenance.
The formal analysis of temporal and causative relationships
has been proved to be useful for determining the patterns.




Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                          5