=Paper=
{{Paper
|id=Vol-2518/paper-ODLS5
|storemode=property
|title=Addressing the Negation Gap in SNOMED CT by Reified Negated Concepts
|pdfUrl=https://ceur-ws.org/Vol-2518/paper-ODLS5.pdf
|volume=Vol-2518
|authors=Catalina Martínez-Costa,Jose Antonio Miñarro-Giménez,Robert Hausam,Stefan Schulz
|dblpUrl=https://dblp.org/rec/conf/jowo/Martinez-CostaM19
}}
==Addressing the Negation Gap in SNOMED CT by Reified Negated Concepts==
<pdf width="1500px">https://ceur-ws.org/Vol-2518/paper-ODLS5.pdf</pdf>
<pre>
 Addressing the negation gap in SNOMED
     CT by reified negated concepts
        Catalina MARTÍNEZ-COSTA a,1, Jose Antonio MIÑARRO-GIMÉNEZ a,
                           Robert HAUSAM b, Stefan SCHULZ a
            a
              Institute of Medical Informatics, Statistics and Documentation,
                            Medical University of Graz, Austria
                      b
                        Hausam Consulting LLC, Columbia, MO, USA


            Abstract. Despite increasing performance of computer hardware, reasoning with
            large OWL ontologies still poses some nearly insurmountable challenges. The
            complexity of sound and complete reasoning makes OWL DL intractable for large
            ontologies. OWL-EL appears a good compromise and OWL models following this
            profile have demonstrated good scalability, using specialised reasoners like ELK
            and Snorocket. SNOMED CT is moving more towards description logics and has
            chosen OWL-EL as the representational profile for the reasons mentioned above. A
            major drawback of this is the lack of support of logical negation (NOT) for this
            profile. Many SNOMED CT concepts suggest negation (e.g. by expressions like “A
            without B”, “Absence of X” etc). Based on such lexical patterns, we have identified
            the currently underlying OWL modelling patterns, classified them into distinct
            categories and for each category manually inspected some of their concepts in order
            to assess if they were correctly classified. Finally, we discuss several OWL
            remodelling approaches able to express negation in a tractable way (EL profile) and
            avoiding wrong inferences.

            Keywords. SNOMED CT, OWL DL, negation, ontology design patterns


1. Introduction

SNOMED CT (SCT) is a semantically rich, ontology-based, large clinical healthcare
terminology, which provides a standardized way to represent clinical phrases captured
by clinicians and enables their automatic interpretation [1]. SCT concepts are defined by
following the SNOMED CT concept model, which provides nine main top-level
categories (also known as semantic types), among which we find Clinical finding,
Procedure, Body structure, etc.

    Among other formats, SCT is provided as an OWL ontology, enabling the use of
description logic (DL) reasoning which supports advanced classification and querying.
Given its large size and the decrease of performance of reasoners over large and
expressive OWL ontologies, it implements the EL profile (EL++) [2]. This profile


      1
        Corresponding Author: Catalina Martínez-Costa, Institute for Medical Informatics, Statistics and
Documentation, Medical University of Graz, 8036 Graz, Austria, catalina.martinez@medunigraz.at. Copyright
© 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
provides polynomial time algorithms for all the standard reasoning tasks of description
logic but lacks the DL features universal restriction, negation and disjunction.
We found circa 6,000 SNOMED CT concepts out of aproximately 311,000 total that
include negation, being modelled in different ways, many of which lead to unintended
inferences like Absence of finger implying Absence of hand, or No Back Pain entailing
No Pain.

     In this work we identified OWL modelling patterns corresponding to lexical patterns
that are typical for SNOMED CT concepts that bear some implicit negation. We
classified them into several categories and inspected sample concepts of each category
in order to assess if they were correctly classified. Finally, different alternatives and steps
toward remodelling are proposed.


2. Material and Methods

Within a PostgreSQL database we checked the SNOMED CT January 2019 release for
the following lexical negation patterns and identified the number of concepts involved
(see Table 1). In total, there are 5,823 SNOMED CT concepts with labels that lexically
include negation cues. Most of them were descendants of Clinical Finding (4,122),
followed by Situation (713). The substrings “not” and “without” were the two most
frequent lexical cues (3,536).

Table 1. Negation lexical patterns; Number of concepts and their semantic type

                                                                    Semantic Type
 Lexical pattern      Concept count
                                                 Clinical Finding                Situation   Procedure

“Absence”                   354                         300                         18          2

“Absent”                    446                         359                         70          2
“No”                        956                         573                        236          14
“Lack of”                   110                         105                         0           5
“Not”                      1743                        1191                        323          96
“Without”                  1793                        1331                         15         357
“Non”                       41                          24                          1           3
“Never”                     22                           0                          17          5
“Negative”                  350                         236                         28          23
“Excluded”                   8                           3                          5           0
Total                      5823                        4122                        713         507


     For each semantic type we have selected the two most frequent patterns (highlighted
in bold in Table 1) and inspected their hierarchies in order to find concepts wrongly
classified. For concepts where inappropriate reasoning results were found, alternative
modelling patterns are proposed and discussed.
3. Results

A frequent lexical pattern is the substring “not” in the label. Out of 1,191 ones under
Clinical Finding, 818 were fully defined concepts, i.e. using equivalentTo axioms in
OWL. More than half of them (633) corresponded to negated verb phrases with the
substring “does not”, most of which denote fully defined concepts. To define a new
concept following this pattern, the variable parts (“X”) are substituted by words that
represent some activity. Note the property ‘Role group (attribute)’, a unique SCT
construct used to group attribute-value pairs for classification purposes [3].

 ‘Does not X (finding)’ equivalentTo
   ‘Finding related to ability to X (finding)’
     and 'Role group (attribute)' some (
           'Has interpretation (attribute)' some 'Does not (qualifier value)’ and
           'Interprets (attribute)' some Ability to X (observable entity)’)


     An example of a wrongly classified concept is Does not jump (finding). Its inferred
definition shows its classification as a subclass of Does not move (finding) (subclassOf
statement part). This is factually wrong, because from the fact that a patient does not
jump you cannot derive that he or she does not move. If we look more deeply in the
definition (equivalentTo statement part) it “interprets” Ability to jump (observable entity)
which is subsumed by Ability to move (observable entity) which is correct but causes the
wrong classification of Does not jump (finding).

‘Does not jump (finding)’
subclassOf
           ‘Does not move (finding)’
           ‘Finding related to ability to jump (finding)’
equivalentTo
           'Finding related to ability to jump (finding)'
                       and 'Role group (attribute)' some (
                                   'Has interpretation (attribute)' some 'Does not (qualifier value)' and
                                   'Interprets (attribute)' some 'Ability to jump (observable entity)')


     Another 323 concepts from the Situation hierarchy (238 are non-primitive) have
“not” in their label, more specifically concepts under ‘Clinical finding absent (situation)’
and ‘Procedure with explicit context (situation)’, following (roughly) the corresponding
pattern:


 ‘Not Clinical finding X (situation)’ equivalentTo
   ‘Clinical finding absent (situation)’
      and 'Role group (attribute)' some (
         'Associated finding (attribute)' some 'Clinical finding X (finding)' and
         'Finding context (attribute)' some 'Known absent (qualifier value)' and
         'Temporal context (attribute)' some 'Current or specified time (qualifier value)' and
         'Subject relationship context (attribute)' some 'Subject of record (person)')
     An example is ‘Joint not swollen (situation)’. The inferred definition wrongly entails
‘Swelling absent (situation)’. Again, a patient might not have swollen joints, but he might
have swollen feet. Here the wrong classification occurs due to the axiom “Associated
finding” some ‘Joint swelling (finding)’, the latter being a subclass of ‘Swelling of body
structure (finding)’.

‘Joint not swollen (situation)’
subclassOf
            ‘Swelling absent (situation)’
equivalentTo
            'Clinical finding absent (situation)'
                         and 'Role group (attribute)' some (
                            'Associated finding (attribute)' some 'Joint swelling (finding)' and
                            'Finding context (attribute)' some 'Known absent (qualifier value)' and
                            'Temporal context (attribute)' some 'Current or specified time (qualifier value)' and
                            'Subject relationship context (attribute)' some 'Subject of record (person)')


     Under ‘Procedure with explicit context (situation)’, we found concepts that
represent some contextual information specific to procedures, which syntactically
correspond to a verb (V) as past participle (P-ed) or nominalization (P-ion) preceded by
“not”, like in “not done”, “not wanted”, “not indicated”, “not suspected”, “not needed”,
etc., related to some anatomy term (A).

 'A Not V-ed (situation)' equivalentTo
  'Procedure not done (situation)’
      and 'Role group (attribute)' some (
         'Associated procedure (attribute)' some 'V-ion of A (procedure)' and
         'Procedure context (attribute)' some 'Not V-ed (qualifier value)' and
         'Temporal context (attribute)' some 'Current or specified time (qualifier value)' and
         'Subject relationship context (attribute)' some 'Subject of record (person)')


     In this pattern, in the object property ‘Procedure context (attribute)', restricted by
‘Qualifier value (qualifier value)’ concepts, we find negative meaning in descendants of
‘Context values for action (qualifier value)’, e.g. ‘Not done (qualifier value)’. An
example is ‘Retinae not examined (situation)’, with the factually wrong entailment
‘Patient not examined (situation)’. The latter concept ‘Patient not examined (situation)’
is associated with the procedure ‘Physical examination procedure (procedure)’ and
‘Examination of retina (procedure)’ is a subclass.

‘Retinae not examined (situation)’
subclassOf
           ‘Patient not examined (situation)’
equivalentTo
           'Procedure not done (situation)'
                      and 'Role group (attribute)' some (
                         'Associated procedure (attribute)' some 'Examination of retina (procedure)' and
                         'Procedure context (attribute)' some 'Not done (qualifier value)' and
                         'Temporal context (attribute)' some 'Current or specified time (qualifier value)' and
                         'Subject relationship context (attribute)' some 'Subject of record (person)')
     Out of 96 ‘Procedure (procedure)’ concepts only 4 of them are fully defined, and
here the only hint to a negative meaning is the label, without any element with negative
polarity in the OWL definition. This is the reason why here no negation-specific OWL
pattern can be provided. We analysed the concept ‘Revision of total prosthetic
replacement of shoulder joint not using cement (procedure)’ and did not find any wrong
inference.

‘Revision of total prosthetic replacement of shoulder joint not using cement (procedure)’
subclassOf
      'Prosthetic uncemented total shoulder replacement (procedure)'
      ‘Revision of total prosthetic replacement of shoulder joint (procedure)’
equivalentTo
      'Uncemented total replacement of joint (procedure)'
         and 'Role group (attribute)' some (
            'Revision status (attribute)' some 'Revision - value (qualifier value)' and
            'Method (attribute)' some 'Surgical insertion - action (qualifier value)' and
            'Direct device (attribute)' some 'Total shoulder replacement prosthesis (physical object)' and
            'Procedure site - Indirect (attribute)' some 'Entire glenohumeral joint (body structure)')


    ‘Prosthetic uncemented total shoulder replacement (procedure)’ is a subclass of the
not fully defined concept ‘Uncemented total replacement of joint (procedure)’ which
lacks any formal representation of the property “uncemented”.

     Another frequent lexical cue is “without”. Most of the concepts with “without” in
their label are from Clinical Findings, more specifically among those with the semantic
tag Disorder (1223). 951 are not fully defined, which is conclusive because the
formalisation of the meaning of “without” would require the negation operator. Among
these concepts we inspected concepts corresponding to the lexical patterns “without
complication”, “without infection”, “without disease/finding”. We did not find any
wrong classification result regarding the use of “without” in Clinical Findings besides
the fact that negation is not explicitly modelled. Therefore, no specific ontology pattern
can be given. See the following example:

'Open wound without complication (disorder)'
subclassOf
     'Open wound (disorder)'
           and 'Role group (attribute)' some
               ('Associated morphology (attribute)' some 'Open wound (morphologic abnormality)')


    Many of the Procedure concepts with “without” in their label and most of the fully
defined ones (103 out of 136) follow the pattern “without contrast”. As it happens with
concepts under Clinical Findings, the meaning of “without” is not explicitly modelled.
We did not find any wrong classification result regarding the use of “without” in
Procedure apart from the fact that it is not explicitly modelled, e.g. in:

‘Imaging procedure without contrast (procedure)' subclassOf
     ‘Procedure (procedure)'
           and 'Role group (attribute)' some ('Method (attribute)' some 'Imaging - action (qualifier value)')
     Together with “not”, most of the negated Situation concepts use “no”. Among the
most common ones are the ones under Clinical Finding Absent hierarchy. The following
concept ‘No history of migraine (situation)’ is wrongly classified under ‘No history of
cardiovascular system disease (situation)’ and under ‘No pain (situation)’. The ontology
pattern is the same as the “Absent-Pattern” introduced above.

'No history of migraine(situation)'
subClassOf
            ‘No history of cardiovascular system disease (situation)’
            'No pain (situation)'
equivalentTo
            'Finding with explicit context (situation)'
                        and 'Role group (attribute)' some (
                                    'Associated finding (attribute)' some 'Migraine (disorder)' and
                                    'Finding context (attribute)' some 'Known absent (qualifier value)' and
                                    'Temporal context (attribute)' some 'All times past (qualifier value)' and
                                    'Subject relationship context (attribute)' some 'Subject of record (person)')


Table 2 provides an overview of lexical negation patterns and their distribution.

Table 2: Lexical negation patterns; Number of concepts and their semantic type
   Lexical     Nº total concepts (non-      Semantic         Modelling pattern        Nº total concepts (non-
    Cue              primitive)               Type                name                      primitive)
“Not”         1191 (576)                  Clinical        “Does not”                 633 (558)
                                          Finding
              323 (238)                   Situation       Clinical finding absent    74 (70)
                                                          Procedure with explicit    228 (168)
                                                          context
“Without”     96 (4)                      Procedure       Only modelled textually 96 (4)

“Without”     1331 (591)                  Clinical        “Without complication” 125 (59)
                                          Finding
                                                          “Without infection”        209 (198)
              357 (136)                   Procedure       “Without contrast”         116 (103)
“No”          236 (216)                   Situation       Clinical Finding absent    201(194)
                                                          No Family History          9 (6)
                                                          No history of              24 (23)


4. Discussion

Our analysis demonstrated obvious shortcomings of the current modelling of SNOMED
CT content with a negative polarity. Expressing this content in OWL-EL, i.e. a language
that does not provide negation operators, is problematic. Workarounds that use OWL
syntax, but ignore its semantics are inappropriate and lead to improper reasoning results,
as e.g. criticised in the case of the NCI Thesaurus [4].
     There are several possibilities to partly or fully tackle the problem. However, each
of these solutions would require some major re-modelling. Following, we sketch and
discuss some of the ones that we foresee:

    1.   OWL-EL is extended to OWL-DL, i.e. the negation operator is allowed as a
         logical constructor in the SNOMED CT description logics framework. The
         drawback here is that even if added to a minor number of axioms, a dramatic
         drop in reasoning performance would ensue, for all that we know about the
         theoretical complexity of OWL-EL and the practical experiences with it. In
         order to avoid a dramatic drop in reasoning, a combination of OWL reasoners
         and ontology modularization could be used to improve the
         reasoning performance. In [5] a combination of an EL and more expressive
         reasoner is used for ontology classification based on the identification of the
         minimal non-EL subontology. In [6], a similar approach is followed
         by implementing the MORe reasoner, as a combination of a fully-fledged DL
         reasoner and a less expressive one.
    2.   The model of meaning of the SNOMED CT situation hierarchy is revised in the
         sense that it is taken seriously what it actually is, viz. an information model
         inside SNOMED CT. As such, it should be seen rather as a frame-like model,
         like openEHR [7] and HL7 FHIR [8], refraining from any formal semantics.
         Then it could remain structurally unchanged, just excluding it from any
         transformation into OWL. However, this is just a partial solution, given that
         SNOMED CT Situation concepts account for only 12% of all negations.
    3.   Reifying negations. This consists in representing negation within the concept
         label and avoid using the OWL construct ‘not’. This approach already partly
         exists, as shown with the example Imaging procedure without contrast
         (procedure). Such primitive concepts, located in a high place in the hierarchy
         could then just be co-ordinated with more specific ones like Computed
         tomography of facial bones without contrast (procedure). On the downside, this
         does - within OWL-EL - not allow any reference to the concept Contrast media
         (substance) unless a new relation is introduced. And whereas the contrast media
         are arranged in taxonomic order, hierarchies of reified negated hierarchies
         would have to be constructed in parallel, cf. Fig. 1. As long as the size of such
         hierarchies is small, it may still be acceptable. However, if large subhierarchies
         are expected in the scope of negations, it would lead to a kind of upside-down
         duplication of large parts of SNOMED CT. Such “ghost hierarchies” would
         entail significant additional maintenance load, unless they are constructed
         automatically.


         Figure 1. Inverted hierarchies for reified SNOMED CT concepts with negative polarity
   5. Conclusion and further work

   SNOMED CT Content with more or less explicit negative meaning cannot be neglected.
   Using ten lexical patterns, 5,823 concepts with negative meaning were identified, mostly
   in the subhierarchies Clinical Finding, Procedure and Situation. In the best case,
   negative polarity is implicitly contained in primitive (i.e. not fully defined concepts); in
   the worst case a negative connotation is ignored in definitional axioms and leads to
   wrong inferences. A remediation of this situation should start with a systematic scrutiny
   of primitive concepts in the qualifier value hierarchy that incorporate a negative polarity,
   like “does not”, “not done”, “absent”. This is necessary because using them in definitions
   leads to paradoxical results, because their implicit negation is not paralleled by any
   formal negation: “absent pain” is not a subtype of pain like “severe pain”, despite the
   same syntactic structure. “Absent pain” implies, e.g. “absent back pain” (among
   thousand other kinds of pain), whereas “back pain” implies “pain”. There is hardly an
   alternative to using logical negation to express the missing of something, using a pattern
   like ‘Clinical condition’ and not (includes some Pain)”. Using more expressive
   reasoning only with small fragments of the ontology containing negation would be an
   alternative to explore. Another option would be adding such expressions to SNOMED
   CT that would be ignored by an OWL-EL reasoner, but at least the wrong inference (that
   no back pain entails no pain) would be avoided. The expected inferences could be added
   by subclass statements of the type ‘Absence of pain’ subclassOf ‘Absence of back pain’.
   Such axioms could be created by batch processes when generating the OWL version,
   however, at the price of large additional “upside - down” subclass hierarchies. Future
   work will concern experimental evaluation of the approaches discussed above.

   Acknowledgements. This work has been partly funded by the Precise4Q project within
   the   EU     H2020    Framework        Program, Call:H2020-SC1-2017-CNECT-2,
   agreement 777107 https://precise4q.eu.


References

   [1] SNOMED CT Starter Guide.
       https://confluence.ihtsdotools.org/display/DOCSTART/SNOMED+CT+Starter+Guide (Last accessed:
       7/19)
   [2] Motik B, Grau BC, Horrocks I, Wu Z, Fokoue A, Lutz C. OWL 2 web ontology language: Profiles.
       Recommendation, World Wide Web Consortium (W3C). 2009. https://www.w3.org/TR/owl2-profiles/
   [3] Cornet R, Schulz S. Relationship Groups in SNOMED CT (2009). In Proceedings of the Medical
       Informatics in a United and Healthy Europe, 2009. 223-227
   [4] Schulz S, Schober D, Tudose I, Stenzhorn H. The Pitfalls of Thesaurus Ontologization - the Case of the
       NCI Thesaurus. AMIA Annu Symp Proc. 2010 Nov 13;2010:727-31.
   [5] Wang, C., Feng, Z., Zhang, X., Wang, X., Rao, G., & Fu, D. (2019). ComR: a combined OWL reasoner
       for ontology classification. Frontiers of Computer Science, 13(1), 139-156.
   [6] Armas Romero A, Cuenca Grau B, Horrocks I. MORe: Modular Combination of OWL Reasoners for
       Ontology Classification. In Proceedings of the 11th International Semantic Web Conference (ISWC
       2012). Springer. 2012.
   [7] OpenEHR Reference Model. https://specifications.openehr.org/releases/RM/latest/index (Last accessed:
       7/19)
   [8] HL7 FHIR Resource list. https://www.hl7.org/fhir/resourcelist.html (Last accessed: 7/19)

</pre>