<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Scrutinizing the axiomatic basis of SNOMED CT: How confused is it by the ambiguous terminology paradigm?</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jean-Marie</forename><surname>Rodrigues</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">INSERM LIMICS UPMC UP 13</orgName>
								<address>
									<settlement>Paris</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Department of Public Health and Medical Informatics</orgName>
								<orgName type="institution" key="instit1">University of Saint Etienne</orgName>
								<orgName type="institution" key="instit2">CHU</orgName>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution">Saint Etienne</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefan</forename><surname>Schulz</surname></persName>
							<affiliation key="aff3">
								<orgName type="department" key="dep1">3Institute for Medical Informatics</orgName>
								<orgName type="department" key="dep2">Statistics and Documentation</orgName>
								<orgName type="institution">Medical University of Graz</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alan</forename><surname>Rector</surname></persName>
							<affiliation key="aff4">
								<orgName type="institution">University of Manchester</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Scrutinizing the axiomatic basis of SNOMED CT: How confused is it by the ambiguous terminology paradigm?</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F29011511B8C3EFBB4E80FAE9CF585E7</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T02:14+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>SNOMED CT, the world's largest clinical terminology introduces itself as "a terminological resource which consists of codes representing meanings expressed as terms, with interrelationships between the codes to provide enhanced representation of the meanings." On the one hand, concepts are linked to lexical entities (terms), including Fully Specified Names, Preferred Terms, and Synonyms. On the other hand, SNOMED CT concepts are described and defined by expressions following a formalism called Compositional Grammar (CG), according to which SNOMED CT might be considered a formal ontology. We investigate whether or not the ambiguity in the terms, which are formulated according to lexical and linguistic principles, is hampering the quality of the formal concept model using DL semantics and propose a more autonomous development process for formal concept definitions.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>SNOMED CT <ref type="bibr" target="#b0">[1]</ref>, a clinical terminology standard with about 300,000 representational units, is presented as a terminological resource linked to description logics expressions <ref type="bibr" target="#b0">[1]</ref>. We can therefore consider SNOMED CT as both  A terminology -as constituted by concepts (entities of lexical meaning), related terms of different types (Fully Specified Names, Preferred Terms, and Synonyms, obeying several naming conventions).  A formal ontology constituted by classes, individuals and formal relations expressed as axioms in "Compositional Grammar" equivalent to EL ++ /OWL-EL -what SNOMED call the "concept model". As such, the consistency of the SNOMED CT concept model can be checked by description logics reasoners. It is critical that the concepts referred to by linguistic expressions used in electronic health records are accurately aligned with the underlying axiomatic representation of those concepts. Recent works on the harmonization between a subset of SNOMED CT and a pre-final version of ICD-11 have highlighted significant modelling issues. In more than one third of cases, the SNOMED CT axiomatic expressions did not align well with the intuitive meaning derived from their Fully Specified Names or synonyms, when lexically mapped to ICD-11 classes <ref type="bibr" target="#b1">[2]</ref>. This paper will investigate the hypothesis that in the process of building and maintaining SNOMED CT, the cor-To whom correspondence should be addressed: rodrigue@univ-st-etienne.fr rectness of the axiomatic expressions is affected when SNOMED CT curators are led preferentially by language. We first analyse the external inconsistencies between axiomatic descriptions and definitions of SNOMED CT concepts on the one hand and the ICD11 class. Thereafter, we investigate inconsistencies within SNOMED CT and their relation to ambiguities in typical clinical interface terms. As a conclusion, we recommend that the axiomatic underpinning of SNOMED CT should be developed autonomously from the lexical entitites/terms, and that the linkage of terms for concepts to the axiomatic descriptions of those concepts be done after the axiomatic model of the concepts is consolidated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">MATERIAL AND METHODS</head><p>SNOMED CT's representational units, called concepts are linked to clinical terms (so called "descriptions") in several languages. Terms are of several types including Fully Specified Names (FSNs), Preferred Terms (PTs), and Synonyms. SNOMED CT concepts are also formally described by expressions following a language called Compositional Grammar (CG) <ref type="bibr" target="#b2">[3]</ref>, which can be interpreted according to description logic (DL) semantics. In the following example, Fracture of tibia, is fully defined as being equivalent to Injury of tibia and Fracture of lower leg, with Associated morphology Fracture and Finding site Bone structure of tibia. Its rendering in CG and the Description Logics Manchester Syntax is shown below (class symbols are set in Italics and relation symbols are in Bold): 'Bone structure of tibia (body structure)') and ('Associated morphology (attribute)' some 'Fracture (morphologic abnormality)')) Table <ref type="table">1</ref>. SNOMED CT definitions in Conceptual Grammar (above) and OWL Manchester Syntax (below) CG supports logic-based compositional expressions in order to maximise the coverage of utterances in clinical records, without requiring the terminology to attend the users' demand by continuous creation of new concepts. The latter is known as pre-coordination. An example for a precoordinated concept is "right hand", which has the code 78791008 |Structure of right hand (body structure). In contrast, there is no code for "right thumb", but the meaning of this is expressible by post-co-ordination, viz. by the CG expression 76505004 |Thumb structure (body structure)|: 272741003 |Laterality (attribute)| = 24028007 |Right (qualifier value), corresponding to the OWL expression: 'Thumb structure (body structure)' and 'Laterality (attribute)' some 'Right (qualifier value)'.</p><p>ICD -the International Classification of Diseases and Related Health Problems -is promoted by WHO as "the standard diagnostic tool for causes of death, epidemiology, health management and clinical purposes". However, it is particularly focused on the analysis of the health of population groups, and is used to monitor the incidence and prevalence of diseases and other health problems. The ongoing 11 th (ICD-11) revision, named ICD-11-MMS (Mortality, Morbidity and Standard) is planned to be finalized in 2018. ICD has recently been characterized as an "aggregation terminology" <ref type="bibr" target="#b1">[2]</ref>. This terminology genre typically contains rules that enforce the principle of single hierarchies and disjoint classes. Partitioning ICD-11 into non-overlapping chapters requires exclusion rules at all hierarchical levels. E.g., the chapter "circulatory system" excludes infections, neoplasms, endocrine and congenital diseases called "developmental", which have their own chapters. Making ICD exhaustive requires residual classes ("other specified", "other unspecified"), indicated by codes ending in "Y" or "Z". named residuals which have no meaning outside the ICD hierarchy. The current study is limited to 428 classes from ICD-11, as displayed by the WHO browser <ref type="bibr" target="#b4">[5]</ref>, covering the circulatory system, and 522 classes covering the digestive system. We exclude ICD-11 residuals because they are meaningless outside ICD. The resulting totals are 206 in the circulatory chapter and 250 in the digestive chapter (see Table <ref type="table" target="#tab_3">4</ref>). In a first step, we compared the Compositional Grammar (CG) expressions of lexically mapped ICD11 classes and SNOMED CT concepts using WHO and IHTSDO/SNOMED Browsers <ref type="bibr" target="#b3">[4]</ref> <ref type="bibr" target="#b4">[5]</ref>. As explained in <ref type="bibr" target="#b5">[6]</ref>, the lexical map is based on ICD 11 class names and SNOMED CT FSNs or synonyms. In a second step, we checked if the CG expressions of SNOMED CT concepts lexically mapped to a single ICD 11 class constituted a fully equivalent representation of the ICD11 class. The details are developed below and summarized in Figure <ref type="figure" target="#fig_1">1</ref> and Table <ref type="table" target="#tab_0">2</ref>. We introduce the following symbols for the mapping types: M (refined by M1 and M2), A (refined by A1 and A2), P and Z. We consider the mapping of a SNOMED CT Concept SCi, described by terms STi{1…n} to an ICD class ICi, described by a name ITi.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lexical map</head><p> The following rules apply for the lexical maps  If there is a full lexical map between the ICD-11 class name ITi and one SNOMED CT description STi{1…n, considered as pre-coordinated in SNOMED CT it is classified as M (for lexical Map) type .  If there is no lexical map between any ITi and STik , but if mapping can be achieved to the post-coordination of two or more descriptions STi{1…n, of SCk , it is classified as A (for Addition map) type.  If only a part of ITi of ICi can be lexically mapped to any STik it is classified as P (for Partial) type.  Finally, if not even a partial lexical mapping between any ITi o of ICi and STik is possible, it is classified as Z (for Zero) type.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Match of meaning</head><p>Subsequently, the defining and constraining axioms of one or more than one SCi CG expressions were analysed to check whether they correspond to the totality of the textual definition and to the hierarchy inheritance of ICi . The following cases are distinguished:  M (lexical map) type:</p><p>1. This expression fully represents the meaning of ICi, a complete match meaning is assumed: the classification is refined to M1. 2. This expression does not fully represent the meaning of ICi, a new expression is produced according to CG: the classification is refined to M2.  A (addition map) type:</p><p>1. These expressions fully represent the meaning of ICi, a complete match meaning is assumed: the classification is refined to A1. 2. These expressions do not fully represent the meaning of ICi, a new expression is produced according to CG: the classification is refined to A2.  P type:</p><p>For ICi it is then necessary to create a logical representation based on one existing CG expression plus an extended de novo CG expression.  Z type:</p><p>For this ICi it is necessary to create a logical expression in accordance with SNOMED CT CG .</p><p>In the following, only M and A types will be analysed. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Create a logical CG expression</head><p>A new logical CG expression We did not consider the current pre-final version of ICD-11 as a gold standard. Therefore, the total or partial omission of a SNOMED CT concept that seemed necessary to ICD 11 was not considered an issue, and these cases were omitted. Neither did we assess the clinical consistency of ICD 11's textual definitions. We assessed only the existing CG expression(s) as to how well they represented the ICD-11 class textual definitions when the IC11 class names have been lexically mapped to SNOMED terms or to a minimally adapted SNOMED CT concept terms. We were conforming to the assumptions, rules, and standards of the SNOMED CT concept model when we have to extend the representation (Types M2 and A2). Two knowledge engineering master students did the work, one each for the circulatory and digestive chapters. The same senior ICD-11 and SNOMED CT expert supervised both. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">RESULTS</head><p>Table <ref type="table" target="#tab_1">3</ref> provides an overview of the results. The two most frequent lexical map types are M (M1 plus M2) for full lexical map with a pre-coordinated SNOMED CT concept and A (A1 plus A2) full lexical map with more than one post-coordinated SNOMED CT concepts: 78 % for the circulatory chapter and 89% for the digestive chapter. The most frequent type is M1 for both. The less frequent types are Z for no possible lexical map for the circulatory chapter (1%) and for the digestive chapter (2%). These differences can be explained by inter-ratter differences (the work was done by two different knowledge engineering master students supervised by the same senior terminology expert) or quality differences between these two chapters either in WHO ICD 11 or in SNOMED CT or in both.  RoleGroup some (('Finding site (attribute)' some 'Esophageal structure (body structure)') and ('Associated morphology (attribute)' some 'Perforation (morphologic abnormality)'))</p><p>As an example for the type M2, the ICD-11 class BB67.3 Macro re-entrant atrial tachycardia is defined as "An atrial arrhythmia in which there is intra-atrial re-entry or circus movement around a fixed or functional central obstacle. The central obstacle may consist normal (e.g. valves) or abnormal (e.g., scar) structures. Conduction to the ventricles is not necessary for the tachycardia to continue. All that is required is an organised atrial rhythm with a rate typically between 250 and 350 bpm, including tachycardia using a variety of re-entry circuits that often occupy large areas of the atrium (''macro-re-entrant''). Here the arrhythmia involves the cavo-tricuspid isthmus". The full lexical map is with the SNOMED CT concept 233893007 Re-entrant atrial tachycardia (disorder), a primitive concept with the following pre-coordinated SNOMED CT inferred expression:</p><p>RoleGroup some (('Finding site (attribute)' some 'Cardiac conducting system structure (body structure.)')and ('Clinical course (attribute))' some 'Sudden onset AND/OR short duration (qualif. value)') and ('Has definitional manifestation (attribute)' some 'Tachycardia (finding)') )</p><p>This representation lacks the localization of the arrhythmia at the atrium and the formalization allows representing it as the following one. The modification to the original expression is underlined.</p><p>RoleGroup some (('Finding site (attribute)' some 'Preferential interatrial pathway (body structure)')and ('Clinical course (attribute))' some 'Sudden onset AND/OR short duration (qualif. value)') and ('Has definitional manifestation (attribute)' some 'Tachycardia (finding)') )</p><p>An example for the type A1 is BA04.3 is Secondary hypertension associated with renal tubular disorders This ICD-11 class has no definition in most recent version (Jan 2017). A full lexical map can be done with the SNOMED CT concept 31992008, Secondary hypertension(disorder), a primitive concept, together with 95568003, Renal tubular disorder (disorder), a fully defined one, using the following postcoordinated SNOMED CT inferred expressions, which introduces the aetiology using the relation DueTo:</p><p>Has definitional manifestation (attribute) some Finding of increased blood pressure (finding) and RoleGroup some ('Finding site (attribute)' some 'Systemic circulatory system structure (body structure)') and 'Due to (attribute)' some Renal tubular disorder (disorder)</p><p>As an example for the type A2, let us analyse the ICD-11 class DB02.31 Ig-E mediated allergic enteritis of small intestine, defined as "Immediate type (IgE-mediated) enteric hypersensitivity due to exposure to an allergen in individuals previously sensitized. The symptoms are acute abdominal pain and diarrhoea and can be combined to other symptoms in cases of anaphylaxis". A full lexical map is possible with the fully defined SNOMED CT concepts 22231002 Allergic enteritis (disorder) and 422076005 Immunoglobulin E-mediated allergic disorder (disorder), constructing the following expression (addition underlined):</p><p>'Pathological process (attribute)' equivalentTo 'Allergic process (qualifier value)' and RoleGroup some (('Associated morphology (attribute)' some 'Inflammation (morphologic abnormality)') and ('Finding site (attribute)' some 'Intestinal structure (body structure)')) and 'Due to (attribute)' some 'Type 1 hypersensitivity response (disorder)' and 'Causative agent (attribute)' some 'Immunoglobulin E (substance)'</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">DISCUSSION</head><p>The study makes the attempt to propose semantically precise mappings between two independent representation artefacts (ICD-11 and SNOMED CT), based on OWL-DL, using the axioms in the SNOMED Composition Grammar "concept model" (and OWL-EL equivalent to from it), which are intended to fine what is universally true in a domain, <ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref>.</p><p>The findings are summarised in  <ref type="table" target="#tab_3">4</ref>, in most of the cases this is related to the high number of primitives, i.e. not fully defined SNOMED CT concepts but as well with some fully defined concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Misalignment between SNOMED CT concept FSN and primitive representation</head><p>There were higher rates of primitive in lexical and meaning match types M2 vs M1, viz. 91% vs 21% in the Circulatory chapter and 84% vs 23% in the Digestive chapter; and in A2 vs A1 53% vs 35% in the Circulatory chapter and 52% vs 47% in the Digestive chapter.</p><p>What is challenging is that the OWL axioms allow a fully defined representation. For example, Essential hypertension (ICD-11 class BA 00), lexically matched to the SNOMED CT concept 59621000 Essential hypertension (disorder) is the most frequent arterial disease. SNOMED CT does not represent the lack of secondary cause, which is the meaning of "essential" or "idiopathic". SNOMED CT CG provides the possibility to represent the lack of secondary cause by adding the following expression:</p><p>If the clinical vocabulary (interface terminology) and the logic-based descriptions were defined independently, this would reduce the problem. However, there would still be issues where the full meaning of the natural language expression would not be captured in the formal logical expression.</p><p>The difference between flexible human language and machine-required logic is apparent in the SNOMED CT Editorial guide <ref type="bibr" target="#b0">[1]</ref>. What is an inappropriate synonym when a synonym is defined by SNOMED as "a term other than the FSN that is an acceptable way to express the meaning of a SNOMED CT concept in a particular language"? This synonym is anchored to a FSN which shall be aligned on the FSN concept model instance. An inappropriate synonym must therefore be "an acceptable (or unacceptable) way to express the meaning of a SNOMED CT concept" and aligned or not aligned on the FSN concept model instance.</p><p>The dimension of this issue is summarized by 24,782 shared terms between pairs of active concepts either in one hierarchy or across hierarchies. In the Clinical findings disorder hierarchy there are 1394 instances of duplicate terms (around 3%). Across hierarchies, most of duplicate terms are between Product and Substance, e.g. 53009005 Analgesic (product) and 373265006 Analgesic (substance). Such definitions (a drug name replaced by the name of the active ingredient) are acceptable for interface terminologies but inappropriate for ontological standards. This therefore suggests a principled reworking of the relations between FSN, concept model instances and synonyms.</p><p>Another example is related to negation as in Non traumatic tear of meniscus. The formal SNOMED CT expression is based on their Compositional Grammar (equivalent to OWL-EL and EL++ without disjointness), which does not support any form of negation. Here the question arises whether the negative expression might be rather restricted to a common interface term feature or represented in CG. Such an interface term, in our example, could point to a fully specified name Degenerative tear of meniscus. But on a logic basis as there are developmental, inflammatory, or other non-traumatic non-degenerative tears it does not appear correct to equate non-traumatic and degenerative cartilage tears. The issue is that even if negation is understandable at the clinical interface terminology level it cannot be represented with the SNOMED formalism. The logical alternative is to point the negated concept at the alternative concepts -developmental, degenerative, etc. This is the base of the solution we recommend to represent such concepts or classes clinical names. For example, it is possible to represent the closely related notion "tears of meniscus excluding traumatic tears" as a query on the representation (codes) for " tears of meniscus" which is an axi-omatized expression minus the representations (codes) or "traumatic tears of meniscus" as recommended in <ref type="bibr" target="#b7">[8]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">CONCLUSION</head><p>To answer the main question of this paper, viz. whether the logic based expressions in SNOMED CT are blurred by a primarily language-driven modelling approach, we can state the following points as a route to an answer: SNOMED CT currently integrates two aspects, a reference clinical terminology and a formal ontology.</p><p>It is necessary to distinguish clearly the part of SNOMED CT natural language definition to be used as the basis of a formal representation in the Composition Grammar/Description Logic from the part used for the management of the clinical interface vocabularies used by clinicians in electronic health records. Clinical language is characterised by lexical ambiguities due to brevity and assumed context. The words used by clinicians often hide widely understood conventions that, if taken literally, give rise to incorrect formal representations.</p><p>Given the conflict between clinical usage and formal representation, errors in the axiomatized formal content arise easily. External validation of the axiomatic content in SNOMED CT is critical to reach validated DL-based (or any other logic-based) model medical knowledge and concept descriptions. The harmonization of SNOMED CT with ICD-11 provides one example of such an external validation.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>31978002 |Fracture of tibia(disorder)| === 428881005 |Injury of tibia (disorder)| + 414292006 |Fracture of lower leg (disorder)| : { 363698007 |Finding site (attribute)| = 12611008 |Bone structure of tibia (body structure)|, 116676008 |Associated morphology (attribute)| = 72704001 |Fracture (morphologic abnormality)| } 'Fracture of tibia' equivalentTo 'Injury of tibia (disorder)' and 'Fracture of lower leg (disorder)' and RoleGroup some (('Finding site (attribute)' some</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. ICD-11 SNOMED CT semantic alignment principle</figDesc><graphic coords="3,52.44,103.80,236.16,130.56" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2 .</head><label>2</label><figDesc>The lexical maps types and meaning matches between the ICD-11 MMS classes and SNOMED CT formal expressions</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3 .</head><label>3</label><figDesc>Numbers of codes in the Circulatory chapter and Digestive chapter, from ICD 11 MMS 2017 to SNOMED CT 31 January 2017 release by map and meaning match types</figDesc><table><row><cell>Map and meaning</cell><cell>ICD11</cell><cell>Rate</cell><cell>ICD11</cell><cell>Rate</cell></row><row><cell>match types</cell><cell>Circ.</cell><cell>(%)</cell><cell></cell><cell></cell></row><row><cell></cell><cell>count</cell><cell></cell><cell>Digestive</cell><cell>(%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell>count</cell><cell></cell></row><row><cell>M 1</cell><cell>209</cell><cell>51</cell><cell>251</cell><cell>53</cell></row><row><cell>M 2</cell><cell>123</cell><cell>30</cell><cell>125</cell><cell>26</cell></row><row><cell>A 1</cell><cell>17</cell><cell>4</cell><cell>23</cell><cell>5</cell></row><row><cell>A 2</cell><cell>15</cell><cell>3</cell><cell>25</cell><cell>5</cell></row><row><cell>P</cell><cell>44</cell><cell>11</cell><cell>45</cell><cell>9</cell></row><row><cell>Z</cell><cell>4</cell><cell>1</cell><cell>9</cell><cell>2</cell></row><row><cell>Total (M + A + P + Z)</cell><cell>412</cell><cell>68</cell><cell>478</cell><cell>66</cell></row><row><cell>"complete chapter"</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Other and unspecified</cell><cell>197</cell><cell>32</cell><cell>250</cell><cell>34</cell></row><row><cell>number of codes</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Total number of codes</cell><cell>609</cell><cell>100</cell><cell>728</cell><cell>100</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>Primitive SNOMED CT concepts by map and meaning match typesTo address the quality of the formal descriptions of SNOMED CT, it is interesting to compare the rate of primitive SNOMED CT concepts in the different Map and Meaning match types as shown in</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4</head><label>4</label><figDesc>Perforation</figDesc><table><row><cell>. The types with full</cell></row><row><cell>map and meaning match (M1 and A1) have a lower rate of</cell></row><row><cell>SNOMED CT primitive concepts (from 21 % to 47%) and</cell></row><row><cell>the types with no full match (M2 and A2) have a higher rate</cell></row><row><cell>of SNOMED CT primitive concepts (from 52% to 91%).</cell></row><row><cell>Nevertheless the primitive concepts rate of full Map and</cell></row><row><cell>Meaning match types (M1 and A1) is high when it is con-</cell></row><row><cell>sidered that the lexical map was complete between the ICD-</cell></row><row><cell>11 class name and the SNOMED CT FSN or synonym. On</cell></row><row><cell>the contrary, when the lexical map is incomplete we should</cell></row><row><cell>have expected a rate nearer from 100 % which is nearly</cell></row><row><cell>true for M2 but less for A2.</cell></row><row><cell>It is necessary to go further by taking some examples of</cell></row><row><cell>mismatches regarding primitive and fully defined SNOMED</cell></row><row><cell>CT concepts.</cell></row><row><cell>As an example for the type M1, the ICD ICD-11 class DA</cell></row><row><cell>40.4</cell></row></table><note>of esophagus is defined by: "Perforation of esophagus is a penetration or hole of the wall of the esophagus, resulting in luminal contents in esophagus flowing into the and/or thoracic cavity". The full lexical map is with the fully defined SNOMED CT concept 23387001, Perforation of esophagus (disorder), which is equivalent to the following (inferred) pre-coordinated SNOMED CT inferred expression:</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 3</head><label>3</label><figDesc></figDesc><table><row><cell>: 138 (123 M2 plus</cell></row><row><cell>15 A2 )out of 364 SNOMED CT concepts (38%) in the</cell></row><row><cell>circulatory chapter and 150 (125 M2 plus 25 A2) out of 424</cell></row><row><cell>SNOMED CT concepts (35%) in the digestive chapter from</cell></row><row><cell>the Clinical finding hierarchy that were lexically mapped to</cell></row><row><cell>ICD-11 classes show modelling issues resulting in misa-</cell></row><row><cell>lignments between the meaning of the ICD-11 MMS classes</cell></row><row><cell>(as given by their name, hierarchic context and text defini-</cell></row><row><cell>tion) and formal axioms that characterise SNOMED CT</cell></row><row><cell>concepts. We equally found misalignments within</cell></row><row><cell>SNOMED CT, i.e. between Fully Specified Names and</cell></row><row><cell>formal axioms. As shown in Table</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">'Pathological process (attribute)' some 'spontaneous (qualifier value)'</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Apart from some other cases of SNOMED CT concepts with the wording "of unknown etiology" there are numerous cases of "real" qualifying adjectives that are not reflected in the definition, such as 85598007, Constrictive pericarditis (disorder) with no representation of "constrictive", 373945007 Pericardial effusion (disorder) with no representation of "effusion" and 706882009 Hypertensive crisis (disorder) with no representation of "crisis".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Misalignment between SNOMED CT concept FSN and full definitions</head><p>The ICD-11 class DA52.51 Allergic gastritis due to IgEmediated hypersensitivity can be fully represented by the SNOMED CT concepts 1824008 Allergic gastritis (disorder) and 422076005 Immunoglobulin E-mediated allergic disorder (disorder), both of which are fully defined. The role of Immunoglobulin E is not represented in the present version.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Inconsistencies across SNOMED CT concept definitions</head><p>It is interesting to try to understand why they are so many issues: let us take the example of hypertension. In clinical settings, most healthcare professionals who use "hypertension" in their daily patient monitoring practice this means exclusively systemic arterial hypertension, which is a frequent disease. However, the SNOMED CT concept 59621000 Essential hypertension (disorder) is described by the expression:</p><p>Has definitional manifestation (attribute) some Finding of increased blood pressure (finding) and RoleGroup some ('Finding site (attribute)' some 'Systemic circulatory system structure (body structure)')</p><p>On the other hand, the SNOMED CT 11399002, Pulmonary hypertensive arterial disease (disorder) is described with</p><p>RoleGroup some ('Finding site (attribute)' some 'Pulmonary artery structure (body structure)')</p><p>Both are primitive concepts, and since 24184005. Finding of increased blood pressure (finding) is clinically understood as a finding measuring only for systemic arterial hypertension it cannot be applied to Pulmonary hypertensive arterial disease.</p><p>On the other hand, the CG formalism would allow the following representations:</p><p>'Pulmonary hypertensive arterial disease (disorder)' subclassOf RoleGroup some ('Finding site (attribute)' some 'Pulmonary artery structure (body structure)') and 'Has interpretation (attribute)' some 'Abnormally high (qualifier value)' and 'Interprets (attribute)' some 'Blood pressure (observable entity)'</p><p>'Essential hypertension (disorder)' subclassOf RoleGroup some ('Finding site (attribute)' some 'Systemic circulatory system structure (body structure)') and 'Has interpretation (attribute)' some 'Abnormally high (qualifier value)' and 'Interprets (attribute)' some 'Blood pressure (observable entity)' and 'Pathological process (attribute)' some 'Spontaneous (origin) (qualifier value)'</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">SNOMED CT® Editorial Guide</title>
		<ptr target=".org/" />
		<imprint>
			<date type="published" when="2017-01">January 2017. 2017</date>
		</imprint>
	</monogr>
	<note>International Release. US English) chapter 2.1. snomed. eg last access 15 may</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Semantic Alignment between ICD-11 and SNOMED CT</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Rodrigues</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Robinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Della</forename><surname>Mea</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Campbell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Rector</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Schulz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Brear</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Üstün</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Spackman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chute</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">G</forename><surname>Millar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Solbrig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Brand Persson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Studies in health technology and informatics</title>
		<imprint>
			<biblScope unit="volume">216</biblScope>
			<biblScope unit="page" from="790" to="794" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">SNOMED CT Compositional Grammar</title>
		<ptr target="http://snomed.org/scglastaccess15may2017" />
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<ptr target="http://browser.ihtsdotools.org/lastaccess30may2017" />
		<title level="m">IHTSDO Browser</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<ptr target="http://who.int/classifications/icd11/browse/f/en" />
		<title level="m">WHO Browser</title>
				<imprint>
			<date type="published" when="2017-05-08">08 may 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Representing ICD-11 JLMMS Using IHTSDO Representation Formalisms</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mamou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Rector</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schulz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Campbell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Solbrig</surname></persName>
		</author>
		<author>
			<persName><surname>Rodrigues</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Studies in Health Technology and Informatics</title>
		<imprint>
			<biblScope unit="volume">228</biblScope>
			<biblScope unit="page" from="431" to="435" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">On closed world data bases</title>
		<author>
			<persName><forename type="first">R</forename><surname>Reiter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Logic and Data Bases</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Gallaire</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Minker</surname></persName>
		</editor>
		<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Plenum</publisher>
			<date type="published" when="1978">1978</date>
			<biblScope unit="page" from="55" to="76" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Interface Terminologies, Reference Terminologies and Aggregation Terminologies: A Strategy for Better Integration</title>
		<author>
			<persName><forename type="first">S</forename><surname>Schulz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Rodrigues</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Rector</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">G</forename><surname>Chute</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Studies in Health Technology and Informatics</title>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>accepted for publication</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Chemical Entities of Biological Interest: an update</title>
		<author>
			<persName><forename type="first">P</forename><surname>De Matos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Alcántara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dekker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ennis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hastings</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Haug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Spiteri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Turner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Steinbeck</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucl. Acids Res</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="D249" to="D254" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
