<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Stop-word based contextual auditing to identify inconsistencies in SNOMED</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rashmi Burse</string-name>
          <email>rashmi.burse@ucdconnect.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gavin McArdle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michela Bertolotto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University College Dublin</institution>
          ,
          <addr-line>Bel eld, Dublin 4</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>SNOMED is one of the most widely adopted Clinical Terminology systems. However, incomplete representations and modelling inconsistencies in SNOMED are preventing healthcare applications from exploiting its full potential. This paper presents a novel stop-word based contextual auditing method to identify potential inconsistencies in the modelling of SNOMED concepts. The results of a pilot study method show promising potential with this method. The percentage of identi ed missing attribute relationships using this method is as high as 69.56% and for identi ed missing hierarchical relationships it is 28.26%. The auditing method proposed in this paper can act as a supplementary Quality Assurance check in the International Health Terminology Standards Development Organization's e ort to improve the quality of SNOMED.</p>
      </abstract>
      <kwd-group>
        <kwd>SNOMED</kwd>
        <kwd>Quality Assurance</kwd>
        <kwd>Lexical Auditing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Incomplete, inconsistent and erroneous representations of Clinical
Terminology (CT) systems limit their expressiveness and have a variety of repercussions
including retrieval of incomplete or incorrect result sets. Missing relationships
result in the existence of partially de ned concepts which obstruct the
divulgence of rich inferential knowledge. For example, in the International Edition of
March 2020 SNOMED version, the concept Insomnia with sleep apnea (disorder)
has only one parent, Insomnia (disorder). The hierarchical link to Sleep apnea
(disorder) is absent. Sleep apnea (disorder) has a role group containing three
attribute relationships which are missing from the concept Insomnia with sleep
apnea (disorder), thus preventing it to capture all relevant information to de ne
this condition. If someone executes a query to retrieve all patients su ering from
sleep apnea (disorder), the patients su ering from Insomnia with sleep apnea
(disorder) would not be retrieved due to the missing hierarchical relationship
between sleep apnea (disorder) and Insomnia with sleep apnea (disorder). This
will yield inaccurate partial results. Given the critical nature of medical data,
e ective Quality Assurance (QA) of CT systems is imperative. 1</p>
      <p>However, the development of e ective auditing methods for the QA of CT
systems is a major challenge and an ongoing process in the health-informatics
domain. In spite of continual research e orts, the healthcare community is still
striving to hone its auditing techniques for two major reasons: (a) the huge size
of CT systems makes it impractical to audit each and every concept manually.
(b) the diverse nature of clinical data has led to a variety of con icting
modelling styles making it impossible to develop a "one size ts all" solution that
can be applied to all CT systems. Taking into consideration these constraints,
the best way forward is to develop e cient auditing techniques that highlight
concentrated erroneous regions in a CT system. Such areas can then be
presented to authors and curators of a CT system for manual inspection. The main
objective of such techniques is to direct the limited available resources to highly
concentrated erroneous areas and identify maximum number of inconsistencies
with minimal e ort.</p>
      <p>With this objective, we present a novel method based on lexical analysis
of concept names containing stop-words. It is our hypothesis that stop-words
which have been disregarded by other lexical auditing methods can prove to
be rich sources of information to identify problematic areas. The pilot version
of this method is restricted to the stop-words \and" and \with" due to their
conjunctive nature. However, we plan to expand our analysis to other
stopwords in the future. The proposed method identi es two types of inconsistencies:
missing hierarchical relationships (i.e., if a SNOMED concept exists, which is
lexically equivalent or a lexical variant of any of the subjects appearing before
or after the stop-word and is not assigned as a parent of the concept) and missing
attribute relationships (i.e., in the case of a missing hierarchical relationship, if
the attribute relationship(s) of the identi ed lexically ideal parent is/are not
included as a role group in the modeling of the concept). The proposed method
promotes semantic completeness by identifying missing attribute relationships
to re ne a concept and ensures consistency in structural modelling by identifying
missing hierarchical relationships. An additional advantage of our method over
other auditing methods is that it not only identi es inconsistencies but also
provides a potential list of suggestive corrections for each identi ed inconsistency.
The aim of our method is to highlight areas with a high concentration of errors
in order to save time and e ort of experts and curators on manual auditing.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Related Work</title>
      <p>
        Bodenreider et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] developed a method to identify missing elements in
SNOMED by targeting concepts containing binary antonymous adjectives such
as (acute, chronic), (unilateral, bilateral), (primary, secondary), and (acquired,
congenital). The proposed method extracted adjectival modi ers from the
targeted concepts ([MOD][CONTEXT]) and created new terms by experimenting
with various combinations of modi ers and contexts. Bodenreider et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
exploited the lexical features of concepts to identify missing hyponomic
relationships. The method selected concepts conforming to a modi er+noun form
([MOD][NOUN]), where modi er was usually an adjectival modi er further
describing the noun. They intuitively assumed that modi er+noun should be a
hyponym of the noun, e.g. acute appendicitis should be a child of appendicitis,
and identi ed missing hyponomic relationships. Pacheco et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] assumed that
non-attributed concepts were underspeci ed and employed a semantic indexing
method to suggest attribute relationships for such concepts. The method
derived sub-words from a non-attributed concept's Fully Speci ed Name (FSN)
with the help of MorphoSaurus [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The derived sub-words were compared with
the concept's parent(s). Common sub-words appearing both in child and
parent concept were eliminated. The concepts containing the remaining sub-words
were then searched and chosen as eligible candidates to re ne the non-attributed
concept.
      </p>
      <p>
        Agrawal and Elhanan [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] examined ve types of inconsistencies among
concepts whose FSNs were lexically similar, i.e., di ered by only one word. The
method created similarity sets consisting of concepts that di ered from a base
descriptor by one word. E.g. for the base descriptor \upper limb stretching",
Prophylactic upper limb stretching (procedure), Therapeutic upper limb
stretching (procedure), and Prophylactic lower limb stretching (procedure) constituted a
similarity set. The method was applied to Procedure sub-hierarchy of SNOMED.
5 samples each consisting of 50 similarity sets were created and each sample was
examined for hierarchical, attribute assignment, attribute target value, group,
and de nitional inconsistencies. Bodenreider [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] claimed that the root cause
for all inconsistencies in CT systems was concepts modeled with faulty logical
de nitions. With this notion they recreated logical de nitions from the lexical
features of a concept name and inferred hierarchical relationships among these
newly de ned concepts. The newly obtained hierarchy was then compared with
the original SNOMED hierarchy to detect di erences. Schulz et al. [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] detected
ambiguities in hierarchy tags, attribute relationships, and IS-A relationships
based on the lexical features of SNOMED concepts and made some valuable
suggestions for the curators of SNOMED. Rector and Iannone [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] focused on
nding concepts from the ndings and diseases sub-hierarchies of SNOMED that
should be classi ed as chronic or acute according to CORE problem list but
currently are not and studied the e ect of this misclassi cation on post-coordination
queries. Ceusters et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] scrutinized concepts containing negation words like
absence, negation, and not and misclassi cation caused due to these words. They
introduced four categories into which negative relationships can be classi ed,
suggested that SNOMED should be aligned with an Upper Level Ontology (ULO)
like Basic Formal Ontology (BFO), and introduced a new "lacks" relationship
to correctly classify such negative concepts.
      </p>
      <p>
        Agrawal et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] reported the results of a study that statistically concluded
that the complexity and thereby the chances of identifying errors increases with
the length (number of words) of a concept name and the number of parents of a
concept. Agrawal [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed an auditing method based on the hypothesis that
if two concepts are lexically similar then their structural and logical modeling
should also be similar. E.g. the concepts Acute injury of anterior cruciate
ligament (disorder) and Acute injury of posterior cruciate ligament (disorder) are
lexically similar as they di er by only one word and hence have similar structural
and logical modelling. Both concepts have the same number of hierarchical
relationships, same number and type of attribute relationships di ering only in the
target values (anterior and posterior). Many variations of this method, including
simple similarity sets [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ], positional similarity sets [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ], and employing
machine learning tools to create similarity sets [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ] were developed and applied
to di erent versions and sub-hierarchies of SNOMED. Cui et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] proposed a
hybrid method combining the structural and lexical aspects of a CT system and
identi ed four lexical patterns in non-lattice subgraphs that suggested potential
missing hierarchical relationships and potential missing concepts.
      </p>
      <p>To summarize, all the lexical auditing methods applied so far work on one
of the following principles (a) counting the length of a concept name to estimate
its complexity and thereby calculate the probability of potential inconsistencies
harbored by it; (b) performing lexico-syntactic and morphosyntactic analysis
on the concept names to identify missing concepts/relationships; (c) applying
normalization techniques and LVG algorithms to deal with variation in concept
names; (d) looking for lexical similarity among concept names to check for
inconsistencies in their structural and logical modelling.</p>
      <p>
        The intent and focus of all the aforementioned methods is on medical jargons
and their lexical variants. As a result, these methods scrutinized xed parts of
speech like adjectival modi ers, nouns, and verbs and found repeatedly occurring
stop-words like \and", \or", \with" etc. to be a hindrance. To improve the
performance e ciency of their algorithms, these methods ignored a list of such
stop-words [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These stop-words that are disregarded and eliminated by all the
aforementioned studies can prove to be rich sources of information to identify
problematic areas. They can serve as e ective indicators to identify concepts
harboring potential inconsistencies. The stop-word list [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] eliminated by these
studies serves as a major motivation for our approach. In this work we present a
unique method that targets concepts containing stop-words, \and" and \with",
to identify two types of inconsistencies: missing hierarchical relationships and
missing attribute relationships. The pilot version of this method is restricted
to the stop-words \and" and \with" due to their conjunctive nature. However,
we plan to expand our analysis to other stop-words [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in the future. To the
best of our knowledge, there is no lexical method developed so far that targets
stop-words to audit CT systems.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Materials and Method</title>
      <sec id="sec-3-1">
        <title>3.1 Materials</title>
        <p>In this pilot study, the proposed method will be applied to the Disorder
sub-hierarchy of SNOMED's March 2020 International Edition. However, the
proposed method is quite generic and can be applied to other hierarchies of
SNOMED as well as other CT systems. We have chosen this sub-hierarchy
because after performing a preliminary inspection, we found many concepts in the
disorder sub-hierarchy containing the stop-words \and" and \with" that were
either missing hierarchical relationships or were assigned inconsistent hierarchical
relationships that varied in granularity and were missing attribute relationships.
There are almost 7000 eligible concepts, containing \and" or \with", that need
to be systematically assessed and it is our hypothesis that the proposed method
will highlight erroneous concepts that require manual auditing.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Method</title>
        <p>The proposed method is based on four assumptions and identi es two types
of inconsistencies. Lexical variants in this work are considered to be concept
FSNs conforming to the lexical structure \subject + syndrome" and terms
appearing before and after \and" or \with" will hereafter be referred to as subjects.
Inconsistencies are de ned as follows:
Missing hierarchical relationship: If a SNOMED concept exists, which is lexically
equivalent or a lexical variant of any of the subjects and is not assigned as a
parent of the concept.</p>
        <p>Missing attribute relationship (role group): In case of a missing hierarchical
relationship, if the attribute relationship(s) of the identi ed lexically ideal parent
is/are not included as a role group in the modeling of the concept.</p>
        <p>
          The assumptions made in this study are based on the observation that
concepts containing \and" and \with" are expected to have at least two
parents and at least two role groups. The rst assumption is also supported by
a semantic rule proposed during the early formative years of SNOMED [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
Mendonca et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] conducted a thorough analysis of SNOMED concepts
containing conjunctions like \and", \and/or", \either/or", \neither/nor" and came
to the conclusion that if a SNOMED concept contains the word \and", it should
be treated as a \logical and" and the properties of the subjects appearing
before and after the conjunction must be present in the concept. All other cases
that entertain the idea of exclusivity allowing the presence of either one or both
subjects should be represented using the more lenient \and/or" conjunction.
        </p>
        <p>
          Fig. 1 illustrates the example of a concept Pneumonia and in uenza
(disorder) which has two parents in uenza(disorder) and Pneumonia(disorder). The
names of the parents are lexically equivalent to the subjects. It has two role
groups one belonging to each of the parent disorders, i.e. role group 1
containing three attribute relationships: pathological process { infectious process,
causative agent { in uenza virus, nding site { structure of respiratory system
belonging to in uenza (disorder) and role group 2 containing two attribute
relationships: associated morphology { In ammation and consolidation, nding site
{ lung structure belonging to pneumonia (disorder). Fig. 2 illustrates the
individual disorder concepts pneumonia (disorder) and in uenza (disorder) along
with their role groups. The diagrammatic representations of concepts are
downloaded from IHTSDO's SNOMED browser [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Based on this observation and
the semantic rules mentioned in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], we present Assumptions 1 and 2.
Assumption 1 Concepts containing the stop-word \and" should have at least
two parents and the parents must either be lexically equivalent or must be lexical
variants of the subjects appearing before and after \and".
        </p>
        <p>Assumption 2 Concepts containing the stop-word \and" should have at least
two role groups, and the role groups should be equivalent to the role groups
of each individual concept corresponding to the subjects appearing before and
after \and".</p>
        <p>Fig. 3 illustrates the example of a concept Ornithosis with pneumonia
(disorder) which has four parents including Ornithosis (disorder) and Pneumonia
(disorder) and two role groups, one for each individual disorder parent
corresponding to the subject. Fig. 4 illustrates the individual concept Ornithosis
(disorder) along with its role group. The other parent Pneumonia (disorder)
along its role group is already illustrated in Fig. 2 (b). Based on this
observation, we present Assumptions 3 and 4.
Assumption 3 Concepts containing the stop-word \with" should have at least
two parents and the parents must either be lexically equivalent or must be lexical
variants of the subjects appearing before and after \with".</p>
        <p>Assumption 4 Concepts containing the stop-word \with" should have at least
two role groups, and the role groups should be equivalent to the role groups of
each individual concept corresponding to the subjects appearing before and after
\with".</p>
        <p>We formulated a set of rules based on the aforementioned assumptions which
form the backbone of our algorithm. The developed algorithm identi es missing
hierarchical relationships, missing attribute relationships, and also makes
corrective suggestions by listing lexically ideal concepts using the four assumptions.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 Results and Discussion</title>
      <p>Table 1 displays the number of eligible concepts containing the keywords
\and" and \with" which were found in the disorder sub-hierarchy of SNOMED's</p>
      <p>Inter-national Edition March 2020 release. The pilot study is limited to concepts
containing a maximum of three words (excluding the hierarchy tag, (disorder))
in their Fully Speci ed Names (FSNs). From Table 1, we can see that out of 6989
concepts containing stop-words \and" or \with", 92 concepts have a maximum
of three words in their FSN.</p>
      <p>
        Out of the 92 concepts, 26 concepts (28.26%) were identi ed to be missing
one or more parent(s) according to the lexical rules stated in assumptions 1- 4.
Out of the 26 concepts, 3 concepts had all suggested parents that belonged to
nding sub-hierarchy. Currently, these concepts are dropped from the analysis
due to lack of medical expertise to check conformance with the guidelines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
but will be covered in future work after developing appropriate rules for such
cases.
      </p>
      <p>Out of the 23 concepts, 16 concepts (69.56%) were found to be missing
attribute relationships. Table 2 reports the statistics of the results related to
missing hierarchical relationships and Table 3 reports the statistics of the results
related to missing attribute relationships that were obtained by our method. In
tables 2 and 3, the second column (#) displays the number of concepts
belonging to the category described by the rst column (Description), the third
column (Percentage) displays the count in terms of percentage and the fourth
and fth columns display the \and" and \with" concept distribution of the count
respectively. Table 4 lists the top three missing parents and missing attribute
relationships identi ed by our method. In table 4, the rst column represents
the identi ed concept containing the stop-word \and" or \with", second column
displays the suggested missing hierarchical relationship, i.e. missing parent, and
the third column represents its corresponding attribute relationship that should
be ideally present but is missing in the identi ed concept.</p>
      <p>The results of this preliminary experiment show the potential of our
approach. The percentage of identi ed missing hierarchical relationships using our
method is 28.26% and that of identi ed missing attribute relationships is as high
as 69.56%. Fig. 5. Illustrates a diagrammatic example of Scleritis and episcleritis
(disorder), one of the identi ed concepts with missing hierarchical and attribute
relationships. According to the assumption 1, Scleritis and episcleritis (disorder)
is missing parents: Scleritis (disorder) and Episcleritis (disorder). As a result of
this, it is also missing the attribute relationships Associated morphology {
inammatory morphology (morphologic abnormality) and Finding site - Scleral
structure (body structure), associated with Scleritis (disorder). Fig. 6. Illustrates
with Granulomatosis (dis- Associated morphology - Granulomatosis</p>
      <p>order)
a diagrammatic example of the suggested parent Scleritis (disorder) and
highlights the suggested missing attribute relationships that need to be added as
an additional role group to complete the modelling of Scleritis and episcleritis
(disorder).</p>
      <p>
        Since the pilot implementation of this method has a limited scope, the
following limitations are noted. Currently the method only processes FSNs
containing a maximum of three words (excluding the hierarchy tag), therefore concepts
containing composite-word disorder names like Myopathy and diabetes mellitus
(disorder) (4 words), Hepatitis A and Hepatitis B (disorder) (5 words) are not
considered in spite of being suitable candidates. Currently, the approach is not
considering concepts containing \and/or" due to their complexity [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
Lexical variants are generated based only on the pattern \subject + syndrome",
e.g. osteochondrodysplasia with osteopetrosis (disorder) is suggested a parent
osteochondrodysplasia syndrome (disorder). As a result other variants are neither
identi ed as existing parents nor included in the suggested parent list. Due to
lack of medical expertise on the team to verify the guidelines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we have for now
disregarded cases where the suggested parent for a disorder belongs to the
Finding sub-hierarchy. E.g. the suggestion of isoimmunization ( nding) as a missing
parent for the concept pregnancy with isoimmunization (disorder) has not been
considered for further analysis. However, in spite of these limitations the method
has shown promising potential and we hope to improve the accuracy of results
further by working on the aforementioned limitations.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>Incomplete and inconsistent representations of CT systems cause retrieval
of incorrect or partially correct result sets. Given the critical nature of medical
data, the repercussions of such inaccurate results could be serious ranging from
incorrect decision making in Clinical Decision Support Systems to predicting
misleading trends in Population Health Management and Predictive Analytics.
Thus, it is very important to implement e ective QA measures for CT systems
to identify any inconsistencies right at the source. In this paper, we presented a
unique lexical stop-word based contextual auditing method to identify two types
of inconsistencies; missing hierarchical relationships and missing attribute
relationships. Employing a pilot version of this method have given promising results.
The percentage of identi ed missing attribute relationships using our method is
as high as 69.56% and that of identi ed missing hierarchical relationships is
28.26%. Our method has an additional asset over other QA methods that it not
only identi es inconsistencies but also provides a list of potential suggestions for
each identi ed inconsistency. Our method contributes to the improvement of a
CT system in the following ways:
1. Help to produce a complete CT system by adding the suggested relationships
to the CT system.
2. Ensure better extraction of inferential knowledge which is otherwise not
divulged due to incomplete relationships and partially de ned concepts.
3. Ensure retrieval of complete information in result sets which will facilitate
informed decision making.</p>
      <p>As future work we propose to improve our algorithm to identify composite
disorder names such as Diabetes Mellitus. This will allow the algorithm to be
applied to any FSN irrespective of its length. We plan to work on all the identi ed
limitations. We also plan to widen the range of stop-words used in our analysis
to include \of", \due to", \to", etc. Finally, we will expand the technique to
process FSNs containing multiple stop-words instead of a single stop-word. E.g.
Disorder due to and following burn of wrist (disorder).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. IHTSDO, SNOMED Clinical Finding/Disorder, https://con uence.ihtsdotools.org/pages/viewpage.action?pageId=71172245, last accessed
          <year>2020</year>
          /07/29
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. IHTSDO, SNOMED CT Browser, https://browser.ihtsdotools.org/?, last accessed
          <year>2020</year>
          /07/30
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>PubMed</given-names>
            <surname>Help</surname>
          </string-name>
          [Internet].
          <article-title>Bethesda (MD): National Center for Biotechnology Information</article-title>
          (US);
          <fpage>2005</fpage>
          -. [Table, Stopwords], https://www.ncbi.nlm.nih.gov/books/NBK3827/table/pubmedhelp.T.stopwords/,
          <source>last accessed</source>
          <year>2020</year>
          /07/28
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Evaluating lexical similarity and modeling discrepancies in the procedure hierarchy of snomed ct</article-title>
          .
          <source>BMC Medical Informatics and Decision Making</source>
          <volume>18</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elhanan</surname>
          </string-name>
          , G.:
          <article-title>Contrasting lexical similarity and formal de nitions in snomed ct: Consistency and implications</article-title>
          .
          <source>Journal of biomedical informatics 47</source>
          ,
          <issue>192</issue>
          {8 (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elhanan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halper</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Dissimilarities in the logical modeling of apparently similar concepts in snomed ct</article-title>
          .
          <source>AMIA ... Annual Symposium proceedings. AMIA Symposium</source>
          <year>2010</year>
          ,
          <volume>212</volume>
          {6 (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perl</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elhanan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Identifying inconsistencies in snomed ct problem lists using structural indicators</article-title>
          .
          <source>AMIA ... Annual Symposium proceedings. AMIA Symposium</source>
          <year>2013</year>
          ,
          <volume>17</volume>
          {
          <fpage>26</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perl</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elhanan</surname>
          </string-name>
          , G.:
          <article-title>Identifying problematic concepts in snomed ct using a lexical approach</article-title>
          .
          <source>Studies in health technology and informatics 192</source>
          ,
          <issue>773</issue>
          {7 (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perl</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ochs</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elhanan</surname>
          </string-name>
          , G.:
          <article-title>A contextual auditing method for snomed ct concepts</article-title>
          .
          <source>Int. J. Data Min. Bioinform</source>
          .
          <volume>15</volume>
          ,
          <issue>372</issue>
          {
          <fpage>391</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qazi</surname>
            ,
            <given-names>K.:</given-names>
          </string-name>
          <article-title>A machine learning approach for quality assurance of snomed ct</article-title>
          .
          <source>2019 IEEE International Conference on Bioinformatics and Biomedicine</source>
          (BIBM) pp.
          <volume>792</volume>
          {
          <issue>798</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qazi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Detecting modeling inconsistencies in snomed ct using a machine learning technique</article-title>
          .
          <source>Methods</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Revelo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Analysis of the consistency in the structural modeling of snomed ct and core problem list concepts</article-title>
          .
          <source>2017 IEEE International Conference on Bioinformatics and Biomedicine</source>
          (BIBM) pp.
          <volume>292</volume>
          {
          <issue>296</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Identifying missing hierarchical relations in snomed ct from logical de nitions based on the lexical features of concept names</article-title>
          . In: ICBO/BioCreative (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burgun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Rind esch, T.C.:
          <article-title>Lexically-suggested hyponymic relations among medical terms and their representation in the umls (</article-title>
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burgun-Parenthoine</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Rind esch, T.C.
          <article-title>: Assessing the consistency of a biomedical terminology through lexical knowledge</article-title>
          .
          <source>International journal of medical informatics 67</source>
          <volume>1-3</volume>
          ,
          <issue>85</issue>
          {
          <fpage>95</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elkin</surname>
            ,
            <given-names>P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Negative ndings in electronic health records and biomedical ontologies: A realist approach</article-title>
          .
          <source>International journal of medical informatics 76 Suppl</source>
          <volume>3</volume>
          ,
          <issue>S326</issue>
          {
          <volume>33</volume>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Cui</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Case</surname>
            ,
            <given-names>J.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>G.Q.</given-names>
          </string-name>
          :
          <article-title>Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in snomed ct</article-title>
          .
          <source>Journal of the American Medical Informatics Association : JAMIA</source>
          <volume>24</volume>
          ,
          <issue>788</issue>
          {
          <fpage>798</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Marko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hahn</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Morphosaurus{design and evaluation of an interlingua-based, cross-language document retrieval engine for the medical domain</article-title>
          .
          <source>Methods of information in medicine 44 4</source>
          ,
          <issue>537</issue>
          {
          <fpage>45</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Mendonca</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimino</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campbell</surname>
            ,
            <given-names>K.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spackman</surname>
            ,
            <given-names>K.A.</given-names>
          </string-name>
          :
          <article-title>Reproducibility of interpreting "and" and "or" in terminology systems</article-title>
          .
          <source>Proceedings. AMIA Symposium</source>
          pp.
          <volume>790</volume>
          {
          <issue>4</issue>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Pacheco</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stenzhorn</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nohama</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paetzold</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Detecting underspeci cation in snomed ct concept de nitions through natural language processing</article-title>
          .
          <source>AMIA ... Annual Symposium proceedings. AMIA Symposium</source>
          <year>2009</year>
          ,
          <volume>492</volume>
          {6 (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iannone</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Lexically suggest</article-title>
          , logically de ne:
          <article-title>Quality assurance of the use of quali ers and expected results of post-coordination in snomed ct</article-title>
          .
          <source>Journal of biomedical informatics 45 2</source>
          ,
          <issue>199</issue>
          {
          <fpage>209</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mart</surname>
            nez-Costa,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Min</surname>
          </string-name>
          <article-title>~arro-</article-title>
          <string-name>
            <surname>Gimenez</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          :
          <article-title>Lexical ambiguity in snomed ct</article-title>
          .
          <source>In: JOWO</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>