=Paper= {{Paper |id=Vol-3603/Paper13 |storemode=property |title=Reuse and Enrichment for Building an Ontology for Obsessive-Compulsive Disorder |pdfUrl=https://ceur-ws.org/Vol-3603/Paper13.pdf |volume=Vol-3603 |authors=Areej Muhajab,Alia Abdelmoty,Athanasios Hassoulas |dblpUrl=https://dblp.org/rec/conf/icbo/MuhajabAH23 }} ==Reuse and Enrichment for Building an Ontology for Obsessive-Compulsive Disorder== https://ceur-ws.org/Vol-3603/Paper13.pdf
                                Reuse and Enrichment for Building an Ontology for
                                Obsessive-Compulsive Disorder
                                Areej Muhajab1,∗ , Alia I. Abdelmoty1 and Athanasios Hassoulas2
                                1
                                    School of Computer Science & Informatics, Cardiff University, Wales, United Kingdom
                                2
                                    School of Medicine, Cardiff University, Wales, United Kingdom


                                                                         Abstract
                                                                         Building ontologies for mental diseases and disorders facilitates effective communication and knowledge
                                                                         sharing between healthcare providers, researchers, and patients. General medical and specialized ontolo-
                                                                         gies, such as the Mental Disease Ontology, are large repositories of concepts that require much effort to
                                                                         create and maintain. This paper proposes ontology reuse and automatic enrichment as means for design-
                                                                         ing and building an Obsessive-Compulsive Disorder (OCD) ontology. The methods are demonstrated by
                                                                         designing and building an ontology for the OCD. Ontology reuse is proposed through ontology alignment
                                                                         design patterns to allow for full, partial or nominal reuse. Enrichment is proposed through deep learning
                                                                         with a language representation model pre-trained on large-scale corpora of clinical notes and discharge
                                                                         summaries, as well as a text corpus from an OCD discussion forum. An ontology design pattern is
                                                                         proposed to encode the discovered related terms and their degree of similarity to the ontological concepts.
                                                                         The proposed approach allows for the seamless extension of the ontology by linking to other ontological
                                                                         resources or other learned vocabularies in the future. The OCD ontology is available online on Bioportal.

                                                                         Keywords
                                                                         Ontology, OCD, Mental health, Conceptual modeling, Ontology enrichment, Ontology reuse,




                                1. Introduction
                                Obsessive-Compulsive Disorder (OCD) is a frequently debilitating and often severe mental health
                                disorder that affects approximately 2% of the population1 . The Royal Collage of Psychiatrists
                                (RCPSYCH2 ) report that approximately 1 in every 50 people suffer from OCD at some point in
                                their lives, amounting to about 1 million people in the UK, affecting men and women equally. It
                                is also noted [1] that people could spend a long time struggling with the disorder, often hiding
                                their symptoms, before they get appropriate help, possibly attributed to the shame or stigma
                                associated with having disturbing thoughts (e.g. ego-dystonic sexual or violent) and compulsive
                                behaviours . Coding information in electronic health records (EHR) using standard medical
                                terminologies, such as the Systematized Nomenclature of Medicine Clinical Terms (SNOMED
                                CT) [2] can facilitate the efficient recording and integration of patient notes, ultimately leading

                                Proceedings of the International Conference on Biomedical Ontologies 2023, August 28th-September 1st, 2023, Brasilia,
                                Brazil
                                Envelope-Open muhajaba@cardiff.ac.uk (A. Muhajab); AbdelmotyAI@cardiff.ac.uk (A. I. Abdelmoty);
                                HassoulasA2@cardiff.ac.uk (A. Hassoulas)
                                Orcid 0009-0007-9943-9169 (A. Muhajab); 0000-0003-2031-4413 (A. I. Abdelmoty)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR

                                           CEUR Workshop Proceedings (CEUR-WS.org)
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073




                                1
                                  https://bestpractice.bmj.com/info/
                                2
                                  https://www.rcpsych.ac.uk/mental-health/problems-disorders/obsessive-compulsive-disorder




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings



                                                                                                                                                                                                                 142
to more effective healthcare management [3]. However, existing clinical terminologies, such as
SNOMED CT, and ontologies, such as the Mental Disease Ontology (MDO)3 are limited with
respect to the following dimensions.
    1. Semantic Richness: Existing ontologies are evolving. No specific ontology or (sub-
       ontology) exist that delineates OCD, its types and diagnosing symptoms; enough for
       example to distinguish it from other related disorders, such as illness anxiety disorder or
       hoarding disorder 4 . The creation and evaluation of such resources is a costly exercise that
       requires both domain and technology experts.
    2. Semantic Heterogeneity: The meaning and provenance of the terms or concepts used
       in these resources are not usually included or described in sufficient detail to explain
       the basis of its association with a particular classification hierarchy. For example, in
       Disease ontology (DOID), OCD is described as a type of Anxiety Disorder, whereas this
       classification has been updated in the DSM-5 revision in 2013, where it is now classified as
       a type of Obsessive-Compulsive and Related Disorders (OCRDs). Also, different classification
       hierarchies for the same concept is used in different ontologies. For example, Obsession
       in the MDO is a type of Pathological Mental Process, whereas it is a type of Behavioral
       Symptom in the Medical Subject Headings ontology (MeSH)5 , and a type of Content
       of Thought in SNOMED-CT. Understanding and establishing the similarity of concepts
       across ontologies is a well-known research challenge.
    3. Structural Richness: Most clinical terminological resources and ontologies are described
       primarily with subsumption (IS_A) relationships and presented as large class hierarchies
       of concepts. The uncontrolled use of IS_A relation to signify different types of relations
       (such as PART_OF, IS_INSTANCE_OF, IS_ASSOCIATED_With, etc.) has been noted, e.g.
       in SNOMED CT [4], leading to structural overload and possible incorrect subsumption
       relationships. Also, some modelling constructs such as the use of multiple inheritance,
       where a concept can have multiple parent types, can lead to complexity in reasoning with
       the information.
In this paper, we propose the development of an OCD ontology to address some of the issues
noted above. The methodology for development follows established proposals in the literature.
In particular, structural richness is addressed by making use of rich ontological modelling
in the logical definition of concepts, whilst semantic heterogeneity is addressed by the reuse
of existing resources directly in the ontology as well as creating explicit reference to related
concepts in other resources. Semantic richness is addressed by complementing the definition of
concepts in the ontology by the automatic discovery of related concepts from relevant resources
using deep learning. Ontology design patterns are proposed to encapsulate the links to other
ontologies and the discovered related concepts. The resulting ontology consists of 97 classes
(expanded to 1047 classes from other ontologies), 17 object properties (relationships) and 5
data properties. Bio_ClinicalBERT6 and a text corpus from an OCD discussion forum are used
in the ontology enrichment task. This work contributes to the efforts of building biomedical
3
  http://purl.obolibrary.org/obo/MFOMD.owl
4
  https://www.rcpsych.ac.uk/mental-health/problems-disorders/obsessive-compulsive-disorder
5
  https://www.nlm.nih.gov/mesh/meshhome.html
6
  https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT




                                                                                                       143
ontologies, by providing modelling solutions that allow for the integration, reuse and enrichment
of the ontological resources. The developed OCD ontology is available on Bioportal7 . The
remainder of the paper is organized as follows. Some related works on biomedical ontology
development, reuse and enrichment are reviewed in section 2. Section 3 describes the proposed
OCD ontology and outlines details of its development process. Ontology design patterns for
reuse and enrichment are presented in section 4 , followed by conclusions in section 6.


2. Related Work on Ontology Reuse and Enrichment
A brief overview is given here of efforts in the field of biomedical ontologies reuse and enrich-
ment.
   An overview of the list of prominent biomedical ontologies is given in [5]. El-Sappagh et al [6]
reviews the limitation and complexity of large clinical terminology resources such as SNOMED-
CT and proposes its transformation to an ontology by aligning and reusing concepts from the
Ontology of General Medical Science (OGMS). Top-level concepts are first manually mapped
to OGMS, and then used to compose more refined (pre-coordinated and post-coordinated)
concepts. More recently, the integration of ontological resources has become the focus of
attention. For example,the Mondo Disease ontology aims to harmonize disease definitions
across the world 8 by integrating the classifications and relationships of commonly used disease
ontologies into a single semantically coherent resource. It employs a Bayesian approach to
ontology structure inference, combining deductive and probabilistic inference, and aims to
provide equivalence mappings with precise annotations. Another example of creating a large
ontology, demonstrating the complexity and scale of the required effort, is demonstrated in
the development of CIDO: the community-based coronavirus infectious disease ontology [7].
The methodology for development of the ontology is based on following the OBO Foundry
ontology development principles, and utilizing the eXtensible Ontology Development(XOD)
strategy, which prescribes: ontology term reuse, semantic alignment, use of ontology design
patterns for new term generation, and community effort. A largely manual effort is ongoing
into the development and visual analysis and evaluation of the resulting ontology. The process
of encoding logical definitions manually when developing ontologies is a challenging task [8].
Ontology Design Patterns (ODPs) are defined as reusable modeling solutions to frequently
occurring ontology design problems and are proposed as a useful tool to address this challenge.
The complexity of the ontologies and the need for checking their consistency was investigated
in [9]. Using the Foundational Model of Anatomy ontology, they analyzed the musculoskeletal
content and show the inconsistencies in the use of relations, lack of definitions of relations,
and incomplete definition of the hierarchy. They suggested the definition and use of ODPs
to address these issues. Recent approaches are proposing the integration of ontologies by
transforming them into a unified knowledge graph, that can be homogeneously queried with
SPARQL endpoints [10]. Representing the ontologies as an RDF resource including all its
entailment (all consequences of its logical definition) can help in the process of checking the
similarity and consistency of the integrated resources.

7
    https://bioportal.bioontology.org/ontologies/OCD
8
    https://mondo.monarchinitiative.org




                                                                                                      144
   Ontology enrichment is a term that has been used in the literature from two points of view:
a) enriching the ontology itself; that is extending the ontology by supplementing its existing
structure with related terms and metadata, and b) enrichment by ontology; where the ontology is
used as a source for discovering related concepts in a particular domain for a particular purpose.
A common example of the latter task is the Gene Ontology (GO) term enrichment; a technique
for interpreting sets of genes by making use of the GO system of classification, in which genes
are assigned to a set of predefined bins depending on their functional characteristics [11].
   In this paper, we are concerned with the first view point; enriching the ontology itself. Few
research works have addressed this problem by utilizing existing resources, or other generic
resources. For example, [12] used UMLS as the source of discovering synonyms for concepts
in the ontology. In [13], deep learning with a large corpus of PubMed review articles and
veterinary clinical notes was used to discover related terms to some pre-defined terms related
to medical conditions. They then use the results to populate the UMLS Semantic Types and
Groups ontology9 , whilst relying on specialized ontologies to represent the relatedness (e.g.
lexical and provenance) relationships and properties. Utilization of standardised methods of
linking ontologies to lexicographic resources10 is an important aspect of this work. The research
area of using ontologies and machine learning is still novel. An overview of how semantic
similarity measures and ontology embeddings may be incorporated with ML methods is given in
[14]. Further work needs to be done on exploiting the ontology structure, possibly by ontology
embedding, in the task of ontology enrichment.
   Several methodologies have been proposed to guide the process of developing ontologies.
The NeOn [15] methodology places emphasis on collaboration and distributed development. It
encourages the modularization of ontology building process, where domain experts contribute to
different modules, while ensuring the overall consistency of the ontology. Systematic Approach
for Building Ontologies (SABiO) [16] is a related approach that suggests a more guided workflow
for the ontology development process, where the design of reference domain and operational
ontologies is suggested and the reuse of foundational ontologies and pattern-oriented reuse are
encouraged. The Open Biological and Biomedical Ontologies (OBO) Foundry proposed a set of
principles and guidelines [17] for the development of ontologies to promote interoperability
and standardization in the life sciences domains. OBO ontologies need to adhere to common
design patterns and share a foundational set of relations, thereby fostering seamless integration
and facilitating collaboration across diverse biomedical domains. The the eXtensible ontology
development (XOD) methodology [18] is designed to be flexible and adaptable, allowing ontology
developers to extend and customize it according to the unique characteristics of the target
domain. By embracing a modular approach, XOD promotes the reuse of existing ontological
components, which minimizes duplication efforts and ensures consistency across the ontology.
In this paper, we build upon SABiO and XOD to demonstrate the reuse and enrichment of
the OCD ontology. We incorporate elements from foundational ontologies, domain-specific
ontologies, and external sources to enhance the OCD ontology, thus showcasing the potential
of leveraging existing resources to enrich and expand knowledge representation effectively.


9
    https://lhncbc.nlm.nih.gov/ii/tools/MetaMap/documentation/SemanticTypesAndGroups.html
10
     https://www.w3.org/2019/09/lexicog/




                                                                                                     145
3. The OCD Ontology Development Process
An overview of the ontology development stages is shown in Figure 1. In this paper, we present
an overview of ontology building process. The ontology evaluation tasks include, a) evaluating
the logical consistency of the ontology; checked using the DL reasoners in Protégé , b) evaluating
the capability of the ontology to answer a set of competency questions defined in the knowledge
acquisition phase; checked by formulating the questions using the SPARQL query language,
and c) expert evaluation to judge the quality and coverage of the ontology. Detailed account of
these tasks are left to future publications.




Figure 1: Ontology Development and Evaluation


   Knowledge Acquisition: Two primary resources were used to acquire basic knowledge
about the nature of the disorder and its diagnosis. These are the Diagnostic and Statistical
Manual of Mental Disorder version 5 (DSM-5)[19] and OCD assessments tools, namely, the
Yale-Brown Obsessive Compulsive Scale (Y-BOCS)11 and the Obsessive-Compulsive Inventory
(OCI) 12 . Statements describing OCD and its related concepts were manually extracted; 28
statements were extracted. Examples include, “OCD is characterized by obsession, compulsion,
or both” and “the definition of obsession as an intrusive thought, image, or urge”; (both examples
are from DSM-5). Y-BOCS, and other diagnostic tools, were particularly useful for identifying
types of OCD and providing specific values of defined concepts. Eight types of Obsession and
six types of Compulsion were identified. To further refine the definition of concepts that are not
described in the primary resources, further relevant resources were employed. For example, the
11
     https://pcl.psychiatry.uw.edu/wp-content/uploads/2021/12/YBOCS.pdf
12
     https://www.div12.org/wp-content/uploads/2015/07/OCI.pdf




                                                                                                     146
cognitive theory of OCD [20] emphasise that an intrusive thought transforms into an obsession
when an additional meaning is attributed to it. A total of 35 statements were extracted (a full
list can be found in GitHub repository “OCD-ontology”. In this phase, we also compiled a
set of competency questions (n=23) from diagnostic and other relevant resources to support
the development and evaluation processes. Examples of the competency questions include:
What patterns of activities do a person with aggressive thought has? and What is the type of
obsession where avoidance behavior is a frequent occurrence?.
   The Analysis Phase: Statements in the previous phase were used to identify relevant
concepts and relationships. Statements for defining specific concepts were grouped and used
to formulate a natural language definition that describe three questions: “what is the concept
defined as?”, “what are its types?”, “what are its symptoms?”. During this process, necessary
and sufficient criteria for defining the concepts were identified. These refer to the properties
(attributes and relationships) that must be present for an instance to be considered a member of
the defined class. A natural language statement defining these properties is then formulated
to allow for its transformation into logical statements in Description Logic (DL) and further
definition in the ontology. An example of this process for the definition of the concept of
Obsession is shown in Table 1. This approach allowed for a clear definition of the required
concepts and for avoiding ambiguity and inconsistency in the representation by utilising the
ontology reasoning tools over the defined ontology model. The analysis phase resulted in the
definition of a set of 97 concepts and 22 types of relationships. This includes 17 object properties
and 5 data properties.

Table 1
Example of the definition of the concept of Obsession in the analysis phase.
     Statements relevant the concept of Obsession (1) Obsessions are recurrent and persistent thoughts,
     urges, or images that are experienced as intrusive and unwanted (DSM-5). (2) Individual with obsession
     attempts to ignore or suppress such thoughts, urges, or images, or to neutralize them with some other
     thoughts or actions (i.e., by performing a compulsion) (DSM-5) (3) Individual with OCD may experience
     over-importance of thoughts [20]. (4) There are 8 types of obsession (Y-BOCS).
     Natural Language Definition In OCD, an Obsession can be any of: Intrusive Thought, Intrusive
     Image or Intrusive Impulse or Urge, that causes distress due to the added importance that the individual
     places on them. In OCD, Obsession is often accompanied by some Compulsions.
     Description Logic Expression
     ((Obsession ≡ Intrusive thought ⊔ Intrusive image ⊔ Intrusive urge)
     ⊓ (∃ hasAssociatedAppraisal.ThoughtAppraisal ))

     Related Identified Concepts thought, intrusive thought, persistent thought, mental image, urge and
     thought appraisal.



4. Ontology Design Patterns for Reuse and Enrichment
Reusing Existing Ontologies Related ontologies were identified by searching the NCBO
BioPortal 13 ; a comprehensive repository of biomedical ontologies. For every concept defined
13
     https://bioportal.bioontology.org/




                                                                                                                147
in the analysis phase, corresponding concepts were identified in the existing ontologies. The
following heuristics were employed in the decision to reuse concepts.
     1. The external concept is considered to be fully equivalent to the required OCD concept,
        if there is a complete overlap between the logical definitions of the two. In this case,
        the concept is imported directly to the ontology. When a concept is imported, all its
        related concepts, including its inheritance tree hierarchy, are also imported. For example,
        the Activity and Symptom classes are root classes in the Activity of Daily Living (ADL)
        ontology [21] and Symptom Ontology (SYMP) 14 , respectively. Importing both classes in
        the OCD ontologies implies the use of their complete ontologies as well.
     2. The external concept is considered to be partially equivalent to the required OCD concept,
        if its logical definition can be considered part of the definition of the required concept. For
        example, “OCD” is defined in the DOID ontology as a subclass of “Anxiety Disorder”. No
        further definition is given in the DOID ontology. This definition is partially sufficient for
        our ontology and we need to further refine it. Hence, instead of importing the class and
        redefining it, we align our definition with the external ontology using the OWL:equiva-
        lentClass; an example is, ocd:OCD ≡ DOID:OCD (where “ocd” and “DOID” are prefixes for
        the OCD and the Disease Ontology, respectively). This ontology alignment design pattern
        allows flexibility of ontology specification, whilst also reusing existing resources. There
        are 13 OCD concepts aligned as equivalent to external concepts. Figure 2 illustrates the
        refinement of the definition of the class “Obsession” in the OCD ontology. The definition
        of the “Obsession” in MDO is defined as: MDO:MFOMD_0000109 ⊑ MDO:Pathologi-
        cal mental process; which is defined as (⊑ OGMS:Pathological bodily process ⊓ (⊑ ∃
        manifestationOf.Mental disorder)) . This definition is reused in our ontology as follows:
        ocd:Obsession ≡ MDO:MFOMD_0000109 (class obsession from MDO). The refinement of
        ocd:Obsession is presented as follows: ocd:Obsession ≡ Intrusive thoughts ⊔ Intrusive
        image ⊔ Intrusive urge) ⊓ (∃ hasAssociatedAppraisal.ThoughtAppraisal); ocd:Obsession
        has 8 sub-classes.
     3. The external concept is considered to be nominally similar to the required OCD concept,
        if there is some overlap between the logical definitions. In this case, we are unable to
        reuse the external class, but we maintain a link to it using the Reuse Ontology Design
        Pattern, as shown in figure 3.
  The set of ontologies that were reused are as follow: Mental Disease Ontology (MDO)15 ,
Mental Functioning Ontology (MFO)16 , ADL, SYMP, Gene Ontology 17 , SNOMED-CT, Experi-
mental Factor Ontol- ogy (EFO), Gender, Sex,and Sexual Orientation(GSSO) ontology, Emotion
ontology 18 , and the Basic Formal Ontology (BFO)19 .
  In the OCD ontology, classes that were not present in existing ontologies were created and
mapped to classes in the Basic Formal Ontology (BFO) using the OWL:subClassOf relation-
ship. The mapping process took into account the characteristics of each class in the BFO and
14
   https://obofoundry.org/ontology/symp.html
15
   http://purl.obolibrary.org/obo/MFOMD.owl
16
   http://purl.obolibrary.org/obo/MF.owl
17
   https://bioportal.bioontology.org/ontologies/GO
18
   http://purl.obolibrary.org/obo/MFOEM.owl
19
   http://purl.obolibrary.org/obo/bfo.owl




                                                                                                          148
Figure 2: The representation of the class “Obsession” defined in protégé .




Figure 3: The Reuse Ontology Design Pattern.


determined the most suitable parent class for the OCD classes. A recent study by Emeruem
et al. [22] proposed an automatic tool for mapping classes to the BFO. The aforementioned
tools were used to guide the mapping of three classes in the OCD ontology: ocd:Severity Level,
ocd:Assessment Criteria and ocd:Functional Impairment. Figure 4 illustrates the mapping of
class “Functional Impairment” as ⊑ BFO:Quality.
   Ontology Enrichment with Deep Learning
   The objective here is to demonstrate how to automate the process of extracting related terms
using deep learning from a given corpus for integration into the ontology. We first explore the




                                                                                                  149
Figure 4: The representation of class “ocd:Functional impairment” as ⊑ BFO:Quality based on the
“Questions History”.


efficacy of the BioClinical_BERT langugage model in identifying terms that are semantically
similar to target terms in the ontology. BioClinical_BERT leverages contextual embeddings
to capture the nuanced meaning of biomedical and clinical terminology. It is trained on a
large corpus (2 million) of clinical notes. We then employ a word2vec model trained on an
OCD-specific corpus to assess term similarity (cosine similarity). The word2vec model learns
the distributional representation of words based on their co-occurrence patterns within the
OCD corpus. Candidate term extraction using BioClinical_BERT : Target terms are mapped to
their corresponding token IDs in the model and their hidden states are retrieved, representing
the contextualized embeddings of the tokens. Similarity scores are then computed between
the target term’s contextualized embedding and the embeddings of other terms in the model.
The top (n) terms with the highest similarity scores are then selected. A sample of candidates
terms and similarity scores to target terms are listed in table 2. Notably, it was observed that
certain target terms such as “compulsion”, “intrusive” and “impairment” were not represented
in the BioClinical_BERT model. Candidate terms form the Word2vec model: Here we utilized
the word2vec model with the Continuous Bag of Words (CBOW) architecture to obtain a list
of relevant terms for the same target terms in the ontology. Data for this study was collected
from an OCD forum20 . Python Selenium library was employed to gather a substantial dataset
comprising 54,410 posts from the forum spanning the period from October 2000 to December
2021 . The experimental set-up; hyperparameter configuration, was as follows. The window
size determines the range of neighbouring context words considered for each target word, and
was set to 5. This means that words within a distance of 5 from the target word were taken
into account. Additionally, a minimum count of 3 was defined, ensuring that words appearing
at least three times in the dataset were included during training. Cosine similarity was then
20
     Online platform dedicated to OCD-related discussions and information exchange https://www.mentalhealthfo-
     rum.net/forum/forums/obsessive-compulsive-disorder-ocd-forum.46/




                                                                                                                 150
computed between the target and candidate terms and the top(n) terms were selected. A sample
of results is shown in table 3.

Table 2
Top similar terms to target terms from Bio_Clinnical BERT model
    Target term     BioClinical_BERT (Top 6 terms)
    obsession       (’obsessed’, 0.651), (’fascination’, 0.636), (’urges’, 0.588),(’irrational’, 0.573, ),
                    (’insistence’, 0.570) (’preoccupied’, 0.569)
    urge            (’urging’, 0.573), (’urgency’, 0.535), (’encourages’,0.519), (’invite’, 0.496), (’de-
                    sire’, 0.49), (’obsession’, 0.489)



Table 3
Top similar terms to target terms from Word2Vec model.
    Target term     Word2Vec
    obsession       (’theme’, 0.984), (’behaviour’, 0.983), (’compulsive’, 0.983), (’fear’, 0.979), (’trig-
                    ger’, 0.974)(’habit’, 0.968)
    compulsion      (’rumination’, 0.987), (’pattern’, 0.986), (’behaviour’, 0.984), (’ritual’, 0.979),
                    (’activity’, 0.974), (’action’, 0.974).
    urge            (’reflex’, 0.988), (’mindfully’, 0.984), (’opposite’, 0.979), (’reaction’, 0.976), (’de-
                    liberately’, 0.976), (’act’, 0.975)
    intrusive       (’harm’, 0.9782), (’triggered’, 0.967), (’unwanted’, 0.965), (’invasive’, 0.965),
                    (’horrific’, 0.965), (’disturbing’, 0.962)
    impairment      (’indicative’, 0.962), (’distress’, 0.954), (’jealousy’, 0.944), (’inexplicable’, 0.940),
                    (’deliberate’, 0.938), (’negativity’, 0.917)

   An Enrichment ODP is proposed here, as in figure 5, to record the results. As shown in the
figure, classes in the ontology can be associated with many alternative labels, whose properties,
including, similarity score, method and date are also recorded. The pattern is a simple generic
and flexible approach to associating multiple terms to the ontology. A more sophisticated
approach to lexical representation of associated terms can also be envisaged, e.g. by employing
the OntoLex ontology. This is the subject of ongoing work.


5. Conclusion
The process of defining an ontology is costly in terms of time and effort. Devising means of
automating the process to complement the traditional approach will be of benefit to all stake-
holders. This paper presents an approach for building an ontology for a specific mental disorder.
The aim is to demonstrate how the traditional ontology building process can be complemented
with a process of reusing existing ontology resources and enrichment with rich textual
resources. The paper considers the building of an ontology for a specific mental disorder, as an
example, but the approach proposed is generalisable to other use cases. A uniform approach is
presented to enriching the ontology with ontological concepts and lexical terms using ontology
design patterns. The degree of similarity of concepts in the ontologies guide the modelling
process of the related concepts. Machine learning is used to discover similar terms to concepts




                                                                                                                151
Figure 5: The Enrichment Ontology Design Pattern.


in the ontology using language models trained on large text corpus. The degree of similarity
with the lexical terms are explicitly encoded. Results from the application of methods on two
different corpora is presented. The paper outlines the approach and sets the way for further
work on several fronts; refactoring the patterns to allow for richer modelling of lexical similarity,
further refinement of the logical definition of the ontology based on expert evaluation. The de-
tailed processes of building the ontology and its evaluation are the subject of future publications.

 Acknowledgment
We would like to acknowledge the scholarship and support provided by Taif University.


References
 [1] D. Veale, A. Roberts, Obsessive-compulsive disorder, Bmj 348 (2014) g2183.
 [2] K. Donnelly, et al., Snomed-ct: The advanced terminology and coding system for ehealth,
     Studies in health technology and informatics 121 (2006) 279.
 [3] M. Adnan, J. Warren, M. Orr, Ontology based semantic recommendations for discharge
     summary medication information for patients, in: 2010 IEEE 23rd International Symposium
     on Computer-Based Medical Systems (CBMS), IEEE, 2010, pp. 456–461.
 [4] O. Bodenreider, R. Cornet, D. J. Vreeman, Recent developments in clinical terminolo-
     gies—snomed ct, loinc, and rxnorm, Yearbook of medical informatics 27 (2018) 129–139.
 [5] J. D. Ferreira, D. C. Teixeira, C. Pesquita, Biomedical ontologies: Coverage, access and use,
     Systems Medicine (2020).
 [6] S. El-Sappagh, F. Franda, F. Ali, K.-S. Kwak, Snomed ct standard ontology based on the
     ontology for general medical science, BMC medical informatics and decision making 18
     (2018) 1–19.
 [7] Y. He, H. Yu, A. Huffman, A. Y. Lin, D. A. Natale, J. Beverley, L. Zheng, Y. Perl, Z. Wang,




                                                                                                        152
     Y. Liu, et al., A comprehensive update on cido: the community-based coronavirus infectious
     disease ontology, Journal of Biomedical Semantics 13 (2022) 1–21.
 [8] M. El Ghosh, F. Ghazouani, B. Birene, E. Akan, J. Charlet, F. Dhombres, Modeling logical
     definitions in biomedical ontologies by reusing ontology design patterns, in: ICBO’21:
     International Conference on Biomedical Ontologies, 2021, p. 20.
 [9] M. D. Clarkson, L. T. Detwiler, K. M. Platt, S. Roggenkamp, Assessing the consistency of
     modeling in complex ontologies: A study of the musculoskeletal system of the foundational
     model of anatomy, Proceedings http://ceur-ws. org ISSN 1613 (2021) 0073.
[10] J. P. Balhoff, U. Bayindir, A. R. Caron, N. Matentzoglu, D. Osumi-Sutherland, C. J. Mungall,
     Ubergraph: integrating obo ontologies into a unified semantic graph, Proceedings
     http://ceur-ws. org ISSN 1613 (2022) 0073.
[11] A. Tomczak, J. M. Mortensen, R. Winnenburg, C. Liu, D. T. Alessi, V. Swamy, F. Vallania,
     S. Lofgren, W. Haynes, N. H. Shah, et al., Interpretation of biological experiments changes
     with evolution of the gene ontology and its annotations, Scientific reports 8 (2018) 5115.
[12] A. M. Rajput, H. Gurulingappa, Semi-automatic approach for ontology enrichment using
     umls, Procedia Computer Science 23 (2013) 78–83.
[13] M. Arguello-Casteleiro, R. Stevens, J. Des-Diz, C. Wroe, M. J. Fernandez-Prieto, N. Maroto,
     D. Maseda-Fernandez, G. Demetriou, S. Peters, P.-J. M. Noble, et al., Exploring semantic
     deep learning for building reliable and reusable one health knowledge from pubmed
     systematic reviews and veterinary clinical notes, Journal of biomedical semantics 10 (2019)
     1–28.
[14] M. Kulmanov, F. Z. Smaili, X. Gao, R. Hoehndorf, Semantic similarity and machine learning
     with ontologies, Briefings in bioinformatics 22 (2021) bbaa199.
[15] M. C. Suárez-Figueroa, A. Gómez-Pérez, M. Fernández-López, The neon methodology for
     ontology engineering, in: Ontology engineering in a networked world, Springer, 2012, pp.
     9–34.
[16] R. de Almeida Falbo, Sabio: Systematic approach for building ontologies., Onto.
     Com/odise@ Fois 1301 (2014).
[17] B. Smith, M. Ashburner, C. Rosse, J. Bard, W. Bug, W. Ceusters, L. J. Goldberg, K. Eilbeck,
     A. Ireland, C. J. Mungall, et al., The obo foundry: coordinated evolution of ontologies to
     support biomedical data integration, Nature biotechnology 25 (2007) 1251–1255.
[18] Y. He, Z. Xiang, J. Zheng, Y. Lin, J. A. Overton, E. Ong, The extensible ontology development
     (xod) principles and tool implementation to support ontology interoperability, Journal of
     biomedical semantics 9 (2018) 1–10.
[19] R. N. Kocsis, Book review: Diagnostic and statistical manual of mental disorders: (dsm-5),
     2013.
[20] D. Julien, K. P. O’Connor, F. Aardema, Intrusive thoughts, obsessions, and appraisals in
     obsessive–compulsive disorder: A critical review, Clinical Psychology Review 27 (2007)
     366–383.
[21] P. R. Woznowski, E. L. Tonkin, P. A. Flach, Activities of daily living ontology for ubiquitous
     systems: Development and evaluation, Sensors 18 (2018) 2361.
[22] C. Emeruem, C. M. Keet, Z. C. Khan, S. Wang, Bfo classifier: aligning domain ontologies
     to bfo, CEUR Workshop Proceedings (2022).




                                                                                                      153