Protégé-TS: An OWL Ontology Term
                       Selection Tool

                         Ian Hyland and Renate A. Schmidt

                           The University of Manchester, UK


        Abstract. This paper introduces Protégé-Term Selection (Protégé-TS),
        a software tool developed to support and partially automate the process
        of constructing OWL ontology signatures. Protégé-TS works from an
        OWL 2 based source ontology to assist the user in creating a signature
        which is composed of a list of concepts and roles. Such signature lists are
        often needed when working with ontologies. For example, the signature
        can be fed to a forgetting or modularisation tool in order to compute an
        output ontology which retains all the relevant logical entailments of the
        source ontology. The Protégé-TS tool implements a range of operations
        which allow the user to create and edit signatures of concept and role
        names in ontologies. This paper describes the functionality and architec-
        ture of the tool which has been implemented as a Java plugin to Protégé,
        and an evaluation of the tool when applied to several large scale ontolo-
        gies, with an example focus on the medical ontology SNOMED-CT.


Keywords: OWL Ontology · Term Selection Approaches · Term Selection Tool
· Protégé Plugin


1     Introduction
As a formal mechanism for knowledge representation, a wide range of ontolo-
gies have been constructed covering several application domains. Many of these
ontologies have grown to such a size and complexity that their utility is under-
mined from the human perspective, for example, in understanding, performing
maintenance and knowledge sharing, and for the ability of software to compute
formal reasoning tasks.
    An area of much recent research is in what is termed forgetting whereby a
large ontology is reduced in size, and the retained portion of the ontology is
focused on support of a given application domain use case. Several forgetting
algorithms and software tools (e.g., FAME [36] and LETHE [22]) have already
been developed, and can process a source ontology to generate a smaller output
ontology which still retains all the relevant logical entailments of the source.
These tools are fed with a signature of terms, i.e., concepts (OWL classes) and
    Copyright © 2020 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
2      Ian Hyland and Renate A. Schmidt

roles (OWL object properties), which specify the parts of the source ontology
that are to be forgotten or kept.
    The essence of this paper is the development of operations and strategies for
creating term lists that constitute the signature, and the implementation of a
software tool to support and partially automate generation of the signature for
input into a forgetting tool or other purposes. Building upon the industry stan-
dard Protégé OWL ontology editor [30] with reasoning tool support, the term
selection tool is implemented as a Protégé Java plug-in. The usability, perfor-
mance, scalability and functionality of the tool is demonstrated to provide valu-
able support for execution, storing and replaying of term selection operations,
and specifically is tested and evaluated against several large-scale ontologies,
with a focus on SNOMED-CT [19].


2   Term Selection

Many different approaches to term selection are mooted in the literature.
    Modularisation [5, 8, 9] requires that the user specifies a seed signature as
input entry to compute a module of an ontology. All entities related in some
respect to the chosen entity will be included in the module. With information
removal, parts of the ontology are selected for removal, resulting in a module
without all the detail of the original ontology. Abstraction by breadth is hid-
ing unnecessary knowledge, it occurs when some relational properties of entities
are removed to provide a simpler view, hence the breadth of the ontology is
reduced. Abstraction by depth is also hiding unnecessary knowledge and oc-
curs when high-level concepts from the source ontology are kept, and lower-level
concepts are removed; hence the depth of the ontology is reduced.
    FASTR [20] describes a unification-based system that efficiently identifies
technical terms and demonstrates the complexity of the data that motivated
the fundamental design decisions. [26] presents statistical term extraction,
a statistical approach to terminology extraction which is general to all lan-
guages, although including some language-specific parameters. The method is
used for the automatic identification of terminology and is theoretically and
computationally simple and disregards resources such as linguistic or ontological
knowledge. [23] describes compound terms, an evaluation of compound terms
extraction from a corpus of the domain of Paediatrics. Bigrams and trigrams
were automatically extracted from a corpus composed of 283 texts using three
different extraction methods.
     [28] describes a terminology-driven framework that integrates several
components: automatic term recognition; term variation handling; acronym ac-
quisition; and automatic term discovery of similarities and clusters. [15] presents
a natural language processing (NLP) based automatic extraction proto-
col for specialised corpus analysis using NLP tools to define semantic hierarchies
of verbs. In combining semantic and syntactic analysis of results in the verb
macro structure it illustrates the evolution of the meaning from more general to
more specific verbs.
                       Protégé-TS: An OWL Ontology Term Selection Tool                        3


                                                            Source: López-García et al (2012)


               Fig. 1. Nodes identified by graph traversal heuristics.


   Utilising frequency-based filtering, a study was conducted by [24] based
on SNOMED-CT, whereby filtering of terms in MEDLINE [25] was used to re-
duce SNOMED-CT module sizes without discarding relevant concepts. Signature
subsets were first extracted using four graph-traversal heuristics and one logic-
based technique. These were subsequently filtered with frequency information
from MEDLINE, Fig. 1 illustrates the four heuristics requiring identification of
concept UpSets, DownSets and concept-role relationships.
   Based on graph-traversal and frequency data, [24] summarises that
graph-traversal strategies and frequency data drawn from an authoritative source
can prune large ontologies and produce modules that exhibit acceptable cover-
age. However, it is also noted that evaluating the performance and optimality
of modules is extremely difficult, and [5, 7] concludes that “there is no universal
way to modularize an ontology”.
    With domain relevance concepts, [34] notes that for effective search and
management of large amounts of medical image and patient data, it is rele-
vant to know the kind of information the clinicians and radiologists seek. They
describe a statistical, clinical query pattern derivation as an approach to ob-
taining this information semi-automatically which is based on predicting clinical
query patterns given medical ontologies, domain corpora and statistical anal-
ysis. Their aim was to discover radiologists’ and clinicians’ information needs
by using semi-automatic text analysis methods that are independent of expert
interviews. Additionally, [35] define a multi-perspective approach to term selec-
tion, which is based on three core assumptions. The goal is to reduce module
4       Ian Hyland and Renate A. Schmidt

size by using a more strictly selected and therefore more domain-specific set of
ontology concepts.
    The SNOMED-CT Expression Constraint Language (ECL) [18] is a
formal syntax that enables the definition of a subset of SNOMED-CT concepts
represented as an expression constraint. The expressions are computable rules
used to define bounded sets of concepts gathered using constraint operations
such as descendantOf, ancestortOf or memberOf and set-operations such as AND,
OR and MINUS. The constraints can be used to restrict the selected concepts for
a given concept, for example, as a machine-processable query, or a range of an
attribute defined in the concept model. Two equivalent syntaxes are defined, a
brief version used for machine to machine communication, and a long form which
aids human readability. Several ECL browsers are publicly available including [4,
31,32], plus language parser tools such as SNOMED-CT Parser and ECL Parser.
    Signature adjustment is used in an iterative combined modularisation
and forgetting approach to ontology extraction [3]. This involves extension
and if required partitioning of the given signature and evaluation by a domain
expert. Based on a framework for modularization, [6] suggests an eight-
stage framework to perform modularization of an ontology. The framework seems
equally applicable to forgetting-based ontology processing.
    The forgetting tools FAME [36] and LETHE [22] take as their input either
a forgetting signature, i.e., a list of terms (concepts and roles) to be forgotten or
a keep signature, i.e., a list of terms to be kept. Logically for a given ontology the
sum of the forgetting and keep signatures must comprise all concepts and roles
in the ontology. Both tools expect the user to provide the list of forget terms or
keep terms. In the very basic GUI versions (of FAME and LETHE) the user can
compose the relevant signature by selecting individual concepts/roles, or groups
of concepts/roles, i.e., by using the shift and control keys in two windows listing
all concept and role symbols occurring in the loaded ontology.
    This approach is entirely usable for very small ontologies, where a single
shot selection of the forgetting signature is adequate. Large ontologies like
SNOMED-CT contain however hundreds of thousands of terms and scrolling
through these to select individual terms is impractical. As an essentially single
shot approach there is no support for an interactive and incremental approach to
signature construction. The formula of the ontology is displayed, however all vis-
ibility of the ontology structure (concept and role hierarchies), axiom definitions
associated with a given concept, any role domain and range restrictions, general
class axioms, and annotations are not accessible. Each term is either selected,
or it is not. No capability is present to more intelligently process terms based
on their place in the hierarchy, e.g., UpSet, DownSet and Equivalence and other
properties like General Class Axioms (GCAs) and role relationships between
concepts. There is no search capability, e.g., to search for a given concept, role
or annotation. There is no access to external reasoning capability, using tools
like ELK [11] for classification, or the ability to apply queries along the lines of
DL Query in Protégé. There is no capability to save and load a forgetting or
keep signature, i.e., for archiving, sharing and off-line analysis. There is no help
                       Protégé-TS: An OWL Ontology Term Selection Tool         5

assistance, i.e., display of a help screen or pop-up help text. Clearly, the GUIs
only support FAME or LETHE, and not for instance, other ontology extraction
tools like the OWL API Module Extractor.
    Because there are so many applications of term selection, and Protégé is a
widely used ontology editor, we have developed a term selection plugin to give
Protégé users the capability to conveniently and flexibly create term lists for
use as input to forgetting or other tools, or for other purposes. The functional
requirements of our Protégé-TS plugin were driven by the mentioned issues with
the above term selection approaches.


3   Term Selection Tool Requirements

At the highest level the Protégé-TS tool supports the concurrent, interactive
and incremental development of two lists of concepts and roles, termed the keep
and forget signatures. High-level user interaction is illustrated in Fig. 2. The
tool introduces a new Term Selection tab into the Protégé main menu. Upon
clicking on any of the tab submenus a welcome screen is displayed, including
pointers to help text and licence conditions. The welcome screen is displayed
only once. A Help Screen is included to display more detailed instructions.
Hovering the mouse pointer over a tab will show pop-up help text.
    Various error messages are defined, for example, covering situations which
require an ontology to have already been loaded Error: First Load an Ontology
or an operation requiring that one concept or role is first selected Error: Concept
or Role Not Selected. The Metrics tab shows basic metrics of the ontology and
signatures, e.g., total numbers of concept, roles and axioms, and the concepts and
roles that are assigned to the keep/forget signatures. Note that both keep and
forget signatures are initially empty. The Display tab shows the current contents
of the keep/forget signatures. A message will be displayed if the signature is too
large to display on screen. The Results tab enables the displaying of the results
of each operation performed to be toggled on/off.
    The All–Forget/Keep tab will place all concepts and roles in the ontology
into either the forget or the keep signature. When the command completes a
summary of the ontology metrics and signatures is displayed. The main action
of All is to update the concept and role annotation section. For example, running
All–Forget (with reference to Fig. 3) the annotation for concept A1 is updated to
include ]-Term-Selection-Concept-Forget and the role r111 has been updated
to include ]-Term-Selection-Role-Forget, i.e., this indicates they are both in
the forget signature. The All–Clear tab will set to empty all of the keep and
forget signatures, i.e., this will delete the ]-Term-Selection-xxx annotations.
    The Entity tab adds a single entity (concept or role) to the keep/forget signa-
tures. With this and the following actions, the individual concept/role annotation
is updated with the appropriate text, e.g., ]-Term-Selection-Concept-Keep.
The Equivalent tab adds all equivalent concepts or roles to the keep/forget
signatures. The DownSet tab adds the down set (i.e., all subconcepts or sub-
roles) to the keep/forget signatures, and the UpSet tab adds the upset (i.e.,
6      Ian Hyland and Renate A. Schmidt


            Fig. 2. Protégé Term Selection High-level User Interaction


all super concepts or super roles) to the keep/forget signatures. The General
Class Axioms (GCA) tab adds to the forget/keep signatures all concepts and
roles appearing in the GCA axiom definition. Note, the processing of these four
operations utilises the reasoning capabilities as detailed below.
    The Roles tab is applied to process Concept-Role Chains. Consider the
example test ontology expressed in Manchester OWL Syntax (MOS)
    T SubClassOf r1 some U             V SubClassOf r3 some W
    U SubClassOf r2 some V             W SubClassOf r4 some X
    U EquivalentTo EQ1                 W EquivalentTo EQ2
which is shown diagrammatically in Fig. 4 (concepts in circles, roles as arrows).
The user is prompted to enter the number of hops which is an integer in the
range 1 to 8. The hop number controls how many of the concepts/roles are
included in the processing to update the keep/forget signatures. For example,
                                        Protégé-TS: An OWL Ontology Term Selection Tool                                                 7


 Showing the
 new
 Term
 Selection
 introduced
 into the                                              Select a concept “A1”, then apply the tab      Note that the concept “A1”
 Protégé main                                          Entity - Forget                                annotation has been updated to show
 menu                                                                                                 that the concept is now part of the
                                                                                                      forget signature


                           The Entity tab is used
                           to select both individual
                           concepts and roles


           Select a role “r111”, then apply the tab
           Entity - Forget
                                                                                              Note that the role “r111” annotation has
                                                                                              been updated to show that the role is now
                                                                                              part of the forget signature


           Fig. 3. Protégé Term Selection Tab and Concept/Role Annotations


                                              Fig. 4. Concept-Role Chains


selecting concept T and performing operation Roles–1 includes concepts T, U,
and role r1 in the signature. Whereas performing Roles–4 includes the concepts
T, U, V, W, X, equivalent concepts EQ1, EQ2 and roles r1, r2, r3, r4. Thus, concepts
and roles encountered horizontally in four hops are included in the list.
    Any given concept can be either fully defined or primitive. A fully defined
concept is complete, i.e., it contains relationships that represent the full set of
necessary and sufficient conditions [1]. In contrast, a primitive concept is incom-
plete, i.e., the set of conditions is insufficient to fully define the concept and
therefore do not have any specified equivalent classes. By default, when process-
ing the Roles function, both defined and primitive concepts are included in the
signature. The Equivalent concepts–Primitive only tab turns-off inclusion
of defined concepts, i.e., will exclude defined equivalent concepts and include
only primitive concepts. Executing the Roles–4 operation now excludes the
equivalent classes EQ1, EQ2 from the signatures.
8       Ian Hyland and Renate A. Schmidt

    When processing a concept, by default Protégé-TS will include all of the ax-
iom definitions associated with the concept. For example, if a concept definition
contains the axiom r1 some Z then both the Z concept and r1 role are added to
the signature. Protégé-TS includes the capability to toggle on/off the inclusion
of axiom processing.

    Any new concept or role added to an ontology will not automatically be part
of the keep/forget signatures. The Undefined tab is used to set these undefined
concepts and roles into either the keep or the forget signature. The Delete tab
will delete from the ontology all concepts, roles and axioms definitions associated
with the given concept that have been assigned to one of the keep or forget
signatures. The delete capability was intended to be utilised in experiments
with the OWL API Modularisation tool.

    The Signature Save/Load tabs allow the keep and forget signatures to be
saved to disk files or loaded from disk files. Note: only the filename needs to be
entered, the .txt extension is added automatically, and error messages are dis-
played as appropriate. Protégé-TS will automatically keep an internal list of the
commands as they are executed, with the ability to save (Command–Save) and
replay the log as commands (Command–Load). Operations that are applica-
ble to the commands feature are All, Delete, Undefined, Entity, DownSet,
Upset, Equivalent, GCA, Roles, Results, Axioms and Equivalent Con-
cepts. The command log can also be cleared, using Command–Clear.

    Considering user deployment constraints, Protégé-TS is intended to be
utilised by research users without recourse to specialist hardware. The supported
user run-time environment for Protégé-TS was therefore targeted to be a mid-
range Microsoft Windows (version 10) laptop, or better. In performance and
capacity terms, mid-range being defined as at least an Intel i5 CPU 4 cores,
8 GB memory and 132 GB disk.

    From the perspective of performance and scalability the requirements
were intended to support incremental and interactive development of the forget
and keep signatures. To maintain a sleek user interface most of the functions de-
tailed above should execute typically in a few seconds. The most “heavyweight”
functions such as deleting large parts of the ontology, i.e., deleting concepts, roles
and axioms identified by the keep or forget signature, were allowed to execute
in less than a minute. Protégé-TS should support the larger available ontologies
such as SNOMED-CT that can contain hundreds of thousands of concepts and
axioms. Ontology expressivity should include support for all of OWL DL.

   Concerning licencing requirements, Protégé-TS is available under the
GNU General Public Licence version 3 or any later version. The welcome screen
contains the standard text recommended in GNU [13]. Copies of the relevant
Protégé-TS source files, e.g., Java code, build and configuration files, Test Cases,
User Guide etc are publicly available on GitHub [16].
                                 Protégé-TS: An OWL Ontology Term Selection Tool                                                     9


                   Eclipse IDE
                                                                                                        with Term
                                                                                                        Selection Plugin


                                                                     Source
                                                                      Java             Command         Signature
                                                                                                         Java               Results
     Maven    plugin          Java             Java                 Ontology
                                                                     Plugin              Log             Files
                                                                                                        Plugin             Ontology
    pom.xml    .xml          Action           Plugin
                              Event
                             Classes
                                                       Copy file ‘protege.plugin.examples-
                                                       2.0.0-SNAPSHOT’ into Protégé
                                                       plugin directory ‘/plugins’

                    Java               OWL
                   Worker               API
                   Classes
                                                                                                 Forgetting Tools e.g. FAME or LETHE


                             Fig. 5. Protégé-TS Software Architecture


4      Term Selection Tool Software Architecture

The software architecture of Protégé-TS is illustrated in Fig. 5 and has the
following major components.
    The Eclipse Integrated Development Environment [10] is used for edit-
ing of the Java [21] source and .xml files. The Maven build system specifies
the build dependencies as detailed in the pom.xml file. The file specifies de-
pendencies, i.e., groupId, artifactId and version number, to the Protégé editor
and utilises Version 5.1.11 of the OWL API [29]. Additionally, it includes build
instructions relating to the maven-compiler-plugin and maven-eclipse-plugin. A
successful build generates the Protégé Java .jar plugin file. Plugin.xml is a file
that specifies the layout of the new Protégé-TS tab, and the one-to-one associ-
ation between each individual tab and a single Java Action Event Class.
    The Java Action Event Classes are the code that is invoked when the
mouse clicks on a given tab, these classes control all GUI actions. Each class
has a one-to-one relationship with the entries in the plugin.xml file. The Java
Action Event Classes utilise the Java Swing library to manage GUI actions, for
example, displaying messages, number input and load/save filename. There are
34 of these classes in Protégé-TS. The Java Worker Classes provide support
functions to the Java Action Event Classes, for example, updating concept and
10      Ian Hyland and Renate A. Schmidt

     Concept                       Role        Equivalent             Equivalent          Concept
     Hierarchy                   Hierarchy      Concept                 Role             SubClassOf


                                                                   General Class Axiom


                        Concept – Role Chain


                                                            SNOMED-CT Like Expressions


                 Complex Roles


                           Fig. 6. Test Ontology OWL Expression Types.


role annotations, traversing the ontology to process DownSet, UpSet, Equivalent
concepts and GCA, construction of message content for display, save/load of
signatures files and calculating ontology metrics. There are 22 of these classes in
Protégé-TS.
     The capabilities of the OWL API are used extensively by both the Java
Action Event Classes and the Java Worker Classes. This includes both the inbuilt
structural reasoner, supplemented by the ELK [11] reasoner which is optimised
for processing ontologies with EL level expressivity.
     Following execution of the Maven build, the plugin .jar file is generated and
copied into the Protégé plugins directory. When Protégé is restarted, the plugin
is loaded and the new Term Selection tab in the main menu is observed. Protégé-
TS can then be utilised with the source ontology, command log and signature
files to generate revised signature files and a results ontology containing the
inserted term selection annotations ready to be processed by forgetting tools
such as FAME and LETHE.


5    Evaluation of the Term Selection Tool

Functionality tests are performed against a test ontology containing a range
of OWL expression types constructed to exercise the full set of functionalities
                       Protégé-TS: An OWL Ontology Term Selection Tool       11

described in Section 3. The test plan and results are available on GitHub [16].
Additional OWL expressions were defined that are based on the expression forms
found in SNOMED-CT, see Fig. 6.
   Detailed performance and scalability tests were executed against the SNOMED-
CT ontology and are also available on GitHub [16]. All tests have passed, with
one exception when deleting large parts of the ontology, the test exceeded the
target allocated run-time limit. To test for the general application of Protégé-TS
a small number of tests were executed against other sample ontologies, which en-
compass different expressivity and scale. Examples included FMA [12], GO [14],
Uberon [33], ICD [17] and NCIt [27].
   Usability testing was performed using the System Usability Scale (SUS)
which was a simple 10 item questionnaire, designed by [2] in the mid-1980s to
assess a user’s overall satisfaction with a product. Three academic staff and re-
search users, who are all highly experienced in OWL, Protégé and the SNOMED-
CT ontology, performed the test, and the 88% SUS score achieved indicates a
high-level of perceived usability.
   Several use case questions were defined to exercise the available features of
Protégé-TS and evaluate its real-world utility. Example questions were to find
“cause for severe sunburn damaged skin”, “available blood pressure measure-
ment techniques”, and “list of bones in the hand”. Taking the first question and
searching on “sunburn skin disorder” returned three results covering first, second
and third-degree sunburn. For example, expressed in MOS:

'Sunburn of first degree (disorder)' EquivalentTo
'Acute effect of ultraviolet radiation on normal skin (disorder)'
and ('Role group (attribute)' some (('Associated morphology (attribute)' some
  ‘First degree burn injury (morphologic abnormality)')
and ('Causative agent (attribute)' some 'Ultraviolet radiation (physical force)')
and ('Finding site (attribute)' some 'Skin structure (body structure)')))

Expressed in natural language: “severe sunburn is a first-degree burn injury of
normal skin caused by ultraviolet radiation”. In Protégé selecting the concept
’Sunburn of first degree (disorder)’, by utilising the features of Protégé-
TS allows the keep/forget signatures to be generated and subsequently fed into
FAME and LETHE. The features employed include Entity, DownSet, UpSet,
GCA and Roles-1 through to Roles-8. The number of concepts and roles
added to the signatures as a result of each operation is shown in Table 1. For
instance, Entity - Keep without axiom inclusion will add just the concept to
the keep signature, whereas with axioms included will add the concept itself
plus the 4 concepts and 4 roles that form the axiom definition. Inclusion of
axiom processing increases run-time significantly from 2 to 9 seconds. UpSet
w/ axioms generates a signature with 39 concepts and 7 roles, but the extensive
OWL API computation involved leads to slightly excessive run-time.
    Perhaps most closely related to Protégé-TS is SNOMED-CT’s Expression
Constraint Language (ECL) [1,18]. A high-level feature comparison of Protégé-
TS versus ECL shows that ECL supports: UpSet and DownSet not including
self, combined constraints, relationships and cardinality. Whereas Protégé-TS
12     Ian Hyland and Renate A. Schmidt

          Table 1. Test Results ’Sunburn of first degree (disorder)’

     Action                         Run-Time Concepts Roles    Test
                                     (secs)   Added Added Comment
     Restart Protégé                 17                      Pass
     Load SNOMED-CT                    52                      Pass
     All-Forget                         5     350711   120     Pass
     Search ’sunburn skin analysis’     1                      Pass
     Entity-Keep w/o Axioms             2        1      0      Pass
     Entity-Keep w/ Axioms              9        5      4      Pass
     DownSet w/ Axioms                 21        5      4      Pass
     UpSet w/ Axioms                   49       39      7   Partial Pass
     Equivalent-Keep w/ Axioms         17        5      4      Pass
     GCA-Keep w/ Axioms                 2        1      0      Pass
     Role-Keep 1                        7        5      4      Pass
     Role-Keep 2                        7       13      4      Pass
     Role-Keep 3                        7       24      4      Pass
     Role-Keep 4                        6       31      4      Pass
     Role-Keep 5                        6       33      4      Pass
     Role-Keep 6                        8       34      4      Pass
     Role-Keep 7                        6       34      4      Pass
     Role-Keep 8                        6       34      4      Pass


supports processing of equivalent concept and role definitions, general class ax-
iom definitions, axiom definitions associated with a given concept, and concept-
role chains. Protégé-TS also supports incremental interactive signature creation,
editing, load/save. Lastly, of course ECL is dedicated to SNOMED-CT, whereas
being available as a Protégé-module Protégé-TS can be applied to any OWL 2
based ontology and benefits from the various functionality available in Protégé.


6    Conclusion

Several term selection approaches have been identified and a tool has been im-
plemented to support the core functions required by some of these approaches.
Different term selection operations are available in the Protégé-TS plugin, from
the simple case of selecting individual terms, through the more complex case of
where the ontology structure (UpSet, DownSet, GCA and axiom definitions etc)
is automatically computed to specify selected terms. Several even more com-
plex approaches could form the basis for further work. Various quantitative and
qualitative metrics were specified to test and evaluate the tool. The experimen-
tal results demonstrate the functionality, performance, scalability and general
applicability of Protégé-TS, i.e., the tests cases pass (with minor exceptions of
excessive run-time) and the tool is determined to have met its stated require-
ments. Validation of the tool encompassed performing a System Usability Scale
Questionnaire, which provided evidence of a high level of usability and ideas for
future developments.
                         Protégé-TS: An OWL Ontology Term Selection Tool           13

References
 1. Bhattacharyya, S.B.: Introduction to SNOMED-CT. Springer (2016)
 2. Brooke, J.: SUS: A “quick and dirty” usability scale. In: Jordan, P., Thomas, B.,
    Weerdmeester, B. (eds.) Usability Evaluation in Industry, pp. 189–194. Taylor &
    Francis (1996)
 3. Chen, J., Alghamdi, G., Schmidt, R.A., Walther, D., Gao, Y.: Ontology extraction
    for large ontologies via modularity and forgetting. In: Kejriwal, M., Szekely, P.A.,
    Troncy, R. (eds.) Proceedings of the 10th International Conference on Knowledge
    Capture (K-CAP’19). pp. 45–52. ACM (2019)
 4. CSIRO: https://apg.ihtsdotools.org/. Accessed 4 November 2019.
 5. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontolo-
    gies: Theory and practice. Journal of Artificial Intelligence Research 31, 273–318
    (2008)
 6. d’Aquin, M.: Modularizing ontologies. In: Suárez-Figueroa, M.C., Gómez-Pérez,
    A., Motta, E., Gangemi, A. (eds.) Ontology Engineering in a Networked World,
    pp. 213–33. Springer (2012)
 7. d’Aquin, M., Schlicht, A., Stuckenschmidt, H., Sabou, M.: Criteria and evaluation
    for ontology modularization techniques. In: Modular Ontologies: Concepts, The-
    ories and Techniques for Knowledge Modularization, Lecture Notes in Computer
    Science, vol. 5445, pp. 67–89. Springer (2009)
 8. Del Vescovo, C.: The modular structure of an ontology: Atomic decomposition
    towards applications. In: Proceedings of the 24th International Workshop on De-
    scription Logics (DL’11). CEUR Workshop Proceedings, vol. 745. CEUR-WS.org
    (2011)
 9. Del Vescovo, C., Gessler, D., Klinov, P., Parsia, B., Sattler, U., Schneider, T.,
    Winget, A.: Decomposition and modular structure of bioportal ontologies. In: The
    Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Lecture
    Notes in Computer Science, vol. 7031, pp. 130–145. Springer (2011)
10. Eclipse:      https://www.eclipse.org/downloads/packages/release/kepler/sr2/
    eclipse-ide-java-ee-developers. Downloaded 28 July 2019
11. ELK: https://protegewiki.stanford.edu/wiki/ELK.VersionELK 0.4.3. Accessed
    3 October 2019
12. FMA: https://bioportal.bioontology.org/ontologies/FMA. Accessed 3 Novem-
    ber 2019.
13. GNU: https://www.gnu.org/licenses/gpl-3.0.en.html. Accessed 12 November
    2019
14. GOC: The Gene Ontology Consortium. Gene Ontology Annotations and Resources.
    Nucleic Acids Research, 41(D1):D530–D535, 2013
15. Goncharova, Y., Sánchez Cárdenas, B.: Specialized corpora processing with auto-
    matic extraction tool. Procedia: Social and Behavioral Sciences 95, 293–297 (2013)
16. Hyland, I.: Protégé-TS (2019), https://github.com/ianhyland/Protege-TS.git.
    Accessed 23 May 2020.
17. ICD: https://www.who.int/classifications/icd/en/. Accessed 4 November 2019
18. IHTSDO: Expression constraint language: Specification and guide,
    https://confluence.ihtsdotools.org/display/DOCECL/Expression+Constraint+
    Language+-+Specification+and+Guide. Accessed 3 November 2019
19. IHTSDO: SNOMED-CT, https://www.snomed.org/. Accessed 17 July 2019
20. Jacquemin, C.: Spotting and Discovering Terms through Natural Language Pro-
    cessing. MIT Press (2001)
14      Ian Hyland and Renate A. Schmidt

21. Java: Java Development Toolkit https://www.oracle.com/technetwork/java/
    javase/downloads/index.html, Version-10.0.1. Downloaded 12 July 2019
22. Koopmann, P.: Practical Uniform Interpolation for Expressive Description Logics.
    Ph.D. thesis, The University of Manchester, UK (2015)
23. Lopes, L., Vieira, R., José Finatto, M., Martins, D.: Extracting compound terms
    from domain corpora. Journal of the Brazilian Computer Society 16, 247–259
    (2010)
24. López Garcı́a, P., Boeker, M., Illarramendi, A., Schulz, S.: Usability-driven pruning
    of large ontologies: The case of SNOMED-CT. J. Am. Med. Inform. Assoc. 19(e1),
    e102–e109 (2012)
25. MEDLINE: https://www.nlm.nih.gov/bsd/pmresources.html. Accessed 3 Novem-
    ber 2019
26. Nazar, R.: A statistical approach to term extraction. International Journal of En-
    gineering and Science 11(2), 159–182 (2011)
27. NCIT: https://ncit.nci.nih.gov/ncitbrowser/. Accessed 3 November 2019
28. Nenadic, G., Spasic, I., Ananiadou, S.: Terminology-driven mining of biomedical
    literature. Bioinformatics 19(8), 939–943 (2003)
29. OWL-API: https://github.com/owlcs/owlapi. Version 5.1.11. Downloaded 12
    July 2019
30. Protégé: http://protege.stanford.edu/. Accessed 6 July 2019
31. SNOW: https://mq.b2i.sg/snow-owl/#. Accessed 2 November 2019.
32. SNQuery: https://snquery.veratech.es/. Accessed 2 November 2019.
33. Uberon: https://bioportal.bioontology.org/ontologies/UBERON. Accessed 4
    November 2019
34. Wennerberg, P., Buitelaar, P., Zillner, S.: Deriving clinical query patterns from
    medical corpora using domain ontologies. In: Workshop Biomedical Information
    Extraction 2009. pp. 50–56. Association for Computational Linguistics, USA (2009)
35. Wennerberg, P., Schulz, K., Buitelarr, P.: Ontology modularization to improve
    semantic medical image annotation. Journal of Biomedical Informatics 44, 155–
    162 (2011)
36. Zhao, Y.: Automated Semantic Forgetting for Expressive Description Logics. Ph.D.
    thesis, The University of Manchester, UK (2018)