<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Protege-TS: An OWL Ontology Term Selection Tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ian Hyland</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Renate A. Schmidt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The University of Manchester</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces Protege-Term Selection (Protege-TS), a software tool developed to support and partially automate the process of constructing OWL ontology signatures. Protege-TS works from an OWL 2 based source ontology to assist the user in creating a signature which is composed of a list of concepts and roles. Such signature lists are often needed when working with ontologies. For example, the signature can be fed to a forgetting or modularisation tool in order to compute an output ontology which retains all the relevant logical entailments of the source ontology. The Protege-TS tool implements a range of operations which allow the user to create and edit signatures of concept and role names in ontologies. This paper describes the functionality and architecture of the tool which has been implemented as a Java plugin to Protege, and an evaluation of the tool when applied to several large scale ontologies, with an example focus on the medical ontology SNOMED-CT.</p>
      </abstract>
      <kwd-group>
        <kwd>OWL Ontology</kwd>
        <kwd>Term Selection Approaches</kwd>
        <kwd>Term Selection Tool</kwd>
        <kwd>Protege Plugin</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>As a formal mechanism for knowledge representation, a wide range of
ontologies have been constructed covering several application domains. Many of these
ontologies have grown to such a size and complexity that their utility is
undermined from the human perspective, for example, in understanding, performing
maintenance and knowledge sharing, and for the ability of software to compute
formal reasoning tasks.</p>
      <p>
        An area of much recent research is in what is termed forgetting whereby a
large ontology is reduced in size, and the retained portion of the ontology is
focused on support of a given application domain use case. Several forgetting
algorithms and software tools (e.g., FAME [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] and LETHE [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]) have already
been developed, and can process a source ontology to generate a smaller output
ontology which still retains all the relevant logical entailments of the source.
These tools are fed with a signature of terms, i.e., concepts (OWL classes) and
Copyright © 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
roles (OWL object properties), which specify the parts of the source ontology
that are to be forgotten or kept.
      </p>
      <p>
        The essence of this paper is the development of operations and strategies for
creating term lists that constitute the signature, and the implementation of a
software tool to support and partially automate generation of the signature for
input into a forgetting tool or other purposes. Building upon the industry
standard Protege OWL ontology editor [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] with reasoning tool support, the term
selection tool is implemented as a Protege Java plug-in. The usability,
performance, scalability and functionality of the tool is demonstrated to provide
valuable support for execution, storing and replaying of term selection operations,
and speci cally is tested and evaluated against several large-scale ontologies,
with a focus on SNOMED-CT [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Term Selection</title>
      <p>Many di erent approaches to term selection are mooted in the literature.</p>
      <p>
        Modularisation [
        <xref ref-type="bibr" rid="ref5 ref8 ref9">5, 8, 9</xref>
        ] requires that the user speci es a seed signature as
input entry to compute a module of an ontology. All entities related in some
respect to the chosen entity will be included in the module. With information
removal, parts of the ontology are selected for removal, resulting in a module
without all the detail of the original ontology. Abstraction by breadth is
hiding unnecessary knowledge, it occurs when some relational properties of entities
are removed to provide a simpler view, hence the breadth of the ontology is
reduced. Abstraction by depth is also hiding unnecessary knowledge and
occurs when high-level concepts from the source ontology are kept, and lower-level
concepts are removed; hence the depth of the ontology is reduced.
      </p>
      <p>
        FASTR [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] describes a uni cation-based system that e ciently identi es
technical terms and demonstrates the complexity of the data that motivated
the fundamental design decisions. [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] presents statistical term extraction,
a statistical approach to terminology extraction which is general to all
languages, although including some language-speci c parameters. The method is
used for the automatic identi cation of terminology and is theoretically and
computationally simple and disregards resources such as linguistic or ontological
knowledge. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] describes compound terms, an evaluation of compound terms
extraction from a corpus of the domain of Paediatrics. Bigrams and trigrams
were automatically extracted from a corpus composed of 283 texts using three
di erent extraction methods.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] describes a terminology-driven framework that integrates several
components: automatic term recognition; term variation handling; acronym
acquisition; and automatic term discovery of similarities and clusters. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] presents
a natural language processing (NLP) based automatic extraction
protocol for specialised corpus analysis using NLP tools to de ne semantic hierarchies
of verbs. In combining semantic and syntactic analysis of results in the verb
macro structure it illustrates the evolution of the meaning from more general to
more speci c verbs.
Source: López-García et al (2012)
      </p>
      <p>
        Utilising frequency-based ltering, a study was conducted by [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] based
on SNOMED-CT, whereby ltering of terms in MEDLINE [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] was used to
reduce SNOMED-CT module sizes without discarding relevant concepts. Signature
subsets were rst extracted using four graph-traversal heuristics and one
logicbased technique. These were subsequently ltered with frequency information
from MEDLINE, Fig. 1 illustrates the four heuristics requiring identi cation of
concept UpSets, DownSets and concept-role relationships.
      </p>
      <p>
        Based on graph-traversal and frequency data, [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] summarises that
graph-traversal strategies and frequency data drawn from an authoritative source
can prune large ontologies and produce modules that exhibit acceptable
coverage. However, it is also noted that evaluating the performance and optimality
of modules is extremely di cult, and [
        <xref ref-type="bibr" rid="ref5 ref7">5, 7</xref>
        ] concludes that \there is no universal
way to modularize an ontology".
      </p>
      <p>
        With domain relevance concepts, [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] notes that for e ective search and
management of large amounts of medical image and patient data, it is
relevant to know the kind of information the clinicians and radiologists seek. They
describe a statistical, clinical query pattern derivation as an approach to
obtaining this information semi-automatically which is based on predicting clinical
query patterns given medical ontologies, domain corpora and statistical
analysis. Their aim was to discover radiologists' and clinicians' information needs
by using semi-automatic text analysis methods that are independent of expert
interviews. Additionally, [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] de ne a multi-perspective approach to term
selection, which is based on three core assumptions. The goal is to reduce module
size by using a more strictly selected and therefore more domain-speci c set of
ontology concepts.
      </p>
      <p>
        The SNOMED-CT Expression Constraint Language (ECL) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is a
formal syntax that enables the de nition of a subset of SNOMED-CT concepts
represented as an expression constraint. The expressions are computable rules
used to de ne bounded sets of concepts gathered using constraint operations
such as descendantOf, ancestortOf or memberOf and set-operations such as AND,
OR and MINUS. The constraints can be used to restrict the selected concepts for
a given concept, for example, as a machine-processable query, or a range of an
attribute de ned in the concept model. Two equivalent syntaxes are de ned, a
brief version used for machine to machine communication, and a long form which
aids human readability. Several ECL browsers are publicly available including [
        <xref ref-type="bibr" rid="ref31 ref32 ref4">4,
31,32</xref>
        ], plus language parser tools such as SNOMED-CT Parser and ECL Parser.
      </p>
      <p>
        Signature adjustment is used in an iterative combined modularisation
and forgetting approach to ontology extraction [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This involves extension
and if required partitioning of the given signature and evaluation by a domain
expert. Based on a framework for modularization, [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] suggests an
eightstage framework to perform modularization of an ontology. The framework seems
equally applicable to forgetting-based ontology processing.
      </p>
      <p>
        The forgetting tools FAME [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] and LETHE [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] take as their input either
a forgetting signature, i.e., a list of terms (concepts and roles) to be forgotten or
a keep signature, i.e., a list of terms to be kept. Logically for a given ontology the
sum of the forgetting and keep signatures must comprise all concepts and roles
in the ontology. Both tools expect the user to provide the list of forget terms or
keep terms. In the very basic GUI versions (of FAME and LETHE) the user can
compose the relevant signature by selecting individual concepts/roles, or groups
of concepts/roles, i.e., by using the shift and control keys in two windows listing
all concept and role symbols occurring in the loaded ontology.
      </p>
      <p>
        This approach is entirely usable for very small ontologies, where a single
shot selection of the forgetting signature is adequate. Large ontologies like
SNOMED-CT contain however hundreds of thousands of terms and scrolling
through these to select individual terms is impractical. As an essentially single
shot approach there is no support for an interactive and incremental approach to
signature construction. The formula of the ontology is displayed, however all
visibility of the ontology structure (concept and role hierarchies), axiom de nitions
associated with a given concept, any role domain and range restrictions, general
class axioms, and annotations are not accessible. Each term is either selected,
or it is not. No capability is present to more intelligently process terms based
on their place in the hierarchy, e.g., UpSet, DownSet and Equivalence and other
properties like General Class Axioms (GCAs) and role relationships between
concepts. There is no search capability, e.g., to search for a given concept, role
or annotation. There is no access to external reasoning capability, using tools
like ELK [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for classi cation, or the ability to apply queries along the lines of
DL Query in Protege. There is no capability to save and load a forgetting or
keep signature, i.e., for archiving, sharing and o -line analysis. There is no help
assistance, i.e., display of a help screen or pop-up help text. Clearly, the GUIs
only support FAME or LETHE, and not for instance, other ontology extraction
tools like the OWL API Module Extractor.
      </p>
      <p>Because there are so many applications of term selection, and Protege is a
widely used ontology editor, we have developed a term selection plugin to give
Protege users the capability to conveniently and exibly create term lists for
use as input to forgetting or other tools, or for other purposes. The functional
requirements of our Protege-TS plugin were driven by the mentioned issues with
the above term selection approaches.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Term Selection Tool Requirements</title>
      <p>At the highest level the Protege-TS tool supports the concurrent, interactive
and incremental development of two lists of concepts and roles, termed the keep
and forget signatures. High-level user interaction is illustrated in Fig. 2. The
tool introduces a new Term Selection tab into the Protege main menu. Upon
clicking on any of the tab submenus a welcome screen is displayed, including
pointers to help text and licence conditions. The welcome screen is displayed
only once. A Help Screen is included to display more detailed instructions.
Hovering the mouse pointer over a tab will show pop-up help text.</p>
      <p>Various error messages are de ned, for example, covering situations which
require an ontology to have already been loaded Error: First Load an Ontology
or an operation requiring that one concept or role is rst selected Error: Concept
or Role Not Selected. The Metrics tab shows basic metrics of the ontology and
signatures, e.g., total numbers of concept, roles and axioms, and the concepts and
roles that are assigned to the keep/forget signatures. Note that both keep and
forget signatures are initially empty. The Display tab shows the current contents
of the keep/forget signatures. A message will be displayed if the signature is too
large to display on screen. The Results tab enables the displaying of the results
of each operation performed to be toggled on/o .</p>
      <p>The All{Forget/Keep tab will place all concepts and roles in the ontology
into either the forget or the keep signature. When the command completes a
summary of the ontology metrics and signatures is displayed. The main action
of All is to update the concept and role annotation section. For example, running
All{Forget (with reference to Fig. 3) the annotation for concept A1 is updated to
include ]-Term-Selection-Concept-Forget and the role r111 has been updated
to include ]-Term-Selection-Role-Forget, i.e., this indicates they are both in
the forget signature. The All{Clear tab will set to empty all of the keep and
forget signatures, i.e., this will delete the ]-Term-Selection-xxx annotations.</p>
      <p>The Entity tab adds a single entity (concept or role) to the keep/forget
signatures. With this and the following actions, the individual concept/role annotation
is updated with the appropriate text, e.g., ]-Term-Selection-Concept-Keep.
The Equivalent tab adds all equivalent concepts or roles to the keep/forget
signatures. The DownSet tab adds the down set (i.e., all subconcepts or
subroles) to the keep/forget signatures, and the UpSet tab adds the upset (i.e.,
all super concepts or super roles) to the keep/forget signatures. The General
Class Axioms (GCA) tab adds to the forget/keep signatures all concepts and
roles appearing in the GCA axiom de nition. Note, the processing of these four
operations utilises the reasoning capabilities as detailed below.</p>
      <p>The Roles tab is applied to process Concept-Role Chains. Consider the
example test ontology expressed in Manchester OWL Syntax (MOS)</p>
      <sec id="sec-3-1">
        <title>T SubClassOf r1 some U U SubClassOf r2 some V U EquivalentTo EQ1</title>
      </sec>
      <sec id="sec-3-2">
        <title>V SubClassOf r3 some W W SubClassOf r4 some X W EquivalentTo EQ2</title>
        <p>which is shown diagrammatically in Fig. 4 (concepts in circles, roles as arrows).
The user is prompted to enter the number of hops which is an integer in the
range 1 to 8. The hop number controls how many of the concepts/roles are
included in the processing to update the keep/forget signatures. For example,
Showing the
new
Term
Selection
introduced
into the
Protégé main
menu</p>
        <p>The Entity tab is used
to select both individual
concepts and roles
Select a role “r111”, then apply the tab
Entity - Forget</p>
        <p>Select a concept “A1”, then apply the tab
Entity - Forget</p>
        <p>Note that the concept “A1”
annotation has been updated to show
that the concept is now part of the
forget signature
Note that the role “r111” annotation has
been updated to show that the role is now
part of the forget signature
selecting concept T and performing operation Roles{1 includes concepts T, U,
and role r1 in the signature. Whereas performing Roles{4 includes the concepts
T, U, V, W, X, equivalent concepts EQ1, EQ2 and roles r1, r2, r3, r4. Thus, concepts
and roles encountered horizontally in four hops are included in the list.</p>
        <p>
          Any given concept can be either fully de ned or primitive. A fully de ned
concept is complete, i.e., it contains relationships that represent the full set of
necessary and su cient conditions [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. In contrast, a primitive concept is
incomplete, i.e., the set of conditions is insu cient to fully de ne the concept and
therefore do not have any speci ed equivalent classes. By default, when
processing the Roles function, both de ned and primitive concepts are included in the
signature. The Equivalent concepts{Primitive only tab turns-o inclusion
of de ned concepts, i.e., will exclude de ned equivalent concepts and include
only primitive concepts. Executing the Roles{4 operation now excludes the
equivalent classes EQ1, EQ2 from the signatures.
        </p>
        <p>When processing a concept, by default Protege-TS will include all of the
axiom de nitions associated with the concept. For example, if a concept de nition
contains the axiom r1 some Z then both the Z concept and r1 role are added to
the signature. Protege-TS includes the capability to toggle on/o the inclusion
of axiom processing.</p>
        <p>Any new concept or role added to an ontology will not automatically be part
of the keep/forget signatures. The Unde ned tab is used to set these unde ned
concepts and roles into either the keep or the forget signature. The Delete tab
will delete from the ontology all concepts, roles and axioms de nitions associated
with the given concept that have been assigned to one of the keep or forget
signatures. The delete capability was intended to be utilised in experiments
with the OWL API Modularisation tool.</p>
        <p>The Signature Save/Load tabs allow the keep and forget signatures to be
saved to disk les or loaded from disk les. Note: only the lename needs to be
entered, the .txt extension is added automatically, and error messages are
displayed as appropriate. Protege-TS will automatically keep an internal list of the
commands as they are executed, with the ability to save (Command{Save) and
replay the log as commands (Command{Load). Operations that are
applicable to the commands feature are All, Delete, Unde ned, Entity, DownSet,
Upset, Equivalent, GCA, Roles, Results, Axioms and Equivalent
Concepts. The command log can also be cleared, using Command{Clear.</p>
        <p>Considering user deployment constraints, Protege-TS is intended to be
utilised by research users without recourse to specialist hardware. The supported
user run-time environment for Protege-TS was therefore targeted to be a
midrange Microsoft Windows (version 10) laptop, or better. In performance and
capacity terms, mid-range being de ned as at least an Intel i5 CPU 4 cores,
8 GB memory and 132 GB disk.</p>
        <p>From the perspective of performance and scalability the requirements
were intended to support incremental and interactive development of the forget
and keep signatures. To maintain a sleek user interface most of the functions
detailed above should execute typically in a few seconds. The most \heavyweight"
functions such as deleting large parts of the ontology, i.e., deleting concepts, roles
and axioms identi ed by the keep or forget signature, were allowed to execute
in less than a minute. Protege-TS should support the larger available ontologies
such as SNOMED-CT that can contain hundreds of thousands of concepts and
axioms. Ontology expressivity should include support for all of OWL DL.</p>
        <p>
          Concerning licencing requirements, Protege-TS is available under the
GNU General Public Licence version 3 or any later version. The welcome screen
contains the standard text recommended in GNU [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Copies of the relevant
Protege-TS source les, e.g., Java code, build and con guration les, Test Cases,
User Guide etc are publicly available on GitHub [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
Eclipse IDE
with Term
Selection Plugin
Maven
pom.xml
plugin
.xml
        </p>
        <p>Java
Action
Event
Classes</p>
        <p>Java</p>
        <p>Plugin
Java
Worker
Classes</p>
        <p>OWL
API</p>
        <p>SJoauvrace
OPnltuogloingy</p>
        <p>Command</p>
        <p>Log</p>
        <p>SigJnaavtaure
PFluilgeisn</p>
        <p>Results</p>
        <p>Ontology
Copy file
‘protege.plugin.examples2.0.0-SNAPSHOT’ into Protégé
plugin directory ‘/plugins’</p>
        <p>Forgetting Tools e.g. FAME or LETHE</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Term Selection Tool Software Architecture</title>
      <p>The software architecture of Protege-TS is illustrated in Fig. 5 and has the
following major components.</p>
      <p>
        The Eclipse Integrated Development Environment [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is used for
editing of the Java [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] source and .xml les. The Maven build system speci es
the build dependencies as detailed in the pom.xml le. The le speci es
dependencies, i.e., groupId, artifactId and version number, to the Protege editor
and utilises Version 5.1.11 of the OWL API [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Additionally, it includes build
instructions relating to the maven-compiler-plugin and maven-eclipse-plugin. A
successful build generates the Protege Java .jar plugin le. Plugin.xml is a le
that speci es the layout of the new Protege-TS tab, and the one-to-one
association between each individual tab and a single Java Action Event Class.
      </p>
      <p>The Java Action Event Classes are the code that is invoked when the
mouse clicks on a given tab, these classes control all GUI actions. Each class
has a one-to-one relationship with the entries in the plugin.xml le. The Java
Action Event Classes utilise the Java Swing library to manage GUI actions, for
example, displaying messages, number input and load/save lename. There are
34 of these classes in Protege-TS. The Java Worker Classes provide support
functions to the Java Action Event Classes, for example, updating concept and
Concept
Hierarchy</p>
      <p>Role
Hierarchy</p>
      <p>Equivalent
Concept</p>
      <p>Equivalent</p>
      <p>Role</p>
      <p>Concept
SubClassOf</p>
      <p>Concept – Role Chain
Complex Roles</p>
      <p>General Class Axiom</p>
      <p>SNOMED-CT Like Expressions</p>
      <p>Fig. 6. Test Ontology OWL Expression Types.
role annotations, traversing the ontology to process DownSet, UpSet, Equivalent
concepts and GCA, construction of message content for display, save/load of
signatures les and calculating ontology metrics. There are 22 of these classes in
Protege-TS.</p>
      <p>
        The capabilities of the OWL API are used extensively by both the Java
Action Event Classes and the Java Worker Classes. This includes both the inbuilt
structural reasoner, supplemented by the ELK [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] reasoner which is optimised
for processing ontologies with E L level expressivity.
      </p>
      <p>Following execution of the Maven build, the plugin .jar le is generated and
copied into the Protege plugins directory. When Protege is restarted, the plugin
is loaded and the new Term Selection tab in the main menu is observed.
ProtegeTS can then be utilised with the source ontology, command log and signature
les to generate revised signature les and a results ontology containing the
inserted term selection annotations ready to be processed by forgetting tools
such as FAME and LETHE.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation of the Term Selection Tool</title>
      <p>
        Functionality tests are performed against a test ontology containing a range
of OWL expression types constructed to exercise the full set of functionalities
described in Section 3. The test plan and results are available on GitHub [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Additional OWL expressions were de ned that are based on the expression forms
found in SNOMED-CT, see Fig. 6.
      </p>
      <p>
        Detailed performance and scalability tests were executed against the
SNOMEDCT ontology and are also available on GitHub [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. All tests have passed, with
one exception when deleting large parts of the ontology, the test exceeded the
target allocated run-time limit. To test for the general application of Protege-TS
a small number of tests were executed against other sample ontologies, which
encompass di erent expressivity and scale. Examples included FMA [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], GO [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
Uberon [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], ICD [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and NCIt [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>
        Usability testing was performed using the System Usability Scale (SUS)
which was a simple 10 item questionnaire, designed by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in the mid-1980s to
assess a user's overall satisfaction with a product. Three academic sta and
research users, who are all highly experienced in OWL, Protege and the
SNOMEDCT ontology, performed the test, and the 88% SUS score achieved indicates a
high-level of perceived usability.
      </p>
      <p>Several use case questions were de ned to exercise the available features of
Protege-TS and evaluate its real-world utility. Example questions were to nd
\cause for severe sunburn damaged skin", \available blood pressure
measurement techniques", and \list of bones in the hand". Taking the rst question and
searching on \sunburn skin disorder" returned three results covering rst, second
and third-degree sunburn. For example, expressed in MOS:
'Sunburn of first degree (disorder)' EquivalentTo
'Acute effect of ultraviolet radiation on normal skin (disorder)'
and ('Role group (attribute)' some (('Associated morphology (attribute)' some
‘First degree burn injury (morphologic abnormality)')
and ('Causative agent (attribute)' some 'Ultraviolet radiation (physical force)')
and ('Finding site (attribute)' some 'Skin structure (body structure)')))
Expressed in natural language: \severe sunburn is a rst-degree burn injury of
normal skin caused by ultraviolet radiation". In Protege selecting the concept
’Sunburn of first degree (disorder)’, by utilising the features of
ProtegeTS allows the keep/forget signatures to be generated and subsequently fed into
FAME and LETHE. The features employed include Entity, DownSet, UpSet,
GCA and Roles-1 through to Roles-8. The number of concepts and roles
added to the signatures as a result of each operation is shown in Table 1. For
instance, Entity - Keep without axiom inclusion will add just the concept to
the keep signature, whereas with axioms included will add the concept itself
plus the 4 concepts and 4 roles that form the axiom de nition. Inclusion of
axiom processing increases run-time signi cantly from 2 to 9 seconds. UpSet
w/ axioms generates a signature with 39 concepts and 7 roles, but the extensive
OWL API computation involved leads to slightly excessive run-time.</p>
      <p>
        Perhaps most closely related to Protege-TS is SNOMED-CT's Expression
Constraint Language (ECL) [
        <xref ref-type="bibr" rid="ref1 ref18">1,18</xref>
        ]. A high-level feature comparison of
ProtegeTS versus ECL shows that ECL supports: UpSet and DownSet not including
self, combined constraints, relationships and cardinality. Whereas Protege-TS
supports processing of equivalent concept and role de nitions, general class
axiom de nitions, axiom de nitions associated with a given concept, and
conceptrole chains. Protege-TS also supports incremental interactive signature creation,
editing, load/save. Lastly, of course ECL is dedicated to SNOMED-CT, whereas
being available as a Protege-module Protege-TS can be applied to any OWL 2
based ontology and bene ts from the various functionality available in Protege.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Several term selection approaches have been identi ed and a tool has been
implemented to support the core functions required by some of these approaches.
Di erent term selection operations are available in the Protege-TS plugin, from
the simple case of selecting individual terms, through the more complex case of
where the ontology structure (UpSet, DownSet, GCA and axiom de nitions etc)
is automatically computed to specify selected terms. Several even more
complex approaches could form the basis for further work. Various quantitative and
qualitative metrics were speci ed to test and evaluate the tool. The
experimental results demonstrate the functionality, performance, scalability and general
applicability of Protege-TS, i.e., the tests cases pass (with minor exceptions of
excessive run-time) and the tool is determined to have met its stated
requirements. Validation of the tool encompassed performing a System Usability Scale
Questionnaire, which provided evidence of a high level of usability and ideas for
future developments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bhattacharyya</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          :
          <article-title>Introduction to SNOMED-CT</article-title>
          . Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Brooke</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>SUS: A \quick and dirty" usability scale</article-title>
          . In: Jordan,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Weerdmeester</surname>
          </string-name>
          ,
          <string-name>
            <surname>B</surname>
          </string-name>
          . (eds.) Usability Evaluation in Industry, pp.
          <volume>189</volume>
          {
          <fpage>194</fpage>
          .
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>Francis</surname>
          </string-name>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alghamdi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walther</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Ontology extraction for large ontologies via modularity and forgetting</article-title>
          . In: Kejriwal,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Szekely</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.A.</given-names>
            ,
            <surname>Troncy</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 10th International Conference on Knowledge Capture (K-CAP'19)</source>
          . pp.
          <volume>45</volume>
          {
          <fpage>52</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. CSIRO: https://apg.ihtsdotools.org/.
          <source>Accessed 4 November</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Kazakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>Modular reuse of ontologies: Theory and practice</article-title>
          .
          <source>Journal of Arti cial Intelligence Research</source>
          <volume>31</volume>
          ,
          <volume>273</volume>
          {
          <fpage>318</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Modularizing ontologies</article-title>
          . In: Suarez-Figueroa,
          <string-name>
            <given-names>M.C.</given-names>
            ,
            <surname>Gomez-Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Motta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.) Ontology Engineering in a Networked World, pp.
          <volume>213</volume>
          {
          <fpage>33</fpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schlicht</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stuckenschmidt</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sabou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Criteria and evaluation for ontology modularization techniques</article-title>
          .
          <source>In: Modular Ontologies: Concepts</source>
          ,
          <source>Theories and Techniques for Knowledge Modularization, Lecture Notes in Computer Science</source>
          , vol.
          <volume>5445</volume>
          , pp.
          <volume>67</volume>
          {
          <fpage>89</fpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Del</given-names>
            <surname>Vescovo</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>The modular structure of an ontology: Atomic decomposition towards applications</article-title>
          .
          <source>In: Proceedings of the 24th International Workshop on Description Logics (DL'11)</source>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>745</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Del</given-names>
            <surname>Vescovo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Gessler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Klinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            ,
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Winget</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Decomposition and modular structure of bioportal ontologies</article-title>
          .
          <source>In: The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7031</volume>
          , pp.
          <volume>130</volume>
          {
          <fpage>145</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Eclipse: https://www.eclipse.org/downloads/packages/release/kepler/sr2/ eclipse
          <article-title>-ide-java-ee-developers</article-title>
          .
          <source>Downloaded 28 July 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. ELK: https://protegewiki.stanford.
          <source>edu/wiki/ELK.VersionELK 0.4.3. Accessed 3 October 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. FMA: https://bioportal.bioontology.org/ontologies/FMA. Accessed 3
          <string-name>
            <surname>November</surname>
          </string-name>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. GNU: https://www.gnu.org/licenses/gpl-3.0.en.
          <source>html. Accessed 12 November 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. GOC:
          <article-title>The Gene Ontology Consortium. Gene Ontology Annotations and Resources</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>41</volume>
          (
          <issue>D1</issue>
          ):
          <source>D530{D535</source>
          ,
          <year>2013</year>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Goncharova</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez Cardenas</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Specialized corpora processing with automatic extraction tool</article-title>
          .
          <source>Procedia: Social and Behavioral Sciences</source>
          <volume>95</volume>
          ,
          <volume>293</volume>
          {
          <fpage>297</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hyland</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Protege-TS</surname>
          </string-name>
          (
          <year>2019</year>
          ), https://github.com/ianhyland/Protege-TS.git. Accessed 23 May
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. ICD: https://www.who.int/classifications/icd/en/.
          <source>Accessed 4 November 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. IHTSDO:
          <article-title>Expression constraint language: Speci cation and guide</article-title>
          , https://confluence.ihtsdotools.org/display/DOCECL/Expression+Constraint+ Language+-+
          <source>Specification+and+Guide. Accessed 3 November 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. IHTSDO:
          <string-name>
            <surname>SNOMED-CT</surname>
          </string-name>
          , https://www.snomed.org/.
          <source>Accessed 17 July 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Jacquemin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Spotting and Discovering Terms through Natural Language Processing</article-title>
          . MIT Press (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21. Java: Java Development Toolkit https://www.oracle.com/technetwork/java/ javase/downloads/index.html,
          <source>Version-10.0.1. Downloaded 12 July 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Koopmann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Practical Uniform Interpolation for Expressive Description Logics</article-title>
          .
          <source>Ph.D. thesis</source>
          , The University of Manchester, UK (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Lopes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vieira</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Jose Finatto,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Extracting compound terms from domain corpora</article-title>
          .
          <source>Journal of the Brazilian Computer Society</source>
          <volume>16</volume>
          , 247{
          <fpage>259</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. Lopez Garc a, P.,
          <string-name>
            <surname>Boeker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Illarramendi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Usability-driven pruning of large ontologies: The case of SNOMED-CT</article-title>
          .
          <source>J. Am. Med</source>
          . Inform. Assoc.
          <volume>19</volume>
          (
          <issue>e1</issue>
          ),
          <year>e102</year>
          {
          <fpage>e109</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25. MEDLINE: https://www.nlm.nih.gov/bsd/pmresources.html.
          <source>Accessed 3 November 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Nazar</surname>
          </string-name>
          , R.:
          <article-title>A statistical approach to term extraction</article-title>
          .
          <source>International Journal of Engineering and Science</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ),
          <volume>159</volume>
          {
          <fpage>182</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27. NCIT: https://ncit.nci.nih.gov/ncitbrowser/.
          <source>Accessed 3 November 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Nenadic</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spasic</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Terminology-driven mining of biomedical literature</article-title>
          .
          <source>Bioinformatics</source>
          <volume>19</volume>
          (
          <issue>8</issue>
          ),
          <volume>939</volume>
          {
          <fpage>943</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29. OWL-API: https://github.com/owlcs/owlapi.
          <source>Version 5.1.11. Downloaded 12 July 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30. Protege: http://protege.stanford.edu/.
          <source>Accessed 6 July 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31. SNOW: https://mq.b2i.sg/snow-owl/#. Accessed 2
          <string-name>
            <surname>November</surname>
          </string-name>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32. SNQuery: https://snquery.veratech.
          <source>es/. Accessed 2 November</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>33. Uberon: https://bioportal.bioontology.org/ontologies/UBERON. Accessed 4 November 2019</mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Wennerberg</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zillner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Deriving clinical query patterns from medical corpora using domain ontologies</article-title>
          .
          <source>In: Workshop Biomedical Information Extraction</source>
          <year>2009</year>
          . pp.
          <volume>50</volume>
          {
          <fpage>56</fpage>
          .
          <article-title>Association for Computational Linguistics</article-title>
          , USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Wennerberg</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelarr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Ontology modularization to improve semantic medical image annotation</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>44</volume>
          ,
          <issue>155</issue>
          {
          <fpage>162</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <source>Automated Semantic Forgetting for Expressive Description Logics. Ph.D. thesis</source>
          , The University of Manchester, UK (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>