=Paper= {{Paper |id=Vol-2029/paper11 |storemode=property |title=Web-based Ontology Alignment with the GeneTegra Alignment Tool |pdfUrl=https://ceur-ws.org/Vol-2029/paper11.pdf |volume=Vol-2029 |authors=Nemanja Stojanovic,Ray M. Bradley,Sean Wilkinson,Mansur Kabuka,E. Patrick Shironoshita |dblpUrl=https://dblp.org/rec/conf/simbig/StojanovicBWKS17 }} ==Web-based Ontology Alignment with the GeneTegra Alignment Tool== https://ceur-ws.org/Vol-2029/paper11.pdf
    Web-based Ontology Alignment with the GeneTegra Alignment tool

           Nemanja Stojanovic, Ray M. Bradley, Sean Wilkinson, Mansur Kabuka,
                                and E. Patrick Shironoshita
                                   INFOTECH Soft, Inc.
                              1201 Brickell Avenue, Suite 220
                                Miami, Florida 33131, USA
                 [nemanja,rbradley,sean,kabuka,patrick]@infotechsoft.com

                     Abstract                         user interface employs to enable iterative align-
                                                      ment of biomedical ontologies.
     Ontologies are increasingly gaining practi-         Our solution to the problem of ontology align-
     cal usage for semantic data in various           ment is twofold. First, we use an ontology align-
     ways and across multiple domains. From
                                                      ment algorithm to identify shared relationships be-
     this growing applicability arises an ever-
     greater need to manage large datasets, re-
                                                      tween heterogeneous entities and generate a set of
     duce analytical complexity and efficiently       suggested mappings. Second, we allow the user to
     as well as accurately integrate different        engage with the results on each iteration by ac-
     heterogeneous ontologies into or within          cepting, rejecting or clearing (reverting an ac-
     existing systems, all while minimizing da-       ceptance or a rejection) mappings or mapping
     ta corruption and maintaining existing se-       groups. The interface also allows the user to up-
     mantics. In this paper, we present the           load a set of equivalence mappings as an input
     GeneTegra Alignment Tool (GT-Align), a           partial alignment to bootstrap a new alignment
     practical implementation of the ASMOV            process. All mappings must be positively accept-
     ontology alignment algorithm within a
                                                      ed; in other words, no mappings are deemed ac-
     Web-based interface, focusing on biomed-
     ical data and using Unified Medical Lan-
                                                      cepted until positively indicated as such by the us-
     guage System (UMLS) for the background           er. Rejection of a mapping is an indication that
     knowledge. GT-Align allows iterative             such a mapping should never happen. Clearing of
     alignment of multiple ontologies as well as      a mapping, on the other hand, indicates that it is
     active user involvement throughout the           not accepted but still possible. The GT-Align Web
     process.                                         Interface supports manual evaluation of results
                                                      through visual inspection, including inspection of
1    Introduction                                     parents and children of elements as well as inspec-
                                                      tion of labels and other textual information. The
Ontologies have been increasingly acknowledged
                                                      tool also provides information on the confidence
as an appropriate abstraction instruments for rep-
                                                      of a mapping as calculated by the underlying
resenting entities and their relationships within
                                                      ASMOV algorithm (Jean-Mary et al., 2009; Jean-
various domains (Euzenat and Shvaiko, 2007).
                                                      Mary and Kabuka, 2014), and where applicable it
Due to this abstract expressiveness, ontologies
                                                      also provides reference to codes in the Unified
have been proven to have a highly extensible ap-
                                                      Medical Language System (UMLS) to which con-
plicability spectrum, allowing a greater variety of
                                                      cepts are tagged. For algorithm details and the ex-
systems to incorporate them in their modeling
                                                      planation on UMLS usage, see Algorithm section.
(Kalfoglou and Schorlemmer, 2003; Noy, 2004;).
                                                         The main goal of GT-Align is to enable easier
Because of this increasing development, the need
                                                      ontology alignment of biomedical data and allow
for flexible tools enabling semantic matching of
                                                      domain experts to validate the results and thus en-
heterogeneous ontologies is becoming much more
                                                      sure high quality alignments. Put succinctly, GT-
apparent (Shvaiko and Euzenat, 2013).
                                                      Align enables the production of an alignment be-
   In this paper, we present a demonstration of an
                                                      tween any two biomedical ontologies, allowing
ontology alignment Web interface called GT-
                                                      users to review and revise mappings interactively.
Align, consisting of a server implementation that
wraps an ontology alignment algorithm and ex-
poses REST API endpoints which the client side



                                                   127
                                                         Additionally, the GT-Align user interface is built
2   Algorithm                                            on modern web technologies including JavaS-
                                                         cript/HTML/CSS as well as SVGs for data visual-
The underlying alignment algorithm for GT-Align          izations, enabling GT-Align to stay on the cutting
is called ASMOV, which was developed for use             edge of UI tools (Li et al., 2015). In addition to a
in the integration of data and ontologies in the bi-     wide platform support (including mobile), web
omedical and life sciences domain within the             technologies maintain consistent high quality of
GeneTegra              Information            System     UI capabilities via frequent improvements. This
(www.genetegra.com). The algorithm makes
                                                         enables GT-Align to be easily deployed into any
use of an iterative approach with similarity calcu-      environment as well as quickly updated with latest
lations along multiple dimensions coupled with a         technological advances at a minimum expense to
process of semantic verification that seeks to re-       the user.
move mapping inconsistencies. ASMOV uses a                  The following sections focus on the individual
combination of string-, constraint-, formal-             visualizations and capabilities of the different
resource-, graph-, model-, and instance-based            views present in the GT-Align Web Interface.
matching mechanisms. ASMOV has participated
in several rounds of the evaluations performed by        3.1   Ontology Import View
the Ontology Alignment Evaluation Initiative
                                                         This view allows the user to upload ontologies in-
(OAEI), placing as one of the top three performers
                                                         to the system, which they can further inspect in
in the benchmark tests of the contests in which it
                                                         the Hierarchical Tree View. The ontology import
has participated (Jean-Mary and Kabuka, 2007;
                                                         process uses an extensible set of rules to normal-
Jean-Mary and Kabuka, 2008; Jean-Mary et al.,
                                                         ize lexical labels used within it, marking one as
2009; Jean-Mary et al., 2009). The ASMOV algo-
                                                         the preferred label and others as alternative labels.
rithm uses UMLS as an underlying vocabulary
                                                         These labels are then annotated to concepts within
aimed at improving lexical matching between
                                                         the UMLS Metathesaurus. Having normalized la-
source and target entities. The interface enables
                                                         bels provides a consistent visual identification
the user to turn off this feature, in which case lexi-
                                                         scheme that is more easily recognized and thus
cal matching is based on Levenshtein edit dis-
                                                         friendlier to the user.
tance. Prior evaluations of the algorithm showed
that the use of an underlying vocabulary signifi-        3.2   Hierarchical Tree View
cantly improves the quality of mapping, while al-
                                                         This part of the system displays ontologies as a
so reducing the time needed for completion of the
                                                         hierarchical tree of concepts. The ontology tree
alignment process (Jean-Mary and Kabuka, 2007).
                                                         visualization serves as the fundamental visualiza-
                                                         tion in GT-Align. The view is shown in Figure 1.
3   User Interface
   Visualization types that have been shown as
most effective at enabling user involvement in an
alignment process are tree and graph structures,
with both having specific benefits to the user (Bo
Fu, 2013). Furthermore, Granitzer et al. (2010)
have shown that an intelligent combination of
both structures is present in many advanced
alignment visualization tools. By combining list,
tree and graph visualizations to present alignment
data to the user at distinct levels of abstraction,
GT-Align can yield a more productive alignment
                                                                 Figure 1: Hierarchical Tree View
through user feedback. The user can explore de-
tailed information on individual concepts as well
                                                         When concepts are asserted as children of multi-
as parameters of mapping candidates such as es-
                                                         ple parents, they are displayed within each parent,
timated confidence and status. Furthermore, the
                                                         as is standard practice. Metadata about the ontolo-
user can filter this data by ontological sub regions
                                                         gy, including its ID and any annotations such as
or by individual mapping features.



                                                     128
textual descriptions, are displayed on an infor-        time for when the alignment was created or last
mation pane. The system also utilizes an auto-          modified and how long it took to execute. The view
complete search for individual concepts within the      also displays the current execution status which gets
ontology. The indented tree visualization presents      updated as an alignment progresses through differ-
information in a commonly used abstraction al-          ent stages. Alignments can execute asynchronously
lowing users to explore ontologies without a spe-       within the GT-Align platform, allowing the user to
cialized knowledge of the visualization itself. This    perform tasks in parallel. The extensive computa-
enables anybody familiar with the concept of an         tional work is offloaded to the server and doesn’t
ontology to get started with the software very          hinder the user experience. Once an alignment is
quickly. The usefulness of an a tree visualization      completed, the user receives a notification of its
in displaying hierarchical relationships has been       final status. Selection of an alignment transitions
demonstrated by its long-term usage in many are-        the user to the Alignment Overview.
as. From visualizing file systems or HTML struc-
tures to visualizing ontologies in tools such as        3.5   Alignment Overview
WebProtégé (Tudorache et al., 2013), an indented        After an alignment is obtained, the mappings are
tree visualization is familiar to most, enabling        presented in an overview pane with a circular
quicker onboarding into the GT-Align system.            graph layout. Due to the structure of ontologies, a
                                                        graph-based visualization is a natural fit for dis-
3.3   Alignment Execution View                          playing their alignment. Unlike indented trees,
Through this view, the user can execute an align-       graphs are more suitable to display multiple inher-
ment with specific parameters. The user starts by se-   itance without any visual redundancy. This pre-
lecting two ontologies that will be aligned. Each of    vents the user of potentially needing to make addi-
the ontologies can either be selected from the set of   tional efforts when understanding the data at hand
ontologies that were previously added or may be         or being confused by concept repetition. Tree vis-
uploaded by the system. Additionally, the user can      ualization is particularly less adequate when dis-
upload a partial alignment as an input parameter to     playing large ontologies because the expansion of
the alignment process.                                  nodes to greater depths can quickly become over-
                                                        whelming. Large trees also make it difficult to ac-
3.4   Alignment Selection View                          cess the overall structure of an ontology. Using a
This view provides a tabular summary of all the         graph visualization allows us to handle large
alignments executed in the system. It contains the      amounts of data in a way that is more customiza-
historical overview the alignment processes ran by      ble and flexible. The graph visualization is shown
the user along with details about each process. An      in Figure 3.
example of this view is shown in Figure 2.              Concepts from both ontologies are distinguished
                                                        in the graph by color and positioning. They are
                                                        separated based on their originating ontology
                                                        where the concepts from the source ontology are
                                                        placed on one side and the concepts from the tar-
                                                        get ontology on another. The user can rotate the
                                                        graph as they please. Further clustering of con-
                                                        cepts is performed based on their hierarchical po-
                                                        sition in the ontology. The closer a concept is to
                                                        the root, the closer it is to the center of the graph.
                                                        Conversely, the outer section of the graph repre-
                                                        sents the leaf nodes. This design allows the user to
       Figure 2: Alignment Selection View               easily asses the structure of both ontologies while
                                                        performing minimal work.
Clicking on a specific column heading will sort the        Two concepts connected with a line represent a
table according to the corresponding parameter. The     single mapping. Thickness of this line corre-
displayed parameters include links to the source and    sponds to the confidence value, i.e. the level of
target ontologies used in the alignment, the custom     confidence in the mapping being correct, accord-
name supplied by the user along with the date and       ing to the underlying algorithm. The thicker the li-



                                                    129
                                     Figure 3: Alignment Overview
ne the higher the generated mapping confidence.        to examine a group of mappings separately from
This lets the user evaluate a section of the align-    the alignment as a whole. The mappings are
ment by the amount of high- or low- confidence         shown as a vertical list of concept pairs, giving the
mappings it contains.                                  user an alternative presentation to the graph that is
    The user can click on a specific mapping or        familiar and straightforward.
group of mappings under a common parent. Se-
lection of a group of mappings transitions to the
Mapping Group View, while selection of any in-
dividual mappings transitions to the Mapping
View. At the top is a toolbar that provides control
to filter the mappings by the confidence value.
Additionally, the mappings can be filtered by the
mapping origin (suggested by algorithm, found by
algorithm, provided by partial alignment), map-
ping state (accepted by user, rejected by user, un-
defined), and a branch in the ontology. This filter-
ing is automatically reflected in the graph, allow-
ing users to quickly see the alignment overview at
different scales. Besides filtering, the toolbar al-
lows for bulk editing of mappings, enabling the                 Figure 4: Mapping Group View
user to accept, reject or clear mappings for large
sections of the alignment. The toolbar additionally       Each concept pair contains a rectangular visual-
allows the user to export the mappings through the     ization of their mapping confidence on a scale of
Alignment RDF format and the EDOAL format,             0-100. The mapping state is show above each of
as well as a merged OWL ontology.                      the confidence visualizations. Control buttons are
                                                       provided allowing the user to alter the mapping
3.6   Mapping Group View                               state of the whole group as well as of individual
This view, shown in Figure 4, displays the sug-        mappings within it. Selection of an individual
gested mappings for a concept and its children.        mapping transitions to the Mapping View section.
The main purpose of this view is to allow the user



                                                   130
                                        Figure 5: Mapping View

3.7   Mapping View                                      these panes places it on the selected mapping
This multi-pane view, shown in Figure 5, provides       pane, making it the new focused concept.
an in-depth visualization of a single mapping, al-         Bottom side panes: Two bottom panes on each
lowing the user to review or modify the mapping.        side contain the hierarchical tree view of both on-
   Top left pane: This pane contains information        tologies (see Hierarchical Tree View). On the bot-
about the currently focused source concept select-      tom left is the ontology tree corresponding to the
ed by the user, including a preferred label, alterna-   focused source concept, and on the bottom right is
tive labels, and annotated UMLS codes if availa-        the ontology tree corresponding to the selected
ble. All other panes display information in relation    target concept. Each ontology tree highlights the
to this currently focused concept.                      selection of the corresponding concepts. All an-
   Top right pane: This pane contains a selected        cestors of both concepts are shown in the respec-
target concept for mapping to the currently fo-         tive tree view along with all siblings of each an-
cused concept. In the center between the top right      cestor, but children of ancestor siblings are initial-
and left panes are controls allowing the user to ac-    ly hidden although the user can explore them if
cept or reject the mapping between the focused          desired. Clicking on a concept within the left on-
and selected concept, or create a new one if none       tology tree will set it as the currently focused con-
yet exists. The controls also include an option for     cept, updating all other panes accordingly. Click-
the user to swap the focused and selected con-          ing on any concept from the right ontology tree
cepts, causing other panes to adjust accordingly.       places it on the selected concept pane, choosing it
Three central panes under the controls show one         as the mapping for the source concept.
or more mappings in different states.
   Bottom center panes: The top pane shows the          4   Future Work
accepted mapping for the focused source concept,
                                                        Previous sections highlight the current status and
if one exists. The middle central pane shows a list
                                                        main features of the GT-Align Web Interface.
of possible mappings according to ASMOV. The
                                                        There are several planned features expected to be
top concept in this list is the mapping suggested
                                                        produced in the future. Subsequent releases will
by ASMOV, other mappings are alternative pos-
                                                        allow users to specify subsumption relationships
sibilities. The bottom central pane shows a list of
                                                        in addition to equivalence relationships. GT-Align
rejected mappings, if any exist. In all three panes,
                                                        will include the capability of performing an auto-
mappings show their preferred label and mapping
                                                        mated evaluation of precision and recall against a
confidence value. Clicking on a concept in any of
                                                        reference alignment, and to display the results of
                                                        such evaluation. Additionally, it will incorporate



                                                    131
functionality to dynamically change the set of           Granitzer, Michael, Vedran Sabol, Kow Weng Onn,
fixed weights for the various similarity values cal-       Dickson Lukose, and Klaus Tochtermann. "Ontol-
culated by ASMOV. It will also present the sepa-           ogy alignment—a survey with focus on visually
                                                           supported semi-automatic techniques." Future In-
rate confidence scores for the different measures
                                                           ternet 2, no. 3 (2010): 238-258.
of similarity generated by ASMOV. GT-Align
will support pivot systems such as EDOAL to al-          Jean-Mary, Yves R., and Mansur R. Kabuka. "AS-
low importing of alignments from other systems.             MOV results for OAEI 2007." In Proceedings of
                                                            the 2nd International Conference on Ontology
Finally, GT-Align will support interactive collab-
                                                            Matching-Volume 304, pp. 150-159. CEUR-WS.
oration by multiple users.                                  org, 2007.

5   Conclusion                                           Jean-Mary, Yves R., and Mansur R. Kabuka. "AS-
                                                            MOV: results for OAEI 2008” In Proceedings of
In this paper, we presented GT-Align, a versatile           the 3rd International Workshop on Ontology
implementation of an ontology alignment Web in-             Matching-Volume 431, pp. 132-139. CEUR-WS.
                                                            org, 2008.
terface, backed by an efficient, robust and field-
tested algorithm called ASMOV. Besides the al-           Jean-Mary, Yves R., E. Patrick Shironoshita, and
gorithmic alignment, the interface enables itera-           Mansur R. Kabuka. "ASMOV: Results for OAEI
tive user involvement, allowing domain experts to           2009” In Proceedings of the 4th International
                                                            Workshop on Ontology Matching-Volume 551, pp.
improve and validate the results thus contributing
                                                            152-159. CEUR-WS. org, 2009.
to the quality of the alignment. We showed fea-
tures to evaluate relationships between entities in      Jean-Mary, Yves R., E. Patrick Shironoshita, and
different biomedical ontologies as well as explore          Mansur R. Kabuka. "ASMOV: Results for OAEI
                                                            2010” In Proceedings of the 5th International
ontologies on their own through hierarchical trees.
                                                            Workshop on Ontology Matching-Volume 689, pp.
The interface enables users to analyze aligned              126-133. CEUR-WS. org, 2010.
mappings from a high-level perspective through
groups refined by optional user filters and com-         Jean-Mary, Yves R., E. Patrick Shironoshita, and
                                                            Mansur R. Kabuka. "Ontology matching with se-
bined with algorithm results. It further provides
                                                            mantic verification." Web Semantics: Science, Ser-
capabilities for fine grain inspection of individual        vices and Agents on the World Wide Web 7, no. 3
concepts through their mappings as well as rela-            (2009): 235-251.
tions to other concepts. An assortment of visuali-
                                                         Jean-Mary, Yves R., and Kabuka, Mansur. "Ontology
zations provided by the user-interface enables
                                                            alignment with semantic validation." U.S. Patent
multiple perspectives on the data itself along with         8,738,636, issued May 27, 2014.
the alignment results. Additionally, we presented
our plans for future development. In summary,            Kalfoglou, Yannis, and Marco Schorlemmer. "Ontol-
                                                           ogy mapping: the state of the art." The knowledge
GT-Align is a robust and easy to use solution for
                                                           engineering review 18, no. 01 (2003): 1-31.
ontology alignment of biomedical data.
                                                         Li, Yiting, Cosmin Stroe, and Isabel F. Cruz. "Interac-
Acknowledgements                                            tive Visualization of Large Ontology Matching Re-
                                                            sults." In VOILA@ ISWC, p. 37. 2015.
This work is supported by grant R44GM097851
from the National Institute of General Medical           Noy, Natalya F. "Semantic integration: a survey of
                                                           ontology-based approaches." ACM Sigmod Record
Sciences (NIGMS), part of the U.S. National In-
                                                           33, no. 4 (2004): 65-70.
stitutes of Health (NIH).
                                                         Shvaiko, Pavel, and Jérôme Euzenat. "Ontology
References                                                 matching: state of the art and future challenges."
                                                           IEEE Transactions on knowledge and data engi-
Euzenat, Jérôme, and Pavel Shvaiko. Ontology match-        neering 25, no. 1 (2013): 158-176.
  ing. Vol. 18. Heidelberg: Springer, 2007.
                                                         Tudorache, Tania, Csongor Nyulas, Natalya F. Noy,
Fu, Bo, Natalya F. Noy, and Margaret-Anne Storey.          and Mark A. Musen. "WebProtégé: A collaborative
  "Indented tree or graph? A usability study of on-        ontology editor and knowledge acquisition tool for
  tology visualization techniques in the context of        the web." Semantic web 4, no. 1 (2013): 89-99.
  class mapping evaluation." In International Seman-
  tic Web Conference, pp. 117-134. Springer, Berlin,
  Heidelberg, 2013.




                                                       132