=Paper=
{{Paper
|id=Vol-2029/paper11
|storemode=property
|title=Web-based Ontology Alignment with the GeneTegra Alignment Tool
|pdfUrl=https://ceur-ws.org/Vol-2029/paper11.pdf
|volume=Vol-2029
|authors=Nemanja Stojanovic,Ray M. Bradley,Sean Wilkinson,Mansur Kabuka,E. Patrick Shironoshita
|dblpUrl=https://dblp.org/rec/conf/simbig/StojanovicBWKS17
}}
==Web-based Ontology Alignment with the GeneTegra Alignment Tool==
Web-based Ontology Alignment with the GeneTegra Alignment tool Nemanja Stojanovic, Ray M. Bradley, Sean Wilkinson, Mansur Kabuka, and E. Patrick Shironoshita INFOTECH Soft, Inc. 1201 Brickell Avenue, Suite 220 Miami, Florida 33131, USA [nemanja,rbradley,sean,kabuka,patrick]@infotechsoft.com Abstract user interface employs to enable iterative align- ment of biomedical ontologies. Ontologies are increasingly gaining practi- Our solution to the problem of ontology align- cal usage for semantic data in various ment is twofold. First, we use an ontology align- ways and across multiple domains. From ment algorithm to identify shared relationships be- this growing applicability arises an ever- greater need to manage large datasets, re- tween heterogeneous entities and generate a set of duce analytical complexity and efficiently suggested mappings. Second, we allow the user to as well as accurately integrate different engage with the results on each iteration by ac- heterogeneous ontologies into or within cepting, rejecting or clearing (reverting an ac- existing systems, all while minimizing da- ceptance or a rejection) mappings or mapping ta corruption and maintaining existing se- groups. The interface also allows the user to up- mantics. In this paper, we present the load a set of equivalence mappings as an input GeneTegra Alignment Tool (GT-Align), a partial alignment to bootstrap a new alignment practical implementation of the ASMOV process. All mappings must be positively accept- ontology alignment algorithm within a ed; in other words, no mappings are deemed ac- Web-based interface, focusing on biomed- ical data and using Unified Medical Lan- cepted until positively indicated as such by the us- guage System (UMLS) for the background er. Rejection of a mapping is an indication that knowledge. GT-Align allows iterative such a mapping should never happen. Clearing of alignment of multiple ontologies as well as a mapping, on the other hand, indicates that it is active user involvement throughout the not accepted but still possible. The GT-Align Web process. Interface supports manual evaluation of results through visual inspection, including inspection of 1 Introduction parents and children of elements as well as inspec- tion of labels and other textual information. The Ontologies have been increasingly acknowledged tool also provides information on the confidence as an appropriate abstraction instruments for rep- of a mapping as calculated by the underlying resenting entities and their relationships within ASMOV algorithm (Jean-Mary et al., 2009; Jean- various domains (Euzenat and Shvaiko, 2007). Mary and Kabuka, 2014), and where applicable it Due to this abstract expressiveness, ontologies also provides reference to codes in the Unified have been proven to have a highly extensible ap- Medical Language System (UMLS) to which con- plicability spectrum, allowing a greater variety of cepts are tagged. For algorithm details and the ex- systems to incorporate them in their modeling planation on UMLS usage, see Algorithm section. (Kalfoglou and Schorlemmer, 2003; Noy, 2004;). The main goal of GT-Align is to enable easier Because of this increasing development, the need ontology alignment of biomedical data and allow for flexible tools enabling semantic matching of domain experts to validate the results and thus en- heterogeneous ontologies is becoming much more sure high quality alignments. Put succinctly, GT- apparent (Shvaiko and Euzenat, 2013). Align enables the production of an alignment be- In this paper, we present a demonstration of an tween any two biomedical ontologies, allowing ontology alignment Web interface called GT- users to review and revise mappings interactively. Align, consisting of a server implementation that wraps an ontology alignment algorithm and ex- poses REST API endpoints which the client side 127 Additionally, the GT-Align user interface is built 2 Algorithm on modern web technologies including JavaS- cript/HTML/CSS as well as SVGs for data visual- The underlying alignment algorithm for GT-Align izations, enabling GT-Align to stay on the cutting is called ASMOV, which was developed for use edge of UI tools (Li et al., 2015). In addition to a in the integration of data and ontologies in the bi- wide platform support (including mobile), web omedical and life sciences domain within the technologies maintain consistent high quality of GeneTegra Information System UI capabilities via frequent improvements. This (www.genetegra.com). The algorithm makes enables GT-Align to be easily deployed into any use of an iterative approach with similarity calcu- environment as well as quickly updated with latest lations along multiple dimensions coupled with a technological advances at a minimum expense to process of semantic verification that seeks to re- the user. move mapping inconsistencies. ASMOV uses a The following sections focus on the individual combination of string-, constraint-, formal- visualizations and capabilities of the different resource-, graph-, model-, and instance-based views present in the GT-Align Web Interface. matching mechanisms. ASMOV has participated in several rounds of the evaluations performed by 3.1 Ontology Import View the Ontology Alignment Evaluation Initiative This view allows the user to upload ontologies in- (OAEI), placing as one of the top three performers to the system, which they can further inspect in in the benchmark tests of the contests in which it the Hierarchical Tree View. The ontology import has participated (Jean-Mary and Kabuka, 2007; process uses an extensible set of rules to normal- Jean-Mary and Kabuka, 2008; Jean-Mary et al., ize lexical labels used within it, marking one as 2009; Jean-Mary et al., 2009). The ASMOV algo- the preferred label and others as alternative labels. rithm uses UMLS as an underlying vocabulary These labels are then annotated to concepts within aimed at improving lexical matching between the UMLS Metathesaurus. Having normalized la- source and target entities. The interface enables bels provides a consistent visual identification the user to turn off this feature, in which case lexi- scheme that is more easily recognized and thus cal matching is based on Levenshtein edit dis- friendlier to the user. tance. Prior evaluations of the algorithm showed that the use of an underlying vocabulary signifi- 3.2 Hierarchical Tree View cantly improves the quality of mapping, while al- This part of the system displays ontologies as a so reducing the time needed for completion of the hierarchical tree of concepts. The ontology tree alignment process (Jean-Mary and Kabuka, 2007). visualization serves as the fundamental visualiza- tion in GT-Align. The view is shown in Figure 1. 3 User Interface Visualization types that have been shown as most effective at enabling user involvement in an alignment process are tree and graph structures, with both having specific benefits to the user (Bo Fu, 2013). Furthermore, Granitzer et al. (2010) have shown that an intelligent combination of both structures is present in many advanced alignment visualization tools. By combining list, tree and graph visualizations to present alignment data to the user at distinct levels of abstraction, GT-Align can yield a more productive alignment Figure 1: Hierarchical Tree View through user feedback. The user can explore de- tailed information on individual concepts as well When concepts are asserted as children of multi- as parameters of mapping candidates such as es- ple parents, they are displayed within each parent, timated confidence and status. Furthermore, the as is standard practice. Metadata about the ontolo- user can filter this data by ontological sub regions gy, including its ID and any annotations such as or by individual mapping features. 128 textual descriptions, are displayed on an infor- time for when the alignment was created or last mation pane. The system also utilizes an auto- modified and how long it took to execute. The view complete search for individual concepts within the also displays the current execution status which gets ontology. The indented tree visualization presents updated as an alignment progresses through differ- information in a commonly used abstraction al- ent stages. Alignments can execute asynchronously lowing users to explore ontologies without a spe- within the GT-Align platform, allowing the user to cialized knowledge of the visualization itself. This perform tasks in parallel. The extensive computa- enables anybody familiar with the concept of an tional work is offloaded to the server and doesn’t ontology to get started with the software very hinder the user experience. Once an alignment is quickly. The usefulness of an a tree visualization completed, the user receives a notification of its in displaying hierarchical relationships has been final status. Selection of an alignment transitions demonstrated by its long-term usage in many are- the user to the Alignment Overview. as. From visualizing file systems or HTML struc- tures to visualizing ontologies in tools such as 3.5 Alignment Overview WebProtégé (Tudorache et al., 2013), an indented After an alignment is obtained, the mappings are tree visualization is familiar to most, enabling presented in an overview pane with a circular quicker onboarding into the GT-Align system. graph layout. Due to the structure of ontologies, a graph-based visualization is a natural fit for dis- 3.3 Alignment Execution View playing their alignment. Unlike indented trees, Through this view, the user can execute an align- graphs are more suitable to display multiple inher- ment with specific parameters. The user starts by se- itance without any visual redundancy. This pre- lecting two ontologies that will be aligned. Each of vents the user of potentially needing to make addi- the ontologies can either be selected from the set of tional efforts when understanding the data at hand ontologies that were previously added or may be or being confused by concept repetition. Tree vis- uploaded by the system. Additionally, the user can ualization is particularly less adequate when dis- upload a partial alignment as an input parameter to playing large ontologies because the expansion of the alignment process. nodes to greater depths can quickly become over- whelming. Large trees also make it difficult to ac- 3.4 Alignment Selection View cess the overall structure of an ontology. Using a This view provides a tabular summary of all the graph visualization allows us to handle large alignments executed in the system. It contains the amounts of data in a way that is more customiza- historical overview the alignment processes ran by ble and flexible. The graph visualization is shown the user along with details about each process. An in Figure 3. example of this view is shown in Figure 2. Concepts from both ontologies are distinguished in the graph by color and positioning. They are separated based on their originating ontology where the concepts from the source ontology are placed on one side and the concepts from the tar- get ontology on another. The user can rotate the graph as they please. Further clustering of con- cepts is performed based on their hierarchical po- sition in the ontology. The closer a concept is to the root, the closer it is to the center of the graph. Conversely, the outer section of the graph repre- sents the leaf nodes. This design allows the user to Figure 2: Alignment Selection View easily asses the structure of both ontologies while performing minimal work. Clicking on a specific column heading will sort the Two concepts connected with a line represent a table according to the corresponding parameter. The single mapping. Thickness of this line corre- displayed parameters include links to the source and sponds to the confidence value, i.e. the level of target ontologies used in the alignment, the custom confidence in the mapping being correct, accord- name supplied by the user along with the date and ing to the underlying algorithm. The thicker the li- 129 Figure 3: Alignment Overview ne the higher the generated mapping confidence. to examine a group of mappings separately from This lets the user evaluate a section of the align- the alignment as a whole. The mappings are ment by the amount of high- or low- confidence shown as a vertical list of concept pairs, giving the mappings it contains. user an alternative presentation to the graph that is The user can click on a specific mapping or familiar and straightforward. group of mappings under a common parent. Se- lection of a group of mappings transitions to the Mapping Group View, while selection of any in- dividual mappings transitions to the Mapping View. At the top is a toolbar that provides control to filter the mappings by the confidence value. Additionally, the mappings can be filtered by the mapping origin (suggested by algorithm, found by algorithm, provided by partial alignment), map- ping state (accepted by user, rejected by user, un- defined), and a branch in the ontology. This filter- ing is automatically reflected in the graph, allow- ing users to quickly see the alignment overview at different scales. Besides filtering, the toolbar al- lows for bulk editing of mappings, enabling the Figure 4: Mapping Group View user to accept, reject or clear mappings for large sections of the alignment. The toolbar additionally Each concept pair contains a rectangular visual- allows the user to export the mappings through the ization of their mapping confidence on a scale of Alignment RDF format and the EDOAL format, 0-100. The mapping state is show above each of as well as a merged OWL ontology. the confidence visualizations. Control buttons are provided allowing the user to alter the mapping 3.6 Mapping Group View state of the whole group as well as of individual This view, shown in Figure 4, displays the sug- mappings within it. Selection of an individual gested mappings for a concept and its children. mapping transitions to the Mapping View section. The main purpose of this view is to allow the user 130 Figure 5: Mapping View 3.7 Mapping View these panes places it on the selected mapping This multi-pane view, shown in Figure 5, provides pane, making it the new focused concept. an in-depth visualization of a single mapping, al- Bottom side panes: Two bottom panes on each lowing the user to review or modify the mapping. side contain the hierarchical tree view of both on- Top left pane: This pane contains information tologies (see Hierarchical Tree View). On the bot- about the currently focused source concept select- tom left is the ontology tree corresponding to the ed by the user, including a preferred label, alterna- focused source concept, and on the bottom right is tive labels, and annotated UMLS codes if availa- the ontology tree corresponding to the selected ble. All other panes display information in relation target concept. Each ontology tree highlights the to this currently focused concept. selection of the corresponding concepts. All an- Top right pane: This pane contains a selected cestors of both concepts are shown in the respec- target concept for mapping to the currently fo- tive tree view along with all siblings of each an- cused concept. In the center between the top right cestor, but children of ancestor siblings are initial- and left panes are controls allowing the user to ac- ly hidden although the user can explore them if cept or reject the mapping between the focused desired. Clicking on a concept within the left on- and selected concept, or create a new one if none tology tree will set it as the currently focused con- yet exists. The controls also include an option for cept, updating all other panes accordingly. Click- the user to swap the focused and selected con- ing on any concept from the right ontology tree cepts, causing other panes to adjust accordingly. places it on the selected concept pane, choosing it Three central panes under the controls show one as the mapping for the source concept. or more mappings in different states. Bottom center panes: The top pane shows the 4 Future Work accepted mapping for the focused source concept, Previous sections highlight the current status and if one exists. The middle central pane shows a list main features of the GT-Align Web Interface. of possible mappings according to ASMOV. The There are several planned features expected to be top concept in this list is the mapping suggested produced in the future. Subsequent releases will by ASMOV, other mappings are alternative pos- allow users to specify subsumption relationships sibilities. The bottom central pane shows a list of in addition to equivalence relationships. GT-Align rejected mappings, if any exist. In all three panes, will include the capability of performing an auto- mappings show their preferred label and mapping mated evaluation of precision and recall against a confidence value. Clicking on a concept in any of reference alignment, and to display the results of such evaluation. Additionally, it will incorporate 131 functionality to dynamically change the set of Granitzer, Michael, Vedran Sabol, Kow Weng Onn, fixed weights for the various similarity values cal- Dickson Lukose, and Klaus Tochtermann. "Ontol- culated by ASMOV. It will also present the sepa- ogy alignment—a survey with focus on visually supported semi-automatic techniques." Future In- rate confidence scores for the different measures ternet 2, no. 3 (2010): 238-258. of similarity generated by ASMOV. GT-Align will support pivot systems such as EDOAL to al- Jean-Mary, Yves R., and Mansur R. Kabuka. "AS- low importing of alignments from other systems. MOV results for OAEI 2007." In Proceedings of the 2nd International Conference on Ontology Finally, GT-Align will support interactive collab- Matching-Volume 304, pp. 150-159. CEUR-WS. oration by multiple users. org, 2007. 5 Conclusion Jean-Mary, Yves R., and Mansur R. Kabuka. "AS- MOV: results for OAEI 2008” In Proceedings of In this paper, we presented GT-Align, a versatile the 3rd International Workshop on Ontology implementation of an ontology alignment Web in- Matching-Volume 431, pp. 132-139. CEUR-WS. org, 2008. terface, backed by an efficient, robust and field- tested algorithm called ASMOV. Besides the al- Jean-Mary, Yves R., E. Patrick Shironoshita, and gorithmic alignment, the interface enables itera- Mansur R. Kabuka. "ASMOV: Results for OAEI tive user involvement, allowing domain experts to 2009” In Proceedings of the 4th International Workshop on Ontology Matching-Volume 551, pp. improve and validate the results thus contributing 152-159. CEUR-WS. org, 2009. to the quality of the alignment. We showed fea- tures to evaluate relationships between entities in Jean-Mary, Yves R., E. Patrick Shironoshita, and different biomedical ontologies as well as explore Mansur R. Kabuka. "ASMOV: Results for OAEI 2010” In Proceedings of the 5th International ontologies on their own through hierarchical trees. Workshop on Ontology Matching-Volume 689, pp. The interface enables users to analyze aligned 126-133. CEUR-WS. org, 2010. mappings from a high-level perspective through groups refined by optional user filters and com- Jean-Mary, Yves R., E. Patrick Shironoshita, and Mansur R. Kabuka. "Ontology matching with se- bined with algorithm results. It further provides mantic verification." Web Semantics: Science, Ser- capabilities for fine grain inspection of individual vices and Agents on the World Wide Web 7, no. 3 concepts through their mappings as well as rela- (2009): 235-251. tions to other concepts. An assortment of visuali- Jean-Mary, Yves R., and Kabuka, Mansur. "Ontology zations provided by the user-interface enables alignment with semantic validation." U.S. Patent multiple perspectives on the data itself along with 8,738,636, issued May 27, 2014. the alignment results. Additionally, we presented our plans for future development. In summary, Kalfoglou, Yannis, and Marco Schorlemmer. "Ontol- ogy mapping: the state of the art." The knowledge GT-Align is a robust and easy to use solution for engineering review 18, no. 01 (2003): 1-31. ontology alignment of biomedical data. Li, Yiting, Cosmin Stroe, and Isabel F. Cruz. "Interac- Acknowledgements tive Visualization of Large Ontology Matching Re- sults." In VOILA@ ISWC, p. 37. 2015. This work is supported by grant R44GM097851 from the National Institute of General Medical Noy, Natalya F. "Semantic integration: a survey of ontology-based approaches." ACM Sigmod Record Sciences (NIGMS), part of the U.S. National In- 33, no. 4 (2004): 65-70. stitutes of Health (NIH). Shvaiko, Pavel, and Jérôme Euzenat. "Ontology References matching: state of the art and future challenges." IEEE Transactions on knowledge and data engi- Euzenat, Jérôme, and Pavel Shvaiko. Ontology match- neering 25, no. 1 (2013): 158-176. ing. Vol. 18. Heidelberg: Springer, 2007. Tudorache, Tania, Csongor Nyulas, Natalya F. Noy, Fu, Bo, Natalya F. Noy, and Margaret-Anne Storey. and Mark A. Musen. "WebProtégé: A collaborative "Indented tree or graph? A usability study of on- ontology editor and knowledge acquisition tool for tology visualization techniques in the context of the web." Semantic web 4, no. 1 (2013): 89-99. class mapping evaluation." In International Seman- tic Web Conference, pp. 117-134. Springer, Berlin, Heidelberg, 2013. 132