Introduction

Towards Dialogue-Based Interactive Semantic Mediation in the Medical Domain

0 German Research Center for Artificial Intelligence 66123 Saarbru ̈cken , Germany

We think of ontology matching as a dialogue-based interactive mediation process for which we propose a three stage model. A preliminary evaluation shows how we applied this method of eliciting input for ontology matching in the medical domain. Especially, we address the challenge how to use dialogue-based interactivity with the user to rate partial alignments between two ontologies.

Introduction

efficiency and ease of use. For example, in the medical domain, radiologists in hospitals can pose clinically relevant questions such as ”Is there a different diagnosis for a lesion in our knowledge base or any other clinical resource available?” The challenge is, of course, to find the relevant instance ontologies and to align them. For the interactive alignment process, we use the same dialogue shell as for the task-based dialogue, with state-of-the-art natural language processing methods combined in a Semantic Web framework [ 2, 3 ].

In this work, we count on dynamic models of the ontology integration process and try to model the meta dialogue for interactive ontology matching. Here, meta dialogue is to be understood as clarifications and questions that elicit user’s feedback to support the ontology matching process. For example:3 1. U: “Show me the internal organs: lungs, liver, then spleen and colon.” 2. S: Shows corresponding patient image data according to referral record. 3. U: “This lymph node here (+ pointing gesture) is enlarged; so lymphadenopathy (finding

Radlex Term A). Are there any comparative cases in the hospital?” 4. S: “Sorry, there are no comparative cases, shall I continue to search for similar cases?” 5. U: “Ah okay.” 6. S: “Unfortunately, external databases could not be matched appropriately. Is lymphadenopathy equivalent to lymph node enlargement (Radlex Term B) or adenopathy (NCI Term C) or a kind of hematopoietic system finding (NCI superconcept)?” 7. U: “Yes, that could be said,” 8. U: “a pathophysiologic process.” (Medico now includes RadLex superconcepts for search.) 9. S: “I found comparative cases in an external knowledge base according to your feedback.” Medico switches to the comparative records to help the radiologist in the differential diagnosis of the suspicious case, before the next organ (liver) is examined. 10. U: “Find similar liver lesions with the characteristics: hyper-intense and/or coarse texture ...” 11. S: Medico again displays the search results ranked by the similarity and matching of the medical terms that constrain the semantic search.

A useful and cooperative question answering dialogue in natural language would combine different topics, heterogeneous information sources, and user feedback on the matching process as meta dialogue. The example dialogue illustrates such a lifelike question answering dialogue; in this respect, utterance (6) is the meta level system question, and utterance (7) the user’s interactive mapping feedback. It is to be mentioned that the system utterance (6) demands for a classification model that judges the accuracy of an ad hoc mapping4; the potential of the user feedback (7) is of course not limited to a singe correspondence which can be demonstrated by fixpoint alignment computation in similarity flooding; (8) shows user-initiative mapping information for possible supertypes. 3 The potential application scenario (provided by Siemens AG in context of the THESEUS-Medico project) includes a radiologist which treats a lymphoma patient; the patient visits the doctor after chemotherapy for a follow-up CT examination. One of the radiologist’s goals is to estimate the effectiveness of the administered medicine. In order to finish the reading/pathology, additional cases have to be taken into account for comparison, which we try to find by matching ontologies of different patient case databases. 4 To our best knowledge, such a classification model has not yet been proposed in literature. We made good first experiences with a string-based model on the concept signs for complete mappings, where we computed the ratio of alignments with confidence value t > 0.9. However, this strategy is not robust in the case of partial mappings.

Dialogue-Based Interactive Matching Approach

The ontology matching problem can be addressed by several techniques (cf. [ 4 ] for example). Recent work in incremental interactive schema matching stressed that users are often annoyed by false positives [ 5 ]; advanced incremental visualisations have been developed (e.g., see [ 6 ]) to do better than calculate the set of correspondences in a single shot; cognitive support frameworks for ontology mapping really involve users [ 7 ]; a dialogue-based approach could make more use of partial mappings in addition, to increase the usability in dialogue scenarios where the primary task is different from the schema matching task itself. Our basic idea is as follows: Consider the methods that are required for interactive ontology mapping and evaluate the impact of dialogue-based user feedback in this process. While dialogue systems allow to obtain user feedback on semantic mediation questions (e.g., questions regarding new semantic mediation rules), incrementally working matching systems can use the feedback as further input for alignment improvement. In order to compute and post-process the alignments, we use the PhaseLibs library5. Subsequently, we focus on interactive ontology matching and dialogue-based interaction. Rather than focussing on the effectiveness of the interactive matching approach, we describe a suitable dialogue-level integration of the matching process by example. Our interactive ontology matching approach envisions the following three stages: 1. Compute a rudimentary partial mapping by a simple string-based method; 2. Ask the user to disambiguate some of the proposed mappings; 3. Use the resulting alignments as input for more complex algorithms.

What concern the first point, we hypothesise that the rudimentary mapping based on the concept and relation signs can be easily computed and obtained in dialogical reaction time (less than 3 seconds even for large ontologies); for second, user interactivity is provided by improving the automatically found correspondences through filtering the alignment. Concerning the third point, we employ similarity flooding, since it allows for input alignments and fixpoint computation in Phaselib’s implementation following [ 8 ]. The interactive semantic mediation approach is depicted in figure 1. In order not to annoy the user, she is presented the difficult cases for disambiguation feedback only; thus we use the application dialogue shell basically for confirming or rejecting pre-considered alignments. The resulting alignments are then serialised as instances of an RDFS alignment format. Assuming that subsequent similarity computations successfully use the partial alignment inputs (to produce query-relevant partial alignment output), the proposed mediator can be said to be a light-weight but powerful approach to support incremental matching. 5 See http://phaselibs.opendfki.de: This platform, for first, supports custom combinations of algorithms; for second, it is entirely written in Java which allows us to directly integrate the API into the dialogue shell; for third, the API supports individual modules and libraries for ontology adapters, similarity measures (e.g., string based, instance based, or graph based), and alignment generators. We performed a series of preliminary experiments. Our datasets consisted of ontologies and alignment examples (manually annotated alignments for Radlex and NCI). For the first test in the medical domain, we annotated 50 alignments, 30 perfect positives and 20 perfect negatives.6 This allows to compute a confusion matrix of the outcomes. In particular, in this domain the precision was 92% and recall 50% for simple string-based methods. (Corresponding concept names may differ substantially in their syntactic form.) Subsequently, the three best matches were taken as alignment input for similarity flooding after manually confirming their validity (which simulates positive user feedback). In subsequent tests, we compared the performance of similarity flooding (stage 3) with and without the initial alignments. Our experiments showed that, on average, the first stage of the matching execution (string-based matching) takes less than 5 percent of the endto-end ontology matching execution time when similarity flooding is involved. In addition, the input alignments (confirmed by the simulated dialogue) allow to compute a complete mapping almost 10 times faster within a 30 seconds time frame;7 a positive effect of partial mapping results with and without initial alignments could not yet be shown in terms of precision/accuracy.

The evaluation showed that for our test cases, interactive semantic mediation can be implemented by a simple string-based method (stage 1), to fulfill the requirements pertinent in the medical domain; the user dialogue was simulated by validating three matching inputs (stage 2). Since instance ontologies are hard to find for specific domains like medicine, non-instance based methods as described 6 The radiologist’s domain consists of many perfect matches according to an almost identical conceptual anatomy and disease model behind it. Unfortunately, this only concerns local concept structures; in addition, only few radiology experts can provide reliable alignments. 7 It is to be mentioned that dataset-specific factors may heavily affect the total execution time as well as the percentage contribution to execution time when comparing the two different similarity flooding stages. are welcome alternatives (stage 3). In future work, we are trying to provide evaluation methods to estimate the contribution of partial alignments input when the retrieval stage is more complex than simple name comparison, as is the case for most of our medical query patterns; user-confirmed perfect mappings can be used in simple name matching retrieval contexts with perfect precision, but this does not reflect the nature of real-world industrial requirements (in particular, where the user cannot be supposed to deliver a reliable judgement). Further, we are investigating techniques to better translate formal mapping uncertainties into appropriate dialogue-level questions for the radiologist and to address the general difficulty that users might not be able to provide helpful feedback in the course of a dialogue.

Acknowledgements. This research has been supported in part by the THESEUS Programme in the Core Technology Cluster WP4, which is funded by the German Federal Ministry of Economics and Technology (01MQ07016). The responsibility for this publication lies with the author.

1. Oberle , D. , Ankolekar , A. , Hitzler , P. , Cimiano , P. , Sintek , M. , Kiesel , M. , Mougouie , B. , Baumann , S. , Vembu , S. , Romanelli , M. , Buitelaar , P. , Engel , R. , Sonntag , D. , Reithinger , N. , Loos , B. , Zorn , H.P. , Micelli , V. , Porzel , R. , Schmidt , C. , Weiten , M. , Burkhardt , F. , Zhou , J.: DOLCE ergo SUMO: On foundational and domain models in the SmartWeb Integrated Ontology (SWIntO) . Web Semant . 5 ( 3 ) ( 2007 ) 156 - 174

2. Reithinger , N. , Sonntag , D.: An integration framework for a mobile multimodal dialogue system accessing the Semantic Web . In: Proceedings of Interspeech'05 . ( 2005 )

3. Sonntag , D. , Engel , R. , Herzog , G. , Pfalzgraf , A. , Pfleger , N. , Romanelli , M. , Reithinger , N.: SmartWeb Handheld-Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services . In Huang, T.S., Nijholt , A. , Pantic , M. , Pentland , A., eds.: Artifical Intelligence for Human Computing . Volume 4451 of Lecture Notes in Computer Science., Springer ( 2007 ) 272 - 295

4. Euzenat , J. , Shvaiko , P. : Ontology matching. Springer-Verlag, Heidelberg ( 2007 )

5. Bernstein , P.A. , Melnik , S. , Churchill , J.E. : Incremental schema matching . In: VLDB '06: Proceedings of the 32nd international conference on Very large data bases , VLDB Endowment ( 2006 ) 1167 - 1170

6. Robertson , G.G. , Czerwinski , M.P. , Churchill , J.E. : Visualization of mappings between schemas . In: CHI '05: Proceedings of the SIGCHI conference on Human factors in computing systems , New York, NY, USA, ACM ( 2005 ) 431 - 439

7. Falconer , S.M. , Noy , N. , Storey , M.A.D. : Towards understanding the needs of cognitive support for ontology mapping . In Shvaiko, P., Euzenat , J. , Noy , N.F. , Stuckenschmidt , H. , Benjamins , V.R. , Uschold , M., eds.: Ontology Matching. Volume 225 of CEUR Workshop Proceedings., CEUR-WS.org ( 2006 )

8. Melnik , S. , Garcia-Molina , H. , Rahm , E.: Similarity flooding: A versatile graph matching algorithm and its application to schema matching . In: ICDE '02: Proceedings of the 18th International Conference on Data Engineering , Washington, DC, USA, IEEE Computer Society ( 2002 ) 117 - 128