Verbally Assisted Virtual-Environment Tactile Maps: A Prototype System Kris Lohmann, Matthias Kerzel, and Christopher Habel Universität Hamburg, Department of Informatics, Vogt-Kölln-Straße 30, 22527 Hamburg, Germany {lohmann,kerzel,habel}@informatik.uni-hamburg.de http://cinacs.informatik.uni-hamburg.de Abstract. Tactile maps are non-visual substitutes for visual maps for blind and visually impaired people. In [3, 11] we suggested to increase the effectiveness of tactile maps by developing a system that generates assisting utterances to facilitate knowledge acquisition. The utterances are inspired by a corpus of assisting utterances given to tactile map explorers by human assistants. The tactile maps are realized as virtual tactile maps presented with a haptic device. The effectiveness of such a system was tested before implementation in controlled experiments reported in [10, 7]. We discuss a prototype system that was developed according to our earlier suggestions. In a user study, we positively evaluated the behavior of the prototype. Keywords: Multi-Modal Maps, Virtual-Environment Maps, Natural Language Generation, Accessible Maps, Audio-Tactile Maps, Verbally Assisted Maps 1 Introduction Maps are important external representations of space. For example, humans use urban-area maps in every-day scenarios such as planning a route to the next bakery or physician in environments novel to them. The access to visual maps is limited for blind and visually impaired people, therefore, tactile maps are used as substitutes. Acquiring knowledge from maps haptically is different from doing it by vision: On the one hand, information provided by tactile maps is more sparsely than that provided by visual maps; on the other hand, this information has to be integrated over time. To reduce some difficulties in haptic comprehension of maps, providing additional information via the auditory channel is useful. In [3, 11] we suggested that approaches towards audio-tactile maps can be extended by developing a system implementing Verbally Assisting Virtual-Environment Tactile Maps (VAVETaM). The suggested system generates situation-dependent assisting utterances, which not only inform the explorer of the tactile map about the proper names of the map objects, but include more information that reduces drawbacks stemming from the sequentiality of tactile-map reading. The generated utterances are based on a corpus study (see [9] for a discussion of the corpus Proceedings SKALID 2012 25 Lohmann, Kerzel, Habel Fig. 1. The Sensable Phantom Omni and a Virtual Tactile Map and the set of assisting utterances). For example, the assisting utterances include information about spatial relations between map objects. The Sensable Phantom Omni device allows the map user to perceive virtual tactile maps (see Fig. 1). The device can be thought of as a reverse robotic arm that generates force depending on the position of the pen-like handle. Streets and landmarks such as buildings are marked as indentations that can be felt with the device. In this paper, we present a prototype implementation of the VAVETaM system. The goal of the development of this prototype is to show the technical feasibility of the suggested system; that is, that the prototype works. The feedback gained from the user study presented in this paper will be used for further development. In the remainder, we discuss relevant literature (Section 2) and we briefly introduce the prototype (Section 3). Furthermore, we present a user study with the prototype (Section 4) before concluding the paper. 2 State of the Art Audio-tactile maps as improvements of uni-modal tactile maps have successfully been developed and tested for usability. Some approaches use physical (‘printout’) overlay maps on touch screens (e.g., [13, 14]). The Phantom device used in our prototype has been successfully used to present virtual tactile maps with audio- feedback [12, 1, 4]. Existing approaches link sounds or brief verbal information (mostly, names) to map objects. In a corpus study, we found that humans asked to assist map explorers include more information in their assisting utterances, which can be described as brief descriptions of the local surroundings [9]. The VAVETaM approach extends earlier systems by generating assisting utterances more similar to human performance. That non-visual knowledge acquisition by direct perception of indoor environments can be facilitated by situated verbal descriptions was shown by [2]. We have positively evaluated the multi-modal interface that is described here prior to implementation in Wizard-of-Oz-like experiments (i.e., the assisting 26 Proceedings SKALID 2012 Verbally Assisted Virtual-Environment Tactile Maps: A Prototype System utterances were controlled by the experimenter) with both blindfolded sighted and visually impaired people [10, 7]. Participants received assisting utterances as suggested compared to a condition in which they were only informed about the names of objects. Spatial knowledge acquisition of virtual tactile maps was facilitated by verbal assistance. 3 The Prototype System Haptic Device Haptic Event Detection Generation Verbal Assistance Fig. 2. From Hand Movements to Assisting Utterances To generate assisting utterances, two important tasks have to be solved. First, the stream of position information from the haptic device is analyzed in order to detect semantic—that is, meaningful—exploration events. Haptic event detection, for example detects when a map user explores a street with the haptic device. This information is the input to the generation component. Second, based on this input, appropriate information is selected from a knowledge base and prepared for verbalization; that is, natural language is generated. For each of these tasks, components were developed. Haptic event detection is described in more detail in [6, 5]. A generation component and the interface between event detection and generation is discussed in [9, 8]. We connected the two components to a fully working prototype system. The utterances produced by the prototype are similar to those evaluated in previous experiments (see Section 2). For example, when the movements the user performs with the device indicate that he or she is interested in a street called ‘Amselweg’, the system produces an output which can be translated as follows: ‘This is Amselweg. This street is parallel to Blumenstraße. It forms a corner with Dorfstraße at the top end and towards the bottom it is restricted by the map frame. Furthermore, it crosses Hochstraße.’ See [9], for a thorough discussion of the content of the utterances. The generation of the utterances is not based on canned text. Both, the verbalization history and the current map exploration, are taken into account: When information was already verbalized, repetitions leading to unnecessary redundancy are avoided. When the map user starts to explore another map object—for example, a landmark called ‘Rathaus’ (town hall)—the verbalization is stopped after the ongoing sentence is finished and the map user receives currently relevant utterances. The generation system and the haptic event detection are implemented in Java. We use the Chai 3D toolkit for haptic rendering of the virtual tactile map. The Chai 3D toolkit and the haptic event detection are interfaced using JAVA Native Interface. Speech synthesis was realized using Mary TTS.1 1 http://www.chai3d.org and http://mary.dfki.de Proceedings SKALID 2012 27 Lohmann, Kerzel, Habel Table 1. System Evaluation: Translation of the Statements and Mean Response Translation of Statement Mean SD (Original) Response A The haptic map is understandable. 1.70 .68 B Giving verbal assisting utterances for such maps is helpful. 1.10 .32 C The utterances were understandable. 2.50 .97 D I always knew exactly what is meant with the verbal utterances. 1.80 .92 E The utterances were helpful. 1.70 .68 F It is easy to follow the streets in the maps. 1.50 .53 G It is easy to locate landmarks (e.g., buildings). 2.90 1.37 H It was confusing that utterances went on when I already was at 1.80 1.03 other map objects. I I usually knew to which map object the assisting utterances 1.70 .48 referred. J The system behaves comprehensive. 1.50 .53 4 User Study To evaluate the prototype system, we asked 13 participants (university students, compensated by course credit or monetary; mean age: 24.7 years, SD = 7.2 years, 9 males) to use the prototype and to give us feedback by indicating agreement to statements about it and by taking part in semi-structured interviews. After an initial training with the haptic device, participants were blindfolded and interacted with the VAVETaM prototype. After they reported that they understood the behavior of the prototype, they were instructed to learn the map so that they would be able to sketch it afterwards. This instruction was included to state a clear goal for the exploration of the map. Participants could take as long as they wanted to explore and memorize the map (overall interaction time: M = 14:16 min, SD = 4:46 min; interaction after instruction: M = 9:30 min, SD = 3:42 min). After they sketched the map, they were asked to indicate agreement to a set of statements about the prototype. Selected statements are shown in Table 1.2 They were given the list of statements in written form. The answers were given on a 1–5 Likert-type scale (1 corresponds to ‘I agree completely’ and 5 to ‘I do not agree at all’). The list of statements was completed together with the experimenter in order to enable immediate discussions of important points (discussions were audio-recorded). The mean responses to the statements indicate that participants considered it generally helpful to support tactile maps with natural langauge (B) and they considered the behavior of the prototype comprehensive (J). Furthermore, the virtual tactile maps were considered understandable (A, F). The responses indicate that it is possible to locate landmarks, although with more difficulty than to explore streets (G). A possible explanation is that the connected network 2 Due to restrictions of the length of the paper, we do not discuss all statements. None of the other statements that were about the prototype produced a mean agreement worse than neutral. The full list of statements and mean responses can be retrieved from: http://www.informatik.uni-hamburg.de/WSV/VAVETaM/ 28 Proceedings SKALID 2012 Verbally Assisted Virtual-Environment Tactile Maps: A Prototype System of streets was easy to follow, while landmarks were not connected with each other and had to be explored one by one. Participants considered the timing of assisting utterances appropriate to establish reference to the map object to which the assistances belonged (I). They considered the utterances helpful and they indicated that they knew what was meant (D, E). The ratings whether utterances were understandable (C) are comparably low. This seems somewhat contradictory to the high agreement with assertion D. In the discussion and the interview, participants indicated that problems with understandability were in general limited to proper names. Currently, once sentences are started they cannot be stopped or changed anymore. From our own experience with the system, we expected that this is potentially confusing. In fact, most participants considered the inability of the prototype to change current utterances confusing (H). In the interviews conducted subsequently, the points discussed above were elaborated and extended. Participants in general considered the set of assisting utterances appropriate and did neither miss important information, nor did they indicate that superfluous information was given. When asked for suggestions on how to improve the system, participants considered options for customization (changing the haptic presentation of the map, switching certain kinds of assisting utterances on and of) and the quality of the synthesis as potential improvements. Some participants indicated that they would have changed the content and frequency of certain utterances—however, this feedback was not consistent. 5 Conclusion We presented the Verbally Assisting Virtual-Environment Tactile Maps (VAVE- TaM) prototype that solves the task to generate assisting utterances for tactile map explorations. The prototype is based on the interaction of two components: Haptic event detection and assistance generation. The effectiveness of the system was previously evaluated positively in controlled experiments [10, 7]. We presented a user study with the prototype in order to show that it works. Participants considered the prototype behavior comprehensive and the utterances and haptic presentation understandable. Potential improvements pointed out by the users are (1) the quality of the speech synthesis with respect to proper names, (2) the possibility to customize the behavior of the system, and (3) the implementation of a possibility to change ongoing utterances when the user moves to another part of the map. Aside from these potential improvements, both, with respect to the assisting utterances and to their timing, participants considered the prototype system usable, comprehensive, and helpful. Acknowledgments. The research reported in this paper has been partially supported by DFG (German Research Foundation) in IRTG 1247 ‘Cross-modal Interaction in Natural and Artificial Cognitive Systems’ (CINACS). We thank Martin Christof Kindsmüller for discussion. We thank the anonymous reviewers for their highly useful commentaries. Proceedings SKALID 2012 29 Lohmann, Kerzel, Habel References 1. De Felice, F., Renna, F., Attolico, G., Distante, A.: A haptic/acoustic application to allow blind the access to spatial information. In: World Haptics Conference. pp. 310 – 315. Tsukuba (2007) 2. Giudice, N.A., Bakdash, J.Z., Legge, G.E.: Wayfinding with words: spatial learn- ing and navigation using dynamically updated verbal descriptions. Psychological Research 71(3), 347–358 (2007) 3. Habel, C., Kerzel, M., Lohmann, K.: Verbal assistance in tactile-map explorations: A case for visual representations and reasoning. In: Proceedings of AAAI Workshop on Visual Representations and Reasoning 2010. Menlo Park, CA (2010) 4. Kaklanis, N., Votis, K., Moschonas, P., Tzovaras, D.: HapticRiaMaps: towards interactive exploration of web world maps for the visually impaired. In: Proceedings of the International Cross-Disciplinary Conference on Web Accessibility. Hyderabad (2011) 5. Kerzel, M., Habel, C.: Monitoring and describing events for virtual-environment tactile-map exploration. In: Galton, A., Worboys, M.F., Duckham, M. (eds.) Pro- ceedings of Workshop on Identifying Objects, Processes and Events. pp. 13–18. Belfast, ME (2011) 6. Kerzel, M., Habel, C.: Ereigniserkennung während der Exploration audio-taktiler Karten. In: Mensch & Computer 2012 (in press) 7. Lohmann, K.: Verbally Assisting Virtual Tactile Maps. An Interface for Visually Impaired People. Doctoral dissertation, Universität Hamburg (in preparation) 8. Lohmann, K., Eichhorn, O., Baumann, T.: Generating situated assisting utterances to facilitate tactile-map exploration: A prototype system. In: Proceedings of the NAACL Workshop on Speech and Language Processing for Assistive Technologies 2012. Montreal, QC (2012) 9. Lohmann, K., Eschenbach, C., Habel, C.: Linking spatial haptic perception to linguis- tic representations: assisting utterances for tactile-map explorations. In: Egenhofer, M., Giudice, N., Moratz, R., Worboys, M. (eds.) Spatial Information Theory, pp. 328–349. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2011) 10. Lohmann, K., Habel, C.: Extended verbal assistance facilitates knowledge acquisition of virtual tactile maps. In: Proceedings of Spatial Cognition 2012 (in press) 11. Lohmann, K., Kerzel, M., Habel, C.: Generating verbal assistance for tactile-map explorations. In: van der Sluis, I., Bergmann, K., van Hooijdonk, C., Theune, M. (eds.) Proceedings of the 3rd Workshop on Multimodal Output Generation 2010. Dublin (2010) 12. Moustakas, K., Nikolakis, G., Kostopoulos, K., Tzovaras, D., Strintzis, M.G.: Haptic rendering of visual data for the visually impaired. IEEE Multimedia 14(1), 62–72 (2007) 13. Parkes, D.: “NOMAD”: an audio-tactile tool for the acquisition, use and manage- ment of spatially distributed information by partially sighted and blind people. In: Proceedings of the 2nd international conference on maps and graphics for visually disabled people. Nottingham (1988) 14. Wang, Z., Li, B., Hedgpeth, T., Haven, T.: Instant tactile-audio map: enabling access to digital maps for people with visual impairment. In: Proceedings of the 11th international ACM SIGACCESS conference on Computers and Accessibility. pp. 43–50. ACM, Pittsburg, PA (2009) 30 Proceedings SKALID 2012