ET4S 2014 High-Level Gaze Metrics From Map Viewing Charting Ambient/Focal Visual Attention Krzysztof Krejtz1 , Andrew T. Duchowski2 , and Arzu Çöltekin3 1 National Information Processing Institute and University of Social Sciences and Humanities, Warsaw, Poland 2 School of Computing, Clemson University, Clemson, SC, USA 3 Department of Geography, University of Zurich, Zurich, Switzerland Abstract. Distinguishing ambient and focal attention, we demonstrate the use of the K coefficient, which can serve as a cue for recommender systems in deciding when to offer information to the user, e.g., when focally attending during search. Keywords: maps, eye tracking, visual attention 1 Introduction & Background Gaze-based recommender systems are designed to respond with information contingent on the viewer’s gaze, e.g., in geographic contexts, when directed to a particular location in physical or virtual space (such as on a map). These geographic gaze-based recom- mender systems have been referred to as location-aware (mobile) eye tracking systems [7]. For the system to provide an appropriate response, analysis of the viewer’s gaze is required to infer the viewer’s desire for information. Generally, this is accomplished via computation of an interest metric [11,10,4]. Recent approaches characterize interest or boredom via Support Vector Machines [6] or Area Of Interest (AOI) revisitation [5]. In this paper we use the K coefficient to distinguish ambient and focal attention. Ambient attention is characterized by relatively short fixations followed by high am- plitude saccades. Conversely, focal attention is described by long fixations followed by saccades of low amplitude [12]. The K coefficient captures the temporal relation be- tween standardized fixation duration and subsequent saccade amplitude. K > 0 indicates focal viewing while K < 0 suggests ambient viewing [9]. Attention becoming more focal over time, or oscillating between focal and ambient modes, could indicate changing cognitive load corresponding to stimulus complexity. The K coefficient can thus potentially act as a contextual cue which could be exploited by recommender systems, e.g., do not interrupt the user when in ambient search mode, or oscillating between ambient and focal search. We demonstrate the utility of K with an experiment comparing visual search behav- ior over different geographic representations of two cities: Amsterdam and Barcelona. ET4S 2014, September 23, 2014, Vienna, Austria Copyright © 2014 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 37 ET4S 2014 (a) Apparatus and setting. (b) Satellite rendering. (c) Map rendering. Fig. 1. Apparatus and stimuli (Barcelona) using Google’s hybrid and roadmap rendering. 2 Methodology The experiment consisted of localizing map objects (features), arguably the most fun- damental map-reading task [1]. Localization, or visual search, i.e., finding an object of interest among others, is a fundamental map use task, because regardless of the map type or the final goal of the map reader, an object or area of interest must be found be- fore it can be studied further. Geographic task taxonomies widely acknowledge this task among the most basic and common [8,2]. The localization tasks were carried out when viewing a city representation in either (cartographic) map or satellite view (Google’s roadmap or hybrid rendering; see Fig. 1 and text below for technical details). Unlike previous work on route learning [3], we hypothesized there would be an effect of map type on visual search, with localization (search) taking longer in satellite view than in map view. Participants. Sixty-three (63) university students took part in the study, with 7 ex- cluded due to technical and procedural problems (e.g., poor calibration). The final sam- ple included 56 participants (19 M, 37 F, ages M = 24.62, SD = 4.01). Apparatus. All stimuli were presented on a computer monitor (1680×1050 reso- lution; 2200 LCD, 60 Hz refresh rate) connected to a standard PC computer. Eye move- ments were recorded at 250 Hz with an SMI RED 250 eye tracking system. Stimulus. Map images were created using Google Mapstm JavaScript API v3. Maps were rendered (see Fig. 1) by disabling visual user interface controls for navigation, scale, rotate, pan, and zoom, and limiting the number of Points Of Interest (POIs) to two. Specifically, the Barcelona maps (using Google’s latitude/longitude coordi- nates: 41.375384, 2.141004) displayed only the “park” and “sports complex” POIs at zoom level 17 with the transit layer turned on. The Amsterdam maps (lat./lon.: 52.3614777, 4.8837351) were also displayed at zoom level 17 with two POIs (“at- traction” and “business”) and transit layer turned on. All maps were rendered to 1280×1024 resolution, then screen-captured and cropped to the same dimensions. Procedure. Participants were randomly assigned to either experimental condition, map (N = 29) or satellite view (N = 27). Prior to viewing of the map, the eye tracking system was calibrated to each individual. Following calibration, participants carried out the localization task (after having located a start point). 38 ET4S 2014 For the city of Barcelona (see Fig. 1), participants were asked to find the intersection of Carrer de Vilardell and Carrer d’Hostafrancs de Sió, indicating its localization by dwelling on it for 3 seconds. For Amsterdam, they were told to find the Rijksmuseum. 3 Results & Discussion Participants’ Familiarity with Presented Areas and Google Maps. Participants were not very familiar with the areas of Barcelona or Amsterdam that were presented to them during the experiment. We drew this conclusion after analyzing questionnaire answers. The percentage of participants who had visited either city at least once in their lives was 39% for Amsterdam and 25% for Barcelona. Of these, they had visited the city at most one time (32.1% for Amsterdam and 19.6% for Barcelona). Google maps was fairly popular among participants, with only 7.1% claiming they had never used the service. Table 1 gives the detailed distribution of answers to the question “How often do you use Google Maps?” Performance (time to completion) and process (visual attention) measures we report from the experiment therefore speak to more or less typical use of Google Maps. The importance of our findings lies in their potential indication of online map use when encountering a new location (e.g., as one would do prior to travel to the destination) by users already familiar with the service. Localization Task Performance. All subjects successfully completed the tasks. To gauge localization performance, we first considered basic eye movement metrics, using a linear mixed model (LMM) analysis with task duration as the dependent variable and experimental condition (map vs. satellite) as the between-subjects fixed factor and city (Amsterdam vs. Barcelona) as the within-subject fixed factor. As expected, analysis revealed a significant main effect of experimental condition, F(2, 54) = 26.26, p < 0.01, indicating that task completion time was shorter for the map view (M = 37.33 s, SE = 41.63) than for the satellite view (M = 47.36 s, SE = 55.27). Unexpectedly, however, the city appeared to be a moderating factor resulting in a significant interaction term, F(1, 54) = 9.15, p < 0.01. Investigation of contrasts related to this interaction effect revealed that the difference between map and satellite views was significant only for Barcelona. For Amsterdam, no significant difference between completion times for the two view types was found. Analysis also revealed a main effect of city, F(1, 54) = 31.68, p < 0.01, meaning that, overall, time to localization on either of Answer Percent Everyday 7.1% 2-3 times a week 32.1% Once a week 12.5% 2-3 times a month 19.6% Once a month 21.4% Never 7.1% Table 1. Answers to the questionnaire question: “How often do you use Google Maps?” (N = 56) 39 ET4S 2014 0.2 0.2 Ambient−Focal K Coefficient Ambient−Focal K Coefficient 0.0 0.0 −0.2 Exp. condition −0.2 Exp. condition Roadmap Roadmap Satellite Satellite 1 2 3 4 5 1 2 3 4 5 Time sequence Time sequence (a) Amsterdam. (b) Barcelona. Fig. 2. Dynamics of K coefficient over time between viewing conditions (whiskers ±1SE). the Barcelona maps (map or satellite views) took significantly longer (M = 61.7 s, SE = 1.9) than on the Amsterdam maps (M = 25.44 s, SE = 1.5). The studied sections of the two cities show strong differences in spatial organization, that is, the way the buildings and other features are organized in relation to each other. In the section from the Barcelona map there is more regularity (more rows of buildings similar to each other), making the visual search potentially harder. This is possibly one of the reasons why we found a strong difference between the cities and one that is also supported by literature [13]. Temporal Dynamics of Visual Attention. To delve deeper into the visual search process of viewers over each of the city maps, we performed a similar LMM analysis but with K as the dependent variable, with the same independent variables of map view and city as before. Because we believe plotting K over time depicts the dynamic dilation and constriction of the viewer’s field of view during visual search, we used time as a fixed within-subject factor, discretized into 5 equal periods (see Fig. 2). Considering the city as a moderating factor, analysis of K showed significant inter- action among city, map type, and time, F(4, 9910) = 2.44, p < 0.05. Temporal ambient/- focal attention patterns are influenced by map type (satellite vs. cartographic) and city. When working with the Barcelona satellite view, attention generally becomes in- creasingly more focal over time, especially over the satellite view, as expected (see Fig. 2(b)). Over the cartographic map view, the pattern is more sinusoidal, eventually becoming increasingly focal over the last two time segments. When viewing the Ams- terdam maps, however, the search pattern is less predictable (see Fig. 2(a)). 4 Conclusion Analysis of K strongly suggests different cognitive strategies involved in visual search depending on the stimulus and/or task difficulty. Faster search over less complex stimu- 40 ET4S 2014 lus may afford ambient search sooner, longer, or with greater ambient/focal oscillation, possibly indicating boredom. More complex stimulus may require increasingly focal attention later on into the search, possibly indicating greater interest. References 1. Boér, A., Çöltekin, A., Clarke, K.C.: An Evaluation of Web-based Geovisualizations for Different Levels of Abstraction and Realism—What do users predict? In: International Car- tographic Conference. pp. 209–220. Dresden, Germany (2013) 2. Carter, J.R.: The many dimensions of map use. In: Proceedings of the International Carto- graphic Conference (2005), http://www.cartesia.org/geodoc/icc2005/pdf/oral/ TEMA12/Session3/JAMESCARTER.pdf 3. Francelet, R.: Realism and Individual Differences in Route-Learning. Master’s thesis, Uni- versity of Zurich (2014) 4. Hammer, J.H., Maurus, M., Beyerer, J.: Real-time 3d gaze analysis in mobile applications. In: Proceedings of the 2013 Conference on Eye Tracking South Africa. pp. 75–78. ETSA ’13, ACM, New York, NY (2013), http://doi.acm.org/10.1145/2509315.2509333 5. Kiefer, P., Giannopoulos, I., Kremer, D., Schlieder, C., Raubal, M.: Starting to get bored: An outdoor eye tracking study of tourists exploring a city panorama. In: Proceedings of the Symposium on Eye Tracking Research and Applications. pp. 315–318. ETRA ’14, ACM, New York, NY (2014), http://doi.acm.org/10.1145/2578153.2578216 6. Kiefer, P., Giannopoulos, I., Raubal, M.: Using eye movements to recognize activities on cartographic maps. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. pp. 488–491. SIGSPATIAL’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/2525314.2525467 7. Kiefer, P., Straub, F., Raubal, M.: Towards location-aware mobile eye tracking. In: Proceed- ings of the Symposium on Eye Tracking Research and Applications. pp. 313–316. ETRA ’12, ACM, New York, NY (2012), http://doi.acm.org/10.1145/2168556.2168624 8. Knapp, L.: A Task Analysis Approach to the Visualization of Geographic Data. In: Nygeres, T.L., Mark, D.M., Laurini, R., Egenhofer, M.J. (eds.) Cognitive Aspects of Human Computer Interaction for Geographic Information Systems, chap. 7, pp. 355–371. Springer (1995) 9. Krejtz, I., Szarkowska, A., Krejtz, K., Walczak, K., Duchowski, A.T.: Audio Description as an Aural Guide of Children’s Visual Attention: Evidence from an Eye-Tracking Study. In: Proceedings of the 2012 Symposium on Eye-Tracking Research and Applications. ETRA ’12, ETRA, ACM, New York, NY (March 28-30 2012) 10. Qvarfordt, P., Zhai, S.: Conversing with the user based on eye-gaze patterns. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. pp. 221–230. CHI ’05, ACM, New York, NY, USA (2005), http://doi.acm.org/10.1145/1054972.1055004 11. Starker, I., Bolt, R.A.: A gaze-responsive self-disclosing display. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. pp. 3–10. CHI ’90, ACM, New York, NY, USA (1990), http://doi.acm.org/10.1145/97243.97245 12. Velichkovsky, B.M., Joos, M., Helmert, J.R., Pannasch, S.: Two Visual Systems and Their Eye Movements: Evidence from Static and Dynamic Scene Perception. In: CogSci 2005: Proceedings of the XXVII Conference of the Cognitive Science Society. pp. 2283–2288. Stresa, Italy (July 2005) 13. Ware, C.: Chapter 2: What We Can Easily See. In: Visual Thinking for Design, pp. 23–42. Morgan Kaufmann (2008) 41