Visualization of Gaze Tracking Data
                              for UX Testing on the Web

                  Róbert Móro                            Jakub Daráž                       Mária Bieliková
               Slovak University of                   Slovak University of               Slovak University of
             Technology in Bratislava               Technology in Bratislava           Technology in Bratislava
            Faculty of Informatics and             Faculty of Informatics and         Faculty of Informatics and
            Information Technologies               Information Technologies           Information Technologies
               Ilkovičova 2, 842 16                  Ilkovičova 2, 842 16              Ilkovičova 2, 842 16
               Bratislava, Slovakia                   Bratislava, Slovakia               Bratislava, Slovakia
           robert.moro@stuba.sk                      xdarazj@stuba.sk              maria.bielikova@stuba.sk

ABSTRACT                                                          with the interface? In what order do they receive the in-
Visualizations on the Web can help users to understand            formation? Do they read the accompanying text? Does the
complex concepts, such as when too many objects of pos-           pattern change when we change a particular element (its po-
sible interest are present. For the purpose of evaluation of      sition, design, etc.)? In order to answer these questions, it is
their usability, gaze tracking data represent a valuable source   not enough to rely on the indirect or implicit forms of feed-
of information. These data are themselves complex, time-          back, such as position of a mouse cursor, clicks or scrolling.
varying and in large quantities, thus posing challenges on        We need to evaluate what the users are actually looking at.
their manipulation and visualization. We propose an infras-
tructure for collection and visualization of the gaze tracking    For this purpose, we can utilize gaze tracking technology
data from dynamic Web applications. Its main purpose is to        that is becoming more aﬀordable for the researchers and
support researchers in UX (user experience) testing of their      the ordinary users alike. Existing solutions have, however,
proposed interfaces (and visualizations). In the paper, we        often only limited support for the Web-based dynamic ap-
provide a user study on the usability of the infrastructure       plications. In this paper, we propose an infrastructure for
and compare it to existing solutions.                             gaze tracking data collection and visualization focusing on
                                                                  the Web environment. We developed a prototype that can
                                                                  transparently work with gaze tracking devices from various
Categories and Subject Descriptors                                manufacturers and supports multiple browsers. We provide
H.1.2 [Models and Principles]: User/Machine Systems—
                                                                  an empirical evaluation of the proposed infrastructure and
human factors; H.5.2 [Information Interfaces and Pre-
                                                                  its visualization capabilities for UX testing and compare it
sentation]: User Interfaces—evaluation/methodology, user-
                                                                  to some of the existing solutions.
centered design

General Terms                                                     2.   RELATED WORK
Design, Experimentation, Human Factors                            Eye tracking has been applied in many user studies in the re-
                                                                  cent years. With lowering price and increasing availability of
                                                                  low-end models, it is becoming possible to have eye-trackers
Keywords                                                          not only in UX laboratories, but also in end-users’ note-
gaze tracking, infrastructure, visualization, UX testing, ar-
                                                                  books. It opens up new possibilities for types of interactions
eas of interest, web
                                                                  and adaptation, i.e. personalization of the applications to
                                                                  the users as noted in [1]. The authors veriﬁed that adapting
1. INTRODUCTION                                                   the displayed ads on a website based on gaze data resulted
For a picture (a visualization) to be worth a thousand words,     in signiﬁcant increase of users’ attention.
it has to have a clear message that is easily understandable
by the users (receivers of the message). However, visualiza-      Adaptation of visualization based on gaze data was proposed
tions nowadays are usually not only static pictures, but re-      in [5]. The authors compared two types of visualization,
quire often complex interaction with the interface elements,      namely bar chart and radar graph on fourteen tasks of diﬀer-
such as ﬁltering values, selecting ranges (e.g. time, price,      ing type and complexity. In addition, the participants’ per-
etc.), zooming or navigation. In addition, this interaction       sonal traits (cognitive abilities), such as perceptual speed or
is in many cases carried out in the Web environment with          visual working memory have been tested. They were able to
dynamically generated, or even streamed content. Evaluat-         correctly classify the task’s type, complexity and the users’
ing proposed visualization, its usability and the overall user    cognitive ability based on the gaze data and selected areas
experience (UX) can be, therefore, an uneasy task.                of interest, thus showing, that there are distinct diﬀerences
                                                                  in patterns and interaction styles worth of adapting to the
There are many questions that can be of importance dur-           users.
ing evaluation, such as: How much time do the users spend
looking at the visualization and how much time interacting        Individual diﬀerences in gaze patterns and behaviours were
                                                                                                        ^ƚŽƌĞƐŐĂǌĞƚƌĂĐŬŝŶŐĚĂƚĂ
                                                                        'ĂǌĞDŽŶŝƚŽƌ

                                                                                                                         'ĂǌĞWƌĞƐĞŶƚĞƌ
                   WĂƌƚŝĐŝƉĂŶƚ
                                                                    ŶƌŝĐŚĞƐĚĂƚĂǁŝƚŚK/Ɛ
                                                                                               ^ƚŽƌĞƐK/Ɛ                 'ĂǌĞĚŵŝŶ
                                         ǇĞƚƌĂĐŬĞƌ
                                                                        tĞďƌŽǁƐĞƌ
                                 ĞĨŝŶĞƐĂƌĞĂƐŽĨŝŶƚĞƌĞƐƚ
                                                                         K/>ŽŐŐĞƌ


                                                             ŶĂůǇƐĞƐŐĂǌĞƚƌĂĐŬŝŶŐĚĂƚĂ                                  'ĂǌĞsŝƐƵĂůŝǌĞƌ
                  ZĞƐĞĂƌĐŚĞƌ


                   ϯƌĚͲƉĂƌƚǇ
                                                     ZĞƚƌŝĞǀĞƐŐĂǌĞƚƌĂĐŬŝŶŐĚĂƚĂĨŽƌƉĞƌƐŽŶĂůŝǌĂƚŝŽŶ                        'ĂǌĞW/
                  ƉƉůŝĐĂƚŝŽŶ


                                 Figure 1: Conceptual design of the proposed architecture.


observed in [2] as well. Eye tracking has also been utilized                       Facebook with its chat, activity stream etc.).
in a user study on visualization of faceted interface [3]. The
authors were interested in ﬁnding out, whether the users do                        Other problem with existing solutions is support for only
not use facets just because they are shown to them. There-                         one tracking device, i.e. multiple users cannot be tracked
fore, they automatically hid (collapsed) them. Using the                           at the same time with exception of Eyeworks software by
eye-tracker they veriﬁed that the faceted interface was used                       Eyetracking 4 when combined with their Quad server solu-
heavily in both cases (when visible as well as when hidden)                        tion. Even so, the existing solutions for gaze data collection
with no signiﬁcant diﬀerence in gaze patterns.                                     and visualization are developed by the eye-trackers’ manu-
                                                                                   facturers and therefore, they are closed to one particular eye
In order to be able to eﬀectively evaluate areas of interest,                      tracker brand and cannot be extended to work with devices
we need to be able to track them throughout dynamically                            from other manufacturers.
changing content. An algorithm for this purpose was pro-
posed in [4] focusing on the tracking in video content.
                                                                                   3.      INFRASTRUCTURE FOR GAZE
However, tracking areas of interests on the Web usually re-                                TRACKING
quires diﬀerent approach as the content can change com-                            In order to address the problems of existing solutions dis-
pletely, although it is still the same element (area of inter-                     cussed in the previous section, we propose an infrastruc-
est). According to our knowledge, it is still largely unsup-                       ture for gaze tracking focusing on the dynamic Web appli-
ported by the existing eye-tracking software.                                      cations. Its conceptual design can be seen in Figure 1. It
                                                                                   consists of three main components, namely Gaze Monitor,
The Eye Tribe 1 that promises a cheap tracker comes with no                        Web Browser AOI Logger and Gaze Presenter, which in turn
software, only with API for developers. On the other hand,                         comprises of Gaze Admin, Gaze Visualizer and Gaze API.
Tobii Technologies 2 oﬀers a Tobii Studio that comes with
a full support for planning user studies, tracking, visualiza-                     Researchers deﬁne the areas of interest (AOI) using the Web
tion and evaluation. However, it works only with Internet                          Browser AOI Logger which are then stored on the server.
Explorer and areas of interest can be added only as static                         They can set-up the whole experiment using the Gaze Ad-
rectangles or polygons which is unusable with dynamically                          min, which is a part of Gaze Presenter component.
changing Web content. The best support for Web 2.0 seems
to have Nyan 2.0 3 solution by Eye Gaze, LC Technologies.                          Then, the participants can connect using the Gaze Moni-
It can recognize diﬀerent overlays and also visualize Web                          tor, which communicates in the background with the eye-
navigation paths. Areas of interest are, however, still deﬁned                     tracker, collects the gaze tracking data and sends them to
as polygons. In addition, most of the existing solutions try                       Web Browser AOI Logger for enrichment. The data are
to roll-out the Web pages to account for scrolling. This is,                       enriched with the XPath5 of the element the user (i.e. par-
however, not enough for many modern applications, which                            ticipant) is looking at, based on the coordinates supplied by
can have diﬀerent elements with their own scrollbars (e.g.                         the eye-tracker. The URL of the current website is added
                                                                                   as well. Enriched data are sent by the Gaze Monitor at
1
  https://theeyetribe.com/
2                                                                                  4
  http://www.tobii.com/                                                                http://www.eyetracking.com/
3                                                                                  5
  http://www.eyegaze.com/eyegaze-analysis-software/                                    http://www.w3.org/TR/xpath/
                                                                   Figure 3: XPath string. It can be customized by
                                                                   deselection of the specific path’s elements (in gray).


                                                                   unique XPath.

                                                                   The extension is also used to deﬁne areas of interest on the
                                                                   Web page, which is in more detail described in section 4.1.


Figure 2: HTML elements highlighted during defi-                   3.3   Gaze Presenter
nition of areas of interest. Green ones have already               The data sent from the Gaze Monitor are collected by the
been added (note that every snippet is a part of                   provided server application, i.e. the Gaze Presenter. It en-
an area of interest definition), orange is highlighted             ables data collection from multiple connected users at once.
upon mouse hover and can be selected by a mouse                    We use two databases for storing the data; SQL database
click.                                                             for storing the information about experiments (projects, ses-
                                                                   sions, users, areas of interest) and NoSQL document-based
                                                                   database RavenDB for storing the enriched gaze tracking
speciﬁed time intervals to the Gaze Presenter for persistent       data in JSON format. One of the considerations when choos-
storage.                                                           ing the data storage was velocity of the incoming data; the
                                                                   eye-tracker’ frequency is (based on the actual model) at least
They can be viewed and analysed by the researchers using           30Hz meaning that we have approximately 100,000 new data
the Gaze Visualizer component. The data can be also re-            records per each hour’s worth of tracking.
trieved using the provided Gaze API and then manipulated
by the third-party applications.                                   The collected data can be accessed and visualized by the
                                                                   users using the provided Web interface. In addition, we pro-
The individual components are further described in the fol-        vide an API for third-party applications that can consume
lowing sections.                                                   collected data (i.e. what users are looking at which elements
                                                                   at what time) and e.g. adapt (personalize) the visualized in-
                                                                   formation based on the users’ gaze, i.e. what they are (not)
3.1 Gaze Monitor                                                   looking at. Thus, the gaze tracking can be used not only
Gaze Monitor connects to an eye-tracking device to receive         for the purpose of evaluating the interface (visualization),
gaze data from it. In order to transparently support de-           but can be considered as a form of implicit user feedback.
vices from various manufacturers, we have implemented our          This way, it can help to model interests of the users more
own library that serves as a façade to the actual eye-tracker’s   precisely.
API. Currently, we support devices from two manufacturers,
namely Tobii Technologies and The Eye Tribe. In addition,
we provide our own gaze data simulator that enables de-            4.    VISUALIZATION OF GAZE TRACKING
velopers and researchers to develop applications for the eye-            DATA
tracker without having one; gaze is simulated by the position      Visualization of gaze tracking data is crucial for its under-
and movement of the mouse cursor. Because it uses our pro-         standing and usage for evaluation of the user interfaces.
vided library, the applications developed and tested with the      Complexity lies in the data’s velocity, multidimensionality
help of the simulator can consume the simulated gaze track-        and time variability. We can signiﬁcantly reduce the compu-
ing data as if they were from the actual eye-tracking device       tational requirements, when we include only data for speciﬁc
(i.e. using the same API calls).                                   areas of the tested Web page that are of an interest for us
                                                                   (so-called areas of interest). Thus, instead of computing ﬁx-
The Gaze Monitor stores gaze data from the tracker in a            ations for all the elements, we can do it for a handful deﬁned
queue. It communicates with the our provided browser ex-           by the user.
tension - Web Browser AOI Logger, sends it the queued data
and receives the enriched data. These are sent to the server
in speciﬁed time intervals.
                                                                   4.1 Definition of Areas of Interest
                                                                   We enable users to deﬁne areas of interest (AOI) using our
                                                                   browser extension. After activation, the elements in the Web
3.2 Web Browser AOI Logger                                         page are highlighted upon mouse hover (see Figure 2). After
Web Browser AOI (Area of Interest) Logger is realized as an        the highlighted element is clicked on, the pop-up appears, in
extension to the web browser. Its main functionality is to         which it is possible to customize the selected area (name it,
enrich data from the Gaze Monitor. Currently, we support           described it) or to change the XPath (see Figure 3) to suite
both Google Chrome as well as Mozilla Firefox browser. The         the user’s speciﬁc needs.
gaze tracker data contain normalized coordinates which are
recalculated in order to identify the speciﬁc HTML element         It is, thus, possible to choose not only the actual clicked
of the displayed Web page. The element is identiﬁed by its         element, but e.g. every paragraph with the same parent, or
              Figure 4: Visualization of fixations in time for the selected user and area of interest.


every element with the same class, etc. This can be used                users looked at the areas of interest during the whole
with advantage for the dynamically generated Web pages,                 session)
where we do not know exact element’s path, but we can
identify it by its relative position within the HTML DOM              • Fixation in time - it shows, how the ﬁxation count
structure or by its other attributes. It also enables users to          changed over the time of the experimental session (see
include to an area of interest elements which are generated             Figure 4)
on the ﬂy and are therefore not present at the time of area
of interest deﬁnition, but share the same attribute value.
                                                                 The users can aggregate and compare the data from multiple
                                                                 sessions, users and for multiple areas of interest using the
4.2 Visualization of Metrics                                     provided ﬁltering options. Data are shown in tabular view
The eye-tracker tracks the position and movements of each        as well as visualized in the form of charts using the D3.js 6
eye separately; however, we are interested, what a user is       library. The charts can be exported and saved to disk.
looking at (which is rarely two things at once). Therefore,
we calculate the gaze position as an average of the two eyes’    5.     EXPERIENCE WITH THE PROPOSED
coordinates. In addition, the tracker is not always precise
and the gaze can seem to oscillate around a speciﬁc point,              GAZE TRACKING INFRASTRUCTURE
when the user actually looks at the same point the whole                AND ITS USABILITY
time. We use several smoothing techniques to account for         In order to evaluate our proposed infrastructure, we carried
this, especially a moving average technique by averaging N       out a user study with four participants. We chose partici-
consecutive gaze coordinates from a moving window. The           pants who had previous experience with eye tracking in Tobii
users (researchers) can also specify minimal time threshold      Studio, so that they could compare the functionality of the
for ﬁxation, i.e. for how much time (e.g. 500 ms, 1 s, etc.)     both systems.
the user has to look at the area for it to count as a ﬁxation.
This way, we can ﬁlter out events, when the user moved gaze      The participants’ task was to set-up an experiment using
through the element without actually ﬁxating on it.              our infrastructure, then collect the gaze data and lastly, to
                                                                 visualize and evaluate it. At the end, we asked them to ﬁll
The cleared data can be accessed and visualized by the users     in a questionnaire evaluating the diﬀerent features.
using the provided Gaze Visualizer component. Currently,
we support the following metrics:                                The participants rated highly the provided functionality of
                                                                 deﬁning the areas of interest. It was also rated as intuitive
                                                                 and easy to understand (4.25 on average from a ﬁve-point
   • Number of fixations - it counts, how many times the         Lickert scale). However, we observed problems with editing
     users looked at the speciﬁed areas of interest during       the XPath string, namely the participants did not intuitively
     the duration of the whole session                           ﬁnd out that it is customizable. After explanation of how
                                                                 it works, they appreciated the ﬂexibility. One of the par-
   • Dwell time - similar to the ﬁrst metric, but instead of
                                                                 ticipant suggested that he would be interested to deﬁne not
     the number of times the users’ gaze entered the areas
                                                                 6
     of interest, it aggregates the spent time (how long the         http://d3js.org/
only a single area of interest as a combination of diﬀerent       a form of events, e.g. someone entered the room during the
elements (e.g. each result on search engine’s results page),      study, the participant looked away, the user study modera-
but also to explore the diﬀerences in gaze patterns with in-      tor provided a guidance, etc. These events represent useful
dividual elements within this area of interest group.             metadata that can further explain the collected gaze data
                                                                  and provide new insights. In addition, it would be interest-
The participants found the experiment easy to set-up, al-         ing to segment the data based on these events or compare
though they had in some cases problems to understand the          the changes in gaze patterns or behaviour (e.g. before giving
diﬀerence between a project and its sessions. As to the vi-       guidance and after it).
sualization, it was again rated very positively (4.25 on aver-
age), even though we currently provide only visualization of      In order to support this kind of annotations, we have to
the three metrics. On the other hand, these metrics are ones      solve several (also) visualization issues, namely visualization
of the most often used as we also veriﬁed in the reviewed lit-    of gaze data stream in real-time and adding the annotations
erature (they were used practically in all of the related works   to a single point in data or a range. The easy to understand
reported in this paper). The participants missed the most         and intuitive visualization of the associated annotations in
possibility of creating heat maps and ﬁxation sequences (how      the data in the process of evaluation is also an open problem.
the gaze moves from element to element).
                                                                  In addition, it is very likely that the eye-trackers will be a
Compared to Tobii Studio, the participants appreciated the        part of end-user devices in the near future. This will allow
ﬂexibility of deﬁning the areas of interest, support of multi-    usage of gaze data as one of the implicit feedback factors of
ple browsers and multiple concurrent users as well as possi-      users’ interest. When we combine our provided Gaze API
bility to manually set the preferred minimal length (thresh-      with the events in the form of annotations, it can support
old) of ﬁxations. On the other hand, they lacked audio and        new ways of personalized interactions on the Web.
video recording and support of data inputs other than gaze,
such as mouse clicks (left and right button), scroll events,      7.   ACKNOWLEDGMENTS
etc. They would also appreciate the possibility to export         This work was partially supported by grants No. APVV
the data or to clean it within our application.                   0208-10 and VG1/0971/11 and it was created with the sup-
                                                                  port of the Research and Development Operational Pro-
Lastly, two participants would use our solution alone and         gramme for the project “University Science Park of STU
two in combination with others, such as Tobii Studio, mainly      Bratislava”, ITMS 26240220084, co-funded by the European
for the lack of audiovisual recording. Overall, we ﬁnd the        Regional Development Fund.
feedback positive and encouraging for future development.
                                                                  We would like to thank our colleagues who participated on
6. CONCLUSIONS                                                    the development of the presented prototype, namely Do-
In the paper, we proposed an infrastructure for collection        minika Červeňová, Lukáš Gregorovič, Michal Mészáros, Ró-
and visualization of gaze data focusing on the dynamic Web        bert Kocian, Martin Janı́k and Kristı́na Mišı́ková. We thank
applications. Our main contributions are:                         also the Tobii Technology for kindly providing us with the
                                                                  eye tracker as well as Tobii Studio for evaluation purposes.

   • visual deﬁnition and support of dynamic areas of in-         8.   REFERENCES
     terest, the content of which as well as size and position    [1] F. Alt, A. S. Shirazi, A. Schmidt, and J. Mennenöh.
     can change over time                                             Increasing the user’s attention on the web. In Proc. of
   • support of multiple browsers and eye-trackers from dif-          the 7th Nordic Conf. on Human-Computer Interaction
     ferent manufacturers by providing a uniﬁed and easily            Making Sense Through Design - NordiCHI ’12, pp.
     extensible API                                                   544–553, NY, USA, 2012. ACM Press.
                                                                  [2] S. T. Dumais, G. Buscher, and E. Cutrell. Individual
   • collection and automatic evaluation of the gaze data             diﬀerences in gaze patterns for web search. In Proc.of
     from multiple concurrent devices and users                       the 3rd Symposium on Information Interaction in
                                                                      Context - IIiX ’10, pp. 185–194, NY, USA, 2010. ACM
                                                                      Press.
We realized a prototype of the infrastructure and carried         [3] M. Kemman, M. Kleppe, and J. Maarseveen. Eye
out an user study in order to gain feedback to its function-          tracking the use of a collapsible facets panel in a search
ality and usability. Based on the collected user feedback de-         interface. In Proc. of the 17th Int. Conf. on Theory and
scribed in previous section, we plan to provide heat maps as          Practice of Digital Libraries - TPDL ’13, pp. 405–408,
well as ﬁxation sequences visualization in the future. More           Berlin, Heidelberg, 2013. Springer.
importantly, we would like to enhance the data manipula-
                                                                  [4] F. Papenmeier and M. Huﬀ. DynAOI: a tool for
tion techniques, such as cleaning the data, selecting time
                                                                      matching eye-movement data with dynamic areas of
ranges, zooming in and out, etc.                                      interest in animations and movies. Behavior Research
                                                                      Methods, 42(1):179–87, Mar. 2010.
Currently, it is possible to automatically annotate the gaze
                                                                  [5] B. Steichen, G. Carenini, and C. Conati. User-adaptive
data based on the ﬁxations within the areas of interest de-
                                                                      information visualization. In Proc. of the 2013 Int.
ﬁned by the users. However, the users may wish to add
                                                                      Conf. on Intelligent User Interfaces - IUI ’13, pp.
other annotations of diﬀerent types either manually or au-
                                                                      317–328, NY, USA, 2013. ACM Press.
tomatically based on a set of predeﬁned rules. It can be in