=Paper=
{{Paper
|id=Vol-1621/paper5
|storemode=property
|title= Exploring the Potential Contribution of Mobile Eye-tracking Technology in Enhancing the Museum Visit Experience 
|pdfUrl=https://ceur-ws.org/Vol-1621/paper5.pdf
|volume=Vol-1621
|authors=Moayad Mokatren,Tsvi Kuflik
|dblpUrl=https://dblp.org/rec/conf/avi/MokatrenK16
}}
== Exploring the Potential Contribution of Mobile Eye-tracking Technology in Enhancing the Museum Visit Experience ==
<pdf width="1500px">https://ceur-ws.org/Vol-1621/paper5.pdf</pdf>
<pre>
           Exploring the potential contribution of mobile eye-tracking
             technology in enhancing the museum visit experience
                      Moayad Mokatren                                                      Tsvi Kuflik
                   The University of Haifa                                           The University of Haifa
                  Mount Carmel, Haifa, 31905                                        Mount Carmel, Haifa, 31905
                       +97248288511                                                      +97248288511
                 moayad.mokatren@gmail.com                                            tsvikak@is.haifa.ac.il
ABSTRACT                                                                Falk and Dierking [2000] and Falk [2009] tried to answer the
An intelligent mobile museum visitors’ guide is a canonical             question of what do visitors remember from their visit and
case of a context-aware mobile system. Museum visitors move             what factors seemed to most contribute to visitors' forming of
in the museum, looking for interesting exhibits, and wish to            long-terms memories: “when people are asked to recall their
acquire information to deepen their knowledge and satisfy               museum experiences, whether a day or two later or after
their interests. A smart context-aware mobile guide may                 twenty or thirty years, the most frequently recalled and
provide the visitor with personalized relevant information              persistent aspects relate to the physical context-memories of
from the vast amount of content available at the museum,                what they saw, what they did, and how they felt about these
adapted for his or her personal needs. Earlier studies relied on        experiences.”. Stock et al. [2009], and Dim and Kuflik [2014]
using sensors for location-awareness and interest detection.            explored the potential of novel, mobile technology in
This work explores the potential of mobile eye-tracking and             identifying visitors behavior types in order to consider
vision technology in enhancing the museum visit experience.             what/how/when to provide them with relevant services.
Our hypothesis is that the use of the eye tracking technology
in museums’ mobile guides can enhance the visit experience              A key challenge in using mobile technology for supporting
by enabling more intuitive interaction. We report here on               museum visitors' is figuring out what they are interested in.
satisfactory preliminary results from examining the                     This may be achieved by tracking where the visitors are and
performance of a mobile eye tracker in a realistic setting – the        the time they spend there [Yalowitz and Bronnenkant, 2009].
technology has reached a reliable degree of maturity that can           A more challenging aspect is finding out what exactly they are
be used for developing a system based on it.                            looking at [Falk and Dierking, 2000]. Given todays' mobile
                                                                        devices, we should be able to gain access seamlessly to
Author Keywords                                                         information of interest, without the need to take pictures or
Mobile guide; Mobile eye tracking; Personalized information;            submit queries and look for results, which are the prevailing
Smart environment; Context aware service.                               interaction methods with our mobile devices. As we move
                                                                        towards "Cognition-aware computing" [Bulling and Zander
ACM Classification Keywords                                             2014], it becomes clearer that eye-gaze based interaction may
H.5.2. Input devices and strategies (e.g., mouse, touchscreen)          play a major role in human-computer interaction before/until
                                                                        brain computer interaction methods will become a reality
1. INTRODUCTION                                                         [Bulling et al. 2012]. The study of eye movements started long
The museum visit experience has changed over the last two               almost 100 years ago, Jacob and Karn [2003] presented a brief
decades. With the progress of technology and the spread of              history of techniques that were used to detect eye movements,
handheld devices, many systems were developed to support                the major works dealt with usability researches, one of the
the museum visitor and enhance the museum visit experience.             important works started in 1947 by Fitts and his colleagues
The purpose of such systems was to encourage the visitors to            [Fitts et al. 1950] when they began using motion picture
use devices that provide multimedia content rather than use             cameras to study the movements of pilots’ eyes as they used
guide books, and as a consequence focus of the exhibits                 cockpit control and instruments to land an airplane. “It is clear
instead of flipping through pages in a guide book, as surveyed          that the concept of using eye tracking to shed light on usability
in [Ardissono et al. 2012].                                             issues has been around since before computer interfaces, as we
Understanding the museum visitors’ motivations plays a                  know them” [Jacob and Karn 2003]. Certain mobile eye
crucial role in the development and designing of systems that           tracking devices that enables to detect what someone is
support their needs and could enhance their visit experience.           looking at and stores the data for later use and analysis, have
                                                                        been developed and could be found in the market nowadays
                                                                        [Hendrickson et al. 2014]. In recent years, eye tracking and
                                                                        image based object recognition technology have reached a
                                                                        reliable degree of maturity that can be used for developing a
Copyright © 2016 for this paper by its authors. Copying permitted for   system based on it, precisely identifying what the user is
private and academic purposes.                                          looking at [Kassner et al. 2014]. We shall refer to this field by
                                                                        reviewing techniques for image matching and extend them for
location-awareness use and we will follow the approach of             creates a basic trajectory for the visit, though the specifics if
“What you look at is what you get” [Jacob 1991].                      what the visitor actually sees and does are strongly influenced
                                                                      by the factors described by the Contextual Model of Learning:
With the advent of mobile and ubiquitous computing, it is time
to explore the potential of this technology for natural,              • Personal Context: The visitor’s prior knowledge,
intelligent interaction of users with their smart environment,          experience, and interest.
not only in specific tasks and uses, but for a more ambitious         • Physical Context: The specifics of the exhibitions,
goal of integrating eye tracking into the process of inferring          programs, objects, and labels they encounter.
mobile users’ interests and preferences for providing them            • Socio-cultural Context: The within-and between-group
with relevant services and enhancing their user models, an              interactions that occur while in the museum and the visitor’s
area that received little attention so far. This work aims at           cultural experiences and values.
exploring the potential of mobile eye tracking technology in
enhancing the museum visit experience by integrating and              Nevertheless the visitor perceives his or her visit experience to
extending these technologies into a mobile museum visitors'           be satisfying if this marriage of perceived identity-related
guide system, so to enable using machine vision for                   needs and museum affordance proves to be well-matched.
identifying visitors' position and their object of interest in this   Hence, considering the use of technology for supporting
place, as a trigger for personalized information delivery.            visitors and enhancing the museum visit experience, it seems
                                                                      that these aspects need to be addressed by identifying visitors'
2. BACKGROUND
                                                                      identity and providing them relevant support.
2.1 Museum visitors and their visit experience
Understanding who visits the museum, their behaviors and the          2.2 Object recognition and image matching
goal of the visit can play an important role in the design of         Modern eye trackers usually record video by a front camera of
museums’ mobile guide (and other technologies) that                   the scenes for further analysis [Kassner et al. 2014]. Object
enhances the visit experience, “the visitors’ social context has      recognition is a task within computer vision of finding and
an impact on their museum visit experience. Knowing the               identifying objects in an image or video sequence. Humans
social context may allow a system to provide socially aware           recognize a multitude of objects in images with little effort,
services to the visitors.” [Bitgood 2002; Falk 2009; Falk and         despite the fact that the image of the objects may vary
Dierking 2000; Leinhardt and Knutson 2004; Packer and                 somewhat in different viewpoints, in many different sizes and
Ballantyne 2005]. Falk [2009] argued that many studies have           scales or even when they are translated or rotated. Objects can
been done on who visits museums, what visitors do in the              even be recognized when they are partially obstructed from
museum and what visitors learn from the museum, and tried to          view. This task is still a challenge for computer vision systems
understand the whole visitor and the whole visit experience as        [Pinto et al. 2008]. Many approaches to the task have been
well as after the visit. Furthermore, he proposed the idea of         implemented over multiple decades. For example, diffusing
visitors "identity" and identified five, distinct, identity-related   models to perform image-to-image matching [Thirion 1998],
categories:                                                           parametric correspondence technique [Barrow 1977] and The
                                                                      Adaptive Least Squares Correlation [Gruen 1985] were
• Explorers: Visitors who are curiosity-driven with a generic         presented as a techniques for image matching. Techniques
   interest in the content of the museum. They expect to find         from [Naphade et al. 1999], [Hampapur et al. 2001] and [Kim
   something that will grab their attention and fuel their            et al. 2005] were presented for image sequence matching
   learning.                                                          (video stream). A related field is visual saliency or saliency
• Facilitators: Visitors who are socially motivated. Their visit      detection, “it is the distinct subjective perceptual quality which
   is focused on primarily enabling the experience and                makes some items in the world stand out from their neighbors
   learning of others in their accompanying social group.             and immediately grab our attention.” [Laurent 2007].
• Professional/Hobbyists: Visitors who feel a close tie               Goferman et al. [2012] proposed a new type of saliency which
   between the museum content and their professional or               aims at detecting the image regions that represent the scene. In
   hobbyist passions. Their visits are typically motivated by a       our case, we can exploit the use of eye tracking to detect
   desire to satisfy a specific content-related objective.            salience in an efficient way since we have fixation points
• Experience Seekers: Visitors who are motivated to visit             representing points of interests in a scene.
   because they perceive the museum as an important
   destination. Their satisfaction primarily derives from the         3. RELATED WORK
   mere fact of having ‘been there and done that’.                    As mentioned above, many studies were conducted in
• Rechargers: Visitors who are primarily seeking to have a            detecting eye movements before considering their integration
   contemplative, spiritual and/or restorative experience. They       with computer interfaces, as we know them today. The studies
   see the museum as a refuge from the work-a-day world or as         have been around HCI and usability and techniques were
   a confirmation of their religious beliefs.                         presented that can be extended for further eye tracking studies
In addition, he argued that the actual museum visit experience        and not just in the field of HCI. Jacob [1991] presented
is strongly shaped by the needs of the visitor’s identity-related     techniques for local calibrating of an eye tracker, which is a
visit motivations, and the individual’s entering motivations          procedure of producing a mapping of the eye movements’
measures and wandering in the scene measures. In addition, he       sideways and there simply was no marker within view. More
presented techniques for fixation recognition with respect to       often, however, swift head movements or extreme position
extracting data from noisy, jittery, error-filled stream and for    changes were causing these issues. Ohm et al. [2014] tried to
addressing the problem of "Midas touch” where people look at        find where people look at, when navigating in a large scale
an item without having the look “mean” something. Jacob and         indoor environment, and what objects can assist them to find
Karn [2003] presented a list of promising eye tracking metrics      their ways. They conducted a user study and assessed the
for data analysis:                                                  visual attractions of objects with an eye tracker. Their findings
                                                                    show that functional landmarks like doors and stairs are most
• Gaze duration - cumulative duration and average spatial           likely to be looked at. In our case we can use these landmarks
  location of a series of consecutive fixations within an area of   as reliable points of interest that can be used for finding the
  interest.                                                         location of the visitor in the museum. Beugher et al. [2014]
• Gaze rate – number of gazes per minute on each area of            presented a novel method for the automatic analysis of mobile
  interest.                                                         eye-tracking data in natural environment for object
• Number of fixation on each area of interest.                      recognition. The obtained results were satisfactory for most of
• Number of fixation, overall.                                      the objects. However, a large scale variance results in a lower
• Scan path – sequence of fixations.                                detection rate (for objects which were looked at both from
                                                                    very far away and from close by.)
• Number of involuntary and number of voluntary fixations
  (short fixations and long fixations should be defined well in     Schrammel et al. [2011, 2014] studied attentional behavior of
  term of millisecond units).                                       users on the move. They discussed the unique potential and
                                                                    challenges of using eye tracking in mobile settings and
Using handheld devices as a multimedia guidebook in                 demonstrated the ability to use it to study the attention on
museums has led to improvement in the museum visit                  advertising media in two different situations: within a digital
experience. Researches have confirmed the hypothesis that a         display in public transport and towards logos in a pedestrian
portable computer with an interactive multimedia application        shopping street as well as ideas about a general attention
has the potential to enhance interpretation and to become a         model based on eye gaze. Kiefer et al. [2014] also explored
new tool for interpreting museum collections [Evans et al.          the possibility of identifying users’ attention by eye tracking in
2005, Evans et al. 1999, Hsi 2003]. Studies about integration       the setting of tourism – when a tourist gets bored looking at a
of multimedia guidebooks with eye tracking have already been        city panorama – this scenario may be of specific interest for us
made in the context of museums and cultural heritage sites.         as locations or objects that attracted more or less interest may
Museum Guide 2.0 [Toyama et al. 2012] was presented as a            be used to model user's interest and trigger further
framework for delivering multimedia content for museum’s            services/information later on. Nakano and Ishii (2010) studied
visitors which runs on handheld device and uses the SMI             the use of eye gaze as an indicator for user engagement, trying
viewX eye tracker and object recognition techniques. The            also to adapt it to individual users. Engagement may be used
visitor can hear audio information when detecting an exhibit.       as an indicator for interest and the ability to adapt engagement
A users' study was conducted in a laboratory setting, but not in    detection to individual users may enable us also to infer
a real museum. We plan to extend this work by integrating an        interest and build/adapt a user model using this information.
eye tracker into real museum visitors' guide system and             Furthermore, Ma et al. [2015] demonstrated an initial ability to
experiment it is realistic setting.                                 extract user models based on eye gaze of users viewing
                                                                    videos. Xu et al. [2008] also used eye gaze to infer user
Brône et al. [2011] have implemented effective new methods
                                                                    preferences in the content of documents and videos by the
for analyzing gaze data collected with eye-tracking device and
                                                                    users attention as inferred from gaze analysis (number of
how to integrate it with object recognition algorithms. They
                                                                    fixations on word/image).
presented a series of arguments why an object-based approach
may provide a significant surplus, in terms of analytical           As we have seen, there is a large body of work about
precision. Specifically they discussed solutions in order to        monitoring and analyzing users' eye gaze in general and also
reduce the substantial cost of manual video annotation of gaze      in cultural heritage setting. Moreover, the appearance of
behavior, and have developed a series of proof-of-concept           mobile eye trackers opens up new opportunities for research in
case studies in different real world situations, each with its      mobile scenarios. It was also demonstrated in several
own challenges and requirements. We plan to use their lessons       occasions that eye gaze may be useful in enhancing a user
in our study. Pfeiffer et al. [2014] presented "EyeSee3D",          model, as it may enable to identify users' attention (and
where they combined geometric modelling with inexpensive            interests). Considering mobile scenarios, when users also carry
3D marker tracking to align virtual proxies with the real-world     smartphones - equipped with various sensors - implicit user
objects. This allowed classifying fixations on objects of           modeling can take place by integrating signals from various
interest automatically while supporting a free movement of the      sensors, including the new sensor of eye-gaze for better
participant. During the analysis of the accuracy of the pose        modeling the user and offering better personalized services. So
estimation they found that the marker detection may fail from       far sensors like GPS, compass, accelerometers and voice
several reasons: First, sometimes the participant looked            detectors were used in modeling users' context and interests,
(see for instance [Dim & Kuflik. 2014]). When we mention               experience. The system will be evaluated in user studies, the
mobile scenarios, we refer to a large variety of different             participants will be students from University of Haifa. The
scenarios – pedestrians' scenario differs from jogging or              study will be conducted in Hecht museum1, which is a small
shopping or cultural heritage scenario. The tasks are different        museum, located at the University of Haifa that has both an
and users' attention is split differently. The cultural heritage       archeological and art collections. The study will include an
domain is an example where users have long term interests              orientation about using the eye tracker and the mobile guide,
that can be modeled and the model can be used and updated              then taking a tour with the eye tracker and handheld device,
during a museum visit by information collected implicitly              multimedia content will be delivered by showing information
from various sensors, including eye-gaze. In this sense, the           on the screen or by listening to audio by earphones. Data will
proposed research extends and aims at generalizing the work            be collected as follows: The students will be interviewed and
of Kardan and Conati [2013]. Still, even though a lot of               asked about their visit experience, and will be asked to fill
research effort was invested in monitoring, analyzing and              questionnaires regarding general questions such as if it is the
using eye gaze for inferring user interests, so far, little research   first time that they have visited the museum, their gender and
attention was paid to users gazing behavior "on the go". This          age, and more. Visit logs will be collected and analyzed for
scenario poses major challenges as it involves splitting               later use, we can come to conclusions about the exhibit
attention between several tasks at the same time – avoiding            importance and where the visitors tend to look, the positioning
obstacles, gathering information and paying attention to               of the exhibits, and the time of the visits or explorations. The
whatever seems relevant, for many reasons.                             study will compare the visit experience when using two
                                                                       different system versions – a conventional one and one with an
4. RESEARCH GOAL AND QUESTIONS                                         integrated eye tracker, we will choose the work of [Kuflik et
Our goal is to examine the potential of integrating the eye            al. 2012] that was conducted in Hecht museum and which uses
tracking technology with a mobile guide for a museum visit             “light weight” proximity based indoor positioning sensors for
and try to answer the question: How can the use of mobile              location-awareness as a comparison system for examining the
eye tracker enhance the museum visit experience? Our                   user experience.
focus will be on developing a technique for location awareness
based on eye gaze detection and image matching, and integrate          6. PRELIMINARY RESULTS
it with a mobile museum visitor’s guide that provides                  It was important to examine the accuracy of eye gaze detection
multimedia content to the visitor. For that we will design and         when using the Pupil Dev mobile eye-tracker device. For that,
develop a system that runs on handheld device and uses Pupil           we have conducted several small-scale user studies onsite.
Dev [Kassner et al. 2014] eye tracker for identifying objects of
interest and delivering multimedia content to visitor in the           6.1 The Pupil eye tracker
museum. Then we will evaluate the system in a user study in a          Pupil eye tracker [Kassner et al. 2014] is an open source
real museum to find out how the use of eye tracker integrated          platform for pervasive eye tracking and gaze-based
with a multimedia guide can enhance the museum visit                   interaction. It comprises a light-weight eye tracking headset
experience. In our study, we have to consider different factors        that includes high-resolution scene and eye cameras, an open
and constraints that may affect the performance of the system,         source software framework for mobile eye tracking, as well as
such as the real environment lighting conditions which are             a graphical user interface to playback and visualize video and
different from laboratory conditions and can greatly affect the        gaze data. The software and GUI are platform-independent
process of object recognition. Another aspect may be the               and include algorithms for real-time pupil detection and
position of the exhibits relative to the eye tracker holder, since     tracking, calibration, and accurate gaze estimation. Results of
the eye tracker device is mounted as this is constrained by the        a performance evaluation show that Pupil can provide an
museum layout. While having many potential benefits, a                 average gaze estimation accuracy of 0.6 degree of visual angle
mobile guide can also have some disadvantages [Lanir et al,            (0.08 degree precision) with a processing pipeline latency of
2013]. It may focus the visitor’s attention on the mobile device       only 0.045 seconds.
rather than on the museum artifacts [Grinter et al, 2002]. We
will also examine this behavior and try to review whether the
use of eye tracker in mobile guide can increase the looking
time at the exhibits. In addition, we will try to build a system
that runs in various real environments with different factors
and have the same constraints such as the light and the
position constraints.

5. TOOLS AND METHODS
A commercial mobile eye tracker will be integrated into a
                                                                           Figure 1. Pupil eye-tracker (http://pupil-labs.com/pupil)
mobile museum visitors' guide system as a tool for location
awareness, interest detection and focus of attention by using
computer vision techniques. Our hypothesis is that the use of          1
                                                                           http://mushecht.haifa.ac.il/
the eye tracker in mobile guides can enhance the visit
6.2 User study 1: Look at a grid cells
Five students from the University of Haifa, without any visual
disabilities participated in this study. They were asked to look
at wall-mounted grid from a distance of 2 meters and track a
finger (see figure 2). On every cell that the finger pointed at,
they were asked to look at for approximately 3 seconds. Data
was collected for determining the practical measurement
accuracy. The results were as follows: on average, fixation
detection rate was ~80% (most missed fixations were in the
edges/corners – see table 1 for details about misses). In                            Figure 4. Gallery exhibition
addition, average fixation point error rate, in terms of distance
from the center of grids, was approximately 5 cm (exact error       6.3 User study 2: Look at an exhibit
rate can be calculated using simple image processing                In this study we examined the accuracy of the eye tracker in a
techniques for detecting the green circle and applying mapping      realistic setting. One participant (1.79m tall) was asked to look
transform to the real word).                                        at exhibits at the Hecht museum. Several exhibits where
                                                                    chosen with different factors and constraints (see figure 4, 5,
                                                                    and 6). The main constraint in this case is the distance from
                                                                    the exhibit since the visual range gets larger when the distance
                                                                    grows, and mainly we have to cover all the objects that we are
                                                                    interested in. Table 2 presents the objects height from the floor
                                                                    and the distance of the participant from the object. The next
                                                                    step was to examine fixations accuracy after making sure that
                                                                    the participant is standing in a proper distance. The participant
                                                                    was asked to look at different points in the exhibit/scene. In
 Figure 2. User study 1. The finger points at the grid where        the gallery exhibits, the scan path has been set to be the four
 the participant were asked to look at. The green circle is a       corners of the picture and finally the center of it. Regarding
 fixation point given from the eye tracker. The size of each        the vitrine exhibits, for each jug one point at the center has
                       cell is 20x20 cm.                            been defined

 Cell #       6         18         19            23      24
 Missed       5         5          3             5       5
                  Table 1. Experiment details.

During the study we ran into several practical problems. The
Pupil Dev eye tracker that we are using is not fitted for every
person. The device consists of two cameras, the first for
delivering the scene and the second directed to the right eye             Figure 5. Mounted backlighted images exhibition
for detecting fixations. In some cases when the device is not
fitted correctly, the vision range got smaller and parts of the     It’s important to note that the heights/distances relation is for
pupil got out from the capture frame (see figure 3 for              visual range (having the objects in the frame of the camera)
example). As a consequence no fixations were detected.              and not for fixations detections. Since missed fixations could
Another limitation was that when using the eye tracker with         be as a result of a set of constraints and not the distance from
tall participants, they have to step back from the object which     the object, thing that we have not examined yet.
negatively affects the accuracy.


          Figure 3. Screen capture from eye camera.

                                                                              Figure 6. Vitrine backlighted exhibition.
                                                                    Exhibit     width       height     Height from         Stand
                                                                     type       (cm)         (cm)       floor (cm)     distance (cm)
                                                                    Vitrine      80           25            150             150
  Exhibit     width        height      Height from         Stand       8.2 Object matching
   type       (cm)          (cm)         floor (cm)    distance (cm)   The matching procedure will be done in three steps:
   shelf        80           15              120             230
                                                                        1. Eye-tracker scene camera frame is taken (figure 7) and
                80           20               90             310
                                                                           image-to-image matching applied. The result is an image
                80           15               40             390           with labeled regions in the current scene’s frame (figure 8).
   Gallery      60           67              150             200
   Table 2. Experiment details – we considered the three most left      2. Mapping transformation – We need to transform the
           shelves in the vitrine exhibit shown in figure 6.               fixation point in the eye-tracker scene camera to a
                                                                           suitable/matched point in the image that we got in step one
                                                                           (image from the data-set with labeled regions), since the
  7. SYSTEM DESIGN
                                                                           viewpoint of the objects can be different from this in the
 A smart context-aware mobile museum visitors' guide may
                                                                           data set. For example one image is rotated relative to the
 provide the visitor with personalized relevant information
                                                                           other or one is zoomed in/out as a result of standing in
 from the vast amount of content available at the museum,
                                                                           different distance from the object when the data-set image
 adapted for his or her personal needs. Furthermore, the system
                                                                           was taken.
 may provide recommendations and location-relevant
 information. However, the potential benefit may also have a            3. Finding the object - This is step is simple since we have a
 cost, the notifications may interrupt the user in the current task        mapped fixation points and labeled regions. What remains
 and be annoying in the wrong context. Beja et al. [2015]                  is determining for which object the point does it relates (or
 examined the effect of notifications in a special leisure                 it relates to nothing).
 scenario - a museum visit. Following Beja et al [2015], we
 will consider three different scenarios:
 I. The Visitor is looking at an exhibit. The region of interest
    will be defined as the region from the scene around the
    gaze fixation point. Then object matching procedure will
    be applied (see section 8). It will enable us to determine
    both the visitor’s position and the object of interest.
II. The visitor is looking at the tablet. This could be done in
    two ways: 1) the visitor is watching multimedia
    information, in this scenario there is nothing to do for him.
                                                                        Figure 7. Example of eye-tracker scene camera. The green
    2) The visitor may need service from the system or a
                                                                                       point is the fixation point.
    recommendation, so it is the right time to deliver him.
III. The visitor is wandering in the museum. According to Beja         9. DISCUSSION
     et al. [2015], it is the best time for sending notifications.     We conducted these small-scale user studies in order to gain
                                                                       initial first-hand experience with the eye-tracker in a realistic
  As a basic system we will use the PIL museum visitor's guide         setting. Furthermore, we tried to clarify which exhibits are
  system [Kuflik et al 2012; Lanir et al. 2013]. The system is a       appropriate to be included in our future study and, given the
  context aware, mobile museum visitors' guide system. Its             limitation of the device, what portion of the museum exhibits
  positioning mechanism is based on proximity based RF                 may be included in general. Not surprisingly, we got 100%
  technology that enables to identify the visitor's position –         accuracy rate when we examined the device in the art wing
  when the visitor is near a point of interest. As vision is the       since all the pictures are placed in ideal height. Regarding the
  main sense for gathering information, we plan to replace the         archeological wing, it is considerably more challenging
  system's positioning component with an eye-tracker based             environment, since objects are placed in different heights and
  positioning and object of interest identification component.         have unequal sizes. As a result the visitor may have to stand
  Hence we will enhance the positioning system by providing            far away from the objects in order to get them into the eye-
  the system the ability to pin-point the object of interest. The      tracker front camera frame, a fact that can negatively affect the
  rest of the system will remain unchanged. Having these two           visit experience. In the case of archeological wing we
  versions of systems will enable us to compare and evaluate the       approximate that about 60% of the exhibits may be detectable
  benefits of the eye-tracker as a positioning and pointing device     with the current device. Regarding the low-height exhibits we
  in the museum.                                                       don’t know yet whether they can be considered or not. More
                                                                       challenging exhibits are these that are placed in harsh light
  8. OBJECT MATCHING PROCEDURE                                         conditions or placed in low height (see figure 9 for example)
  8.1 Data-set preparation                                             and/or these that are too large to fit in one frame (see figure 10
  A set of images of the exhibits will be taken, each image may        for example).
  contain one or more objects. Each image will be given a
  distinct label value and size of region around the object (in
  terms of width and height – rectangular shape).
                                                                    earphones or by watching slides. Furthermore, knowing
                                                                    exactly where the visitor look in the scene (specific object)
                                                                    will allow us to deliver personalized information. Our research
                                                                    will be a supplement to the nowadays mobile museum guide
                                                                    that uses location-awareness technology and techniques that
                                                                    enhances the visit experience. The system can also be
                                                                    extended and used in other venues such as outdoors cultural
                                                                    heritage sites as well as shopping centers/markets after further
                                                                    validation.
                                                                    REFERENCES
Figure 8. Image-to-image matching. The yellow rectangles
are the regions around each object. The green point is the          [1] Ardissono, L., Kuflik, T., &           Petrelli, D. (2012).
fixation point after performing proposed mapping                         Personalization in cultural heritage: the road travelled and
transformation. The corresponding region would be R3.                    the one ahead. User modeling and user-adapted
                                                                         interaction, 22(1-2), 73-99.
                                                                    [2] Barrow, H. G., Tenenbaum, J. M., Bolles, R. C., & Wolf,
                                                                         H. C. (1977). Parametric correspondence and chamfer
                                                                         matching: Two new techniques for image matching (No.
                                                                         TN-153). SRI international, Menlo Park CA, Artificial
                                                                         Intelligence center.
                                                                    [3] Beja, I., Lanir, J. and Kuflik, T. Examining Factors
                                                                         Influencing the Disruptiveness of Notifications in a
                                                                         Mobile         Museum           Context. Human–Computer
                                                                         Interaction just-accepted (2015).
                                                                    [4] Bitgood, S. (2002). Environmental psychology in
                                                                         museums, zoos, and other exhibition centers. Handbook
 Figure 9. A challenging exhibition: harsh light conditions              of environmental psychology, 461-480.
                      and low-height.                               [5] Brône, G., Oben, B., Van Beeck, K. Goedemé T. (2011).
                                                                         Towards a more effective method for analyzing mobile
                                                                         eye-tracking data: integrating gaze data with object
                                                                         recognition algorithms. UbiComp ‘11, Sep 17–Sep 21,
                                                                         2011, Beijing, China.
                                                                    [6] Bulling, A., Dachselt, R., Duchowski, A., Jacob, R.,
                                                                         Stellmach, S., & Sundstedt, V. (2012). Gaze interaction in
                                                                         the post-WIMP world. In CHI'12 Extended Abstracts on
                                                                         Human Factors in Computing Systems, 1221-1224.
                                                                    [7] Bulling, A., & Zander, T. O. (2014). Cognition-aware
                                                                         computing. Pervasive Computing, IEEE, 13(3), 80-83.
                                                                    [8] De Beugher, S., Brône, G., & Goedemé, T. (2014).
                                                                         Automatic analysis of in-the-wild mobile eye-tracking
Figure 10. A challenging exhibit: too big to fit in one frame.           experiments using object, face and person detection.
                                                                         In Proceedings of VISIGRAPP 2014, 1, 625-633.
10. CONCLUSIONS AND FUTURE WORK                                     [9] Dim, E., & Kuflik, T. (2014). Automatic detection of
This paper presents a work-in-progress that aims at exploring            social behavior of museum visitor pairs. ACM
the potential contribution of the mobile eye tracking                    Transactions on Interactive Intelligent Systems, 4(4), 17.
technology in enhancing the museum visit experience. For that       [10] Evans, J., Sterry, P. Portable computers & interactive
we have done small-scale experiments in order to get an
                                                                         media: A new paradigm for interpreting museum
understanding of the performance of the system in realistic
                                                                         collections. In: Bearman, D., Trant, J. (eds.) Cultural
setting. We got satisfactory results from these studies and an
understanding of the limitations of the equipment. The next              Heritage Informatics 1999: Selected papers from ICHIM
step in the study is to design and build a museum mobile guide           1999, 93–101. Kluwer Academic Publishers, Dordrecht.
that extends the use of mobile eye tracking as a tool for           [11] Falk, John H., and Lynn D. Dierking. Learning from
identifying the visitor position and points of interests. We will        museums: Visitor experiences and the making of
use the eye-tracker scene camera captures and the collected              meaning. Altamira Press, 2000.
gaze data to develop a technique for location-awareness. The        [12] Fitts, P. M., Jones, R. E., & Milton, J. L. (1950). Eye
system will run on tablet, and the multimedia content will be            movements of aircraft pilots during instrument-landing
delivered to the participants by listening to audio guide via            approaches. Aeronautical Eng. Review 9(2), 24–29.
[13] Goferman, S., Zelnik-Manor, L., & Tal, A. (2012).             [27] Ma, K. T., Xu, Q., Li, L., Sim, T., Kankanhalli, M., &
     Context-aware saliency detection. Pattern Analysis and             Lim, R. (2015). Eye-2-I: Eye-tracking for just-in-time
     Machine Intelligence, IEEE Transactions on, 34 (10),               implicit user profiling. arXiv preprint arXiv:1507.04441.
     1915-1926.                                                    [28] Nakano, Y. I., & Ishii, R. (2010). Estimating user's
[14] Grinter, R. E., Aoki, P. M., Szymanski, M. H., Thornton,           engagement from eye-gaze behaviors in human-agent
     J. D., Woodruff, A., & Hurst, A. (2002). Revisiting the            conversations. In Proceedings of IUI 2010.139-148.
     visit: understanding how technology can shape the             [29] Naphade, M. R., Yeung, M. M., & Yeo, B. L. (1999).
     museum visit. In Proceedings of CSCW 2002,146-155.                 Novel scheme for fast and efficient video sequence
[15] Gruen, A. (1985). Adaptive least squares correlation: a            matching using compact signatures. Electronic Imaging.
     powerful image matching technique. South African                   564-572. International Society for Optics and Photonics.
     Journal of Photogrammetry, Remote Sensing and                 [30] O. Stock, M. Zancanaro, F. Pianesi, D. Tomasini, and C.
     Cartography, 14(3), 175-187.                                       Rocchi. 2009. Formative evaluation of a tabletop Display
[16] Hampapur, A., Hyun, K., & Bolle, R. M. (2001,                      meant to orient casual conversation. Journal of
     December). Comparison of sequence matching techniques              Knowledge, Technology and Policy 22(1), 17–23.
     for video copy detection. In Electronic Imaging 2002 .        [31] Ohm, C., Müller, M., Ludwig, B., & Bienk, S. (2014).
     194-201. International Society for Optics and Photonics.           Where is the Landmark? Eye Tracking Studies in Large-
[17] Hendrickson, K., & Ailawadi, K. L. (2014). Six lessons             Scale Indoor Environments.
     for in-store marketing from six years of mobile eye-          [32] Packer, Jan, and Roy Ballantyne. (2005) "Solitary vs.
     tracking research. Shopper Marketing and the Role of In-           shared: Exploring the social dimension of museum
     Store Marketing. Review of Marketing Research, 11, 57-             learning." Curator: The Museum Journal 48.2 177-192.
     74.                                                           [33] Pinto, Nicolas, David D. Cox, and James J. DiCarlo.
[18] Kardan, S., & Conati, C. (2013). Comparing and                     "Why is real-world visual object recognition hard?."
     Combining Eye Gaze and Interface Actions for                       (2008): e27.
     Determining User Learning with an Interactive                 [34] Pfeiffer, T., & Renner, P. (2014). Eyesee3d: A low-cost
     Simulation. In proceedings of UMAP 2013, 215-227.                  approach for analyzing mobile 3d eye tracking data using
[19] Kassner, M., Patera, W., & Bulling, A. (2014,                      computer vision and augmented reality technology.
     September). Pupil: an open source platform for pervasive           In Proceedings of the Symposium on Eye Tracking
     eye tracking and mobile gaze-based interaction.                    Research and Applications. 369-376. ACM.
     In Proceedings of the 2014 ACM International Joint            [35] ROBERT J. K. JACOB (1991). The Use of Eye
     Conference on Pervasive and Ubiquitous Computing:                  Movements in Human-Computer Interaction Techniques:
     Adjunct Publication. 1151-1160. ACM.                               What You Look At is What You Get, ACM Transactions
[20] Katy Micha, Daphne Economou (2005). Using Personal                 on Information Systems, 9(3), April 1991, 152-169.
     Digital Assistants (PDAs) to Enhance the Museum Visit         [36] Robert J. K. Jacob, Keith S. Karn (2003). Eye Tracking in
     Experience. In proceedings of PCI 2005, Volas, Greece,             Human–Computer Interaction and Usability Research:
     November 11-13, 2005. Proceedings. 188-198.                        Ready to Deliver the Promises, Elsevier Science BV.
[21] Kiefer, P., Giannopoulos, I., Kremer, D., Schlieder, C., &    [37] Schrammel, J., Mattheiss, E., Döbelt, S., Paletta, L.,
     Raubal, M. (2014). Starting to get bored: An outdoor eye           Almer, A., & Tscheligi, M. (2011). Attentional behavior
     tracking study of tourists exploring a city panorama.              of users on the move towards pervasive advertising
     In Proceedings of the Symposium on Eye Tracking                    media. In Pervasive Advertising, 287-307..
     Research and Applications (pp. 315-318). ACM.                 [38] Schrammel, J., Regal, G., & Tscheligi, M. (2014).
[22] Kim, C., & Vasudev, B. (2005). Spatiotemporal sequence             Attention approximation of mobile users towards their
     matching for efficient video copy detection. Circuits and          environment. In CHI'14 Extended Abstracts on Human
     Systems for Video Technology, IEEE Transactions                    Factors in Computing Systems, 1723-1728.
     on, 15(1), 127-132.                                           [39] S.Hsi (2003). A study of user experiences mediated by
[23] Kuflik, T., Lanir, J., Dim, E., Wecker, A., Corra, M.,             nomadic web content in a museum. The Exploratorium,
     Zancanaro, M., & Stock, O. (2012). Indoor positioning in           3601 Lyon Street, San Francisco, CA 94123
     cultural heritage: Challenges and a solution. In Electrical   [40] Takumi Toyama, Thomas Kieninger, Faisal Shafait,
     & Electronics Engineers in Israel (IEEEI), 2012 IEEE 27th          Andreas Dengel. Gaze guided object recognition using a
     Convention of. 1-5. IEEE.                                          head-mounted eye tracker. ETRA '12 Proceedings of the
[24] Lanir, J., Kuflik, T., Dim, E., Wecker, A. J., & Stock, O.         Symposium on Eye Tracking Research and Applications,
     (2013). The influence of a location-aware mobile guide on          91-98, ACM New York, NY, USA 2012
     museum visitors' behavior. Interacting with Computers,        [41] Thirion, J. P. (1998). Image matching as a diffusion
     25(6), 443-460.                                                    process: an analogy with Maxwell's demons. Medical
[25] Laurent I.(2007) Visual salience. Scholarpedia, 2(9):3327.         image analysis, 2(3), 243-260.
[26] Leinhardt, G., and Knutson K. Listening in on museum          [42] Xu, S., Jiang, H., & Lau, F. (2008). Personalized online
     conversations. Rowman Altamira, 2004.                              document, image and video recommendation via
                                                                        commodity eye-tracking. In Proceedings of RecSys 2008
[43] Yalowitz, S.S. and Bronnenkant, K. (2009) Timing and         [44] Zhang, Z., Deriche, R., Faugeras, O., & Luong, Q. T.
    tracking: unlocking visitor behavior. Visit. Stud., 12, 47–       (1995). A robust technique for matching two uncalibrated
    64.                                                               images through the recovery of the unknown epipolar
                                                                      geometry. Artificial intelligence, 78(1), 87-119.

</pre>