Modelling and Predicting Movements of Museum Visitors: A Simulation
    Framework for Assessing the Impact of Sensor Noise on Model Performance
            Fabian Bohnert, Ingrid Zukerman,                                      Timothy Baldwin
                  and David W. Albrecht
             Faculty of Information Technology                        Dept. of Comp. Sci. and Soft. Eng.
               Monash University, Australia                         The University of Melbourne, Australia
          firstname.lastname@monash.edu                                       tb@ldwin.net

                          Abstract                                 the information stream is reliable, as opposed to information
                                                                   obtained from visitors’ direct participation.
     We present a simulation framework to examine the
     impact of sensor noise on the performance of user                In order to personalise content and generate recommen-
     models in the museum domain. Our contributions                dations on the basis of information provided by unobtrusive
     are: (1) models to simulate noisy visit trajectories          sensors (rather than from user participation), questions of in-
     as time-stamped sequences of (x, y) positional co-            terest include: (1) how to infer a visitor’s viewed exhibits
     ordinates which reflect walking and hovering be-              solely from sensor readings; and (2) how to predict the next
     haviour; (2) a discriminative inference model that            exhibit(s) a visitor is likely to view. In this paper, we present
     distinguishes between hovering and walking on the             a realistic simulation model which offers some insights to an-
     basis of (simulated) noisy sensor observations; (3) a         swer these questions, and may be employed to make deci-
     model that infers viewed exhibits from hovering co-           sions regarding the instrumentation of a space.
     ordinates; and (4) a model that predicts the next ex-            In previous research, we offered a simulation framework
     hibit on the basis of inferred (rather than known)            for investigating the impact of different sensing technologies
     viewed exhibits. Our staged evaluation assesses the           on the predictive performance of user models [Schmidt et al.,
     effect of these models (in combination with sensor            2009]. The aim was to provide a practical solution to the
     noise) on inferential and predictive performance,             problem of assessing the accuracy of the user models that can
     thus shedding light on the reliability attributed to          be derived from a sensor-based system prior to actually de-
     inferences drawn from sensor observations.                    ploying a particular technology. However, that work made
                                                                   strong simplifying assumptions that affected the realism of
                                                                   the framework, and hence the significance and usefulness of
1   Introduction                                                   its results, viz: (1) sensors can detect, with some error, a
The construction of models of visitors to public spaces, in        single square (in a grid representation of the museum floor)
particular museums, has been of interest to the user modelling     where a visitor is statically positioned while viewing an ex-
and cultural tourism communities for some time [Cheverst           hibit ik ; and (2) the previously viewed exhibits i1 , . . . , ik−1
et al., 2002; Hatala and Wakkary, 2005; Stock et al., 2007].       are known (not just the previous coordinates of a visitor)
These models are used to predict visitors’ interests in order      when predicting the next exhibit ik+1 . In reality, people tend
to personalise the content of presentations, or make recom-        not to remain stationary at an exhibit, and they certainly do
mendations of locations (e. g., exhibits) to be visited. In most   not ‘teleport’ between squares on the floor. Rather, they walk
systems developed to date, these user models are acquired          between exhibits, and often hover around an exhibit to view
through the active participation of the visitors, e. g., by pro-   it from different angles or distances. Thus, when sensing a
viding feedback through a device. This requirement imposes         visitor’s movements in a museum, the best we can hope for is
a burden on the visitors, which in turn may reduce the reli-       a time-stamped trajectory of (x, y) coordinates (sampled at a
ability of the obtained information, e. g., if visitors provide    particular rate), where the observed coordinates diverge from
feedback only occasionally.                                        the true positions of the visitor by some sensor error. As a
   Recent advances in mobile computing and sensing tech-           result, the sequence of previously viewed exhibits cannot be
nologies have enabled the instrumentation of physical public       known with certainty — at best a likely sequence of exhibits
spaces, which in turn has enabled the automatic tracking of        can be inferred from the sensor observations.
visitors’ movements [Hazas et al., 2004; Lassabe et al., 2009;        In this paper, we propose a simulation framework that
Philipose et al., 2004]. Information regarding visitors’ where-    eschews the above assumptions, significantly extending our
abouts and the time spent at different locations supports the      previous work and the insights obtained from it. Specifi-
automatic inference of visitors’ interests and the prediction of   cally, our contributions are: (1) models to simulate noisy visit
their trajectories [Bohnert and Zukerman, 2009]. Clearly, in-      trajectories as time-stamped sequences of (x, y) positional
ferences from positional and timing information are more in-       coordinates which reflect walking and hovering behaviour;
direct and uncertain than visitors’ direct feedback. However,      (2) a discriminative inference model that distinguishes be-
tween hovering and walking on the basis of noisy sensor ob-        PEACH project developed technology which adapts its user
servations; (3) a model that infers likely viewed exhibits from    model on the basis of both explicit user feedback and im-
time-stamped sequences of hovering coordinates (instead of         plicit observations of a user’s interactions with a mobile de-
a single static grid square per exhibit as done in our previous    vice [Stock et al., 2007]. This user model was used to gener-
work); and (4) a model that predicts the next exhibit on the ba-   ate personalised multimedia presentations for museum visi-
sis of these inferred (rather than known) viewed exhibits. At      tors. The PEACH project also explored simple localisation
present, we assume that the sensors can only track a visitor’s     technology, but did not derive user modelling information
position. However, our models may be extended to incor-            from sensor readings. The augmented audio reality system
porate orientation information and occasional user feedback        for museums ec(h)o adapted its user model on the basis of a
to improve the accuracy of inferences obtained from sensor         visitor’s movements through the exhibition space and his/her
readings, and hence the predictions of subsequent exhibits.        interactions with the system [Hatala and Wakkary, 2005]. The
   The research in this paper builds on the framework de-          collected user modelling data were used to deliver person-
scribed by Schmidt et al. [2009], which comprises a predic-        alised information associated with exhibits via audio display.
tive user model of exhibits to be viewed, and a spatial viewing    However, the project did not investigate the effect of locali-
model of positions from which each exhibit can be seen. Like       sation accuracy on the quality of the resultant user modelling
Schmidt et al., we evaluate our framework in the context of        information.
the Marine Life Exhibition at Melbourne Museum. In this pa-           In contrast to the above research, this paper investigates
per, we augment the evaluations done by Schmidt et al., pre-       the impact of using sensing technology as a means for gath-
senting the results of a staged evaluation which examines the      ering information about a user, i. e., to learn a user model. To
effect of different information-based models, in combination       this effect, we offer a simulation framework which generates
with sensor noise, on inferential and predictive performance.      noisy visit trajectories that reflect walking and hovering be-
   This paper is organised as follows. Section 2 discusses re-     haviour, and investigate the relationship between sensor noise
lated research. Section 3 briefly summarises the key compo-        and inferential and predictive user model performance.
nents of our previous simulation framework. Our approach
for simulating detailed coordinate-based visit trajectories is     3   Prerequisites
presented in Section 4, and our inference and prediction mod-      This section briefly summarises four key components of the
els are described in Section 5. The results of our evaluation      simulation framework introduced by Schmidt et al. [2009],
are presented in Section 6, followed by concluding remarks         which is extended in this paper: (1) frequency-based Transi-
in Section 7.                                                      tion Model; (2) Spatial Exhibit Viewing Model; (3) generation
                                                                   of exhibit tours; and (4) generation of exhibit squares.
2   Related Research
The research community has initiated a wealth of projects          Frequency-based Transition Model. We use a frequency-
that investigate user modelling and personalisation technol-       based Transition Model to represent visitors’ movements be-
ogy in the context of physical spaces. For example, in the mu-     tween museum exhibits [Bohnert et al., 2008; Schmidt et
seum domain, HyperAudio dynamically adapted hyperlinks             al., 2009]. This model, which is implemented as a 1-stage
and presented content to stereotypical assumptions about a         Markov model, estimates the transition probabilities Pi,j be-
visitor, and to what the visitor has already accessed through      tween exhibits i and j from frequency counts of exhibit tran-
a mobile device and seems interested in [Petrelli and Not,         sitions that are derived from observed visit trajectories. When
2005]. The CHIP project harnessed Semantic Web tech-               estimating the transition probabilities, additive smoothing is
niques to provide personalised access to digital museum col-       applied in light of our small dataset of 44 observed trajecto-
lections both online and in the physical museum [Wang et al.,      ries (Section 6.1):
2009]. This was done by using explicitly initialised user mod-                         ni,j + αi
els. The Kubadji project investigated user and language mod-                  P̂i,j =               for i, j = 1, . . . , M
                                                                                      Ni + M αi
elling techniques that rely on mobile technology deployed in
museums [Bohnert and Zukerman, 2009]. While the focus              where ni,j counts the transitions from
                                                                                                     P exhibit i to exhibit j,
was on modelling visitors based on non-intrusive observa-          αi is a smoothing constant, Ni = k=1,...,M ni,k is the total
tions that can be derived from sensor readings, the project did    number of times exhibit i was viewed, and M is the number
not evaluate its models with real-world sensing technology.        of exhibits.
   In contrast to these projects, which did not employ real-
world sensing technology, other research projects incorpo-         Spatial Exhibit Viewing Model. Our modelling frame-
rated wireless technology or sensor networks. The GUIDE            work employs a probabilistic model of the viewing areas for
project developed a handheld tourist guide for visitors to the     each exhibit in the museum space, which divides the space
city of Lancaster, UK [Cheverst et al., 2002]. It employed         into a grid of squares (for the Marine Life Exhibition, the grid
user models obtained from explicit user input to generate dy-      size is 47 × 61 = 2, 867 squares, where a square is approxi-
namic and user-adapted city tours, where the order of the vis-     mately 30 cm × 30 cm; Figure 1). The model specifies a dis-
ited locations could be varied. The project used wireless ac-      crete probability distribution which represents P(i | x, y), the
cess points to stream content data to a user’s device, but did     probability of a visitor viewing each exhibit i from a square
not employ the wireless network to localise the user. The          at position (x, y).
             (a) Smooth representation (ground truth)                               (b) Noisy representation (ν = 2 metres)

                                Figure 1: Two representations of part of a simulated visitor pathway


Generation of exhibit tours. We generate tours of viewed              ing is represented by a red/grey line, hovering is represented
exhibits as follows. Each tour begins at a fictitious start ex-       by a blue/dark-grey line on pink/shaded squares, and wall
hibit i0 and ends at a fictitious end exhibit iend . For each         squares are coloured in blue/grey), and Figure 1(b) is the rep-
exhibit ik−1 already in the tour (k = 1, 2, . . .), the next ex-      resentation obtained by applying Gaussian sensor noise at a
hibit ik is generated by sampling from a categorical distri-          level of ν = 2 metres.
bution specified by the transition probabilities Pik−1 ,ik . This
step is repeated for each added exhibit ik until the end ex-          4.1    Generating Walking Squares
hibit iend is reached.                                                In Section 3, we generated one viewing square for each ex-
   In addition to this sequence of exhibits, our walking/hov-         hibit in a visitor’s tour. However, visitors do not simply tele-
ering model (Section 4) requires the time that a visitor spends       port between squares. To produce a more realistic continuous
at each viewed exhibit. We generate a viewing time Ti at              visit trajectory, we must build a path that links these squares.
exhibit i by randomly drawing from an exponential distribu-           At first glance, it seems that a shortest-path algorithm may be
tion, i. e., Ti ∼ Exp(λi ), where the average viewing time λi         used for this task. However, trajectories generated in this way
at each exhibit i is estimated by maximum likelihood from             exhibit an unnatural level of repetition and purposefulness,
the 44 observed tours in the Marine Life Exhibition dataset.          tending to run directly along exhibition walls. In practice,
                                                                      visitors tend to move more erratically. To simulate these be-
Generation of exhibit squares. Once a tour of exhibits has            haviours, we incorporate stochastic effects into the shortest-
been simulated, Schmidt et al. [2009] generate a single view-         path procedure. Specifically, we model the probability of
ing square at position (x, y) for each viewed exhibit i in the        moving into a square as being proportional to the probability
tour. This is done by sampling from the categorical distribu-         of viewing the destination exhibit from this square, moder-
tion P(x, y | i) over all exhibit squares, where P(x, y | i) is de-   ated by the visitor’s propensity to avoid walls and to meander.
rived by applying Bayes’ theorem to the viewing probabilities         Our approach uses parameters that control two behavioural
P(i | x, y) obtained from the Spatial Exhibit Viewing Model.          aspects of visitors: (1) how erratic or purposeful their move-
   In this work, we use Schmidt et al.’s model to generate the        ment is; and (2) their propensity to avoid walls.1 These con-
first hovering square for each viewed exhibit (Section 4.2).          siderations are implemented as follows.
                                                                         Assume we want to generate a sequence of walking squares
                                                                      to connect two exhibits i and j in a tour. Let (xs , ys ) de-
4   Simulation of Coordinate-based Visitor                            note the end square of exhibit i (i. e., the source square),
    Pathways                                                          and (xd , yd ) the starting square of exhibit j (i. e., the des-
The previous section outlined our method for generating ex-           tination square). Also, treating diagonal squares as adja-
hibit tours with a single static grid square per exhibit. In this     cent, let the candidate squares of a square (x, y) be the
section, we simulate (smooth and noisy) coordinate-based              eight squares surrounding this square. We start by employing
visit trajectories which reflect two types of behaviour: walk-        Dijkstra’s algorithm [Dijkstra, 1959] to generate a distance
ing between exhibits, and hovering at exhibits. Our ap-               matrix D whose elements Dx,y correspond to the shortest-
proach comprises the following four steps, which are de-              path distances from each square (x, y) to the destination
scribed below: (1) generation of connected paths of walk-             square (xd , yd ). Then, we generate a sequence of walking
ing squares between exhibits; (2) generation of connected             squares as follows. For each square (xn , yn ) (starting from
paths of hovering squares to simulate viewing behaviour at            the source square (xs , ys )), the next square (xn+1 , yn+1 ) that
exhibits; (3) smoothing of the obtained square trajectory; and        a visitor moves into while walking is sampled from among
(4) simulation of noisy sensor observations from this smooth              1
                                                                            In our evaluation, we use fixed parameter values. Alternatively,
pathway representation.                                               one could sample the values for each trajectory simulation. Also,
   Figure 1 depicts two representations of part of a simulated        certain parameter values in combination with different transition
visit trajectory (we show the part for the Tool Time exhibit          models may yield different types of museum visitors, e. g., the ant,
in the Mealtime section of the Marine Life Exhibition). Fig-          fish, butterfly and grasshopper types [Véron and Levasseur, 1983;
ure 1(a) shows the trajectory obtained after simulation (walk-        Zancanaro et al., 2007].
(xn , yn )’s eight candidate squares, provided that the move           obtained by distorting the true coordinates (x, y) through
does not take the visitor farther away from (xd , yd ) (the dis-       additive Gaussian noise and sampling at regular time in-
tance information is obtained from D). In this procedure, the          tervals (for our experiments, we use a constant sampling
sampling is performed from a categorical distribution over             rate of one second). Specifically, the measured coordinates
the eight candidate squares, whose probabilities are propor-           are found by sampling from a bivariate normal distribu-
tional to the probabilities of viewing the destination exhibit         tion N((x, y), σ 2 I) with mean (x, y) and covariance σ 2 I,
from each square, moderated by the visitor’s propensity to             where σ is a constant which reflects the expected accuracy
avoid walls and to meander (the probabilities are zero for the         of the sensing infrastructure, and I is the identity matrix.
squares that take the visitor farther away from (xd , yd )). The       For example, if the infrastructure is able to deliver posi-
visitor moves in this fashion until (xd , yd ) is reached. At that     tions within an accuracy level of ν metres 95% of the time,
point, the trajectory between (xs , ys ) and (xd , yd ) is com-        then σ = ν/2 would be a suitable value, as this places ap-
plete, and timestamps are iteratively added to the trajectory          proximately 95% of the probability mass within the circle de-
                                                                                          2             2
assuming a constant walking speed vw for the visitor.                  fined by (x0 − x) + (y 0 − y) = ν 2 . Figure 1(b) depicts
                                                                       part of a noisy visit trajectory which was sampled by follow-
4.2   Generating Hovering Squares                                      ing this procedure for the pathway shown in Figure 1(a) at a
Once at an exhibit, visitors usually observe the exhibit for           sampling rate of one second with ν = 2 metres.
some time before moving on to the next one. Additionally,
visitors typically do not remain static, but move around to ex-        5     Inference and Prediction of Viewed Exhibits
amine the exhibit from different angles and distances. This
so-called hovering behaviour is included in our simulation                   from Positional Coordinates
framework by varying the movement model described in Sec-              When information on a visitor’s movements is automatically
tion 4.1, so that a visitor is more likely to move towards a           gathered through sensors, all that is available is a sequence
square from which the exhibit is more likely to be viewed,             of (typically noisy) time-stamped (x, y) coordinates (Sec-
but may not move at all.                                               tion 4.4).2 Assuming that we have a method for detecting
   Timestamps are added to the generated hovering squares              whether a visitor is hovering (and hence viewing an exhibit),
assuming a hovering speed of vh < vw (as for the walking               we can decompose the complete (x, y) sequence into sub-
case, we assume a constant hovering speed). The hovering               sequences of (x, y) coordinates that pertain to hovering be-
behaviour continues until the sampled viewing time Ti for              haviour (Section 5.1). From these, we can infer which exhibit
the current exhibit i is exceeded (viewing time sampling is            the visitor is viewing (Section 5.2), and employ a model to
described in Section 3).                                               predict which exhibit the visitor is likely to view next on the
                                                                       basis of this information (Section 5.3).
4.3   Smoothing the Square Trajectory
To obtain a smooth positional tour representation from a               5.1    Classification-based Inference of Walking and
time-stamped trajectory of squares, i. e., (htn , xn , yn i; n =              Hovering
1, 2, . . .), we fit piecewise cubic splines to the coordinate-        To infer walking and hovering behaviour from positional
individual trajectories htn , xn i and htn , yn i (one piecewise       (x, y) coordinates, we employ a window-based approach. We
cubic spline each). We do this by applying the splinefit               first derive indicative features from a window comprising the
package from the Matlab Central File Exchange [Lundgren,               previous ω sensor observations, and then provide these fea-
2007]. This approach uses the method of least squares to fit           tures to a purpose-trained classifier for inference. The output
splines with reduced degrees of freedom (we reduce the num-            of this binary classifier is a label which indicates whether a
ber of spline pieces by 70% compared to direct interpolation),         visitor’s activity is walking or hovering.
and generates a smooth representation of the trajectory in the            Prior to deriving the features, we smooth the noisy sen-
sense that (x, y), (ẋ, ẏ) and (ẍ, ÿ) are all continuous in time.   sor observations ht, x, yi by fitting piecewise cubic splines to
   The resultant representation may be interpreted as a contin-        the ht, xi and ht, yi trajectories [Lundgren, 2007], and eval-
uous positional representation of the visit trajectory, enabling       uating these splines at the original timestamps (similarly to
us to obtain a visitor’s position at any point in time. Fig-           Section 4.3). Using the resultant smoothed sensor observa-
ure 1(a) depicts part of one such smooth visit trajectory.             tions, we compute the following feature set of size 2ω + 7
                                                                       that pertains to (non-directional) velocity and acceleration:
4.4   Simulating Sensor Noise
                                                                           • ω − 1 velocities (each of them calculated as the length of
The visit trajectories obtained so far are smooth and continu-               one of the ω − 1 velocity vectors, which in turn are de-
ous. However, in practice, any trajectory-based input to a user              rived from the ω smoothed positional coordinates from
modelling system would be acquired through sensors that de-                  within the window)
liver only a visitor’s approximate position (due to measure-
ment error) at a certain sampling rate.                                    • Minimum and maximum of the ω − 1 velocities
   In this paper, we explore sensor noise that may be at-                  • Mean and median of the ω − 1 velocities
tributed to range-based positioning technology, e. g., WiFi
                                                                           • Standard deviation of the ω − 1 velocities
and ultra-wide band (UWB) [Zhao and Guibas, 2004]. We
follow a widely accepted model for sensor noise in this set-               2
                                                                             For simplicity of notation, we use (x, y) instead of (x0 , y 0 ) in
ting, and assume that the measured coordinates (x0 , y 0 ) are         the remainder of the paper to denote the measured noisy coordinates.
   • ω−2 accelerations (each of them calculated as the length              In this calculation, the transition probabilities Pj,i are de-
      of one of the ω − 2 acceleration vectors, which in turn           rived from the information provided by the Transition Model
      are derived from the ω − 1 velocity vectors)                      in Section 3 by setting to zero the columns of the transition
   • Minimum and maximum of the ω − 2 accelerations                     matrix that pertain to the already viewed exhibits, and renor-
                                                                        malising each row of the matrix to 1.4
   • Mean and median of the ω − 2 accelerations
   • Standard deviation of the ω − 2 accelerations                      6     Evaluation
   In our experiments (Section 6), we use support vector ma-
                                                                        This section presents our data collection method and datasets,
chines (SVM) to train the classifiers. We employ C-SVC
                                                                        and describes our experiments and results.
SVMs with an RBF kernel from LIBSVM [Chang and Lin,
2001], using features derived from the previous five (x, y) ob-         6.1    Data Collection and Datasets
servations (ω = 5).
                                                                        Our dataset of real-world exhibit tours was obtained at the
5.2    Probability-based Inference of Exhibits                          Marine Life Exhibition at Melbourne Museum. It consists
In this section, we describe how we infer the exhibits most             of a (manually collected) record of the exhibits viewed by
likely viewed by the visitor while hovering.                            44 visitors, and the viewing times at the exhibits. On aver-
   After inferring a visitor’s activity (i. e., walking or hover-       age, each visitor viewed 7.2 of the M = 22 exhibits. The
ing) for each sensor observation ht, x, yi, we extract from the         data for the viewing model described in Section 3 were ob-
complete (x, y) sequence the sub-sequences of (x, y) coordi-            tained separately, by manually annotating a grid-based map
nates that correspond to hovering behaviour. For each sub-              to record the positions of visitors to the exhibition.
sequence of hovering-labelled (x, y) coordinates, we then                  These data were used together with the method from
calculate the following exhibit scores:                                 Section 4 to generate 1000 simulated visits, where each
                     Y                                                  visit comprises time-stamped sequences of (typically noisy)
         score(i) =       P(i | x, y) for all exhibits i      (1)       (x, y) coordinates at different noise levels — each element
                      (x,y)                                             consisting of ht, x, yi. These 1000 simulated visits are the
where P(i | x, y) is the probability of a visitor viewing ex-           basis for our evaluation. When generating the visits, we as-
hibit i while hovering at position (x, y) (Section 3). To               sumed a constant walking speed of vw = 3 km/h and a hover-
smooth out possible errors introduced in the classification             ing speed of vh = 1 km/h. Also, we used a sampling rate of
step (Section 5.1), we delete walking labels that separate two          one observation per second.
consecutive sub-sequences of hovering labels for which the                 Current range-based positioning systems are often based
same exhibit has the highest score. We also remove hovering-            on processing radio signals, e. g., WiFi and ultra-wide band
labelled sub-sequences of length 1 (the exhibit scores of any           (UWB). WiFi-based technology typically achieves accuracy
affected sub-sequences of hovering labels are recomputed).              levels of 2 to 3.5 metres [Bahl and Padmanabhan, 2000;
Finally, all scores are normalised to obtain probabilities.             Lassabe et al., 2009], while future UWB-based systems
   For each sub-sequence of hovering labels, this procedure             are expected to achieve accuracy levels of up to 0.15 me-
yields a probability distribution which specifies how likely a          tres [Hazas et al., 2004]. We therefore considered accuracy
visitor is to view each exhibit.                                        levels of ν = 0 to 4.5 metres when generating the visits.
5.3    Model-based Prediction of Exhibits                               6.2    Experiments and Results
Once the viewed exhibits are inferred, we can use this infor-           To evaluate our models, we applied bootstrapping [Mooney
mation to predict a visitor’s next exhibit for each (x, y) po-          and Duval, 1993] as follows. The 1000 generated visits were
sition at which the visitor is hovering.3 However, as seen              split into a training set of 100 visits and a test set of 900 vis-
in the previous section, there is some uncertainty regarding            its. 200 bootstrap samples were then generated from the test
which exhibit the visitor is actually viewing. We therefore             set, with each bootstrap sample being constructed by sam-
use the Weighted approach described by Schmidt et al. [2009]            pling from the 900 visits with replacement (200 is the recom-
for predicting the next exhibit from positional information.            mended upper bound on the number of samples for bootstrap-
For each possible next exhibit i, the Weighted approach es-             ping [Mooney and Duval, 1993]). The training set remained
timates Pnext (i | x, y) as the weighted average of the transi-         the same for all samples. Our results are averaged over the
tion probabilities Pj,i from each possible current exhibit j            bootstrap samples.5
to exhibit i. The weights are the probabilities P(j | x, y) of             We conducted three experiments with these training and
viewing exhibit j when standing within the square at posi-              test sets: (1) walking/hovering classification; (2) inferring
tion (x, y) (Section 3).                                                exhibits from positional hovering coordinates; and (3) pre-
                               M
                               X                                        dicting the next exhibit. All performance differences be-
           P̂next (i | x, y) =   { P(j | x, y) × Pj,i }                 tween models were found to be statistically significant with
                              j=1                                           4
                                                                              Our observations indicate that visitors rarely return to previ-
   3                                                                    ously viewed exhibits. Hence, we focus on unseen exhibits.
    Predictions of a visitor’s next exhibits can be combined with
                                                                            5
predictions of the personally interesting exhibits to generate recom-         We employed bootstrapping, because only the test data varies
mendations of exhibits that may be overlooked if the predicted next     for this technique, compared to cross validation which conflates the
exhibits are actually visited.                                          variation in the training and test data.
                                                                                 Table 1: Inference models and their experimental conditions
                                                                                                                                                                              Exhibits
                                                     Models                                    Time & (x, y)          Walk/Hover
                                                                                                                                                                      Previous        Current
                                                     TLall                              sequence of ht, x, yi          Inferred                                       Inferred                    Inferred
                                                     TLAall                             sequence of ht, x, yi           Given                                         Inferred                    Inferred
                                                     Exhprev TLAcurr                    sequence of ht, x, yi           Given                                          Given                      Inferred
                                                     Schmidt et al.                    one hx, yi per exhibit            N/A                                           Given                      Inferred
                                                     Exhall                             sequence of ht, x, yi           Given                                          Given                       Given

                                                1                                                                                             2.8
                                                                                                                                                         TLAall
                                                                                                                                                         TLall
             average classification accuracy


                                               0.9                                                                                            2.6


                                                                                                                           average log loss
                                               0.8                                                                                            2.4


                                               0.7                                                                                            2.2


                                               0.6                                                                                             2
                                                          TL
                                                               all
                                                          MCL
                                               0.5                                                                                            1.8
                                                     0   0.5         1   1.5      2      2.5      3   3.5   4   4.5                                 0   0.5       1      1.5      2      2.5      3   3.5    4   4.5
                                                                               sensor error (ν)                                                                                sensor error (ν)


            Figure 2: Average walking/hovering classifica-                                                                Figure 3: Average log loss of actually viewed
            tion accuracy against sensor error                                                                            exhibits against sensor error


p  0.001 (evaluated using two-tailed paired t-tests on the                                                                Figure 2 depicts classification accuracy as a function of
bootstrap samples).                                                                                                     sensor error, where the majority class baseline (MCL) as-
   Table 1 summarises the models used in our experiments,                                                               sumes that a person is always hovering (the results are av-
indicating the inferred versus given information (only the                                                              eraged over the 22 exhibits of the Marine Life Exhibition).
first two models, i. e., those with grey background, are used                                                           Our results show that for no sensor error, our SVM classi-
in our first two experiments). The top model TLall (Time-                                                               fier (TLall ) is able to infer whether a visitor is walking or
Location for all observations) is the most realistic, as its in-                                                        hovering with approximately 97% accuracy. Classification
formation is akin to that obtained from sensor readings (i. e.,                                                         accuracy decreases to about 88% as the sensor error increases
a sequence of time-stamped (x, y) coordinates). The models                                                              to 2.75 metres (the middle of the range for WiFi technology).
then become progressively less realistic, starting with TLAall
(Time-Location-Action for all observations), where the walk-                                                            Inferring exhibits from positional hovering coordinates.
ing/hovering labels are considered given, up to Exhall , where                                                          To evaluate the performance of our mechanism for inferring
the walking/hovering labels, previous exhibits and current ex-                                                          the sequence of visited exhibits, we gave as input sequences
hibit are given. To contextualise our work, Table 1 also shows                                                          of times and positions (htn , xn , yn i; n = 1, 2, . . .) and walk-
Schmidt et al.’s model [Schmidt et al., 2009] (typeset in ital-                                                         ing/hovering labels (one label for each element in a se-
ics), but its results are excluded from our evaluation, as it does                                                      quence). The probabilities of viewed exhibits were calculated
not model trajectories or temporal information.                                                                         once for given (known) walking/hovering labels, and once for
                                                                                                                        labels inferred using the SVM classifier (Section 5.1). The in-
Walking/hovering classification. To evaluate the perfor-                                                                ferences were made as described in Section 5.2, and resulted
mance of our walking/hovering classification method (Sec-                                                               in a probability distribution of the exhibit being viewed by a
tion 5.1), we gave as input sequences of times and posi-                                                                visitor for each sub-sequence of hovering labels.
tions (htn , xn , yn i; n = 1, 2, . . .). For each walking/hovering                                                        Figure 3 depicts the average log loss (negative log of the
classification, we considered the five positional observations                                                          probability of the actually viewed exhibit), averaged over the
made within the last four seconds (ω = 5). As visitors hover                                                            22 exhibits, as a function of sensor error. The figure compares
slightly less than 69% of the time, and walk between exhibits                                                           the performance obtained when the walking/hovering labels
for the rest of the time, we under-sampled the hovering por-                                                            are inferred (TLall ) with that obtained when the labels are
tion of the training data to balance the classes.6                                                                      given (TLAall ). It is worth noting that the comparison was
                                                                                                                        done for the timestamps where the inferred and given hover-
   6
     We under-sampled the larger class, rather than over-sampling                                                       ing labels overlap, but the exhibit probabilities used for the
the smaller class, in order to retain the variance of the latter class.
We also experimented with unbalanced data, but the performance                                                          was inferior to that obtained with the balanced data.
                                                0.6                                                                                             3.05
                                                                                                                                                            Exhall


            average top−3 predictive accuracy
                                                                                                                                                            ExhprevTLAcurr
                                                0.5                                                                                              2.9
                                                                                                                                                            TLAall
                                                                                                                                                            TLall


                                                                                                                             average log loss
                                                0.4                                                                                             2.75


                                                0.3                                                                                              2.6

                                                           Exhall
                                                           ExhprevTLAcurr
                                                0.2                                                                                             2.45
                                                           TLAall
                                                           TLall
                                                0.1                                                                                              2.3
                                                      0   0.5       1       1.5      2      2.5      3   3.5   4   4.5                                 0   0.5       1       1.5      2      2.5      3   3.5   4   4.5
                                                                                  sensor error (ν)                                                                                 sensor error (ν)

                                                                (a) Average top-3 accuracy                                                                               (b) Average log loss

                                                                        Figure 4: Predictive performance of the four models against sensor error


comparison were calculated for all the inferred or given hov-                                                            7                      Conclusions
ering labels in each continuous sub-sequence of hovering la-                                                             This paper offered a realistic model of sensor-based infor-
bels. This explains the (expected) slight drop in performance                                                            mation, significantly extending the work of Schmidt et al.
for inferred hovering labels, since, as seen in the first exper-                                                         [2009]. Our framework enables us to study the impact of dif-
iment, the inferred labels are sometimes wrong. Also, as ex-                                                             ferent assumptions regarding sensor noise and available sen-
pected, performance deteriorates as sensor error increases.                                                              sor information on inferential performance regarding viewed
                                                                                                                         exhibits. The accuracy of these inferences in turn affects the
                                                                                                                         performance of user models, viz models of visitors’ interests
Predicting the next exhibit. This experiment determines
                                                                                                                         and of exhibits they are likely to visit. As expected, predic-
the effect of different assumptions regarding available infor-
                                                                                                                         tive performance deteriorates for every experimental parame-
mation on predictive accuracy. We consider our four models
                                                                                                                         ter that is inferred (rather than given), and also as sensor error
from Table 1, whose information ranges from time-stamped
                                                                                                                         increases. However, interestingly, performance remains quite
positional sensor logs (TLall ) to sequences of viewed ex-
                                                                                                                         stable for sensor noise up to 1.5 metres, which is an encour-
hibits (Exhall ). In line with Schmidt et al. [2009], for all
                                                                                                                         aging result for real-world systems.
four models, the next exhibit was predicted using the transi-
                                                                                                                            Our inferential and predictive models in combination sup-
tion matrix learned from the 44 tours observed at the Marine
                                                                                                                         port the generation of recommendations of exhibits that may
Life Exhibition (Section 3). For Exhall , we used the transi-
                                                                                                                         be of interest but are likely to be missed. Our models may
tion matrix directly (the transition probabilities for previously
                                                                                                                         also be used to influence the strength of recommendations as
visited exhibits were set to zero), while for the other models,
                                                                                                                         a function of the reliability of the information on which the
we used the Weighted approach described in Section 5.3.
                                                                                                                         recommendations are based. An additional application of our
   Figures 4(a) and 4(b) show, respectively, the average top-                                                            results is in guiding the layout of sensing devices in a mu-
3 accuracy and average log loss for various levels of sen-                                                               seum, e. g., it may be advantageous to place more devices in
sor error for the four models described in Table 1 (the re-                                                              locations where the inferences are more uncertain.
sults are averaged over the 22 exhibits). For this experi-
ment, log loss is defined as the negative log of the probabil-
ity with which the exhibit actually viewed next is predicted,                                                            Acknowledgements
and top-3 accuracy measures how often the exhibit actually                                                               This research was supported in part by grant DP0770931
viewed next is one of the three exhibits predicted with the                                                              from the Australian Research Council. The authors thank
highest probability. We employ top-3 rather than top-1 ac-                                                               Daniel F. Schmidt for his involvement at early stages of this
curacy because the top probabilities are often quite similar                                                             research, Liz Sonenberg and Carolyn Meehan for fruitful dis-
due to the physical layout of the exhibition. As seen in the                                                             cussions and their support, and David Abramson, Jeff Tan and
figures, the higher the uncertainty about a visitor’s behaviour                                                          Blair Bethwaite for their assistance with the computer cluster.
and the higher the sensor error, the lower the accuracy and the
higher the log loss (statistically significant). Note that Exhall                                                        References
is invariant to sensor noise, as all the information is assumed
                                                                                                                         [Bahl and Padmanabhan, 2000] Paramvir Bahl and Ven-
given (Table 1). Interestingly, the differences in performance
between the three lower-information models (TLall , TLAall                                                                 kata N. Padmanabhan. RADAR: An in-building RF-based
and Exhprev TLAcurr ) are relatively small, and their perfor-                                                              user location and tracking system. In Proceedings of the
mance profiles are quite flat up to ν = 1.5 metres, diverg-                                                                19th Annual Joint IEEE Conference on Computer Commu-
ing slightly from there on. The creditable performance up to                                                               nications (INFOCOM-00), pages 775–784, 2000.
ν = 1.5 metres means that one can expect acceptable predic-                                                              [Bohnert and Zukerman, 2009] Fabian Bohnert and Ingrid
tive performance from sensor-based systems.                                                                                Zukerman. Non-intrusive personalisation of the museum
   experience. In Proceedings of the 17th International Con-       [Petrelli and Not, 2005] Daniela Petrelli and Elena Not.
   ference on User Modeling, Adaptation, and Personaliza-             User-centred design of flexible hypermedia for a mobile
   tion (UMAP-09), pages 197–209, 2009.                               guide: Reflections on the HyperAudio experience. User
[Bohnert et al., 2008] Fabian Bohnert, Ingrid Zukerman,               Modeling and User-Adapted Interaction, 15(3-4):303–
   Shlomo Berkovsky, Timothy Baldwin, and Liz Sonenberg.              338, 2005.
   Using interest and transition models to predict visitor loca-   [Philipose et al., 2004] Matthai Philipose, Kenneth P. Fish-
   tions in museums. AI Communications, 21(2-3):195–202,              kin, Mike Perkowitz, Donald J. Patterson, Dieter Fox,
   2008.                                                              Henry Kautz, and Dirk Hahnel. Inferring activities from
[Chang and Lin, 2001] Chih-Chung Chang and Chih-Jen                   interactions with objects. IEEE Pervasive Computing,
                                                                      3(4):50–57, 2004.
   Lin. LIBSVM: A library for support vector machines,
   2001. Software available at http://www.csie.ntu.                [Schmidt et al., 2009] Daniel F. Schmidt, Ingrid Zukerman,
   edu.tw/˜cjlin/libsvm.                                              and David W. Albrecht. Assessing the impact of measure-
                                                                      ment uncertainty on user models in spatial domains. In
[Cheverst et al., 2002] Keith Cheverst, Keith Mitchell, and
                                                                      Proceedings of the 17th International Conference on User
   Nigel Davies. The role of adaptive hypermedia in a                 Modeling, Adaptation, and Personalization (UMAP-09),
   context-aware tourist GUIDE. Communications of the                 pages 210–222, 2009.
   ACM, 45(5):47–51, 2002.                                         [Stock et al., 2007] Oliviero Stock, Massimo Zancanaro,
[Dijkstra, 1959] Edsger W. Dijkstra. A note on two prob-              Paolo Busetta, Charles Callaway, Antonio Krüger,
   lems in connexion with graphs. Numerische Mathematik,              Michael Kruppa, Tsvika Kuflik, Elena Not, and Cesare
   1:269–271, 1959.                                                   Rocchi. Adaptive, intelligent presentation of information
[Hatala and Wakkary, 2005] Marek Hatala and Ron                       for the museum visitor in PEACH. User Modeling and
   Wakkary. Ontology-based user modeling in an aug-                   User-Adapted Interaction, 18(3):257–304, 2007.
   mented audio reality system for museums. User Modeling          [Véron and Levasseur, 1983] Eliséo Véron and Martine Le-
   and User-Adapted Interaction, 15(3-4):339–380, 2005.               vasseur. Ethnographie de l’Exposition. Bibliothèque
[Hazas et al., 2004] Mike Hazas, James Scott, and John                Publique d’Information, Centre Georges Pompidou, Paris,
   Krumm. Location-aware computing comes of age. IEEE                 France, 1983.
   Computer, 37(2):95–97, 2004.                                    [Wang et al., 2009] Yiwen Wang, Lora Aroyo, Natalia Stash,
[Lassabe et al., 2009] Frederic Lassabe, Philippe Canalda,            Rody Sambeek, Yuri Schuurmans, Guus Schreiber, and
                                                                      Peter Gorgels. Cultivating personalized museum tours
   Pascal Chatonnay, and François Spies. Indoor Wi-Fi posi-
                                                                      online and on-site. Interdisciplinary Science Reviews,
   tioning: Techniques and systems. Annals of Telecommuni-
                                                                      34(2):141–156, 2009.
   cations, 64:651–664, 2009.
                                                                   [Zancanaro et al., 2007] Massimo Zancanaro, Tsvika Kuflik,
[Lundgren, 2007] Jonas Lundgren.          SPLINEFIT, 2007.            Zvi Boger, Dina Goren-Bar, and Dan Goldwasser. Analyz-
   Software available at http://www.mathworks.                        ing museum visitors’ behavior patterns. In Proceedings of
   com/matlabcentral/fileexchange/                                    the 11th International Conference on User Modeling (UM-
   13812-fit-a-spline-to-noisy-data.                                  07), pages 238–246, 2007.
[Mooney and Duval, 1993] Christopher Z. Mooney and                 [Zhao and Guibas, 2004] Feng Zhao and Leonidas Guibas.
   Robert D. Duval. Bootstrapping: A Nonparametric Ap-                Wireless Sensor Networks: An Information Processing Ap-
   proach to Statistical Inference. Sage Publications, New-           proach. Morgan Kaufmann, 2004.
   bury Park, CA, USA, 1993.