<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modelling and Predicting Movements of Museum Visitors: A Simulation Framework for Assessing the Impact of Sensor Noise on Model Performance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabian Bohnert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ingrid Zukerman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David W. Albrecht</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Comp. Sci. and Soft. Eng. The University of Melbourne</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Information Technology Monash University</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a simulation framework to examine the impact of sensor noise on the performance of user models in the museum domain. Our contributions are: (1) models to simulate noisy visit trajectories as time-stamped sequences of (x; y) positional coordinates which reflect walking and hovering behaviour; (2) a discriminative inference model that distinguishes between hovering and walking on the basis of (simulated) noisy sensor observations; (3) a model that infers viewed exhibits from hovering coordinates; and (4) a model that predicts the next exhibit on the basis of inferred (rather than known) viewed exhibits. Our staged evaluation assesses the effect of these models (in combination with sensor noise) on inferential and predictive performance, thus shedding light on the reliability attributed to inferences drawn from sensor observations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The construction of models of visitors to public spaces, in
particular museums, has been of interest to the user modelling
and cultural tourism communities for some time [Cheverst
et al., 2002; Hatala and Wakkary, 2005; Stock et al., 2007].
These models are used to predict visitors’ interests in order
to personalise the content of presentations, or make
recommendations of locations (e. g., exhibits) to be visited. In most
systems developed to date, these user models are acquired
through the active participation of the visitors, e. g., by
providing feedback through a device. This requirement imposes
a burden on the visitors, which in turn may reduce the
reliability of the obtained information, e. g., if visitors provide
feedback only occasionally.</p>
      <p>Recent advances in mobile computing and sensing
technologies have enabled the instrumentation of physical public
spaces, which in turn has enabled the automatic tracking of
visitors’ movements [Hazas et al., 2004; Lassabe et al., 2009;
Philipose et al., 2004]. Information regarding visitors’
whereabouts and the time spent at different locations supports the
automatic inference of visitors’ interests and the prediction of
their trajectories [Bohnert and Zukerman, 2009]. Clearly,
inferences from positional and timing information are more
indirect and uncertain than visitors’ direct feedback. However,
the information stream is reliable, as opposed to information
obtained from visitors’ direct participation.</p>
      <p>In order to personalise content and generate
recommendations on the basis of information provided by unobtrusive
sensors (rather than from user participation), questions of
interest include: (1) how to infer a visitor’s viewed exhibits
solely from sensor readings; and (2) how to predict the next
exhibit(s) a visitor is likely to view. In this paper, we present
a realistic simulation model which offers some insights to
answer these questions, and may be employed to make
decisions regarding the instrumentation of a space.</p>
      <p>In previous research, we offered a simulation framework
for investigating the impact of different sensing technologies
on the predictive performance of user models [Schmidt et al.,
2009]. The aim was to provide a practical solution to the
problem of assessing the accuracy of the user models that can
be derived from a sensor-based system prior to actually
deploying a particular technology. However, that work made
strong simplifying assumptions that affected the realism of
the framework, and hence the significance and usefulness of
its results, viz: (1) sensors can detect, with some error, a
single square (in a grid representation of the museum floor)
where a visitor is statically positioned while viewing an
exhibit ik; and (2) the previously viewed exhibits i1; : : : ; ik 1
are known (not just the previous coordinates of a visitor)
when predicting the next exhibit ik+1. In reality, people tend
not to remain stationary at an exhibit, and they certainly do
not ‘teleport’ between squares on the floor. Rather, they walk
between exhibits, and often hover around an exhibit to view
it from different angles or distances. Thus, when sensing a
visitor’s movements in a museum, the best we can hope for is
a time-stamped trajectory of (x; y) coordinates (sampled at a
particular rate), where the observed coordinates diverge from
the true positions of the visitor by some sensor error. As a
result, the sequence of previously viewed exhibits cannot be
known with certainty — at best a likely sequence of exhibits
can be inferred from the sensor observations.</p>
      <p>In this paper, we propose a simulation framework that
eschews the above assumptions, significantly extending our
previous work and the insights obtained from it.
Specifically, our contributions are: (1) models to simulate noisy visit
trajectories as time-stamped sequences of (x; y) positional
coordinates which reflect walking and hovering behaviour;
(2) a discriminative inference model that distinguishes
between hovering and walking on the basis of noisy sensor
observations; (3) a model that infers likely viewed exhibits from
time-stamped sequences of hovering coordinates (instead of
a single static grid square per exhibit as done in our previous
work); and (4) a model that predicts the next exhibit on the
basis of these inferred (rather than known) viewed exhibits. At
present, we assume that the sensors can only track a visitor’s
position. However, our models may be extended to
incorporate orientation information and occasional user feedback
to improve the accuracy of inferences obtained from sensor
readings, and hence the predictions of subsequent exhibits.</p>
      <p>The research in this paper builds on the framework
described by Schmidt et al. [2009], which comprises a
predictive user model of exhibits to be viewed, and a spatial viewing
model of positions from which each exhibit can be seen. Like
Schmidt et al., we evaluate our framework in the context of
the Marine Life Exhibition at Melbourne Museum. In this
paper, we augment the evaluations done by Schmidt et al.,
presenting the results of a staged evaluation which examines the
effect of different information-based models, in combination
with sensor noise, on inferential and predictive performance.</p>
      <p>This paper is organised as follows. Section 2 discusses
related research. Section 3 briefly summarises the key
components of our previous simulation framework. Our approach
for simulating detailed coordinate-based visit trajectories is
presented in Section 4, and our inference and prediction
models are described in Section 5. The results of our evaluation
are presented in Section 6, followed by concluding remarks
in Section 7.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Research</title>
      <p>The research community has initiated a wealth of projects
that investigate user modelling and personalisation
technology in the context of physical spaces. For example, in the
museum domain, HyperAudio dynamically adapted hyperlinks
and presented content to stereotypical assumptions about a
visitor, and to what the visitor has already accessed through
a mobile device and seems interested in [Petrelli and Not,
2005]. The CHIP project harnessed Semantic Web
techniques to provide personalised access to digital museum
collections both online and in the physical museum [Wang et al.,
2009]. This was done by using explicitly initialised user
models. The Kubadji project investigated user and language
modelling techniques that rely on mobile technology deployed in
museums [Bohnert and Zukerman, 2009]. While the focus
was on modelling visitors based on non-intrusive
observations that can be derived from sensor readings, the project did
not evaluate its models with real-world sensing technology.</p>
      <p>In contrast to these projects, which did not employ
realworld sensing technology, other research projects
incorporated wireless technology or sensor networks. The GUIDE
project developed a handheld tourist guide for visitors to the
city of Lancaster, UK [Cheverst et al., 2002]. It employed
user models obtained from explicit user input to generate
dynamic and user-adapted city tours, where the order of the
visited locations could be varied. The project used wireless
access points to stream content data to a user’s device, but did
not employ the wireless network to localise the user. The
PEACH project developed technology which adapts its user
model on the basis of both explicit user feedback and
implicit observations of a user’s interactions with a mobile
device [Stock et al., 2007]. This user model was used to
generate personalised multimedia presentations for museum
visitors. The PEACH project also explored simple localisation
technology, but did not derive user modelling information
from sensor readings. The augmented audio reality system
for museums ec(h)o adapted its user model on the basis of a
visitor’s movements through the exhibition space and his/her
interactions with the system [Hatala and Wakkary, 2005]. The
collected user modelling data were used to deliver
personalised information associated with exhibits via audio display.
However, the project did not investigate the effect of
localisation accuracy on the quality of the resultant user modelling
information.</p>
      <p>In contrast to the above research, this paper investigates
the impact of using sensing technology as a means for
gathering information about a user, i. e., to learn a user model. To
this effect, we offer a simulation framework which generates
noisy visit trajectories that reflect walking and hovering
behaviour, and investigate the relationship between sensor noise
and inferential and predictive user model performance.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Prerequisites</title>
      <p>This section briefly summarises four key components of the
simulation framework introduced by Schmidt et al. [2009],
which is extended in this paper: (1) frequency-based
Transition Model; (2) Spatial Exhibit Viewing Model; (3) generation
of exhibit tours; and (4) generation of exhibit squares.</p>
      <sec id="sec-3-1">
        <title>Frequency-based Transition Model. We use a frequency</title>
        <p>based Transition Model to represent visitors’ movements
between museum exhibits [Bohnert et al., 2008; Schmidt et
al., 2009]. This model, which is implemented as a 1-stage
Markov model, estimates the transition probabilities Pi;j
between exhibits i and j from frequency counts of exhibit
transitions that are derived from observed visit trajectories. When
estimating the transition probabilities, additive smoothing is
applied in light of our small dataset of 44 observed
trajectories (Section 6.1):</p>
        <p>P^ i;j =
ni;j + i
Ni + M i
for i; j = 1; : : : ; M
where ni;j counts the transitions from exhibit i to exhibit j,
i is a smoothing constant, Ni = Pk=1;:::;M ni;k is the total
number of times exhibit i was viewed, and M is the number
of exhibits.</p>
        <p>Spatial Exhibit Viewing Model. Our modelling
framework employs a probabilistic model of the viewing areas for
each exhibit in the museum space, which divides the space
into a grid of squares (for the Marine Life Exhibition, the grid
size is 47 61 = 2; 867 squares, where a square is
approximately 30 cm 30 cm; Figure 1). The model specifies a
discrete probability distribution which represents P(i j x; y), the
probability of a visitor viewing each exhibit i from a square
at position (x; y).
(a) Smooth representation (ground truth)
(b) Noisy representation ( = 2 metres)
Generation of exhibit tours. We generate tours of viewed
exhibits as follows. Each tour begins at a fictitious start
exhibit i0 and ends at a fictitious end exhibit iend. For each
exhibit ik 1 already in the tour (k = 1; 2; : : :), the next
exhibit ik is generated by sampling from a categorical
distribution specified by the transition probabilities Pik 1;ik . This
step is repeated for each added exhibit ik until the end
exhibit iend is reached.</p>
        <p>In addition to this sequence of exhibits, our
walking/hovering model (Section 4) requires the time that a visitor spends
at each viewed exhibit. We generate a viewing time Ti at
exhibit i by randomly drawing from an exponential
distribution, i. e., Ti Exp( i), where the average viewing time i
at each exhibit i is estimated by maximum likelihood from
the 44 observed tours in the Marine Life Exhibition dataset.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Generation of exhibit squares. Once a tour of exhibits has</title>
        <p>been simulated, Schmidt et al. [2009] generate a single
viewing square at position (x; y) for each viewed exhibit i in the
tour. This is done by sampling from the categorical
distribution P(x; y j i) over all exhibit squares, where P(x; y j i) is
derived by applying Bayes’ theorem to the viewing probabilities
P(i j x; y) obtained from the Spatial Exhibit Viewing Model.</p>
        <p>In this work, we use Schmidt et al.’s model to generate the
first hovering square for each viewed exhibit (Section 4.2).
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Simulation of Coordinate-based Visitor</title>
    </sec>
    <sec id="sec-5">
      <title>Pathways</title>
      <p>The previous section outlined our method for generating
exhibit tours with a single static grid square per exhibit. In this
section, we simulate (smooth and noisy) coordinate-based
visit trajectories which reflect two types of behaviour:
walking between exhibits, and hovering at exhibits. Our
approach comprises the following four steps, which are
described below: (1) generation of connected paths of
walking squares between exhibits; (2) generation of connected
paths of hovering squares to simulate viewing behaviour at
exhibits; (3) smoothing of the obtained square trajectory; and
(4) simulation of noisy sensor observations from this smooth
pathway representation.</p>
      <p>Figure 1 depicts two representations of part of a simulated
visit trajectory (we show the part for the Tool Time exhibit
in the Mealtime section of the Marine Life Exhibition).
Figure 1(a) shows the trajectory obtained after simulation
(walking is represented by a red/grey line, hovering is represented
by a blue/dark-grey line on pink/shaded squares, and wall
squares are coloured in blue/grey), and Figure 1(b) is the
representation obtained by applying Gaussian sensor noise at a
level of = 2 metres.</p>
      <sec id="sec-5-1">
        <title>4.1 Generating Walking Squares</title>
        <p>In Section 3, we generated one viewing square for each
exhibit in a visitor’s tour. However, visitors do not simply
teleport between squares. To produce a more realistic continuous
visit trajectory, we must build a path that links these squares.
At first glance, it seems that a shortest-path algorithm may be
used for this task. However, trajectories generated in this way
exhibit an unnatural level of repetition and purposefulness,
tending to run directly along exhibition walls. In practice,
visitors tend to move more erratically. To simulate these
behaviours, we incorporate stochastic effects into the
shortestpath procedure. Specifically, we model the probability of
moving into a square as being proportional to the probability
of viewing the destination exhibit from this square,
moderated by the visitor’s propensity to avoid walls and to meander.
Our approach uses parameters that control two behavioural
aspects of visitors: (1) how erratic or purposeful their
movement is; and (2) their propensity to avoid walls.1 These
considerations are implemented as follows.</p>
        <p>Assume we want to generate a sequence of walking squares
to connect two exhibits i and j in a tour. Let (xs; ys)
denote the end square of exhibit i (i. e., the source square),
and (xd; yd) the starting square of exhibit j (i. e., the
destination square). Also, treating diagonal squares as
adjacent, let the candidate squares of a square (x; y) be the
eight squares surrounding this square. We start by employing
Dijkstra’s algorithm [Dijkstra, 1959] to generate a distance
matrix D whose elements Dx;y correspond to the
shortestpath distances from each square (x; y) to the destination
square (xd; yd). Then, we generate a sequence of walking
squares as follows. For each square (xn; yn) (starting from
the source square (xs; ys)), the next square (xn+1; yn+1) that
a visitor moves into while walking is sampled from among
1In our evaluation, we use fixed parameter values. Alternatively,
one could sample the values for each trajectory simulation. Also,
certain parameter values in combination with different transition
models may yield different types of museum visitors, e. g., the ant,
fish, butterfly and grasshopper types [Ve´ron and Levasseur, 1983;
Zancanaro et al., 2007].
(xn; yn)’s eight candidate squares, provided that the move
does not take the visitor farther away from (xd; yd) (the
distance information is obtained from D). In this procedure, the
sampling is performed from a categorical distribution over
the eight candidate squares, whose probabilities are
proportional to the probabilities of viewing the destination exhibit
from each square, moderated by the visitor’s propensity to
avoid walls and to meander (the probabilities are zero for the
squares that take the visitor farther away from (xd; yd)). The
visitor moves in this fashion until (xd; yd) is reached. At that
point, the trajectory between (xs; ys) and (xd; yd) is
complete, and timestamps are iteratively added to the trajectory
assuming a constant walking speed vw for the visitor.
4.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Generating Hovering Squares</title>
        <p>Once at an exhibit, visitors usually observe the exhibit for
some time before moving on to the next one. Additionally,
visitors typically do not remain static, but move around to
examine the exhibit from different angles and distances. This
so-called hovering behaviour is included in our simulation
framework by varying the movement model described in
Section 4.1, so that a visitor is more likely to move towards a
square from which the exhibit is more likely to be viewed,
but may not move at all.</p>
        <p>Timestamps are added to the generated hovering squares
assuming a hovering speed of vh &lt; vw (as for the walking
case, we assume a constant hovering speed). The hovering
behaviour continues until the sampled viewing time Ti for
the current exhibit i is exceeded (viewing time sampling is
described in Section 3).
4.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Smoothing the Square Trajectory</title>
        <p>To obtain a smooth positional tour representation from a
time-stamped trajectory of squares, i. e., (htn; xn; yni; n =
1; 2; : : :), we fit piecewise cubic splines to the
coordinateindividual trajectories htn; xni and htn; yni (one piecewise
cubic spline each). We do this by applying the splinefit
package from the Matlab Central File Exchange [Lundgren,
2007]. This approach uses the method of least squares to fit
splines with reduced degrees of freedom (we reduce the
number of spline pieces by 70% compared to direct interpolation),
and generates a smooth representation of the trajectory in the
sense that (x; y), (x_ ; y_) and (x; y) are all continuous in time.</p>
        <p>The resultant representation may be interpreted as a
continuous positional representation of the visit trajectory, enabling
us to obtain a visitor’s position at any point in time.
Figure 1(a) depicts part of one such smooth visit trajectory.
4.4</p>
      </sec>
      <sec id="sec-5-4">
        <title>Simulating Sensor Noise</title>
        <p>The visit trajectories obtained so far are smooth and
continuous. However, in practice, any trajectory-based input to a user
modelling system would be acquired through sensors that
deliver only a visitor’s approximate position (due to
measurement error) at a certain sampling rate.</p>
        <p>In this paper, we explore sensor noise that may be
attributed to range-based positioning technology, e. g., WiFi
and ultra-wide band (UWB) [Zhao and Guibas, 2004]. We
follow a widely accepted model for sensor noise in this
setting, and assume that the measured coordinates (x0; y0) are
obtained by distorting the true coordinates (x; y) through
additive Gaussian noise and sampling at regular time
intervals (for our experiments, we use a constant sampling
rate of one second). Specifically, the measured coordinates
are found by sampling from a bivariate normal
distribution N((x; y); 2I) with mean (x; y) and covariance 2I,
where is a constant which reflects the expected accuracy
of the sensing infrastructure, and I is the identity matrix.
For example, if the infrastructure is able to deliver
positions within an accuracy level of metres 95% of the time,
then = =2 would be a suitable value, as this places
approximately 95% of the probability mass within the circle
defined by (x0 x)2 + (y0 y)2 = 2. Figure 1(b) depicts
part of a noisy visit trajectory which was sampled by
following this procedure for the pathway shown in Figure 1(a) at a
sampling rate of one second with = 2 metres.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5 Inference and Prediction of Viewed Exhibits from Positional Coordinates</title>
      <p>When information on a visitor’s movements is automatically
gathered through sensors, all that is available is a sequence
of (typically noisy) time-stamped (x; y) coordinates
(Section 4.4).2 Assuming that we have a method for detecting
whether a visitor is hovering (and hence viewing an exhibit),
we can decompose the complete (x; y) sequence into
subsequences of (x; y) coordinates that pertain to hovering
behaviour (Section 5.1). From these, we can infer which exhibit
the visitor is viewing (Section 5.2), and employ a model to
predict which exhibit the visitor is likely to view next on the
basis of this information (Section 5.3).
5.1</p>
      <sec id="sec-6-1">
        <title>Classification-based Inference of Walking and</title>
      </sec>
      <sec id="sec-6-2">
        <title>Hovering</title>
        <p>To infer walking and hovering behaviour from positional
(x; y) coordinates, we employ a window-based approach. We
first derive indicative features from a window comprising the
previous ! sensor observations, and then provide these
features to a purpose-trained classifier for inference. The output
of this binary classifier is a label which indicates whether a
visitor’s activity is walking or hovering.</p>
        <p>Prior to deriving the features, we smooth the noisy
sensor observations ht; x; yi by fitting piecewise cubic splines to
the ht; xi and ht; yi trajectories [Lundgren, 2007], and
evaluating these splines at the original timestamps (similarly to
Section 4.3). Using the resultant smoothed sensor
observations, we compute the following feature set of size 2! + 7
that pertains to (non-directional) velocity and acceleration:
! 1 velocities (each of them calculated as the length of
one of the ! 1 velocity vectors, which in turn are
derived from the ! smoothed positional coordinates from
within the window)</p>
        <sec id="sec-6-2-1">
          <title>Minimum and maximum of the !</title>
        </sec>
        <sec id="sec-6-2-2">
          <title>1 velocities</title>
        </sec>
        <sec id="sec-6-2-3">
          <title>Mean and median of the !</title>
        </sec>
        <sec id="sec-6-2-4">
          <title>1 velocities</title>
          <p>Standard deviation of the !</p>
        </sec>
        <sec id="sec-6-2-5">
          <title>1 velocities</title>
          <p>2For simplicity of notation, we use (x; y) instead of (x0; y0) in
the remainder of the paper to denote the measured noisy coordinates.
! 2 accelerations (each of them calculated as the length
of one of the ! 2 acceleration vectors, which in turn
are derived from the ! 1 velocity vectors)</p>
          <p>In our experiments (Section 6), we use support vector
machines (SVM) to train the classifiers. We employ C-SVC
SVMs with an RBF kernel from LIBSVM [Chang and Lin,
2001], using features derived from the previous five (x; y)
observations (! = 5).</p>
        </sec>
      </sec>
      <sec id="sec-6-3">
        <title>5.2 Probability-based Inference of Exhibits</title>
        <p>In this section, we describe how we infer the exhibits most
likely viewed by the visitor while hovering.</p>
        <p>After inferring a visitor’s activity (i. e., walking or
hovering) for each sensor observation ht; x; yi, we extract from the
complete (x; y) sequence the sub-sequences of (x; y)
coordinates that correspond to hovering behaviour. For each
subsequence of hovering-labelled (x; y) coordinates, we then
calculate the following exhibit scores:
score(i) = Y P(i j x; y) for all exhibits i (1)
(x;y)
where P(i j x; y) is the probability of a visitor viewing
exhibit i while hovering at position (x; y) (Section 3). To
smooth out possible errors introduced in the classification
step (Section 5.1), we delete walking labels that separate two
consecutive sub-sequences of hovering labels for which the
same exhibit has the highest score. We also remove
hoveringlabelled sub-sequences of length 1 (the exhibit scores of any
affected sub-sequences of hovering labels are recomputed).
Finally, all scores are normalised to obtain probabilities.</p>
        <p>For each sub-sequence of hovering labels, this procedure
yields a probability distribution which specifies how likely a
visitor is to view each exhibit.</p>
      </sec>
      <sec id="sec-6-4">
        <title>5.3 Model-based Prediction of Exhibits</title>
        <p>Once the viewed exhibits are inferred, we can use this
information to predict a visitor’s next exhibit for each (x; y)
position at which the visitor is hovering.3 However, as seen
in the previous section, there is some uncertainty regarding
which exhibit the visitor is actually viewing. We therefore
use the Weighted approach described by Schmidt et al. [2009]
for predicting the next exhibit from positional information.
For each possible next exhibit i, the Weighted approach
estimates Pnext(i j x; y) as the weighted average of the
transition probabilities Pj;i from each possible current exhibit j
to exhibit i. The weights are the probabilities P(j j x; y) of
viewing exhibit j when standing within the square at
position (x; y) (Section 3).</p>
        <p>M
P^ next(i j x; y) = X f P(j j x; y)</p>
        <p>Pj;i g
j=1
3Predictions of a visitor’s next exhibits can be combined with
predictions of the personally interesting exhibits to generate
recommendations of exhibits that may be overlooked if the predicted next
exhibits are actually visited.</p>
        <p>In this calculation, the transition probabilities Pj;i are
derived from the information provided by the Transition Model
in Section 3 by setting to zero the columns of the transition
matrix that pertain to the already viewed exhibits, and
renormalising each row of the matrix to 1.4
6</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Evaluation</title>
      <p>This section presents our data collection method and datasets,
and describes our experiments and results.
6.1</p>
      <sec id="sec-7-1">
        <title>Data Collection and Datasets</title>
        <p>Our dataset of real-world exhibit tours was obtained at the
Marine Life Exhibition at Melbourne Museum. It consists
of a (manually collected) record of the exhibits viewed by
44 visitors, and the viewing times at the exhibits. On
average, each visitor viewed 7:2 of the M = 22 exhibits. The
data for the viewing model described in Section 3 were
obtained separately, by manually annotating a grid-based map
to record the positions of visitors to the exhibition.</p>
        <p>These data were used together with the method from
Section 4 to generate 1000 simulated visits, where each
visit comprises time-stamped sequences of (typically noisy)
(x; y) coordinates at different noise levels — each element
consisting of ht; x; yi. These 1000 simulated visits are the
basis for our evaluation. When generating the visits, we
assumed a constant walking speed of vw = 3 km/h and a
hovering speed of vh = 1 km/h. Also, we used a sampling rate of
one observation per second.</p>
        <p>Current range-based positioning systems are often based
on processing radio signals, e. g., WiFi and ultra-wide band
(UWB). WiFi-based technology typically achieves accuracy
levels of 2 to 3:5 metres [Bahl and Padmanabhan, 2000;
Lassabe et al., 2009], while future UWB-based systems
are expected to achieve accuracy levels of up to 0:15
metres [Hazas et al., 2004]. We therefore considered accuracy
levels of = 0 to 4:5 metres when generating the visits.
6.2</p>
      </sec>
      <sec id="sec-7-2">
        <title>Experiments and Results</title>
        <p>
          To evaluate our models, we applied bootstrapping [Mooney
and Duval, 1993] as follows. The 1000 generated visits were
split into a training set of 100 visits and a test set of 900
visits. 200 bootstrap samples were then generated from the test
set, with each bootstrap sample being constructed by
sampling from the 900 visits with replacement
          <xref ref-type="bibr" rid="ref11">(200 is the
recommended upper bound on the number of samples for
bootstrapping [Mooney and Duval, 1993])</xref>
          . The training set remained
the same for all samples. Our results are averaged over the
bootstrap samples.5
        </p>
        <p>We conducted three experiments with these training and
test sets: (1) walking/hovering classification; (2) inferring
exhibits from positional hovering coordinates; and (3)
predicting the next exhibit. All performance differences
between models were found to be statistically significant with
4Our observations indicate that visitors rarely return to
previously viewed exhibits. Hence, we focus on unseen exhibits.</p>
        <p>5We employed bootstrapping, because only the test data varies
for this technique, compared to cross validation which conflates the
variation in the training and test data.</p>
        <p>2 2.5
sensor error (ν)
3
3.5
4
4.5
0.5
1
1.5
3
3.5
4</p>
        <p>4.5
2 2.5
sensor error (ν)
p 0:001 (evaluated using two-tailed paired t-tests on the
bootstrap samples).</p>
        <p>Table 1 summarises the models used in our experiments,
indicating the inferred versus given information (only the
first two models, i. e., those with grey background, are used
in our first two experiments). The top model TLall
(TimeLocation for all observations) is the most realistic, as its
information is akin to that obtained from sensor readings (i. e.,
a sequence of time-stamped (x; y) coordinates). The models
then become progressively less realistic, starting with TLAall
(Time-Location-Action for all observations), where the
walking/hovering labels are considered given, up to Exhall, where
the walking/hovering labels, previous exhibits and current
exhibit are given. To contextualise our work, Table 1 also shows
Schmidt et al.’s model [Schmidt et al., 2009] (typeset in
italics), but its results are excluded from our evaluation, as it does
not model trajectories or temporal information.</p>
        <p>Walking/hovering classification. To evaluate the
performance of our walking/hovering classification method
(Section 5.1), we gave as input sequences of times and
positions (htn; xn; yni; n = 1; 2; : : :). For each walking/hovering
classification, we considered the five positional observations
made within the last four seconds (! = 5). As visitors hover
slightly less than 69% of the time, and walk between exhibits
for the rest of the time, we under-sampled the hovering
portion of the training data to balance the classes.6</p>
        <p>6We under-sampled the larger class, rather than over-sampling
the smaller class, in order to retain the variance of the latter class.
We also experimented with unbalanced data, but the performance
Figure 2 depicts classification accuracy as a function of
sensor error, where the majority class baseline (MCL)
assumes that a person is always hovering (the results are
averaged over the 22 exhibits of the Marine Life Exhibition).
Our results show that for no sensor error, our SVM
classifier (TLall) is able to infer whether a visitor is walking or
hovering with approximately 97% accuracy. Classification
accuracy decreases to about 88% as the sensor error increases
to 2:75 metres (the middle of the range for WiFi technology).</p>
      </sec>
      <sec id="sec-7-3">
        <title>Inferring exhibits from positional hovering coordinates.</title>
        <p>To evaluate the performance of our mechanism for inferring
the sequence of visited exhibits, we gave as input sequences
of times and positions (htn; xn; yni; n = 1; 2; : : :) and
walking/hovering labels (one label for each element in a
sequence). The probabilities of viewed exhibits were calculated
once for given (known) walking/hovering labels, and once for
labels inferred using the SVM classifier (Section 5.1). The
inferences were made as described in Section 5.2, and resulted
in a probability distribution of the exhibit being viewed by a
visitor for each sub-sequence of hovering labels.</p>
        <p>Figure 3 depicts the average log loss (negative log of the
probability of the actually viewed exhibit), averaged over the
22 exhibits, as a function of sensor error. The figure compares
the performance obtained when the walking/hovering labels
are inferred (TLall) with that obtained when the labels are
given (TLAall). It is worth noting that the comparison was
done for the timestamps where the inferred and given
hovering labels overlap, but the exhibit probabilities used for the
was inferior to that obtained with the balanced data.
1.5</p>
        <p>2 2.5
sensor error (ν)
3
3.5
4
4.5
0.5
1
1.5</p>
        <p>2 2.5
sensor error (ν)
3
3.5
4
4.5
(a) Average top-3 accuracy
(b) Average log loss
comparison were calculated for all the inferred or given
hovering labels in each continuous sub-sequence of hovering
labels. This explains the (expected) slight drop in performance
for inferred hovering labels, since, as seen in the first
experiment, the inferred labels are sometimes wrong. Also, as
expected, performance deteriorates as sensor error increases.
Predicting the next exhibit. This experiment determines
the effect of different assumptions regarding available
information on predictive accuracy. We consider our four models
from Table 1, whose information ranges from time-stamped
positional sensor logs (TLall) to sequences of viewed
exhibits (Exhall). In line with Schmidt et al. [2009], for all
four models, the next exhibit was predicted using the
transition matrix learned from the 44 tours observed at the Marine
Life Exhibition (Section 3). For Exhall, we used the
transition matrix directly (the transition probabilities for previously
visited exhibits were set to zero), while for the other models,
we used the Weighted approach described in Section 5.3.</p>
        <p>Figures 4(a) and 4(b) show, respectively, the average
top3 accuracy and average log loss for various levels of
sensor error for the four models described in Table 1 (the
results are averaged over the 22 exhibits). For this
experiment, log loss is defined as the negative log of the
probability with which the exhibit actually viewed next is predicted,
and top-3 accuracy measures how often the exhibit actually
viewed next is one of the three exhibits predicted with the
highest probability. We employ top-3 rather than top-1
accuracy because the top probabilities are often quite similar
due to the physical layout of the exhibition. As seen in the
figures, the higher the uncertainty about a visitor’s behaviour
and the higher the sensor error, the lower the accuracy and the
higher the log loss (statistically significant). Note that Exhall
is invariant to sensor noise, as all the information is assumed
given (Table 1). Interestingly, the differences in performance
between the three lower-information models (TLall, TLAall
and ExhprevTLAcurr) are relatively small, and their
performance profiles are quite flat up to = 1:5 metres,
diverging slightly from there on. The creditable performance up to
= 1:5 metres means that one can expect acceptable
predictive performance from sensor-based systems.
7</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Conclusions</title>
      <p>This paper offered a realistic model of sensor-based
information, significantly extending the work of Schmidt et al.
[2009]. Our framework enables us to study the impact of
different assumptions regarding sensor noise and available
sensor information on inferential performance regarding viewed
exhibits. The accuracy of these inferences in turn affects the
performance of user models, viz models of visitors’ interests
and of exhibits they are likely to visit. As expected,
predictive performance deteriorates for every experimental
parameter that is inferred (rather than given), and also as sensor error
increases. However, interestingly, performance remains quite
stable for sensor noise up to 1:5 metres, which is an
encouraging result for real-world systems.</p>
      <p>Our inferential and predictive models in combination
support the generation of recommendations of exhibits that may
be of interest but are likely to be missed. Our models may
also be used to influence the strength of recommendations as
a function of the reliability of the information on which the
recommendations are based. An additional application of our
results is in guiding the layout of sensing devices in a
museum, e. g., it may be advantageous to place more devices in
locations where the inferences are more uncertain.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>This research was supported in part by grant DP0770931
from the Australian Research Council. The authors thank
Daniel F. Schmidt for his involvement at early stages of this
research, Liz Sonenberg and Carolyn Meehan for fruitful
discussions and their support, and David Abramson, Jeff Tan and
Blair Bethwaite for their assistance with the computer cluster.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Bahl and Padmanabhan</source>
          , 2000]
          <string-name>
            <given-names>Paramvir</given-names>
            <surname>Bahl</surname>
          </string-name>
          and
          <string-name>
            <surname>Venkata N. Padmanabhan. RADAR:</surname>
          </string-name>
          <article-title>An in-building RF-based user location and tracking system</article-title>
          .
          <source>In Proceedings of the 19th Annual Joint IEEE Conference on Computer Communications (INFOCOM-00)</source>
          , pages
          <fpage>775</fpage>
          -
          <lpage>784</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[Bohnert and Zukerman</source>
          , 2009]
          <string-name>
            <given-names>Fabian</given-names>
            <surname>Bohnert</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ingrid</given-names>
            <surname>Zukerman</surname>
          </string-name>
          .
          <article-title>Non-intrusive personalisation of the museum experience</article-title>
          .
          <source>In Proceedings of the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)</source>
          , pages
          <fpage>197</fpage>
          -
          <lpage>209</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Bohnert et al.,
          <year>2008</year>
          ]
          <string-name>
            <given-names>Fabian</given-names>
            <surname>Bohnert</surname>
          </string-name>
          , Ingrid Zukerman, Shlomo Berkovsky, Timothy Baldwin, and
          <string-name>
            <given-names>Liz</given-names>
            <surname>Sonenberg</surname>
          </string-name>
          .
          <article-title>Using interest and transition models to predict visitor locations in museums</article-title>
          .
          <source>AI Communications</source>
          ,
          <volume>21</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>195</fpage>
          -
          <lpage>202</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Chang and Lin</source>
          , 2001]
          <article-title>Chih-Chung Chang and Chih-Jen Lin</article-title>
          .
          <article-title>LIBSVM: A library for support vector machines</article-title>
          ,
          <year>2001</year>
          . Software available at http://www.csie.ntu. edu.tw/˜cjlin/libsvm.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Cheverst et al.,
          <year>2002</year>
          ]
          <string-name>
            <given-names>Keith</given-names>
            <surname>Cheverst</surname>
          </string-name>
          , Keith Mitchell, and
          <string-name>
            <given-names>Nigel</given-names>
            <surname>Davies</surname>
          </string-name>
          .
          <article-title>The role of adaptive hypermedia in a context-aware tourist GUIDE</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>45</volume>
          (
          <issue>5</issue>
          ):
          <fpage>47</fpage>
          -
          <lpage>51</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>[Dijkstra</source>
          , 1959]
          <string-name>
            <given-names>Edsger W.</given-names>
            <surname>Dijkstra</surname>
          </string-name>
          .
          <article-title>A note on two problems in connexion with graphs</article-title>
          .
          <source>Numerische Mathematik</source>
          ,
          <volume>1</volume>
          :
          <fpage>269</fpage>
          -
          <lpage>271</lpage>
          ,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Hatala and Wakkary</source>
          , 2005]
          <string-name>
            <given-names>Marek</given-names>
            <surname>Hatala</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ron</given-names>
            <surname>Wakkary</surname>
          </string-name>
          .
          <article-title>Ontology-based user modeling in an augmented audio reality system for museums. User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>15</volume>
          (
          <issue>3-4</issue>
          ):
          <fpage>339</fpage>
          -
          <lpage>380</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Hazas et al.,
          <year>2004</year>
          ]
          <string-name>
            <given-names>Mike</given-names>
            <surname>Hazas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>James</given-names>
            <surname>Scott</surname>
          </string-name>
          , and John Krumm.
          <article-title>Location-aware computing comes of age</article-title>
          .
          <source>IEEE Computer</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <fpage>95</fpage>
          -
          <lpage>97</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [Lassabe et al.,
          <year>2009</year>
          ]
          <string-name>
            <given-names>Frederic</given-names>
            <surname>Lassabe</surname>
          </string-name>
          , Philippe Canalda, Pascal Chatonnay, and Franc¸ois Spies.
          <article-title>Indoor Wi-Fi positioning: Techniques and systems</article-title>
          .
          <source>Annals of Telecommunications</source>
          ,
          <volume>64</volume>
          :
          <fpage>651</fpage>
          -
          <lpage>664</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Lundgren</source>
          , 2007]
          <string-name>
            <given-names>Jonas</given-names>
            <surname>Lundgren. SPLINEFIT</surname>
          </string-name>
          ,
          <year>2007</year>
          . Software available at http://www.mathworks. com/matlabcentral/fileexchange/ 13812-fit
          <article-title>-a-spline-to-noisy-data.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Mooney and Duval</source>
          , 1993] Christopher
          <string-name>
            <given-names>Z.</given-names>
            <surname>Mooney</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robert D.</given-names>
            <surname>Duval</surname>
          </string-name>
          .
          <article-title>Bootstrapping: A Nonparametric Approach to Statistical Inference</article-title>
          .
          <source>Sage Publications</source>
          , Newbury Park, CA, USA,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Petrelli and Not</source>
          , 2005]
          <string-name>
            <given-names>Daniela</given-names>
            <surname>Petrelli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Elena</given-names>
            <surname>Not</surname>
          </string-name>
          .
          <article-title>User-centred design of flexible hypermedia for a mobile guide: Reflections on the HyperAudio experience</article-title>
          .
          <source>User Modeling</source>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>15</volume>
          (
          <issue>3-4</issue>
          ):
          <fpage>303</fpage>
          -
          <lpage>338</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Philipose et al.,
          <year>2004</year>
          ]
          <string-name>
            <given-names>Matthai</given-names>
            <surname>Philipose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kenneth P.</given-names>
            <surname>Fishkin</surname>
          </string-name>
          , Mike Perkowitz,
          <string-name>
            <given-names>Donald J.</given-names>
            <surname>Patterson</surname>
          </string-name>
          , Dieter Fox,
          <string-name>
            <given-names>Henry</given-names>
            <surname>Kautz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Dirk</given-names>
            <surname>Hahnel</surname>
          </string-name>
          .
          <article-title>Inferring activities from interactions with objects</article-title>
          .
          <source>IEEE Pervasive Computing</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ):
          <fpage>50</fpage>
          -
          <lpage>57</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [Schmidt et al.,
          <year>2009</year>
          ] Daniel F. Schmidt, Ingrid Zukerman, and
          <string-name>
            <given-names>David W.</given-names>
            <surname>Albrecht</surname>
          </string-name>
          .
          <article-title>Assessing the impact of measurement uncertainty on user models in spatial domains</article-title>
          .
          <source>In Proceedings of the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)</source>
          , pages
          <fpage>210</fpage>
          -
          <lpage>222</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Stock et al.,
          <year>2007</year>
          ]
          <string-name>
            <given-names>Oliviero</given-names>
            <surname>Stock</surname>
          </string-name>
          , Massimo Zancanaro, Paolo Busetta, Charles Callaway, Antonio Kru¨ger, Michael Kruppa, Tsvika Kuflik, Elena Not, and
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Rocchi</surname>
          </string-name>
          .
          <article-title>Adaptive, intelligent presentation of information for the museum visitor in PEACH. User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>18</volume>
          (
          <issue>3</issue>
          ):
          <fpage>257</fpage>
          -
          <lpage>304</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[Ve´ron and Levasseur</source>
          , 1983]
          <article-title>Elise´o Ve´ron and Martine Levasseur</article-title>
          . Ethnographie de l'Exposition. Bibliothe`que Publique d'Information, Centre Georges Pompidou, Paris, France,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>[Wang</surname>
          </string-name>
          et al.,
          <year>2009</year>
          ]
          <string-name>
            <given-names>Yiwen</given-names>
            <surname>Wang</surname>
          </string-name>
          , Lora Aroyo, Natalia Stash, Rody Sambeek, Yuri Schuurmans, Guus Schreiber, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Gorgels</surname>
          </string-name>
          .
          <article-title>Cultivating personalized museum tours online and on-site</article-title>
          .
          <source>Interdisciplinary Science Reviews</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ):
          <fpage>141</fpage>
          -
          <lpage>156</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Zancanaro et al.,
          <year>2007</year>
          ]
          <string-name>
            <given-names>Massimo</given-names>
            <surname>Zancanaro</surname>
          </string-name>
          , Tsvika Kuflik, Zvi Boger, Dina Goren-Bar, and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Goldwasser</surname>
          </string-name>
          .
          <article-title>Analyzing museum visitors' behavior patterns</article-title>
          .
          <source>In Proceedings of the 11th International Conference on User Modeling (UM07)</source>
          , pages
          <fpage>238</fpage>
          -
          <lpage>246</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>[Zhao and Guibas</source>
          , 2004]
          <string-name>
            <given-names>Feng</given-names>
            <surname>Zhao</surname>
          </string-name>
          and
          <string-name>
            <given-names>Leonidas</given-names>
            <surname>Guibas</surname>
          </string-name>
          .
          <source>Wireless Sensor Networks: An Information Processing Approach</source>
          . Morgan Kaufmann,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>