<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Modelling and Predicting Movements of Museum Visitors: A Simulation Framework for Assessing the Impact of Sensor Noise on Model Performance</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Fabian</forename><surname>Bohnert</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Monash University</orgName>
								<address>
									<country key="AU">Australia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ingrid</forename><surname>Zukerman</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Monash University</orgName>
								<address>
									<country key="AU">Australia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">David</forename><forename type="middle">W</forename><surname>Albrecht</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Monash University</orgName>
								<address>
									<country key="AU">Australia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Timothy</forename><surname>Baldwin</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Dept. of Comp. Sci. and Soft. Eng</orgName>
								<orgName type="institution">The University of Melbourne</orgName>
								<address>
									<country key="AU">Australia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Modelling and Predicting Movements of Museum Visitors: A Simulation Framework for Assessing the Impact of Sensor Noise on Model Performance</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">00D45EDE204491F7A55EF6071BC7871C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:13+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We present a simulation framework to examine the impact of sensor noise on the performance of user models in the museum domain. Our contributions are: (1) models to simulate noisy visit trajectories as time-stamped sequences of (x, y) positional coordinates which reflect walking and hovering behaviour; (2) a discriminative inference model that distinguishes between hovering and walking on the basis of (simulated) noisy sensor observations; (3) a model that infers viewed exhibits from hovering coordinates; and (4) a model that predicts the next exhibit on the basis of inferred (rather than known) viewed exhibits. Our staged evaluation assesses the effect of these models (in combination with sensor noise) on inferential and predictive performance, thus shedding light on the reliability attributed to inferences drawn from sensor observations.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The construction of models of visitors to public spaces, in particular museums, has been of interest to the user modelling and cultural tourism communities for some time <ref type="bibr" target="#b0">[Cheverst et al., 2002;</ref><ref type="bibr" target="#b1">Hatala and Wakkary, 2005;</ref><ref type="bibr" target="#b3">Stock et al., 2007]</ref>. These models are used to predict visitors' interests in order to personalise the content of presentations, or make recommendations of locations (e. g., exhibits) to be visited. In most systems developed to date, these user models are acquired through the active participation of the visitors, e. g., by providing feedback through a device. This requirement imposes a burden on the visitors, which in turn may reduce the reliability of the obtained information, e. g., if visitors provide feedback only occasionally.</p><p>Recent advances in mobile computing and sensing technologies have enabled the instrumentation of physical public spaces, which in turn has enabled the automatic tracking of visitors' movements <ref type="bibr" target="#b1">[Hazas et al., 2004;</ref><ref type="bibr" target="#b1">Lassabe et al., 2009;</ref><ref type="bibr" target="#b2">Philipose et al., 2004]</ref>. Information regarding visitors' whereabouts and the time spent at different locations supports the automatic inference of visitors' interests and the prediction of their trajectories <ref type="bibr" target="#b0">[Bohnert and Zukerman, 2009]</ref>. Clearly, inferences from positional and timing information are more indirect and uncertain than visitors' direct feedback. However, the information stream is reliable, as opposed to information obtained from visitors' direct participation.</p><p>In order to personalise content and generate recommendations on the basis of information provided by unobtrusive sensors (rather than from user participation), questions of interest include: (1) how to infer a visitor's viewed exhibits solely from sensor readings; and (2) how to predict the next exhibit(s) a visitor is likely to view. In this paper, we present a realistic simulation model which offers some insights to answer these questions, and may be employed to make decisions regarding the instrumentation of a space.</p><p>In previous research, we offered a simulation framework for investigating the impact of different sensing technologies on the predictive performance of user models <ref type="bibr" target="#b3">[Schmidt et al., 2009]</ref>. The aim was to provide a practical solution to the problem of assessing the accuracy of the user models that can be derived from a sensor-based system prior to actually deploying a particular technology. However, that work made strong simplifying assumptions that affected the realism of the framework, and hence the significance and usefulness of its results, viz: (1) sensors can detect, with some error, a single square (in a grid representation of the museum floor) where a visitor is statically positioned while viewing an exhibit i k ; and (2) the previously viewed exhibits i 1 , . . . , i k−1 are known (not just the previous coordinates of a visitor) when predicting the next exhibit i k+1 . In reality, people tend not to remain stationary at an exhibit, and they certainly do not 'teleport' between squares on the floor. Rather, they walk between exhibits, and often hover around an exhibit to view it from different angles or distances. Thus, when sensing a visitor's movements in a museum, the best we can hope for is a time-stamped trajectory of (x, y) coordinates (sampled at a particular rate), where the observed coordinates diverge from the true positions of the visitor by some sensor error. As a result, the sequence of previously viewed exhibits cannot be known with certainty -at best a likely sequence of exhibits can be inferred from the sensor observations.</p><p>In this paper, we propose a simulation framework that eschews the above assumptions, significantly extending our previous work and the insights obtained from it. Specifically, our contributions are: (1) models to simulate noisy visit trajectories as time-stamped sequences of (x, y) positional coordinates which reflect walking and hovering behaviour;</p><p>(2) a discriminative inference model that distinguishes be-tween hovering and walking on the basis of noisy sensor observations;</p><p>(3) a model that infers likely viewed exhibits from time-stamped sequences of hovering coordinates (instead of a single static grid square per exhibit as done in our previous work); and (4) a model that predicts the next exhibit on the basis of these inferred (rather than known) viewed exhibits. At present, we assume that the sensors can only track a visitor's position. However, our models may be extended to incorporate orientation information and occasional user feedback to improve the accuracy of inferences obtained from sensor readings, and hence the predictions of subsequent exhibits.</p><p>The research in this paper builds on the framework described by <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref>, which comprises a predictive user model of exhibits to be viewed, and a spatial viewing model of positions from which each exhibit can be seen. Like Schmidt et al., we evaluate our framework in the context of the Marine Life Exhibition at Melbourne Museum. In this paper, we augment the evaluations done by Schmidt et al., presenting the results of a staged evaluation which examines the effect of different information-based models, in combination with sensor noise, on inferential and predictive performance.</p><p>This paper is organised as follows. Section 2 discusses related research. Section 3 briefly summarises the key components of our previous simulation framework. Our approach for simulating detailed coordinate-based visit trajectories is presented in Section 4, and our inference and prediction models are described in Section 5. The results of our evaluation are presented in Section 6, followed by concluding remarks in Section 7.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Research</head><p>The research community has initiated a wealth of projects that investigate user modelling and personalisation technology in the context of physical spaces. For example, in the museum domain, HyperAudio dynamically adapted hyperlinks and presented content to stereotypical assumptions about a visitor, and to what the visitor has already accessed through a mobile device and seems interested in <ref type="bibr" target="#b2">[Petrelli and Not, 2005]</ref>. The CHIP project harnessed Semantic Web techniques to provide personalised access to digital museum collections both online and in the physical museum <ref type="bibr" target="#b4">[Wang et al., 2009]</ref>. This was done by using explicitly initialised user models. The Kubadji project investigated user and language modelling techniques that rely on mobile technology deployed in museums <ref type="bibr" target="#b0">[Bohnert and Zukerman, 2009]</ref>. While the focus was on modelling visitors based on non-intrusive observations that can be derived from sensor readings, the project did not evaluate its models with real-world sensing technology.</p><p>In contrast to these projects, which did not employ realworld sensing technology, other research projects incorporated wireless technology or sensor networks. The GUIDE project developed a handheld tourist guide for visitors to the city of Lancaster, UK <ref type="bibr" target="#b0">[Cheverst et al., 2002]</ref>. It employed user models obtained from explicit user input to generate dynamic and user-adapted city tours, where the order of the visited locations could be varied. The project used wireless access points to stream content data to a user's device, but did not employ the wireless network to localise the user. The PEACH project developed technology which adapts its user model on the basis of both explicit user feedback and implicit observations of a user's interactions with a mobile <ref type="bibr">device [Stock et al., 2007]</ref>. This user model was used to generate personalised multimedia presentations for museum visitors. The PEACH project also explored simple localisation technology, but did not derive user modelling information from sensor readings. The augmented audio reality system for museums ec(h)o adapted its user model on the basis of a visitor's movements through the exhibition space and his/her interactions with the system <ref type="bibr" target="#b1">[Hatala and Wakkary, 2005]</ref>. The collected user modelling data were used to deliver personalised information associated with exhibits via audio display. However, the project did not investigate the effect of localisation accuracy on the quality of the resultant user modelling information.</p><p>In contrast to the above research, this paper investigates the impact of using sensing technology as a means for gathering information about a user, i. e., to learn a user model. To this effect, we offer a simulation framework which generates noisy visit trajectories that reflect walking and hovering behaviour, and investigate the relationship between sensor noise and inferential and predictive user model performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Prerequisites</head><p>This section briefly summarises four key components of the simulation framework introduced by <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref>, which is extended in this paper: (1) frequency-based Transition Model; (2) Spatial Exhibit Viewing Model; (3) generation of exhibit tours; and (4) generation of exhibit squares.</p><p>Frequency-based Transition Model. We use a frequencybased Transition Model to represent visitors' movements between museum exhibits <ref type="bibr" target="#b0">[Bohnert et al., 2008;</ref><ref type="bibr" target="#b3">Schmidt et al., 2009]</ref>. This model, which is implemented as a 1-stage Markov model, estimates the transition probabilities P i,j between exhibits i and j from frequency counts of exhibit transitions that are derived from observed visit trajectories. When estimating the transition probabilities, additive smoothing is applied in light of our small dataset of 44 observed trajectories (Section 6.1):</p><formula xml:id="formula_0">Pi,j = n i,j + α i N i + M α i for i, j = 1, . . . , M</formula><p>where n i,j counts the transitions from exhibit i to exhibit j, α i is a smoothing constant, N i = k=1,...,M n i,k is the total number of times exhibit i was viewed, and M is the number of exhibits.</p><p>Spatial Exhibit Viewing Model. Our modelling framework employs a probabilistic model of the viewing areas for each exhibit in the museum space, which divides the space into a grid of squares (for the Marine Life Exhibition, the grid size is 47 × 61 = 2, 867 squares, where a square is approximately 30 cm × 30 cm; Figure <ref type="figure" target="#fig_0">1</ref>). The model specifies a discrete probability distribution which represents P(i | x, y), the probability of a visitor viewing each exhibit i from a square at position (x, y). Generation of exhibit tours. We generate tours of viewed exhibits as follows. Each tour begins at a fictitious start exhibit i 0 and ends at a fictitious end exhibit i end . For each exhibit i k−1 already in the tour (k = 1, 2, . . .), the next exhibit i k is generated by sampling from a categorical distribution specified by the transition probabilities P i k−1 ,i k . This step is repeated for each added exhibit i k until the end exhibit i end is reached.</p><p>In addition to this sequence of exhibits, our walking/hovering model (Section 4) requires the time that a visitor spends at each viewed exhibit. We generate a viewing time T i at exhibit i by randomly drawing from an exponential distribution, i. e., T i ∼ Exp(λ i ), where the average viewing time λ i at each exhibit i is estimated by maximum likelihood from the 44 observed tours in the Marine Life Exhibition dataset.</p><p>Generation of exhibit squares. Once a tour of exhibits has been simulated, <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref> generate a single viewing square at position (x, y) for each viewed exhibit i in the tour. This is done by sampling from the categorical distribution P(x, y | i) over all exhibit squares, where P(x, y | i) is derived by applying Bayes' theorem to the viewing probabilities P(i | x, y) obtained from the Spatial Exhibit Viewing Model.</p><p>In this work, we use Schmidt et al.'s model to generate the first hovering square for each viewed exhibit (Section 4.2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Simulation of Coordinate-based Visitor Pathways</head><p>The previous section outlined our method for generating exhibit tours with a single static grid square per exhibit. In this section, we simulate (smooth and noisy) coordinate-based visit trajectories which reflect two types of behaviour: walking between exhibits, and hovering at exhibits. Our approach comprises the following four steps, which are described below: (1) generation of connected paths of walking squares between exhibits;</p><p>(2) generation of connected paths of hovering squares to simulate viewing behaviour at exhibits; (3) smoothing of the obtained square trajectory; and (4) simulation of noisy sensor observations from this smooth pathway representation. Figure <ref type="figure" target="#fig_0">1</ref> depicts two representations of part of a simulated visit trajectory (we show the part for the Tool Time exhibit in the Mealtime section of the Marine Life Exhibition). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Generating Walking Squares</head><p>In Section 3, we generated one viewing square for each exhibit in a visitor's tour. However, visitors do not simply teleport between squares. To produce a more realistic continuous visit trajectory, we must build a path that links these squares. At first glance, it seems that a shortest-path algorithm may be used for this task. However, trajectories generated in this way exhibit an unnatural level of repetition and purposefulness, tending to run directly along exhibition walls. In practice, visitors tend to move more erratically. To simulate these behaviours, we incorporate stochastic effects into the shortestpath procedure. Specifically, we model the probability of moving into a square as being proportional to the probability of viewing the destination exhibit from this square, moderated by the visitor's propensity to avoid walls and to meander. Our approach uses parameters that control two behavioural aspects of visitors: (1) how erratic or purposeful their movement is; and (2) their propensity to avoid walls.<ref type="foot" target="#foot_0">1</ref> These considerations are implemented as follows.</p><p>Assume we want to generate a sequence of walking squares to connect two exhibits i and j in a tour. Let (x s , y s ) denote the end square of exhibit i (i. e., the source square), and (x d , y d ) the starting square of exhibit j (i. e., the destination square). Also, treating diagonal squares as adjacent, let the candidate squares of a square (x, y) be the eight squares surrounding this square. We start by employing Dijkstra's algorithm <ref type="bibr" target="#b1">[Dijkstra, 1959]</ref> to generate a distance matrix D whose elements D x,y correspond to the shortestpath distances from each square (x, y) to the destination square (x d , y d ). Then, we generate a sequence of walking squares as follows. For each square (x n , y n ) (starting from the source square (x s , y s )), the next square (x n+1 , y n+1 ) that a visitor moves into while walking is sampled from among (x n , y n )'s eight candidate squares, provided that the move does not take the visitor farther away from (x d , y d ) (the distance information is obtained from D). In this procedure, the sampling is performed from a categorical distribution over the eight candidate squares, whose probabilities are proportional to the probabilities of viewing the destination exhibit from each square, moderated by the visitor's propensity to avoid walls and to meander (the probabilities are zero for the squares that take the visitor farther away from (x d , y d )). The visitor moves in this fashion until (x d , y d ) is reached. At that point, the trajectory between (x s , y s ) and (x d , y d ) is complete, and timestamps are iteratively added to the trajectory assuming a constant walking speed v w for the visitor.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Generating Hovering Squares</head><p>Once at an exhibit, visitors usually observe the exhibit for some time before moving on to the next one. Additionally, visitors typically do not remain static, but move around to examine the exhibit from different angles and distances. This so-called hovering behaviour is included in our simulation framework by varying the movement model described in Section 4.1, so that a visitor is more likely to move towards a square from which the exhibit is more likely to be viewed, but may not move at all. Timestamps are added to the generated hovering squares assuming a hovering speed of v h &lt; v w (as for the walking case, we assume a constant hovering speed). The hovering behaviour continues until the sampled viewing time T i for the current exhibit i is exceeded (viewing time sampling is described in Section 3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Smoothing the Square Trajectory</head><p>To obtain a smooth positional tour representation from a time-stamped trajectory of squares, i. e., ( t n , x n , y n ; n = 1, 2, . . .), we fit piecewise cubic splines to the coordinateindividual trajectories t n , x n and t n , y n (one piecewise cubic spline each). We do this by applying the splinefit package from the Matlab Central File Exchange <ref type="bibr" target="#b2">[Lundgren, 2007]</ref>. This approach uses the method of least squares to fit splines with reduced degrees of freedom (we reduce the number of spline pieces by 70% compared to direct interpolation), and generates a smooth representation of the trajectory in the sense that (x, y), ( ẋ, ẏ) and (ẍ, ÿ) are all continuous in time.</p><p>The resultant representation may be interpreted as a continuous positional representation of the visit trajectory, enabling us to obtain a visitor's position at any point in time. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Simulating Sensor Noise</head><p>The visit trajectories obtained so far are smooth and continuous. However, in practice, any trajectory-based input to a user modelling system would be acquired through sensors that deliver only a visitor's approximate position (due to measurement error) at a certain sampling rate.</p><p>In this paper, we explore sensor noise that may be attributed to range-based positioning technology, e. g., WiFi and ultra-wide band (UWB) <ref type="bibr" target="#b4">[Zhao and Guibas, 2004]</ref>. We follow a widely accepted model for sensor noise in this setting, and assume that the measured coordinates (x , y ) are obtained by distorting the true coordinates (x, y) through additive Gaussian noise and sampling at regular time intervals (for our experiments, we use a constant sampling rate of one second). Specifically, the measured coordinates are found by sampling from a bivariate normal distribution N((x, y), σ<ref type="foot" target="#foot_2">2</ref> I) with mean (x, y) and covariance σ 2 I, where σ is a constant which reflects the expected accuracy of the sensing infrastructure, and I is the identity matrix. For example, if the infrastructure is able to deliver positions within an accuracy level of ν metres 95% of the time, then σ = ν/2 would be a suitable value, as this places approximately 95% of the probability mass within the circle defined by (x − x)</p><p>2 + (y − y) 2 = ν 2 . Figure <ref type="figure" target="#fig_0">1</ref>(b) depicts part of a noisy visit trajectory which was sampled by following this procedure for the pathway shown in Figure <ref type="figure" target="#fig_0">1</ref>(a) at a sampling rate of one second with ν = 2 metres.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Inference and Prediction of Viewed Exhibits from Positional Coordinates</head><p>When information on a visitor's movements is automatically gathered through sensors, all that is available is a sequence of (typically noisy) time-stamped (x, y) coordinates (Section 4.4). 2 Assuming that we have a method for detecting whether a visitor is hovering (and hence viewing an exhibit), we can decompose the complete (x, y) sequence into subsequences of (x, y) coordinates that pertain to hovering behaviour (Section 5.1). From these, we can infer which exhibit the visitor is viewing (Section 5.2), and employ a model to predict which exhibit the visitor is likely to view next on the basis of this information (Section 5.3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Classification-based Inference of Walking and Hovering</head><p>To infer walking and hovering behaviour from positional (x, y) coordinates, we employ a window-based approach. We first derive indicative features from a window comprising the previous ω sensor observations, and then provide these features to a purpose-trained classifier for inference. The output of this binary classifier is a label which indicates whether a visitor's activity is walking or hovering.</p><p>Prior to deriving the features, we smooth the noisy sensor observations t, x, y by fitting piecewise cubic splines to the t, x and t, y trajectories <ref type="bibr" target="#b2">[Lundgren, 2007]</ref>, and evaluating these splines at the original timestamps (similarly to Section 4.3). Using the resultant smoothed sensor observations, we compute the following feature set of size 2ω + 7 that pertains to (non-directional) velocity and acceleration:</p><p>• ω − 1 velocities (each of them calculated as the length of one of the ω − 1 velocity vectors, which in turn are derived from the ω smoothed positional coordinates from within the window) • Minimum and maximum of the ω − 1 velocities In our experiments (Section 6), we use support vector machines (SVM) to train the classifiers. We employ C-SVC SVMs with an RBF kernel from LIBSVM <ref type="bibr" target="#b0">[Chang and Lin, 2001]</ref>, using features derived from the previous five (x, y) observations (ω = 5).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Probability-based Inference of Exhibits</head><p>In this section, we describe how we infer the exhibits most likely viewed by the visitor while hovering.</p><p>After inferring a visitor's activity (i. e., walking or hovering) for each sensor observation t, x, y , we extract from the complete (x, y) sequence the sub-sequences of (x, y) coordinates that correspond to hovering behaviour. For each subsequence of hovering-labelled (x, y) coordinates, we then calculate the following exhibit scores: <ref type="figure">y</ref>) for all exhibits i</p><formula xml:id="formula_1">score(i) = (x,y) P(i | x,</formula><p>(1)</p><p>where P(i | x, y) is the probability of a visitor viewing exhibit i while hovering at position (x, y) (Section 3). To smooth out possible errors introduced in the classification step (Section 5.1), we delete walking labels that separate two consecutive sub-sequences of hovering labels for which the same exhibit has the highest score. We also remove hoveringlabelled sub-sequences of length 1 (the exhibit scores of any affected sub-sequences of hovering labels are recomputed).</p><p>Finally, all scores are normalised to obtain probabilities.</p><p>For each sub-sequence of hovering labels, this procedure yields a probability distribution which specifies how likely a visitor is to view each exhibit.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Model-based Prediction of Exhibits</head><p>Once the viewed exhibits are inferred, we can use this information to predict a visitor's next exhibit for each (x, y) position at which the visitor is hovering.<ref type="foot" target="#foot_3">3</ref> However, as seen in the previous section, there is some uncertainty regarding which exhibit the visitor is actually viewing. We therefore use the Weighted approach described by <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref> for predicting the next exhibit from positional information. For each possible next exhibit i, the Weighted approach estimates P next (i | x, y) as the weighted average of the transition probabilities P j,i from each possible current exhibit j to exhibit i. The weights are the probabilities P(j | x, y) of viewing exhibit j when standing within the square at position (x, y) (Section 3).</p><formula xml:id="formula_2">Pnext (i | x, y) = M j=1 { P(j | x, y) × P j,i }</formula><p>In this calculation, the transition probabilities P j,i are derived from the information provided by the Transition Model in Section 3 by setting to zero the columns of the transition matrix that pertain to the already viewed exhibits, and renormalising each row of the matrix to 1.<ref type="foot" target="#foot_4">4</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Evaluation</head><p>This section presents our data collection method and datasets, and describes our experiments and results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Data Collection and Datasets</head><p>Our dataset of real-world exhibit tours was obtained at the Marine Life Exhibition at Melbourne Museum. It consists of a (manually collected) record of the exhibits viewed by 44 visitors, and the viewing times at the exhibits. On average, each visitor viewed 7.2 of the M = 22 exhibits. The data for the viewing model described in Section 3 were obtained separately, by manually annotating a grid-based map to record the positions of visitors to the exhibition.</p><p>These data were used together with the method from Section 4 to generate 1000 simulated visits, where each visit comprises time-stamped sequences of (typically noisy) (x, y) coordinates at different noise levels -each element consisting of t, x, y . These 1000 simulated visits are the basis for our evaluation. When generating the visits, we assumed a constant walking speed of v w = 3 km/h and a hovering speed of v h = 1 km/h. Also, we used a sampling rate of one observation per second.</p><p>Current range-based positioning systems are often based on processing radio signals, e. g., WiFi and ultra-wide band (UWB). WiFi-based technology typically achieves accuracy levels of 2 to 3.5 metres <ref type="bibr" target="#b0">[Bahl and Padmanabhan, 2000;</ref><ref type="bibr" target="#b1">Lassabe et al., 2009]</ref>, while future UWB-based systems are expected to achieve accuracy levels of up to 0.15 metres <ref type="bibr" target="#b1">[Hazas et al., 2004]</ref>. We therefore considered accuracy levels of ν = 0 to 4.5 metres when generating the visits.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Experiments and Results</head><p>To evaluate our models, we applied bootstrapping <ref type="bibr" target="#b2">[Mooney and Duval, 1993]</ref> as follows. The 1000 generated visits were split into a training set of 100 visits and a test set of 900 visits. 200 bootstrap samples were then generated from the test set, with each bootstrap sample being constructed by sampling from the 900 visits with replacement (200 is the recommended upper bound on the number of samples for bootstrapping <ref type="bibr" target="#b2">[Mooney and Duval, 1993]</ref>). The training set remained the same for all samples. Our results are averaged over the bootstrap samples. <ref type="foot" target="#foot_5">5</ref>We conducted three experiments with these training and test sets: (1) walking/hovering classification; (2) inferring exhibits from positional hovering coordinates; and (3) predicting the next exhibit. All performance differences between models were found to be statistically significant with  Table <ref type="table" target="#tab_0">1</ref> summarises the models used in our experiments, indicating the inferred versus given information (only the first two models, i. e., those with grey background, are used in our first two experiments). The top model TL all (Time-Location for all observations) is the most realistic, as its information is akin to that obtained from sensor readings (i. e., a sequence of time-stamped (x, y) coordinates). The models then become progressively less realistic, starting with TLA all (Time-Location-Action for all observations), where the walking/hovering labels are considered given, up to Exh all , where the walking/hovering labels, previous exhibits and current exhibit are given. To contextualise our work, Table <ref type="table" target="#tab_0">1</ref> also shows <ref type="bibr">Schmidt et al.'s model [Schmidt et al., 2009]</ref> (typeset in italics), but its results are excluded from our evaluation, as it does not model trajectories or temporal information.</p><p>Walking/hovering classification. To evaluate the performance of our walking/hovering classification method (Section 5.1), we gave as input sequences of times and positions ( t n , x n , y n ; n = 1, 2, . . .). For each walking/hovering classification, we considered the five positional observations made within the last four seconds (ω = 5). As visitors hover slightly less than 69% of the time, and walk between exhibits for the rest of the time, we under-sampled the hovering portion of the training data to balance the classes. <ref type="foot" target="#foot_6">6</ref>Figure <ref type="figure">2</ref> depicts classification accuracy as a function of sensor error, where the majority class baseline (MCL) assumes that a person is always hovering (the results are averaged over the 22 exhibits of the Marine Life Exhibition). Our results show that for no sensor error, our SVM classifier (TL all ) is able to infer whether a visitor is walking or hovering with approximately 97% accuracy. Classification accuracy decreases to about 88% as the sensor error increases to 2.75 metres (the middle of the range for WiFi technology).</p><p>Inferring exhibits from positional hovering coordinates.</p><p>To evaluate the performance of our mechanism for inferring the sequence of visited exhibits, we gave as input sequences of times and positions ( t n , x n , y n ; n = 1, 2, . . .) and walking/hovering labels (one label for each element in a sequence). The probabilities of viewed exhibits were calculated once for given (known) walking/hovering labels, and once for labels inferred using the SVM classifier (Section 5.1). The inferences were made as described in Section 5.2, and resulted in a probability distribution of the exhibit being viewed by a visitor for each sub-sequence of hovering labels.</p><p>Figure <ref type="figure" target="#fig_4">3</ref> depicts the average log loss (negative log of the probability of the actually viewed exhibit), averaged over the 22 exhibits, as a function of sensor error. The figure compares the performance obtained when the walking/hovering labels are inferred (TL all ) with that obtained when the labels are given (TLA all ). It is worth noting that the comparison was done for the timestamps where the inferred and given hovering labels overlap, but the exhibit probabilities used for the was inferior to that obtained with the balanced data. comparison were calculated for all the inferred or given hovering labels in each continuous sub-sequence of hovering labels. This explains the (expected) slight drop in performance for inferred hovering labels, since, as seen in the first experiment, the inferred labels are sometimes wrong. Also, as expected, performance deteriorates as sensor error increases.</p><p>Predicting the next exhibit. This experiment determines the effect of different assumptions regarding available information on predictive accuracy. We consider our four models from Table <ref type="table" target="#tab_0">1</ref>, whose information ranges from time-stamped positional sensor logs (TL all ) to sequences of viewed exhibits (Exh all ). In line with <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref>, for all four models, the next exhibit was predicted using the transition matrix learned from the 44 tours observed at the Marine Life Exhibition (Section 3). For Exh all , we used the transition matrix directly (the transition probabilities for previously visited exhibits were set to zero), while for the other models, we used the Weighted approach described in Section 5.3.</p><p>Figures <ref type="figure" target="#fig_5">4(a</ref>) and 4(b) show, respectively, the average top-3 accuracy and average log loss for various levels of sensor error for the four models described in Table <ref type="table" target="#tab_0">1</ref> (the results are averaged over the 22 exhibits). For this experiment, log loss is defined as the negative log of the probability with which the exhibit actually viewed next is predicted, and top-3 accuracy measures how often the exhibit actually viewed next is one of the three exhibits predicted with the highest probability. We employ top-3 rather than top-1 accuracy because the top probabilities are often quite similar due to the physical layout of the exhibition. As seen in the figures, the higher the uncertainty about a visitor's behaviour and the higher the sensor error, the lower the accuracy and the higher the log loss (statistically significant). Note that Exh all is invariant to sensor noise, as all the information is assumed given (Table <ref type="table" target="#tab_0">1</ref>). Interestingly, the differences in performance between the three lower-information models (TL all , TLA all and Exh prev TLA curr ) are relatively small, and their performance profiles are quite flat up to ν = 1.5 metres, diverging slightly from there on. The creditable performance up to ν = 1.5 metres means that one can expect acceptable predictive performance from sensor-based systems.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusions</head><p>This paper offered a realistic model of sensor-based information, significantly extending the work of <ref type="bibr" target="#b3">Schmidt et al. [2009]</ref>. Our framework enables us to study the impact of different assumptions regarding sensor noise and available sensor information on inferential performance regarding viewed exhibits. The accuracy of these inferences in turn affects the performance of user models, viz models of visitors' interests and of exhibits they are likely to visit. As expected, predictive performance deteriorates for every experimental parameter that is inferred (rather than given), and also as sensor error increases. However, interestingly, performance remains quite stable for sensor noise up to 1.5 metres, which is an encouraging result for real-world systems.</p><p>Our inferential and predictive models in combination support the generation of recommendations of exhibits that may be of interest but are likely to be missed. Our models may also be used to influence the strength of recommendations as a function of the reliability of the information on which the recommendations are based. An additional application of our results is in guiding the layout of sensing devices in a museum, e. g., it may be advantageous to place more devices in locations where the inferences are more uncertain.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Two representations of part of a simulated visitor pathway</figDesc><graphic coords="3,54.00,58.98,241.92,92.79" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Figure 1(a) shows the trajectory obtained after simulation (walk-ing is represented by a red/grey line, hovering is represented by a blue/dark-grey line on pink/shaded squares, and wall squares are coloured in blue/grey), and Figure 1(b) is the representation obtained by applying Gaussian sensor noise at a level of ν = 2 metres.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>Figure 1(a) depicts part of one such smooth visit trajectory.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>•</head><label></label><figDesc>Mean and median of the ω − 1 velocities • Standard deviation of the ω − 1 velocities • ω−2 accelerations (each of them calculated as the length of one of the ω − 2 acceleration vectors, which in turn are derived from the ω − 1 velocity vectors) • Minimum and maximum of the ω − 2 accelerations • Mean and median of the ω − 2 accelerations • Standard deviation of the ω − 2 accelerations</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Average log loss of actually viewed exhibits against sensor error</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Predictive performance of the four models against sensor error</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Inference models and their experimental conditions</figDesc><table><row><cell></cell><cell></cell><cell cols="2">Models</cell><cell></cell><cell></cell><cell></cell><cell cols="4">Time &amp; (x, y)</cell><cell></cell><cell cols="3">Walk/Hover</cell><cell></cell><cell></cell><cell cols="5">Exhibits Previous Current</cell></row><row><cell></cell><cell></cell><cell cols="2">TL all</cell><cell></cell><cell></cell><cell></cell><cell cols="5">sequence of t, x, y</cell><cell cols="3">Inferred</cell><cell></cell><cell></cell><cell>Inferred</cell><cell></cell><cell></cell><cell cols="2">Inferred</cell></row><row><cell></cell><cell></cell><cell cols="2">TLA all</cell><cell></cell><cell></cell><cell></cell><cell cols="5">sequence of t, x, y</cell><cell cols="3">Given</cell><cell></cell><cell></cell><cell>Inferred</cell><cell></cell><cell></cell><cell cols="2">Inferred</cell></row><row><cell></cell><cell></cell><cell cols="4">Exh prev TLA curr</cell><cell></cell><cell cols="5">sequence of t, x, y</cell><cell cols="3">Given</cell><cell></cell><cell></cell><cell>Given</cell><cell></cell><cell></cell><cell cols="2">Inferred</cell></row><row><cell></cell><cell></cell><cell cols="4">Schmidt et al.</cell><cell></cell><cell cols="5">one x, y per exhibit</cell><cell cols="2">N/A</cell><cell></cell><cell></cell><cell></cell><cell>Given</cell><cell></cell><cell></cell><cell cols="2">Inferred</cell></row><row><cell></cell><cell></cell><cell cols="2">Exh all</cell><cell></cell><cell></cell><cell></cell><cell cols="5">sequence of t, x, y</cell><cell cols="3">Given</cell><cell></cell><cell></cell><cell>Given</cell><cell></cell><cell></cell><cell></cell><cell>Given</cell></row><row><cell></cell><cell>1</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>2.8</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>TLA all</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>average classification accuracy</cell><cell>0.6 0.7 0.8 0.9</cell><cell></cell><cell>all TL</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>average log loss</cell><cell>2 2.2 2.4 2.6</cell><cell></cell><cell>TL all</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell>MCL</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.5</cell><cell>0</cell><cell>0.5</cell><cell>1</cell><cell>1.5</cell><cell>2</cell><cell>2.5</cell><cell>3</cell><cell>3.5</cell><cell>4</cell><cell>4.5</cell><cell></cell><cell>1.8</cell><cell>0</cell><cell>0.5</cell><cell>1</cell><cell>1.5</cell><cell>2</cell><cell>2.5</cell><cell>3</cell><cell>3.5</cell><cell>4</cell><cell>4.5</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="3">sensor error (ν)</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="3">sensor error (ν)</cell><cell></cell><cell></cell></row><row><cell cols="12">Figure 2: Average walking/hovering classifica-</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="9">tion accuracy against sensor error</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">In our evaluation, we use fixed parameter values. Alternatively, one could sample the values for each trajectory simulation. Also, certain parameter values in combination with different transition models may yield different types of museum visitors, e. g., the ant, fish, butterfly and grasshopper types[Véron and Levasseur, 1983;  Zancanaro et al.,  </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2007" xml:id="foot_1">].</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2">For simplicity of notation, we use (x, y) instead of (x , y ) in the remainder of the paper to denote the measured noisy coordinates.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_3">Predictions of a visitor's next exhibits can be combined with predictions of the personally interesting exhibits to generate recommendations of exhibits that may be overlooked if the predicted next exhibits are actually visited.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_4">Our observations indicate that visitors rarely return to previously viewed exhibits. Hence, we focus on unseen exhibits.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_5">  5  We employed bootstrapping, because only the test data varies for this technique, compared to cross validation which conflates the variation in the training and test data.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_6">We under-sampled the larger class, rather than over-sampling the smaller class, in order to retain the variance of the latter class. We also experimented with unbalanced data, but the performance</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This research was supported in part by grant DP0770931 from the Australian Research Council. The authors thank Daniel F. Schmidt for his involvement at early stages of this research, Liz Sonenberg and Carolyn Meehan for fruitful discussions and their support, and David Abramson, Jeff Tan and Blair Bethwaite for their assistance with the computer cluster.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Using interest and transition models to predict visitor locations in museums</title>
		<author>
			<persName><forename type="first">Padmanabhan</forename><surname>Bahl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paramvir</forename><surname>Bahl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Venkata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Padmanabhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabian</forename><surname>Bohnert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ingrid</forename><surname>Bohnert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Zukerman</surname></persName>
		</author>
		<author>
			<persName><surname>Bohnert</surname></persName>
		</author>
		<ptr target="http://www.csie.ntu.edu.tw/˜cjlin/libsvm" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)</title>
				<meeting>the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)</meeting>
		<imprint>
			<date type="published" when="2000">2000. 2000. 2009. 2009. 2008. 2008. 2001. 2001. 2002. 2002</date>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="47" to="51" />
		</imprint>
	</monogr>
	<note>AI Communications</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Canalda, Pascal Chatonnay, and Franc ¸ois Spies. Indoor Wi-Fi positioning: Techniques and systems</title>
		<author>
			<persName><forename type="first">W</forename><surname>Dijkstra ; Edsger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Dijkstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wakkary</forename><surname>Hatala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marek</forename><surname>Hatala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ron</forename><surname>Wakkary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Hazas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">User Modeling and User-Adapted Interaction</title>
				<imprint>
			<date type="published" when="1959">1959. 1959. 2005. 2005. 2004. 2004. 2009. 2009</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="651" to="664" />
		</imprint>
	</monogr>
	<note>Numerische Mathematik</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">User-centred design of flexible hypermedia for a mobile guide: Reflections on the HyperAudio experience</title>
		<author>
			<persName><forename type="first">Jonas</forename><surname>Lundgren</surname></persName>
		</author>
		<author>
			<persName><surname>Lundgren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Splinefit ; Christopher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert</forename><forename type="middle">D</forename><surname>Mooney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniela</forename><surname>Petrelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elena</forename><surname>Not</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Philipose</surname></persName>
		</author>
		<ptr target="http://www.mathworks.com/matlabcentral/fileexchange/13812-fit-a-spline-to-noisy-data" />
	</analytic>
	<monogr>
		<title level="j">User Modeling and User-Adapted Interaction</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">3-4</biblScope>
			<biblScope unit="page" from="50" to="57" />
			<date type="published" when="1993">2007. 2007. 1993. 1993. 2005. 2005. 2004. 2004</date>
			<publisher>Sage Publications</publisher>
		</imprint>
	</monogr>
	<note>IEEE Pervasive Computing</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Assessing the impact of measurement uncertainty on user models in spatial domains</title>
		<author>
			<persName><surname>Schmidt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)</title>
				<meeting>the 17th International Conference on User Modeling, Adaptation, and Personalization (UMAP-09)<address><addrLine>Paris, France</addrLine></address></meeting>
		<imprint>
			<publisher>Véron and Levasseur</publisher>
			<date type="published" when="1983">2009. 2009. 2007. 2007. 1983. 1983</date>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="257" to="304" />
		</imprint>
		<respStmt>
			<orgName>Bibliothèque Publique d&apos;Information, Centre Georges Pompidou</orgName>
		</respStmt>
	</monogr>
	<note>Eliséo Véron and Martine Levasseur. Ethnographie de l&apos;Exposition</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Cultivating personalized museum tours online and on-site</title>
		<author>
			<persName><forename type="first">Wang</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 11th International Conference on User Modeling (UM-07)</title>
				<editor>
			<persName><forename type="first">Guibas</forename><surname>Zhao</surname></persName>
		</editor>
		<meeting>the 11th International Conference on User Modeling (UM-07)</meeting>
		<imprint>
			<publisher>Morgan Kaufmann</publisher>
			<date type="published" when="2004">2009. 2009. 2007. 2007. 2004. 2004</date>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="238" to="246" />
		</imprint>
	</monogr>
	<note>Wireless Sensor Networks: An Information Processing Approach</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
