Analysing visitor flow using a Bluetooth
                   positioning system

         Pieter van den Ham1 , Bert Bredeweg1 and Maartje Raijmakers2
     1
      Faculty of Science, University of Amsterdam, Amsterdam, The Netherlands
2
    Educational Studies, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
    {p.e.vandenham@gmail.com, b.bredeweg@uva.nl, m.e.j.raijmakers@vu.nl}

      This contribution proposes a Bluetooth fingerprinting-based system that
can be used to analyse participant movement within a public space [2]. Several
classification algorithms, such as Naive Bayes, k-Nearest Neighbors and SVM,
are compared to determine which algorithm is the best fit for the system. The
data collected by the system is able to provide metrics such as time spent in
a location and movement patterns. We conducted two experiments in a science
museum, with and without regular visitors, to analyse its performance. Finally,
several suggestions are provided on how this system may be improved.
    Until Bluetooth 4.0 was introduced in 2010, the de-facto standard for indoor
localization was a technique known as “Wi-Fi fingerprinting”, an algorithm that
used the received signals and their strengths to map the user to a known, pre-
recorded location. Bluetooth 4.0 specifies a subsystem known as Bluetooth Low
Energy, a protocol that was built specifically for usage in the context of Internet
of Things with improved energy usage and lower scan times. Similarly to Wi-
Fi fingerprinting, Bluetooth fingerprinting works by observing incoming signals
and classifying this signal-vector (fingerprint) to known fingerprints at refer-
ence locations. Under ideal conditions (low interference, 1 beacon per 30m2 ),
an accuracy of fewer than 2.5 meters can be achieved 90% of the time, which
is a significant improvement over Wi-Fi fingerprinting (8.5 meters 95% of the
time) [1]. Received Signal Strength Indicator (RSSI) fingerprinting seems to be
the most promising technique because of its high theoretical accuracy [1]. RSSI
fingerprinting localization systems differ mainly in what classifiers they use to
classify RSSI vectors.
    The approach we followed utilises a machine learning algorithm such that
classify(u) yields a location, ideally alongside a probability, that most likely
contains the RSSI feature vector u. However, all supervised machine learning
algorithms must first be trained using manually-labelled RSSI vectors in order
for it to make accurate estimations. Therefore, fingerprinting-based systems must
be split into two distinct phases: a training phase, during which a database is
constructed of labelled training data to be fed to a machine learning algorithm
(classifier), and a “live” phase, during which that trained classifier is used to
provide “live” probability estimations for a given RSSI vector.
    Estimote’s Proximity Beacons were used as the system’s Bluetooth beacons.
The packets that are emitted by these beacons are received by a mobile device
    Copyright c 2019 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0).
2         Pieter van den Ham, Bert Bredeweg and Maartje Raijmakers

running Android. A central server acts as a central database that allows mobile
devices to upload the packets they receive for analysis.
    The final and most important concern is building a classifier that can accu-
rately translate RSSI-vectors into locations. Scikit Learn (a Machine Learning
Python library; https://www.scikit-learn.org/) allows us to build this function-
ality using pipelines. A pipeline, in a Scikit Learn context, comprises a series of
preprocessing and classification operations, combined into one entity. A pipeline
has to be fitted to training data before it can transform unlabelled test data into
locations. A diagram representing the pipeline used for RSSI classification can
be found in figure 1. When used on a small dataset of 393 samples, it was found
that the system yields a 95% accuracy, after accounting for bias with k-fold cross
validation (k = 5).


                                               Pipeline

                                   Impute missing
                                      values


Feature       Min-Max Scaling                             Feature Union   Classiﬁer   Location
 vector


                                   Missing Indicator


                       Fig. 1. The RSSI vector classification pipeline.


    The experiments confirmed that small, close together sections were more
difficult to classify than large sections that are relatively isolated, mainly due
to Bluetooth’s susceptibility to noise [3]. Furthermore, the system proved to be
accurate enough for tracking and analysis purposes and will be used to analyse
visitor flow in a science museum.


References
1. Faragher, R., Harle, R.: Location fingerprinting with bluetooth low energy beacons.
   IEEE Journal on Selected Areas in Communications 33(11), 2418–2428 (11 2015).
   https://doi.org/10.1109/JSAC.2015.2430281
2. van den Ham, P., Bredeweg, B., Raijmakers, M.: Analysing visitor flow using a Blue-
   tooth positioning system (2019), http://scriptiesonline.uba.uva.nl/scriptie/692291
3. Kouyoumdjieva, S.T., Karlsson, G.: Experimental Evaluation of Pre-
   cision of a Proximity-based Indoor Positioning System. Tech. rep.,
   https://www.arubanetworks.com