RADIO MAP ESTIMATION WITH NEURAL
 NETWORKS AND ACTIVE LEARNING FOR INDOOR
              LOCALIZATION

     Sıla Güler1[0000−0001−7866−6433] , F. Serhan Daniş2,3[0000−0002−8813−9220] , and Ali Taylan
                                    Cemgil2[0000−0003−4463−8455]
             1
                 Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
                                             s.guler@tue.nl
                             2
                               Boğaziçi University, 34342 Istanbul, Turkey
                           3
                             Galatasaray University, 34349 Istanbul, Turkey


      Abstract. Accurate estimation of the radio frequency maps is key in practical indoor local-
      ization, but this requires dense sampling of the electromagnetic field, which is also known as
      fingerprinting. To decrease the time-consuming fingerprinting process, we both aim to esti-
      mate probabilistic radio frequency maps (RM) using artificial neural networks (ANN), and
      automatically pick the most informative fingerprint positions following an active learning
      strategy based on uncertainty sampling, aided by Gaussian Processes (GP). Once we have
      an estimate of the RM of the environment, the problem can be formulated as a tracking
      problem with a Hidden Markov Model (HMM), and the RM can be used as the observa-
      tion densities of the HMM. The trajectories of sequential positions are approximated with
      a sequential Monte Carlo filter. The results indicate that fingerprint measurements can be
      reduced by 30% in return of 8.1% decrease in the localization performance.

      Keywords: Indoor localization, neural networks, fingerprint reduction, radio map estima-
      tion, uncertainty sampling


1   INTRODUCTION

Indoor localization has become a widespread research area recently due to the use cases of mar-
keting, healthcare [3] and asset tracking. Indoor localization approaches are divided into three
categories: geometry-based, propagation-based, and fingerprint-based. Geometry-based approaches
[5,10] uses triangulation or trilateration principles to calculate the position of the object. However,
they are not capable of modeling the multi-path and shadowing effects. The propagation-based
approaches model the attenuation of the signal power, but as environments vary, these approaches
require to model the power of the signals with different parameters for each environment. Recently,
fingerprint-based localization approaches has become common in this research area.
    Fingerprint-based approaches have two stages. The first stage is the fingerprint collection at
specified positions in the environment. A fingerprint is a statistic of the signal strength at a co-
ordinate. It can be in the units of Received Signal Strength Indication (RSSI) or Signal-to-Noise
Ratio (SNR). The second stage is the position estimation based on the data driven fingerprints.
The number of collected fingerprints and the position estimation methodology are two important
factors that affect the localization accuracy. Accordingly, we analyze the studies using different
position estimation and fingerprint reduction methodologies.
    One of the earliest fingerprint-based approach is RADAR [2]. They estimated the positions
using kNN using a previously created RM of WiFi of 2.94-meter and 1-meter resolutions. In order
to reduce the fingerprint measurement cost, they modeled the signal propagation with the Wall
Attenuation Model and calculated the fingerprints at the other positions in their test environment.
By using the calculated fingerprints with the nearest neighbor technique, they achieved 4.3-meter
localization accuracy.
    In [1] and [16], they used RSSI values and Channel Impulse Response as inputs for the neural
network training for estimating the positions. Both achieved accuracies under 0.6 meters. Similarly,
in [17], they used Auto Encoders consisting of two deep neural networks functioning as encoder and
decoder and achieved 1.82-meter mean accuracy. In [13], they computed propagated signal power
2       S. Guler et al.

distributions for each position and stored them in a probabilistic RM. They used both maximum
likelihood estimation (MLE) and neural networks in localization stage and achieved 0.27-meter
and 1-meter positioning error respectively. In [14], they generated a magnetic field map with the
Gaussian Processes. They used Sequential Monte Carlo Method for the localization and achieved
a 4.87-meter median position accuracy.
    In [7], they focus on fingerprint reduction by extracting the hidden structure and redundancy
of fingerprint matrices by using Singular Value Decomposition (SVD). The missing values were
imputed via the kNN algorithm. They recovered the original fingerprints from only 5% of the
collected data with an error rate less than 14%. In [9], they suggested an artificial fingerprint
generation method based on signal propagation model. They generated two different RMs with the
measured and the artificially generated fingerprints. They used the kNN approach to evaluate the
localization performance based on these maps. They reduced the positioning error by 63% with
the artificially generated fingerprints.
    In this work, we hypothetize that both neighboring fingerprints and their coordinates are re-
lated to each other. Modeling these relations with Gaussian Processes (GP) and Artificial Neural
Networks (ANN), we propose a novel reduction strategy to be applied to any fingerprint-based
localization problem. Our reduction strategy combines two methods: ANNs are used to estimate
the RM of fingerprints in the positions where no fingerprint data is originally available, and GPs
are used to model the uncertainty of fingerprints. For evaluation, we compare the localization er-
ror with and without these methods. We approach our problem as a tracking problem based on
HMM and use SMC to approximate the original trajectory. We calculate the localization error by
comparing the estimated trajectory with the ground truth.


2     METHODOLOGY

2.1   Radio Frequency Map Estimation

The relationship between an RSSI distribution at a position and its 2D position coordinates is a
complex relationship, as it bases on the degenerative effects on signals, such as path loss, shadowing,
or even the architecture of the environment. ANNs are good at modeling complex relationships and
multi-class classification problems where the output layer represents a probability distribution.
    Five different sets of input features, Si , are constructed with different combinations of the
following features: the distances between a BLE transmitter (i.e. beacon) and a bluetooth receiver
(i.e., dongle), fingerprints in the form of RSSI histograms on the neighboring positions and their
distances to the beacon. In Table 1, P T stands for the distance of the position to the beacon, P Fi
for the distance of the position to the ith closest fingerprint, T Fi for the distance between the ith
closest fingerprint and the beacon and Hi for the histogram on the ith closest fingerprint, with
indices for the rank of the proximity.


                                       Table 1. Input Features

                          Feature Sets                Features
                              S1                          PT
                              S2                    P F1 , T F1 , H1
                              S3                  P F1,2 T F1,2 , H1,2
                              S4              P F1,2,3 , T F1,2,3 , H1,2,3
                              S5         P T, P F1,2,3,4 , T F1,2,3,4 , H1,2,3,4


    We train a Single Layer Neural Network (SNN) using these five feature sets. For every set we
select different number of units in the hidden layer ranging from 20 to 520. As an SNN may not be
successful enough to approximate the complex function between our input features and output, we
also train a Deep Neural Network (DNN) with two hidden layers. We feed the features to the input
                                  RADIO MAP ESTIMATION WITH NEURAL NETWORKS                         3

layers and we estimate the discrete probability distribution of RSSI values ranging from −100 to
−20.
    We compare two distributions using a cross-bin comparison metric that is also called the squared
Earth Mover’s Distance (EMD2 ) [12]. It corresponds to the square sum of element-wise difference
between the cumulative distribution functions (CDF) of two distributions. In an ordered multi-class
classification problem it becomes equivalent to Mallow’s distance [8]:

                                   EMD2 (p, q) ∝ ||FP (p) − FQ (q)||22                             (1)

 where p and q are original and predicted distributions respectively, and FP (p) and FQ (q) are their
cumulative distribution functions.
   For intermediate and top layers of ANNs, we select the sigmoid function as the activation
function to model complex relations of features. The weights and biases are initialized with the
Glorot Initializer and as zero respectively. Weights and bias values are optimized via the RMSProp
algorithm [6].


2.2   Active Learning with Gaussian Processes

We use GP to model the mapping functions between the positions and the fingerprints together
with the uncertainty in these positions. GP is composed of random variables, f , any finite set of
which have a joint Gaussian distribution. The positions and the RSSI values are the observations,
and the functions are the latent variables (see Fig. 1). When the functions for the given observations
are known, the function value, f∗ , for a new position, x∗ , can be calculated.


                                      r1        r2       r...         r∗


                                      f1        f2       f...         f∗


                                      x1        x2       x...         x∗


                     Fig. 1. Graphical Model of Gaussian Process Regression [11]


   We assume that the distances between the positions makes a difference in RSSI distributions.
We use a squared exponential function as the covariance function with the kernel k(xi , xj ). The
hyper-parameters of the kernel, σf , σn , and l are optimized with gradient ascent by maximizing
the log marginal likelihood as in [11]. Each function for each input value is treated as a random
variable and they have a joint Gaussian distribution.The predictive distribution of RSSI values is
computed as follows:

                                      f∗ |x∗ , x, r, ∼ N (µ∗ , Σ∗ )
                         µ∗ = m(x) + K(x∗ , x)[K(x, x) + σn2 I]−1 (r − m(x))                       (2)
                        Σ∗ = K(x∗ , x∗ ) − K(x∗ , x)[K(x, x) + σn2 I]−1 K(x, x∗ )

 where r and x are the RSSI observations and positions respectively. The function value can be
predicted only by calculating the correlation between existing positions and a new coordinate.
The output value r∗ differs from f∗ only in the sense that it has an additional noise term in its
covariance, Σ∗ + σn2 I.
   Predictive variance, Σ∗ , is used for iterative active fingerprint selection. In the initial step,
a random fingerprint is selected for training. At each step, the fingerprint that has the greatest
variance is added to the training set. Retraining with the newly added fingerprint, predictive
variance for each of the remaining positions is recalculated and this procedure continues until we
reach a target number of fingerprints.
4      S. Guler et al.

2.3   Tracking with Radio Frequency Map
We model the tracking problem as HMM, as in Fig. 2. Measurements, rt , are the RSSI values and
the latent variables, xt , are the positions. We are to estimate the current position, xt , given the
observations so far. Filtering, p(xt , r1:t ), is a recursive process:


                     x0            x1              ...               xt            ...           xT


                                   r1                                rt                          rT


                             Fig. 2. Graphical Model for Tracking Problem


                                                       t
                                                       X
                          p(xt , r1:t ) = p(rt |xt )         p(xt−i+1 |xt−i )p(xt−i , r1:t−i )        (3)
                                                       i=1


    For the transition density, p(xt−i+1 |xt−i ), a diffusion-based motion model is used as explained
in [4]. For the observation densities, p(rt |xt ), we use the RM estimations, matrices that store the
conditional probabilities, p(r|x). When the latent variable space is too big and/or continuous, the
exact method will be intractable and/or impossible to apply. We employ an SMC method, the
particle (bootstrap) filter, to approximate the conditional marginal distribution, p(xt |r1:t ). We
evaluate the final results by measuring the filtered trajectory of positions with the ground truth
trajectories (see [4] for details).

3     EXPERIMENTS AND RESULTS
We conducted our experiments on two different datasets, collected from two different areas, the
first of which, A1 , is a living room (5.28 × 6.35 m2 ) containing six BLE beacons, logged at 50
positions each for 24 hours [4] (see Fig. 3a). The second area, A2 , is a home simulation environment
(10.40 × 5.95 m2 ) [15], with six beacons, logged at 50 positions each for 20 minutes. (see Fig. 3b).


                    (a) A1                                                                 (b) A2

                 Fig. 3. Fingerprints are represented by black dots on two test areas.


   We used 70% of these fingerprints for training, and the remaining for validation and test
purposes. The training data were used for optimizing the weights and biases, and the validation
                                  RADIO MAP ESTIMATION WITH NEURAL NETWORKS                         5

data for selecting the best model. The test data were used only for evaluating the localization
performance. All computations were performed on an Intel Core i7 based computer and 8GB of
RAM. The models were constructed on Keras 2.1.5 with Tensorflow as backend.
    For the test area of A1 , we trained the SNN with five feature sets (see Table 1) and 25 different
hidden unit sizes. We repeated the training process ten times, constructing ten different randomly
selected fingerprint configurations which are then used to compare the results of these feature sets.
We evaluate the performance by measuring the average validation loss over all beacons. Fig. 3
shows that S1 yields much greater error than the remaining feature sets, meaning the estimations
that depend solely on the distance to the beacons perform very poorly. S4 has the best performance
with minimum EMD2 loss with 60 units in the hidden layer. It shows that the input feature with
the four nearest fingerprint data, their distances to the beacons, and their histograms performs
better than the other feature sets. We trained the DNN with hidden unit sizes ranging from 10 to
100. Finally, the configuration of S4 and 40 units is the best architecture for each layer, as seen in
Fig. 3. Repeating the same experiments for the fingerprints collected in the A2 , we found that S3
and 40 hidden units in the hidden layers performs better for both SNN and DNN (see Fig. 3).


                                                (a) A1


                                                (b) A2

                 Fig. 4. Errors with respect to hidden unit sizes for two environments


     To evaluate our strategy, we trained ANNs using (i) all collected fingerprints, (ii) randomly
selected 70% of the fingerprints and (iii) actively selected 70% of the fingerprints. With the trained
networks, we made estimations of dense RMs with 0.1-meter intervals for ten times.
     We used the constructed maps as the observation distribution in the particle filter to estimate
the trajectory points. We applied the SMC with 2000 particles and at 400 trajectory points in A1
and A2 . In each run, the first 32 estimated points were omitted as burn-in period. We run the
filtering algorithm ten times for each radio frequency map and each time particles were initialized
randomly. This prevented us from getting poor performance because of the random initialization.
We reported the performance of our methodology as median tracking error instead of the mean
absolute error, because errors form skewed distributions, which are clearly non-Gaussian. Tracking
accuracies in A1 and A2 are given in Fig. 5a.
     We have three observations. The first one is that deep neural networks cause higher median
error in both areas. Second one is when the training size is decreased from 100% to 70% of data,
6       S. Guler et al.

median error increases. Finally, the median error for the actively selected data in each test area is
lower than the one for the randomly selected data regardless of the neural network type.
    As seen in Table 2, we can reduce the number of training data by 30% while increasing the
tracking accuracy by at most 8.1% in return. For A1 , the resulting increase are 1.9% and 2.3% for
SNN and DNN, respectively. For A2 , the resulting increase are 8.1% and 3.96%.


Fig. 5. Comparisons of neural network performances in both areas with different selection strategies. “All”
refers to the case when all collected fingerprints are used.


Table 2. Accuracy results of NNs in both areas. 50% corresponds to the second quartile (median) error
of positioning error distribution.

                                A1                                    A2
                  NN type Fingerprints 50%              NN type Fingerprints 50%
                                   All       2.10                        All       2.83
                      SNN       Random       2.28           SNN       Random       3.21
                                 Active      2.14                      Active      3.06
                                   All       2.20                        All       3.03
                      DNN       Random       2.29           DNN       Random       3.22
                                 Active      2.25                      Active      3.15


4    CONCLUSION

We propose a novel approach to reduce the training size required to estimate probabilistic radio
maps by picking training positions adaptively. We have two conclusions showing the success of our
approach. First, we pick the fingerprint positions in an adaptive manner, we estimate fingerprints
at remaining positions better than we do with the randomly selected fingerprints in both test areas.
Second, we reduced the training size by 30% with an increase in the median error by a maximum
of 8.1%. Since fingerprinting is a time-consuming procedure in indoor localization field, this study
would be used to reduce the fingerprinting process of any other study in the indoor localization
research area.
                                    RADIO MAP ESTIMATION WITH NEURAL NETWORKS                             7

References
 1. Altini, M., Brunelli, D., Farella, E., Benini, L.: Bluetooth indoor localization with multiple neural
    networks. In: IEEE 5th International Symposium on Wireless Pervasive Computing 2010. pp. 295–300
    (May 2010)
 2. Bahl, P., Padmanabhan, V.N.: Radar: an in-building rf-based user location and tracking system. In:
    Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual
    Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064). vol. 2,
    pp. 775–784 vol.2 (2000)
 3. Calderoni, L., Ferrara, M., Franco, A., Maio, D.: Indoor localization in a hospital environment using
    random forest classifiers. Expert Systems with Applications 42(1), 125 – 134 (2015)
 4. Daniş, F.S., Cemgil, A.T.: Model-based localization and tracking using bluetooth low-energy beacons.
    Sensors 17(11), 2484 (2017)
 5. Fang, B.T.: Simple solutions for hyperbolic and related position fixes. IEEE Transactions on Aerospace
    and Electronic Systems 26(5), 748–753 (Sept 1990)
 6. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
 7. Gu, Z., Chen, Z., Zhang, Y., Zhu, Y., Lu, M., Chen, A.: Reducing fingerprint collection for indoor
    localization. Computer Communications 83, 56 – 63 (2016)
 8. Hou, L., Yu, C.P., Samaras, D.: Squared earth mover’s distance-based loss for training deep neural
    networks. CoRR abs/1611.05916 (2016)
 9. Li, Y., Shi, G., Zhou, X., Qu, W., Li, K.: Reducing the site survey using fingerprint refinement for
    cost-efficient indoor location. Wireless Networks 25(3), 1201–1213 (Apr 2019)
10. Peterson, B.B., Kmiecik, C., Hartnett, R., Thompson, P.M., Mendoza, J., Nguyen, H.: Spread spectrum
    indoor geolocation. Navigation 45(2), 97–102
11. Rasmussen, C.E.: Gaussian processes for machine learning. MIT Press (2006)
12. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval.
    International Journal of Computer Vision 40(2), 99–121 (Nov 2000)
13. Saleem, F., Wyne, S.: Wlan–based indoor localization using neural networks. Journal of Electrical
    Engineering 67(4), 299 – 306 (2016)
14. Solin, A., Särkkä, S., Kannala, J., Rahtu, E.: Terrain navigation in the magnetic landscape: Particle
    filtering for indoor positioning. In: 2016 European Navigation Conference (ENC). pp. 1–9 (May 2016)
15. Tunca, C., Alemdar, H., Ertan, H., Incel, O.D., Ersoy, C.: Multimodal Wireless Sensor Network-Based
    Ambient Assisted Living in Real Homes with Multiple Residents. Sensors 14(6), 9692–9719 (2014)
16. Yu, L., Laaraiedh, M., Avrillon, S., Uguen, B.: Fingerprinting localization based on neural networks and
    ultra-wideband signals. In: 2011 IEEE International Symposium on Signal Processing and Information
    Technology (ISSPIT). pp. 184–189 (Dec 2011)
17. Zhou, C., Gu, Y.: Joint positioning and radio map generation based on stochastic variational bayesian
    inference for fwips. In: 2017 International Conference on Indoor Positioning and Indoor Navigation
    (IPIN). pp. 1–10 (Sept 2017)