=Paper=
{{Paper
|id=Vol-3097/paper37
|storemode=property
|title=Environment Classification Using RSS for Indoor Localization
|pdfUrl=https://ceur-ws.org/Vol-3097/paper37.pdf
|volume=Vol-3097
|authors=Maximilien Dufau,Elizabeth Colin,Vighnesh Gharat
|dblpUrl=https://dblp.org/rec/conf/ipin/DufauCG21
}}
==Environment Classification Using RSS for Indoor Localization==
<pdf width="1500px">https://ceur-ws.org/Vol-3097/paper37.pdf</pdf>
<pre>
Environment Classification Using RSS for Indoor Localization
Maximilien Dufau 1, Elizabeth Colin 1 and Vighnesh Gharat 2
1
    Efrei Paris, 30-32 avenue de la République, Villejuif, 94800, France
2
    Ela Innovation, 297, rue Maurice Béjart, Montpellier, 34080, France

                 Abstract
                 Indoor localization using radio frequency is challenging because of multipath propagation and
                 complex phenomenon that make standard positioning techniques, such as multilateration,
                 inaccurate most of the time. However, knowing information about the environment can help
                 overcome these issues. In this paper, we use Machine Learning to classify the environment
                 between a Hall and a Corridor in order to use a more precise propagation model for Received
                 Signal Strength (RSS) based localization. The classified samples are series of RSS values
                 taken from RFID tags. Our method achieves 98.57% accuracy. The novelty of this approach is
                 that it uses only a small series of RSS and thus is lighter than the existing techniques. To
                 achieve this result, we use quadruplets of tags, we compare the best quadruplets for
                 classification that we found with the best quadruplets for localization.
                 Keywords
                 Environment classification, RSS, UHF-RFID, hall localization, corridor localization, tag
                 positioning

1. Introduction
   Nowadays, Indoor positioning is achieved through many techniques that all face different specific
issues [1]. Techniques based on Machine Learning, especially Fingerprinting, needs a tedious pre-
deployment phase and are not resilient to even slight modifications of the environment [2]. The complex
behavior of electromagnetic wave in indoor environment makes Received Signal Strength (RSS) based
techniques less accurate than competing ones based on time or angles. However, as this behavior
depends on the type of environment, knowing information about the current environment can improve
the accuracy of the positioning. Moreover, this information can be used to support context-aware
applications.
   Environment classification in an indoor localization scenario is traditionally achieved using Radio
Frequency (RF) signature [3-5]. Channel Transfer Function (CTF) is used to extract the different
multipaths that compose the signal, other features like the Frequency Coherence Function, the complex
autocorrelation of CTF, are also often used as features for Machine Learning. Different algorithms
such as k-nearest neighbor or convolutional neural networks are then used to classify the environment.
These systems achieve classification accuracies of over 99%. However, they are more complex and
require more data than using only RSS measures. In this form, they are thus not applicable in many
localizations’ scenarios.
   In this paper, we present a new classification method based on only a small series of RSS measures
from RFID tags that can classify the environment between a hall, and a corridor. Previous papers have
shown that using only RSS gives poor classification results [5]. However, we introduce the use of four
tags, also called quadruplets, to significantly improve the classification accuracy. Three different
classification algorithms are considered, k-nearest neighbors, classification tree and bagged tree, so that
we can compare their performance for different model parameters and analyze the tags’ quadruplets
behavior for each of them.


IPIN 2021 WiP Proceedings, November 29 -- December 2, 2021, Lloret de Mar, Spain
EMAIL: maximilien.dufau@efrei.net (A. 1); elizabeth.colin@efrei.fr (A. 2); vighnesh.gharat@elainnovation.com (A. 3)
ORCID: 0000-0001-5796-7974 (A. 1); 0000-0002-5992-8124 (A. 2)
              ©️ 2020 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
    In the end, our solution is easier to apply than the previous ones in a practical scenario. It can be
 added to existing RSS based location systems with little effort to improve its accuracy as it allows using
 a specific propagation model for each environment.
    We also discuss the best positioning of tags for classification purposes and compare it with the best
 positioning for localization that was found in [6-8].
    This paper is organized as follows, section 2 presents the classified environments and the measures.
 The classification methods with a single tag and quadruplets are presented in section 3. In section 4 we
 discuss the best tags’ quadruplets characteristics, and we compare them with the best quadruplets for
 localization. We conclude in section 5 with a discussion of the presented results and future work.

 2. Measures & context
 2.1. Environments
     Experiments are performed in a hall of approximatively 205 square meters, and in a corridor of
 dimensions 30 m x 2 m, which can be seen in Fig. 1. Tags are alternatively placed at a 1.3 meter and a
 2.1 meters height. Both environments are meshed with sixty active RFID tags.
     We needed to verify that the measurements taken in the narrowest part of the hall were sufficiently
 different from the ones in the corridor to be labeled as hall measurement. Indeed the dimensions of
 these parts were similar. We used k-means and hierachical clustering to do so, with the conclusion that
 the natural labelling of the data was indeed the best one.

 2.2.    RFID equipment
     For this work we used active UHF-RFID tags at 433 MHz band. The tags can be detected from as
 far as 20 meters in an indoor environment. A bidirectional traffic detection RFID reader, having two ¼
 wave antennas separated by 10.0 cm, is considered for the experiment. RS232 connector is used to
 connect the reader to the computer. The tag model “Coin ID" and reader model “UTP Diff 2”, used for
 this work are from RFID manufacturer Ela Innovation [9]. The reader provides an 8-bits RSSI (RSS
 Information) corresponding to the received power as shown in (1). The higher the RSSI, the lower
 power.

                                  𝑃𝑑𝐵𝑚 = 30.84 − 0.632 ∗ 𝑅𝑆𝑆𝐼            (1)

     Where 𝑃𝑑𝐵𝑚 is the received power in dBm. RSSI is the received signal strength in decimal.

 2.3.    Measurements
   The measurements that we will be considering here are similar to the ones presented in [6-8]. The
 RFID reader is used to read RSS coming from sixty active tags. At a given position, 2x60 RSS
 acquisitions are realized, one for each tag for the two antennas. Multiple series of measurements are


a)                                                     b)

 Figure 1: a) Tags’ placement in the Hall, b) Tags’ placement in the Corridor
taken, separated by 30 cm each and there has been no passing in both environments during the measures’
collection. The measures were done at reference positions as can be seen on Fig.1.

2.4.    Scenario and data sample
    Our baseline scenario is the one of a moving agent only able to record Signal Strength at regular
intervals, e.g. a robot or a pedestrian carrying a smartphone and walking in an indoor environment. The
relative distance between successive RSS measures needs to be the same for each sample, in order for
them to always describe the same type of information.
    In a practical use scenario this can be achieved using dead reckoning or with the help of an
accelerometer for example. However, these kinds of techniques are only suitable for a few steps and
quickly become inaccurate due to sensor drift [10]. Thus, we must consider that we cannot guarantee
an accurate relative position for measurements taken over large distances.
    Moreover, reducing the amount of data for each sample implies less calculation and energy
consumption. Considering that our solution aim to be used for real time localization on embedded
systems these parameters are important to consider.
    Therefore, for optimization and accuracy reasons, our classification method needs to use the less
possible data. Moreover, each samples’ measures are related to a single tag.
    The samples we use for Machine Learning are therefore series of 2x6 RSS measures taken along a
line, as shown in Fig. 2. Distance between antennas is 10 cm, six measures separated by 30.0 cm
approximately correspond to at least two measures per wavelength at 433 MHz, which gives a good
compromise between a reduced data set and the amount of information. We used a total of 1200
samples distributed as follow, 70% for the training set and 30% for the test set. We used oversampling
to balance the quantity of samples between the hall and the corridor.


Figure 2: A typical sample data used for Machine Lea. rning

3. Environment Classification
3.1. Features and results of single tag classification
    To classify the two considered environments (hall and corridor), we established a set of features that
describe our samples and discriminate them efficiently. We extracted mean, max, min, root mean square
and other classical statistics functions of our measures set. We also extracted two subsets of data: the
difference between the two antennas at a given position and the difference between two pairs of adjacent
measures. The same statistics functions are applied for these two subsets. This allows the algorithms to
take into account the sudden RSS variations that are characteristic of the multipath and of the fading of
this kind of environments. A total of 14 features were extracted for each data sample.
   Three algorithms’ results are compared in Table 1. The best classification accuracy is achieved with
the Classification Tree algorithm (82% of prediction accuracy). However, this system only considers
the data coming from one tag, we improve it in subsection 3.2 by considering multiple tags for
classification.
   One important information to notice is that we do not use tags’ ID information to perform the
classification, even though it could contain useful clues about the environment. As said in section 2.2,
our tags have a 20 meters range, which means a single tag’s signal can be received in many different
rooms. Moreover, using ID information would require an additional pre-deployment step. Thus, we
decided not to take this information into account in this first iteration of our solution. However, when
deployed as a practical solution, ID information could indeed be useful to perform sanity check on the
model output.
Table 1
Prediction accuracy for three methods
        Methods                   k-NN                             Classification Tree   Bagged Tree
        Accuracy                75.32 %                                 82.28 %            79.75 %

3.2.    Using quadruplets of tags to improve performance
   As our classification method will be used in a localization scenario using multilateration with RSS,
we can assume that at least four tags will always be available for this purpose. Thus, we built a
classification method that considers the output of our previous algorithms for quadruplets. We choose
the new classification output by majority voting. With 60 tags there are 487635 possible quadruplets,
the accuracy of our method mainly depends on the choice of goods quadruplets. Therefore, different
selection criteria were considered to assess the quadruplet efficiency.
   As shown in Table 2, choosing randomly from available tags does not significantly improve the
classification accuracy, in some cases it even degrades it compared to single tag classification (82% for
the best single tag classification). On the one hand, choosing the four tags that have the lowest mean
RSS gives good results (higher than 85%) and on the contrary selecting tags with the highest mean
considerably weakens the model (less than 70%). Indeed, high RSS value from a tag tends to indicate
that the tag is close and therefore that the dominant path is generally the line-of-sight path, which is the
same in all environments, contrary to the multipaths that depend on the environment geometry, building
materials, etc. Therefore, as a high RSS value generally implies that the line-of-sight path is dominant
in the signal, it means that the considered sample does not contain a lot of environment-related
information, which weakens the classification accuracy. On the other hand, signal from tags that are
too far are strongly attenuated and the signal strength is around the noise floor level, making it not
suitable for classification. Selecting the tags that are closest to the 35th-percentile allows to find a good
trade-off between strength of the signal and a sufficient distance between the sensor and the quadruplets
tags. It gives very satisfying results, from 91.43% with Bagged Trees algorithm to 98.57% with k-NN
algorithm.

Table 2
Prediction accuracy for different quadruplets selection methods
                                   k-NN               Classification Tree                Bagged Tree
        Random                    73.25 %                  77.33 %                         83.22 %
   Maximum mean                   35.71 %                  45.71 %                         68.57 %
    Minimum mean                  92.86 %                  88.57 %                         85.71 %
 35th-percentile mean             98.57 %                  94.29 %                         91.43 %


                                                              Bagged Tree
                         True Class


                                      corridor    83.3%                       16.7%


                                          hall   13.0%                        87.0%

                                                 corridor                     hall
                                                            Predicted Class

  Figure 3: Confusion matrix with minimum mean selected quadruplets
4. Describing best classification quadruplets
   In this paper we meshed both environments with sixty tags. However, this high density is not
desirable in a practical usage scenario due to the cost of the tags as well as interference issues that
reduce the tags’ signal range. Therefore, we need to identify the optimal placement of tags, i.e. the tags’
quadruplets that give the best classification results, to reduce their number. This is the subject of this
section.

4.1.    Best quadruplets for classification
    We take two criteria to rank quadruplets. Firstly their accuracy (acc), i.e. the ratio of correct
classifications. Secondly their visibility (vis), i.e. the number of samples they are able to classify within
the test set. The visibility notion is important because a high accuracy on very few samples is not
statistically significant. For example, a quadruplet that has an accuracy of 100% on only two of the test
set samples will be ranked lower than a quadruplet that has an accuracy of 95% on thirty test set samples.
We thus define the following score to rank quadruplets :

                        𝑟𝑎𝑛𝑘𝑆𝑐𝑜𝑟𝑒 = (𝑎𝑐𝑐 − 𝐶) ∗ log(𝑣𝑖𝑠) ∗ 𝑒𝑎𝑐𝑐− 1               (2)

   Where acc is the accuracy, vis the visibility and C a threshold used to set below 0 the rankScore of
quadruplets that have a very low accuracy. It was set to 0.4 for this article’s analysis so that the scores
of quadruplets that achieve less than 40% classification accuracy are negative.

4.1.1. Best quadruplets in the corridor
   The three best quadruplets for classification in the corridor environment for each mentioned
algorithm appear on Fig. 4. They are very similar. They all share:

    •   A triangle shape, moreover, the angles that are adjacent to the side with three tags are always
        both acute.
    •   Two very close tags.

   One noticeable point is that the corridor is symmetric whereas the best quadruplet layouts are not.
This could be explained by an electromagnetic disturbance that is not caused by the corridor geometry,
e.g. a particularly important presence of ferro-magnetic materials in a wall.


   Figure 4: Best quadruplets in the corridor environment


4.1.2. Best quadruplets in the hall
   Best Quadruplets in the hall environment appear on Fig. 5. The best quadruplets are not strictly
identical for each algorithm; however, they share common characteristics which are:
    •   Omniscience, i.e., at least one tag has a direct line-of-sight for the whole hall.
    •   Alignment, i.e. three tags always approximately share a common straight line. Just like in the
        corridor, hall quadruplets have a triangular shape and two tags at least are close to each other.


  Figure 5: Best quadruplets in the hall environment

  Moreover, there is no significant correlation between the length of the quadruplets’ maximal diagonal
and their rankScore, however best tags, i.e. the top 20% of rankScore values, tend to be clustered closely
with a span of around 14.56 m long, as shown in Fig. 6.

4.2.    Best quadruplets for classification vs best quadruplets for localization
      In [7-9] best tags’ quadruplets for localization in both environments were analyzed. We compared
  their characteristics with the ones we identified above. A priori, good positioning quadruplets must
  differ from good classification quadruplets because they need opposite signal characteristics. The
  first ones require the less possible multipaths whereas the second ones need these multipaths to
  identify the environment.
      According to [8], quadruplets positioned on the extremity of the corridor achieve better
  positioning accuracy. We observe an opposite behavior with quadruplets for classification in the
  corridor. Indeed, they are centered with respect to the corridor. On the other hand, [8] points out the
  importance of having two close tags out of the four quadruplet’s tags, which is similar to our results.
  According to [7], the closer the quadruplets’ tags are from the sample, the better the positioning
  accuracy is. However, we found that for classification the tags need to be not too close from the
  sample position.
      Best quadruplets for positioning in the hall described in [6] share the alignment characteristic we
  identified for classification’s quadruplets. However, the average span of the positioning quadruplets
  is much greater than the classification ones.

                                           30


                                                                                                                                   25
                                           25
                    Quadruplets Span (m)


                                                                                                            Quadruplets Span (m)


                                           20                                                                                      20


                                           15                                                                                      15


                                           10
                                                                                                                                   10


                                           5
                                                                                                                                   5


                                           0
                                            -0.4   -0.2   0   0.2   0.4   0.6   0.8   1   1.2   1.4   1.6                               80-100 % (30202) 60-80 % (10349)   40-60 % (3124) 20-40 % (1260) Top 20% (60)
                                                          Accuracy and visibility score                     Quadruplets grouped by Accuracy and Visibility scores


Figure 6: Quadruplets Span against rankScore
5. Conclusion and future work
   We built an indoor environment classification method that can recognize a hall and a corridor with
98.57 % accuracy. As a comparison the method presented in [2] classifies four indoor environments
with an accuracy of 97.21 % and a similar size training set. Our method thus achieves an equivalent
classification accuracy to that of the previous methods, but it requires less measures and is thus simpler
to use in a practical real-time localization scenario. However, additional data from other similar
environments are needed to test and improve the scalability of our solution to other halls and corridors.
Other environments will also need to be added to make the system complete.
   We identified best classification tags’ quadruplets characteristics for both environments. Good tags’
quadruplets for classification correspond to an extend to the good tags for localization which confirms
our solution can be use in pair with a traditional localization method, such as received RSS, with little
additional deployment effort.
   Our future work will be to expand the classification method to other environments and to test its
performances with other type of signals such as WiFi as it is more widespread than RFID and thus
requires less deployment effort. It will also be necessary to test our method in a localization scenario to
confirm its value.

6. References
[1]  F. Zafari, A. Gkelias and K. K. Leung, "A Survey of Indoor Localization Systems and
     Technologies," in IEEE Communications Surveys & Tutorials, vol. 21, no. 3, pp. 2568-2599,
     thirdquarter 2019, doi: 10.1109/COMST.2019.2911558.
[2] Antoni Pérez-Navarro, Joaquín Torres-Sospedra, Raul Montoliu, Jordi Conesa, Rafael Berkvens,
     Giuseppe Caso, Constantinos Costa, Nicola Dorigatti, Noelia Hernández, Stefan Knauth, Elena
     Simona Lohan, Juraj Machaj, Adriano Moreira, Pawel Wilkv, “Challenges of Fingerprinting in
     Indoor Positioning and Navigation, 2019, Intelligent Data-Centric Systems, Geographical and
     Fingerprinting Data to Create Systems for Indoor Positioning and Indoor/Outdoor Navigation,
     Academic Press, ISBN 9780128131893, doi : 10.1016/B978-0-12-813189-3.00001-0.
[3] G. Zhu, F. Dong and N. Pang, "Classification of Indoor Environments Based on Mixed Graph
     Similarity using UWB Signals," 2020 IEEE Symposium Series on Computational Intelligence
     (SSCI), 2020, pp. 197-201, doi: 10.1109/SSCI47803.2020.9308294.
[4] Z. Chen, M. I. AlHajri, M. Wu, N. T. Ali and R. M. Shubair, "A Novel Real-Time Deep Learning
     Approach for Indoor Localization Based on RF Environment Identification," in IEEE Sensors
     Letters, vol. 4, no. 6, pp. 1-4, June 2020, Art no. 7002504, doi: 10.1109/LSENS.2020.2991145.
[5] M. I. AlHajri, N. T. Ali and R. M. Shubair, "Classification of Indoor Environments for IoT
     Applications: A Machine Learning Approach," in IEEE Antennas and Wireless Propagation
     Letters, vol. 17, no. 12, pp. 2164-2168, Dec. 2018, doi: 10.1109/LAWP.2018.2869548.
[6] V. Gharat, E. Colin, ”Active RFID Tags Placement Configuration for Accurate Positioning in a
     Hall Environment” in Proceedings of IPIN 2015, 2015.
[7] E. Colin, A. Moretto and M. Hayoz, "Improving indoor localization within corridors by UHF active
     tags placement analysis," 2014 IEEE RFID Technology and Applications Conference (RFID-TA),
     2014, pp. 181-186, doi: 10.1109/RFID-TA.2014.6934224.
[8] V. Oruganti, V. Gharat, E. Colin and A. Moretto, "Location performance law according to the
     dimensions of the corridor using trilateration," 2014 International Conference on Indoor Positioning
     and Indoor Navigation (IPIN), 2014, pp. 511-517, doi: 10.1109/IPIN.2014.7275523.
[9] Ela Innovation active RFID tag and reader manufacturer. https://www. elainnovation.com.
     Accessed May 2021
[10] L. Ojeda and J. Borenstein, “Personal dead-reckoning system for gpsdenied environments,” in
     Safety, Security and Rescue Robotics, 2007. SSRR 2007. IEEE International Workshop on. IEEE,
     2007, pp. 1–6.

</pre>