<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>WiP, June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>LoRaWAN Fingerprinting with K-Means: the Relevance of Clusters Visual Inspection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Joaquín Torres-Sospedra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rafael Berkvens</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ALGORITMI Research Centre, University of Minho</institution>
          ,
          <addr-line>4800-058 Guimarães</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IDLab - Faculty of Applied Engineering, University of Antwerp - imec</institution>
          ,
          <addr-line>Antwerp</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Michiel Aernouts</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>0</volume>
      <fpage>7</fpage>
      <lpage>09</lpage>
      <abstract>
        <p>LoRaWAN-based positioning is emerging as an alternative positioning solution for battery-constrained IoT devices or GNSS-denied areas in urban environments. The data collected at the LoRaWAN Base Stations, such as the RSSI of received messages, can be merged to generate an RF fingerprint. Unsupervised crowdsourcing can be leveraged to build a large radio map covering a urban area at the expense of introducing noise of around tens of meters when labelling the reference data. As fingerprinting may have a low eficiency in a such a dense radio map, we propose to use -Means clustering to make the position estimation faster. During our study, we found that clustering can also be used to detect large outliers in the radio map that can be subject to be removed. The rationale is to identify those samples within the cluster that are far from the geometric centroid of the cluster. This paper introduces the analysis of introducing -Means clustering with outlier detection and the benefits it might bring. Although removing outliers have not had an outstanding increase in the positioning accuracy, the performed analysis has enabled a new metric that is moderately correlated with the positioning error. This correlation may be useful to detect unreliable position estimates and discard them. The results presented in this work, based on two LoRaWAN datasets, show that the average and median positioning error can be improved by 5 % to 10 % by discarding 4 % to 6 % of operational samples.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Fingerprinting</kwd>
        <kwd>Clustering</kwd>
        <kwd>Scalability</kwd>
        <kwd>LoRaWAN</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        An important constraint on IoT communication and localization technologies is that they
must be as energy-eficient as possible, because IoT devices generally operate for multiple
years using small batteries. This, and the fact that GNSS can normally only be used in outdoor
environments, has motivated researchers to omit power-hungry GNSS receivers and instead
leverage the existing LPWAN link and sensor data for localization purposes. For example,
metadata such as the Received Signal Strength Indicator (RSSI), phase or timing information
from multiple LPWAN receivers can be translated to distance estimations between each receiver
and a transmitting IoT device. However, these methods strongly depend on the LPWAN network
deployment and generally lead to high location estimation errors. A previous analysis on the
choice between GNSS and LPWAN localization shows that the latter should only be favored
over GNSS when a large location error is justifiable and when the energy budget of an IoT
device is extremely limited [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In practice, this means that implementing GNSS receivers on
low-power IoT devices is often feasible. That being said, LPWAN localization can certainly still
prove its use, because not all applications require location data with GNSS-like accuracy. For
example, a construction company might only want to know at which of its building sites its
assets are located, which implies that an error of hundreds of meters can be accepted. LPWAN
localization can also play an important role in multimodal localization, for example as a fallback
solution when a tracking device is moving into GNSS-denied areas such as tunnels or indoor
environments [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Moreover, it may act as a verification mechanism to detect GNSS spoofing.
      </p>
      <p>
        In 2019, Aernouts et al. published an extended version of the LoRaWAN dataset described
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Over a course of three months, 20 postal services cars carried LoRaWAN devices
that periodically transmitted their latest GNSS location. As a result, the collected dataset
contains 130430 entries with a ground truth location, the LoRa Spreading Factor (SF) used
by the transmitter, timing data and Received Signal Strength (RSS) data for each receiving
LoRaWAN gateway. It should be noted that the ground truth information was collected from
GNSS receivers and, therefore, with potential errors of tens of meters. First, urban canyoning
can decrease the GNSS accuracy, since the dataset is collected in a dense urban area. Second,
the received GNSS coordinates of the transmitting device could difer from the actual device
coordinates at receiving time because the total transmission time of a LoRa signal can take up
to a few seconds, depending on the payload size and the SF. This efect becomes even more
prominent when the transmitter travels at higher speeds.
      </p>
      <p>RSS data enables positioning with trilateration and fingerprinting. While the former requires
knowing the location of the LoRaWAN Base Stations (BSs), the propagation model and the
environment obstructions; the latter only requires a set of reference data at known positions, also
known as the radio map. In this paper, we focus on passive fingerprinting, where a fingerprint
is the set of RSSI measurements of a particular LoRaWAN message transmitted by a device and
measured in the available LoRaWAN BSs in the operational area.</p>
      <p>
        This technique requires two phases: the ofline phase focuses on geo-referenced RSSI data
collection (see radio map collection in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]), whereas the online phase estimates the position of
new fingerprints at unknown positions with, for instance, a -Nearest Neighbour (-NN)-based
algorithm and the radio map.
      </p>
      <p>
        However, fingerprinting is computationally demanding if the dataset contains thousands of
samples, e.g. LoRaWAN datasets in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In those datasets, every single operational fingerprint
has to be compared with all the reference samples in the radio map, even if they significantly
difer, to obtain the most similar ones and compute the final position estimate. Thus, clustering
techniques have been applied to split the radio map into several smaller versions [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref4 ref5 ref6 ref7 ref8 ref9">4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14</xref>
        ]. In the operational stage, the identification of the most relevant cluster is done
ifrst (coarse search). Then, the position is estimated using the corresponding reduced radio map
(fine-grained search). This two-step procedure is significantly faster that regular fingerprinting,
specially in large datasets [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>In this paper we propose a version of -Means clustering with outlier detection where noisy
ifngerprints are removed. We hypothetise that the clusters generated with -Means over the
feature RSSI space can be de-noised by removing the reference samples which are significantly
far for the cluster geometric centroid. It is worth noting that the proposed algorithmic solution is
performed after generating the clusters with -Means. -Means clustering is an unsupervised
model that groups similar data without, in this case, the location information (i.e., the labels).
Thus, we consider that -MEANS basic principles cannot be significantly re-formulated to
make it more robust. The main contributions of this work include:
• Modification of -Means to remove outliers from clusters according to the geometric
information;
• Comprehensive comparison between applying -Means without and with ourlier
detection;
• A new metric which is correlated with the positioning error under some cases;
• A procedure to discard unreliable position estimations.</p>
      <p>The remainder of this work is organised as follows. Section 2 introduces the related work on
LoRaWAN, fingerprinting and clustering. Section 3 describes the materials and methods used
in this work. Section 4 details the experimental setup and shows the empirical results. Section 5
provides the final discussion and conclusions about this work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <sec id="sec-2-1">
        <title>2.1. LoRaWAN and fingerprinting</title>
        <p>
          LoRaWAN’s relatively wide bandwidth of 125 kHz to 250 kHz makes it a suitable candidate for
both RSS-based and time-based localization. Thanks to the widespread availability of LoRaWAN
networks and datasets [
          <xref ref-type="bibr" rid="ref16 ref17 ref18 ref19 ref3">3, 16, 17, 18, 19</xref>
          ], many researchers have evaluated the performance
of various localization methods. For instance, Pospisil et al. evaluated the performance of
ifve Time Diference of Arrival ( TDoA) algorithms through simulation and validated two of
them with field measurements. They achieved a mean location error of 543 m in a test area of
4.58 km2 [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
        <p>
          The aforementioned LoRaWAN dataset by Aernouts et al. enabled many researchers to
evaluate fingerprinting and machine learning approaches for localization. Pandangan et al. generated
a hybrid dataset containing RSS and TDoA information based on the LoRaWAN dataset. Their
hybrid dataset was then used to evaluate -NN and Random Forest algorithms which resulted in
a median error of 333 m and 194 m respectively [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. This is a slight improvement compared to
related research on Neural Network localization with the LoRaWAN dataset [
          <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
          ]. Purohit et
al. also used this dataset for their research on Neural Network localization. In their investigation
of three diferent learning models, the Long Short-Term Memory (LSTM) model with 64 neurons
came out on top with a mean error of 191 m [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. Janssen et al. compared the location accuracy,
2 score and evaluation time of ten Machine Learning algorithms using the LoRaWAN dataset.
Their experiments show that the weighted -NN and Random Forest algorithms result in the
best accuracy and 2 score, but Random Forest has a significantly faster computation time [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ].
In a subsequent study, the authors extended their comparison with range-based localization
using eight diferent path loss models and six weight functions. Their best path loss model
weight function combination yielded an estimation error of 700 m, which is significantly higher
than the 340 m obtained with fingerprinting. Furthermore, this work provides a comprehensive
overview of the trade-ofs that must be made between range-based and fingerprinting-based
localization, including accuracy, complexity, cost, etc. [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Clustering in fingerprinting</title>
        <p>
          Clustering has been widely applied in Wi-Fi and BLE fingerprinting to reduce the
computational cost and keep similar accuracy, being -Means [
          <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
          ], including -Medoids [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ] and
Fuzzy -Means (FCM) [
          <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
          ] variants, the most popular. Other approaches, such as Afinity
Propagation Clustering (APC) [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ] or Density-based spatial clustering of applications with
noise (DBSCAN) [
          <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
          ], have also been explored but their feasibility may depend on the
dataset according to some preliminary experiments we performed.
        </p>
        <p>
          Therefore, this work is focusing in -Means clustering, trying to take benefit from the
position information of the reference data to remove those reference samples that may poison
the radio map. To enhance the performance of -Means, we have used the Manhattan distance
for distances computations in the feature (RSSI) space and the centroid initialization proposed
in [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <sec id="sec-3-1">
        <title>3.1. -Means in fingerprinting</title>
        <p>The core of the passive fingerprinting technique requires two phases: the of-line and on-line
phases as explained before. In the of-line phase , reference fingerprints ( ) are generated from
a set of received LoRaWAN messages (that include their position from GPS) by the available
LoRaWAN BS, generating thus a radio map ( ). In the on-line phase, the operational fingerprints
(from unknown positions) are compared to the fingerprints stored in the radio map. Their
position is estimated using the locations of the most similar fingerprints in the radio map,
usually computing their centroid.</p>
        <p>After generating the radio map  , similar fingerprints in the feature RSSI space are grouped
by -Means clustering algorithm. It is expected that fingerprints within a cluster would be
also close in the geometrical space. The output of -Means provides the  cluster centroids,
, ∀ ∈ [1, . . . , ], and the reduced radio map for every cluster , ∀ ∈ [1, . . . , ]. The
centroids and reference fingerprints are both vectors representing the feature RSSI space, thus
having as many values as LoRaWAN BSs.</p>
        <p>As an illustrative example, a few clusters over the LoRaWAN 2017/18 dataset are shown
in Fig. 2. The gray dots represent the reference fingerprints in the radio map, whereas the
coloured ones represent the samples in the cluster. The number of reference fingerprints and
their dispersion in the geometric space depends on the cluster.</p>
        <p>3000
2500
2000
1500
1000
500
1600
1400
1200
1000
800
600
400
200
4500
4000
3500
3000
2500
2000
1500
1000
2000
1800
1600
1400
1200
1000
800
600
400
200</p>
        <p>In the operational phase, the search of most similar reference fingerprints is done in a two-step
process. First, the operational fingerprint is compared to all the cluster centroids ( RSSI space)
to retrieve the one reporting the lowest Euclidean distance. Second, the search of most similar
reference fingerprints is done over the corresponding reduced radio map, .</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Analysis of clustering with -Means</title>
        <p>
          Previous results in the literature show that -Means in fingerprinting reduces the computational
cost at the expense of a slightly higher positioning error. This reduction on time is specially
relevant in large datasets [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ].
        </p>
        <p>
          In this paper, the same clustering model has been applied to both LoRaWAN datasets, being
 the squared root of the samples in the radio map as suggested in [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. These results, which
are shown in Section 4.2, were in phase with the results reported in the literature.
        </p>
        <p>However, to avoid the adoption of a black box approach while using -Means, an additional
overall analysis on the clusters was performed. In particular, the location (longitude and latitude
in WGS84 format) of the reference samples in the reduced radio map was visually inspected for
each cluster, showing a relevant output in many clusters.</p>
        <p>Fig. 2 shows six illustrative examples of the clusters generated with -Means. Despite their
size and dispersion depend on the cluster, most of them report cases where the fingerprints are
very far (reddish points in the figure) from the current geometric centroid and close to others
geometric centroids. Those outliers share similar RSSI values with respect other reference
ifngerprints in the cluster, but thet geographically far from them. Among other factors, this
efect may be caused by the positioning errors introduced by the GNSS receivers.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Removing noisy samples from clusters</title>
        <p>The idea to remove noisy samples from the radio map is simple. Given the samples (fingerprints)
of a cluster, their geometric centroid (in the WGS84 space) is calculated. All samples whose
distance to the geometric centroid is higher than twice the median value are removed. This is
only applied to those clusters where the maximum distance is higher than 5 times the median
value. i.e., it is only applied to those clusters having significant outliers. The proposed model
is described in Algorithm 1, which has 3 stages: clusters generation (line 2), clusters cleaning
(ln. 3–12) and position estimation (ln.15–22). First and second stage can be performed once per
dataset, so their timing can be neglected when providing the computational costs of providing
a position estimate in the online phase.</p>
        <p>
          is the radio map,  is the set with the test/evaluation samples,  is the number of nearest
neighbors for -NN. A sample (fingerprint) is represented with s and has  elements (one for
each LoRaWAN BS), whereas its position is represented and its position (longitude and latitude
in WGS84) with pos. For -Means,  is the number of clusters ( = √︀| | as suggested
in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]),  represents the clusters RSSI centroids and  represents the clusters geometric lat/lon
centroids. ˙ stands for the clean reduced radio map for cluster .
        </p>
        <p>The other parameters for the outlier detection were set based on the researchers experience.
e.g., the threshold used to remove the noisy samples, 2 times the median distance, has been
selected as the distances to the geometric centroid usually increase gradually.
Algorithm 1 -NN for positioning with -Means and outlier detection
1: input  , , , 
3: for  = 1 to  do
2: , ←</p>
        <p>Apply -Means to 
 ←
 ←
˙
 ←
˙
else
end if</p>
        <p>Compute geometric centroid of samples in 
if  () &gt; (5 ·  ()) then
// Remove samples far from geometric centroid
{︁s
 ∈  :  ≤ (2 ·  ())</p>
        <p>}︁
 ←   // No cleaning for cluster 
12: end for
13: for  = 1 to || do
for  = 1 to |˙| do
Identify most relevant cluster, 
Set the reduced radio map ˙
end for
Sort distances in RSS space
Select the  closest candidates
Estimate position lat/lon</p>
        <p>Compute distance between s and s
˙
22: end for
23: Return: Estimated positions for all samples in 
4. Experiments and Results
4.1. Experimental Setup
for training and the last ≈
In order to assess the proposed clustering model with outlier detection, we have compared
the results between the plain -NN, the optimization rule proposed by Moreira [29], -Means
without outlier detection and -Means with outlier detection. To estimate the final position,
we have used the simple 1-NN algorithm using the Euclidean distance. The models have been
run 10 times to minimise the random initialization of -Means.</p>
        <p>
          For the experiments, two datasets collected in the city of Antwerp between end of 2017 and
beginning of 2019 [
          <xref ref-type="bibr" rid="ref3">3, 30</xref>
          ] have been used, namely LoRaWAN 2017/18 and LoRaWAN 2018/19.
Both datasets were collected to evaluate fingerprint localization algorithms in large outdoor
environments and, according to the database authors, the RSSI of the LoRaWAN messages could
hold an additional GPS error. This feature makes them appropriate for assessing the proposed
algorithm to remove noise from clusters. For both datasets, the samples have been sorted by
timestamp and then split for training and testing, the first ≈
80% of samples have been used
20% of samples have been used for evaluation. This division has
been performed to avoid having data from the same device and day on both subsets,  and .
        </p>
        <p>The evaluation metrics include the Averaged Positioning Error (APE), ¯; the Median
Positioning Error (MPE), ˜; and the Averaged Operational Time (AOT), ¯, and consider all the
10 execution runs. The APE and MPE are included in the ISO18305 standard, whereas the
AOT refers to the average time required to process an operational fingerprint and provide the
position estimate. In contrast to the plain -NN algorithm, where all fingerprints hold similar
operational time, the operational time may significantly vary depending on the cluster. i.e.,
-Means clustering does not guarantee that all clusters are equally distributed, so the time
required to perform the fine-grained search will depend on the selected cluster. Therefore, the
standard deviation is also reported for the operational time.</p>
      </sec>
      <sec id="sec-3-4">
        <title>4.2. Results</title>
        <p>This subsection is devoted to show the empirical results. First, a comparison with traditional
ifngerprint models is introduced. Then, a comprehensive analysis about the consequences of
removing noise from the radio map is performed. Finally, the possible benefits of the proposed
model are described.</p>
        <sec id="sec-3-4-1">
          <title>4.2.1. Comparative analysis</title>
          <p>Table 1 introduces the main results for the comparative analysis. It includes the plain -NN
algorithm ( = 1), the optimization rule based on common strongest anchor proposed in
Moreira et al. [29], and -Means clustering without and with the outlier detection (OD). Fig. 3
introduces the Empirical Cumulative Distribution Function (ECDF) plots of the positioning
error and operational time of the four methods for both datasets.</p>
          <p>In general, the four models provide similar results in terms of positioning error being the
main diference their computational cost. The two solutions based on -Means report the
lowest computational cost with an averaged execution time below 20 ms and 30 ms respectively.</p>
          <p>Removing the outliers not only made -Means slightly more accurate but also slightly more
eficient in the operational stage as the proposed approach removed 8.6% and 9.5% of reference
ifngerprints on each dataset respectively. However, the improvements may be marginal.</p>
          <p>LoRaWAN 2017</p>
          <p>LoRaWAN 2019</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>4.2.2. A comprehensive analysis of removing noise</title>
          <p>Despite -Means without and with outliers detection having similar performance according to
the previous results, positioning can be based on two sources, namely a noisy radio map and a
clean radio map. This enables to exploit some additional information at the operational stage
as there are samples where both approaches based on -Means (without and with outliers
detection) do not agree in estimating the position. This happens in 12% and 32% of samples on
each dataset, respectively.</p>
          <p>Thus, the evaluation set can be split into two subsets, one where both estimators agree and
provide the same position estimation (“same” in table and figure), and the other where they
disagree (“dif ”). Table 2 and Fig. 4 show the corresponding results and ECDFs.</p>
          <p>According to Table 2 and the ECDFs plots from Fig. 4, the subset “same” is generally better
than the subset “dif ” in both metrics, positioning error and execution time, specially in the first
dataset (LoRaWAN 2017/18). i.e. when the two estimators –without and with outlier detection–
agree, the positioning results are better that when they disagree. If both estimators disagree,
the positioning error provided with -Means with outlier detection is better.</p>
          <p>We hypothesise that a divergence between the position estimate between both -Means
models may indicate the quality of the position estimate provided by the proposed model.
In particular, we explore the correlation between the distance between position estimations
when they disagree and the positioning error using -Means with outlier detection. First, that
relation is shown as a scatter plot in Fig. 5 (top) and as a density heat map (color representing
the amount of samples for a particular range in both dimensions) in Fig. 5 (bottom). In addition,
we computed the Pearson correlation, which provided a correlation factor of 0.64 and 0.52
for the two datasets respectively. Thus, it seems that when both models based on -Means
disagree, the distance between the two estimators may indicate the positioning error.</p>
          <p>The scatter plots are dense as the number of test samples is large and the experiments have
been run 10 times. Fig. 6 shows the boxplot of the positioning errors for diferent distances
between estimates. The correlation trends between the distance between estimates and the
positioning error using -Means with the proposed outlier detection can be seen more clearly
in the figure. However, it can also be seen that in distances above around 2000 m seems to
be less reliable as the error and its variability are both high. i.e., the the lowest variability
is provided in range [0, . . . , 500[ and it is increasing as the distance between estimates also
increases and the number of cases is significant. For the ranges including the largest distances
between estimations, there are only a few cases in both datasets.</p>
        </sec>
        <sec id="sec-3-4-3">
          <title>4.2.3. Possible benefits of combining noisy and cleaned data sets</title>
          <p>Positioning using -Means has shown to be very eficient in terms of computational time.
Computing the position estimate with fingerprints from the original reduced radio maps and
the cleaned (without outliers) reduced radio maps is feasible. For any operational fingerprint,
if the two position estimates difer, their distance could be used as an indicator of reliability
(see Figs. 5-6). For instance, if this distance is higher than a predefined threshold, the position
estimate could be discarded.</p>
          <p>We consider that positioning can take benefit of discarding unreliable samples. In general,
these operational fingerprints may have a large positioning error attached. Therefore, the
positioning error of the remaining fingerprints (the ones that are reliable) should be better. The
only requirement is to set a threshold on the distance between the two position estimates. Table 3
and Fig. 7 show the results using -Means with outlier detection and diferent thresholds,
where  stands for reliable samples.
Threshold ¯[m] ˜[m] ¯[ms] [%] ¯[m] ˜[m]</p>
          <p>The ECDF is shown for both sets, reliable samples (solid) and unreliable samples (dashed). In
general, as the threshold decreases, the more samples are considered unreliable and the lower
the positioning error of the reliable samples. However, the presence of low positioning errors in
the set of unreliable samples increases. i.e., the lower the threshold (e.g., 125 m), the better the
results of the reliable samples, but also the higher the probability of discarding a good position
estimate.</p>
          <p>According to the results presented in Table 3 and Fig. 7, the threshold depends on the dataset.
For the two LoRaWAN datasets we have used, the threshold is 500 m (LoRaWAN 2017/18) and
125 m (LoRaWAN 2018/19), as they provide good results in terms of positioning error of the
reliable samples in their respective datasets. On the other hand, the lower the threshold the
more samples (including good estimations) are removed.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Discussion and Conclusions</title>
      <p>-Means is often applied to fingerprinting as a black box to obtain a similar average positioning
error with a significantly lower computational cost. In this paper, we have applied it to two
large datasets, getting results in phase to what has been reported in state-of-the art works about
Wi-Fi and BLE fingerprinting.</p>
      <p>Visual inspection on the generated clusters has shown that they might contain noisy
fingerprints which are close to the cluster centroid in the RSSI space but on diferent locations.
Thus, as the reference data provides the fingerprints ( RSSI vectors) and their locations, we have
proposed a simple rule to remove the noisy samples from clusters.</p>
      <p>Although the results are not outstanding, having two ways to estimate the position has
enabled a new metric based on the distance between the two position estimates. For samples
where both estimators diverge, this metric has shown to be moderately correlated to the
positioning error provided by the proposed -Means clustering with outlier detection.</p>
      <p>Being able to detect unreliable position estimates at the operational stage is an important step
as a better accuracy can be ensured for the reliable ones. In this case, the average and median
positioning error can be improved by 5 % to 10 % by discarding the 4 % to 6 % of operational
samples.</p>
      <p>In this paper, we propose a model to clean the clusters. It is of utmost importance to not
blindly trust on Machine Learning models if they were used as black boxes. Visual inspection
allowed to detect noisy samples and get a new metric correlated to the positioning error. Further
eforts will be devoted to improve noise removal with diferent strategies.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>A. Moreira gratefully acknowledge funding from FCT – Fundação para a Ciência e Tecnologia
within the R&amp;D Units Project Scope: UIDB/00319/2020.
J. Huerta, New cluster selection and fine-grained search for k-means clustering and wi-fi
ifngerprinting, in: 2020 Int. Conf. on Localization and GNSS (ICL-GNSS), 2020.
[29] A. Moreira, M. J. Nicolau, F. Meneses, A. Costa, Wi-fi fingerprinting in the real world
- RTLS@UM at the EvAAL competition, in: 2015 International Conference on Indoor
Positioning and Indoor Navigation (IPIN), IEEE, ????
[30] M. Aernouts, R. Berkvens, K. Van Vlaenderen, M. Weyn, Sigfox and LoRaWAN Datasets
for Fingerprint Localization in Large Urban and Rural Areas, 2019. doi:10.5281/zenodo.
3904158, https://doi.org/10.5281/zenodo.3904158.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aernouts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Janssen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weyn</surname>
          </string-name>
          ,
          <article-title>Lora localization: With gnss or without?, IEEE IoT Magazine (Submitted) (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aernouts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lemic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Moons</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Famaey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hoebeke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Multimodal</given-names>
            <surname>Localization</surname>
          </string-name>
          <article-title>Framework Design for IoT Applications</article-title>
          ,
          <source>Sensors</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <article-title>4622</article-title>
          . doi:
          <volume>10</volume>
          .3390/s20164622.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aernouts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Van Vlaenderen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weyn</surname>
          </string-name>
          ,
          <article-title>Sigfox and lorawan datasets for fingerprint localization in large urban and rural areas</article-title>
          ,
          <source>Data</source>
          <volume>3</volume>
          (
          <year>2018</year>
          ). URL: https: //www.mdpi.com/2306-5729/3/2/13. doi:
          <volume>10</volume>
          .3390/data3020013.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Anuwatkun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sangthong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sang-Ngern</surname>
          </string-name>
          ,
          <article-title>A dif-based indoor positioning system using ifngerprinting technique and k-means clustering algorithm</article-title>
          ,
          <source>in: 16th International Joint Conference on Computer Science and Software Engineering</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>148</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Developing an improved fingerprint positioning radio map using the k-means clustering algorithm</article-title>
          ,
          <source>in: Int. Conf. on Information Networking</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>761</fpage>
          -
          <lpage>765</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheng</surname>
          </string-name>
          , Y. Cai,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Cheng, C. Yan,
          <article-title>A new three-dimensional indoor positioning mechanism based on wireless lan</article-title>
          ,
          <source>Mathematical Problems in Engineering</source>
          <year>2014</year>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>An optimized fingerprint positioning algorithm for underground garage environment</article-title>
          ,
          <source>in: Int. Conf. on Information Networking</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>291</fpage>
          -
          <lpage>296</lpage>
          . URL: https: //doi.ieeecomputersociety.
          <source>org/10</source>
          .1109/ICOIN.
          <year>2016</year>
          .
          <volume>7427079</volume>
          . doi:
          <volume>10</volume>
          .1109/ICOIN.
          <year>2016</year>
          .
          <volume>7427079</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. Van,</surname>
          </string-name>
          <article-title>Indoor fingerprint localization based on fuzzy c-means clustering</article-title>
          ,
          <year>2014</year>
          , pp.
          <fpage>337</fpage>
          -
          <lpage>340</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICMTMA.
          <year>2014</year>
          .
          <volume>83</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Suroso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cherntanomwong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sooraksa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Takada</surname>
          </string-name>
          ,
          <article-title>Fingerprint-based technique for indoor localization in wireless sensor networks using fuzzy c-means clustering algorithm</article-title>
          ,
          <source>in: International Symposium on Intelligent Signal Processing and Communications Systems</source>
          ,
          <year>2011</year>
          . doi:
          <volume>10</volume>
          .1109/ISPACS.
          <year>2011</year>
          .
          <volume>6146167</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Received signal strength-based indoor localization using hierarchical classification</article-title>
          ,
          <source>Sensors</source>
          <volume>20</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .3390/s20041067.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Karegar</surname>
          </string-name>
          ,
          <article-title>Wireless fingerprinting indoor positioning using afinity propagation clustering methods</article-title>
          ,
          <source>Wireless Networks</source>
          <volume>24</volume>
          (
          <year>2018</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2833</lpage>
          . URL: https://doi.org/10.1007/ s11276-017-1507-0. doi:
          <volume>10</volume>
          .1007/s11276-017-1507-0.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Caso</surname>
          </string-name>
          , L. De Nardis,
          <string-name>
            <surname>M.-G. Di Benedetto</surname>
          </string-name>
          ,
          <article-title>A mixed approach to similarity metric selection in afinity propagation-based wifi fingerprinting indoor positioning</article-title>
          ,
          <source>Sensors</source>
          <volume>15</volume>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .3390/s151127692.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Achieving cost-eficient indoor fingerprint localization on wlan platform: A hypothetical test approach</article-title>
          ,
          <source>IEEE Access 5</source>
          (
          <year>2017</year>
          )
          <fpage>15865</fpage>
          -
          <lpage>15874</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2017</year>
          .
          <volume>2737651</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <source>An Improved WiFi Positioning Method Based on Fingerprint Clustering and Signal Weighted Euclidean Distance, Sensors</source>
          <volume>19</volume>
          (
          <year>2019</year>
          ). URL: https://pubmed.ncbi.nlm.nih.gov/31109054https://www.ncbi.nlm.nih.gov/ pmc/articles/PMC6567165/. doi:
          <volume>10</volume>
          .3390/s19102300.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Torres-Sospedra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Richter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Mendoza-Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Lohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Trilles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matey-Sanz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huerta</surname>
          </string-name>
          ,
          <article-title>A comprehensive and reproducible comparison of clustering and optimization rules in wi-fi fingerprinting</article-title>
          ,
          <source>IEEE Transactions on Mobile Computing</source>
          <volume>21</volume>
          (
          <year>2022</year>
          )
          <fpage>769</fpage>
          -
          <lpage>782</lpage>
          . doi:
          <volume>10</volume>
          .1109/TMC.
          <year>2020</year>
          .
          <volume>3017176</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Masek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stusek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Svertoka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pospisil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Burget</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Lohan</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Marghescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hosek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ometov</surname>
          </string-name>
          ,
          <article-title>Measurements of LoRaWAN Technology in Urban Scenarios: A Data Descriptor, Data 6 (</article-title>
          <year>2021</year>
          ). URL: https://www.mdpi.com/2306-5729/6/6/62. doi:
          <volume>10</volume>
          .3390/ data6060062.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Mikhaylov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stusek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Masek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fujdiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mozny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Andreev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hosek</surname>
          </string-name>
          ,
          <article-title>On the performance of multi-gateway lorawan deployments: An experimental study</article-title>
          ,
          <source>in: 2020 IEEE Wireless Communications and Networking Conference (WCNC)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1109/WCNC45663.
          <year>2020</year>
          .
          <volume>9120655</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Breza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marfievici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>McCann</surname>
          </string-name>
          ,
          <string-name>
            <surname>Loed:</surname>
          </string-name>
          <article-title>The lorawan at the edge dataset: Dataset</article-title>
          , in
          <source>: Proceedings of the Third Workshop on Data: Acquisition To Analysis, DATA '20</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>7</fpage>
          -
          <lpage>8</lpage>
          . URL: https://doi.org/10.1145/3419016.3431491. doi:
          <volume>10</volume>
          .1145/3419016.3431491.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Cardell-Oliver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hübner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leopold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Beringer</surname>
          </string-name>
          , Dataset:
          <article-title>Lora underground farm sensor network</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Data Acquisition To Analysis, DATA'19</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>26</fpage>
          -
          <lpage>28</lpage>
          . URL: https://doi.org/10.1145/3359427.3361912. doi:
          <volume>10</volume>
          .1145/3359427.3361912.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pospisil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fujdiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mikhaylov</surname>
          </string-name>
          ,
          <article-title>Investigation of the performance of tdoa-based localization over lorawan in theory and practice</article-title>
          ,
          <source>Sensors (Switzerland) 20</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          . doi:
          <volume>10</volume>
          .3390/s20195464.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Pandangan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C. R.</given-names>
            <surname>Talampas</surname>
          </string-name>
          ,
          <article-title>Hybrid LoRaWAN Localization using Ensemble Learning, in: 2020 Global Internet of Things Summit (GIoTS)</article-title>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          . 1109/GIOTS49054.
          <year>2020</year>
          .
          <volume>9119520</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G. G.</given-names>
            <surname>Anagnostopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalousis</surname>
          </string-name>
          ,
          <article-title>A Reproducible Comparison of RSSI Fingerprinting Localization Methods Using LoRaWAN</article-title>
          , in: 16th Workshop on Positioning,
          <source>Navigation and Communications</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>I.</given-names>
            <surname>Daramouskas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kapoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Paraskevas</surname>
          </string-name>
          ,
          <article-title>Using Neural Networks for RSSI Location Estimation in LoRa Networks</article-title>
          ,
          <source>in: 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . doi:
          <volume>10</volume>
          .1109/IISA.
          <year>2019</year>
          .
          <volume>8900742</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Purohit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Fingerprinting-based Indoor and Outdoor Localization with LoRa and Deep Learning</article-title>
          , in: GLOBECOM 2020
          <string-name>
            <surname>- 2020 IEEE Global Communications</surname>
            <given-names>Conference</given-names>
          </string-name>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1109/GLOBECOM42002.
          <year>2020</year>
          .
          <volume>9322261</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>T.</given-names>
            <surname>Janssen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Weyn,</surname>
          </string-name>
          <article-title>Comparing Machine Learning Algorithms for RSS-Based Localization in LPWAN</article-title>
          ,
          <source>in: Lecture Notes in Networks and Systems</source>
          , volume
          <volume>96</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>726</fpage>
          -
          <lpage>735</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -33509-0_
          <fpage>68</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>T.</given-names>
            <surname>Janssen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weyn</surname>
          </string-name>
          ,
          <article-title>Benchmarking RSS-based localization algorithms with LoRaWAN</article-title>
          ,
          <source>Internet of Things</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <article-title>100235</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.iot.
          <year>2020</year>
          .
          <volume>100235</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Arthur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vassilvitskii</surname>
          </string-name>
          , K-means+
          <article-title>+: The advantages of careful seeding</article-title>
          ,
          <source>in: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>1027</fpage>
          -
          <lpage>1035</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Torres-Sospedra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Quezada-Gaibor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Mendoza-Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nurmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koucheryavy</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>