<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Eficient Indoor Localization Model Construction by Sequential Recommendation of Data Gathering Position based on Bayesian Optimization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yoshiki Omori</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Masato Sugasaki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Masamichi Shimosaka</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tokyo Institute of Technology</institution>
          ,
          <addr-line>Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, with the spread of smartphones and IoT devices, the demand for indoor localization is increasing. GPS is not suitable for indoor localization, so radio signal strength indicator (RSSI) such as Wi-Fi or BLE is frequently used. On the other hand, indoor localization based on RSSI requires data gathering in advance and it is quite costly. We insist that data gathering costs should be reduced in terms of the number of data needed, walking distance, and required time during data gathering. However, to the best of our knowledge, none of the previous work could simultaneously reduce the cost in all aspects above. To reduce the cost in all three aspects, we propose Efective Sequential Recommendation of Neighborhood Data Gathering Position, which is based on Bayesian Optimization and streamlined data gathering. Experiments show that our method could reduce not only the number of data gathered, but walking distance and required time during data gathering compared with other methods.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;indoor localization</kwd>
        <kwd>data gathering</kwd>
        <kwd>Bayesian Optimization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>proposed; however, this framework cannot obtain suficient localization accuracy because of the
efects of radio signal reflection and difraction in the indoor environment. Indoor localization
using fingerprint can be defined as estimating the target mesh from the meshes set on the target
area using fingerprint.</p>
      <p>The model that estimates the acquired position of the fingerprint is called the localization
model. In this localization model, the parameters must be optimized for all sections of the
localization target environment. Therefore, it is necessary to acquire fingerprint data evenly in
the entire localization target environment prior to localization. The high cost of gathering data
required to construct an indoor localization model has been regarded as a problem.</p>
      <p>
        Therefore, many studies have been working on the reduction of data gathering costs. For
example, a method that utilizes a small amount of labeled data and a large amount of unlabeled
data using semi-supervised learning [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
        ] and distribute the efort of gathering data per
person by crowdsourcing [
        <xref ref-type="bibr" rid="ref10 ref6 ref7 ref8 ref9">6, 7, 8, 9, 10</xref>
        ] are proposed. In addition, a method to reduce costs
by gathering data while walking has also been proposed. [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
        ]. This method aims to
significantly reduce the data gathering time by combining data gathering and movement. These
methods eficiently gather a large amount of data and do not reduce the amount of data itself.
      </p>
      <p>
        On the other hand, a method that reduces the amount of data gathered by constructing an
indoor localization model with a small amount of data has the same accuracy as when learning
with a large amount of data has also been studied. This method fundamentally reduces the data
gathering cost. As a typical example, a method of maximizing the improvement of localization
accuracy by gathering each data using Bayesian optimization has been proposed [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. This
method has succeeded in constructing a highly accurate localization model with a small amount
of data by gathering data at a position that maximizes the possible accuracy improvement of
the localization model by each scan according to Bayesian optimization. On the other hand,
since this method maximizes the immediate reward focusing only on the accuracy improvement
range of the localization model, the distance to be moved during continuous data gathering
tends to become long, and most of it is redundant.
      </p>
      <p>
        Considering the burden on the data collector when gathering data in an actual situation,
we should reduce the number of data collected, the walking distance and the required time
during data gathering. The method of gathering a large amount of data has not been suficiently
reduced by the method of Shimosaka et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], which aims to reduce only the number of
data gathered, but the walking distance during data gathering is costly. Since this problem
was not taken into consideration, the ratio of redundant walking was larger than that of the
conventional method.
      </p>
      <p>Therefore, we propose a method named Efective Sequential Recommendation of
Neighborhood Data Gathering Position. Our method achieves the target gathering accuracy by
reducing redundant data gathering and movement. The Efective Sequential Recommendation
of Neighborhood Data Gathering Position is a method that can simultaneously reduce the
number of data collected, the walking distance during data gathering, and the data gathering
time. Specifically, our algorithm recommends data gathering position sequentially with using
Bayesian Optimization to reduce not only the number of data but also walking distance during
data gathering for constructing enough accurate indoor localization model.</p>
      <p>Also, we conducted an experiment to confirm the performance of the proposed method in an
ofice environment of 450 m2. As a result, our method beats baseline and comparison methods
in all the number of data, walking distance, and required time.</p>
      <p>The contributions of this research are as follows:
• We propose a brand new data gathering framework for constructing an indoor localization
model, reducing the amount of data required to achieve the target localization accuracy
by using Bayesian optimization and the walking distance required throughout the data
gathering. Specifically, instead of improving the usage of the acquisition function and
recommending the data gathering position that takes the maximum value, in our method,
the data gathering position is recommended based on the estimation of the current
position and the acquisition function after data gathering.
• We confirm the performance of our method from the viewpoint of the actual gathering
accuracy to the target accuracy, the required amount, walking path length, and needed
time by an experiment using actual Wi-Fi fingerprint data. We show that our method
collects the data necessary for achieving target localization accuracy in a significantly
shorter walking distance.</p>
      <p>The structure of this paper is as follows. Chapter 1 summarizes the position of this research
based on related work, and Chapter 2 describes the premise of setting indoor gathering problems.
Regarding the contribution of this research, Chapter 3 describes a sequential recommendation
algorithm for data gathering positions considering walking distance based on Bayesian
optimization. In Chapter 4, an evaluation experiment of the proposed method is conducted, and the
conclusions are summarized in Chapter 5.</p>
      <sec id="sec-1-1">
        <title>1.1. Related Work</title>
        <p>
          Indoor Localization with Wi-Fi RSSI Bahl et al. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] proposed indoor localization based
on radio signal strength indicator. In particular, the method using RSSI of Wi-Fi has attracted
attention for its high penetration rate of Wi-Fi and its high localization accuracy and has
been actively researched [17, 18, 19, 20]. In addition, many localization methods based on
deep learning using Wi-Fi RSSI have been proposed[21, 22]. And not only Wi-Fi RSSI, but
also channel state information [23, 24, 25, 26, 27], phase diference for each antenna[ 28, 29],
AP-to-user propagation time of RSSI [30, 31, 32] has also been used for indoor localization.
While these methods have improved the localization accuracy, they require a high cost for data
gathering.
        </p>
        <sec id="sec-1-1-1">
          <title>1.1.1. Methods for collecting a large amount of data eficiently</title>
          <p>
            Semi-supervised learning A method has been proposed to reduce the number of labeled
data collected by using a large amount of unlabeled data by a method using semi-supervised
learning[
            <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
            ]. This is to utilize unlabeled data in consideration of the cost of collecting
labeled data in general. However, it is hard to say that the data gathering cost is reduced from
the viewpoint of the amount of data collected because not only at least one labeled data is
required for each localization target point but also a large amount of unlabeled data is required.
Crowdsourcing A method to reduce the burden of data collection per person by using
crowdsourcing has been proposed[
            <xref ref-type="bibr" rid="ref10 ref6 ref7 ref8 ref9">6, 7, 8, 9, 10</xref>
            ]. This is to disperse the data gathering cost by
asking the public to collect data, but the data collected by the public is unreliable, and the total
number of gathered data is not reduced.
          </p>
          <p>
            Data gathering while walking A method to reduce the data collection cost by collecting
data while walking has been proposed[
            <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
            ]. Although it is possible to significantly
reduce the data collection time by collecting data while walking, it has been pointed out that
there is a problem of deterioration of localization accuracy due to inaccuracies in labeling. A
method for improving the labeling accuracy by making the walking route known has also been
proposed. On the other hand, there are problems such as it needs to specify the walking route
before collecting data, but the eficiency of data collection is not taken into consideration when
specifying the walking route.
          </p>
        </sec>
        <sec id="sec-1-1-2">
          <title>1.1.2. Methods for collecting a short amount of data with high collection eficiency</title>
          <p>
            Bayesian Optimization While the method in the previous section is a method for eficiently
collecting a large amount of data, a method for fundamentally reducing the data collection cost by
reducing the amount of data collected itself has been proposed. Bayesian optimization[33, 34, 35]
is an eficient sampling algorithm for collecting training data. An example of using Bayesian
optimization is controlled parameter adjustment in the field of robot control[ 36, 37, 38, 39]. A
method has been proposed in which a reduction in the number of data collections is applied
to indoor localization data collection using Bayesian optimization[
            <xref ref-type="bibr" rid="ref15">15</xref>
            ]. This method aims to
achieve the maximum localization accuracy with the minimum amount of data by sequentially
recommending the most efective data collection points for improving the localization accuracy
by Bayesian optimization. However, unlike sampling in the context of robot control parameter
adjustment, data collection in the context of indoor localization model construction requires
movement to the data collection position for each scan. This movement is one of the main
factors included in the data collection cost, but the method of Shimosaka et al. tends to increase
the walking distance required during data collection, which was a problem from practicality. In
this study, based on the reduction of the number of data collected using Bayesian optimization,
by improving the utilization of the acquisition function, the number of data collected and the
walking distance during data collection is reduced to achieve the target accuracy. We propose a
sequential recommendation algorithm for possible data gathering positions.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Formulation of Indoor Localization</title>
      <sec id="sec-2-1">
        <title>2.1. Information source for Indoor Localization</title>
        <p>Wi-Fi RSSI is used as an information source for indoor localization. One RSSI is obtained for
each AP (Access Point). Here, the RSSI corresponding to AP is expressed as . The APs used
for localization are specified in advance, and the total number is . The RSSI from  APs
obtained by one scan is expressed as a vector  ∈ R , which is the fingerprint at the data
collection point. In addition, the data collection points are set by dividing the localization target
environment into a mesh and setting one for each divided section. This partition  is also a unit
of localization. Assign a label to each partition and define it as the position label  ∈ ℛ. Here,
ℛ is a set of localization target sections. Based on them, indoor localization using Wi-Fi signal
strength becomes a multi-class classification problem that estimates the collection position  by
inputting the collected RSSI vector  of the fingerprint. RSSI  takes real numbers from − 100
to 0. The unit is dBm, and the larger the value, the stronger the signal strength. Also, RSSI 
for unobserved AP is complemented as − 100.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Indoor Localization based on Fingerprint</title>
        <p>Fingerprint  is featured and used for training and localization. Each section  has corresponding
parameter  of the same dimension as the featured fingerprint (). This parameter is learned
with training data  = {, }∈(1,· ,), and  = ||. The parameters are learned to
minimize the localization error. Let  be the ground truth data gathered position and ^ be the
estimated position as data gathered, localization error  is defined by  = (, ^). Here, (· , · )
is Euclidean distance between sections.</p>
        <p>When estimating the collection position of a certain fingerprint, estimated gathered position
^ is calculated by (1), using  = {1, · · · , |ℛ|}.</p>
        <p>^ = argmax T().</p>
        <p>∈ℛ</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Learning method of Indoor Localization model with a small amount of data</title>
        <p>When constructing a localization model as a multi-class classification problem, it is necessary to
learn the parameter  corresponding to all sections. The simplest solution to this problem is to
collect data in all sections contained in ℛ, but this method requires data gathering at least once in
each area, |ℛ| times in total. Therefore, we introduce multitasking regularization. Multitasking
regularization is a parameter learning method that considers the proximity between classes,
which enables learning with a small amount of data less than |ℛ|. Here, in the context of indoor
localization, classes correspond to sections, so the proximity between classes is defined by the
Euclidean distance between the coordinates of the representative points of the areas. In other
words, the closer the distance on the floor map, the higher the proximity of the corresponding
classes between the plots.</p>
        <sec id="sec-2-3-1">
          <title>2.3.1. Streamlining data gathering to reduce the number of data</title>
          <p>In this study, we adopt an experimental design method using Bayesian optimization. The
function that models the accuracy improvement range of the target model is called the acquisition
function. Let that the acquisition function be , and the coordinate space of the data be ,
the coordinates ^ of the next observed data are determined by (2).</p>
          <p>^ = argmax ()
∈
(1)
(2)
In the context of indoor localization, we model the localization error that decreases by observing
the fingerprint in each section and recommend the most eficient data collection position.</p>
          <p>In the context of indoor localization, the coordinate space of the data is ℛ, and the accuracy
improvement range of the target model can be obtained from the localization error  in each
section  ∈ ℛ. Since the acquisition function is modeled depending on the set of collected data
 , it can be expressed as (;  ). This (;  ) can be obtained by the following procedure.</p>
          <p>First, we construct a localization error data set ℒ using  . Though the localization error of
the localization model is originally calculated using data diferent from the data used for model
training as an evaluation metric of the constructed localization model, at the data collection
stage for constructing the indoor localization model, since it is impossible to use data other
than the data currently being collected, the data used for training is same as for estimating the
localization error. Parameters {}∈ℛ corresponding to each section  ∈ ℛ contained in ℛ
are trained using the gathered data  . Here, the training method is not limited to a specific
method, and any method may be used as long as a fingerprint is regarded as an input vector,
and multi-class classification is performed.</p>
          <p>The estimated localization error data set ℒ is initialized with an empty set and then constructed
by the following procedure: For each data (, ) contained in  , find the estimated collection
position ^ according to (1). Then, the Euclidean distance (, ^) between the ground truth
data collection position  and the estimated collection position ^ is regarded as the estimated
localization error. Tuple ((, ^), ) is inserted into ℒ.</p>
          <p>The accuracy improvement range of the localization model must be estimated for all sections of
ℛ, but the data contained in ℒ is insuficient. Therefore, the predicted distribution of localization
error is obtained using Gaussian process regression. The predicted distribution estimated by
Gaussian process regression is the Gaussian distribution, and the predicted distribution of the
estimated localization error in  can be obtained in the form of  ( ,  2 ). As a modeling
method for improving the accuracy of the localization model using them, it is possible to use
only the average of the predicted distributions of the estimated localization errors, but that alone
is not suficient. Since the variance of the predicted distribution obtained from Gaussian process
regression increases as the number of observed data decreases, the variance information is also
important in the context of data collection position recommendation for indoor localization
models. Therefore, the metric GPUCB (Gaussian Process Upper Confidence Bound) [ 40] is used,
which can consider both the mean and variance of the predicted distribution. Here, the GP-UCB
score  at  is obtained by (3) using the mean   and variance  2 of the predicted distribution.
Here,  is a parameter that adjusts the degree of influence of   and   on .
 =   +  
(3)
Using this , we get the acquisition function as (;  ) = .</p>
          <p>However, the number of data collected  = | | is reduced when Bayesian optimization is
used. In the context of indoor localization, data collection requires user movement. In particular,
in the method of Shimosaka et al., The next data collection position is obtained according to (2),
and since only the number of data collections is considered, the data collection route becomes
rather redundant and the number of data collections is reduced. On the other hand, the burden
on the user has been increased from the viewpoint of the walking route length and the required
time.</p>
          <p>Previous Work</p>
          <p>Proposed method
Maximum improvement on
localization accuracy for each scan</p>
          <p>Finally Achieve target localization</p>
          <p>accuracy with minimum cost
Shelf
Desk
Desk</p>
          <p>Reduce redundant walking</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Efective Sequential Recommendation of Neighborhood</title>
    </sec>
    <sec id="sec-4">
      <title>Data Gathering Position</title>
      <sec id="sec-4-1">
        <title>3.1. Sequential Recommendation of data gathering position considering both localization accuracy and walking cost</title>
        <p>
          This study aims not only to reduce the number of data collections by using Bayesian optimization
but also to reduce further the data collection cost required to build an indoor localization model
by reducing the distance traveled during data collection. Shimosaka et al.[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] set the section
with the highest GPUCB value of the estimated localization error using the acquired data as the
next data collection position, but it caused redundant walking. Therefore, we propose a new
method to recommend the next data collection position sequentially named Efective Sequential
Recommendation of Neighborhood Data Gathering Position that considers the GPUCB value
of the estimated localization error and the relationship with the current location is taken into
consideration.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Efective Sequential Recommendation of Neighborhood Data Gathering</title>
      </sec>
      <sec id="sec-4-3">
        <title>Position</title>
        <p>
          The Efective Sequential Recommendation of Neighborhood Data Gathering Position uses the
collected data to calculate the estimated localization error and the estimated localization error in
the entire indoor environment, similar to Shimosaka et al.[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], calculate GP-UCB of localization
error prior to the recommendation of the next data collection position. These procedures are
similar to those described in 2.3.1, and only the next data collection position recommendation
method after calculating the GPUCB value of the estimated localization error for the entire
ℛ is diferent. The flow of the entire algorithm of Efective Sequential Recommendation of
Neighborhood Data Gathering Position is shown in Algorithm 1, and the state of selecting the
next destination is shown in Fig. 2. The detailed flow of the algorithm is as follows:
        </p>
        <p>The initial position of the user who collects data is current, and the set of the entire localization
target section is ℛ. Define scan () as a function that collects fingerprints in section  and returns
their values, and scan(current) is added to the data set  . Let ℛ1 be the set of compartments
 ∈ ℛ whose estimated localization error GP-UCB value exceeds the threshold . Of the section
 contained in ℛ1, the set of sections that satisfy GPUCBScoreOnRouteLower( , current, )
constitutes ℛ2. The details of the procedure for GPUCBScoreOnRouteLower( , current, ) will
be described later in Algorithm 2. ℛ2 still contains many sections, which is not enough to narrow
down the section candidates for the next data collection position. Therefore, we focus on the
fact that ℛ2 tends to include adjacent sections. Divide ℛ2 into a set of adjacent compartments,
select one compartment from each set, and use it as an element of ℛ3. Here, when selecting one
block from the block group, the section with the shortest required travel distance ′(current, )
from the current location current is selected. Select the next data collection position from ℛ3
constructed in this way. If there are multiple sections included in ℛ3, select the section with
the shortest required travel distance from the current location current ′(current, ). The user
moves to the next destination selected in this way.</p>
        <p>By repeating the above procedure, the algorithm stops when the GP-UCB value in the entire
localization target environment becomes less than the threshold , that is, when ℛ1 becomes
an empty set.</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.3. Next data gathering position selection based on predicted localization error on route</title>
        <p>Here, The calculation method of GPUCBScoreOnRouteLower( , current, destination) is
described. The flow of the algorithm is summarized in Algorithm 2.</p>
        <p>First, let  be vertex set composed of elements of ℛ and  be edge set composed of undirected</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments</title>
      <sec id="sec-5-1">
        <title>4.1. Purpose</title>
        <p>The purpose of this experiment is to show that the proposed method can construct a highly
accurate localization model while keeping the amount of data collected small and the walking
distance during data collection short in comparison with other methods.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Experimental Environment</title>
        <p>Fingerprint data was collected in an ofice environment of about 450 square meters. The size of
one section, which is the unit of localization, was 1 m square, and 202 sections were used as the
localization target area. We also conducted experiments in the same environment, excluding
Lab2 and Lab3, to evaluate robustness. Let each be Env1 and Env2. The number of APs used for
identification was 32.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Evaluation Metrics</title>
        <p>As evaluation metrics, the accuracy of the indoor localization model learned using the collected
data, the number of collected fingerprints, and the walking distance during data collection is used.
We also compare the accuracy of the localization models constructed for each walking distance
and collection time from the viewpoint of the eficiency of building the indoor localization
model for the walking distance and collection time during data collection. The walking distance
during data collection is calculated from the actual walking route by the Manhattan distance.</p>
      </sec>
      <sec id="sec-5-4">
        <title>4.4. Proposed and Comparison Methods</title>
        <sec id="sec-5-4-1">
          <title>4.4.1. NearestNearbyCandidate</title>
          <p>NearestNearbyCandidate implements the Efective Sequential Recommendation of
Neighborhood Data Gathering Position. The termination condition of data collection is that Efective
Sequential Recommendation of Neighborhood Data Gathering Position determines that
suficient data is obtained for learning a localization model with target accuracy.</p>
        </sec>
        <sec id="sec-5-4-2">
          <title>4.4.2. Gather-all</title>
          <p>Gather-All is a method of scanning fingerprints once in all sections in the localization target
environment. The data collection order was set manually so that redundant walking would
not occur as much as possible by collecting data in order from the end. The condition for
terminating data collection is that the fingerprint scan is completed in all sections.</p>
        </sec>
        <sec id="sec-5-4-3">
          <title>4.4.3. BayesianOptimization</title>
          <p>
            We compare the method of Shimosaka et al. [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ] as Bayesian Optimization. Bayesian
optimization focuses only on the number of data collected and minimizes it. Specifically, using
Bayesian optimization, the GPUCB value of the localization error estimated using Gaussian
process regression from the data collection position recommendation model learned using the
ifngerprint data obtained so far takes the maximum value. Repeat the fingerprint scan in the
section. The termination condition of data collection is that the model determines that suficient
data is available for learning a localization model with target accuracy.
          </p>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>4.5. Experimental Settings</title>
        <p>
          A fingerprint is featurized by Gauss features and used according to the method of
Shimosaka et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. For RSSI for each AP, the Gauss feature obtained using  =
[− 80.0, − 70.0, − 60.0, − 50.0, − 40.0, − 30.0],  = 2.0 is used as a vector. The feature quantity
was obtained by reconnecting it for all APs. As a loss function, a cost-considered hinge loss is
used according to Shimosaka et al. This makes it possible to perform learning considering not
only the correctness of labels but also the distance between labels. In addition, when learning
parameters, L2 norm and a quadratic norm of parameters between adjacent sections, that works
as multitask learning, is used as a regularization term. FOBOS is used as an optimization method
for the above loss function and a regularization term. Data collection was performed using an
interactive data collection system, and the terminal used was an Apple MacBook Pro.
        </p>
      </sec>
      <sec id="sec-5-6">
        <title>4.6. Experimental Result</title>
        <sec id="sec-5-6-1">
          <title>4.6.1. Comparison of data gathering cost</title>
          <p>LocalizationError vs WalkedDistance</p>
          <p>NearestNearbyCandidate
BayesianOptimization
Gather-All</p>
          <p>LocalizationError vs ScanDuration</p>
          <p>NearestNearbyCandidate
BayesianOptimization
Gather-All
LocalizationError vs WalkedDistance</p>
          <p>NearestNearbyCandidate
BayesianOptimization
Gather-All</p>
          <p>LocalizationError vs ScanDuration</p>
          <p>NearestNearbyCandidate
BayesianOptimization
Gather-All
200 400 600 800 1000 200 400 600 800 1000 1200</p>
          <p>Walked Distance [m] Scan Duration [s]
(a) Averaged localization error for walking distance (b) Averaged localization error for required time
while data gathering while data gathering
100 200 300 400 500 100 200 300 400 500 600</p>
          <p>Walked Distance [m] Scan Duration [s]
(a) Averaged localization error for walking distance (b) Averaged localization error for required time
while data gathering while data gathering</p>
          <p>Comparison of required walking distance for each method for achieved localization accuracy
is shown in Fig. 4a, Fig. 5a, comparison of required collection time is shown in Fig. 4b, Fig. 5b.
When the localization error of 3.5 m or less is achieved, the required walking distance at the
end of data gathering, the required collection time and the actual values of the localization error
are summarized in Table 1,Table 2, respectively. The required collection time is calculated by
weighting 4.0 s per 1 data collection and 1.0 s per 1 m walk from the record when the data was
actually collected. However, when comparing all the collected data, the values of the required
walking distance and required collection time of Bayesian Optimization are overwhelmingly
large compared to other methods, so in consideration of readability, only the result of 40 point
is shown at the beginning in the graph.</p>
          <p>Comparing the changes in the average localization error for the walking distance from Fig. 4a,
the accuracy of Gather-All and NearestNearbyCandidate converges when walking about 200 m,
compared with Bayesian Optimization. It can be seen that a highly accurate localization model
can be constructed with a short walking distance. This can also be confirmed from Table 1.</p>
          <p>From Fig. 4b, comparing the changes in the average localization error for the required
time, Bayesian Optimization and NearestNearbyCandidate have a similar tendency, and the
localization error decreases, and NearestNearbyCandidate converges first. On the other hand, it
can be seen that Gather-All takes a longer time to converge the accuracy than the other two
methods.</p>
          <p>Fig. 5a and Fig. 5b show the same tendency as Fig. 4a and Fig. 4b, and it can be seen that NNC
can reduce the data collection cost from the viewpoint of the number of data collected, walking
distance, and required time even if the environment changes.</p>
          <p>From Table 1 and Table 2, comparing the number of data collected when the average
localization error of 3.5 m is achieved, Nearest Nearby Candidate is 33 % of Gather-All. The target
accuracy was achieved with the following number of collections, and the increase in the number
of collections was suppressed by about 50 % compared to Bayesian Optimization.</p>
          <p>We confirmed from the above results that NearestNearbyCandidate could construct a highly
accurate localization model with the same walking distance as Gather-All, the same required
time as Bayesian Optimization, and the number of data collected. As a result, it can be said
that the cost was reduced by considering all of the target numbers of data collections, walking
distance, and required time.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>In this research, we propose the necessity of reducing the cost of data collection for indoor
localization from the viewpoint of the number of data collected, walking distance, and required
time. We proposed Efective Sequential Recommendation of Neighborhood Data Gathering
Position, which is efective for cost reduction from all viewpoints. Experiments have confirmed
that Efective Sequential Recommendation of Neighborhood Data Gathering Position can reduce
these costs simultaneously, unlike the existing method. Future tasks include constructing a
localization model without using the floor map information known in this study and selecting
the data collection order with proof of optimality.
and Communications Societies. Proceedings. IEEE, volume 2, Ieee, 2000, pp. 775–784.
[17] M. Youssef, A. Agrawala, The horus wlan location determination system, in: Proceedings
of the 3rd international conference on Mobile systems, applications, and services, 2005, pp.
205–218.
[18] A. H. Salamah, M. Tamazin, M. A. Sharkas, M. Khedr, An enhanced wifi indoor localization
system based on machine learning, in: 2016 International Conference on Indoor Positioning
and Indoor Navigation (IPIN), IEEE, 2016, pp. 1–8.
[19] Y. Sun, M. Liu, M. Q.-H. Meng, Wifi signal strength-based robot indoor localization, in:
2014 IEEE International Conference on Information and Automation (ICIA), IEEE, 2014,
pp. 250–256.
[20] C. Chen, Y. Chen, Y. Han, H.-Q. Lai, K. R. Liu, Achieving centimeter-accuracy indoor
localization on wifi platforms: A frequency hopping approach, IEEE Internet of Things
Journal 4 (2016) 111–121.
[21] X. Wang, Z. Yu, S. Mao, Deepml: Deep lstm for indoor localization with smartphone
magnetic and light sensors, in: 2018 IEEE International Conference on Communications
(ICC), IEEE, 2018, pp. 1–6.
[22] M. Abbas, M. Elhamshary, H. Rizk, M. Torki, M. Youssef, Wideep: Wifi-based accurate
and robust indoor localization system using deep learning, in: 2019 IEEE International
Conference on Pervasive Computing and Communications (PerCom, IEEE, 2019, pp. 1–10.
[23] X. Wang, L. Gao, S. Mao, S. Pandey, Csi-based fingerprinting for indoor localization: A
deep learning approach, IEEE Transactions on Vehicular Technology 66 (2016) 763–776.
[24] K. Wu, J. Xiao, Y. Yi, D. Chen, X. Luo, L. M. Ni, Csi-based indoor localization, IEEE</p>
      <p>Transactions on Parallel and Distributed Systems 24 (2012) 1300–1309.
[25] Z. Yang, Z. Zhou, Y. Liu, From rssi to csi: Indoor localization via channel response, ACM</p>
      <p>Computing Surveys (CSUR) 46 (2013) 1–32.
[26] Y. Chapre, A. Ignjatovic, A. Seneviratne, S. Jha, Csi-mimo: Indoor wi-fi fingerprinting
system, in: 39th annual IEEE conference on local computer networks, IEEE, 2014, pp.
202–209.
[27] Z. Wu, Q. Xu, J. Li, C. Fu, Q. Xuan, Y. Xiang, Passive indoor localization based on csi and
naive bayes classification, IEEE Transactions on Systems, Man, and Cybernetics: Systems
48 (2017) 1566–1577.
[28] S. Wielandt, L. D. Strycker, Indoor multipath assisted angle of arrival localization, Sensors
17 (2017) 2522.
[29] S. Yang, E. Jeong, S. Han, Indoor positioning based on received optical power diference
by angle of arrival, Electronics Letters 50 (2014) 49–51.
[30] O. Hashem, M. Youssef, K. A. Harras, Winar: Rtt-based sub-meter indoor localization using
commercial devices, in: 2020 IEEE International Conference on Pervasive Computing and
Communications (PerCom), IEEE, 2020, pp. 1–10.
[31] C. Gentner, M. Ulmschneider, I. Kuehner, A. Dammann, Wifi-rtt indoor positioning, in:
2020 IEEE/ION Position, Location and Navigation Symposium (PLANS), IEEE, 2020, pp.
1029–1035.
[32] G. Guo, R. Chen, F. Ye, X. Peng, Z. Liu, Y. Pan, Indoor smartphone localization: A hybrid
wifi rtt-rss ranging approach, IEEE Access 7 (2019) 176767–176781.
[33] J. Snoek, H. Larochelle, R. P. Adams, Practical bayesian optimization of machine learning
algorithms, arXiv preprint arXiv:1206.2944 (2012).
[34] B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, N. De Freitas, Taking the human out of
the loop: A review of bayesian optimization, Proceedings of the IEEE 104 (2015) 148–175.
[35] K. Swersky, J. Snoek, R. P. Adams, Multi-task bayesian optimization (2013).
[36] R. Calandra, A. Seyfarth, J. Peters, M. P. Deisenroth, An experimental comparison of
bayesian optimization for bipedal locomotion, in: 2014 IEEE International Conference on
Robotics and Automation (ICRA), IEEE, 2014, pp. 1951–1958.
[37] T. Seyde, J. Carius, R. Grandia, F. Farshidian, M. Hutter, Locomotion planning through a
hybrid bayesian trajectory optimization, in: 2019 International Conference on Robotics
and Automation (ICRA), IEEE, 2019, pp. 5544–5550.
[38] H. Cheng, H. Chen, Online parameter optimization in robotic force controlled assembly
processes, in: 2014 IEEE International Conference on Robotics and Automation (ICRA),
IEEE, 2014, pp. 3465–3470.
[39] F. Berkenkamp, A. P. Schoellig, A. Krause, Safe controller optimization for quadrotors with
gaussian processes, in: 2016 IEEE International Conference on Robotics and Automation
(ICRA), IEEE, 2016, pp. 491–496.
[40] N. Srinivas, A. Krause, S. M. Kakade, M. W. Seeger, Information-theoretic regret bounds
for gaussian process optimization in the bandit setting, IEEE Transactions on Information
Theory 58 (2012) 3250–3265.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Yassin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nasser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Awad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Dubai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yuen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raulefs</surname>
          </string-name>
          , E. Aboutanios,
          <article-title>Recent advances in indoor localization: A survey on theoretical approaches and applications</article-title>
          ,
          <source>IEEE Communications Surveys &amp; Tutorials</source>
          <volume>19</volume>
          (
          <year>2016</year>
          )
          <fpage>1327</fpage>
          -
          <lpage>1346</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kashima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Suzuki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tsuboi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Takahashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ide</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Takahashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tajima</surname>
          </string-name>
          ,
          <article-title>A semi-supervised approach to indoor location estimation (</article-title>
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <article-title>A low-cost and accurate indoor localization algorithm using label propagation based semi-supervised learning</article-title>
          ,
          <source>in: Mobile Ad-hoc and Sensor Networks</source>
          ,
          <year>2009</year>
          . MSN'
          <volume>09</volume>
          . 5th International Conference on,
          <year>2009</year>
          , pp.
          <fpage>108</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Geng</surname>
          </string-name>
          ,
          <article-title>Semi-supervised learning for indoor hybrid fingerprint database calibration with low efort</article-title>
          ,
          <source>IEEE Access 5</source>
          (
          <year>2017</year>
          )
          <fpage>4388</fpage>
          -
          <lpage>4400</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pulkkinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Roos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Myllymäki</surname>
          </string-name>
          ,
          <article-title>Semi-supervised learning for wlan positioning</article-title>
          ,
          <source>in: International Conference on Artificial Neural Networks</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>355</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Arias-de Reyna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dardari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Closas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Djurić</surname>
          </string-name>
          ,
          <article-title>Enhanced indoor localization through crowd sensing</article-title>
          ,
          <source>in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>2487</fpage>
          -
          <lpage>2491</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Azizyan</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Constandache</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Roy</given-names>
            <surname>Choudhury</surname>
          </string-name>
          ,
          <article-title>Surroundsense: mobile phone localization via ambience fingerprinting</article-title>
          ,
          <source>in: Proceedings of the 15th annual international conference on Mobile computing and networking</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>261</fpage>
          -
          <lpage>272</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.-H.</given-names>
            <surname>Jung</surname>
          </string-name>
          , D. Han,
          <article-title>Automated construction and maintenance of wi-fi radio maps for crowdsourcing-based indoor positioning systems</article-title>
          ,
          <source>IEEE Access 6</source>
          (
          <year>2018</year>
          )
          <fpage>1764</fpage>
          -
          <lpage>1777</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kawajiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shimosaka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kashima</surname>
          </string-name>
          ,
          <article-title>Steered crowdsensing: Incentive design towards quality-oriented place-centric crowdsensing</article-title>
          ,
          <source>in: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>691</fpage>
          -
          <lpage>701</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lashkari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rezazadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sandrasegaran</surname>
          </string-name>
          ,
          <article-title>Crowdsourcing and sensing for indoor localization in iot: A review</article-title>
          ,
          <source>IEEE Sensors Journal</source>
          <volume>19</volume>
          (
          <year>2018</year>
          )
          <fpage>2408</fpage>
          -
          <lpage>2434</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kawajiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shimosaka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fukui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sato</surname>
          </string-name>
          ,
          <article-title>Frustratingly simplified deployment in wlan localization by learning from route annotation</article-title>
          ,
          <source>in: Asian Conference on Machine Learning</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>191</fpage>
          -
          <lpage>204</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. K.</given-names>
            <surname>Chintalapudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Padmanabhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sen</surname>
          </string-name>
          , Zee:
          <article-title>Zero-efort crowdsourcing for indoor localization</article-title>
          ,
          <source>in: Proceedings of the 18th annual international conference on Mobile computing and networking</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>293</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Accurate and eficient indoor location by dynamic warping in sequence-type radio-map</article-title>
          ,
          <source>Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies</source>
          <volume>2</volume>
          (
          <year>2018</year>
          )
          <fpage>50</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zheng</surname>
          </string-name>
          , Turf:
          <article-title>Fast data collection for fingerprint-based indoor localization</article-title>
          ,
          <source>in: 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shimosaka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Saisho</surname>
          </string-name>
          ,
          <article-title>Eficient calibration for rssi-based indoor localization by bayesian experimental design on multi-task classification</article-title>
          ,
          <source>in: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>244</fpage>
          -
          <lpage>249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Padmanabhan</surname>
          </string-name>
          ,
          <string-name>
            <surname>Radar:</surname>
          </string-name>
          <article-title>An in-building rf-based user location and tracking system</article-title>
          ,
          <source>in: INFOCOM</source>
          <year>2000</year>
          . Nineteenth Annual Joint Conference of the IEEE Computer
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>