<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Indoor localization based on analysis of environmental ultrasound</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yudai Nagama</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takeshi Umezawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Noritaka Osawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graduate School of Engineering Chiba University</institution>
          ,
          <addr-line>Chiba</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Graduate School of Science and Engineering Chiba University</institution>
          ,
          <addr-line>Chiba</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This study proposes and evaluates a method to estimate indoor position by environmental ultrasound as latent information in a target environment. The method converts environmental ultrasounds into spectrograms and builds a nonlinear regression model based on the spectrograms using a convolutional neural network to estimate the indoor position and direction. This study evaluated the proposed method's estimation accuracy of the position and direction in a room. In a position estimation experiment using ultrasounds recorded at 20 measuring point in the target room, a position was estimated based on a regression model trained with training data. The model has a root mean squared error (RMSE) of 1.01 m for test data. This result demonstrates that the proposed method is effective for indoor position estimation. In a direction estimation experiment, the estimation error was 78.81° for test data. This result shows that it is difficult to estimate the direction of an ultrasonic microphone used in the experiment.</p>
      </abstract>
      <kwd-group>
        <kwd>Environmental Ultrasound</kwd>
        <kwd>Spectrogram</kwd>
        <kwd>Convolutional Neural Network</kwd>
        <kwd>Deep Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Indoor position estimation techniques are being studied at present, but many of them
require installation of devices such as beacons. We think that information which already
exists in an indoor environment makes it possible to perform indoor position estimation
which keeps cost down. In this research, we focused on environmental ultrasound as
latent information in an indoor environment. Environmental ultrasound refers to a
mixture of ultrasounds emitted from indoor devices such as air conditioning equipment
and switching power supply.</p>
      <p>
        Tsuchiya et al. have shown that it is possible to identify multiple rooms by using
environmental ultrasound [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. They converted the ultrasound into a spectrogram and
analyzed it using a convolutional neural network (CNN). We extend the area
classification method to a regression method in an area and propose a new method to
estimate a position and direction based on environmental ultrasounds. We performed
experiments to evaluate the proposed method and obtained the results that the proposed
method is effective for position estimation but not for direction estimation.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>In the case of a method using radio waves, a method based on triangulation is often
used if the position of the transmitting source is known. A method of triangulation
mainly uses received signal strength or time of flight between a transmitting source and
a receiving terminal. When the position of the transmitting source is unknown, many
methods based on position fingerprinting are used. In the case of using environmental
ultrasounds, estimation based on position fingerprinting is performed.</p>
      <p>
        There are many methods that use Wi-Fi, which is already installed in many indoor
environments [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Methods using Bluetooth [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and using a
special wireless signal [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] require installment of the beacons and terminals, and such
installment may cost much.
      </p>
      <p>
        Tsuchiya et al. showed that environmental ultrasounds in rooms have unique
characteristics and demonstrated that analysis of ultrasounds could distinguish multiple
rooms with an accuracy of 97.5% [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Their work also reported that ultrasounds at
different locations in the same room had different characteristics. Therefore, we think
it is possible to perform indoor position estimation in one room by exploiting
environmental ultrasounds.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Method</title>
      <p>We propose a method to estimate the position and direction within a room using
environmental ultrasound. The method is partially based on the work of Tsuchiya et al.
but differs in that we estimate the coordinates and azimuth of a recoding ultrasonic
microphone by using non-linear regression models. The regression models are built by
CNN with spectrograms of ultrasound. Since our method is a type of fingerprinting, we
need to collect ultrasounds at various positions and directions to build non-linear
regression models for coordinates and azimuth.
3.1</p>
      <sec id="sec-3-1">
        <title>Spectrogram</title>
        <p>An example of a spectrogram is shown in Fig.1. A spectrogram is a heat map
representing relationships among time, frequency, and intensity. In the spectrogram,
the horizontal axis represents time, the vertical axis represents a frequency band, and
the gray scale represents intensity. Short-term changes such as beats of fan noises of an
air conditioner can be important features of ultrasound fingerprinting. The spectrogram
can include such importance features which cannot be expressed by simple acoustic
features such as Fourier coefficients for a period.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Convolutional Neural Network (CNN)</title>
        <p>CNN is one of deep learning algorithms that show high accuracy in image recognition.
CNN performs feature extraction in the image by the convolution layer and the pooling
layer that allows misalignment.
We experimented to evaluate the effectiveness of position and direction estimation by
the proposed method. This section describes the experiment and its results.
We collected ultrasounds in one room. The recorded environmental ultrasound is
converted into a spectrogram. Pairs of the spectrogram and recording position of
ultrasound were accumulated to a dataset. We divide the created data set into training
data and test data. Non-linear regression models of the positions were built by CNN.
The features of the spectrogram for each position of training data were learned.
Recording positions of test data are estimated using the created models, and the
estimation error is evaluated.
4.2</p>
      </sec>
      <sec id="sec-3-3">
        <title>Experimental Environment</title>
        <p>Ultrasounds were collected in a room whose dimensions are 5.93 m long × 4.73 m wide
× 2.60 m high. The room has 0.3 m × 0.4 m pillars at its four corners, which are
indicated by black rectangles in Fig.2(a).</p>
        <p>A Dodotronic Ultramic 250k ultrasonic microphone was used as the recording
device. Since the maximum sampling rate of the microphone is 250 kHz, ultrasounds
can be recorded up to 125 kHz. The microphone was connected to a notebook-type
personal computer (PC) which captured and saved ultrasounds. The PC was fixed at the
center of the room to reduce the influence of any ultrasound generated by the PC.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Experimental Conditions</title>
        <p>We placed 20 measuring points indicated by dots in Fig.2(a), which were points at the
intersections on a 1.2 × 1.2 m grid, within a room, and performed recording at the 20
points. Also, to evaluate the influence of recording direction, recordings in four
directions were collected at each measuring point as shown in Fig.2(a). This dataset
which contains four direction data is called dataset R-all.</p>
        <p>The length of one recording was two seconds, and we repeatedly recorded
ultrasounds at one-second intervals. In this experiment, we only used sound data of 20k
to 125 kHz because a spectrum of the audible range is affected by sounds caused by
daily human activities (e.g., human voices).</p>
        <p>To investigate the influence of recording direction on position estimation, we
compared the estimation errors of the datasets for each direction and all directions. We
refer to the set of all data recorded with direction 0° as dataset R0, which contains
10,000 recordings. Similarly, sets of data recorded with directions 90°, 180°, and 270°
are referred to respectively as datasets R1, R2, and R3. For dataset R-all, 200 data were
collected in each direction, 800 data in each position, 16,000 data for all. We used 80 %
of the data in each direction at each point as training data and the remaining 20% as test
data.</p>
        <p>0.6
1.8
3.0
4.2
(m)</p>
        <p>window
0
0.6
1.8
3.0
5.4
(m)</p>
        <p>PC
door</p>
        <p>= Measuring Point
(a).Position Estimation
Root mean squared error (RMSE) is used as a quantitative measure of estimation error.
RMSE shows an average of distance between coordinates of a ground truth and an
estimation. The unit of distance is meter in this research.</p>
        <p>In this experiment, we configure two non-linear regression model which estimate X
and Y coordinate. We defined RMSE of X and Y coordinates independently by (1), (2)
as RMSEx and RMSEy where a both of data i is respectively (  ,   ) and ( ̂ ,  ̂ ) and the
number of data is n. RMSE between ground truth positions and estimated positions is
defined (3).</p>
        <p>RMSE =
RMSE =


√  = 1( ̂ −   )⁄
√  = 1( ̂ −   )⁄
RMSE = √RMSE 2 + RMSE 2
(1)
(2)
(3)
4.6</p>
      </sec>
      <sec id="sec-3-5">
        <title>Result</title>
        <p>horizontal and vertical axes of the graph represent respectively the number of epochs
of learning and RMSE. The solid and dotted lines in Fig.3(a) indicate RMSE for the
training data and the test data, respectively. At epoch 100, the RMSE of the training
data was 0.15 m, and that of the test data was 1.01 m.
values of dataset R0, R1, R2, and R3, each of which includes data for one direction, is
lower than the RMSE values of dataset R-all, which includes all of four direction data.
(m) 4.5
4.0
3.5
3.0
E 2.5
SM 2.0
R 1.5
1.0
0.5
0.0
(m) 1.4
test</p>
        <p>R0</p>
        <p>R1</p>
        <p>R2</p>
        <p>R3</p>
        <p>R-all
test
In the experiment using the dataset R-all that mixed the recording data of four
directions, RMSE for training data was about 0.15 m. This result shows that learning
epoch is sufficient and the indoor environment ultrasounds had different characteristics
for each position.</p>
        <p>However, RMSE for test data fluctuated between 1 m and 1.5 m, as shown in
Fig.3(a). Moreover, the estimation performance for training data was higher than that
for test data. Those suggest that the non-linear regression model fell into overfitting.</p>
        <p>The RMSE of R0, R1, R2, and R3 using only unidirectional data is lower than that
of R-all. This result suggests that recording directions affect position estimation
accuracy. If we get orientation information, for example, from a magnetic field sensor
in a smartphone, we could improve the accuracy and robustness to microphone
direction of the proposed method.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Direction Estimation</title>
      <p>From result of position Estimation, recording direction of microphone effect position
estimation accuracy. In this section, we performed direction estimation experiment by
proposed method.
5.1</p>
      <sec id="sec-4-1">
        <title>Experimental procedure</title>
        <p>We followed the same procedure as the experiment described in Section 4. The same
microphone and room as the position estimation experiment were used for the
recording. In this experiment, we estimate indoor direction by non-linear regression
model.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Experimental Conditions</title>
        <p>Ultrasounds were recorded in 16 directions from the center of the room, as shown in
Fig.2(b). An ultrasonic microphone was placed on a desk at the center of the room. The
recording angle was varied from 0 to 337.5° in increments of 22.5°.</p>
        <p>Dataset contains 1,000 items of data in each direction, for a total of 16,000 items.
We used 80% of the data in each direction as training data and the remaining 20% as
test data.
5.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Result</title>
        <p>Fig. 4. shows the transition of the estimation error. The horizontal axis of the graph
represents the number of epochs of learning and the vertical axis represents estimation
error. The unit of estimation error is a degree. The solid and dotted lines in the graph
indicate the estimation error for the training data and test data of datasets, respectively.
At epoch 100, the estimation error for the test data was respectively 78.81°.
(degree) 110</p>
        <p>100
ro 90
err 80
on 70
ita 60
itm 50
sE 40
0
20
40</p>
      </sec>
      <sec id="sec-4-4">
        <title>Epoch</title>
        <p>60
training
80
test
100
As a result of direction estimation error for test data is about 80°. In the position
experiment, Fig.3(b) shows that environmental ultrasound recorded in 90 ° units has
different features. If the direction estimation result is used to specify the recording
direction of the microphone in position estimation, an error of about 90 degrees is
insufficient in accuracy from the R-all result. Therefore, further investigation is needed
to clarify the cause.
This paper proposed and evaluated a method to estimate indoor position and direction
within a room based on environmental ultrasounds. In a position estimation, the results
suggest that environmental ultrasound can be used to estimate a position in one room
and that estimation accuracy can be improved by utilizing orientation information
together. In a direction estimation, it was difficult to estimate the direction of an
ultrasonic microphone used in the experiment.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Tatsuro</given-names>
            <surname>Tsuchiya</surname>
          </string-name>
          , Takeshi Umezawa, and Noritaka Osawa, “
          <article-title>An Indoor Area Estimation Method Analyzing Spectrograms of Environmental Ultrasounds by Convolutional Neural Network”, 2018 Ubiquitous Positioning, Indoor Navigation and Location and LocationBased Services (UPINLBS)</article-title>
          ,
          <source>DOI: 10.1109/UPINLBS</source>
          .
          <year>2018</year>
          .8559701
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Duc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
            and
            <given-names>Paul J.M.</given-names>
          </string-name>
          <string-name>
            <surname>Havinga</surname>
          </string-name>
          , “
          <article-title>SoLoc: Self-organizing indoor localization for unstructured and dynamic environments</article-title>
          ”,
          <source>2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2017</year>
          .8115900
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Viet-Cuong</surname>
            <given-names>Ta</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trung-Kien</surname>
            <given-names>Dao</given-names>
          </string-name>
          , Dominique Vaufreydaz, and Eric Castelli, “
          <article-title>SmartphoneBased User Positioning in a Multiple-User Context with Wi-Fi and</article-title>
          <string-name>
            <surname>Bluetooth”</surname>
          </string-name>
          ,
          <source>2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2018</year>
          .8533809
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Brieuc</given-names>
            <surname>Berruet</surname>
          </string-name>
          , Oumaya Baala, Alexandre Caminada, and Valery Guillet, “
          <article-title>A Deep Learning Based CSI Fingerprinting Indoor Localization in IoT Context”</article-title>
          ,
          <source>2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2018</year>
          .
          <volume>8533777</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Xingli</given-names>
            <surname>Gan</surname>
          </string-name>
          , Baoguo Yu,
          <string-name>
            <given-names>Lu</given-names>
            <surname>Huang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Yaning</given-names>
            <surname>Li</surname>
          </string-name>
          , “
          <article-title>Deep learning for weights training and indoor positioning using multi-sensor fingerprint</article-title>
          ”,
          <source>2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2017</year>
          .
          <volume>8115923</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Xiaoyun</given-names>
            <surname>Zhou</surname>
          </string-name>
          , Jie Wei,
          <string-name>
            <given-names>Fang</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Haiyong</given-names>
            <surname>Luo</surname>
          </string-name>
          , and Langlang Ye, “
          <article-title>A shop-level location algorithm based on CNN for crowdsourcing fingerprint”, 2018 Ubiquitous Positioning, Indoor Navigation and Location</article-title>
          and
          <string-name>
            <surname>Location-Based Services</surname>
          </string-name>
          (UPINLBS),
          <source>DOI: 10.1109/UPINLBS</source>
          .
          <year>2018</year>
          .8559873
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Maani</given-names>
            <surname>Ghaffari</surname>
          </string-name>
          <string-name>
            <surname>Jadidi</surname>
          </string-name>
          , Mitesh Patel, Jaime Valls Miro, Gamini Dissanayake Jacob Biehl, and Andreas Girgensohn, “
          <article-title>A Radio-Inertial Localization and Tracking System with BLE Beacons Prior Maps”</article-title>
          ,
          <source>2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2018</year>
          .8533827
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Tomofumi</given-names>
            <surname>Takayama</surname>
          </string-name>
          , Takeshi Umezawa, Nobuyoshi Komuro, and Noritaka Osawa, “
          <article-title>An Indoor Positioning Method Based on Regression Models with Compound Location Fingerprints”, 2018 Ubiquitous Positioning, Indoor Navigation and Location and LocationBased Services (UPINLBS)</article-title>
          ,
          <source>DOI: 10.1109/UPINLBS</source>
          .
          <year>2018</year>
          .8559728
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. A.
          <string-name>
            <surname>De-La-Llana-Calvo</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          <article-title>L´azaro-</article-title>
          <string-name>
            <surname>Galilea</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Gardel-Vicente</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>Rodr´ıguez-</article-title>
          <string-name>
            <surname>Navarro</surname>
            ,
            <given-names>and I.</given-names>
          </string-name>
          <article-title>Bravo-Munoz, “</article-title>
          <source>Characterization of Multipath Effects in Indoor Positioning Systems Based on Infrared Signals”</source>
          ,
          <source>2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN)</source>
          ,
          <source>DOI: 10.1109/IPIN</source>
          .
          <year>2018</year>
          .8533816
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>