<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Self-Synchronizing Acoustic Positioning System Based on TDoA</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shuai Cao</string-name>
          <email>caoshuai@ustc.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ya P. Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jian Xu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sheng B. Pei</string-name>
          <email>shengbingpei@ahu.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institutes of Physical Science and Information Technology, Anhui University</institution>
          ,
          <addr-line>Hefei 230039</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science and Technology, Anhui University</institution>
          ,
          <addr-line>Hefei 230039</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <fpage>5</fpage>
      <lpage>7</lpage>
      <abstract>
        <p>A self-synchronizing system for acoustic positioning using low-cost acoustic devices is proposed in this paper. It includes a master base station and multiple slave base stations. During a positioning cycle, the master base station first transmits an audio signal with synchronization and positioning functions. Then, after detecting the audio signal transmitted by the master base station, each slave base station transmits the positioning audio signal after delaying the set time. Finally, the target realizes its own position estimation by detecting the arrival time of the audio signal transmitted by the master base station and each slave base station. The proposed system realizes self-synchronization by using known information such as the distance between base stations and the delay time of each slave base station. After synchronization, the audio arrival time of each base station received by the target is used for position estimation. Compared with the previous synchronization method, the adopted selfsynchronization method avoids complicated wiring and radio interference and greatly reduces the cost of synchronization. The simulation results show that under the test conditions when the detection noise level of the target is less than 2.5 ms, the localization accuracy of the proposed system is better than 1 m. acoustic positioning, self-synchronizing, master base station, slave base station</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Location-aware technology has important applications in smart cities, the Internet of Things,
medical monitoring, etc. Based on the type of signals used for positioning, location-aware technology
is mainly divided into radio, motion signal, geomagnetic, image, audio, etc. In radio positioning
technology, Bluetooth and WIFI based on received signal strength indication (RSSI) usually use
signal attenuation</p>
      <p>model ranging or fingerprint method to achieve positioning, which has low
positioning accuracy (meter level) [1]-[2], the large workload of offline fingerprint collection and
update, and signal attenuation model is highly susceptible to environmental interference. The angle of
arrival (AoA) based on antenna array[3], WIFI-based round-trip time (RTT)[4]-[6] and channel status
information (CSI)[7], and ranging-based UWB technology[8] can provide centimeter-level to
submeter-level high-precision positioning, but wireless base stations and smart terminals are required to
support related protocols, the deployment complexity is high, and the wide-area coverage requires
extremely high costs. Pedestrian dead reckoning (PDR) based on motion signal [9] has the advantages
of not requiring additional infrastructure and being compatible with mobile phones, but there is an
accumulation of errors, and accurate positioning cannot be achieved for a long time.
Geomagneticbased positioning [10] can use the mobile phone's magnetic sensor to achieve positioning without
additional infrastructure but requires the offline collection of the indoor geomagnetic field distribution.
Image-based localization method [11] can achieve high-precision localization of the target, but it
needs to establish an image feature library in advance. Besides, the hardware cost is high, the</p>
      <p>2022 Copyright for this paper by its authors.
calculation is complex, and the performance is easily affected by environmental textures and shooting
conditions.</p>
      <p>The audio-based positioning technology [12] uses acoustic devices such as microphones and
speakers to achieve positioning. To support high concurrency, acoustic positioning systems (APS)
usually adopt a passive architecture, in which the base stations only transmit audio and the targets
receive and process the audio. The characteristics of this architecture are 1) Low system building
complexity. Since the propagation speed of audio is much lower than that of radio, the requirement of
clock synchronization accuracy to achieve high-precision acoustic positioning is not very high, 2)
Low system cost. On the one hand, due to the low cost of commercial audio components, although the
realization of acoustic positioning requires the deployment of acoustic base stations, location service
providers are expected to provide users with high-precision location services through low-cost
infrastructure investment. Furthermore, if the positioning of smart terminals is realized with the help
of broadcast speakers in large shopping malls, stations, and airports, there is no need to deploy
additional base stations. On the other hand, for the users, the microphones and the speakers are the
standard configurations of the hand-held mobile terminals, and there is no need to provide additional
overhead for users, 3) Good privacy. The targets can realize their own positioning only by receiving
signals without information exchanges with the acoustic base stations. Without the user's permission,
the user's location cannot be obtained by others, which is suitable for occasions with high privacy
requirements, and 4) Suitable for positioning occasions with severe electromagnetic interference and
metal substances. Compared with radio signals, the transmission of audio signals will not be disturbed
by electromagnetic waves, nor is it not affected by metal substances.</p>
      <p>Because synchronization between base stations and targets is not required, time difference of
arrival (TDoA) is a common measurement type used in high-precision acoustic positioning systems.
Accurate acquisition of TDoA is the key to ensuring the performance of the acoustic positioning
system, which not only requires the target to accurately detect the audio arrival time but also requires
accurate synchronization between acoustic base stations. The acoustic positioning systems involved in
previous studies mainly use wireless or wired methods to achieve synchronization [13]-[16]. The
wireless method uses radio to achieve synchronization between the base stations, which may cause
radio interference. The wired method uses connection lines to achieve synchronization between the
base stations, which is difficult to deploy. Both above synchronization methods are costly.</p>
      <p>In this paper, a novel self-synchronizing acoustic positioning system is proposed, which consists of
a master base station and multiple slave base stations. In a positioning cycle, the master base station
first transmits an audio signal with synchronization and positioning functions. Then, each slave base
station delays the set time after detecting the audio signal transmitted by the master base station and
starts to transmit the positioning audio signal. Finally, the target realizes its real-time position
estimation by detecting the arrival time of the audio signals transmitted by the master base station and
each slave base station. The proposed acoustic positioning system utilizes the acoustic devices
themselves to realize self-synchronization. Compared with the previous synchronization methods,
complicated wiring and radio interference are avoided, and the synchronization cost is greatly reduced.</p>
      <p>The rest of the paper is organized as follows. The basic principle of proposed self-synchronizing
acoustic positioning systems is presented in Section Ⅱ. The numerical simulations are described in
Section Ⅲ, and a conclusion is drawn in Section Ⅳ.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Self-synchronizing Acoustic Positioning</title>
      <p>The basic principle described below takes two-dimensional positioning as an example and is also
applicable to three-dimensional situations. As shown in Figure 1, it is assumed that there are four
acoustic base stations in the positioning area, which are represented by A, B, C, and D respectively.
The coordinates of the acoustic base stations are known, and each acoustic base station contains a
speaker and a microphone. To ensure high concurrency, the target only uses its microphone to
passively receive audio signals and realizes position calculation by detecting the arrival time of the
audio transmitted by each acoustic base station.</p>
      <p>The proposed self-synchronizing acoustic positioning system consists of a master base station and
multiple slave base stations. In a positioning cycle, the master base station first transmits an audio
signal that will be received by the target and each slave base station. Then, after detecting the arrival
time of the audio signal transmitted by the master base station, each slave base station delays the set
time and immediately emits the positioning audio signal. Finally, the target detects the time of arrival
of audio transmitted by all base stations, calculates the TDoA, and estimates the position using a
TDoA-based localization algorithm. Taking the timing diagram as an example, as shown in Figure 2,
the positioning processes of the self-synchronizing acoustic positioning system are described as
follows:
1. In a positioning period, such as 1 second, the master base station A transmits an audio signal
at time TA. To suppress background noise, the audio signal above is modulated, such as chirp
signal, orthogonal code modulation signal, etc.
2. After receiving the audio signal transmitted by base station A, base stations B, C, and D
detect the arrival times of the audio, which are represented by RB, RC, and RD respectively.
The target also detects the arrival time of the audio signal transmitted by base station A,
which is represented by RAT. In Figure 2, tAB, tAC, and tAD are used to represent the time taken
for the audio transmitted by the base station A to reach the base stations B, C, and D,
respectively, we have:
tAB = RB - TA
{tAC = RC - TA
tAD = RD - TA
(1)
3. Delayed by tdelayB from moment RB, namely at moment TB, base station B transmits
positioning audio signal, which is then received by the target at moment RBT. tBT is the flight
time for the positioning audio signal transmitted by base station B to reach the target.</p>
      <p>Similarly, Delayed by tdelayC from moment RC, namely at moment TC, base station C transmits</p>
      <p>positioning audio signal, which is then received by the target at moment RCT. tCT is the flight
time for the positioning audio signal transmitted by base station C to reach the target. For
base station D, the processing method is the same as above, and the corresponding time
parameters are tdelayD, TD, RDT and tDT.</p>
      <p>According to Figure 2, formula (2) can be obtained:</p>
      <p>RBT - RAT = tAB + tdelayB + tBT - tAT
{ RCT - RAT = tAC + tdelayC + tCT - tAT</p>
      <p>RDT - RAT = tAD + tdelayD + tDT - tAT
Formula (3) is derived from formula (2):</p>
      <p>tBT - tAT = (RBT - RAT) - (tAB + tdelayB)
{ tCT - tAT=(RCT - RAT) - (tAC + tdelayC)</p>
      <p>tDT - tAT = (RDT - RAT) - (tAD + tdelayD)
Multiply both sides of the equal sign of formula (3) by the speed of sound v (unit: m/s) to
obtain formula (4).</p>
      <p>dBT − dAT = v ∙ (RBT – RAT) – (dAB + v ∙ tdelayB)
{ dCT − dAT = v ∙ (RCT – RAT) – (dAC + v ∙ tdelayC)
dDT − dAT = v ∙ (RDT – RAT) - (dAD + v ∙ tdelayD)
(2)
(3)
(4)
where dAT , dBT , dCT and dDT are the distances between the target and base stations A, B, C
and D, respectively. dAB, dAC, and dAD are the distances between base station A and base
stations B, C, and D, respectively. RAT, RBT, RCT and RDT are the arrival times extracted by
the target from the received audio signal using the audio detection algorithm [17]. On the
right side of formula (4), dAB, dAC, dAD, v, tdelayB, tdelayC and tdelayD are known, (RBT - RAT),
(RCT - RAT) and (RDT - RAT) are the time interval of target detection, which are measured
values and can be obtained according to the number of sampling points and sampling period.
Therefore, the distance differences on the left side of the formula (4) are determined, i.e., with
base station A as the reference point, the distance difference(dBT - dAT ) between the target
and base stations B and A, the distance difference(dCT - dAT ) between the target and base
stations C and A, and the distance difference(dDT - dAT ) between the target and base stations
D and A are obtained.
5. Substitute the distance difference information in formula (4) into the TDoA-based positioning
algorithm to estimate the target's position.</p>
      <p>For each positioning cycle, the acoustic position system repeats the above steps to achieve target
positioning. In the whole process, the target only receives the audio signal transmitted by the base
stations and does not interact with the base stations, which makes the system supports high
concurrency, i.e., the number of users is not limited. This paper realizes a self-synchronization acoustic
positioning system through the transmitted audio signal. There are no connection lines between the
base stations, which is convenient for installation and layout. It also avoids the radio interference
problem existing in traditional radio synchronization. Furthermore, the proposed synchronization
method is low cost considering the acoustic components such as commercial loudspeakers,
microphones, etc., are cheap.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Numerical Simulations</title>
      <p>To analyze the positioning performance of the proposed system, simulations were performed to
investigate the influence of the detection accuracy of the target on the position estimation accuracy.</p>
      <p>The factors affecting the positioning accuracy of the proposed system include: (1) The errors of the
arrival time of the audio signal transmitted by the master base station detected by the slave base
stations, that is, the errors of RB , RC and RD in Figure 2, are represented by   ,   and  
respectively. Since the coordinates of the base stations are fixed, the paths between the master and
slave base stations are in the line-of-sight (LOS) state, the acoustic channel states are stable, and if the
system is calibrated, the noise will be at a low level. (2) The errors of the delay tdelayB, tdelayC, and
tdelayD caused by the system clock deviations of the slave base stations are relatively small, so this
simulation ignored the influence of this factor. (3) The detection errors of the audio detection
algorithm used by the target, corresponding to the four base stations A, B, C, and D, are represented
by  
,  
,   , and</p>
      <p>, respectively. Due to the influence of strong multipath and
non-line-ofsight (NLOS), these errors are a major factor in degrading system performance. This simulation
mainly examines this factor. According to Figure 2, the noisy audio arrival times RAT, RBT, RCT, and
RDT are obtained by formula (5).</p>
      <p>RAT = TA +</p>
      <p>+  
RBT = TA +</p>
      <p>+   + tdelayB +</p>
      <p>RCT = TA +
{</p>
      <p>RDT = TA +
dAC +   + tdelayC +
dAD +   + tdelayD +
dAT
dAB




dBT
dCT
dDT



+  
+  
+  
the speed of sound and is set to 340 m/s.</p>
      <p>In this simulation, formula (5) is substituted into formula (4) to calculate the noisy distance
differences, which are then substituted into the TDoA-based combined
weighted (COM-W)
positioning algorithm[18] to obtain the estimated position. By comparing the localization error (LE)
between the true position Pr(xr,yr) and the estimated position Pe(xe,ye), the influence of the detection
accuracy of target on the system positioning accuracy is evaluated. LE is the Euclidean distance,
which is calculated using (6).</p>
      <p>where dAT , dBT , dCT , and dDT are the distances between the base stations and the target, and v is
To avoid the influence of a single abnormal noise, it is usually necessary to add multiple noises to
measurement at a specific test point. Each time the noise is added, the COM-W algorithm estimates a
position and can use (6) to obtain its LE, and multiple noises can be added to obtain multiple LEs.
The mean positioning error (MPE) is calculated using (7).</p>
      <p>LE=√(xe-xr)2+(ye-yr)</p>
      <p>2
MPE=√∑iL=1 LEi2</p>
      <p>L
where L represents the total times of adding noise, and LE is the LE of the position estimated by the
positioning algorithm under the i-th noise addition. In this simulation, L is set to 1000.
(5)
(6)
(7)</p>
      <p>As shown in Fig 3, four base stations enclose a positioning area, and their coordinates are A(0 m, 0
m), B(10 m, 10 m), C(10 m, 10 m), and D(0, 10 m), respectively. Three representative test points
were selected, namely TP1 (1.0 m, 1.0 m) near a base station, TP2 (1.0 m, 5.0 m) near the edge of the
positioning area, and TP3 (5.1 m, 5.1 m) near the central area. In formula (5), it is assumed that   ,   ,
and   conform to Gaussian white noise with a mean of zero and a standard deviation of 0.1
millisecond (ms), and   ,   ,   , and   conform to Gaussian white noise with a mean of zero
and a standard deviation of σ (ms).</p>
      <p>Figure 4 shows the relationship between the MPE at the 3 test points and the standard deviation of
the detection noise of the target. It can be seen that as the detection noise of the target increases, the
MPE also increases, i.e., the accuracy of the position estimation of the target decreases. At low noise
levels, the MPEs at the three test points are relatively close, but at high noise levels, the MPEs at the
three test points are quite different, which is caused by the difference in geometric conditions[18].
From Figure 4, it can also be found that under the current simulation conditions, if the positioning
accuracy of the self-synchronized acoustic positioning system is to be better than 1 m, the detection
error of the target should be less than 2.5 ms.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This paper proposes a self-synchronizing acoustic positioning system. During a positioning period,
a master base station transmits an audio signal for synchronization and positioning, multiple slave
base stations receive the audio signal and then send positioning audio signals after delay setting times.
Finally, the target receives the audio signals transmitted by the master base station and all slave base
stations, detects their arrival times, and estimates the positions. The proposed system does not need to
be synchronized by wire or radio, which reduces the cost of system layout and avoids radio
interference. The simulation results verify that the proposed system can achieve positioning better
than 1 m, if the detection error of the target is less than 2.5 ms.</p>
    </sec>
    <sec id="sec-5">
      <title>5. References</title>
      <p>[1] C. Wu, Z. Yang, Z. Zhou, Y. Liu, and M. Liu, “Mitigating Large Errors in WiFi-Based Indoor
Localization for Smartphones,” IEEE Trans. Veh. Technol., vol. 66, no. 7, pp. 6246-6257, Jul.
2017.
[2] J. Luo, Z. Zhang, C. Wang, C. Liu, and D. Xiao, “Indoor Multifloor Localization Method Based
on WiFi Fingerprints and LDA,” IEEE Trans. Ind. Infomat., vol. 15, no. 9, pp. 5225-5234, Sept.
2019.
[3] X. Qiu, B. Wang, J. Wang, and Y. Shen, “AOA-Based BLE Localization with Carrier Frequency
Offset Mitigation,” in Proc. IEEE Int. Conf. Commun. Workshops, (ICC Workshops), 2020, pp.
1-5.
[4] C. Gentner, M. Ulmschneider, I. Kuehner, and A. Dammann, “WiFi-RTT Indoor Positioning,” in</p>
      <p>Proc. IEEE/ION Position, Locat. Navig. Symp., (PLANS), 2020, pp. 1029-1035.
[5] O. Hashem, M. Youssef and K. A. Harras, “WiNar: RTT-based Sub-meter Indoor Localization
using Commercial Devices,” in Proc. IEEE Int. Conf. Pervasive Comput. Commun., (PerCom),
2020, pp. 1-10.
[6] H. Cao, Y. Wang, and J. Bi, “Smartphones: 3D Indoor Localization Using Wi-Fi RTT,” IEEE</p>
      <p>Commun. Lett., vol. 25, no. 4, pp. 1201-1205, Apr. 2021.
[7] X. Wang, L. Gao, S. Mao, and S. Pandey, “CSI-Based Fingerprinting for Indoor Localization: A</p>
      <p>Deep Learning Approach,” IEEE Trans. Veh. Technol., vol. 66, no. 1, pp. 763-776, Jan. 2017.
[8] F. Mazhar, M. G. Khan, and B. Sallberg, “Precise Indoor Positioning Using UWB: A Review of
Methods, Algorithms and Implementations,” Wireless Pers. Commun., vol. 97, no. 3, pp.
44674491, Dec. 2017.
[9] L. Ciabattoni, G. Foresi, A. Monteriu, L. Pepa, D. P. Pagnotta, L. Spalazzi, and F. Verdini, “Real
time indoor localization integrating a model based pedestrian dead reckoning on smartphone and
BLE beacons,” J. Ambient Intell. Humanized Comput., vol. 10, no. 1, pp. 1-12, Jan. 2019.
[10] S. -C. Yeh, W. -H. Hsu, W. -Y. Lin, and Y. -F. Wu, “Study on an Indoor Positioning System
Using Earth’s Magnetic Field,” IEEE Trans. Instrum. Meas., vol. 69, no. 3, pp. 865-872, Mar.
2020.
[11] M. Zhao, M. Yan, and T. Li, “Vision-Based Positioning: Related Technologies, Applications,
and Research Challenges,” in Proc. IEEE 9th Int. Conf. Software Eng. Serv. Sci., (ICSESS),
2018, pp. 531-535.
[12] M. N. Liu, L. S. Cheng, K. Qian, J. L. Wang, J. Wang, and Y. H. Liu, “Indoor acoustic
localization: a survey,” Hum.-Centric Comput. Inf. Sci., vol. 10, no. 1, Jan 6, 2020.
[13] S. I. Lopes, J. M. N. Vieira, J. Reis, D. Albuquerque, and N. B. Carvalho, “Accurate smartphone
indoor positioning using a WSN infrastructure and non-invasive audio for TDoA estimation,”
Pervas. Mobile Comput., vol. 20, pp. 29-46, Jul. 2015.
[14] J. Urena, A. Hernandez, J. J. Garcia, J. M. Villadangos, M. C. Perez, D. Gualda, F. J. Alvarez,
and T. Aguilera, “Acoustic Local Positioning with Encoded Emission Beacons,” Proc. IEEE, vol.
106, no. 6, pp. 1042-1062, Jun. 2018.
[15] P. Pajuelo, M. C. Perez, J. M. Villadangos, E. Garcia, D. Gualda, J. Urena, and A. Hernandez,
“Implementation of indoor positioning algorithms using Android smartphones,” in Proc. IEEE
20th Conf. Emerging Technol. Factory Autom., (ETFA), 2015, pp. 1-4.
[16] L. Zhang, M. L. Chen, X. H. Wang, and Z. Wang, “TOA Estimation of Chirp Signal in Dense
Multipath Environment for Low-Cost Acoustic Ranging,” IEEE Trans. Instrum. Meas., vol. 68,
no. 2, pp. 355-367, Feb. 2019.
[17] S. Cao, X. Chen, X. Zhang, and X. Chen, “Effective Audio Signal Arrival Time Detection
Algorithm for Realization of Robust Acoustic Indoor Positioning,” IEEE Trans. Instrum. Meas.,
vol. 69, no. 10, pp. 7341-7352, Oct. 2020.
[18] S. Cao, X. Chen, X. Zhang, and X. Chen, “Combined Weighted Method for TDOA-Based
Localization,” IEEE Trans. Instrum. Meas., vol. 69, no. 5, pp. 1962-1971, May. 2020.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>