1. Introduction

and ToA Estimation Method of OFDM Signal Based on Cascaded Deep Neural Network

Chaofan Zheng

Shaoshuai Fan

fanss@bupt.edu.cn 1

Hui Tian

tianhui@bupt.edu.cn 1

Bin Ren

Ren Da

Zhenyu Zhang

zhangzhenyu1@datangmobile.cn 0

Shaohui

Telecommunications

Beijing

China

(CATT)

Beijing

China

0 School of Electronic and Information Engineering, Beihang University , Beijing , China 1 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts 2 State Key Laboratory of Wireless Mobile Communications, China Academy of Telecommunications Technology

Accurate estimation of the direction of arrival (DoA) and time of arrival (ToA) are very important in many scenarios such as accurate positioning. However, it is challenging in environments with multipath propagation and noise. This paper proposes the DoA and ToA estimation method of OFDM signal based on a cascaded deep neural network (DNN) with a uniform grid array (UGA). In the proposed method, we use the channel state information (CSI) matrix as the network input rather than the correlation matrix. Simulation results show that the trained deep neural network has better estimation accuracy under multipath propagation and noisy interference environment compared with the conventional DoA and ToA estimation method. Direction of arrival, time of arrival, deep learning, convolution neutral network

1. Introduction

Direction of arrival (DoA) and time of arrival (ToA) of wireless signals are widely used in commercial and military fields, such as indoor positioning, underwater and air target tracking and monitoring, and some intelligent robots. Under these applications, it is often necessary to obtain both DoA and ToA. Estimation of DoA and ToA is relatively straightforward under high signal-to-noise (SNR) conditions. However, in complex wireless environments where the transmitted signal is subject to fading and interference, the SNR is low and there are few effective components in the received signal, the estimation of ToA and DoA is extremely challenging.

In the past few years, many physically driven methods have been proposed to estimate DoA and ToA with high accuracy including matrix pencil (MP) , multiple signal classification algorithm (MUSIC), estimation of signal parameters via rotation variance (ESPRIT), manifold separation technique, etc. In [ 1 ], the array manifold matrix was constructed by using the spatial characteristics of the uniform circular array (UCA) and the time diversity of OFDM subcarriers, then a virtual space smoothing method is designed to enhance the covariance matrix of the signal, and MUSIC algorithm was used to estimate the DoA and ToA of the multipath signal. A 3-D matrix pencil method is proposed in [ 2 ], which decomposed the covariance matrix of LTE signal by singular value decomposition, and extracts DoA and ToA information from the obtained poles. In [ 3 ], an efficient maximum likelihood approximation algorithm was proposed which alternately updated the DoA and time domain parameters.

2020 Copyright for this paper by its authors.

In recent years, with the continuous progress of artificial intelligence technology, deep neural network (DNN) [ 4 ] is widely used in image processing, speech recognition, pattern recognition and other fields [ 5 ]. In addition, research on DNNs has also been spread to communication areas such as signal processing, channel estimation [ 6 ] and so on. DNN has many advantages: DNN extracts features layer by layer and combines lower layer features to form higher layer features, allowing for distributed representation of data [ 7 ]; the multi-layer hidden layers of DNN have great non-linear fitting capabilities, allowing for effective mapping of the relationship between inputs and outputs. Although the training of DNN may take some time, the trained DNN has a fast computing speed and can get output results quickly. Therefore, the use of DNN for DoA and ToA estimation is an attractive option.

DNN-based DoA and ToA estimation has been studied by many scholars. A fully connected neural network is used for DoA estimation to verify the robustness of DNN under different signal-noise ratio (SNR)conditions [ 8 ]. To improve the estimation accuracy, [ 9,10,11,12 ] regarding DNN as a highperformance filter, the function of filtering is realized by learning the mapping relationship between the clean covariance matrix and the noisy covariance matrix of low angle of the arrival radar signal. DoA estimation is modeled as an angle classification problem, and recurrent neural network RNN) is used to learn the mapping relationship between sampling covariance matrix and angle [ 13 ]. In order to increase the accuracy of DoA estimation of neural network under different SNR, a cascade neural network structure was proposed in [ 14 ]. The SNR was used as the input of the network, and the DoA network was selectively used according to the strength of the SNR. [ 15 ] proposed a deep learning-based framework for preamble detection and ToA estimation with high accuracy under multipath and noise interference. [ 16 ] presented a learning-based algorithm that estimates the ToA of radio frequency (RF) signals from channel frequency response (CFR) measurements for wireless localization applications. [ 17 ] proposed a Convolution neutral network (CNN)-based method which can overcomes the negative effect of false peaks in block interleaved frequency division multiplexing (B-IFDM) structure.

The above presentation shows the effectiveness of DNN in the estimation of DoA and ToA, and it’s superiority over traditional methods in some conditions. Although much research has been done, the joint estimation of TOA and DoA based on neural networks is lacking unattended. Much work has used the covariance matrix of the received signal as input to the neural network, but this only estimates the DoA and is not very sensitive to changes in ToA. The CSI matrix, on the other hand, is rich in information, and can be learned by neural network to extract the features. In this paper, a cascaded neural network structure is proposed to estimate DoA and ToA of OFDM signals. The cascaded neural network consists of a filtering neural network and an estimation neural network. The filtering neural network performs signal enhancement for low SNR CSI matrix to reduce noise. The estimation neural network provides high accuracy estimation of DoA and ToA. The proposed cascaded neural network has higher accuracy compared to some other physically driven and data driven methods.

The paper is organized as follows. In section II, the signal model for the uniform grid array (UGA) and DNN structure has been discussed. Section III introduces the structure and training strategy of cascaded neural networks. Simulation parameters and results are given in Section IV and Section V concludes the paper.

2. System model 2.1. Signal model

As shown in Fig. 1, a UGA is used with H antennas to receive OFDM signals, and the distance between antennas is 0.5λ, where λ is the wavelength. We assume that the OFDM signal transmitted by the source impacts the antenna array with different directions of arrival through the line of sight (LOS) and non-line of sight (NLOS) paths. The space corresponding matrix of ℎ antenna can be expressed as: ( , ) = 2 ⁡( − )∙ / (1) where f is the carrier frequency of the transmitted signal and c is the speed of light.⁡ and⁡ are distance and angle between ℎ antenna and the origin of the coordinates respectively.⁡ and are the azimuth of arrival (AoA) and zenith of Arrival (ZoA) of the incident signal, respectively.

nlos los d

Similar to [ 1 ], we can obtain the channel state information, and then construct a CSI matrix in which the ℎ column is the ℎ CSI snapshot: ( ) = [ 11( ), 21( ), … ,

12( ), … , 16( )] Where

is the CSI of the ℎ subcarrier on the ℎ antenna. We can construct a design matrix that contains both DoA and ToA parameters by using this matrix. the channel impulse response at the center of the array is given by [ 3 ]:

ℎ( ) = ∑ ( − ) where is the time delay of the ℎ path to the center of the array, and is the gain of the ℎ path. The discrete Fourier transform result of ℎ( ) is the channel frequency response, and the ℎ subcarrier of CSI can be written as: written as: matrix: =1 =1 ( ) is an H ×K matrix, where H and K are the number of antennas and subcarriers, respectively. ( ) is a complex matrix, but neural networks cannot handle complex numbers directly. In order not to lose the information in the matrix, we reconstruct the matrix as follows: (2) (3) (4) (5) (6) (7) = ∑ − 2 = ∑ − 2 1 ∙ − 2 ( −1)∆

= ∑ − 2 1 ∙ ( , ) ∙ − 2 ( −1)∆ According to the spatial response vector constructed in (1), the CSI of the ℎ antenna can be In ℎ snapshot, the ℎ column of CSI matrix of -path signal can be written as the following =1 =1 ( ) = [ ⋮ ⋯ ⋯ 1 ( ) = ( 11)

⋯ ( 1 ) ( 1) ( 11) ⋯ ( ) ⋯ ( 1 ) ⋱ ⋱ ⋮ ⋮ [ ( 1)

⋯ ( ) ] where (·) and (·)⁡denote the real and imaginary parts of a complex-valued entity, respectively.

Finally, the input matrix in get by the average of all ⁡

( ).

For the CSI given in (6), the covariance matrix can be expressed as: Where ( )⁡and ( ) are ℎ and ℎ row of CSI matrix respectively, (∙)* represent conjugate and (∙ ) means the covariance of two vectors. The average covariance matrix of all snapshots is given in the following equation: = ( ( ), ( )∗) 1

Deep Neutral Network Structure

Convolution neutral network (CNN) is a type of DNN and has many advantages compared to traditional techniques, e.g.: good fault tolerance, parallel processing and self-learning capability, can handle problems in situations with complex environmental information, unclear background knowledge and unclear inference rules, allowing samples with large deficiencies and distortions, running fast, good adaptive performance and high resolution. It is a feature extraction function fused into a multi-layer perceptron through structural reorganization and weight reduction, omitting the complex image feature extraction process prior to recognition.

A CNN consists of four main components: convolutional layer, pooling layer, fully connected layer and an activation function for each layer.

Input matrix

without losing too much useful information. Fully connected layer means the layer-by-layer connection is fully connected, i.e., each neuron in one layer is connected to all neurons in the next layer. Such a structure introduces arbitrary linear combinations of the inputs and can have powerful approximate behaviors. We can express these three processes as follows: = [ ( , ) + ] (10) where , ,⁡

and are referred to as input, output, weight, and bias respectively. (·) refers to convolution, pooling or matrix multiplication and [·] means the activation function of this layer.

3. DoA estimation with DNN

This paper presents a detailed study of CNN-based DoA and ToA estimation method. In this work, a cascaded convolutional neural network is used to solve the DoA and ToA estimation problem with the aim of learning the mapping of DoA and ToA from the observed antenna array signal to the incident wave. However, the generalization capability of the neural network is limited, and the performance of the neural network degrades substantially in the case of large SNR gaps. To overcome this problem, a noise filtering network is introduced to perform noise filtering at low SNR. The network structure consist of two steps: a) the noise filtering step and b) the estimation step. We will describe our work in detail in the following section.

CSI Matrix Data preprocessing

SNR<0dB?

Y Noise Filter

Network Estimation Network DOA TOA Output

Noise filtering neutral network

We first need to classify the SNR of received signal. Referring to [ 14 ], the distinction of SNR is modelled as a binary classification problem. Eigenvalue decomposition is performed on (10), from which signals with high SNR and low SNR can be distinguished.

Noise filtering neural networks are used to filter CSI matrices at low SNR to enhance the effective components of the signal components in the CSI matrix through noise filtering operations. In this paper, a convolutional neural network is used for noise filtering, and signal enhancement is accomplished by learning the mapping relationship between the CSI matrix under low SNR conditions and the CSI matrix under noiseless conditions. The filtering neural network consists of a five-layer structure, containing two convolutional layers and three fully connected layers.

We can get a 2H ×K matrix after data preprocessing by (equation), then fed into the neural network. Next, the input matrix goes through two convolutional layers and two max-pooling layers alternately. To avoid losing some features and to obtain a larger convolutional perceptual field of view, we use a zero-padding approach and a convolutional kernel size of 5 ×5 for feature extraction on the input matrix. The specific number of filters used for the first and second and convolutional layers is 32 and 64 respectively. For all two max-pooling layers, we use the same pooling size 2 and stride of size 2. And then we can get a 64 × 0.5H × 0.25K three-dimensional features. The extracted features are flattened and fed into two fully connected layer with 1024 neurons and 2H × K neurons. The final output is reshaped as a 2H ×K matrix, which is the output after noise filtering. 3.2.

Estimation network of DoA and ToA Convolution1 Maxpooling1 Convolution2 Maxpooling2

(a) noise filtering neural network Convolution1 Maxpooling1 Convolution2 Maxpooling2

Three fully connected layers

S O F T M A X S O F T M A X

AOA

TOA Two parallel fully connected layers (b) estimation neural network We can model the DoA and ToA estimation problem as a classification problem, where DoA and ToA obtained for each classification result are in set

= { 1, ⋯ , }⁡and set = { 1, ⋯ , } respectively.

The configuration of the convolutional layers is similar to noisy filter network. As shown in the Each parallel network contains only one input layer and one output layer, with the same number of neurons in both input layers, 64 ×0.5H ×0.25K, and the number of neurons in the output layer being related to the angle and time resolution respectively. For example, If the DoA is distributed in [ 1, 2] , number of DoA output neurons is ( 1 − 2)/∆ + 1,where ∆ is resolution of angle. Similar to DoA, number of ToA output neurons is ( 1 − 2)/∆ + 1. We then put the output through the Softmax function and the neuron with the highest probability output is used as the final output. We can get results of Softmax function as follows: ( ) =

∑ =1

( ) =

⁡(0, ) where is the output value of ℎ neuron and J is the total number of output neurons. The output values of a multiclassification can be transformed into a probability distribution in the range [ 0, 1 ] and summing to 1 through Softmax function.

Throughout the cascade network, the activation function used is the Relu function: Relu is a non-saturated linear unit that speeds up network training, reduces computational complexity, is more robust to various disturbances and avoids the gradient disappearance problem to some extent compared to the Tanh and Simgod functions. 3.3.

Training and testing strategy

(11) (12)

The cascaded neural network consists of two neural networks connected together, which are trained separately. The trained neural network is cascaded to complete the work of filtering out noise and estimating DoA and ToA. During training, the data is fed in as a batch to reduce the training burden. Each neural network was trained 100,000 times separately, where the noise filtering neural network was back-propagated based on minimizing mean square error (MSE) and the estimation neural network was back-propagated based on minimizing cross-entropy loss. Where MSE and cross-entropy can be calculated as follows: 1 where is the number of output neurons of the noise filtering neural network, and ̂ is the output value and the true value respectively. In (14), and ̂ are output vectors and truth vectors respectively.

Both neural networks use Adam optimizer for gradient descent to complete the update of the weights. Dropout was used after every layer to prevent over fitting and improve the stability and robustness of the neutral network. The selection of the learning rate is also very important for the training of the neural network. If the learning rate is chosen to be relatively large, the weights will be adjusted more substantially during the training process, thus speeding up the network training, but this will cause the network to jitter frequently during the search on the error surface, which leads to the training process not converging and may cross the optimal optimization . Similarly, a relatively small learning rate can steadily make the network approach the global optimal point, but it may also fall into some local optimal regions. Experimentally, the learning rate of the filtered neural network is set to 1e

3 and the estimated neural network is set to 1e-4.

The testing data is input into the trained neural network to calculate the prediction accuracy and mean square error, so as to measure the effectiveness of the neutral network. In addition, during the testing phase, we must make sure that the data used for testing has not been trained in advance so that our neural network can be considered to work properly.

4. Simulation parameters and results 4.1. Simulation setup

In our experiments, the proposed convolutional neural network is implemented in Python 3.5 with TensorFlow 1.12, and the conventional correlation and MUSIC based methods are implemented by MATLAB R2019a. All experiments are performed on a lab server with two NVIDIA GeForce GTX TITAN Xp Graphical Processing Units (GPUs) with 24GB of memory. 4.2.

Dataset generation

In the simulation, a uniform grid array of 4 ×4 is used, with 16 single- polarized antennas evenly distributed in the array at half-wavelength spacing. It is assumed that the source emitted signal impinges into the antenna array via the direct and reflected paths, with the central frequency set at 2 GHz and the ratio of the variance of the power of the two paths is 10dB. All data are generated by the simulation software rather than direct measurements in real scenarios. The received signal impinging on the antenna array is an OFDM signal and has K subcarriers with a subcarrier spacing of 30K Hz. The CSI of the received signal can be obtained by (8), and the information matrix is calculated according to 50 snapshots of the CSI of the received signal.

Our proposed neural network is used to estimate both the DoA and ToA of OFDM signals. DoA contains AoA and ZoA，and this paper focuses on the estimation of AoA, with ZoA being assumed to be a constant value. The neural network is trained by treating the data of the direct path as the true output of the signal. We assume that the AoA of the signal transmitted through the reflect path occurs 20°larger and arrives 30 ns later than the direct path. we assume that the AoA of the direct path is uniformly distributed at (-60,60] and the angular search resolution is set to 1°, containing a total of 120 AoA incident directions, for each AoA, the corresponding ToA is assumed to be uniformly distributed at (10,50] and the resolution is set to 1 ns, so there are total of 120 × 40 directions of arrival with different time of arrival in the dataset. For each DoA and ToA, 90 independent noisy signal vectors generated from UGA ’s received signal vector after adding noise are used for training. 4.3.

Neural network parameters initialization

The initial values of network weights also have a great influence on the training of neural networks, if the initial weights are not set properly, it may lead to slow training, gradient disappearance or gradient explosion, etc. In general, the connection weights and thresholds of the network are initialized to be distributed in a relatively small interval with 0 mean. In this paper, the weight parameters w of the filtering and estimation networks obey a Gaussian truncated distribution with mean 0 and standard deviation 0.01 and 0.1, respectively, and are set to 0.01 and 0.1 for all bias parameters b, respectively. 4.4.

Simulation result

First, to verify that our neural network works, we explored the variation of loss with the number of iterations for both networks during training.

Figure 6 and Figure 7 show the images of the loss functions of the filtered and estimated neural networks with the number of iterations, respectively, and it can be seen that the loss functions are decreasing as the number of iterations increases, and finally converge to a range. The neural network can learn the mapping relationship between estimation parameters and input matrix. And ToA training is better than the AoA, as will be given specifically in the simulation below.

To evaluate the effectiveness and robustness of our proposed convolutional neural network structure, we compared our proposed cascaded neural network with other four methods:

MUSIC-enhanced: A algorithm based on MUSIC. The time diversity of every OFDM subcarrier, and a virtual spatial smoothing method was used for construction of the correlation matrix. DoA and ToA estimation were then performed based on MUSIC algorithm.

AML: An efficient approximate maximum likelihood algorithm for indoor location, which updates the DoA and ToA parameters alternatingly. 3. CNN-class: A CNN-based estimation method that first classifies the signal-to-noise ratio and then selectively uses two neural networks for ToA and DoA estimation. 4. CNN-base: Estimate DoA and ToA through CNN directly.

The first two methods are physically driven and the latter two and our proposed methods are data driven.

Two evaluations chosen in this paper are the mean absolute error (MAE) of DoA and the mean squared error (MSE) of DoA estimation. MAE is a better reflection of the actual error in the predicted values and MSE can indicate the accuracy of the predicted values Where the MAE can be expressed as: The MSE can be calculated as follows: ∑( − ̂ )2 =1 (15) (16) clear from the figures that our proposed method performs better than the other four whatever the SNR is. The estimation errors are decreasing as the SNR increases, and the performance of the neural network-based estimation methods is comparable to that of the physically driven methods at different SNRs due to the influence of the generalization ability of the neural networks. The classification-based CNN network is the same structure as this paper at SNR⁡≥ 10 dB, and both have the same MSE and MAE. When SNR < 0 dB, the performance is better than the other methods due to the filtered noisy neural network of the proposed method. Although it can be seen from Fig.6 that the CSI matrix with filtered noise is similar to that without SNR, the estimated performance is not as good as with high SNR due to the inherent correlation between the matrix data and the loss of some correlation properties after training. But it still performs better than others.

Figure 9 show the MSE and MAE of the ToA estimates for different SNRs, respectively. Our proposed method has a higher accuracy for ToA estimation. As can be seen from the figure, the datadriven approach is much more sensitive to changes in ToA than the physically driven approach at low SNRs. At SNR ≥ 0dB, the neural network can complete the classification task with a resolution of 1ns perfectly, achieving an accuracy of 100, and no error in these cases. When SNR <⁡0dB, the error of the physically driven-based approach increases sharply, but the ToA estimation of the data-driven approach has some noise immunity and still provides a relatively accurate estimate of ToA.

We then investigated the relationship between the estimation performance of the neural network and the number of subcarriers.

In this paper, we propose a deep learning CNN-based method for estimating DoA and ToA of OFDM signal. A cascaded neural network is used to filter noise and estimate DoA and ToA. Extensive simulation results show that the proposed CNN-based estimation method is more resistant to multipath and noise compared to the conventional estimation methods, which demonstrates the potential of the data-driven approach in parameter estimation for accurate positioning. 6. Reference

[1]

Chen ,

Qi ,

Liu , E. Yuan,

Zhao and G. Ding. "Joint 2-

DoA and ToA estimation for multipath OFDM signals based on three antennas . " IEEE Communications Letters 22.2 ( 2017 ): 324 - 327 .

[2] Shamaei , Kimia, Joe Khalife , and Zaher

Kassas . "A joint TOA and DOA approach for positioning with LTE signals . " 2018 IEEE/ION Position , Location and Navigation Symposium (PLANS) . 2018 .

[3] Wen , F. , Liu , P. , Wei , H. , Zhang, Y. , & Qiu , R. C. "Joint azimuth, elevation, and delay estimation for 3-D indoor localization . " IEEE Transactions on Vehicular Technology 67.5 ( 2018 ): 4248 - 4261 .

[4] Hinton , Geoffrey E. , and Ruslan

Salakhutdinov . "Reducing the dimensionality of data with neural networks . " science 313 .5786 ( 2006 ): 504 - 507 .

[5] Liu , W. , Wang , Z. , Liu , X. , Zeng , N. , Liu , Y. , and Alsaadi , F. E. ( 2017 ). A survey of deep neural network architectures and their applications . Neurocomputing , 234 , 11 - 26 .

[6] Neumann , David, Thomas

Wiese , and Wolfgang

Utschick . "Learning the MMSE channel estimator . " IEEE Transactions on Signal Processing 66.11 ( 2018 ): 2905 - 2917 .

[7] Chen , Min, Yi

Gong , and Xingpeng

Mao . "Deep Neural Network for Estimation of Direction of Arrival With Antenna Array." IEEE Access 8 ( 2020 ): 140688 - 140698 .

[8] Kase , Y. , Nishimura , T. , Ohgane , T. , Ogawa , Y. , Kitayama , D. , & Kishiyama . "Fundamental Trial on DoA Estimation with Deep Learning." IEICE Transactions on Communications ( 2020 ): 2019EBP3260 .

[9] Xiang , H. , Chen , B. , Yang , M. , Yang , T. , and Liu , D.

"A novel phase enhancement method for low-angle estimation based on supervised DNN learning." IEEE Access 7 (

2019 ): 82329 - 82336 .

[10] Xiang , H. , Chen , B. , Yang , M. , Yang , T. , and Liu , D. "Phase enhancement model based on supervised convolutional neural network for coherent DoA estimation . " Applied Intelligence ( 2020 ): 1 - 12 .

[11] Xiang , H. , Chen , B. , Yang , M. , Yang , T. , and Liu , D. " Improved de -multipath neural network models with self-paced feature-to-feature learning for doa estimation in multipath environment . " IEEE Transactions on Vehicular Technology 69.5 ( 2020 ): 5068 - 5078 .

[12] Xiang , H. , Chen , B. , Yang , M. , Yang , T. , and Liu , D. "Improved direction-of-arrival estimation method based on LSTM neural networks with robustness to array imperfections . " Applied Intelligence ( 2021 ): 1 - 14 .

[13] Wajid , M. , Kumar , B. , Goel , A. , Kumar , A. , and Bahl , R. "Direction of arrival estimation with uniform linear array based on recurrent neural network." 2019 5th international conference on signal processing, computing and control (ISPCC) . IEEE, 2019 .

[14] Guo , Y. , Zhang , Z. , Huang , Y. , and Zhang, P. "DoA estimation method based on cascaded neural network for two closely spaced sources . " IEEE Signal Processing Letters 27 ( 2020 ): 570 - 574 .

[15] Sun , H. , Kaya , A. O. , Macdonald , M. , Viswanathan , H. , & Hong , M. "Deep learning based preamble detection and ToA estimation." 2019 IEEE Global Communications Conference (GLOBECOM) . IEEE, 2019 .

[16] Hsiao , Yao-Shan , Mingyu Yang , and Hun-Seok Kim . "Super-Resolution Time-of-Arrival Estimation using Neural Networks . " 2020 28th European Signal Processing Conference (EUSIPCO) . IEEE, 2021 .

[17] Luo , Zhe, Tao Tao , and Jianguo Liu. "ToA Estimation Scheme Based on CNN for B-IFDM-Based Preambles . " 2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring) . IEEE, 2019 .