Region Classification using Wi-Fi and Magnetic Field Strength Suhardi Azliy Junoh1 , Jae-Young Pyun1 1 Department of Information and Communication Engineering, Chosun University, Gwangju, South Korea Abstract The widespread deployment of Wi-Fi access points (APs) provides an appealing candidate for indoor positioning. However, the major drawback to Wi-Fi-based positioning is that utilizing the signal faces several challenges in a dynamic environment. On the other hand, magnetic fields provide long-term stability in an indoor environment. Similarly, the available Wi-Fi APs can supplement the low number of magnetic field elements present in an indoor environment. Therefore, the hybrid use of Wi-Fi and magnetic field data provides several unique characteristics to compensate for the limitations encountered when each is used independently. In this paper, we propose applying the long short-term memory (LSTM) model to the spatial information from Wi-Fi and magnetic fields due to its advantages in time-series prediction and characterization for region classification. The results demonstrate that the proposed approach can perform indoor region classification with each of the values for precision, recall, and F1 scoring above 95.0%. Keywords Indoor region classification, Wi-Fi, magnetic fields, long short-term memory (LSTM) 1. Introduction With the growth of mobile computing and Internet of Things technologies, location-based services (LBS) are becoming an integral part of our lives both indoors and outdoors in fields such as emergency response services, staff management, and vehicle tracking [1]. Global positioning system (GPS) has proven its effectiveness in enabling users to establish their whereabouts outdoors. However, in indoor environments, the use of GPS can lead to considerable position calculation errors due to multipath effects, missing line-of-sight between user and satellite, and complicated settings. Therefore, the challenge of creating an LBS for interior situations with sufficient accuracy and robustness remains. Creating a reliable and accurate LBS is often crucial for applications focused on indoor areas. Further, the exact needs will depend on the requirements of a particular application. For example, smart building applications often require the ability to distinguish different workplaces. In some applications, such as emergency response services, locating a mobile device in a subregion [2] rather than a precise location may be sufficient because the user may obtain his or her location inside the subarea via visual inspection. Similarly, in large indoor environments such as airports IPIN 2022 WiP Proceedings, September 5 - 7, 2022, Beijing, China * Corresponding author. $ suhardi@chosun.kr (S. A. Junoh); *jypyun@chosun.ac.kr (J. Pyun)  0000-0001-8530-8327 (S. A. Junoh); 0000-0002-1143-8281 (J. Pyun) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) and department stores, a few meters of accuracy is sufficient to establish visual contact and discover the appropriate area. To date, there has been extensive investigation conducted into indoor positioning systems (IPS) (e.g., Wi-Fi, Bluetooth Low Energy (BLE), ultra-wideband (UWB) and radio frequency identification (RFID)). These technologies can be categorized into infrastructure-based and infrastructure-free approaches. The former approach needs hardware and software to execute IPS, such as Wi-Fi, UWB, and BLE. The latter systems employ commonly available positioning technologies and can work without extra infrastructure (e.g., Pedestrian dead reckoning (PDR), magnetic fields). Infrastructure-based methods provide higher accuracy but are costly due to the infrastructure requirements. In contrast, an infrastructure-free solution provides pervasive positioning and is less expensive to implement and maintain. Recently, localization methods using geomagnetic sensors in smartphone-based have piqued the interest of many researchers owing to the pervasiveness of magnetic fields in indoor environments and their independence from external infrastructure [3, 4]. These approaches use distorted indoor magnetic fields created by ferromagnetic materials (e.g., steel frames and electrical equipment). Additional infrastructure is unnecessary as geomagnetic data is pervasive. However, relying solely on magnetic fields does not guarantee high localization accuracy [5]. It requires integrating data from additional sensors (i.e., Wi-Fi, BLE, PDR, etc) to increase the accuracy of IPS among other possible approaches. This paper proposes LBS based on indoor region classification utilizing Wi-Fi received signal strength (RSS) and magnetic data. We present the long short-term memory (LSTM) model by taking advantage of time-series characteristics for classification. The experimental result is evaluated with the classical machine learning (ML) and deep learning (DL) algorithms to compare the performance metrics such as precision, recall, and F1 score. The remainder of the paper is organized as follows. In section 2, recent literature on smart- phone general localization technologies and fusion method is introduced. The proposed system is presented in 3. In section 4, we present the experimental setup and the discussion of the results. Finally 5 concludes this paper. 2. Related Work 2.1. Smartphone General Localization Technologies Modern smartphones are embedded with various sensors, making them both communication tools and sensing equipment [6]. These sensors (i.e., wireless, proximity, light, vision and magnetometer) can be utilized for indoor localization. Such sensors are inexpensive, convenient and user friendly to the user which provide the ideal platform in IPS system. In recent years, various smartphone-based IPS has been investigated. Among various local- ization technologies, localization methods based on wireless signal (Wi-Fi and BLE) are the most popular approach owing to low cost of infrastructure. The research [7, 8, 9] utilizes the widely available Wi-Fi signal in an indoor area and proposes the gaussian process regression method to enrich the sparsely collected fingerprinting signal data. [10] proposes the RSS signal from BLE beacon, particle filter, and floor plan with map-constraint for indoor navigation inside the building. The development of internal sensors in smartphones (e.g., acceleromete, gyroscope and magnetometer) improves IPS accuracy. For example, JustWalk [11] utilized sensors from smart- phones (e.g., accelerometers, compasses, and gyroscopes) to construct user motion traces in the building’s floorplan. However, the method requires high computation complexities as various mathematical expressions and visual processing approaches are applied to the acquired motion traces to recognize the floorplan shape. [12] proposes magnetic fields derived from the magnetometer data from the smartphone and Convolutional Neural Networks (CNN) to locate a user in indoor environments. [13] proposes an Impulse-radio ultra-wideband IR-UWB radar for detecting individuals in the indoor environment for two poses (i.e., standing and lying down). The proposed approach demonstrated robust detection performance in a cluttered indoor setting in the experiment. 2.2. Data Fusion-based Localization Many fusion-based localization technologies have been developed by combining two or more technologies. For example, IndoorWaze [14] utilizes the Wi-Fi fingerprinting and PDR data collected by the store owner and the customer. The system directs the shopping personnel to visit the targeted section in a large department store using audio instructions. ViNav [15] is a low-cost system that utilizes image-based localization and Wi-Fi fingerprinting to identify the user’s position and calibrate dead-reckoning for the trajectories. However, ViNav required a significant amount of high-quality photos to build a 3D model of the testbeds. [16] combines multiple technologies available in smartphone sensors (e.g., Wi-Fi, PDR, magnetic and light sensor) to detect building landmarks and a landmark graph can be constructed by identifying landmarks such as elevators, corners, and stairs. The motivation for combining infrastructure-based (Wi-Fi) and infrastructure-less (magnetic field) technologies for indoor region classification is that combining the two can compensate for the drawbacks of each and provide complementary information to improve localization [17, 18, 19]. For example, Magicol [17] and WAIPO [18] utilize magnetic fields and Wi-Fi to compare the measured signals with the fingerprint database. These two systems additionally employ particle filters to enhance the localization accuracy further. MagFi [20] and UbiFin [21] propose the crowdsourcing method to automatically construct the Wi-Fi and magnetic fingerprint simultaneously while reducing the site survey cost. Using a Wi-Fi based approach alone suffers from signal attenuation in a dynamic environment, especially when faced with human mobility and occupancy. However, human mobility and occupancy have little effect on magnetic fields. Similarly, because data from smartphone sensors include noise and variation, the magnetic field alone provides poor accuracy due to handshaking and the user’s movement. The drawbacks can be compensated by combining it with the Wi-Fi approach. Furthermore, there are almost always Wi-Fi access points (APs) and the magnetic field generated by many sources in a public indoor environment. Therefore, we can exploit these two abundant resources in public places for localization. $3 $3 $3                                                           \ P \ P \ P                                                        [ P [ P [ P (a) (b) (c) Figure 1: Heatmap of the RSS from three APs in the region 13. (a) AP1, (b) AP2, and (c) AP3. 5HJLRQ 5HJLRQ 5HJLRQ                                             \ P  \ P \ P                                                                               [ P [ P [ P (a) (b) (c) Figure 2: Heatmap of the magnitude of the magnetic field in three scenario on the same floor. (a) Region 1, (b) Region 11, and (c) Region 13. 3. System overview 3.1. Background The rapid change in Wi-Fi RSS has a substantial impact on localization accuracy and is one of the challenges of Wi-Fi-based IPS. In contrast to Wi-Fi, magnetic fields in indoor environments exhibit long-term stability. When used in a dynamic environment with human mobility and occupancy, Wi-Fi-based techniques suffer from a significant loss in positioning accuracy because of variations in RSS over time. In contrast, human motion has a negligible effect on magnetic field measurements due to the absence of significant ferromagnetic elements in humans. Fig. 1 and Fig. 2 present the heatmap for data collected in the selected scenario. The RSS and magnetic field information provide a distinctive feature and unique pattern in an indoor environment, as shown in Fig. 1 and 2. Fig. 3 shows the magnetometer values from the π‘₯, 𝑦, and 𝑧-axis collected by a smartphone along the 100-meter corridor. We can see from Fig. 3 that the magnetic field varies from location to location, but the changes are not significant. The magnetic field’s π‘₯ and 𝑦 components are pointed towards the north and east, while the 𝑧 component is pointed vertically towards the earth. Therefore, the normalized magnitude 𝐡𝑇 of the magnetometer is calculated as follows: 5 0 B x B y B z B T 4 0 3 0 M a g n e t i c f i e l d s t r e n g t h ( Β΅T ) 2 0 1 0 0 -1 0 -2 0 -3 0 -4 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0 T im e (s ) Figure 3: The magnetic field data collected by smartphone along 100 meter corridor. √︁ 𝐡𝑇 = 𝐡π‘₯2 + 𝐡𝑦2 + 𝐡𝑧2 (1) Given the widespread use of Wi-Fi and the pervasiveness of magnetic fields in an indoor environment, we may be able to acquire both signals concurrently in many places. Anomalies in the geomagnetic field induced by local disturbances caused by ferromagnetic construction materials can be employed to enable pervasive positioning technology for an indoor environment that is not reliant on infrastructure. 3.2. System Architecture The architecture of the proposed system is depicted in Fig. 4. First, the corridor area is classified into 15 different regions. Then, RSS values from numerous APs and magnetic field data for each region are collected to create a dataset for training the model. During the data collection, we recorded the RSS and magnetic data simultaneously for each subregion in the corridor. The corresponding RSS reading is set to -200 dBm in the Android program if the RSS for a specific AP is not detected. We utilize linear [22] and median interpolation methods to handle the missing RSS data during the data preprocessing. Using the interpolation method, we were able to recreate the missing RSS value of each AP in the specific region by utilizing the spatial correlation of the adjacent region. Then, we feed the data into our proposed LSTM model for training purposes. Finally, the preprocessed data from each region is combined into the database according to each location. Here, we divided the dataset into 80% for training and 20% for testing purposes. The trained LSTM model utilizes the test data to classify the region during the testing stage. 3.3. Structure of LSTMs This paper uses LSTMs for Wi-Fi and magnetic datasets for indoor subarea classification based on time series data. LSTM is a form of recurrent neural network (RNN) designed to avoid WiFi and magnetic field data collection Region 1 Region 2 Region N RSS + magnetic RSS + magnetic RSS + magnetic Data preprocessing Database LSTM network Prediction stage Region estimation Figure 4: The workflow of system architecture. the problem of long-term dependency. Unlike RNN, which exhibits gradient exploding when dealing with a long-term time series, LSTM employs memory cells and a gated approach to solve the problem. Each LSTM unit manages the learning and forgetting of timing information via a memory cell and many non-linear gating units. Fig. 5 depicts the structure of an LSTM unit. The three gates in LSTM (i.e., forget, input and output) manage the memory and performing the update for cell state information. Given an input of forget gate is β„Žπ‘‘βˆ’1 and π‘₯𝑑 , a standard LSTM can be formulated as follows: π‘Ÿπ‘‘ = 𝜎(π‘€π‘Ÿ Β· [β„Žπ‘‘βˆ’1 , π‘₯𝑑 ] + π‘π‘Ÿ ) (2) where π‘Ÿπ‘‘ , π‘€π‘Ÿ π‘π‘Ÿ denotes the output, weight and bias of forget gate. Sigmoid is the activation function which responsible for determining which values to let through (0 or 1) and can be given as follows: 1 𝜎= (3) 1 + π‘’βˆ’π‘₯ The second layer is divided into two sections (i.e., sigmoid and π‘‘π‘Žπ‘›β„Ž function). During the first part, an input gate is updated as follows: Output yt c t-1 ct Output gate Cell state Next cell state Forget gate Input gate tanh rt zt st t g Οƒ Οƒ tanh Οƒ wr br wz bz wg bg ws bs h t-1 ht Hidden state Next hidden state t x Input Figure 5: Basic LSTM architecture. 𝑧 𝑑 = 𝜎(𝑀𝑧 Β· [β„Žπ‘‘βˆ’1 , π‘₯𝑑 ] + 𝑏𝑧 ) (4) The second part generates a candidate state value 𝑔 𝑑 by π‘‘π‘Žπ‘›β„Ž activation layer as 𝑔 𝑑 = π‘‘π‘Žπ‘›β„Ž(𝑀𝑔 Β· [β„Žπ‘‘βˆ’1 , π‘₯𝑑 ] + 𝑏𝑔 ) (5) The π‘‘π‘Žπ‘›β„Ž activation assigns weightage to the values, determining their level of relevance (-1 to 1), and is formulated as follows: 𝑒π‘₯ βˆ’ π‘’βˆ’π‘₯ π‘‘π‘Žπ‘›β„Ž(π‘₯) = (6) 𝑒π‘₯ + π‘’βˆ’π‘₯ The next cell state 𝑐𝑑 is determined as follows: 𝑐𝑑 = π‘Ÿπ‘‘ * π‘π‘‘βˆ’1 + 𝑧 𝑑 * 𝑔 𝑑 (7) The last stage is to determine the output. First, a sigmoid layer is run to identify which elements of the cell state are sent to the output. The cell state is then passed via the π‘‘π‘Žπ‘›β„Ž function to shift the values between -1 and 1 multiplied by the sigmoid gate output. The process can be summarized as follows: 𝑠𝑑 = 𝜎(𝑀𝑠 Β· [β„Žπ‘‘βˆ’1 , π‘₯𝑑 ] + 𝑏𝑠 ) (8) β„Žπ‘‘ = 𝑠𝑑 * π‘‘π‘Žπ‘›β„Ž(𝑐𝑑 ) (9) In this paper, the LSTM networks are employed for fusing Wi-Fi and magnetic fields based on the time-series feature. Our final model integrates elements from LSTM and dense neural networks based on hyperparameter selection. We use a three-layer network comprising two Min-max normalization One hot encoder LSTM(50) LSTM(50) Dense(15) Figure 6: Basic LSTM architecture. LSTM layers with 50 number of hidden states and one dense node for output classification as shown in Fig. 6. We use min-max normalization to normalize the data [23] into the range [0,1] and one hot encoder to convert a categorical variable into a format that can be used by our model during data preprocessing. The final layer was used as a sigmoid layer to generate an LSTM prediction. The sigmoid activation function determines the output values from 0 to 1. We implemented our model on the Jupyter Notebook platform and trained the model on a computer with i5-9500 CPU with a 3GHz processor and 32G RAM running on Windows 10. We utilized the ADAM optimization algorithm with a learning rate of 0.0001 and categorical crossentropy as our loss function to train our network. 3.4. Indoor Region Classification using Machine Learning Algorithms There are two types of machine learning algorithms (i.e., parametric and nonparametric models). In parametric models, we estimate parameters from the training dataset to learn a function that can classify new data points without needing the original training dataset. Logistic regression and linear Support Vector Machine (SVM) are two examples of parametric models. Nonparamet- ric models, on the other hand, cannot be defined by a fixed set of parameters, and the number of parameters increases proportionally as the training data. Examples of nonparametric models in ML are K-Nearest Neighbor (KNN), random forest, and decision tree. 3.4.1. K-Nearest Neighbor KNN is a nonparametric model used for classification by estimating the likelihood that a data point will become a member of another group based on the similarity of the patterns. It is a simple yet efficient algorithm and provides a benchmark for comparing against other ML algorithms. The KNN algorithm can be summarized as follows. First, it chooses a distance metric and a number of π‘˜. Second, it locates the k-nearest neighbors from the data to be classified. It achieves this by calculating the Euclidean distance between the training and test samples in the dataset and then ranks the distance by increasing order. Finally, it tallies votes to assign class labels to the category with the highest number of neighbors. 3.4.2. Support Vector Machine An SVM is an ML model to solve linear and nonlinear problems for classification. In SVM, the algorithm first draws a line (hyperplane) that provides a decision boundary to segregate the data into classes. Then, the SVM algorithm finds the points from both classes that lie closest to the hyperline, and the process is known as "support vectors." The goal is to maximize profit margins (distance between the support vectors and the hyperline), and the best hyperplane is the one with the largest margin. To deal with multi-class classification issues (𝑁 > 2), 𝑁 binary SVM classifiers are used. The 𝑖th SVM is trained so that samples in the 𝑖th class are labeled as positive and the rest as negative. During the classification stage, a test sample is obtained from each of the 𝑁 SVMs and labeled based on the classifier with the highest output among the 𝑁 classifiers. We adopted the radial basis function as the kernel, considering the relationship between classes and features is nonlinear. 3.4.3. Random Forest Random forest is also a supervised ML algorithm for classification. It comprises several decision trees, with the number of trees increasing the robustness and accuracy of the algorithm. The random forest delivers a class prediction for each tree, and the class that receives the most votes becomes the model’s prediction. 4. Experiment and Analysis 4.1. Experimental Setup The experiment was conducted on the second floor of the university’s building which has an area measuring 110 m Γ— 16.9 m, as shown in Fig. 7. The environment consists of a number of main corridors that connect to different rooms. A total of 142 APs were detected during the experiment. To construct the database, the user utilizes the indoor APs’ information (e.g., MAC address, RSS values, and timestamp) and surrounding magnetic fields. The data was collected by a single user using a Samsung Note S8 model. The user walked randomly inside each region of the building for over the course of two hours during the data collection. The sampling frequency was set between 0.25 Hz and 1 Hz for Wi-Fi localization [20] to differentiate the two adjacent signals. However, the sampling frequency for the magnetic field is much higher (25 Hz). In this experiment, the Wi-Fi and magnetic fields are sampled at the same frequency of 0.25 Hz; thus, there will be a variation in RSS data during the random walk instead of a repeated RSS signal. During data collection, the average walking speed was between 0.65 m/s and 0.8 m/s. Overall, 1815 samples were collected for each area (27225 samples for all 15 regions). The mobile phone is held in front of the user’s body, and the user can change any direction during the data collection. To minimize magnetic fluctuations during walking, we fixed the user’s altitude by having them hold the phone (i.e., hand-held horizontally with the Y-axis towards the R12 R11 R10 R9 R8 R7 R6 R5 R4 R3 R2 R1 16.9 m R15 R14 R13 110 m (a) R4 R5 R10 R13 R14 R15 (b) Figure 7: Layout of of indoor scenes in the corridor with region partition. heading direction). Because data is automatically collected while the user is walking, the method of database construction in this study offers considerable cost-savings. Furthermore, it can be extended as a crowdsourcing approach in the future. The walking method of collecting the Wi-Fi and magnetic field data is proposed to overcome the drawbacks of the static point-to-point method in the conventional fingerprinting approach. 4.2. Performance metrics Given the true and predicted labels, we utilize the metrics of precision, recall, and F1 score for model evaluation and comparing with the benchmark algorithm. These metrics are widely used in evaluating ML settings and our proposed algorithm. Since we examine the multiclass classification model with an 𝑀 =15 region, we used a 15 class confusion matrix to map one region to all classifications. We define the parameters for each region 𝑗 as follows: 1. 𝑇 π‘Ÿπ‘’π‘’π‘ƒ π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’(𝑇 𝑃𝑗 ) = Number of region 𝑗 is correctly labeled to region 𝑗. The classifier predicts a positive sample in a true positive event, and the actual value is positive. 2. 𝐹 π‘Žπ‘™π‘ π‘’π‘ƒ π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’(𝐹 𝑃𝑗 ) = Number of region 𝑗 incorrectly labeled to region 𝑗. In a false positive event, the classifier makes a mistake by predicting a positive sample with a negative model. 3. 𝐹 π‘Žπ‘™π‘ π‘’π‘ π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’(𝐹 𝑁𝑗 ) = Number of region 𝑗 not classified to region 𝑗. The classifier predicts a negative result while the actual sample is positive during a false negative event. Based upon those definitions, we develop our performance metrics as follows: 4.2.1. Precision It is characterized as the measure of true-positive to predictive positive. It indicates the fraction subregion in an indoor environment that was predicted correctly. βˆ‘οΈ€π‘€ 𝑗=1 𝑇 𝑃𝑗 𝑃 π‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘›(100%) = βˆ‘οΈ€π‘€ Γ— 100 (10) 𝑗=1 𝑇 𝑃𝑗 + 𝐹 𝑃𝑗 4.2.2. Recall It denotes the ratio of true-positive to the total number of positive classifications. It indicates the fraction of subregions in an indoor environment that can be accurately classified. βˆ‘οΈ€π‘€ 𝑗=1 𝑇 𝑃𝑗 π‘…π‘’π‘π‘Žπ‘™π‘™(100%) = βˆ‘οΈ€π‘€ Γ— 100 (11) 𝑗=1 𝑇 𝑃𝑗 + 𝐹 𝑁𝑗 4.2.3. F1 Score Fi score is a metric which takes into account both precision (P) and recall (R) and is defined as follows: 𝑃 ×𝑅 𝐹 1(100%) = 2 Γ— Γ— 100 (12) 𝑃 +𝑅 4.3. Result and Discussion The proposed LSTM-Sigmoid-5500 method outperforms all ML and DL benchmarks methods with a classification accuracy of 95.7%. The classification accuracy of the 15 regions in the indoor environment is displayed in the confusion matrix in Fig 8. As shown in Fig. 8, each indoor region is correctly identified with the prediction accuracy not going below 90%. It means that the proposed LSTM-Sigmoid-5500 functions consistently well across different areas. We tested the performance of the proposed LSTM-Sigmoid-5500 on various numbers of LSTM units and hidden layers, as shown in Fig. 9. Increasing the number of LSTM units resulted in a corresponding improvement in accuracy. However, performance can degrade in some cases when the network goes deeper. Therefore, the best accuracy result is obtained with two hidden layers and 50 LSTM units per hidden layer, as shown in Fig. 9. Hence, we chose a 2-layer with 50 LSTM units as our model. We compare the validation results of a 2-layer LSTM using sigmoid with the softmax activation function. We selected the same dimensions of the hidden states for all LSTM architectures. Fig. 10 shows the curves of cross-entropy loss and the accuracies and the cross-entropy loss of LSTM-Sigmoid and LSTM-Softmax with 5500 epochs. From Fig. 10, it can be observed that LSTM-Sigmoid always has a lower loss than LSTM-Softmax on the training data. This pattern is also seen in the test data, especially at the tails of the loss graphs. It can be observed that sigmoid provides a lower loss and consequently produces better accuracy than softmax. In softmax, increasing the probability of one class reduces the overall &RQIXVLRQ0DWUL[IRU/6706LJPRLG (SRFKV                                                                                                                    7UXH/DEHO 6FDOH                                                                                                                                                   3UHGLFWHG/DEHO Figure 8: Confusion matrix for classification results in 15 different regions. Table 1 Precision, Recall and F1 scores of different classification model. Precision % Recall % F1 score % KNN 93.9 93.3 93.3 SVM 90.1 86.5 86.7 Random Forest 88.1 87.1 84.9 Dense-Softmax-5500 72.4 54.2 53.0 Dense-Sigmoid-5500 66.2 50.1 47.3 LSTM-Softmax-5500 48.7 45.6 42.2 LSTM-Sigmoid-5500 95.7 95.6 95.6 probability of all other classes because the activation function requires the sum of the probabil- ities of the output classes to be one. Conversely, increasing the probability of one class does not affect the total probability of the other classes when using a sigmoid. This is why sigmoid outperforms softmax in multi-label classification. As shown in Table 1, the average precision was 93.9% for KNN, 90.1% for SVM, 88.1% for the random forest, 72.4% for Dense-Softmax-5500, and 66.2% for Dense-Sigmoid-5500, respectively. Meanwhile, LSTM-Softmax-5500 produces the lowest precision at 48.7%. The proposed method outperformed the baseline KNN by 1.8%, 2.4%,          $FFXUDF\    /D\HU  /D\HU /D\HU  /D\HU  /D\HU               1XPEHURI/6708QLWV Figure 9: The accuracy of the LSTM-Sigmoid-5500 with respect to the number of LSTM units and layers. 7UDLQLQJ/RVVIRU(SRFKV /6706LJPRLG96/6706RIWPD[ 7UDLQLQJ$FFXUDF\IRU(SRFKV /6706LJPRLG96/6706RIWPD[     7UDLQLQJ$FFXUDF\ 7UDLQLQJ/RVV   /6706LJPRLG /6706LJPRLG /6706RIWPD[ /6706RIWPD[                  (SRFKV (SRFKV Figure 10: The crossentropy loss and accuracy for Sigmoid and Softmax function under 5500 epochs. and 2.4% in terms of the average precision, recall, and F1 score, respectively. However, compared to LSTM-Sigmoid-5500, LSTM-Sigmoid-5500 increased the average precision, recall, and F1 score by 47.0%, 50.1%, and 53.5%, respectively. LSTM-Sigmoid-5500 is designed to perform indoor classification by retrieving the features based on the Wi-Fi and magnetic field dataset. The accuracy measures of the training and test data are close to each other indicating that the proposed model is not overfitted. As shown in Fig. 8, the lowest prediction happens at region 8, which gives the prediction accuracy of 90.8%, where it wrongly predicts regions 4, 6, 12, 13, and 14, respectively. Meanwhile, the highest percentage of misprediction occurs in region 4, with 7.1% predicted as region 9. 5. Conclusions and Future Directions This study proposed the creation of an indoor regional classification method through fusing Wi-Fi and magnetic data collected by smartphone users. An LSTM network was used for data fusion and region classification based on Wi-Fi and magnetic field time series data. We compared our proposed LSTM-Sigmoid-5500 system with the other six benchmark schemes (i.e., KNN, SVM, random forest, LSTM-Softmax-5500, Dense-Sigmoid-5500, and Dense-Softmax-5500). Performance was analyzed in terms of average precision, recall, and F1 score for the different classification models. The proposed LSTM-Sigmoid-5500 method outperforms other benchmark methods, addresses the underfitting problem, and works well with time-series data. In the future, we plan to provide fine-grained localization instead of coarse-grained localization and improve the localization performance by using landmarks and more advanced ML and clustering methods. Moreover, we also plan to perform an extensive experiment in diverse scenarios with different dimensions and interference to verify the results. 6. Acknowledgments This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT).(No. NRF-2022R1A2B5B01002385). References [1] M. Zhang, J. Jia, J. Chen, Y. Deng, X. Wang, A. H. Aghvami, Indoor localization fusing wifi with smartphone inertial sensors using lstm networks, IEEE Internet of Things Journal 8 (2021) 13608–13623. [2] Q. Chen, B. Wang, Finccm: Fingerprint crowdsourcing, clustering and matching for indoor subarea localization, IEEE Wireless Communications Letters 4 (2015) 677–680. [3] M. Zhang, J. Jia, J. Chen, L. Yang, L. Guo, X. Wang, Real-time indoor localization using smartphone magnetic with lstm networks, Neural Computing and Applications 33 (2021) 10093–10110. [4] B. Bhattarai, R. K. Yadav, H.-S. Gang, J.-Y. Pyun, Geomagnetic field based indoor landmark classification using deep learning, IEEE Access 7 (2019) 33943–33956. [5] I. Ashraf, Y. B. Zikria, S. Hur, Y. Park, A comprehensive analysis of magnetic field based indoor positioning with smartphones: Opportunities, challenges and practical limitations, IEEE Access 8 (2020) 228548–228571. [6] S. A. Junoh, S. Subedi, J.-Y. Pyun, Smartphone-based indoor navigation system using particle filter and map-constraints, in: The 9th International Conference on Smart Media and Applications, 2020, pp. 354–357. [7] X. Wang, X. Wang, S. Mao, J. Zhang, S. C. Periaswamy, J. Patton, Indoor radio map construction and localization with deep gaussian processes, IEEE Internet of Things Journal 7 (2020) 11238–11249. [8] M. Xue, W. Sun, H. Yu, H. Tang, A. Lin, X. Zhang, R. Zimmermann, Locate the mobile device by enhancing the wifi-based indoor localization model, IEEE Internet of Things Journal 6 (2019) 8792–8803. [9] W. Sun, M. Xue, H. Yu, H. Tang, A. Lin, Augmentation of fingerprints for indoor wifi local- ization based on gaussian process regression, IEEE Transactions on Vehicular Technology 67 (2018) 10896–10905. [10] S. A. Junoh, S. Subedi, J.-Y. Pyun, Floor map-aware particle filtering based indoor navigation system, IEEE Access 9 (2021) 114179–114191. [11] M. Elhamshary, M. Alzantot, M. Youssef, Justwalk: A crowdsourcing approach for the automatic construction of indoor floorplans, IEEE Transactions on Mobile Computing 18 (2018) 2358–2371. [12] I. Ashraf, M. Kang, S. Hur, Y. Park, Minloc: Magnetic field patterns-based indoor localization using convolutional neural networks, IEEE Access 8 (2020) 66213–66227. [13] J.-E. Kim, J.-H. Choi, K.-T. Kim, Robust detection of presence of individuals in an indoor environment using ir-uwb radar, IEEE Access 8 (2020) 108133–108147. [14] T. Li, D. Han, Y. Chen, R. Zhang, Y. Zhang, T. Hedgpeth, Indoorwaze: A crowdsourcing- based context-aware indoor navigation system, IEEE Transactions on Wireless Communi- cations 19 (2020) 5461–5472. [15] J. Dong, M. Noreikis, Y. Xiao, A. YlΓ€-JÀÀski, Vinav: A vision-based indoor navigation system for smartphones, IEEE Transactions on Mobile Computing 18 (2018) 1461–1475. [16] C. Zhang, P. Patras, H. Haddadi, Deep learning in mobile and wireless networking: A survey, IEEE Communications surveys & tutorials 21 (2019) 2224–2287. [17] Y. Shu, C. Bo, G. Shen, C. Zhao, L. Li, F. Zhao, Magicol: Indoor localization using per- vasive magnetic field and opportunistic wifi sensing, IEEE Journal on Selected Areas in Communications 33 (2015) 1443–1457. [18] F. Gu, J. Niu, L. Duan, Waipo: A fusion-based collaborative indoor localization system on smartphones, IEEE/ACM Transactions on Networking 25 (2017) 2267–2280. [19] X. Guo, W. Shao, F. Zhao, Q. Wang, D. Li, H. Luo, Wimag: Multimode fusion localiza- tion system based on magnetic/wifi/pdr, in: 2016 International Conference on Indoor Positioning and Indoor Navigation (IPIN), IEEE, 2016, pp. 1–8. [20] H. Wu, Z. Mo, J. Tan, S. He, S.-H. G. Chan, Efficient indoor localization based on geomag- netism, ACM Transactions on Sensor Networks (TOSN) 15 (2019) 1–25. [21] J. Tan, H. Wu, K.-H. Chow, S.-H. G. Chan, Implicit multimodal crowdsourcing for joint rf and geomagnetic fingerprinting, IEEE Transactions on Mobile Computing (2021). [22] J. Talvitie, M. Renfors, E. S. Lohan, Distance-based interpolation and extrapolation methods for rss-based localization with indoor wireless signals, IEEE transactions on vehicular technology 64 (2015) 1340–1353. [23] S. M. Sultan, M. Waleed, J.-Y. Pyun, T.-W. Um, Energy conservation for internet of things tracking applications using deep reinforcement learning, Sensors 21 (2021) 3261.