=Paper=
{{Paper
|id=Vol-3248/paper18
|storemode=property
|title=Position Estimation at Indoors using Wi-Fi and Magnetic Field Sensors
|pdfUrl=https://ceur-ws.org/Vol-3248/paper18.pdf
|volume=Vol-3248
|authors=Ganduri Chandra
|dblpUrl=https://dblp.org/rec/conf/ipin/Chandra22
}}
==Position Estimation at Indoors using Wi-Fi and Magnetic Field Sensors==
Position Estimation at Indoors using Wi-Fi and Magnetic Field Sensors Ganduri Chandra Theatro Labs India Pvt Ltd, Bangalore, India Abstract Indoor location estimation is important for many applications, including home automation, security, and surveillance, to name a few. This is accomplished via sensors such as cameras and passive infrared (PIR) sensors, as well as radio frequency-based technologies such as Wi-Fi, Bluetooth, and radio frequency identification (RFID). Because of its ease of implementation, Wi-Fi technology is the most extensively used. Aside from Wi-Fi, a growing number of researchers are experimenting with the sensors embedded into mobile phones. To increase the performance of the localization, deep learning, probabilistic, and statistical algorithms are applied to the raw data collected from sensors and Wi-Fi access points. In this paper, we present a sensor-based indoor localization test-bed. We feed data from magnetometer sensors and Wi-Fi access points to Machine Learning (ML) algorithms. With our proposed method, we get a very high accuracy of 98%. Performance analysis of the ML-classifiers’ dependence on the size of the test data set is also explored in detail. Keywords Extreme Gradient Boosting(XG-Boost), Indoor Localization, Machine learning, Magnetic Field Sensor, Random forest, Wi-Fi RSSI. 1. Introduction It is well-known that the Indoor Localization Systems (ILS) – also referred to as Indoor GPS – detect the location of a user inside a room with high accuracy. Using this information provided by these ILS, Indoor Navigation Systems (INS) guide a user through optimal paths and with “turn-by-turn” directions. In the working of these systems, the GPS signals from satellites are ineffective due to their poor position resolutions indoors, and attenuation by walls and iron materials etc. So, a variety of ILS and INS use different technologies such as WiFi [1], RFID [2], Bluetooth [3] and Visible light communication [4] etc. Among these, Wi-Fi is the most commonly used method for its pervasiveness. For a user’s location estimation, Wi-Fi based localization methods offer a wide range of solutions: Angle of Arrival (AoA), Time of Arrival (ToA), Time Difference of Arrival (TDoA), Return Time of Flight (RToF), Phase of Arrival (PoA) (see [5] for an account), Particle Filtering [6] and MAP estimation [7]. In both particle filtering and MAP based approaches, the training phase primarily requires a prior knowledge of the building design map, and the new test data will be compared with the existing database for an accurate location estimation. IPIN 2022 WiP Proceedings, September 5 - 7, 2022, Beijing, China $ gjnv.manichandra@gmail.com (G. Chandra) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Various fingerprinting approaches have recently been studied and deployed in real time [1]. Many works [8] and [9] have demonstrated that fingerprinting-based approaches achieve high precision. In the methods involving Wi-Fi – for instance in all of the above discussed approaches – huge collection of data is required for calibrations, and this one of the major drawbacks. The range-based approaches compute the distances from a user’s end device and Wi-Fi access points, and employ geometrical models (for example, triangulation model) to estimate the user’s location. Several Path loss models, namely Log distance Path loss model, have been investigated in [10] and [11] to estimate distances based on the Received signal strength (RSS) information. As RSS is in general not stable indoors, Channel State Information (CSI) based ranging has been introduced. Notably, majority of the aforementioned approaches are not stand-alone, i.e., these cannot be implemented independently. More precisely, Wi-Fi based localization requires the use of a network, which raises the concerns about data security. Furthermore, when a small number of access points are installed, using alone a Wi-Fi approach is not very effective. Also, these approaches necessitate prior knowledge of the Geo-graphical locations of the access points deployed indoors. Wireless fingerprinting-based localization is the most popular among all the localization techniques, due to its ease of deployment. The Radio Frequency signals (RF signatures) collected from the access points help in the prediction of the location accurately. Typically, RF measurement values and the Received Signal Strength (RSS) values of WiFi or Bluetooth or any other wireless technologies, are one and the same. Indoor localization methods which are based on the signatures of magnetic field generally assume that these magnetic fields do not vary much within a smaller range of altitudes; for example, see [12]. This assumption is known to be very reasonable, and applies well to the situation where a magnetic field sensor (to measure the magnetic field signatures) is placed at any point between the surface of the floor and ceiling of a room; note that pillars/beams are usually made of iron bars which are ferromagnetic. More generally, [13] considers a larger range of altitudes for placing a magnetic field sensor, such as the altitudes of buildings; and also other factors, for example buildings are made of multiple vertical and horizontal beams (which are ferromagnetic and hence can influence the measurements by the sensors). One of the purposes of his article is to critically quantify the impact of altitude on the accuracy of traditional Machine Learning classifiers, for position estimation indoors (across the height of a room) using magnetic field data. In particular, we devised small-scale experiments conducted in slightly restricted environment (departmental laboratory) at two different altitudes (2 feet and 4 feet), that aided us in answering few questions such as: 1. Does the magnetic field at indoors vary significantly with altitude? 2. Suppose that the altitude is same for both training and testing an ML model. Then what is the role of the device’s location height whose magnetic field intensity we measure? The broad goal of the present paper is to provide an ML-based solution to estimate the location information with the data extracted from a realistic embedded test bed. Our contributions in this paper are as follows: 1. We present an efficient method for describing the user’s exact location at Indoors with high resolution. 2. The novel aspect of our method: we used a multi-label and multi-output approach to predict the location of the user’s end device. 3. We validate our results via training and testing the data we collect from a realistic test-bed. 4. We also investigate the impact of the sensor’s height on the prediction accuracy. Rest of the paper is structured as follows: Section 2 provides a synopsis of some related works. Section 3 discusses the motivation of the paper. The Experimental setup which includes brief description of the sensors used in this work is presented in Section 4. Section 5 presents the experimental results. Section 6 discusses some concluding remarks. 2. Related Works In recent past, extensive research has been conducted towards indoor localization and the boosting of the prediction accuracy. In this, many researches have proposed a wide range of novel schemes for Indoor Localization/Navigation which involve the technologies, to name but a few, RF-ID[2], Bluetooth [3] and Wi-Fi [1]. In [2], the authors proposed a ‘sensor by proxy’ approach that involves sensor fusion of Radio frequency identification measurements to reduce false-off error rates in Localization indoors. In particular, they derived a novel score function, which is used in developing distance based algorithms, with an aim to guarantee maximum coverage using a limited number of sensors indoors. The solutions provided in this work yielded localization with a reasonably good accuracy. But sensor based approaches in general have some drawbacks/ limitations in real time; for instance, limited range of the coverage area when deployed. In [14], some device-free techniques to analyze the user’s behavior through Wi-Fi signals at shopping malls have been employed. Usually, Received Signal Strength Information(RSSI) and Channel State Information(CSI) data are used for localization of users indoors, see [1]. The authors of this paper claimed that their proposed method will work efficiently even when the user is not connected to any of the access points. For real time deployment, the approach presented in [14] will work efficiently only if the client devices are embedded with specific hardware: Intel 5300 802.11n Wi-Fi NIC. Device-free Localization Systems with accurate positioning have been proposed in [15]. The work in [15] involves combining the RSSI of the user’s end device with a filtering technique for estimating the user’s location inside a closed environment. In the same paper, it has been claimed that the approach does not require any additional/dedicated hardware infrastructure. In [16], a Machine learning based algorithm has been proposed, and a reasonably good accuracy of around 85% has been shown. In the same paper, the sensors utilized were accelerom- eter, Wi-Fi, light, color and sound, and the approach therein involves translating the pixels from the light sensor to Hue-Saturation-Lightness (HSL). Also, the classifier’s used therein were: K-means clustering (distance based approach) and Support Vector Machine (SVM). Recall, considering SVM for localization is not a best suitable solution, since it can be used only for binary classification, i.e., one can only predict the presence or absence of the users. The work in [17] proved that magnetic field strength can vary in different rooms within buildings. The authors of this paper also concluded that give a place in a building, the magnetic signatures there are unique, and hence the magnetic field data can be used for Localization using both Machine Learning and Deep Learning (DL) frameworks. In the aforementioned paper, the coefficients extracted after applying Fast Fourier Transform (FFT) to the magnetic field data were fed to Convolutional Neural Networks (CNN) for real time predictions. Hybrid Deep Learning Model (HDLM) based localization systems, which only utilize RSSI heat maps instead of the raw RSSI signals from the access points, have been proposed in [18], resulting in a better localization performance of the Wi-Fi RSSI signal based positioning systems. HDLM is the combination of convolutional neural network and Long Short-Term Memory (LSTM) network. Also, it has been claimed in this paper that HDLM achieves reasonably good accuracy prediction with very less localization error, When compared to the existing solutions that involve the approaches viz fingerprint, trilateration, and Wi-Fi fusion etc. A Deep Learning based approach has been introduced in [19] to predict the location of a user indoors. Although Deep Learning approaches outperform very well, the bottleneck is that training the network requires huge data and time. In [20], the authors used Long Short-Term Memory. In general, LSTM networks are used for predicting time series data by learning the data collected at previous time instances. For real time deployment, all the neural network based models consume huge amount of time while training and processing, as well as occupy more memory. In general, in any positioning approach, the evaluation of the accuracy involves the calculation of the Euclidean distance between the true and estimated positions of the users. One major draw back in this is that the floor transitions are being ignored. Dealing with some more challenges in predicting an accurate position of users indoors, the work in [21] introduces a new methodology to measure the positioning error, which considers the length of the user’s path that connects the estimated position to the exact position using the visibility graph for the floor map, so as to compute the shortest distance, navigational meshes for the vector maps to identify similar paths with less computation time. To improve the user localization accuracy using RSSI and the complexity over widely used fingerprinting methods, an efficient compressive sensing based approach have been introduced in [22]. Localization using Least Square Lateration(LSL) method has been introduced in[23]. This paper also discussed a comparative analysis of the accuracy with LSL against that of with Pure lateration(PL), by utilizing different Gaussian noise parameters for varying number of access points and varying distance between the access points and user’s end device. Further, it has been proven (in the same paper) that least-square algorithm with curve fitting approach provides significant performance improvements when compared to PL. A novel scheme for localization based on the Angle Difference of Arrival information and triangulation using mm- waves indoors has been proposed in [24]. Also, it has been stated that their proposed algorithm outperforms very well at both vacant as well as office environments, by using a commercial 60-GHZ mm-wave test bed for their experiment. Though the results therein are promising, deploying the same for real time usage will be the bottleneck given that their equipment is expensive. To study and understand the behaviour of the Earth’s magnetic field inside a room, a novel methodology has been introduced in [17, 25, 26]. In these, Dense and spatially referenced samples of the magnetic vector field towards the surface of the ground as well as the free space above, was considered for both the analysis and also the map estimation of the surroundings. The approach followed in [6] utilizes both the Rao–Blackwellized particle filter as well as kalman Figure 1: Block level representation of our approach (a) (b) Figure 2: (a) Magnetic sensor (HMC5885L) (b) ESP-CAM32 Micro-controller with inbuilt Wi-Fi and SD card modules filter, to estimate the pose and calibration parameters. So far, several research papers have been published in the literature that use either RSSI [27] and [28] or CSI data [1] for Wi-Fi based localization. The bottleneck in such methods is that a large number of access points must be deployed to monitor the user’s location, and importantly, the user must be connected to a network when using the widely used triangulation method. Recent works, such as [29] and [30], to name a few, have demonstrated that using magnetic signatures indoors, and more broadly in the corridors of even large buildings, is quite beneficial in terms of consistency and stability. Furthermore, the sensor signatures do not change over time. In addition, they are unaffected by any changes in the objects, such as furniture in different locations and moving objects such as doors, windows, and user(s), etc. However, some extreme conditions can influence these signatures, for example large-scale reconstructions or the relocation of heavy metallic objects. 3. Motivation Motivated by the findings mentioned in previous section, we developed a novel machine learning-based approach that makes use of data from both magnetic sensors and Wi-Fi access points. Interestingly, our approach does not require any network connectivity for localization. Furthermore, in this work, we critically examined the dependence of ML classifiers’ location prediction on the altitudes from the surface at which sensors are placed. We therefore aim to overcome majority of these limitations with a novel idea of combining the utility of both the magnetic field and Wi-Fi Beacon information to describe the indoor position. Finally, for a further efficient localization indoors, we would like to emphasize that the following can be immensely helpful: 1. A detailed study on the magnetic field data analysis. 2. A study of more recent state-of-the-art 4. Experimental Setup To begin, we gathered data points spanning an area of 364 sq.ft. In the corridors of a building, this area is subdivided into cells of size 4 sq.ft., resulting in a grid (matrix) of size 7×13 (outdoors). Similarly, while working inside a room (indoors), data points from an area of 180 sq.ft., were divided into cells of 4 sq.ft. each, and resulting in a 5×9 grid. The actual problem is posed as a multi-output and multi-label classification problem, i.e., predicting the co-ordinates (of the user’s position) on the grid based on inputs from both the magnetometer sensor (HMC5883L) and the Beacon RSSI values of the selected Wi-Fi access points. We divided the rest of the description of the experiment (its setup and the various elements involved) into four separate subsections for ease of exposition. 4.1. Magnetometer sensor Honeywell HMC5885L magnetometer sensor is used in this work. This ’low power’ module is useful for a variety of general purposes, including measuring the magnetization of a material (for example, ferromagnetic materials, iron bars inside concrete pillars, and so on), measuring the field strength, and measuring the direction of the magnetic field at a point, among others. It is intended for magnetic field sensing with a digital interface for a wide range of real-time embedded applications at a low cost and with high resolution. Next, using the I2C interface, you can easily establish communication between the HMC5885L and microcontrollers. This sensor supports 3V to 5V Input-Output levels on I2C Serial Clock Line (SCL) and Serial Data Line (SDA) pins. The sensor’s output is measured in milli-Gauss units. 4.2. Data collection procedure To collect sensor data and beacon information, we created an Embedded test bed that includes an RF enabled Wi-Fi module and a Magnetometer sensor. Following that, we connected an SD card module to the microcontroller to store the raw data from the sensor into files (.csv format), and these files contain both the magnetic field data (along the x, y, and z planes in reference to the sensor’s alignment with the test bed) and the Beacon information of three (nearest, chosen at random) selected Wi-Fi access points. The block level representation of our experimental setup is shown in Fig. 1. We collected magnetic field data and Wi-Fi beacon information in both outdoor and indoor environments of our research laboratory by placing the sensor at various altitudes. The data we collected has six attributes: the first three are the data corresponding to the X, Y, and Z axes provided by the Magnetometer sensor, and the remaining three are provided by the Beacon RSSI values of the three selected Wi-Fi access points. Over the 364 sq.ft. area, we collected a total of 54,600 samples (chosen above, outdoors). Remember that this area has 7×13 cells, each with a 4 sq.ft. area, and we collected 600 samples from each of these cells. Similarly, we classified the indoor floor into a 5x9 grid and collected (total of 20,250 samples) 450 samples at each cell of the grid. 4.2.1. Wi-Fi Data During each scan, the signal strength measurements of all nearby access points can be theo- retically obtained. However, not all access points in a location are usually observed/captured in real time. The Service Set Identifier (SSID) of any nearby Wi-Fi access point, as well as its BSSI (MAC address) and Received signal strength Information, are immediately visible in the database obtained by scanning the Wi-Fi network coverage area (RSSI). In our paper, we only considered three access points that were closest to the data collection location. 4.2.2. Magnetic Field Data The data from the Magnetic field sensor used in this work yields three column vectors each with dimension equal to the number of samples taken: [︀ ]︀ H = H𝑥 , H𝑦 , H𝑧 number of samples×3 . Above, H𝑥 , H𝑦 and H𝑧 denote the position-measurements from the sensor along 𝑥, 𝑦 and 𝑧-axes, respectively. Recall, these measurements are noted in mG units. In general, the raw measurements obtained from the magnetic field sensor can be used in two different ways: 1) to calculate the magnitude of H, i.e., the magnitude of each row in H, which is a point in 3-dimensional space; 2) to use H directly/explicitly. For the purpose of enhancing the accuracy. usually way 2) in the previous line is followed. For each 1 ≤ 𝑖 ≤ the number of samples, recall: √︁ Magnitude of H[𝑖] = (H𝑥 [𝑖])2 + (H𝑦 [𝑖])2 + (H𝑧 [𝑖])2 , where H[𝑖] = H𝑥 [𝑖], H𝑦 [𝑖], H𝑧 [𝑖] ∈ R3 . (︀ )︀ Figures 3(a), 3(b) and 4 show the plots of the magnitude for the sensor data collected at both outdoor and indoor environments, for few specific columns in the grids in Fig. 5(a), 5(b) and 5(c), respectively. (The latter three figures are explained in the next subsection.) 4.3. Statistical analysis The 2D-heat maps of the measurements are shown in Fig. 5. Fig. 5(a) corresponds to the experiment outdoors; while the remaining two correspond to indoors and the two different (a) (b) Figure 3: The magnitude plot of magnetic field at few locations in outdoor environment shows that irrespective of time of data collection, the signature remains similar and stable, proving that the field strength is time-invariant. The lag or delay between the patterns can be attributed to the differences in moving the embedded device. altitudes (2 feet and 4 feet) considered. Furthermore, for a better understanding, we have plotted the 3D-heat map for the experiment outdoors, see Fig. 6. To verify the consistency of the measurements by the magnetic sensor (embedded on the test bed), the process will be repeated to collect a large number of signatures of the magnetic field in different columns of the grids in Fig. 5. From all of these sub figures, one observes: • The signatures collected across different columns on the floor can be seen from Fig. 3. Clearly, there is a variation in the patterns in these two figures. • Consistency: Evidently, the patterns in the plots in Fig. 3 are not varying with the times at which the samples were collected (morning, afternoon and evening). • At each location indoors, there is a small variation in the signatures as we change the height of the sensors, see Fig. 4. The collected data at both outdoor and indoor environments is processed to obtain the mean magnitudes – plotted as a heat maps in Fig. 5(a), 5(b) and 5(c) – at every individual cell in the defined grid. The data is then normalised over all the cells, for a better representation of the heat maps. Observe from Fig. 6 (for outdoor environment), the oscillations/non-uniformity in the normalized magnitude values as we move among various cells on the grid. 4.4. Machine learning approach In our paper, we employ classifiers such as Random Forest(RF) and XG-Boost, since these are purely tree-based ensemble algorithms. Comparatively, in RF classifier, more number of Decision Trees (DT) are applied to multiple sub-samples, so as to increase the prediction performance, to prevent over fitting while training, and also to enhance the stability and accuracy. RF classifier is a bootstrap ensemble for decision trees. Usually, Hyper parameters namely, number of trees and depth etc, are tuned for better prediction performances. 0.48 Magnitude of sensor data (mGauss) 0.46 0.44 0.42 0.40 0.38 0.36 2-feet 4-feet 0 1 2 3 4 5 6 7 8 Number of samples Figure 4: The plot of the magnitudes of the two magnetic signatures at altitudes 2 feet and 4 feet, with some similarity at column-3 in both Fig. 5(b) and 5(c) (at Indoor environment). The DT predicts the label information of test data based upon calculating entropy of original data set and information gain at each leaf node of the tree. For better understanding, we elaborate the formula of Entropy: 𝑀,𝑁 ∑︁ H(𝐷) = − P(𝑚, 𝑛)𝑙𝑜𝑔P(𝑚, 𝑛) (1) 𝑚,𝑛=0 Where in, D denotes number of data points in node. m,n denotes number of labels (label m and n correspond to row and column information of room, in our case). P(m,n) denotes probability of data belonging to a label. {︀ }︀𝐾 Average entropy of a split D: 𝐷𝑖 𝑖=1 and Information gain(𝐼𝑔 ) can be calculated using below equations, [︃ 𝐾 ]︃ ∑︁ |𝐷𝑖 | 𝐴𝑒 = H(𝐷𝑖 ) (2) |𝐷| 𝑖=1 [︃ 𝐾 ]︃ ∑︁ |𝐷𝑖 | 𝐼𝑔 = H(𝐷) − 𝐴𝑒 = H(𝐷) − H(𝐷𝑖 ) . (3) |𝐷| 𝑖=1 XG-Boost(Xgb) classifier was found effective over off-the-shelf ML classifiers with categorical data. In simple words, Xgb is an algorithm which is an implementation of Gradient Boosted DT’s. It is being very dominant in applied machine learning now a days. One of the objectives of Xgb is to minimize the error of cost function ℓ(𝑦, 𝐹 (𝑥)), via applying gradient descent algorithm. In this algorithm, DT’s are created in a sequential order. Weights, which play an important role in XGBoost, are assigned to all the independent variables, which are then fed into the DT which predicts the outputs. The weight of the variables that is predicted wrong by the DT, is increased and then the variables are fed to the next DT. These individual DT’s are then ensemble to give a concrete and more efficient (in terms of accuracy) model. It can work on both the regression and classification problems. (a) (b) (c) Figure 5: (a) Normalised heat map of the floor at outdoor environment; (b) Magnetic field at 2 feet altitude at Indoor environment; (c) Magnetic field at 4 feet altitude at indoor environment. }︀𝑁 The training data set is D = (𝑥𝑖 , 𝑦𝑖 ) 𝑖=1 , where 𝑥𝑖 and 𝑦𝑖 corresponding to the input and {︀ }︀𝑀,𝑁 output attributes (a total of 8 attributes) of data set, 𝑦𝑖 = [𝑦𝑚 , 𝑦𝑛 ] 𝑚,𝑛=1 , Note that the lengths {︀ of 𝑦𝑚 and 𝑦𝑛 are same. Let the differential loss function for our model be ℓ(𝑦, 𝐹 (𝑥)), where 𝐹 (𝑥) is the prediction corresponding to the input of classifier. We begin by initializing the model with some constant value: 𝑁 ∑︁ 𝐹0 (𝑥) = arg min ℓ (𝑦𝑖 , 𝛾) (4) 𝛾 𝑖=1 Let 𝐾 be a positive integer. For each 𝑘 ∈ {1, . . . , 𝐾} and for each 𝑖 ∈ {1, . . . , 𝑁 }, we compute the entries of the Gradient and Hessian iteratively (using a for loop) as follows: [︂ ]︂ 𝜕ℓ (𝑦𝑖 , 𝐹 (𝑥𝑖 )) 𝑟𝑖𝑘 = − (5) 𝜕𝐹 (𝑥𝑖 ) 𝐹 (𝑥)=𝐹𝑘-1 (𝑥) [︂ 2 ]︂ 𝜕 ℓ (𝑦𝑖 , 𝐹 (𝑥𝑖 )) 𝑠𝑖𝑘 = − (6) 𝜕𝐹 (𝑥𝑖 )2 𝐹 (𝑥)=𝐹(𝑘−1) (𝑥) Figure 6: Map showing the variation in magnetic field intensities of cells in a grid(3D- view) {︁ }︁𝑁 We fit a base learner (Decision tree) for the training set 𝑥𝑖 , − 𝑠𝑟𝑖𝑘 𝑖𝑘 , solving the following 𝑖=1 optimization problem: 𝑁 [︂ ]︂2 ∑︁ 1 𝑟𝑖𝑘 𝛾𝑘 = arg min 𝑠𝑖𝑘 − − 𝛾 (𝑥𝑖 ) (7) 𝛾 2 𝑠𝑖𝑘 𝑖=1 𝐹𝑘 (𝑥) = 𝑣 * 𝛾𝑘 (𝑥) (8) (𝑣 is the learning rate.) After finding 𝛾𝑘 and thereby 𝐹𝑘 (𝑥) as above, we now update (in the for loop) the parameter 𝐹𝑘 (𝑥): 𝐹𝑘 (𝑥) = 𝐹𝑘−1 (𝑥) + 𝐹𝑘 (𝑥) (9) By doing all the above, we obtain the final output: 𝐾 ∑︁ 𝐹 (𝑥) = 𝐹(𝑘) (𝑥) = 𝐹𝑘 (𝑥) (10) 𝑘=0 4.5. Experimental Procedure To begin with, we have collected data inside as well as outside our research laboratory building. Next, we captured the real time data from both the Wi-Fi module and Magnetic sensor, and then stored all of this data in an SD card for further processing. The data extracted from the Magnetic sensor are composed of an individual reading for each Cartesian axis (x, y and z). In real time, the data from the magnetic sensors are anomalous due to sudden changes in the magnetic field, or erroneous sensor readings. The data stored in SD cards has been transferred to a laptop for further data processing, that includes training and testing of ML models. We have then classified the target attributes into two outputs. The first output corresponds to the row information of the cell in the grid, while the second to the column information. The Figure 7: Architecture illustrating the procedures and their flow. First data collection process, followed by database creation, and then feeding to ML classifiers for position estimation. floor map of outdoor environment can be found in Fig. 5(a). The values 0–12 on the x-axis (as in this figure) represent individual columns of the grid, while the values 0–6 on the y-axis represent individual rows. Similarly, for indoor environment, we considered the total area into a 5x9 grid, and the same can be seen clearly from both Fig. 5(b) and 5(c). Next, the values 0–8 on the x-axis (in both the figures in the previous line) represent individual columns of the grid, while 0–4 on the y-axis represent individual rows. Now, we present our experimental results, comparing the performances of various ML models used in this paper for the position estimation. 5. Experimental Results We observed very good testing accuracy running a series of experiments on the collected data set, with varying test sizes. As can be seen in Fig. 8, the results obtained after training and testing the classifiers with both magnetic field data and Wi-Fi data are depicted (collected from the embedded test bed developed presently). Observe in the same figure that the RF classifier has a very good accuracy of around > 95%. Because we are working with a fixed number of samples, the test size increases as the size of the training data set decreases. Our accuracy results immediately show that we are able to predict with high appreciable accuracy, and more importantly (perhaps remarkably) with only few samples. The following example is for a quick grasp: • At 95% test sample size, the number of samples used for training is 2730 (which amounts to the leftover 5%). • All of these 2730 samples are uniformly distributed across 91 classes, i.e., 30 samples per each class. • Our prediction accuracy (with only 5% of training data set) is > 95%. Figure 8: Results obtained after training different ML classifiers with different test sizes at outdoors Figure 9: Results obtained with different ML classifiers at indoors The prediction accuracy was calculated using the formula, 𝑀 (︀ Prediction(m) == Original label(m) ∑︀ )︀ 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑚=1 , Total no. of samples used for testing where, m is a element in a array. Motivated by the outdoor results, we considered a small number of samples at each cell in an indoor environment at two different altitudes. The results shown in Fig. 9 are the result of applying ML classifiers to two separate data sets collected at two different altitudes within the room. It is clear that all of the classifiers we used performs very well. 6. Conclusions and Future scope In this paper, we presented a novel approach for estimating the user’s position indoors with high accuracy (around 98%). The motivation for using statistical approaches is that deep learning- based models are computationally expensive and require a large amount of memory for real-time deployment. We created a machine learning-based multi-label and multi-output strategy for indoor localization and demonstrated it using our own embedded test bed. However, the motivation for this work is to predict the position of the user indoors as well. In our experiment, we used industry standard sensors on our test bed to evaluate the performance of ML classifiers with very good resolution (in terms of accuracy) for indoor localization. Changes in the size of the training data set were also used to evaluate performance. We achieved best accuracy of around 98.6% and 97.6% with RF and XGB classifiers, respec- tively, by training the models with 5% of the original data set. Furthermore, we evaluated the effectiveness of our approach for interior location detection at various altitudes. Our method is simple to implement, and the time required for data collection and prediction is minimal. We mapped a small area within a large building, but given the promising accuracy, this approach can be scaled up to larger infrastructures such as large office buildings and shopping malls, among other things. Note that this paper does not focus on: materials, configurations of material, and other properties of buildings cause our observations. Indeed, if we can understand the elements playing a role in our observations – such as before – then we can make some predictions on the possible magnetic field variations that might occur in future, in the same buildings worked inside/ others. Such predictions would allow for improvements in evaluation of scientific grounding in localization techniques. To begin with, for studying possible links between variables, one can investigate using a building information model to predict the magnetic fields inside it. This approach could help in associating magnetic field variations with specific building elements for instance: steel beams, heating and cooling systems such as Air coolers. In parallel, one can extend the study of the performance of any machine learning model, via conducting the same experiments in different environments in large areas like: a multi-story building constructed by brick and concrete, a warehouse/factory building, a high-rise building made of steel. We would like to explore the above avenues in future, in the light of a reasonable thinking that, there is no guarantee that the results shown in this paper would hold the same for every building, with may be similar structure etc. References [1] X. Wang, L. Gao, S. Mao, S. Pandey, Csi-based fingerprinting for indoor localization: A deep learning approach, IEEE Transactions on Vehicular Technology 66 (2017) 763–776. [2] Y. Ma, C. Tian, Y. Jiang, A multitag cooperative localization algorithm based on weighted multidimensional scaling for passive uhf rfid, IEEE Internet of Things Journal 6 (2019) 6548–6555. [3] P. Sthapit, H.-S. Gang, J.-Y. Pyun, Bluetooth based indoor positioning using machine learning algorithms, in: 2018 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia), 2018, pp. 206–212. [4] T. Akiyama, M. Sugimoto, H. Hashizume, Time-of-arrival-based smartphone localization using visible light communication, in: 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2017, pp. 1–7. [5] A. Hilal, I. Arai, S. El-Tawab, Dataloc+: A data augmentation technique for machine learning in room-level indoor localization, in: 2021 IEEE Wireless Communications and Networking Conference (WCNC), 2021, pp. 1–7. [6] B. Siebler, S. Sand, U. D. Hanebeck, Localization with magnetic field distortions and simultaneous magnetometer calibration, IEEE Sensors Journal 21 (2021) 3388–3397. [7] H. Zou, C.-L. Chen, M. Li, J. Yang, Y. Zhou, L. Xie, C. J. Spanos, Adversarial learning- enabled automatic wifi indoor radio map construction and adaptation with mobile robot, IEEE Internet of Things Journal 7 (2020) 6946–6954. [8] J. Torres-Sospedra, R. Montoliu, A. Martínez-Usó, J. P. Avariento, T. J. Arnau, M. Benedito- Bordonau, J. Huerta, Ujiindoorloc: A new multi-building and multi-floor database for wlan fingerprint-based indoor localization problems, in: 2014 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2014, pp. 261–270. [9] T. Koike-Akino, P. Wang, M. Pajovic, H. Sun, P. V. Orlik, Fingerprinting-based indoor localization with commercial mmwave wifi: A deep learning approach, IEEE Access 8 (2020) 84879–84892. [10] J. Zhang, G. Han, N. Sun, L. Shu, Path-loss-based fingerprint localization approach for location-based services in indoor environments, IEEE Access 5 (2017) 13756–13769. [11] H. K. Rath, S. Timmadasari, B. Panigrahi, A. Simha, Realistic indoor path loss modeling for regular wifi operations in india, in: 2017 Twenty-third National Conference on Communications (NCC), 2017, pp. 1–6. [12] H.-S. Kim, W. Seo, K.-R. Baek, Indoor positioning system using magnetic field map navigation and an encoder system, Sensors 17 (2017). [13] D. Hanley, A. B. Faustino, S. D. Zelman, D. A. Degenhardt, T. Bretl, Magpie: A dataset for indoor positioning with magnetic anomalies, in: 2017 International Conference on Indoor Positioning and Indoor Navigation, 2017, pp. 1–8. [14] Y. Zeng, P. H. Pathak, P. Mohapatra, Analyzing shopper’s behavior through wifi signals, in: Proceedings of the 2nd Workshop on Workshop on Physical Analytics, Association for Computing Machinery, 2015, p. 13–18. doi:10.1145/2753497.2753508. [15] F. Alam, N. Faulkner, B. Parr, Device-free localization: A review of non-rf techniques for unobtrusive indoor positioning, IEEE Internet of Things Journal 8 (2021) 4228–4249. [16] A. Gadhgadhi, Y. Hachaichi, H. Zairi, A machine learning based indoor localization, in: 2020 4th International Conference on Advanced Systems and Emergent Technologies (ICASET), 2020, pp. 33–38. [17] N. Lee, S. Ahn, D. Han, Amid: Accurate magnetic indoor localization using deep learning, Sensors 18 (2018). [18] A. Poulose, D. Han, Hybrid deep learning model based indoor positioning using wi-fi rssi heat maps for autonomous applications, Electronics 10 (2020). [19] R. Ayyalasomayajula, A. Arun, C. Wu, S. Sharma, A. R. Sethi, D. Vasisht, D. Bharadia, Deep Learning Based Wireless Localization for Indoor Navigation, 2020. [20] H. J. Bae, L. Choi, Large-scale indoor positioning using geomagnetic field with deep neural networks, in: ICC 2019 - 2019 IEEE International Conference on Communications (ICC), 2019, pp. 1–6. [21] G. M. Mendoza-Silva, J. Torres-Sospedra, F. Potortì, A. Moreira, S. Knauth, R. Berkvens, J. Huerta, Beyond euclidean distance for error measurement in pedestrian indoor location, IEEE Transactions on Instrumentation and Measurement 70 (2021) 1–11. [22] C. Feng, W. S. A. Au, S. Valaee, Z. Tan, Received-signal-strength-based indoor positioning using compressive sensing, IEEE Transactions on Mobile Computing 11 (2012) 1983–1993. [23] K. Cengiz, Comprehensive analysis on least-squares lateration for indoor positioning systems, IEEE Internet of Things Journal 8 (2021) 2842–2856. [24] J. Palacios, G. Bielsa, P. Casari, J. Widmer, Single- and multiple-access point indoor local- ization for millimeter-wave networks, IEEE Transactions on Wireless Communications 18 (2019) 1927–1942. [25] R. Shirai, M. Hashimoto, Dc magnetic field based 3d localization with single anchor coil, IEEE Sensors Journal 20 (2020) 3902–3913. [26] D. Hanley, A. S. D. d. Oliveira, X. Zhang, D. H. Kim, Y. Wei, T. Bretl, The impact of height on indoor positioning with magnetic fields, IEEE Transactions on Instrumentation and Measurement 70 (2021) 1–19. [27] M. T. Hoang, B. Yuen, X. Dong, T. Lu, R. Westendorp, K. Reddy, Recurrent neural networks for accurate rssi indoor localization, IEEE Internet of Things Journal 6 (2019) 10639–10651. [28] S. Knauth, M. Storz, H. Dastageeri, A. Koukofikis, N. A. Mähser-Hipp, Fingerprint calibrated centroid and scalar product correlation rssi positioning in large environments, in: 2015 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2015, pp. 1–6. [29] M. Kwak, C. Hamm, S. Park, T. T. Kwon, Magnetic field based indoor localization system: A crowdsourcing approach, in: 2019 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2019, pp. 1–8. [30] I. Ashraf, S. Hur, Y. Park, Enhancing performance of magnetic field based indoor localization using magnetic patterns from multiple smartphones, Sensors 20 (2020).