Algorithmic Support for the Detection Characteristics Improving of the Monitoring Object Elena Chernetsova1[0000-0001-5805-3111] and Anatoly Shishkin2[0000-0003-1992-5663] 1 Russian State Hydrometeorological University, Metallistov Av.,3, St-Peterburg, 195027, Russia chernetsova@list.ru 2 Russian State Hydrometeorological University, Metallistov Av.,3, St-Peterburg, 195027, Russia an.dm.shishkin@mail.ru Abstract. The paper presents: an object detection algorithm for coherent recep- tion of signals coming from a monitoring network consisting of several sensors; an algorithm for detecting an extended object by analog signals of sensors of a monitoring network. These algorithms use statistics that take into account the most stable features of the distribution of the source data. They can be imple- mented in an automated decision support system. At the same time, decisions on the detection of a monitoring object made by an automated system will be more reliable Keywords: Monitoring, Object, Sensor, Signal, Algorithm, Detection. 1 Introduction To carry out environmental monitoring, it is necessary to conduct continuous obser- vations over time, based on a well-thought-out distribution of measuring instruments in space, for which it is necessary to use a stationary distributed multi-sensor remote monitoring system [1]. It should work efficiently, preferably at a real time scale. Effi- ciency also means reducing the time frame for deciding on the classification of the observed object. Therefore, it is necessary to automate not only the data collection process, but also the classification algorithms of the monitoring object in order to attract the attention of the human operator only to objects that actually threaten the ecological state of the observed area and even at the stage of automated data process- ing to weed out objects that do not threaten the ecological state of the zone of respon- sibility. A stationary network of stations included in the monitoring system requires the availability of communication channels with a Monitoring Control Point (MCP) [2]. Laying a cable communication network is often unprofitable. Therefore, for communication purposes it is necessary to use a radio channel or satellite communica- tion [3]. Since the sensors of the monitoring network receive energy from the batter- ies, in order to save energy in the monitoring network, it is often justified not to pre- process the signal on the sensor, but to send analog signals to the MCP, which is Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). charged with processing the sensor signals and detecting the monitoring object [4]. Information exchange over the radio channel raises the problem of detecting an ana- log signal with an unknown law of fluctuations against the background of noise with an unknown distribution [5]. To solve this problem, in this paper, it is proposed to develop the following algorithms: • an algorithm for detecting a monitoring object during coherent reception of signals coming from a monitoring network consisting of several sensors; • the sample size for detecting the object of the analog signals of the sensors of the monitoring network. 2 Theoretical Analysis 2.1 The most powerful accordingly to the signal-to-noise ratio criterion algorithm for processing spatially distributed data from a monitoring network consisting of several sensors Let us consider the problem of coherent detection of a signal from an object distrib- uted in N resolution elements, which are sensors of a monitoring network. It was shown in [6] that the detector calculates the likelihood ratio: N 1  0 k (k ) 1 1 k l( X )  ( )  exp[(  )  xn 2 ]2 (1) ( N ) 1 2 0 2 1 n 1 k where xn - detector output envelope samples , n  1, 2,....., N  0 and  1 - signal variances received from ( N  k ) sensors, that did not fix the object and k sensors, fixed object accordingly . From equation (1) we can see, that detector which is most powerful accordingly to the signal-to-noise ratio criterion can be implemented by a rather complex circuit, and, in addition, for its implementation a priori information is required about the pa- rameters of signal ( 1 ) and noise ( 0 ) , which, as a rule, in real monitoring condi- tions are unknown. Therefore the rule (1) characterizes the potential for detecting an object and cannot be realized in many practical cases. It is necessary to develop an most powerful by signal / noise criterion algorithm for coherent detection of a signal from a monitoring object received from ( N  k ) sen- sors on the background of noise interference provided that the signal and noise pa- rameters, as well as the position of the fixed object k sensors among N sensors of the monitoring network are a priory unknown. Detection is formulated as the statisti- cal task of testing general linear hypotheses [7-10] and the rule is found in the class of so-called invariant rules [11]. We use the following premises: 1. There are statistically independent radio pulses sent by N ( N  1) sensors. In the absence of the object of observation, these pulses have the same average power. The law of the distribution of the noise background is considered normal. 2. In the presence of an object of observation, the resulting fluctuation in resolution is the additive sum of the signal with unknown amplitude  m (m  1,2..., k ) and Gaussian noise with unknown variance  2 . Coherent processing is assumed. In- dependent voltage samples are taken at the output of the linear path of the MCP receiver at time instants following the resolution interval. x n (n  1,2,..., N ) . 3. Processing is carried out during the p periods of the signal, so that each reference element n will correspond to a sample vector ( xn1 ,...., xnp ) with multidimensional normal probability density | A|  1 p p  g( X n )  p/2 exp    aij ( xni   ni )( xnj   nj ). (2) (2 )  2 i 1 j 1  The mean values and the covariance matrix of the vector are determined from the expressions E ( xni )   ni ; E ( xni   ni )( xnj   nj )   ij ;  ij  A 1 , where E - is the sign of mathematical averaging, and  n  0 , if n  ( N  k ) , and  n  0 at n  k . It is also believed that the matrix A  (aij ) - is common to all vectors N , having di- mension p , but unknown . The challenge is that by sample  x11 . . . x1 p     . . . . .  X   . . . . .    . . . . .     x N1 . . . x Np  determine the presence or absence of a signal about the existence of a monitoring object. Matrix X consists of p column vectors ( x1i ,......., x Ni ) , and each such vector has its own mean value vector  i  (1i ,...,  Ni ) . Given the accepted assumptions, the task of detection is to test complex hypotheses H 0 and H1 regarding parameters  i and A . H 0 : i  0 H1 : i  0 i  1,2,...., p А is unknown (3) Hypothesis testing (3) fits into the scheme of testing multidimensional linear hy- potheses. As follows from the general theory [12], principles of invariance and suffi- ciency allows you to reduce the sample X when testing hypotheses (3) to maximally invariant statistics of the form p p N xi x j T (4) N i 1 j 1  ( xni  xi )( x nj  x j ) n 1 and the set of parameters  i and (aij ) - to maximal invariant p p  2  N   aij i j (5) i 1 j 1 N In expressions (4), (5) x i ( j )  N 1  x xi( j ) ;  i ( j )  E ( x i ( j ) ). Numerator of the n 1 formula (4) has off center  p 2 - off-center distribution with the noncentrality pa- rameter  2 and p degrees of freedom, and the denominator has central distribution  2 ( N  p ) , so the statistics ( N  p)T / p has off-central F distribution with p and ( N  p ) degrees of freedom and with the noncentrality parameter  2 . Regarding the parameter  2 of F - distribution initial hypotheses (3) can now be formulated as follows : H 0 :   0; H1 :   0 (6) Using the method of constructing optimal rules [13], it can be shown that the most powerful invariant criterion for testing hypotheses (6) has a critical region of the form T  C. (7) Threshold level C determined by the given probability of false alarm  from the condition   F p, ( N  p) ( y )dy   (8) C where Fp,( N  p ) -is central F distribution with p and ( N  p ) degrees of freedom. The expressions (4), (7) determine the functional scheme of the detector with completely unknown correlation properties of vectors ( x n1 ,..., x np ) . For practical implementation, expressions (7), (8) can be concretized, for example, in the case of the absence of inter-period correlation. In this case, the discovery rule and parameter  2 take the form p 2 N xi   C, (9) i 1 2  ( xni  x i ) p  2   qi , (10) i 1 N where q i  (   n ) i2 / N 2 - is the average for all N signal-to-noise ratio for one n 1 observation period. The detector efficiency is determined by the power function of rule (7), (8), which shows the dependence of the probability of correct detection on the parameter  2 . It can be calculated directly from off-center tables of F - distribu- tion [14]. 2.2 A most powerful according to the signal-to-noise ratio criterion algorithm for detecting an extended object by analog signals of monitoring network sensors To develop an algorithm for classifying an extended object (for example, classify- ing the observed water surface as clean or polluted by oil emissions) using a distrib- uted multisensor geographic information system, suppose:  The central post decides to detect / not detect an object (contamination) based on signals received from N sensors under the same observation conditions ;  The resulting radio signal of each sensor is the additive sum of the non-fluctuating signal of unknown amplitude  i( j )  0 ( j =1,2- is numbers of object - e.g. clean water surface and dirty water surface, i  N ) and Gaussian noise with unknown dispersion  i2 . At the output of the receiver’s linear path, the amplitude samples xi( j ) are taken for the signal of each sensor.  Observation of objects is carried out for some time T , during which readings for the signal of each sensor n are taken. Thus, for each object, the sample space is represented as n sample vectors x(1)  ( x(11) ,..., x(1n) ) ; x( 2)  ( x( 2) ,..., x( 2n) ) ;   1, n . Vectors x(1) and x( 2) have a normal probability density A 1 N N ( j) ( j) g ( x( j ) )  exp[    ik ( xi  k )] (11) N /2 2 i 1k 1 (2 ) The mean values and elements of the covariance matrix are determined from the expressions E ( x( kj ) )   ( kj ) ; E ( x( ij )   ( ij ) )( x( kj )   ( kj ) )   ik ; ( ik )  A 1 , where E – is the sign of mathematical averaging. We consider that the matrix A  ( ik ) is common to vectors x(1) and x( 2) , but its elements are unknown. The classification task is by the sample x(1) , x( 2) ,   1, n determine whether ob- jects belong to the same class or belong to different classes. Based on the assumptions made, this problem can be formulated as two hypotheses - 1. objects are of the same type; 2. objects are of the different type: (1) ( 2) (1) ( 2) H0 :  i   i ; H1 :  i   i for all i  1, N (12) ( j) In expression (12) the parameter  i  is matrix column having dimension (n  1) with elements (1(i j ) ,...,  ni ( j) ). As follows from the general theory [15], the principles of invariance and suf- ficiency allow us to reduce the sample space x(1) , x( 2) ,   1, n , when testing hy- potheses (12) for maximally invariant statistics (MI) of the form (1) (2) (1) ( 2) N N n( x i  x i )( x  x ) T   k k (13) i 1 k 1 n (1) (1) (1) (1) n ( 2) ( 2) ( 2) ( 2)  ( x1  x i )( xk  x k )   ( xi  x  )( xk  x k )  1  1 and the parameter space is to MI n  2     ik ( i(1)   i(2) )( k(1)   k(2) ) (14) 2 ( j) n In expressions (13) and (14) is marked: x k  n 1  x( kj ) ;  k( j )  Е ( x kj ); k= 1, N .  1 It can be shown that there is uniformly the most powerful (UMP) criterion for test- ing hypotheses (12), (12), which rejects the hypothesis H 0 in case if T > C, (15) where С – is the threshold constant . The constant С should be determined from the condition that under the hy- pothesis H 0 ( 2  0) the probability of the fulfillment of condition (15) was no more than a certain predetermined significance level  . Whereas statistics  1 / 2T under the hypothesis H0 has central F distribution with  1  N and  2  ( 2n  N  1) degrees of freedom [16], the constant C can be found from the ex- pression   Fv v ( )d   1 2 (16) c The rule (15) can be specified for the case when the matrix A is diagonal. In this case, it has the form (1) ( 2) N n( x i  x i ) 2  n  C1 (17) (1) 2 n i 1 (1) ( 2) ( 2) 2  ( x 1  x i )   ( xi  xi )  1  1 where C1  CN /(2n  N  1) . ( j) From the expressions (13) and (17) it can be seen that x i ( j  1,2; i  1, N ) - are maximum likelihood estimates for parameters  i( j ) , calculated for the sensor N sig- nals for the first and second objects, and the value in the denominator is the sum of the parameter  i2 estimates calculated for the signal of the first and second object of i-th sensor. Thus, to distinguish between objects, it is necessary to estimate the ampli- tudes of the N sensor signals, calculate the square of the distance between the pa- rameters of the signals of the classified objects by the sensors of the same name, and sum them with weights inversely proportional to the noise variance. The amount re- ceived is compared with a threshold, in case of exceeding which a decision is made on whether the objects belong to different classes. Expression (15) can also be used to detect a distributed object, if we put x( 2)  0;   1, n . Formula (17) in this case takes the form [17]: N 2 nxi   C 2 , где C 2  CN /( n  N  1) . (18) i 1 n ( x  x ) 2  i i  1 Considering that under the hypotesis H1 statistics T has off-central F distribu- tion with off-center parameter  2 and  1 , 2 degrees of freedom, the probability of correctly distinguishing between objects is determined by the expression [18]: P (T  C )  0 F  ( , 2 ) d 1 2 (19) and can be calculated according to the tables of off-central F distribution [19]. 3 Results Figure 1 shows the curves characterizing the effectiveness of the detector of oil pollution of the water surface depending on the resolving power of the network of contact sensors constructed in accordance with expressions (4), (7). Characteristics calculated for false alarm probability value   102 and the number of received signal periods p  2 provided that the value of the signal-to-noise ratio averaged over all N sensors for one observation period q i is independent of resolution (uni- form distribution of translational buoys (contact sensors) along the length of contami- nation). For comparison, the same figure shows the power function of the potential most powerful rule (MP) of coherent detection of a known signal [20] in the presence of only one sensor ( N  1 ). It can be seen from the figure 1 that ignorance of the noise and signal levels in the decision elements leads to losses in the signal-to-noise ratio. However, with in- creasing resolution, the detector’s efficiency increases. This is due to the fact that the increase allows a more accurate assessment of noise and signal levels. So, when N  8 the loss in the signal-to-noise ratio is ~4 dB, and when N  22 - less than 1 dB. Fig. 1. The probability of detecting oil pollution of the water surface by signals received from a network of contact sensors Figure 2 shows the dependences of the probability P(N ) of correct distinguishing between two objects, calculated by the formula (19), for different values of the signal sample size n. Fig. 2. The dependence of the probability of distinguishing objects P (N) for a different number of sensors in the monitoring system (N) for several values of the signal sample size (n) In this case, the noncentrality parameter  2 of F distribution was assumed con- stant, independent of the number N of sensors in the monitoring system . As can be seen from Figure 2, the dependences have an optimum in the probability of distin- guishing between objects, and its position depends on the size of the sample n. The presence of an optimum and its position are apparently due to the following reasons. On the one hand, with an increase in the number of sensors in the monitoring system, the difference in signals increases, that is, the “distance” between objects in the pa- rameter space increases. Let us explain what was said by the following example. Let the objects have the same area, but a different distribution of them among the sensors. The value of the parameters of the amplitudes of the signals from the first and second objects  i(1) and  i(2) for N =3 is presented in table form 1. 3 At N = 1, the distance in the parameter space between objects A and B is (  iА - 1 3 B 2  i ) =9-9=0 and it’s not possible to distinguish between them. At the same time, 1 3 for N= 3 we get  ( iА -  iB )2=(3-1)2+(4-5)2+(2-3)2=6, i.e. the difference in parame- 1 ters is significant. On the other hand, with a decrease in the number of sensors in the monitoring network, the correlation between the signals of objects of various classes increases. Moreover, the accuracy of parameter estimates can be improved by increas- ing the accumulation time, i.e., increasing the size of the sample n. Table 1. The value of the parameters of the amplitudes of the signals from the first and second objects Object i 1 2 3 A i А 3 4 2 3 А   i =9 1 B  iB 1 5 3 3 B   i =9 1 4 Conclusion The proposed algorithm in the sense of signal-to-noise ratio for processing spatially distributed data coming from a monitoring network consisting of several sensors with the following practically important properties: a) does not depend on a priori un- known parameters  2 and  n ( n  1,2,..., N ) and provides a constant probability of false alarm at any noise level; b) is invariant to the location of k sensors that recorded the object and (N-k) sensors that have not fixed the object, among N sensors of the monitoring network; c) has the highest probability of correct detection, depending on the average signal-to-noise ratio and for large N  p close to potential. The proposed algorithm for detecting an extended object by the analog signal of sensors of the monitoring network can be used to identify objects if, for example, as x i( 2) , i  1, N , a priori estimates of the parameters of the recognized object are used. The practical significance of the results lies in the development of analog signal detection algorithms that are resistant to changes in the signal-to-noise ratio in the communication channels of the sensors of the monitoring network with a monitoring and control post. Algorithms can be implemented programmatically using various programming languages and used to automate the process of classifying monitoring objects at a monitoring and control point. References 1. Krapivin, Vladimir & Shutko, Anatolij.. Information Technologies for Remote Monitoring of the Environment. 1st edn ,Springer, Berlin Heidelberg (2012). 2. R.R.Brooks, P.Ramanathan, A. Sayeed Distributed target tracking and classification in sensor networks IEEE Signal Processing Magazine, vol.19. no.2, pp.17-29 (2002) 3. Wu and J. Hu: Design and Implementation of Production Environment Monitoring System Based on GPRS-Internet. In 4th International Conference on Genetic and Evolutionary Computing, pp. 818-821. Shenzhen (2010). 4. Gabriel Nallathambi, Jose C. Principe Theory and Algorithms for Pulse Signal Processing . CoRR abs/1901.01140 (2019) 5. Dan L., Wong K., Hu Y., Sayeed A. Detection, classification and tracking of targets in dis- tributed sensor networks IEEE Signal Processing Magazine, vol.19, no.2, pp.17-29 (2002). 6. P. Venkatasubramaniam, S. Adireddy, L.Tong Sensor networks with mobile access: Opti- mal random access and coding, IEEE J.Sel.Areas Commun. (Special Issue on Sensor Net- works), vol.22, pp.1058-1068 (2004). 7. Keith A. McNeil, Isadore Newman, Francis J. Kelly Testing Research Hypotheses with the General Linear Model, 1st edn, SIU Press, Illinois (1996). 8. Lei Song, Hongchang Hu and Xiaosheng Cheng Hypothesis Testing in Generalized Linear Models with Functional Coefficient Autoregressive Processes, Hindawi Publishing Corpo- ration Mathematical Problems in Engineering, pp.2-18 (2012). 9. R. Azrak and G. Mélard, “Asymptotic properties of quasi-maximum likelihood estimators for ARMA models with time-dependent coefficients,” Statistical Inference for Stochastic Processes, vol. 9, no. 3, pp. 279–330 (2006). 10. R. A. Maller, “Asymptotics of regressions with stationary and nonstationary residuals,” Stochastic Processes and Their Applications, vol. 105, no. 1, pp. 33–67 (2003). 11. Y. Bai, W. K. Fung, and Z. Zhu, “Weighted empirical likelihood for generalized linear models with longitudinal data,” Journal of Statistical Planning and Inference, vol. 140, no. 11, pp. 3446–3456 ( 2010). 12. L. Fahrmeir and H. Kaufmann, “Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models,” The Annals of Statistics, vol. 13, no. 1, pp. 342–368 (1985). 13. F. Carsoule and P. H. Franses, “A note on monitoring time-varying parameters in an auto- regression,” International Journal for Theoretical and Applied Statistics, vol. 57, no. 1, pp. 51–62 (2003). 14. A. Logothetis, A. Isaksson On sensor scheduling via information theoretic criteria, in Proc. Amer. Control Conf., San Diego, CA, pp.2402-2406 (1999). 15. W. A. Fuller, Introduction to Statistical Time Series, John Wiley & Sons, New York, NY, USA, 2nd edition (1996). 16. J. D. Hamilton, Time Series Analysis, 1st edn , Princeton University Press, Princeton, NJ, USA (1994). 17. Sergio Albeverio, Raphael Hoegh-Krohn, Jens Erik Fenstad, and Tom Lindstrom Non- standard methods in stochastic analysis and mathematical physics, 1st edn , Academic Press, Orlando (1990) 18. F. Zhao, J. Liu, J. Liu, L. Guibas, and J. Reich, “Collaborative signal and information pro- cessing: An information directed approach,” Proceedings of the IEEE, vol. 91, no. 8, pp. 199–1209 (2003). 19. The F Distribution and the F-Ratio, https://openstax.org/books/introductory- statistics/pages/13-2-the-f-distribution-and-the-f-ratio, last accessed 2020/01/21 20. Data communication systems and their performance :proceedings of the IFIP TC6 Fourth International Conference on Data Communication Systems and Their Performance, Barce- lona, Spain (1990 ).