On recognition of the shift register output function with a random input for a Markov model of an input sequence Sergey Yu. Melnikov, Konstantin E. Samouylov Peoples’ Friendship University of Russia (RUDN University), 6, Miklukho-Maklaya St., Moscow, 117198, Russia Abstract The problem of recognizing the output function of a binary non-autonomous shift register is considered, the input of which receives a sequence of random variables connected in a simple homogeneous stationary Markov chain. The statistics of symbol frequencies in the output register sequence is used. The recognition problem is reduced to the form of integer minimization of the linear form module under linear constraints. The results are compared with the case when the input sequence is Bernoulli. Keywords shift register, random input machine, recognition of output functions 1. Introduction and background In [1], the problem of recognizing the output function of a Moore automaton by the output sequence symbol statistics when the input is a sequence of independent identically distributed random variables was considered. Such a problem is a special case of the general problem of recognition of machines with random input and arises in a number of theoretical and applied problems. We list some of them. The problem of choosing the minimum set of multigrams in the output sequence, the probabilities of which characterize the automaton, leads to the problem of analyzing the algebraic properties of the probability distributions family of such multigrams as functions of the probability distribution at the input [2, 3]. In [4], a review of papers on a similar problem is given: the probabilities of multigrams of some stationary discrete random process are known, and it is required to determine whether this process is a function on a Markov chain. In [5], for a non-autonomous shift register with a random input, an estimate was obtained of the output n-grams size, the set of probabilities of which determines the equivalence class of such an automaton, and estimates of the number of equivalence classes. We can represent any probabilistic automaton [6, 7] in the form of a sequential connection of a “source of probability” that randomly and independently generates the symbols of an Workshop on information technology and scientific computing in the framework of the X International Conference Information and Telecommunication Technologies and Mathematical Modeling of High-Tech Systems (ITTMM-2020), Moscow, Russian, April 13-17, 2020 Envelope-Open melnikov@linfotech.ru (S. Yu. Melnikov); samuylov-ke@rudn.ru (K. E. Samouylov) Orcid 0000-0002-6368-9680 (K. E. Samouylov) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 100 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 alphabet with given probabilities and a deterministic finite automaton. In a number of works (see the review [8]), the problem of synthesizing such a “source of probability” was considered, in particular, developing criteria to determine whether the desired value is generated by a combination of Bernoulli random variables with probabilities from a given set. The search for new calculation methods leads to the problem of analyzing a class of func- tions that describe the probabilities of multigrams in the output sequence as functions of real variables [9]. A large number of works are devoted to the problem of discrete devices testing by applying random sequences (tests) to them and analyzing the statistics at the outputs [10, 11], etc.). The problem of determining the malfunction of a device in such a formulation is the task of recognizing automata [12]. This direction is called random or pseudorandom testing. As noted in [13], the relevance of this direction is associated with the high volume of computations necessary to construct deterministic tests for complex discrete devices. A special place is taken by probabilistic compact testing methods, in which malfunctions are detected by comparing some statistical characteristics of the tested and reference device. Such a characteristic may be the frequency of occurrence in the output sequence of a fixed segment of the output symbols. The ability to change the analyzed segment and the parameters of the random sequence generator at the input of the device is a powerful tool in the hands of the researcher. In [Hamlet, 2006], situations are described where the use of deterministic tests is impractical and random tests should be used. The questions of estimating the length of random tests and the probability of detection of malfunctions are considered in [14] and [15]. We note the possibility of using probabilistic testing to detect covert channels in the analyzed equipment [16, 17]. In [18] and [19], the problem of diagnosing a discrete device with random input under conditions when the output sequence of the device is known with distortions is considered. In [20], a binary shift register with a random Bernoulli input is considered, the output function of which is known up to a certain transformation. The output sequence of the register is specified, and the input sequence is known in the variants. A statistical criterion is built to verify that the output sequence could be obtained by this input variant. The criterion is based on counting the frequencies of certain bigrams in the output sequence. The problem considered in [1] can be formulated as follows. What information about the output function of the automaton whose input is a sequence of independent random variables can we obtain if the parameters of the “input” distribution and the relative frequency of occurrence of the symbol in the output sequence are known? How will this information change if the parameters of the “input” distribution are changed? What output functions are indistinguishable? Similar problems can be considered in a more general form, assuming a simple or complex [21] Markov dependence in the input sequence. The main ideas of the developed approach, in principle, can be transferred from the “independent” to the “Markov” case. In fact, a sequence of random variables connected into a finite Markov chain can be considered as a sequence of states of a suitable finite automaton (the transition graph of which is the transition graph of the Markov chain under consideration) that processes a sequence of independent random variables with a properly selected distribution [22]. Let the output function of such an automaton be identical (output = state). Then the machine that processes the Markov chain can be represented as a serial connection of two machines, the input of the first of which receives a sequence of 101 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 independent random variables. Thus, the problem reduces to the one considered above, only for a more complex automaton (Figure 1). Figure 1: New automaton design The same can be done if the input sequence is a more general random process — a function on a finite Markov chain. In this case, the output function of the first automaton may not be identical. Certain difficulties in this direction are associated both with an increase in the number of states of a new combined automaton, and therefore with an increase in the dimension of computational problems, and with the requirement of strong connectivity of the serial connection of two automata, the first of which should provide all possible models of randomness in the input sequence. For the case of a binary shift register, we will use the direct representation of probability functions as functions of the Markov chain parameters. A Markov chain with two states is determined by the transition probability matrix and the initial probability distribution, which we assume to be stationary. The transition probability matrix has a size of 2 × 2 , and due tostochasticity it is determined by two real parameters 𝜆 and 𝜉, 0 < 𝜆, 𝜉 < 1. These parameters can be interpreted as the probabilities of transitions “from state 0 to state 1” and “from state 1 tostate 0”, respectively. In [23], for the case when the symbols of the input sequence are connected into a simple homogeneous stationary Markov chain, an expression is obtained for the probability of the symbol “1” in the output sequence of the register. The same work describes the equivalence relation on a set of Boolean functions that occurs when registers with these output functions have the same probability functions. Here we consider the problem of determining the equivalence class of the output function from symbol statistics, reduce it to the form of the discrete optimization problem, and compare the Bernoulli and Markov cases in terms of obtaining information for recognizing the output function and dimensions of discrete optimization problems. 2. Statement of the problem Let 𝑉𝑛 be the space of 𝑛-dimensional binary vectors, 𝐹𝑛 be the set of Boolean functions of 𝑛 arguments, 𝑛 = 1, 2, …. For 𝑓 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) ∈ 𝐹𝑛 by 𝐴𝑓 = (𝑋 = {0, 1}, 𝑉𝑛 , 𝑌 = {0, 1}, ℎ, 𝑓) we denote the Moore automaton with set of states 𝑉𝑛 , the transition function ℎ, determined by 102 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 the rule ℎ ((𝑎1 , … , 𝑎𝑛 ), 𝑥) = (𝑎2 , … , 𝑎𝑛 , 𝑥), where 𝑥, 𝑎𝑖 ∈ {0, 1}, 𝑖 = 1, 2, … , 𝑛, and output function 𝑓 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ). The automaton 𝐴𝑓 is a shift register with a size of 𝑛. Suppose that the automaton 𝐴𝑓 , 𝑓 ∈ 𝐹𝑛 , 𝑛 = 1, 2, … receives a sequence of binary random variables 𝑥 (𝑖) , 𝑖 = 1, 2, …, connected in a simple homogeneous stationary Markov chain with the transition probability matrix 1−𝜆 𝜆 ( ), 0 < 𝜆, 𝜉 < 1. (1) 𝜉 1−𝜉 Our task is to determine the accuracy with which the output function 𝑓 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) can be recognized by the relative frequency of the symbol “1” in the output sequence of automaton 𝐴𝑓 and propose a recognition method. 3. Recognition of the equivalence class by the value of the probability function In [23], the definition of the Markov weight structure of a Boolean function 𝑓 ∈ 𝐹𝑛 was introduced as an integer vector 𝑚(𝑓 ) consisting of sums of values 𝑓 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) on special subsets 𝑉𝑛 formed by vectors with the same bigram markings. The vector 𝑚(𝑓 ) of the Markov weight structure defines the class of statistical equivalence of the function 𝑓. It is easy to see that the dimension 𝑑(𝑛) of the vector 𝑚(𝑓 ) grows quadratically with 𝑛. Direct calculation of the number of different bigram markings of vectors from 𝑉𝑛 leads to the following result. Theorem 1. If 𝑛 ⩾ 2: 3 2 𝑛 − 𝑛 + 2, if 𝑛 is even, 𝑑(𝑛) = { 43 (2) 4 (𝑛 − 1)2 − 𝑛 + 2, if 𝑛 is odd. Consider the set 𝐴 = {𝑚(𝑓 ), 𝑓 ∈ 𝐹𝑛 } of vectors formed by the vectors of all possible Markov weight structures as the set of integer points in the 𝑑(𝑛)-dimensional Euclidean space 𝑅𝑑(𝑛) . Using the result of the Lemma of [23], it is easy to see that 𝐴 = 𝐴1 × 𝐴2 × 𝐴3 , where (1) (1) 𝑛−𝑗−1 𝑗−1 (1) 𝐴1 = {(𝑎𝑖𝑗 , (𝑖, 𝑗) ∈ 𝐼1 ) |0 ≤ 𝑎𝑖𝑗 ≤ ( )( ), 𝑎𝑖𝑗 – integer} , 𝑖 𝑖−1 (2) (2) 𝑛−𝑗−1 𝑗−1 (2) 𝐴2 = {(𝑎𝑖𝑗 , (𝑖, 𝑗) ∈ 𝐼2 ) |0 ≤ 𝑎𝑖𝑗 ≤ 2 ( )( ), 𝑎𝑖𝑗 – integer} , (3) 𝑖−1 𝑖−1 (3) (3) 𝑛−𝑗−1 𝑗−1 (3) 𝐴3 = {(𝑎𝑖𝑗 , (𝑖, 𝑗) ∈ 𝐼3 ) |0 ≤ 𝑎𝑖𝑗 ≤ ( )( ), 𝑎𝑖𝑗 – integer} . 𝑖−1 𝑖 Thus, 𝐴 is the set of integer points of a 𝑑(𝑛)-dimensional parallelepiped. Let 𝑀𝑛 be a family 𝑇 of real functions defined on a square 0 < 𝜆, 𝜉 < 1 of the form 𝑅(𝜆, 𝜉 ) (𝑎 − 𝑏) , where 𝑎, 𝑏 ∈ 𝐴, 𝑎 ≠ 𝑏, the vector 𝑅(𝜆, 𝜉 ) is defined in [23], the coordinates of this vector are fractional rational functions of 𝜆 and 𝜉. Let Ω𝑛 be the set of all those points of the square at which at least one 103 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 function in the family 𝑀𝑛 is equal to zero. It is easy to see that Ω𝑛 is the union of a finite number of smooth curves in a unit square. In particular, it can be shown that both diagonals of the square, {(𝜆, 𝜉 )|𝜆 + 𝜉 = 1} and {(𝜆, 𝜉 )|𝜆 = 𝜉} belong to the set Ω𝑛 , if 𝑛 ⩾ 3. The set Ω𝑛 has Lebesgue measure zero. Theorem 2. For (𝜆, 𝜉 ) ∈ {(0, 1) × (0, 1)} \Ω𝑛 , by the value of 𝑃𝑓 (𝜆, 𝜉 ), we can unambiguously indicate the class of statistical equivalence of the function 𝑓. This equivalence class corresponds to the vector of the Markov weight structure 𝜇0 , which is the only solution of the equation 𝑃𝑓 (𝜆, 𝜉 ) = 𝑅(𝜆, 𝜉 ) (𝜇0 )𝑇 under restriction 𝜇0 ∈ 𝐴. (4) If (𝜆, 𝜉 ) ∈ Ω𝑛 , then this equation has more than one solution, and the vector of the Markov weight structure of 𝑓 is contained among these solutions. 4. Recognition of the equivalence class using the symbols statistics Let 𝑦 (𝑖) = 𝑓 (𝑥 (𝑖) , 𝑥 (𝑖+1) , … , 𝑥 (𝑖+𝑛−1) ), 𝑖 = 1, 2, … be the output sequence of the automaton 𝐴𝑓 in the scheme described above, 𝑁 1 𝑌𝑁 = ∑ 𝑦 (𝑖) , 𝑁 = 1, 2, … . (5) 𝑁 𝑖=1 The binary sequence 𝑦 (𝑖) is stationary, therefore, the law of large numbers is valid [24]: 𝑌𝑁 → 𝑃𝑓 (𝜆, 𝜉 ) (probability convergence). (6) Theorem 3. Let (𝜆, 𝜉 ) ∈ {(0, 1) × (0, 1)} \Ω𝑛 . The probability that the minimization problem |𝑌𝑁 − 𝑅(𝜆, 𝜉 )𝜇𝑇 | → min { (7) 𝜇∈𝐴 has a unique solution, tends to 1 with 𝑁 → ∞. This unique solution corresponds the class of statistical equivalence of output function 𝑓. For the case (𝜆, 𝜉 ) ∈ Ω𝑛 , we can only say that the vector of the weighted Markov structure of the true output function 𝑓 is contained among the solutions of problem (7). 5. Conclusion Comparing the results obtained for the case when the input sequence is a Markov chain with the results obtained for the case of a Bernoulli sequence, the following can be noted. 1. Probabilistic functions in both cases are determined by the weights (sums of values) of the Boolean function of the outputs on special subsets of space 𝑉𝑛 . These subsets consist of vectors that have the same probability in a particular model (in the independent case, vectors of the same weight, in Markov, vectors with the same frequency of occurrence of 104 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 bigrams). In the first case, the probability function is a polynomial in the parameter 𝑝 of the Bernoulli scheme; in the second, it is a fractional rational function in the parameters 𝜆 and 𝜉, that determine the Markov chain. 2. The relations of statistical equivalence on 𝐹𝑛 are introduced on the basis of the identity of probability functions; from the statistical equivalence of two Boolean functions with a Markov input dependence follows their statistical equivalence with a Bernoulli input. The equivalence of two functions is equivalent to the equality of their weights on the mentioned subsets 𝑉𝑛 . In the independent case the number of such subsets is 𝑛 + 1, in the Markov case the number of such subsets is 34 𝑛2 𝑂(𝑛). The number of equivalence classes 2 for the Bernoulli case is equal to exp ( 𝑛2 + 𝑂(𝑛 ln 𝑛)), for the Markov case is equal to 3 exp ( 5𝑛4 ln 𝑛 + 𝑂(𝑛3 )). 3. In the set of all kinds of distributions (the parameter 𝑝 of the Bernoulli distribution belongs to the interval (0, 1), the parameters (𝜆, 𝜉 ) of the transition matrix of the Markov chain belong to the square (0, 1)×(0, 1)), a subset Ω of the “special” distributions is distinguished. For all distributions not from the set Ω, the value of the probability function allows us to uniquely indicate the equivalence class of the output function. If the distribution belongs Ω, then there are at least two nonequivalent output functions for which the probability values of “1” in the output sequence coincide. The Lebesgue measure of the set Ω of “special distributions” in the distribution space is zero: in the first case it turns out to be a finite set of points of the segment, in the second – the set of points of a finite number of smooth curves inside a square. 4. The problem of determining the equivalence class of a function from the value of a probability function reduces to the problem of solving the equation in integers under linear constraints; the problem of determining the equivalence class of a function from the statistic value of a sample average output sequence is reduced to the problem of minimizing a linear form module under linear constraints. The number of unknowns in the Bernoulli case is equal 𝑛 + 1, in the case of Markov dependence it is approximately equal 34 𝑛2 − 𝑛. Thus, analyzing the problem of recognizing an unknown function of outputs by symbol statistics, we can say that the proposed approach, while complicating the probabilistic nature of the input sequence (switching from the Bernoulli scheme to the Markov chain), leads, on the one hand, to the possibility of obtaining more detailed information about the recognizable function; on the other hand, the dimension of discrete optimization problems increases. Acknowledgments The publication has been prepared with the support of the “RUDN University Program 5- 100”. The reported study was funded by RFBR, project numbers 18-00-01555 (18-00-01685), 19-07-00933. 105 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 References [1] S. Y. Melnikov, K. E. Samouylov, The recognition of the output function of a finite automaton with random input, in: V. M. Vishnevskiy, D. V. Kozyrev (Eds.), Distributed Computer and Communication Networks, volume 919, Springer International Publishing, Cham, 2018, pp. 525–531. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 3 1 9 - 9 9 4 4 7 - 5 _ 4 5 . [2] A. S. Barashko, O range i statisticheskom otobrazhenii sil’nosvyaznogo avtomata, Kiber- netika (1987) 56–60. [3] H. Ito, S. Amari, K. Kobayashi, Identifiability of hidden markov information sources and their minimum degrees of freedom, IEEE Transactions on Information Theory 38 (1992) 324–333. doi:1 0 . 1 1 0 9 / 1 8 . 1 1 9 6 9 0 . [4] M. Vidyasagar, The complete realization problem for hidden markov models: a survey and some new results, Math. Control Signals Syst. (2011) 1–65. doi:1 0 . 1 0 0 7 / s 0 0 4 9 8 - 0 1 1 - 0 0 6 6 - 7 . [5] M. I. Rozhkov, Algoritmicheskie voprosy identifikacii konechnyh avtomatov po rasprede- leniyu vyhodnyh m-gramm, dis. … d.t.n., 05.13.19 - metody i sistemy zashchity informacii. informacionnaya bezopasnost, Moskva, MGIEM, 2012. [6] R. G. Buharaev, Osnovy teorii veroyatnostnyh avtomatov, Nauka, Moskva, 1985. [7] A. Paz, Introduction to probabilistic automata, Academic Press, London, 1971. [8] R. M. Kolpakov, Diskretnye preobrazovaniya veroyatnostnyh raspredelenij, in: Sovre- mennye problemy matematiki i mekhaniki, volume III. Matematika, Izd-vo Moskovskogo universiteta, Moskva, 2009, pp. 35–50. [9] A. V. Ryabinin, Stohasticheskie funkcii konechnyh avtomatov, in: Algebra, logika i teoriya chisel, 7 temat. konf. mekh.-mat. fak. MGU, Moskva, 1986, pp. 77–80. [10] D. Y. Golembiovskij, Diagnostika cifrovyh ustrojstv. Mikroprogrammnoe i veroyatnostnoe testirovanie, Politekhn. in-t, Saratov, 1993. [11] C. N. Hadjicostis, Probabilistic detection of fsm single state-transition faults based on state occupancy measurements, IEEE Transactions on Automatic Control 50 (2005) 2078–2083. doi:1 0 . 1 1 0 9 / T A C . 2 0 0 5 . 8 6 0 2 7 0 . [12] N. G. Kushik, Metody sinteza ustanovochnyh i razlichayushchih eksperimentov s nede- terminirovannymi avtomatami, dis. … k.f.-m.n., 05.13.01 - sistemnyj analiz, upravlenie, obrabotka informacii, Tomsk, 2013. [13] A. S. Barashko, Y. A. Skobcov, D. V. Speranskij, Modelirovanie i testirovanie diskretnyh ustrojstv, Naukova dumka, Kiev, 1992. [14] A. S. Barashko, N. V. Cherevko, Nekotorye sposoby ocenki dliny sluchajnogo testa, in: Teoriya i modelirovanie upravlyayushchih sistem, Naukova dumka, Kiev, 1989, pp. 10–15. [15] G. V. Petruhnova, Ocenka dliny sluchajnoj posledovatel’nosti dlya operacii vosproizve- deniya informacii v kontrol’nom ispytanii, in: O. Y. Kravca (Ed.), Sovremennye problemy informatizacii v sistemah modelirovaniya, programmirovaniya i telekommunikaciyah, volume 9, Izd. ”Nauchnaya kniga”, Voronezh, 2004, pp. 303–305. [16] E. E. Timonina, Analiz ugroz skrytyh kanalov i metody postroeniya garantirovanno zashchishchennyh raspredelennyh avtomatizirovannyh sistem, dis. … d.t.n. 05.13.17 - teoreticheskie osnovy informatiki, Moskva, RGGU, 2004. [17] A. A. Grusho, N. A. Grusho, E. E. Timonina, Vklyuchenie novyh zapretov v sluchajnye posledovatel’nosti, Inform. i ee primen. 8 (2014) 46–52. 106 Sergey Yu. Melnikov et al. CEUR Workshop Proceedings 100–107 [18] E. Athanasopoulou, C. N. Hadjicostis, Probabilistic approaches to fault detection in networked discrete event systems, IEEE Transactions on Neural Networks 16 (2005) 1042–1052. doi:1 0 . 1 1 0 9 / T N N . 2 0 0 5 . 8 5 3 4 3 0 . [19] E. Athanasopoulou, L. Li, C. N. Hadjicostis, Probabilistic failure diagnosis in finite state machines under unreliable observations, in: in Proceedings of the 8th International Workshop on Discrete Event Systems - WODES’06, Ann Arbor, MI., 2006, pp. 301–306. doi:1 0 . 1 1 0 9 / W O D E S . 2 0 0 6 . 1 6 7 8 4 4 6 . [20] B. A. Sevastyanov, The conditional distribution of the output of an automaton without memory for given characteristics of the input, Discrete Mathematics and Applications 4 (1994) 1–6. doi:1 0 . 1 5 1 5 / d m a . 1 9 9 4 . 4 . 1 . 1 . [21] A. S. Barashko, O raspoznavanii avtomatov s ispol’zovaniem generatorov zavisimyh sluchajnyh signalov, in: Naukovi praci Donec’kogo nacional’nogo tekhnichnogo univer- sitetu, Problemi modelyuvannya ta avtomatizacii proettuvannya (MAP-2002), DonNTU, Donec’k, 2002, pp. 61–71. [22] A. S. Davis, Markov chains as random input automata, The American Mathematical Monthly 68 (1961) 264–267. [23] S. Y. Mel’nikov, K. E. Samujlov, Veroyatnostnye funkcii i statisticheskaya ekvivalentnost’ dvoichnyh registrov sdviga so sluchajnym markovskim vhodom, in: Informacionno- telekommunikacionnye tekhnologii i matematicheskoe modelirovanie vysokotekhno- logichnyh sistem : materialy Vserossijskoj konferencii s mezhdunarodnym uchastiem. Moskva, RUDN, 13–17 aprelya 2020 g., 2020, pp. 49–54. [24] V. S. Korolyuk, N. I. Portenko, A. V. Skorohod, A. F. Turbin, Spravochnik po teorii veroyat- nostej i matematicheskoj statistike, Nauka, Moskva, 1985. 107