<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Application of Genetic Algorithm in Problems of Approximation of Complex Multidimensional Dependencies and Identification of Parameters of Theoretical Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmitry Glukhov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hlukh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>r Luky</string-name>
          <email>3a.lukyanov@psu.by</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Polotsk State University</institution>
          ,
          <addr-line>Blokhin str., 29, Novopolotsk, 211440</addr-line>
          ,
          <country>Republic of Belarus</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The article proposes a method for constructing an analytical approximation of n-dimensional data, based on the use of a genetic algorithm. A feature of the method is that the encoding of the search space is performed in the form of a parsing tree for an algebraic expression by the parser of the context-free grammar of the class LR (1). In addition, during the evolutionary process, in addition to the use of structure mutations (subject to their positive influence), the stage of mutation of the coefficients is performed, which allows avoiding the target function falling into local extremum. And also at each step of the evolutionary process, there is a stage for searching for an extremum in the space of coefficients and a stage for simplifying the analytical model.</p>
      </abstract>
      <kwd-group>
        <kwd>genetic algorithm</kwd>
        <kwd>approximation</kwd>
        <kwd>theoretical models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>By the complexity of a multidimensional dependency, we mean the uncertainty
regarding the functional dependencies present in the data between input variables and
the output value of the function. Such tasks include problems of identifying
parameters of analytical models of approximation of multidimensional dependencies,
identifying parameters of theoretical probability distributions, identifying parameters of
trend models in forecasting.</p>
      <p>Data approximation in the absence of a priori information about the real model
requires the use of universal approximators, such as artificial neural networks, fuzzy
logical approximators. However, the analytical representation of the model is the most
compact and most accurate. Approximation methods based on the identification of
parameters of some theoretical model solve the problem using an immutable model
adopted from any assumptions of reasonableness.</p>
      <p>We are making an attempt to use the genetic algorithm to find the most accurate
analytical model of the approximator. The genetic algorithm performs not only the
identification of parameters of a theoretical model, but also the search for the model
itself in the space of possible modifications of the structure. The purposeful
movement of the population of models to models providing a qualitative approximation is
carried out due to the law of evolutionary selection according to the criterion of
minimizing the standard deviation.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Implementation of genetic algorithm</title>
      <p>A genetic algorithm (GA) is a heuristic search algorithm used to solve optimization
and simulation problems by sequentially selecting, combining, and varying the
desired parameters using mechanisms that resemble biological evolution.</p>
      <p>
        Previously, we made attempts to solve this problem on multi-agent systems, which
for analytical models in the space of 3 variables showed results superior to results of
artificial neural networks, but on a limited variety of models [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. The main
disadvantage of such approximators is a limited set of structure transformations
formulated in the form of local heuristics. Heuristics are obvious ways for a human expert
to change the structure. Accordingly, modifications that are not obvious to the expert
become impossible. In our opinion, the use of GA removes this restriction.
      </p>
      <p>
        In addition, we have built approximators based on fuzzy logic (FLA). Fuzzy logic
systems proposed by Lotfi Zadeh [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] were developed in works [
        <xref ref-type="bibr" rid="ref9">9, 10</xref>
        ]. If we consider
the unknown parameter as continuous, then in this case we can draw a parallel
between the conclusion about the value of the unknown parameter and the
approximation of the function and talk about the property of a fuzzy system to act as a linguistic
approximator. The advantage of such approximators is their speed, and the main
disadvantage – limited opportunities to improve approximation accuracy.
      </p>
      <p>The context-free grammar Gexp of the LR(1) class of algebraic expressions
proposed by us is presented below in the Beckus-Naur form:
formula: exp {ANode* o = (ANode*)$1;
createdNodes.push_back(o);}
| formula ';' exp
Nodes.push_back(o);}
;
exp : NUM {ANodeNUM* o = new ANodeNUM($1); $$ =
(void*) o;}
| '(' exp ')' {$$ = $2;}
| exp '+' exp {ANode* o = (ANode*) new
ANodeOp((ANode*)$1, (ANode*)$3, PLUS); $$ = (void*) o;}
| exp '-' exp {ANode* o = (ANode*) new
ANodeOp((ANode*)$1, (ANode*)$3, MINUS); $$ = (void*) o;}
| exp '/' exp {ANode* o = (ANode*) new
ANodeOp((ANode*)$1, (ANode*)$3, DIV); $$ = (void*) o;}
{ANode* o = (ANode*)$3;
created| exp '*' exp {ANode* o = (ANode*) new
ANodeOp((ANode*)$1, (ANode*)$3, MUL); $$ = (void*) o;}
| 'p' 'o' 'w' '(' exp ',' exp ')' {ANode* o = (ANode*)
new ANodeOp((ANode*)$5, (ANode*)$7, POW);</p>
      <p>$$ = (void*) o;}
| 's' 'i' 'n' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$5, SIN); $$ = (void*) o;}
| 'c' 'o' 's' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$5, COS); $$ = (void*) o;}
| 'l' 'o' 'g' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$5, LOG); $$ = (void*) o;}
| 'e' 'x' 'p' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$5, EXP); $$ = (void*) o;}
| 't' 'a' 'n' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$5, TAN); $$ = (void*) o;}
| 'c' 't' 'a' 'n' '(' exp ')' {ANode* o = (ANode*) new
ANodeFunc((ANode*)$6, CTAN); $$ = (void*) o;}
| NAME {ANode* o = (ANode*) new
ANodeNAME(string((char*)$1)); $$ = (void*) o; delete $1;}
;</p>
    </sec>
    <sec id="sec-3">
      <title>3 Parsing process</title>
      <p>In the process of syntactic parsing, as we can see, the right-side parsing tree is built in
the form of an object-oriented structure. In order to be able to compile an algebraic
expression into a tree of objects nested into each other, we have developed a class
systempresented in the class diagram(Figure 1).</p>
      <p>As a result of syntactic parsing the initialization file, an initial population of
analytical n-dimensional models is created. Initial variations of the models are formulated
byan expert in the formal language of the Gexp grammar.</p>
      <p>Objects of the base class ANode and derived classes ANodeNUM, ANodeNAME,
ANodeOp, ANodeFunc, ANodeProxy attribute semantic stack nodes and form a tree
of right-side parsing of an algebraic expression.</p>
      <p>The Genetic class implements the main stages of the GA. Namely:
1. initialization() - initialization of the population with expert-defined algebraic
expressions for the case of n-variables;
2. calcAllErrors() - calculating of the approximation error for each individual in its
current state;
3. coefficientMutation() – the process of random cloning of individuals with mutation
of coefficients to ensure the possibilityof exit from local extremes;
4. structureMutation() – the process of cloning with a structure mutation, subject to
the positive effect of the mutation;
5. fitCoefficiets() – the process of searching for the extremum of the target function
in the coefficient space. In the implementation of this process, we have chosen the
gradient descent algorithm, but using coordinate relaxation. This algorithm is not
fast, but it is reliable enough;
6. simlifyStructure() – simplifies the structure of an expression in specific cases.
Simplification of the structure leads to the removal and replacement of semantic tree
nodes (the ANodeProxy class is provided for the simplification procedure) and is
provided in the following cases:
 x * 0 | 0 * x  0;
 x0  1; x1  x;
 x * 1 | 1 * x  x;
Simplification is triggered when the algorithm detects the convergence of the
coefficient to 0 or 1.</p>
      <p>The block diagram of the evolutionary search algorithm is shown in Figure 2.
As the objective function we have chosen the standard deviation between the
theoretical and calculated values of the function (SD). In the m-dimensional space of the
coefficients of analytical models, the SD surface may have local extremes due to the
limited data area and the complexity of approximated dependencies. However, the
nature of the surface allows us to suggest the possibility of using the gradient descent
algorithm to search for a local or global extremum. The nature of the SD surface in
the space of two coefficients a, b of a function of one variable is illustrated
in Figure 3.</p>
      <p>
        In a number of works devoted to this topic, it is proposed to search for coefficients
that minimize SD as a result of mutation of coefficients and to use the Nelder-Mead
algorithm to search for a local extremum. [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6, 11</xref>
        ]
      </p>
      <p>Given that we are trying to minimize the standard error of the deviation of some
analytical function from a discrete data set, we conducted a comparative analysis of
the applicability of various numerical optimization methods and provide the user of
the system with a choice of a specific method. Our approach is based on two points:
1. The mutation of coefficients is necessary for an abrupt transition to a new
extremum search point, which allows the algorithm to run several extremum search
strategies from different areas of the n-dimensional search space, and, therefore, to
avoid getting into local extremes.
2. Application of the most qualitative methods for finding the extremum depending
on the specifics of the problem (gradient descent method, simplex method, etc.) for
identification of coefficients with a given accuracy.
In our numerical experiments, the best convergence rate was shown by the coordinate
descent method, in which the step is performed along the coordinate giving the
maximum gain. However, the search and implementation of more efficient
optimization algorithms remains the subject of further research.</p>
      <p>It is important to note that we have proposed a new method for screening out
equivalent formulas. Since the same formula, due to the commutativity of operations
(+, -, /, *) and the distributivity of the corresponding pairs of operations, as well as the
possibility of arbitrary arrangement of brackets, can be written in many different
ways, we proposed to form numerical hashes of formulas using special codes of
variables, coefficients and functions and a mirror - modified algebra (+ instead of
*,instead of /, * instead of +, / instead of -).</p>
      <p>An example of the operation of this hash function is shown in the Table 2:</p>
      <p>An example of the gradient descent method for 7 population individuals with an
accuracy of 0.001 is shown in Table 1. The example also shows that we have
developed a mechanism for processing results in the form of non-numbers and infinities, as
large SD values. This approach allows you to continue searching without the
occurrence of exceptional overflow situations and divisions by zero.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Algorithm application example</title>
      <p>As an example of the application of the GA we constructed for solving an applied
problem, we consider the problem of accounting for transients in stationary
nonisothermal models of gas transportation.</p>
      <p>Formulation of the problem: when the compressor station is switched on, gas
heating begins and, despite the fact that the heated gas has not yet spread through the
pipeline, the stationary model describes the state of the pipeline that will occur after
the transition to steady state. Thus, significant sections of the pipeline in the model
are found to have an overvalued gas temperature and, given the strong dependence of
gas density on temperature, with an underestimated value of the gas reserve.</p>
      <p>We calculated the correlation coefficient, which for the considered areas with
temperature increase showed a high negative correlation, so for the 4 graphs given for
example in Figure 4, the correlation coefficient was Correl(X, Y) = { -0,84369,
0,88839, -0,88509, -0,85517}.
As the authors of the software package for calculating the gas reserve on the main gas
pipeline of OJSC "Gazprom transgaz Belarus», we attempted to eliminate the
influence of such transients by introducing the inertia of the temperature change at the
compressor output during the time of system transition to the steady state.</p>
      <p>We used GA to find an analytical relationship between the value of the transition
time y and the current value of the gas flow Q, the ground temperature Tg, and the
temperature difference at the output of the compressor station dT. The task of the
approximator is to find such an analytical dependence that has a minimum SD for
describing y(Q, Tg, dT).</p>
      <p>Comparative results of GA operation are given in Table 3.</p>
      <p>Approximation
method
GA</p>
      <p>Model
1.66597 * pow(1.7573, 1.1225 *
log(3.21177 * pow(dT * 4.23434, 1.1878))) /
pow(log(0.0858575 * Q), 0.0742341)
4.15915 * dT
52 - 14.48 * pow(dT, 0.08961) / pow(Q,
0.1755)
102.607 * pow(dT, -0.369501)
log(0.265488 * Q)
The model is described in our work [10]
SD
1.078879
1.330954
/ 1.200513</p>
      <p>The solution found by GA exceeded in accuracy the fuzzy logical approximator
with the selected optimal smoothing settings, while giving the most compact
representation of the approximator.</p>
      <p>The dynamics of improving the approximation quality can be demonstrated
by the graph of positive mutations shown in Figure 5.</p>
      <p>One of the parameters of the proposed GA is the limit size of the formula, which
allows obtaining a wide variety of compact variants of approximators.</p>
      <p>Developed GA has the possibility of flexible adjustment of limit thresholds of
population size, percentage and death period of unpromising individuals, the
probability of a period and type of mutations of the structure and coefficients, formulations of
model simplification tools, algorithms of search of extremum. In particular, to solve
the problem of accounting for transients in stationary non-isothermal models of gas
transportation, we are interested not only in time, but also in the nature of temperature
inertia, which requires collecting and processing of additional information.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This article proposes a new method for constructing an analytical approximation of
complex n-dimensional dependencies based on the use of the genetic algorithm.</p>
      <p>The developed GA has the following features:
1. Search space encoding is performed as a parsing tree of an algebraic expression by
the parser of the context-free grammar of the class LR(1);
2. In the course of the evolutionary process, in addition to the use of structure
mutations (provided their positive effect), the stage of coefficients mutation is
performed, which allows avoiding the target function falling into local extremes;
3. At each step of the evolutionary process there is a stage of searching for an
extremum in the coefficient space and a stage of simplifying the analytical model.</p>
      <p>The developed GA helped us solve the problem of accounting for transients in
stationary non-isothermal gas transportation models. The resulting models were
implemented as part of the software package for calculating the gas reserve on the main gas
pipeline of OJSC "Gazprom transgaz Belarus», which reduced the impact of
transients on the gas imbalance.
10. Glukhov D.О., Glukhova T.M., Kundas S.P.: Soft calculations for organization of computer
representation of nomograms on the example of calculating the limit creep coefficient.
Vestnik Polotskogo gosudarstvennogo universiteta. – Series C: Basic Sciences, №3. pp. 2–6
(2010).
11. Zvonkov V.B., Popov A.M.: Comparative study of classical optimization methods and
genetic algorithms. Vestnik Sibirskogo gosudarstvennogo aehrokosmicheskogo universiteta
im. Akademika M.F. Reshetneva, pp. 23–27. Izdatelstvo: Federalnoe gosudarstvennoe
byudzhetnoe obrazovatelnoe uchrezhdenie vysshego obrazovaniya «Sibirskij
gosudarstvennyj universitet nauki i texnologij imeni akademika M.F. Reshetneva», Krasnoyarsk (2013).
ISSN: 1816-9724</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Glukhov</surname>
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trofimov</surname>
            <given-names>V</given-names>
          </string-name>
          .V.
          <article-title>: Multi-agent structures for solving the traveling salesman problem. Problemy menedzhmenta: sb. nauchn, tr</article-title>
          . Vypusk №
          <article-title>3; pod obshch</article-title>
          .
          <source>red. prof. O.A. Strakhovoy</source>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>76</lpage>
          .
          <string-name>
            <surname>Izd-vo</surname>
            <given-names>SPbGUEF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saint-Petersburg</surname>
          </string-name>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Trofimov</surname>
            <given-names>V.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glukhov</surname>
            <given-names>A.O.:</given-names>
          </string-name>
          <article-title>Approximation on multi-agent structures. Ekonomicheskaya kibernetika: sistemnyy analiz v ekonomike i upravlenii: sb</article-title>
          . nauchn. tr. Vypusk №1;
          <string-name>
            <surname>pod</surname>
            <given-names>red. D.V. Sokolova i N.N.</given-names>
          </string-name>
          <string-name>
            <surname>Pogostinskoy</surname>
          </string-name>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>53</lpage>
          .
          <string-name>
            <surname>Izd-vo</surname>
            <given-names>SPbGUEF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saint-Petersburg</surname>
          </string-name>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Glukhov</surname>
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glukhov</surname>
            <given-names>D.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trofimov</surname>
            <given-names>V.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trofimova</surname>
            <given-names>L.A.</given-names>
          </string-name>
          :
          <article-title>Modified hybrid genetic algorithm of discreet optimization problems</article-title>
          .
          <source>2017 XX IEEE International Conference on Soft Computing and Measurements (SCM</source>
          <year>2017</year>
          ). Saint-Petersburg (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Pozharskiy</surname>
            <given-names>D. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zolotov</surname>
            <given-names>N.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semenov</surname>
            <given-names>I.E.</given-names>
          </string-name>
          :
          <article-title>Genetic algorithm for finding the approximation coefficients of a function in contact problems for a cylinder</article-title>
          . Molodoy uchenyy №
          <volume>24</volume>
          (
          <issue>158</issue>
          ), pp.
          <fpage>122</fpage>
          -
          <lpage>125</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kildiushov</surname>
            <given-names>M.S.:</given-names>
          </string-name>
          <article-title>A program for reconstructing approximated algebraic functions from several variables from a set of discrete values of a function</article-title>
          .
          <source>Online magazine «NAUKOVEDENIE»</source>
          Vol.
          <volume>7</volume>
          , №5, (
          <year>2015</year>
          ) http://naukovedenie.ru/PDF/136TVN515.pdf. doi:
          <volume>10</volume>
          .15862/136TVN515
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kildiushov</surname>
            <given-names>M.S.:</given-names>
          </string-name>
          <article-title>The use of genetic algorithms for the restoration of approximated algebraic functions with a certain accuracy. Nauka i biznes: puti razvitiya. - Fond razvitiya nauki i kultury (Tambov</article-title>
          ) №
          <volume>1</volume>
          (
          <issue>55</issue>
          ), pp.
          <fpage>25</fpage>
          -
          <lpage>31</lpage>
          (
          <year>2016</year>
          ). ISSN:
          <fpage>2221</fpage>
          -
          <lpage>5182</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Driankov</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellendoorn</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reinfrank</surname>
            <given-names>M.:</given-names>
          </string-name>
          <article-title>An introduction to fuzzy control</article-title>
          .
          <source>Springerverlag</source>
          (
          <year>1993</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lotfi</surname>
            <given-names>A</given-names>
          </string-name>
          . Zadeh: Fuzzy Sets.
          <source>Information &amp; Control</source>
          vol.
          <volume>8</volume>
          , pp.
          <fpage>338</fpage>
          -
          <lpage>353</lpage>
          (
          <year>1965</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Glukhov</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>О</surname>
          </string-name>
          .:
          <article-title>Dynamic expert system by fuzzy inference rules to automations an examination of complex objects</article-title>
          .
          <source>Budownictwo i Inzynieria Srodowiska</source>
          . - Zielonogorsk: Politechnika Zielonogorska, pp.
          <fpage>105</fpage>
          -
          <lpage>109</lpage>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>