<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adaptive ADALINE Robust Training Algorithm Under the Maximum Correntropy Criterion With Variable Center</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oleg G. Rudenko</string-name>
          <email>oleh.rudenko@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>O leksandr O. Bezsonov</string-name>
          <email>oleksandr.bezsonov@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrzej Szajna</string-name>
          <email>a.szajna@dtpoland.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Khark iv National University of Radio Electronics</institution>
          ,
          <addr-line>Nauk y Ave. 14, Kharkiv, 61166, Uk raine</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Uniwersytet Zielonogórsk i</institution>
          ,
          <addr-line>ul. Licealna 9, Zielona Gora, 65-417</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The problem of training ADALINA in the presence of non-Gaussian interference is considered. The learning algor ithm is a gradient procedure for maximizing the functional. In contrast to the commonly used Gaussian kernels, the centers of which are at zero and effective for distributions with zero mean, the paper considers a modification of the criterion suitable for distributions with nonzero mean. The modification is to use correntropy with a variable center. The use of Gaussian kernels with a variable center will allow us to estimate unknown parameters under Gaussian and non-Gaussian noises with zero and non-zero mean distributions. The properties of its convergence in the stationary and non-stationary cases in conditions of Gaussian and non-Gaussian noises are investigated.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Correntropy</kwd>
        <kwd>maximization</kwd>
        <kwd>functional</kwd>
        <kwd>gradient algor ithm</kwd>
        <kwd>asymptotic convergence</kwd>
        <kwd>non-stationary</kwd>
        <kwd>steady state</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Adaptive linear element (ADALINE) was the first linear neural network proposed by Widrow B.
and Hoff M., and became an alternative to the perceptron [1]. Subsequently, this element and its
learning a lgorithm are being very commonly used in problems of identification, control, filtering, etc.
The learning algorithm of Widrow-Hoff is the Kaczmarz algorithm for solving systems of linear
algebraic equations [2]. Properties of this algorithm dealt w ith the solution of the identification
problem is suffic iently described in [3].</p>
    </sec>
    <sec id="sec-2">
      <title>2. The problem of ADALINE training</title>
      <sec id="sec-2-1">
        <title>ADALINE is described by the equation</title>
        <p>yn+1 = c∗T xn+1 + ξ n+1 ,
where yn+1 is the observed output s ignal; xn+1 = (x1,n+1, x2,n+1,..xN ,n+1)T is the vector of output
signa ls N ×1; c∗ = (c1∗, c2∗,..c∗N )T is the vector of desired parameters N ×1; ξ n + 1 is the noise; n is the
discrete time.</p>
        <p>The task of its learning cons ists in the definition (estimation) of the vector of parameters c∗ and is
reduced to minimize some of the chosen in advance performance functiona l (identification criterion)
(1)
(2)
where ei = yi − yˆi ; yˆi = ciT−1xi is the output model signal; c is the vector estimation c∗ ; ρ(ei ) –
some differentia l loss function satisfying the conditions:
ρ(ei ) ≥ 0; ρ(0) = 0; ρ(ei ) = ρ(− ei ); ρ(ei ) ≥ ρ(e j )
for
ei ≥ e j .</p>
        <p>The training objective is to search for estimate c defined as the solution of a minimum extreme
problem
or as solving equation system</p>
        <p>F (c) = min ,
∂F( e )
∂c j</p>
        <p>n
= ∑ ρ′(ei ) ∂∂ceij = 0,
i=1
(3)
(4)
where ρ′(ei )) = ∂ρ(ei ) – is the function of influence.</p>
        <p>∂ei</p>
        <p>If we introduce the weigh function ω(e) = ρ′(e) / e , the system of equations (4) may be put as
following:
n
∑ ω(ei )ei ∂∂ceij = 0, (5)
i=1
while functional minimization (2) will be equiva lent to minimizing a weighted quadratic
functional, most often seen in practice</p>
        <p>n
min ∑ω (ei )ei2. (6)</p>
        <p>i=1</p>
        <p>A quadratic functional the most wide ly used in estimating the parameters uses the second order
statistics of the error signa l and is quite optimal in assuming linearity and Gauss nature of signals.
Indeed, when choosing ρ(ei ) = 0.5ei2 the influence function ρ′(ei ) = ei , i.e. grows linearly with the
increase of ei , that expla ins the volatility of the least squares method va luation to outliers and
distortions with big distribution “tails”.</p>
        <p>Stable M-estimation is also estimation c , defined as solving an extremal problem (3) or solving a
system of equations (4), however loss function ρ(ei ) is chosen as different from the quadratic one.</p>
        <p>There are quite a number of functiona ls that provide the robust M-estimates but the most common
are combined functiona ls proposed by Huber [4] and Hampe l [5] consisting of quadratic, that ensures
optima l estimates for the Gaussian distribution, and modular, that allows to get an estimate that is
more robust to distributions with heavy "ta ils" (outliers). However, the effectiveness of the resulting
robust estimations depends significantly on many parameters used in these criteria and chosen
depending on the experience of the researcher.</p>
        <p>The practical application of these functiona ls for solving the identification problem was considered
in many works, in [6, 7], in particular.</p>
        <p>Another approach to obta in robust estimates, devoid of this drawback, is the use of the fourth
degree criterion [8], combined criteria using a combination of the quadratic criterion and the criterion
of smallest moduli [9–11], the quadratic criterion and the fourth degree criterion [12], the fourth
degree criterion and the criterion of smallest moduli [13]. It should be noted that the use of the
combined criterion turned out to be very effective and much simpler when implementing the
identification procedure.</p>
        <p>One more approach that is currently wide ly used is the approach based on information
characteristics of signals, entropy, in particular. The functiona l used in this case is an explic it
functional of the probability density function (PDF) and inc ludes all the higher-order statistical
properties defined in PDF. Since entropy measures the mean uncertainty conta ined in a given PDF,
minimizing it provides a reduction in error. In [14, 15], the concept of information theoretic learning
(ITL) was introduced, using as a criterion the Rényi quadratic entropy, for which a nonparametric
estimate based on Parzen windows with Gauss kernels is determined directly from data samples. In
these works, it was proved that when using the Rényi entropy, as a result of training, the Rényi
distance between the conditiona l probability of the density function of the desired and actual output
signa ls for the given input signa ls is minimized.</p>
        <p>The results of numerous studies indicate that in the presence of non-Gaussian, in particular,
impulse noise, in measurements, an approach based on information characteristics of signals is very
effective, while a criterion that considers all statistics of a higher-order error signa l turns out to be
more appropriate. Correntropy was introduced in [16] as a generalized measure of similarity, the
maximization of which underlies the development of suffic iently simple and effic ient robust
algorithms.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Correntropy as a measure of similarity</title>
      <p>Correntropy, defined as a localized measure of similarity, has proven to be very efficient for
obtaining robust estimates due to its less sensitivity to outliers. Its name emphasizes the relationship
with correlation, and also indicates the fact that its average value over time or measurements is
associated with entropy, more precisely, with the argument of the logarithm in the quadratic Rényi
entropy, estimated with the help of Parzen windows [17].</p>
      <sec id="sec-3-1">
        <title>For two random variables X and Y, the correntropy is defined as</title>
        <p>V ( X ,Y ) = M {kσ ( X ,Y )}, (7)
where M{•} – is the expectation symbol; kσ (•) – rotation invariant Mercer kernels; σ – kernel
width.</p>
        <p>The most widely used in calculating the correntropy are Gaussian ones, defined by the formula
kσ ( X ,Y ) = 1 exp− X − Y 2 . (8)</p>
        <p>2πσ  2σ 2 </p>
        <p>When calculating the correntropy, it is necessary to know the joint distribution of random variables
X and Y, which, as a rule, is not known. Since in practice there are usually a finite number of samples
{xi , yi },i = 1,2,..., N , the most simple estimate of the correntopy is calculated as follows:
1 N
Vˆ( X ,Y ) = ∑ kσ (xi − yi ). (9)</p>
        <p>N i=1</p>
        <p>In tasks of identification, filtering, etc. as a functional, the correntropy between the required output
signa l di and the model output signa l (real) yi is used. When using Gaussian kerne ls, the optimized
functional takes the form
where ei = di − yi – is the identification (filtration) error.</p>
        <p>The use of the Taylor series expansion for the Gaussian kernel makes it possible to write the
correntropy as follows:</p>
        <p>Jcorr (n) =
1 1 N</p>
        <p>∑
2πσ N i=n−N +1
exp − ei2 ,</p>
        <p> 2σ 2 
V ( X ,Y ) =
1 ∑∞ (−1)n
2πσ n=02n σ 2nn!</p>
        <p>M { X − Y 2n }.</p>
        <p>(10)
(11)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Correntropy maximization algorithms</title>
      <p>The gradient optimization a lgorithm (10) at N = 1 will have the form [18, 19]
and having the form
where γ is the parameter affecting the rate of convergence.</p>
      <p>A significant drawback of this algorithm is the low convergence rate, which significantly limits the
possibility of its use in identifying nonstationary objects. It should be noted that finding the optima l
value of the parameter γ , that provides the maximum convergence rate of the algorithm, equa l, as it is
easy to show,</p>
      <p> e2n2σ+21 , leads to an analogue of Kaczmarz algorithm (Widrow–Hoff’s).
where ψn+1 = exp −</p>
      <p></p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref16">20–23</xref>
        ], to reduce impulse noise, a recurrent weighted least squares (RWLS) method was
proposed, which minimizes the criterion
 en2+1 
wn+1 = wn + γ exp −
 2σ 2 en xn+1,

γn+1 = ψn+1 xn+1 2 −1,
      </p>
      <p> en2+1 
ψn+1 = exp −

 2σ 2 
cn+1 = cn +
ψn+1Pn xn+1</p>
      <p>T
λ + ψn+1xn+1Pn xn+1</p>
      <p>( yn+1 − cnT xn+1),</p>
      <p>Pn+1 = λ−1 Pn − λψ+n+ψ1nP+n1xxnnT++11xPnTn+x1nP+n1 ,
where 0 ≤ λ &lt; 1 is the weighing factor.</p>
      <p>Thus, when deriving the formula for calculating Pn+1 (16), the approximation was used</p>
      <p>Pn+1 = λPn + ψn+1xn+1xnT+1. (17)</p>
      <p>As known, introduc ing a parameter λ into an algorithm is advisable when identifying
nonstationary parameters.</p>
      <p>Since a function Gσ (e) is a local function of error e , correntropy can be used as an indicator of
error in information processing and machine learning problems</p>
      <p>Gσ (e) =exp  − e2  (18)
1 
2πσ  2σ 2 .</p>
      <p>It can be seen from (18) that the center of the Gaussian nuc leus is at zero. This circumstance can
lead to the fact that if the distribution of errors (noise) has a nonzero mean, function (18) will not
correspond to this distribution. Therefore, the problem arises of choosing such a correntropy function
that would be effective for noises having a nonzero mean.</p>
      <p>One of the approaches to solving this problem is the use of correntropy with a variable center
[24</p>
      <p>Vσ ,c (T ,Y ) = M {Gσ ,c (e)}Gσ (x, y) =
1
2π σ
exp − {e − c}2 ,
 2σ 2 
where c ∈ R is the center.</p>
      <sec id="sec-4-1">
        <title>In this case</title>
        <p>Vσ ,c (T ,Y ) =
1 ∑∞ (−1)n M  (e − c)2n .
2π σ n=0 2n n!  σ 2n 
(12)
(13)
(14)
(15)
(16)
(19)
(20)</p>
        <p>When σ increasing, the moments of higher orders relative to the center will decrease faster,
therefore, the moment of the second order will prevail in the value Vσ ,c (T ,Y ) . In particular, for
c = M {e} and σ → ∞ , maximizing the correntropy whih the center c is equiva lent to minimizing the
error variance.</p>
        <p>In [27], it was proposed сomplex сorrentropy w ith variable center, in [28] was introduced
generalized correntropy criterion. In [29] was considered maximum mixture correntropy criterion.</p>
        <p>
          The solution of practical problems based on the minimization of the corresponding criteria was
considered in [
          <xref ref-type="bibr" rid="ref2 ref6">30–33</xref>
          ].
        </p>
        <p>Sparsity Constrained Recursive Generalized maximum correntropy criterion (MCC ) with variable
center algorithm was studied in [34]. Work [35], is interested in distributed MCC algorithms, based
on a divide-and-conquer strategy.</p>
        <p>Minimizing functional (19) with respect to the parameters of the model, we obtain
∂En+1 = − exp (en+1 − c)2  (en+1 − c)
∂w  2σ 2  2σ 2</p>
        <p>xn+1;
 (en+1 − c)2  (en+1 − c)
∂En+1 = w exp −
∂c  2σ 2  σ 2</p>
        <p>;
 (en+1 − c)2  (en+1 − c)2
∂∂Eσn+21 = −w exp − 2σ 2  σ 3 . (23)</p>
        <p>Taking these expressions into account, the algorithms for correcting the network parameters will
have the form</p>
        <p> (en+1 − cn+1 )2 (en+1 − cn+1 )xn+1,
wn+1 = wn +γ w exp −
 2σ n2+1 </p>
        <p> (en+1 − cn+1 )2 (en+1 − cn );
cn+1 = cn +γ c exp −</p>
        <p> 2σ n2+1 
σ n2+1 = σ n2 −γ σ wn+1 exp − (en+12−σcn2n+1 )2  (en+1σ− n3cn+1 ) , (26)
2
where γ w ,γ c , γ σ are the parameters of the algorithm that regulate the step size and affect the rate
of its convergence.
4.1.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Multidimensional object</title>
      <p>If the object under study has several outputs, then the output s igna l w ill be a vector signal and the
error will also be a vector value, and the learning algorithm w ill have the form
2 
wn+1 = wn +γ exp − en+1 − c R−1 en+1xn+1,</p>
      <p>
2
where en+1 − c R−1 = (en+1 − c)T R−1(en+1 − c); R−1 is the covariance matrix of the input vector
Rn−+11 = Rn−1 −γ Rwn+1 exp − en+1 − cn+1 2Rn−1 (en+1 − cn+1)(en+1 − cn+1)T .</p>
      <p>
4.2.</p>
    </sec>
    <sec id="sec-6">
      <title>Investigation of the issues of convergence of the algorithm.</title>
      <sec id="sec-6-1">
        <title>Consider the estimation error Then</title>
        <p>Θ n+1 = cn+1 − c*.
en+1 =Θ nT+1xn+1 +ξ n+1 = ena+1 +ξ n+1,
where en+1 =Θ nT+1xn+1 is a priori error.
a
(21)
(22)
(24)
(25)
(27)
(28)
(29)
(30)</p>
      </sec>
      <sec id="sec-6-2">
        <title>In this case, the estimation algorithm can be written as</title>
        <p>where f (en+1) = exp (en+1 − c)2 
(en+1 − c).
 2σ 2 </p>
      </sec>
      <sec id="sec-6-3">
        <title>Writing down algorithm (31) with respect to estimation errors, we have</title>
        <p>Multiplying both sides of the given expression on the left by θ nT+1, we get
wn+1 = wn +γf (en+1)xn+1,
θ n+1 =θ n −γf (en+1)xn+1.</p>
        <p>Averaging both sides of (32), i.e.
we obtain the condition for the convergence of algorithm (31) in the mean square</p>
      </sec>
      <sec id="sec-6-4">
        <title>Consider a steady state. Since in steady state it follows from (33) that</title>
        <p>0 &lt; γ ≤</p>
        <p>2M {f (en+1)ena+1}
M {f 2 (en+1) xn+1
2 }</p>
        <p>.
lim M {θ n+1 2}= lim M {θ n 2},
n→∞ n→∞
2lim M {ena+1 f (en+1)}= γtrRx lim M {f 2(en+1) ,</p>
        <p>}
n→∞ n→∞
(31)
(32)
(33)
(34)
(36)</p>
        <p>+∞
2π1σ e nl→im∞ −∫∞ exp − (en+21σ−2c)2 (en+1 − c)2 exp − (en+21σ−e2ce )2 den+1.</p>
        <p>Substitution of (35) and (36) into (34) gives the expression for the steady-state error
( a )2 
lim M  en+1 
n→∞  </p>
        <p>( a )2  A
lim M  en+1  =
n→∞   2B
where tr denotes the trace operator.</p>
        <p>To calculate the steady-state value of the estimation error, we define M {f 2(en+1) xn+1 2} and
Consider the case of Gaussian noise ξ ∼ Ν (0,σξ2 ). Using Price's theorem [36], we obtain
lim M {ena+1 f (en+1)}= lim M {ena+1 f (ena+1 +ξ n+1)}= lim M (ena+1)2 M {f ′(en+1)} =
n→∞ n→∞ n→∞  </p>
        <p>  (en+1 − c)2 
= lim SM exp − 1−
n→∞   2σ 2 
B = nl→im∞ +−∫∞∞ exp − (en+21σ−2c)2 1 − (en+σ1 −2 c)2 exp − (en+21σ−e2ce )2 den+1;
or
lim M (ena+1)2  =
n→∞
=</p>
        <p>+∞
γtrRx lim ∫ exp − (en+1 − c)2 (en+1 − c)2 exp − (en+1 − ce )2 den+1
n→∞ −∞  2σ 2   2σ e2 
+∞
lim ∫ exp − (en+1 − c)2 1− (en+1 − c)2 exp − (en+1 − ce )2 den+1
n→∞ −∞  2σ 2  σ 2   2σ e2 </p>
        <p>This expression shows that lim M (ena+1 )2  = 0 when choosing γ → 0.</p>
        <p>n→∞ </p>
        <p>Consider the case of non-Gaussian interference. In this case, we use the Taylor series expansion. In
the steady state, the estimated parameters change (are corrected) insignificantly. Therefore, we can
rewrite (34) as follows:
f ′(ξ ) = exp − (ξ2−σ c2)2 1− (ξ σ−2c)2 ;
f ′′(ξ ) = exp − (ξ − c)2  (ξ − c)3 −
 2σ 2  σ 4
3(ξ − c) 
σ 2 .</p>
        <p>
γtrRxM {f 2(ξ − c)}
γtrRxM {K (ξ − c)2}
S =
2M {f ′(ξ − c)}−γtrRxM {f (ξ − c) f ′′(ξ − c) + f ′(ξ − c) 2}
.
2M K′1− (ξ − c)2  −γtrRxM K1+ 2(ξ − c)4
  σ 2    σ 4
−
5(ξ − c)2 
σ 2 </p>
        <p>Assuming that the interference does not correlate with the signa ls and the prior error ea , we can
write</p>
        <p>M {f 2 (e)}≈ M {f 2 (ξ )}+ SM {f (ξ ) f ′′(ξ ) + f ′(ξ ) 2}.</p>
      </sec>
      <sec id="sec-6-5">
        <title>Substituting (43) and (44) into (40), we have</title>
        <p>M {ea f (e)}= M ea f (ξ ) + f ′(ξ )(ea )2 + o(ea )2 
 ≈ SM {f ′(ξ )};
 </p>
      </sec>
      <sec id="sec-6-6">
        <title>Substitution of (41), (42) into (45) gives</title>
        <p>S =
where</p>
        <p> (ξ − c)2   (ξ − c)2 
K = exp − σ 2 ; K′ = exp − 2σ 2 .
4.3.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Non-stationary case</title>
      <sec id="sec-7-1">
        <title>Let us assume that the estimated parameters are non-stationary, i.e.</title>
        <p>cn+1 = cn∗ + ∆c*,
∗
zero mathematical expectation, the correlation matrix of which is equal to Rc = M {c∗c∗T }.
where ∆c* = (∆c1∗,∆c2∗,...,∆c∗N )T is a vector of a random sequence N ×1 whose components have
Consider the error vector θ n+1 = cn+1 − cn∗+1.</p>
      </sec>
      <sec id="sec-7-2">
        <title>Then, taking into account (30), the estimation algorithm can be written as</title>
        <p>θ n+1 =θ n − cn∗+1 +γf (en+1)xn+1 =θ n − ∆c∗ +γf (en+1)xn+1,
Multiplying both sides of (48) on the left by θ nT+1 and calculating the mathematical expectation, we
Sσ 3</p>
        <p>3
(σ 2 +σ ξ2 + S )2</p>
        <p>;
σ 3 (S +σ ξ2 )</p>
        <p>3
(2σ ξ2 +σ 2 + 2S )2
.</p>
        <p>(47)
(48)
(49)</p>
        <p>(50)
(51)
(52)
get
=
=</p>
        <p>M {θ n+1 2}= M {θ n 2}− 2γM {xnT+1θ n f (en+1)}+γ 2M {f 2(en+1) xn+1 2}+
+ M  ∆c∗ 2  }</p>
        <p> + M {xnT+1∆c∗}+ M {∆c∗T xn+1}− 2γM {xnT+1∆c∗ f (en+1) ,

 </p>
      </sec>
      <sec id="sec-7-3">
        <title>Taking into account the statistical properties of signa ls and noise, we have</title>
        <p>M {θ n+1 2}= M {θ n 2}− 2γM {ena+1 f (en+1)}+ γ 2M {f 2(en+1) xn+1 2}+ M  ∆c∗ 2 .
 </p>
        <p>For Gaussian interference, using Price's theorem gives
lim M {ena+1 f (en+1)}= lim M {ena+1 f (ena+1 +ξ n+1)}= lim M (ena+1)2 M {f ′(en+1)} =
n→∞ n→∞ n→∞  
  (en+1 − c)2 
= lim SM exp − 1−
n→∞   2σ 2 
M {f 2 (en+1)}= lim M exp − (en+1 − c)2 (en+1 − c)2  =
n→∞   2σ 2  
+∞
1  (en+1 − c)2   (en+1 − ce )2 
2π σ e nl→im∞ −∫∞ exp − 2σ 2 (en+1 − c)2 exp − 2σ e2 den+1 =</p>
      </sec>
      <sec id="sec-7-4">
        <title>Considering that</title>
        <p>

 = M {∆c∗∆c∗T }= trRc ,
M  ∆c∗ 2 </p>
        <p>
for steady state when lim M {θ n+1 2}= lim M {θ n 2},
n→∞ n→∞
from expression (49) we obtain
From this ratio, we can determine the value S
2S</p>
        <p>3
(σ 2 +σξ2 + S )2
=
γtrRx (σξ2 + S )</p>
        <p>3
(σ 2 + 2σξ2 + 2S )2
+
trRc
γσ 3
.</p>
        <p>S =</p>
        <p>3 3
γtrRx (σξ2 + S )(σ 2 +σξ2 + S )2 trRc (σ 2 +σξ2 + S )2</p>
        <p>3 + 2γσ 3
(σ 2 + 2σξ2 + 2S )2</p>
        <p>For σ 2 → ∞ , we have the value of S for the least squares
γtrRxσξ2 +γ −1trRc
lim S = .
σ →∞ 2 −γtrRx</p>
      </sec>
      <sec id="sec-7-5">
        <title>In the case of non-Gaussian noise, we have</title>
        <p>M {ena+1 f (en+1)}≈ M {ena+1 f (ξ n+1)+ ena+1 f ′(ξ n+1)}≈ SM {f ′(ξ n+1)}.
M {f 2(en+1)}≈ M (f (ξ n+1)+ ena+1 f ′(ξ n+1)+ 0,5 f ′′(ξ n+1)ena+21)2  ≈

≈ M {f 2(ξ n+1)}+ SM {(f (ξ n+1) f ′′(ξ n+1)+ ( f ′(ξ n+1))2 )},
where
f ′(ξ n+1) = exp − (ξ2−σ c2)2 1− (ξ σ−2c)2 ;
 (ξ n+1 − c)2 ξ n3+1 − 3ξ n+1 .</p>
        <p>
f ′′(ξ n+1) = exp − 2σ 2 ξ n4+1 σ 2 </p>
      </sec>
      <sec id="sec-7-6">
        <title>Substituting (54) and (55) into (49), after simple transformations we obtain</title>
        <p>S = γAC+−γγ−D1B ,
where
A = trR xM(ξn+1 − c)2 exp − (ξn+1 − c)2 ;
 
  σ2 
B = trRc;
C = 2M1− (ξn+21σ−2 c)2  exp − (ξn+σ1 2− c)2 ;
D = trR xM1+ 2(ξn+σ14− c)4 − 5(ξn+σ12− c)2  exp − (ξn+σ1 2− c)2 
(54)
(55)
(56)
This expression shows that S is a monotonically non-increasing function of the parameter γ .</p>
        <p>From the condition ∂S / ∂γ = 0 , an equation can be obtained to determine the optima l value of the
parameter γ that provides the minimum value S</p>
        <p>ACγ 2 + BDγ − BC = 0.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>5. Numerical experiments</title>
      <p>The problem of ADALINE parameters adjustment was considered. Sequences of normally
distributed quantities х(k) ~ Ν (0;1) were chosen as the input signa l х(k) . When testing the robustness
of the algorithms, an independent noise distributed according to the Rayle igh law with σ = 1 was
added to the output signa l of the object. The histogram of such noise is shown in fig. 2. The
simulation results for various values of the parameter are shown in fig. 3. In fig. 4 shows the graphs of
changes in the error when choosing the RWLS algorithm (15)-(16) and algorithm (31) respective ly,
here</p>
      <p>RMSE =
1 2
2 cn − c* ,
where cn and c* denote estimated and target parameters vectors respectively.</p>
    </sec>
    <sec id="sec-9">
      <title>6. Conclusion</title>
      <p>The work considered an adaptive robust learning a lgorithm for ADALINE when using the
information criterion of correntropy with variable center as a learning criterion.</p>
      <p>The properties of its convergence in the stationary and non-stationary cases in conditions
of non-Gaussian noises are investigated.</p>
      <p>The importance of choosing the width of the Gaussian kernel, which affects the rate of
convergence of estimation a lgorithms and the error in the steady state, is noted, and the expediency of
developing procedures for adaptive correction of the kernel width is indicated.</p>
      <p>The estimates obtained are quite general and depend both on the degree of nonstationarity
of the object and on the statistical characteristics of useful signals and interference.</p>
    </sec>
    <sec id="sec-10">
      <title>7. References</title>
      <p>[1] Widrow, В., Hoff, М. Adaptive switching circuits, IRE WESCON Convention Record. Part 4.</p>
      <sec id="sec-10-1">
        <title>New York: Institute of Radio Engineers, рp.96–104.(1960)</title>
        <p>[2] Kaczmarz, S. Angenäherle Auflösung von Systemen linearer Gle ichungen, Bull. Int. Acad.</p>
        <p>Polon. Sci. Lett., C 1, Sci. Math. Nat. Ser. A. S. 355-357.(1937)
[25] Zhu, L., Song, C., Pan, L., Li, J. Adaptive Filtering Under the Maximum Correntropy Criterion</p>
        <p>With Variable Center. IEEE Acces, pp. 105902–105908 (2019).
[26] Wang, X., Han, J. Affine Projection Algorithm by Employing Maximum Correntropy Criterion
for System Identification of Mixed Noise.IEEE Acces, 7, pp. 182515– 182526 (2019)
[27] Dong,F., Q ian,G., Wang, S. Complex Correntropy with Variable Center: Definition, Properties,
and Application to Adaptive Filtering. Entropy (Basel). 22(1): 70 p.( 2020)
[28] Yang, J., Cao, J., Xue, A. Robust Maximum Mixture Correntropy Criterion-Based
SemiSupervised ELM With Variable Center. IEEE Transactions on C ircuits and Systems II:
Express Briefs, 67, 12, pp. 3572–3576 ( 2020)
[29] Zhang, J., Huang, G., Zhan, L. Generalized Correntropy Criterion-Based Performance</p>
      </sec>
      <sec id="sec-10-2">
        <title>Assessment for Non-Gaussian Stochastic Systems. Entropy, 23, 764 (2021).</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref6">30</xref>
          ] 30. Li, Y., Wang, Y., Sun, L. A Proportionate Normalized Maximum Correntropy Criterion
Algorithm with Correntropy Induced Metric Constra int for Identifying Sparse Systems.
        </p>
        <p>Symmetry, 10, 683 (2018).
[31] 31. Zhang, J., Huang, G., Zhan, L. Generalized Correntropy Criterion-Based Performance</p>
      </sec>
      <sec id="sec-10-3">
        <title>Assessment for Non-Gaussian Stochastic Systems. Entropy, 23, 764 (2021).</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref2">32</xref>
          ] 32. Wang, X., Han, J. Affine Projection Algorithm by Employing Maximum Correntropy
        </p>
        <p>Criterion for System Identification of Mixed Noise. IEEE Acces, 7, pp. 182515–182526 (2019).
[33] 33. Sun, Q., Zhang, H., Wang, X., Ma, W., C hen, B. Sparsity Constrained Recursive
Generalized Maximum Correntropy Criterion With Variable Center Algorithm. IEEE
Transactions on C ircuits and Systems II: Express Briefs, 67, 12, pp. 3517–3521 (2020).
[34] Xie, F., Hu,T., Wang, S., Wang, D. Maximum Correntropy Criterion with Distributed Method.</p>
        <p>Mathematics, 10, 304, 17 p. (2022).
[35] Wang, X., Han, J. Affine Projection Algorithm by Employing Maximum Correntropy Criterion
for System Identification of Mixed Noise. IEEE Acces, 7, pp. 182515–182526 (2019).
[36] Price, R. A useful theorem for nonlinear devices having Gaussian inputs. IEEE Transactions on
Information Theory, 4 (2), pp. 69–72. (1958).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Control</surname>
          </string-name>
          ,
          <year>1993</year>
          ,
          <volume>57</volume>
          , p.
          <fpage>1269</fpage>
          -
          <lpage>1271</lpage>
          . [3]
          <string-name>
            <surname>Либероль</surname>
            ,
            <given-names>Б.Д.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Руденко</surname>
          </string-name>
          . О.Г, Бессонов, А.А.
          <article-title>Исследование схо димости одношаговых</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <volume>32</volume>
          (
          <year>2018</year>
          ). [4]
          <string-name>
            <surname>Хьюбер</surname>
            ,
            <given-names>П.</given-names>
          </string-name>
          <article-title>Робастность в статистике</article-title>
          .
          <source>М.:Мир</source>
          .
          <volume>304</volume>
          <fpage>с</fpage>
          .(
          <year>1984</year>
          ) [5]
          <string-name>
            <surname>Hampel</surname>
            ,
            <given-names>F.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ronchetti</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rousseeuw</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stahel</surname>
            ,
            <given-names>W.A. Robust</given-names>
          </string-name>
          <string-name>
            <surname>Statistics</surname>
          </string-name>
          . The Approach
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>Based on Influence Functions</article-title>
          . N.Y.: John Wiley and Sons, 526 p.(
          <year>1986</year>
          ) [6]
          <string-name>
            <surname>Rudenko</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bezsonov</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <article-title>Function approximation using robust radial basis function networks</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>J. of Intelligent Learning Systems and Applications</source>
          ,
          <volume>3</volume>
          , pр.
          <fpage>17</fpage>
          -
          <lpage>25</lpage>
          (
          <year>2011</year>
          ). [7]
          <string-name>
            <surname>Руденко</surname>
            ,
            <given-names>О.Г.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Бессонов</surname>
          </string-name>
          , А.А.
          <article-title>М-обучение радиально-базисных сетей с использованием</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          (
          <year>2012</year>
          ). [8]
          <string-name>
            <surname>Walach</surname>
            ,
            <given-names>E. Widrow D.</given-names>
          </string-name>
          <article-title>The least mean fourth (LMF) adaptive algorithm and its family</article-title>
          , IEEE
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Trans</surname>
          </string-name>
          , IT 30, pр.
          <fpage>275</fpage>
          -
          <lpage>283</lpage>
          (
          <year>1984</year>
          ). [9]
          <string-name>
            <surname>Chambers</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Avlonitis,
          <string-name>
            <given-names>А. A Robust</given-names>
            <surname>Mixed-Norm Adaptive</surname>
          </string-name>
          Filter Algorithm. IEEE Signal
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>Processing Letters</source>
          ,
          <volume>4</volume>
          ,
          <issue>2</issue>
          , pp.
          <fpage>46</fpage>
          -
          <lpage>48</lpage>
          (
          <year>1997</year>
          ). [10]
          <string-name>
            <surname>Papoulis</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stathaki</surname>
            <given-names>T. A Normalized</given-names>
          </string-name>
          <string-name>
            <surname>Robust</surname>
          </string-name>
          <article-title>Mixed-Norm Adaptive Algorithm for System</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Identification</surname>
          </string-name>
          ,
          <source>IEEE Signal Processing Letters</source>
          ,
          <year>2004</year>
          ,
          <volume>11</volume>
          ,
          <issue>1</issue>
          , p.
          <fpage>56</fpage>
          -
          <lpage>59</lpage>
          [11]
          <string-name>
            <surname>Chambers</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tanrikulu</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Constantinides</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          <article-title>Least mean mixed-norm adaptive filtering,</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Electronics</surname>
          </string-name>
          letters,
          <volume>30</volume>
          ,
          <issue>19</issue>
          , pp.
          <fpage>1574</fpage>
          -
          <lpage>1575</lpage>
          (
          <year>1984</year>
          ). [12]
          <string-name>
            <surname>Zerguine</surname>
            ,
            <given-names>А.</given-names>
          </string-name>
          <article-title>A variable-parameter normalized mixed-norm (VPNMN) adaptive algorithm</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>EURASIP Journal on Advances in Signal Processing</source>
          ,
          <volume>55</volume>
          , 13 p. (
          <year>2012</year>
          ) [13]
          <string-name>
            <surname>Руденко</surname>
            ,
            <given-names>О.Г.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Безсонов</surname>
          </string-name>
          , О.О.,
          <string-name>
            <surname>Сердюк</surname>
          </string-name>
          , Н.М. Олійник, К.О. Романюк, О.С. Робастна
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <issue>інформації</issue>
          ,
          <volume>1</volume>
          (
          <issue>160</issue>
          ), с.
          <fpage>80</fpage>
          -
          <lpage>88</lpage>
          (
          <year>2020</year>
          ). [14]
          <string-name>
            <surname>Principe</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , Fisher,
          <string-name>
            <surname>J. W.</surname>
          </string-name>
          <article-title>Learning from examples with information theoretic</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>criteria. J. VLSI Signal</given-names>
            <surname>Process</surname>
          </string-name>
          . Syst.,
          <volume>26</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          , p.
          <fpage>61</fpage>
          -
          <lpage>77</lpage>
          (
          <year>2000</year>
          ). [15]
          <string-name>
            <surname>Principe</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Fisher, J. Information theoretic learning / In: S. Haykin (Ed.),
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Unsupervised</given-names>
            <surname>Adaptive Filtering</surname>
          </string-name>
          . New York: Wiley, pp.
          <fpage>265</fpage>
          -
          <lpage>319</lpage>
          (
          <year>2000</year>
          ). [16]
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pokharel</surname>
            ,
            <given-names>P.P</given-names>
          </string-name>
          , Principe,
          <string-name>
            <surname>J.C.</surname>
          </string-name>
          <article-title>Correntropy: Properties and Applications in Non-Gaussian</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Signal</given-names>
            <surname>Processing</surname>
          </string-name>
          .
          <source>IEEE Trans. on Signal Processing</source>
          ,
          <volume>1</volume>
          , pp.
          <fpage>5286</fpage>
          -
          <lpage>5298</lpage>
          .(
          <year>2007</year>
          ) [17]
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Principe</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <article-title>An adaptive kernel width update method of</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>(DSP)</source>
          , pp.
          <fpage>916</fpage>
          -
          <lpage>920</lpage>
          (
          <year>2015</year>
          ). [18]
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xing</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>N</given-names>
          </string-name>
          , Principe,
          <string-name>
            <surname>J.C.</surname>
          </string-name>
          <article-title>Steady-state mean-square error analysis</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <volume>21</volume>
          (
          <issue>7</issue>
          ), pp.
          <fpage>880</fpage>
          -
          <lpage>884</lpage>
          (
          <year>2014</year>
          ). [19]
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qua</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guib</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <article-title>Maximum correntropy criterion based</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          environments.
          <source>J. of the Franklin Institute</source>
          ,
          <volume>352</volume>
          ,
          <issue>2</issue>
          , pp.
          <fpage>2708</fpage>
          -
          <lpage>2727</lpage>
          (
          <year>2015</year>
          ) [20]
          <string-name>
            <surname>Xiong</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schindelhauer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>So</surname>
            ,
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <article-title>Maximum Correntropy Criterion for Robust</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Processing</surname>
          </string-name>
          ,
          <volume>40</volume>
          , pp.
          <fpage>6325</fpage>
          -
          <lpage>6339</lpage>
          (
          <year>2021</year>
          ). [21]
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>G</given-names>
          </string-name>
          ,.
          <string-name>
            <surname>Ho</surname>
            ,
            <given-names>K.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <source>Robust Ellipse Fitting With Laplacian Kernel Based</source>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Maximum</given-names>
            <surname>Correntropy</surname>
          </string-name>
          <article-title>Criterion</article-title>
          .
          <source>IEEE Transactions on Image Processing</source>
          ,
          <volume>30</volume>
          , pp.
          <fpage>3127</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <volume>3141</volume>
          (
          <year>2021</year>
          ). [22]
          <string-name>
            <surname>Flores</surname>
            ,
            <given-names>T.K.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villanueva</surname>
            ,
            <given-names>J.M.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>H.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Catunda</surname>
            ,
            <given-names>S.Y.C.</given-names>
          </string-name>
          <string-name>
            <surname>Adaptive Pressure</surname>
          </string-name>
          Control
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>System Based on the Maximum Correntropy Criterion. Sensors</source>
          ,
          <volume>21</volume>
          (
          <issue>15</issue>
          ),
          <volume>5156</volume>
          (
          <year>2021</year>
          ). [23]
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>Kernel-based maximum correntropy criterion with gradient method</article-title>
          ,
          <source>CPAA</source>
          ,
          <volume>19</volume>
          (
          <issue>8</issue>
          ),
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          4159-
          <fpage>4177</fpage>
          (
          <year>2020</year>
          ). [24]
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Principe</surname>
            ,
            <given-names>J.C</given-names>
          </string-name>
          <string-name>
            <surname>Maximum</surname>
          </string-name>
          <article-title>Correntropy Criterion with Variable Center</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>IEEE Signal Process. Letter,26.(5 )</source>
          , pp.
          <fpage>1212</fpage>
          -
          <lpage>1216</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>