<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Methods Assessment the Probability Density of Discrete Signals in Telecommunications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuriy Kropotov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aleksey Belov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Murom Institute (branch) "Vladimir State University named after Alexander and Nicholay Stoletovs"</institution>
          ,
          <addr-line>Murom</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>745</fpage>
      <lpage>754</lpage>
      <abstract>
        <p>This paper is devoted to investigation of problems and methods of acoustic signals modeling in the information and control systems for audio exchange communications. The problems of estimation and approximation of probable density functions, which may assist in distinction of acoustic speech signals and external acoustic noise. We consider the direct and indirect methods, techniques histogram evaluation, ways to overcome incorrect problems.</p>
      </abstract>
      <kwd-group>
        <kwd>Probability density</kwd>
        <kwd>discrete signals</kwd>
        <kwd>telecommunication systems</kwd>
        <kwd>distribution function</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Evaluation distributions speech signals and noise, as well as any nature of data, based
on empirical derived from experimental results of measurements [1]. There are many
methods of preparing such estimates, divided into many parametric and
nonparametric, direct and indirect methods.</p>
      <p>Under the parametric or understood by classical methods and the methods in which
the probability density is known to an accuracy of parameters, it has the form</p>
      <p>Copyright © by the paper's authors. Copying permitted for private and academic purposes.</p>
      <p>In: A. Kononov et al. (eds.): DOOR 2016, Vladivostok, Russia, published at http://ceur-ws.org</p>
      <p>If the function f (x, ) is not a probability density, the parameter vector
estimation methods  are considered to be non-parametric. In this case - it is a task of
approximation or approximation of the observed data. The resulting approximation
function f (x, ) must satisfy the constraints [1, 3]</p>
      <p>
f (x, )  0 and  f ( x, )dx  1 .</p>
      <p>
        
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>A clear distinction between parametric and non-parametric methods is not always
possible. Thus, the problem of data closer mixture of known distributions represented
density functions  k (x, k ) , f (x, )   ak k (x, k ) ,  ak  1 ,, more
approk k
priately be classified as non-parametric tasks. However if the coefficients ak  0 are
known, the task can be seen as a parametric. For nonparametric problems are the
problems of least squares or linear and nonlinear regression. Methods for solving such
problems is also called projection methods. It should be noted that the definition of
non-parametric methods above only used in mathematical statistics. In the field of
systems theory, optimization, approximation and approach them, on the contrary, it is
called parametric [4, 7], based on the meaning of the tasks is to find a finite number of
unknown parameters.
2</p>
      <p>
        Direct and indirect methods of estimating the probability
density
A number of studies estimating the probability density methods are divided into direct
and indirect methods. This hallmark of the direct methods is to use a direct link with
the required density of empirical data. For example, to direct methods include
methods based on the solution of the integral equation relating the probability density of
the empirical distribution function

 I (x  v) f (v)dv  Fn (x) , (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )

where F  (x) is the empirical distribution function of the stepped type. The
solun
tion of equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) gives the desired estimate of the probability density.
The empirical distribution function is given by
      </p>
      <p>
        1 N
Fn (x)   I(, x] (xl ) (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
      </p>
      <p>N l1
where I(, x] (xl )  the indicator of the set (, x] ,</p>
      <p>1,
I(, x] (xl )  
0,
xl  (, x]
xl  (, x]</p>
      <p>and N  the sample size.</p>
      <p>
        Problem solving equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) with the function (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), as already indicated, it relates
to a class of incorrect and requires the use of special techniques. Especially the
incorrectness is shown with a small sample size [5]. Thus the need for recovery of the
probability density limited amount of data arises frequently, for example, in
connection with the analysis and segmentation unsteady, particularly speech signals, the
statistical characteristics can only be considered as constant intervals of similar
sounds.
      </p>
      <p>Unlike the direct, indirect methods are based on the average risk minimization
functional described by expressions of the form</p>
      <p>R     Q(x, )dF (x)  Rn   
1 n</p>
      <p>Q(xl , ) .</p>
      <p>N l1
or their corresponding empirical functionals</p>
      <p>1 n
Rn    Q(xl , ) .</p>
      <p>N l1</p>
      <p>According to this criterion to indirect methods include, such as the maximum
likelihood method [6].</p>
      <p>Direct, in principle, other methods, such as histogram techniques and methods
based on approximation   functions of a regular feature in the in the expression

f (x)    (x  v) f (v)dv .</p>
      <p>
        
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
      </p>
      <p>However, a clear distinction between direct and indirect methods, in general, is not
always possible. And due to the fact that in both cases, the problem of finding the
density estimates may result in one way or another, to the problem of minimizing a
functional of the empirical data, in particular, from the empirical distribution function.
3</p>
      <p>
        On nuclear and projection estimates the probability density
The nuclear method for obtaining estimates of the density based on the approximated
  function under the integral sign in (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is a function K (x) defined on some
interval of the argument. This function must satisfy the condition
      </p>
      <p>1  x 
lim K     (x) .</p>
      <p>h0 h  h 
As a function K (x) frequently used expressions</p>
      <p>1 2,
K (x)  
 0,
x  1
tation</p>
      <p>
        The right-hand side of equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) after such a substitution is a function of
expec1 K  x  , which can be replaced by the empirical mean value
h  h 
If we consider that option is chosen on the basis of the sample size, the probability
density estimate in accordance with equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) can be written as
      </p>
      <p>
        1 n  x  xl 
fˆ (x)   K   . (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
      </p>
      <p>nh(n) l1  h(n) </p>
      <p>The convergence of this expression to the desired density estimation provided by
the conditions: 1) h(n)  0 if n   and 2) a  0 for any number of inequality
l1

 e nh(n)   .</p>
      <p>
        The definition of function K (x) can be seen that with the decrease of the
parameter h is an increase in the accuracy of the approximation functions  , but at the same
time, increasing the chances of erroneous classification evaluation to class multimodal
densities. Conversely, increasing this setting may lead to an erroneous assessment of
the assignment to the unimodal density. The problem of choosing a parameter h that
arises in this regard stems from the incorrect density estimation problem and for this
reason has no unique solution. We can only assert that in assessing unimodal
distributions require higher values h than in the case of multimodal.
Equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ), and you can use when assessing the probability density projection
method. In this method, an unknown probability density is represented by a
polynomial system of normalized orthogonal functions k (x)m , while assessment [1]
1
la fˆ (x) 
      </p>
      <p>Substituting
1 n m</p>
      <p> k (x)k (xl ).
n l1 k 1</p>
      <p>m
fˆ (x)   akk (x) .</p>
      <p>
        k 1
Substitution of this polynomial in (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) gives the equation
      </p>
      <p>m d
f (x)   akk (x) . and ak  k (x) f (x)dx </p>
      <p>
        k 1 c
this expression in (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) leads
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
1 n
      </p>
      <p>k (xl )
n l1
to
the</p>
      <p>
        formum
Finally, if you enter the kernel function K (x, xl )  k (x)k (xl ) the estimate of
k 1
the density takes the form similar to (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
      </p>
      <p>1 n
fˆ (x)   K  x, xl  .</p>
      <p>n l1</p>
      <p>
        The use of projection methods in which the score is represented by formula (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ), is
not limited to the case considered. There are tasks that are equally based on a
projection methods, and the integral equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). In one of the approaches are evaluated on
the smoothed data that is provided by a non-degenerate linear operator of the form
d
B g(x)   K (x, v)g(v)dv .\
      </p>
      <p>
        c
The action of the operator on (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) leads to the equation
      </p>
      <p>
        G f (x)  Qn (x)
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
where
      </p>
      <p>d d
G f (x)   K (x, z) I (z  v) f (v)dvdz ,
c</p>
      <p>c
Qn (x)  B</p>
      <p>Fn (x) 
1 n d</p>
      <p>
          K (x, v)dv ,
n l1 xl
and (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) is an expansion with respect to functions k (x)1m of operator G
GH .
      </p>
      <p>
        The solution of equation (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) because of its incorrectness reduced to the problem of
minimizing the functional
      </p>
      <p>d 2 d
J ( fˆ )   G fˆ (x)  Qn (x) dx  j  fˆ 2 (x)dx . (8)
c c</p>
      <p>
        It is shown that this functional reaches a minimum at values of the coefficients of
the polynomial-patients (
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
ak 
 b
      </p>
      <p>k k
k2  j</p>
      <p>,
d
where bk   Qn (x) k (x)dx and  k (x) , k  its own functions and values of
c
the operator GH G .</p>
      <p>In the particular case when the core K(x, v)  K(x  v) of the operator B the
opd
erator to convolution B f (x)   K (x  v) f (v)dv .</p>
      <p>c</p>
      <p>This allows for the minimization of the functional (8) to take advantage of the
Fourier transform [1, 6, 7]. Using this, evaluation of density</p>
      <p>1 n
fˆ (x)  n  j l1 g (x  xl )  j  .</p>
      <p>Here the function g(u) 
1 </p>
      <p> g( )e jud . It is the inverse Fourier transform
2 
g( ) </p>
      <p>K ( )K ( )
K ( )K ( )  j 2</p>
      <p>
and K ( )   K (u)e ju du .</p>
      <p></p>
      <p>The histogram assessment of the probability density
The histogram is called a bar chart of the distribution of the random variable. The
height of each column represents the number of values of the random value falling
within the appropriate interval, generally different widths (see Fig. 1.).
The ratio of the random variable values nl from the interval (xl1, xl ] to the total
number of values N is the empirical probability of the event x  (xl1, xl ] .
3500
3000
2500
2000
1500
1000
500
0-2
-1
0
1
2
3
The theoretical value of this probability is written at the same time through a
probaxl
bility density P  x  (xl1, xl ]   f (x)dx .</p>
      <p>xl 1</p>
      <p>If we equate the theoretical and empirical density and assume that within each
interval change in the probability density can be neglected, the density estimation can
be written as
where
xl  xl  xl1 
the
lenght
of
the
l 
interval.</p>
      <p>When splitting field (c, d ] of the random variable values q at equal intervals of
length xl  (d  c) q and formula (9) can be written as
fl </p>
      <p>nl
xl N
, l  1,</p>
      <p>, q ,
fl </p>
      <p>nl q
(d  c)N
, l  1,
, q ,
(9)
(10)</p>
      <p>Count value obtained by the formula (9) or (10), etc. may be used for
approximation of the probability density. Units corresponding to estimates, in the first
approximation can be found from the expressions</p>
      <p>1
where f   f1,</p>
      <p>T
, f m  and    (x1), (x2 ),
, (xm ) .</p>
      <p>In evaluating you can also take advantage of the generalized method of local
interpolation. In this method, a sequence of the form of formula (11) as defined for the
corresponding sequence of interpolation intervals. At the same time these formulas
are supplemented by restrictions, providing the necessary conditions of conjugation of
local solutions, and the order of the polynomial is not required to match the number of
points (xl , fl ) , that is q  m .</p>
      <p>Approximation of probability density smoothing means is the task of the least
squares. The challenge here is to minimize the residual sum of squares polynomial
smoothing and density fl ratings. Functional to be minimized is recorded at the same
time as</p>
      <p>n 2
J (a)   aT (xl )  fl  (12)</p>
      <p>l1</p>
      <p>In order to smooth the data, as in the interpolation, you can use the methods of the
local approximation, generalizing them in relation to the desired, in particular, a
smooth interface polynomials defined on a sequence of intervals and delivering the
minimum values of functionals of the form (12) under the constraints set by the terms
of pairing.</p>
      <p>Histogram methods [1, 6] of estimation of the probability density,
especially by interpolation, the problem inherent in the partition of the set of values of
the random variable into intervals for small sample sizes. Fig. 2. a, b shows two
histograms mixture of normal distributions, the same as in Fig. 1. for a sample of 100
samples.
This figure shows that the partition of the set of values of the random variable by 20
intervals (Fig. 2 a) interpolation approach does not restore the true form of
distribution and draw the right conclusions. The situation is improved by splitting the
plurality of slots 10 (Fig. 2 b). In this case, a graph similar in shape to the true probability
density bimodal. The solution to this problem, in principle, feasible in the framework
of the adaptive partition of the set of values of the random variable in the interval, not
necessarily of the same length. Optimal partition is in this case, by varying the lengths
and intervals of the centers and of the results of comparison, possibly followed by
averaging them.</p>
      <p>At the local, including generalized local approximation, partition problem is less
acute, and is connected, on the contrary, ensuring sufficient to smooth the number of
intervals. However, there is a new question - the question selection algorithm that
ensures optimal degree of smoothing empirical estimates (9). Resolution of this issue
in principle, feasible methods based on the variation of the free parameters of the
algorithm and then selecting the best according to some evaluation criteria.</p>
      <p>Another problem for interpolation and approximation of methods for smoothing is
a problem of assessment fˆ ( x) belonging to the class of probability density functions.
These conditions within the local approximation can be taken into account by
introducing into the problem of minimizing the functional (12), corresponding limitations
and within the interpolation approach - by varying the lengths and intervals of the
partition centers.</p>
      <p>
        Finding the coefficients of the polynomial (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) optimization methods is the task of
the linear regression. In practice, however, these polynomials are often built on the
systems standard probability densities nonlinearly depend on a certain set of
parameT
ters. In this case, if the input vector of parameters    1, , r  , it is possible to
m
polynomial P(x, a, )   akk (x, )  aT (x, ) ,
k 1
where
determine
      </p>
      <p>the
 (x, )   1(x, ),</p>
      <p>T
, m ( x, ) .</p>
      <p>Accordingly, the estimate of density can be written as</p>
      <p>m
fˆ (x)   aˆkk (x,ˆ)  aˆT (x,ˆ)</p>
      <p>k 1
where the evaluation parameters are the solution to the minimization problem
n 2
 aˆ,ˆ  arg min  aT (xl , )  fl  .</p>
      <p>a, l1</p>
      <p>Finding the vector of parameters a and  and in this case refers to a class of
nonlinear problems, which are usually solved by constrained optimization methods.
5</p>
    </sec>
    <sec id="sec-2">
      <title>Conclusion</title>
      <p>This paper is a study of direct and indirect estimating the density methods of acoustic
signals and the probability of interference occurring in the information and control
telecommunications systems. Investigated models of nuclear projection probability
density estimate that are based on probability density signals approximation in the
case of unimodal and multimodal distributions. Applying method of histogram
mixture normal distributions estimation shows that the true form of distributions in the
partition of values set of a random variable on a different number of slots is not
always possible to restore. This solution is provided by an adaptive optimal partition by
varying the lengths and intervals of partition centers.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paramonov</surname>
            <given-names>A.A.</given-names>
          </string-name>
          <article-title>Methods of designing information processing telecommunications systems sharing audio algorithms: monograph</article-title>
          .-Moscow-Berlin: Direct Media,
          <year>2015</year>
          . 226 p (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          <article-title>The time interval determine the probability distribution of the amplitude of the speech signal law</article-title>
          <source>Radiotekhnika</source>
          ,
          <year>2006</year>
          . № 6. pp.
          <fpage>97</fpage>
          -
          <lpage>98</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ermolaev</surname>
            <given-names>V.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          <article-title>About correlation estimating model parameters of acoustic echo</article-title>
          .
          <source>Questions electronics</source>
          , Vol.
          <volume>1</volume>
          . №1. pp.
          <fpage>46</fpage>
          -
          <lpage>50</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bykov</surname>
            <given-names>A.A.</given-names>
          </string-name>
          <article-title>Algorithm acoustic noise suppression and interference with concentrated formant distribution rejection bands</article-title>
          .
          <source>Questions electronics</source>
          .
          <source>2010</source>
          . Vol.
          <volume>1</volume>
          . № 1. pp.
          <fpage>60</fpage>
          -
          <lpage>65</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          ,
          <article-title>Bykov A.A. Approximation of law probability distribution of acoustic noise signal samples</article-title>
          .
          <source>Radio engineering and telecommunication systems</source>
          .
          <source>2011. № 2</source>
          . pp.
          <fpage>61</fpage>
          -
          <lpage>67</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ermolaev</surname>
            <given-names>V.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eremenko</surname>
            <given-names>V.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karasev</surname>
            <given-names>O.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          <article-title>Identification of model of discrete linear systems with variable, slowly varying parameters</article-title>
          .
          <source>Radio Engineering and Electronics</source>
          ,
          <year>2010</year>
          . Vol.
          <volume>55</volume>
          . №1. pp.
          <fpage>57</fpage>
          -
          <lpage>62</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ermolaev</surname>
            <given-names>V. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karasev</surname>
            <given-names>O.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kropotov</surname>
            <given-names>Y.A.</given-names>
          </string-name>
          <article-title>Interpolation method filtration in problems of speech signal processing in the time domain/</article-title>
          / Journal of Computer and Information Technology,
          <year>2008</year>
          .-
          <fpage>№</fpage>
          7.- pp.
          <fpage>12</fpage>
          -
          <lpage>17</lpage>
          (in Russian).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>