=Paper= {{Paper |id=Vol-2917/paper11 |storemode=property |title=Nonparametric Test for Change Point Detection in Time Series |pdfUrl=https://ceur-ws.org/Vol-2917/paper11.pdf |volume=Vol-2917 |authors=Dmitriy Klyushin,Irina Martynenko |dblpUrl=https://dblp.org/rec/conf/momlet/KlyushinM21 }} ==Nonparametric Test for Change Point Detection in Time Series== https://ceur-ws.org/Vol-2917/paper11.pdf
Nonparametric Test for Change Point Detection in Time Series
Dmitriy Klyushin 1, Irina Martynenko 2
1
    Taras Shevchenko National University of Kyiv, prospect Glushkova 4D, Kyiv, 03680, Ukraine
2
    Academy of Labour Social Relations and Tourism, 3-A, Kiltseva doroha, Kyiv, 03187, Ukraine


                 Abstract
                 We describe new nonparametric tests for detection change points of the time series, which
                 are the points dividing the time series into segments of random values obeying different
                 distributions. The proposed tests are based on the Dempster–Hill theory and generalizations
                 of the Bernoulli scheme. To recognize the change points, simplified versions of the
                 Klyushin–Petunin homogeneity test are proposed. The significance level of these tests does
                 not exceed 0.05, and the accuracy is comparable to the original version. The tests have high
                 sensitivity and specificity of recognizing the heterogeneity of two random samples with
                 different means and the same variances or equal means but different variances. The described
                 tests can be useful in a wide variety of areas from healthcare (for example, when analyzing
                 time series generated by pulse oximeters) to IoT devices in industry and in everyday life.

                 Keywords 1
                 Time series, regression analysis, modeling and prediction, nonparametric statistics

1. Introduction
    The modern epoch is characterized by the intensive development of information networks and IoT
technologies. There are more and more information sources that generate large amounts of sequential
data that form time series which need to be processed in a short time. The problem has been
exacerbated by the COVID-19 outbreak, which has raised questions related to the processing of
signals from medical devices, tracking tools, etc. For example, a change in the distribution of random
variables representing the level of oxygen in the blood of a patient with COVID-19 may require an
immediate response from clinicians, which means that the change point in the time series generated
by the pulse oximeter must be found in real time.
    A time series is considered homogeneous if all its segments consist of random values that obey the
same distribution. A change point is a point before and after which the values of the time series obey
different distributions. Consequently, the problem of detecting the change point can be reduced to
testing of samples homogeneity in adjacent segments of time series.
    Change point detection methods are divided into online and offline. Online methods analyze
fragments of a time series in real time. Offline methods analyze a complete time series from the first
to the last point. A fairly detailed review of methods for finding change points in time series is given
in [1]. Despite the fact that we restrict ourselves to one-dimensional time series, it may be easily
generalized on multivariate case (see a survey in [2]).
    Since the problem of finding a change point can be reduced to the problem of samples
homogeneity, it should be mentioned that such tests are divided into parametric and nonparametric
tests. The former tests are based on the assumption that the time series data obey a specific (e.g.,
Gaussian distribution) or parameterized distribution [3–8]. The last tests do not depend on
assumptions about the shape of the distribution, using only the most general properties, for example,


MoMLeT+DS 2021: 3rd International Workshop on Modern Machine Learning Technologies and Data Science, June 5, 2021, Lviv-Shatsk,
Ukraine
EMAIL: dokmed5@gmail.com (D. Klyushin); martia@rambler.ru (I. Martynenko)
ORCID: 0000-0003-4554-1049 (D. Klyushin); 0000-0003-0249-4887 (I. Martynenko)
            © 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
continuity [9–19]. Online methods for finding change points in time series are considered in [20–24].
These methods analyze individual fragments of a time series, and not the entire series from beginning
to end in opposite to offline methods. This allows us not to accumulate data in large storages. The
purpose of the article is to outline new nonparametric online methods for recognizing change points in
a time series. The originality of the proposed methods lies in the simplification of the Klyushin–
Petunin criterion [25] without loss of accuracy.

2. Simplified versions of Klyushin–Petunin test for homogeneity
   Consider the problem of testing homogeneity of two samples which is equivalent to the problem of
change point detection. Usually, to solve this problem the Kolmogorov–Smirnov test and the Mann–
Whitney–Wilcoxon test are used. Recently, we proposed to use the Klyushin–Petunin test [25] and
demonstrated its high performance [26].
   Suppose that samples x = ( x1 , x2 ,..., xn ) and y = ( y1 , y2 ,..., ym ) are drawn from the distributions F1
and F2 , respectively. The null hypothesis states that x and y obey the same distribution, i.e. F1 = F2 .
The alternative hypothesis states that F1 ≠ F2 .
   Standard non-parametric two-sample homogeneity tests (Kolmogorov–Smirnov, Mann–Whitney–
Wilcoxon, etc.) produce only one-sided confidence limits with the given significance level. This
means that if the test statistics exceeds a given threshold the samples are considered as homogeneous,
otherwise the answer is uncertain. Therefore, it is desirable to construct tests with two-sided
confidence limits for a test statistics. Such tests is the Klyushin–Petunin homogeneity test [25] based
on the Matveychuk–Petunin model [27–29]. It allow constructing a two-sided confidence interval
with a given the significance level for both the true and false null hypothesis and estimating the
probability of types I and II errors.

2.1.        The Dempster–Hill theory and p‐statistics

      Let x = ( x1 , x2 ,..., xn ) ∈ F1 , y = ( y1 , y2 ,..., yn ) ∈ F2 , and x(0) = −∞, x(1) , x( 2 ) ,..., x( n ) , x( n +1) = +∞ be
corresponding variance series. The null hypothesis H 0 states that F1 = F2 . If the null hypothesis is
true, then due to the Dempster–Hill assumption [30] we have

                                                         ( (
                                 pij( k , n ) = P yk ∈ x(i ) , x( j ) =    ))
                                                                        j −i
                                                                        n +1
                                                                              , i < j.             (1)
   Corollary 1. Selecting random natural numbers i and j such that i < j N times, for a
sufficiently large N we have

                                                         ( (
                                 pij( k , n ) = P yk ∈ x(i ) , x( j ) ≈    ))
                                                                         j −i
                                                                        n +1
                                                                              , i < j.


      Corollary 2. Selecting random natural numbers N times (N < n) from (1) we have

                                                                ( (
                                          pij( k , n ) = P yk ∈ x(1) , x( j ) = ))
                                                                                 j −1
                                                                                n +1
                                                                                      .
                                                                    j −i
      If F1 ≠ F2 , then pij( ) significantly deviates from
                            k ,n
                                                                            . Therefore, we must estimate a difference
                                                                   n +1
between observed relative frequency hij(
                                                        k ,n)
                                                                                               (        )
                                                                of the event yk ∈ x(i ) , x( j ) . To do this, we use a

              ( p( ) , p( ) ) for the binomial proportion p( ) with given significance level β. Let
confidence limits
                              1
                             ij   ij
                                    2
                                                                                 ij
                                                                                   k ,n




L be the number of the intervals ( p ( ) , p( ) ) containing p , h =
                                                    1      2           2L
                                                                                be the proportion of the
                                                                                                   xy

                                                                     n ( n − 1)
 xy                                                ij    ij                      ij       xy
              ( p( ) , p( ) ) containing p( ) , h = m ( m − 1) is the proportion of the intervals ( p( ) , p( ) )
                   1             2                 k ,n                      2 Lyx                                                       1     2
intervals         ij           ij                ij               yx                                                                    ij   ij


containing pij(
                       k ,m)
                                 in the scheme where the samples x and y are switched. Then, p-statistics [25] is
                                              1
                                              2
                                                ( hxy + hyx ) .
                                                          ρ ( x. y ) =                         (2)
It is the probability that the samples x and y are homogeneous, i.e. they are drawn from the same
distribution.

2.2.     The Klyushin–Petunin test and its simplifications

   The original version of the Klyushin–Petunin test use N =
                                                                                                            ( n − 1) n    confidence intervals
                                                                                                                2
( p( ) , p( ) ) . Let L be the number of intervals ( p( ) , p( ) ) containing p( ) and (2) is the relative
    1
   ij   ij
          2                                                                        1
                                                                                  ij        ij
                                                                                              2
                                                                                                                 ij
                                                                                                                   k ,n



frequency of the random event { p ∈ ( p( ) , p( ) )} with the probability 1 − β . Construct the confidence
                                                 ij
                                                              1
                                                             ij        ij
                                                                         2



interval I for the probability of the event { p ∈ ( p( ) , p( ) )} with the given significance level (for
              n                                                              ij
                                                                                        1
                                                                                       ij         ij
                                                                                                    2



definiteness, in this paper we use the Wilson confidence interval [31]). The decision rules may be
formulated in the following way:
   1)    Original version. If the confidence interval I n covers 0.95 then the null hypothesis is
         accepted, otherwise it is rejected.
   2)    First simplified version. Generate N random natural numbers i and j such that j > i,
              where N ≥ 30. Compute ρ ( x, y ) using only intervals x( i ) , x( j )                     (           ) with above mentioned i
              and j , and the confidence interval I n . If I n covers 0.95 then the null hypothesis is accepted,
              otherwise it is rejected.
   3)         Second simplified version. Compute                                  ρ ( x, y ) using only intervals          ( x( ) , x( ) ) and the
                                                                                                                              1    j

              confidence interval I n . If I n covers 0.95 then the null hypothesis is accepted, otherwise it is
              rejected.

3. Numerical experiments
    As far as, there are many formulas for confidence intervals for binomial proportions [31], we must
justify the choice. To compare p-statistics based on different confidence intervals we used samples of
100 and 300 numbers and parameterized distributions. The step for parameters was set to 0.1. To
avoid bias of pseudorandom number generators, we have conducted 10 experiments, and results were
averaged. We have considered eight methods:
        I.    Clopper–Pearson interval.
        II. Bayes’s interval.
        III. Wilson interval with corrections for continuity.
        IV. Wilson interval without corrections for continuity.
        V. Normal approximation with corrections for continuity.
        VI. Normal approximation without corrections for continuity.
        VII. Arcsine transformations with corrections.
        VIII. Arcsine transformations without corrections.
    In order to compare the statistics, samples from the following general categories were considered:
1) same variance and different means; 2) same mean and different variances; 3) both different means
and variances. Samples from a parametric family of hypothetical distribution with a sample from the
general population with a reference distribution were compared (Fig. 1–6).
Figure 1: N(0, α), size = 100




Figure 2: N(0, α), size = 300




Figure 3: N(α, 1), size = 100




Figure 4: N(α, 1), size = 300
Figure 5: αU(0, 1)+ (1–α)U(1/2, 3/2), size = 100




Figure 6: αU(0, 1)+ (1–α)U(1/2, 3/2), size = 300

    Analyzing the Figures 1-6, we see that the choice of the confidence intervals I ij does not effect to
the final result: homogeneity measures are equivalent. Therefore, the family of homogeneity measures
based on Dempster–Hills theory has high performance both for small and large samples, and the p-
statistics may be considered as independent of the selection of the type of the confidence interval.
    In [26] we have shown that in the context of change point detection the sensitivity and specificity
of the Klyushin–Petunin test are higher than the sensitivity and specificity of the classical tests
(Kolmogorov–Smirnov and Mann–Whitney–Wilcoxon). Now, we consider the simplified versions of
the Klyushin–Petunin test and restricted ourselves with randomly selected 100 intervals x( i ) , x( j )(        )
following to the recommendations given in [32]. We generated samples containing 40 random
numbers obeying distributions which have the same mean and the different variances, the different
means and the same variance, and different means and different variances, and averaged results on 10
runs.
    Here N(µ, σ) is the Gaussian distribution with the mean µ and the standard deviation σ, U(a, b) is
the uniform distribution on an interval (a, b), LN(µ, σ) is the lognormal distribution with the mean µ
and the standard deviation σ, Exp(λ) is the exponential distribution with the parameter λ, Γ(α, β) is
the gamma distribution with parameters α and β (Fig. 7–9). Consider a time series x1 , x2 ,..., xn ,... . A
change-point in the time series is a point xm such that a sample ( x1 , x2 ,..., xm ) is drawn from a
distribution F1 and a sample ( xm +1 , xm + 2 ,..., xn ) is drawn from a distribution F2 ≠ F1 . Let the sample
( x1 , x2 ,..., xn ) be fixing. Consider the sliding window ( xi , xi +1 ,..., xi + n ) , where i = 1,..., n. As i
increases the sliding window “contaminated by the elements of the sample ( xn +1 , xn+ 2 ,..., x2n ) .
                        1

                       0,8                                                N(0,1)-N(3,1)-
                                                                          N(0,1)
                       0,6
                                                                          N(0,1)-N(2,1)-
                                                                          N(0,1)
                       0,4
                                                                          N(0,1)-N(1,1)-
                       0,2                                                N(0,1)

                        0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                 a) Gaussian distributions with different means

                        1

                       0,8                                              LN(0,1)-LN(3,1)-
                                                                        LN(0,1)
                       0,6
                                                                        LN(0,1)-LN(2,1)-
                                                                        LN(0,1)
                       0,4
                                                                        LN(0,1)-LN(1,1)-
                       0,2                                              LN(0,1)

                        0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                              b) Lognormal distributions with different means

Figure 10: P‐statistics for a) Gaussian and b) lognormal distributions


                        1

                       0,8                                             U(0,1)-U(2,3)-
                                                                       U(0,1)
                       0,6
                                                                       U(0,1)-U(1,2)-
                                                                       U(0,1)
                       0,4
                                                                       U(0,1)-U(0,5,1,5)-
                       0,2                                             U(0,1)

                        0
                             1   8 15 22 29 36 43 50 57 64 71 78


                                                 a) Uniform distributions

                         1

                       0,8                                                G(1,2)-G(4,2)-
                                                                          G(1,2)
                       0,6
                                                                          G(1,2)-G(3,2)-
                                                                          G(1,2)
                       0,4
                                                                          G(1,2)-G(2,2)-
                       0,2                                                G(1,2)

                         0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                                 b) Gamma distributions

Figure 11: P‐statistics for a) uniform and b) gamma distributions
                         1

                       0,8                                                N(0,1)-N(3,1)-
                                                                          N(0,1)
                       0,6
                                                                          N(0,1)-N(2,1)-
                                                                          N(0,1)
                       0,4
                                                                          N(0,1)-N(1,1)-
                       0,2                                                N(0,1)

                         0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                           a) Gaussian distributions

                         1

                       0,8                                              LN(0,1)-LN(3,1)-
                                                                        LN(0,1)
                       0,6
                                                                        LN(0,1)-LN(2,1)-
                                                                        LN(0,1)
                       0,4
                                                                        LN(0,1)-LN(1,1)-
                       0,2                                              LN(0,1)

                         0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                               b) Lognormal distributions

Figure 12: Simplified P‐statistics (first version) for a) Gaussian and b) lognormal distributions


                         1

                       0,8                                             U(0,1)-U(2,3)-
                                                                       U(0,1)
                       0,6
                                                                       U(0,1)-U(1,2)-
                       0,4                                             U(0,1)
                                                                       U(0,1)-U(0,5,1,5)-
                       0,2                                             U(0,1)

                         0
                             1   8 15 22 29 36 43 50 57 64 71 78


                                           a) Uniform distributions

                         1

                       0,8                                                G(1,2)-G(4,2)-
                                                                          G(1,2)
                       0,6
                                                                          G(1,2)-G(3,2)-
                                                                          G(1,2)
                       0,4
                                                                          G(1,2)-G(2,2)-
                       0,2                                                G(1,2)

                         0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                           b) Gamma distributions

Figure 13: Simplified P‐statistics (first version) for a) uniform and b) gamma distributions
                        1

                      0,8                                                 N(0,1)-N(3,1)-
                                                                          N(0,1)
                      0,6
                                                                          N(0,1)-N(2,1)-
                                                                          N(0,1)
                      0,4
                                                                          N(0,1)-N(1,1)-
                      0,2                                                 N(0,1)

                        0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                           a) Gaussian distributions

                        1

                       0,8                                              LN(0,1)-LN(3,1)-
                                                                        LN(0,1)
                       0,6
                                                                        LN(0,1)-LN(2,1)-
                                                                        LN(0,1)
                       0,4
                                                                        LN(0,1)-LN(1,1)-
                       0,2                                              LN(0,1)

                        0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                          b) Lognormal distributions

Figure 14: Simplified P‐statistics (second version) for a) Gaussian and b) lognormal distributions


                        1

                      0,8                                              U(0,1)-U(2,3)-
                                                                       U(0,1)
                      0,6
                                                                       U(0,1)-U(1,2)-
                                                                       U(0,1)
                      0,4
                                                                       U(0,1)-U(0,5,1,5)-
                      0,2                                              U(0,1)

                        0
                             1   8 15 22 29 36 43 50 57 64 71 78


                                           a) Uniform distributions

                        1

                       0,8                                                G(1,2)-G(4,2)-
                                                                          G(1,2)
                       0,6
                                                                          G(1,2)-G(3,2)-
                                                                          G(1,2)
                       0,4
                                                                          G(1,2)-G(2,2)-
                       0,2                                                G(1,2)

                        0
                             1 7 13 19 25 31 37 43 49 55 61 67 73 79


                                           b) Gamma distributions

Figure 15: Simplified P‐statistics (second version) for a) uniform and b) gamma distribution functions
   The p-statistics attains the minimum at a change point and increases when the sliding window
moves further (see Figures 10–15). The sensitivity of versions of the Klyushin–Petunin test is shown
in Table 1. The earlier a test detects the “contamination, the more sensitive it is.
Table 1
Sensitivity of versions of the Klyushin–Petunin test (order number of the detected change point)
      Distribution               Original version        First version (N=100)        Second version
                                                                                          (N=40)
N(0,1)–N(3,1)                              12                      14                        8
N(0,1)–N(2,1)                              14                      17                       15
N(0,1)–N(1,1)                              22                      29                       28
N(0,1)–N(0,4)                              15                      12                       14
N(0,1)–N(0,3)                              26                      17                       14
N(0,1)–N(0,2)                              27                      32                       18
LN(0,1)–LN(3,1)                            12                      12                        9
LN(0,1)–LN(2,1)                            23                      12                       10
LN(0,1)–LN(1,1)                            28                      21                       28
LN(0,1)–LN(0,4)                            14                      13                        9
LN(0,1)–LN(0,3)                            18                      17                        9
LN(0,1)–LN(0,2)                            25                      21                       28
U(0,1)–U(2,3)                              12                      11                        6
U(0,1)–U(1,2)                              12                      11                        6
U(0,1)–U(0.5,1.5)                          16                      16                       12
Exp(1)–Exp(4)                              17                      14                       14
Exp(1)–Exp(3)                              18                      18                       29
Exp(1)–Exp(2)                              22                      37                       32
Γ(1,2)–Γ (4,1)                             17                      10                       18
Γ(1,2)– Γ (4,2)                            15                      18                       10
Γ(1,2)– Γ (2,2)                            22                      19                       30
   If one test detects a change-point earlier that other do, it is considered as more sensitive. This fact
does not affect the accuracy of the change point detection because despite of the detection of the
contamination the p-statistics monotonically decreases to a change point at the left end of the sliding
window. After this point, the p-statistics becomes monotonically increasing.
   All the results are consistent. For example, when the first segment ( x1 , x2 ,..., x40 ) has the
distribution N (0,1) and the second segment ( x1 , x2 ,..., x80 ) has the distribution N(3,1), the first sample
is considered contaminated when m > 16 according to the original Klyushin–Petunin original test (see
Table 1). When the change point is equal to 40 then the corresponding test is considered as failed. As
expected, that the Klyushin–Petunin test in general is more robust and sensitive than its simplified
versions.
    The Table 1 shows that the Klyushin–Petunin test is sensitive both for distributions with different
means and the same standard deviation (for example, N(0,1) vs N(1,4), and similar variants) and for
distributions with the same means and the different standard deviations (for example, N(0,1) vs
N(0,4), and similar variants). It also demonstrates the high performance for distributions with
different means and standard deviations (exponential and gamma distributions). The less distributions
differ from each other, the earlier change points are detected.
    Table 1 demonstrates that the original Klyushin–Petunin test is most robust. The first and second
simplified versions of the p-statistics are not monotonic. Due to the high sensitivity and robustness the
original Klyushin–Petunin test may be considered as an effective method for testing samples
heterogeneity and change-point detecting. The fact that first and second simplified versions are
unstable is quite understandable, since they are based on the incomplete information (former) or
random choice of intervals (latter). Therefore, despite on the complicated computations, the original
version of the Klyushin–Petunin test is a preferred choice.
4. Conclusion
   The accuracy, sensitivity and specificity of the simplified versions of the Klyushin–Petunin test are
comparable with the original version of this test. However, the original version, despite of its
computational complication, is more robust. All the versions of the Klyushin–Petunin test do not
depend on assumptions about parameters of distributions and equally sensitive to difference between
means and standard deviations of distributions. Their significance levels are less that 0.05. They do
not require large storage for saving data. All the versions of the Klyushin–Petunin test are more
effective than the Kolmogorov–Smirnov and Mann–Whitney–Wilcoxon tests for small samples (size
less than 40).

REFERENCES
[1] С. Truong, L. Oudre, and N. Vayatis, “Selective review of offline changepoint detection
     methods, Signal Processing, Elsevier, 167 (2020) 107299. doi:10.1016/j.sigpro.2019.107299.hal-
     02442692
[2] S. Aminikhanghahi, D. J. Cook, “A survey of methods for time series change point detection,
     Knowledge and Information Systems, 51 (2017) 339–367. doi:10.1007/s10115-016-0987-z.
[3] E. Gombay, L. Horvath, An application of the maximum likelihood test to the change-point
     problem, Stochastic Processes and Their Applications, 50 (1994) 161–171. doi:10.1016/0304-
     4149(94)90154-6.
[4] E. Gombay, L. Horvath, “On the rate of approximations for maximum likelihood tests in change-
     point models, Journal of Multivariate Analysis, 56 (1996) 120–152. doi:
     10.1006/jmva.1996.0007.
[5] G. Gurevich, “Retrospective parametric tests for homogeneity of data, Communications in
     Statistics. Theory and Methods, 36 (2007) 2841–2862. doi:10.1080/03610920701386968.
[6] B. James, K. James, and D. Siegmund, Tests for a change-point, Biometrika, 74 (1987) 71–83.
     doi: 10.1093/biomet/74.1.71.
[7] A. Vexler, G. Gurevich, “Average most powerful tests for a segmented regression,
     Communications in Statistics. Theory and Methods, 38 (2009) 2214–2231. doi:
     10.1080/03610920802521208.
[8] G. Gurevich, A.Vexler, “Retrospective change point detection: from parametric to distribution
     free policies, Communications in Statistics. Simulation and Computation, 39 (2010) 1–22.
     doi:10.1080/03610911003663881.
[9] B. Brodsky, B. Darkhovsky, “Extrapolation, Interpolation, and Smoothing of Stationary Time
     Series, Kluwer Academin Press, Dordrecht, Boston, 1993. doi:10.1007/978-94-015-8163-9.
[10] B. Brodsky, B. Darkhovsky, Non-Parametric Statistical Diagnosis: Problems and Methods,
     Springer, Netherlands, 2010. doi:10.1007/978-94-015-9530-8.
[11] B. Brodsky, Change-point analysis in nonstationary stochastic models, CRC Press, Boca Raton,
     2017. doi:10.1201/9781315367989.
[12] J. Chen, A. Gupta, Parametric Statistical Change Point Analysis With Applications to Genetics,
     Medicine, and Finance, Birkhauser, 2012. doi:10.1007/978-0-8176-4801-5.
[13] D. Ferger, “On the power of nonparametric changepoint-tests, Metrika, 41 (1994) 277–292. doi:
     10.1007/BF01895324.
[14] A. Pettitt, “A non-parametric approach to the change-point problem, Applied Statistics, 1979 28,
     126–135. doi:10.2307/2346729.
[15] D. Wolfe, E. Schechtman, “Nonparametric statistical procedures for the changepoint problem,
     Journal of Statistical Planning and Inference, 9 1984 3896–3396. doi:10.1016/0378-
     3758(84)90013-2.
[16] E. Gombay, “U-statistics for sequential change detection, Metrika, 52 (2000) 113–145.
     doi:10.1007/PL00003980.
[17] E. Gombay, “U-statistics for change under alternatives, Journal of Multivariate Analysis, 78
     2001 139–158. doi:10.1006/jmva.2000.1945.
[18] C. Zou, Y. Liu, P. Qin, and Z. Wang, “Empirical likelihood ratio test for the change-point
     problem, Statistics & Probability Letters, 77 (2007) 374–382. doi:10.1016/j.spl.2006.08.003.
[19] M. Holmes, I. Kojadinovic, and J. Quessy, “Nonparametric tests for change-point detection a la
     Gomabay and Hovath, Journal of Multivariate Analysis, 115 (2013) 16–32.
     doi:10.1016/j.jmva.2012.10.004.
[20] P. Fearnhead, Z. Liu, “On line inference for multiple change point problems, Journal of Royal
     Statistical Society, Series B, 69 (2007) 203–213, doi:10.1111/j.1467-9868.2007.00601.x.
[21] Y. Mei, “Sequential change-point detection when unknown parameters are present in the pre-
     change         distribution,   The     Annals       of     Statistics, 34     (2006)     92–122.
     doi:10.1214/009053605000000859.
[22] D. Siegmund, Sequential analysis, Springer Series in Statistics. Springer-Verlag, New York,
     1985. doi:10.1007/978-1-4757-1862-1.
[23] H. Poor, O. Hadjiliadis, Quickest Detection, Cambridge University Press, Cambridge, 2009.
     doi:10.1017/CBO9780511754678.
[24] A. Tartakovsky, B. Rozovskii, R. Blazek, and H. Kim, A novel approach to detection of
     intrusions in computer networks via adaptive sequential and batch-sequential change-point
     detection methods, IEEE Trans. Signal Process, 54 (2006) 3372–3382. doi:
     10.1109/TSP.2006.879308.
[25] D. Klyushin, Yu. Petunin, “A Nonparametric Test for the Equivalence of Populations Based on a
     Measure of Proximity of Samples, Ukrainian Mathematical Journal, 55 2003 181–198.
     doi:10.1023/A:1025495727612.
[26] D. Klyushin, I. Martynenko, “Nonparametric Test for Change-Point Detection in Data Stream,
     2020 IEEE Third International Conference on Data Stream Mining and Processing (DSMP),
     2020, pp. 281–286. doi:10.1109/DSMP47368.2020.9204193.
[27] S. Matveichuk, Yu. Petunin, “A generalization of the Bernoulli model occurring in order
     statistics. I., Ukrainian Mathematical Journal, 42 1990 459–466. doi:10.1007/BF01071335.
[28] S. Matveichuk, Yu. Petunin, “A generalization of the Bernoulli model occurring in order
     statistics. II., Ukrainian Mathematical Journal, 43 (1991) 728–734. doi:10.1007/BF01058940.
[29] N. Johnson, S. Kotz, “Some generalizations of Bernoulli and Polya-Eggenberger contagion
     models, Statistical Papers, 32 (1991) 1–17. doi:10.1007/BF02925473.
[30] B. Hill, “Posterior distribution of percentiles: Bayes' theorem for sampling from a population,
     Journal        of     the    American    Statistician    Association,    63    1968     677–691.
     doi:10.1080/01621459.1968.11009286.
[31] A. Pires, C. Amado, Interval Estimators for a Binomial Proportion: Comparison of Twenty
     Methods,             REVSTAT-Statistical          Journal,          6     (2008)        165–197.
     doi:10.1080/01621459.1968.11009286.
[32] H. Freudenthal, “The ‘empirical law of large numbers’ or ‘The stability of frequencies’, Educ
     Stud Math, 4 (1972) 484–490. doi:10.1007/BF00567002.