<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Novel Test for Survival Data Analysis of Cancer Patients</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmitriy Klyushin</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavel Yakovlev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Feofaniya Clinical Hospital</institution>
          ,
          <addr-line>Akademika Zabolotnogo 21, Kyiv, 03143</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Taras Shevchenko National University of Kyiv, Ukraine</institution>
          ,
          <addr-line>Akademika Glushkova Avenue 4D, Kyiv, 03680</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Modern medical information systems necessarily include functions for assessing the effectiveness of treatment provided to patients. As a rule, this problem is solved by calculating the survival functions for estimation of the risk of death. Traditionally, three nonparametric tests are used to analyze survival: the Cochran−Mantel−Hansel log-rank test, the Wilcoxon test for censored data, and the Tarone−Ware test. In these tests, testing statistical hypotheses about the equivalence of survival functions, as a rule, is reduced to calculating the critical value of the standard normal distribution. These tests give reliable results only if the samples are large enough and additional conditions are met. Consequently, for the development of effective medical information systems that perform survival analysis, nonparametric tests are required that use a minimum of preliminary assumptions and allow the use of small samples. The paper proposes a test for testing the hypothesis of the equivalence of the survival functions, which does not depend on the sample size and does not use additional preconditions, except for the condition of the continuity of the distribution functions of the initial data.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Survival analysis</kwd>
        <kwd>risk of death</kwd>
        <kwd>Kaplan-Mayer curve</kwd>
        <kwd>Log-rank test</kwd>
        <kwd>Wilcoxon test</kwd>
        <kwd>Tarone−Ware test</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        To assess the effectiveness of the treatment provided to patients and the risk of death during a
given period, many cancer healthcare facilities design information systems that analyze data and
assess patient survival using the Kaplan–Meier curve [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Three nonparametric tests are usually
used in the survival analysis based on the Kaplan−Meier estimator: the Cochran‒Mantel‒Hansel
log-rank test [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the Wilcoxon test [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and the Tarone–Ware test [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. To test statistical
hypotheses about the identity of the survival functions, these tests mainly calculate the values of
the standard normal distribution. However, these tests give reliable results only if the samples are
large enough and additional conditions are met. The most popular is the log rank test, which
gives the maximum power under the alternatives with proportional hazards [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, these
tests give reliable results only if the samples are large enough and additional conditions are met.
For example, the Wilcoxon test is preferable when deaths at early time points have more weights
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and the Tarone‒Ware test also places more heavy weight on hazards at the early time [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>The nonparametric Kaplan-Meier estimate measures the survival time of patients, i.e. the
interval of time between a certain date (for example, the date of surgery) and the moment of
death or censuring. It allows the construction of survival functions based on data on the life
expectancy of patients and estimates the risk of death during a given time period. Similarly, it
can be used to estimate the time to equipment failure or other significant event. Thus, it can be
used for assessment of the risk of a specific event (death, failure, etc.) based on observations
(censored and uncensored).</p>
      <p>
        The aim of this paper is to describe an alternative nonparametric test that does not use any
assumption excepting the most general (continuity of the distribution) and allow using small
samples (size less than 50). This test use the p-statistics investigated in [
        <xref ref-type="bibr" rid="ref10 ref11 ref8 ref9">8–11</xref>
        ] and base on the
A(n) Hillʼs assumption [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The theoretical background of the p-statistics is developed by
Matveichuk and Petunin [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] and later by Johnson and Kotz [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and Klyushin and Petunin
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The high sensitivity and specificity of the nonparametric test for homogeneity of two
samples based on the p-statistics is demonstrated in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Here we propose new application of
this test for comparison of two survival curves.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Theoretical background</title>
      <p>Consider samples x
=x1,x2 ( ,..., xn ) ∈ G1 and y</p>
      <p>
        =y1,y2 ( ,..., yn ) ∈ G2 from absolutely continuous
distributions F1
and F . The Hill's assumption A(n) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] states that for exchangeable random
2
values x1, x2 ,..., xn ∈ G following to an absolutely continuous distribution function
P ( x ∈ ( x(i) , x( j) )) =
j − i
, j &lt; i,
(1)
(2)
(3)
where x(i) and x( j) are the i-th and j-th order statistics. Find the relative frequency hij
of the
event ym ∈ ( x(i) , x( j) ) for the elements of y and estimate the deviation of hij from the expected
probability
j − i
n + 1
      </p>
      <p>using the Wilson confidence interval Ii(jn) = ( pi(j1) , pi(j2) ) where</p>
      <p>
        The significance level of this interval is the function of g. When g = 3 the significance level
of Ii(jn) does not exceed 0.05 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. P-statistics, estimating the homogeneity of samples x and y, is
pi(j1) = hij n + g 2 2 − g hij (1 − hij )n + g 2 4
      </p>
      <p>n + g 2
p(2) = hij n + g 2 2 + g hij (1 − hij )n + g 2 4
ij n + g 2
h =#  pij =j − i
n + 1
∈ Ii(jn) 
 n ( n − 1) 
  ,
 2 
It is the relative frequency of the event  pij =j − i ∈ Ii(jn)  . Therefore, using (2) and (3) we
n + 1
may construct the Wilson interval I for the p-statistics an formulate the following test: the null
hypothesis on identity of the survival functions is accepted if the upper bound of I is greater
than 0.95, else it is rejected.</p>
      <p>
        For the true null hypothesis is true, the events  pij =j − i ∈ Ii(jn)  form a generalized
n + 1
Bernoulli scheme [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. For the false null hypothesis they form a modified Bernoulli scheme. If
the null hypothesis may be either true or false, they form the Matveichuk–Petunin scheme [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        j − i i
If the null hypothesis is true, lim
n→∞ n + 1
∈ (0,1) , and lim
n→∞ n + 1
∈ (0,1) , then the asymptotic
significance level β of a sequence of confidence intervals Ii(jn) is less than 0.05 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments and results</title>
      <p>To confirm the high sensitivity and specificity of the proposed test, we considered two groups of
patients with a nondifferential diagnosis of bladder cancer of stages T2 and T3, who in 1998–
2016 received special surgical care (radical and salvage cystectomy) at the urology department
of the Kiev City Clinical Oncological Dispensary. For the analysis, patients were taken who had
a complete history and an accurate survival result (uncensored). Characterization of the
prevalence of the malignant process was carried out according to the clinical classification TNM
7th ed. (2010).</p>
      <p>The first group (stage T2) consists of 38 patients, among them 22 patients were underwent to
radical cystectomy (17 died and 5 are alive), and 16 were underwent to the salvage cystectomy
(7 died and 9 are alive). The second group (stage T3) consists of 51 patients, among them 33
patients were underwent to radical cystectomy (24 died and 9 are alive), and 18 were underwent
to the salvage cystectomy (10 died and 8 are alive). The survival curves for the first and second
groups are demonstrated in Fig. 1 and Fig. 2. Here the mark 1 means the radical cystectomy and
0 means the salvage cystectomy, Tables 1–4 contain the mean survival times and results of
testing identity of the survival curves using four tests: log-rank, Wilcoxon, Tarone–Ware, and
pstatistics,</p>
      <p>1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
0
500
1000
1500
2000
2500
3000
3500</p>
      <p>4000
Survival time, days
0
1
As we see, in the first group (stage T2) the survival curve of the patients who were underwent to
radical cystectomy goes above the survival curve of the patients who were underwent to salvage
cystectomy. Therefore, intuitively, the risk of death for the former patients is less than for the
latter ones and the salvage cystectomy prolongs life of patients better than the radical
cystectomy. However, this hypothesis must be rigorously tested using statistical tests.
Traditionally, to estimate the significance of the deviation between to survival curves the
logrank test, the Wilcoxon test, and the Tarone–Ware are used. Their p-values are the critical values
of these tests.</p>
      <p>1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
0
500
2000</p>
      <p>2500
1000 1500
Survival time, days
0
1
In the second group (stage T3) the survival curve of the patients who were underwent to radical
cystectomy also goes above the survival curve of the patients who were underwent to salvage
cystectomy. We again may suppose that the risk of death for the former patients is less than for
the latter ones. Note, that since the stage T3 is harder that T2, the survival interval became mush
shorter. The maximum survival time in the first group is avout 4000 days (almost 11 years) but
in second group it is about 2500 days (almost 7 years). Thus, the effectiveness of the cytectomy
in this group is compensated by the stage of tumors. To estimate the significance of the deviation
between to survival curves we again used the log-rank test, the Wilcoxon test, and the Tarone–
Ware and their p-values.</p>
      <p>In both cases we completed the traditional analysis by computing the p-statistics as an
alternative to the three above tests. Descriptive statistics of the data are provided in Tables 1–3
The hypothesis of the identity of the two survival functions (0 — the salvage cystectomy and
1 —the radical cystectomy) in the first and second groups (stages T2 and T3, respectively) was
tested using four tests at a significance level of 0.05. In all the results, there were no statistically
significant differences between the survival curves, since the observed values did not exceed the
critical value and the upper confidence bound for the p-statistics exceeds 0.95. The log-rank test,
the Wilcoxon test and the Tarone–Ware test acceps the null hypothesis is the corresponding
pvalues are less than 0.05, and the test based on the p-statistics, in opposite, accepts the null
hypothesis if its p-value is greater than 0.05.</p>
      <p>Noteworthy is the fact that the observed p-value (the probability of rejecting the null
hypothesis, provided that it is true) in the p-statistics test is an order of magnitude less than in the
three traditional nonparametric tests used in the analysis of survival. This is the evidence of high
sensitivity and specificity of the proposed test.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>Mathematical basis of modern medical information systems for assessing the effectiveness of
treatment and the risk of death during a given time period must be more rigorously justified.
Traditional nonparametric tests used in survival analysis (the log-rank test, the Wilcoxon test,
and the Tarone−Ware test) assume conditions that not always are met in practice. These tests
reduce the verification of statistical hypotheses about the equivalence of survival functions to
calculating the critical value of the standard normal distribution. This is justified only when
samples are large enough and additional conditions are met. Thus, to develop an effective
medical information system for survival analysis, we need in nonparametric tests with minimal
preliminary assumptions and minimal requirements to the size of samples.</p>
      <p>In paper, we described a test for verification of the hypothesis of the equivalence of the
survival functions and risk of death during a given time period, which does not depend on the
sample size and does not use additional preconditions, except for the condition that the samples
have not ties.</p>
      <p>We have provided the strong mathematical background and demonstrated high sensitivity and
specificity of testing homogeneity of two samples of random samples from continuous
distributions in comparison with three traditional tests. We have shown the practical application
of this test in survival analysis of the patient with bladder cancer and demonstrated its high
performance. This test may be used for the development of effective medical information
systems that perform survival analysis of cancer patients. Note, that the scheme described in the
paper is easily expanded on much wider spectrum of problems connected with the assessment of
the risk of device failure or risk of some significant event based on the censored and uncensored
observations.</p>
      <p>Future work will be directed to the improvement of computational complexity of the
proposed test and its expanding to the various problem of the risk assessment.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Landon</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.Reguilon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Butler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>McKee</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Nolte, Understanding the link between health systems and cancer survival: A novel methodological approach using a system-level conceptual model</article-title>
          ,
          <source>Journal of Cancer Policy</source>
          ,
          <volume>25</volume>
          ,
          <issue>202</issue>
          , 100233. doi:
          <volume>10</volume>
          .1111/codi.15622
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.M.</given-names>
            <surname>Bland</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.G.Altman,</surname>
          </string-name>
          <article-title>The logrank test</article-title>
          .
          <source>British Medical Journal</source>
          ,
          <volume>328</volume>
          ,
          <year>2004</year>
          ,
          <volume>1073</volume>
          . doi:
          <volume>10</volume>
          .1136/bmj.328.7447.1073
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.A.</given-names>
            <surname>Proschan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.E.</given-names>
            <surname>Dodd</surname>
          </string-name>
          ,
          <article-title>Re-randomization tests in clinical trials</article-title>
          , Statistics in medicine,
          <volume>38</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>2292</fpage>
          -
          <lpage>2302</lpage>
          . doi:
          <volume>10</volume>
          .1002/sim.8093
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.E.</given-names>
            <surname>Tarone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ware</surname>
          </string-name>
          ,
          <article-title>On distribution-free tests for equality of survival distributions</article-title>
          ,
          <source>Biometrika</source>
          ,
          <volume>64</volume>
          ,
          <year>1977</year>
          , pp.
          <fpage>156</fpage>
          -
          <lpage>160</lpage>
          . doi:
          <volume>10</volume>
          .1093/biomet/64.1.
          <fpage>156</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.G.</given-names>
            <surname>Karrison</surname>
          </string-name>
          ,
          <article-title>Versatile tests for comparing survival curves based on weighted log-rank statistics</article-title>
          ,
          <source>The Stata Journal</source>
          ,
          <volume>16</volume>
          ,
          <year>2016</year>
          , pp.
          <fpage>678</fpage>
          -
          <lpage>690</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hazra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gogtay</surname>
          </string-name>
          ,
          <source>Biostatistics Series Module</source>
          <volume>9</volume>
          :
          <string-name>
            <surname>Survival</surname>
            <given-names>Analysis</given-names>
          </string-name>
          ,
          <source>Indian Journal of Dermatology</source>
          ,
          <volume>62</volume>
          ,
          <year>2017</year>
          , pp.:
          <fpage>251</fpage>
          -
          <lpage>257</lpage>
          . doi:
          <volume>10</volume>
          .4103/ijd.IJD_
          <volume>201</volume>
          _
          <fpage>17</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.G.</given-names>
            <surname>Karadeniz</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.Ercan</surname>
          </string-name>
          ,
          <article-title>Examining tests for comparing survival curves with right censored data</article-title>
          , Statistics in Transition New Series,
          <volume>18</volume>
          ,
          <year>2017</year>
          , pp.
          <fpage>311</fpage>
          ‒
          <lpage>328</lpage>
          . doi:
          <volume>10</volume>
          .21307/stattrans2016-
          <fpage>072</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.A.</given-names>
            <surname>Matveichuk</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yu.I.Petunin</surname>
          </string-name>
          ,
          <source>Generalization of Bernoulli schemes that arise in order statistics, I. Ukrainian Mathematical Journal</source>
          ,
          <volume>42</volume>
          ,
          <year>1990</year>
          , pp.
          <fpage>459</fpage>
          -
          <lpage>466</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF01058940
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.A.</given-names>
            <surname>Matveichuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yu.I</given-names>
            <surname>Petunin</surname>
          </string-name>
          ,
          <article-title>Generalization of Bernoulli schemes that arise in order statistics</article-title>
          , II.
          <source>Ukrainian Mathematical Journal</source>
          ,
          <volume>43</volume>
          ,
          <year>1991</year>
          , pp.
          <fpage>728</fpage>
          -
          <lpage>734</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF01058940
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , S.Kotz,
          <article-title>Some generalizations of Bernoulli and Polya-Eggenberger contagion models</article-title>
          ,
          <source>Statist Paper</source>
          ,
          <volume>32</volume>
          ,
          <year>1991</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF02925473
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.A.</given-names>
            <surname>Klyushin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yu.I.</given-names>
            <surname>Petunin</surname>
          </string-name>
          ,
          <article-title>A Nonparametric Test for the Equivalence of Populations Based on a Measure of Proximity of Samples</article-title>
          ,
          <source>Ukrainian Mathematical Journal</source>
          ,
          <volume>55</volume>
          ,
          <year>2003</year>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>198</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:1025495727612
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.M.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <article-title>Posterior distribution of percentiles: Bayes' theorem for sampling from a population</article-title>
          ,
          <source>Journal of American Statistical Association</source>
          ,
          <volume>63</volume>
          ,
          <year>1968</year>
          , pp.
          <fpage>677</fpage>
          -
          <lpage>691</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>