<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Pattern Structures for Treatment Optimization in Subgroups of Patients</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Natalia V. Korepanova</string-name>
          <email>korepanova.natalia@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergei O. Kuznetsov</string-name>
          <email>skuznetsov@hse.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Research University Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>A comparison of di erent treatment strategies does not always result in determining the best one for all patients, one needs to study subgroups of patients with signi cant di erence in e ciency between treatment strategies. To solve this problem an approach to subgroups generation is proposed, where data are described in terms of a pattern structure and pattern concepts stay for patient subgroups and their descriptions. To nd the most promising pattern concepts in terms of the di erence of treatment strategies in e ciency a version of CbO algorithm is proposed. An application to the analysis of data on childhood acute lymphoblastic leukemia is considered.</p>
      </abstract>
      <kwd-group>
        <kwd>pattern structure</kwd>
        <kwd>subgroup analysis</kwd>
        <kwd>acute lymphoblastic leukaemia</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Randomized controlled trial (RCT)[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is a common approach in evidence-based
medicine to prove the superiority of a disease treatment over another one. There
are three main types of hypotheses which can be tested with the help of RCT:
superiority, noninferiority or equivalence. The main goal of all intervention, drug,
or therapy inventions is to nd a better way of treating patients. So, superiority
trials are possibly the most popular ones, and they are the subject of interest
of this paper. A superiority trial allows physicians to nd the optimal
treatment and improve the curability of the disease. However if comparing treatment
strategies are similar enough it is a big success to nd and prove the superiority
of any of them for all patients. But the e ect of treatment strategies may
depend on patients initial features (physiological characteristics and/or results of
diagnostics). For example, if we compare dosages of a toxic drug a small dosage
may be more suitable for patients with light disease manifestation because it
throws o a disease and reduces the negative consequences of toxicity while for
patients with intense disease manifestation a small dosage is not enough to cure
them of the disease. The question is how to nd such subgroups of quite similar
patients where the e ciency di erence of treatment strategies is signi cant.
      </p>
      <p>
        Several approaches were proposed in [2{11]. Most of them [5{11] are based on
the idea of decision or regression trees which locally optimize some measure at
every iteration of the algorithm. This approach is more suitable when we operate
on the big datasets or constantly increasing dataset because they allow to nd
some subgroups quickly. In the case of treatment optimization the datasets, as
a rule, are not very big and do not increase in size rapidly. Moreover, collecting
such datasets demands a lot of time and e orts. So, it is more important to carry
out more detailed analysis of the data then to make it fast. Also, in RCT on
cancer, heart conditions or chronic diseases the outcome of the therapy can be
censored, while only few papers like [
        <xref ref-type="bibr" rid="ref11 ref9">9, 11</xref>
        ] report on analysis of censored data.
In this paper we propose a universal approach to nding subgroups of patients
with signi cantly di erent responses to di erent treatment strategies, which is
not biased by any local optimization criterion. Within this approach subgroups of
patients are generated, which are determined by subsets of patients' features.The
approach in based on computing closed patterns [14{17] that satisfy criteria of
treatment e ciency. The approach was proposed for the analysis of the database
of randomized controlled trial on childhood acute lymphoblastic leukemia (ALL)
[
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] which was performed in several hospitals in Russia and Byelorussia. In
this dataset each patient under study is assigned one of the studied treatment
strategies, he/she is described by a set of initial features that can be nominal or
numerical, and some outcome which is used to estimate treatment e ciency.
      </p>
      <p>
        The rest of the paper is organized as follows. In section 2 we recall basic
de nitions of pattern structures and give examples of pattern structures [14{17]
relevant to the analyzed data. In section 3 a version of Close-by-One (CbO)
algorithm [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] performing on the attributes is proposed. Section 4 presents stopping
criterion added to the version of CbO to generate only subgroups with the di
erence in the e ciency of treatment strategies. Section 5 presents an application
of the proposed approach to the ALL dataset, and we conclude in section 6.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Pattern Structures</title>
      <sec id="sec-2-1">
        <title>Main De nitions</title>
        <p>In this section we recall pattern structures and examples of pattern structures
used for nominal and numerical features.</p>
        <p>Let G be a set (of objects), (D; u) be a meet-semilattice (of all possible
object descriptions), and : G ! D be a mapping. Then (G; (D; u); ) is called
a pattern structure, provided that the set (G) = f (g) j g 2 Gg generates
a complete subsemilattice (D ; u) of (D; u), i.e. every subset X of (G) has
an in mum FX in (D; u). Elements of D are called patterns and are naturally
ordered by subsumption relation v: c v d , cud = c, where c; d 2 D. Operation
u is also called a similarity operation. If (G; (D; u); ) is a pattern structure we
de ne the derivation operators which form a Galois connection between the
powerset of G and (D; u) as:</p>
        <p>A = gF2A (g)
d = fg 2 G j d v (g)g
for A G
for d 2 D
(1)
The pairs (A; d) satisfying A G, d 2 D, A = d, and A = d are called pattern
concepts of (G; (D; u); ), with pattern extent A and pattern intent d. Pattern
concepts are ordered with respect to set inclusion on extents. The ordered set of
pattern concepts makes a lattice, called pattern concept lattice. Operator ( )
is an algebraical closure operator on patterns, since it is idempotent, extensive,
and monotone.</p>
        <p>If objects are described by binary attributes from set M , then D = }(M ),
the powerset of M , and (g) is prime operator ( )0 in the context (G; M; I):
(g) = fm 2 M j gImg, and d1 u d2 = d1 \ d2 where d1; d2 2 D. So, subsumption
corresponds to set inclusion: d1 v d2 , d1 u d2 = d1 , d1 \ d2 = d1 , d1 d2.</p>
        <p>So, if all patients' initial features are binary we can use them as binary
attributes directly. However we also aim at dealing with nominal initial features.
Consider patients are described by k initial features f 1; :::; kg, all of them
are nominal (binary is a particular case) and their values are coded as natural
numbers. So, if i takes li values we assume that the range of i is f1; :::; lig. For
each i we construct li binary attributes f i1; :::; ili g such that ij : G ! f0; 1g
and ij (g) : g ! i(g) = j where g 2 G; j = 1; :::; li. As a result we get
Pi=1;:::;n li binary attributes to which pattern structures can be applied as it is
shown above.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Pattern Structures on Intervals</title>
        <p>To operate with numerical features interval pattern structures [15{17] can be
applied. Let us consider each patient is described by n numerical and no nominal
initial features. In our notation G corresponds to the set of patients. So, let
f'1; '2; :::; 'ng be a set of functions represented patients' initial features such
that 'i : G ! R for i = 1; :::; n. For each feature 'i we construct a corresponding
interval attribute i : G ! [R; R] such that if 'i(g) = x for g 2 G, then
i(g) = [x; x], where x 2 R. These attributes are used for pattern structures
construction.</p>
        <p>Each object is described by a n-dimensional tuple of intervals. Let a and b
be tuples of n intervals, so a = h[vi; wi]ii=1;:::;n and b = h[xi; yi]ii=1;:::;n, where
vi; wi; xi; yi 2 R 8i = 1; :::; n. In this case the similarity operation u is de ned
by the meet of tuple components:</p>
        <p>a u b = h[vi; wi]ii=1;:::;n u h[xi; yi]ii=1;:::;n = h[vi; wi] u [xi; yi]ii=1;:::;n;
where [vi; wi] u [xi; yi] = [min(vi; xi); max(wi; yi)].</p>
        <p>Hence, subsumption on tuples of interval is de ned as:
a v b , [vi; wi] v [xi; yi]i=1;:::;n , [vi; wi] u [xi; yi] = [vi; wi]i=1;:::;n ,
, [min(vi; xi); max(wi; yi)] = [vi; wi]i=1;:::;n , [vi; wi] [xi; yi]i=1;:::;n:
For example, h[2; 6]; [4; 5]i v h[3; 4]; [5; 5]i as [2; 6] v [3; 4] and [4; 5] v [5; 5].
(2)
(3)
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Pattern Structures on Mixed Tuples</title>
        <p>In the previous sections we consider separately the cases of nominal and
numerical initial features. However, the situation when patients have both nominal
and numerical initial features seems more natural. As it is described above we
associate the set of binary attributes with each nominal feature and an interval
attribute - with each numerical one. So, d 2 D is a tuple, where components are
intervals and binary attributes. Let us have k binary and n interval attributes.
Assume d = h ; i where is the tuple of intervals of length n, and is the
subset of binary attributes. If d1; d2 2 D, d1 = h 1; 1i, and d2 = h 2; 2i similarity
operator can be set as d1 u d2 = h 1 u 2; 1 u 2i where similarity operators for
the tuples of intervals and the sets of binary attributes are de ned above. The
subsumption is also de ned by subsumption on the tuples of intervals and the
sets of binary attributes:
Pattern concepts can be computed by Close by One (CbO) algorithm. It
produces a tree structure on pattern concepts where edges represent a subset of
lattice edges. For the following analysis we do not need the whole lattice but
the tree-structure provided by CbO is helpful for implementation of stopping
criterion. Considering the top of the lattice is the pair (G; G ), and the bottom
is (;; ; ) CbO starts from the bottom and proceeds \object-wise". However,
when the number of objects is larger than the number of attributes processing
top-down allows one to reduce computation time. Moreover, for the given
problem we need to generate pattern concepts with as large extents as possible to
detect the di erence in treatment e ciency. So, it is more reasonable to start
generation process from the top of the lattice. As we operate on descriptions
consisting from interval and binary attributes classical CbO must be adapted to
such descriptions. For this purpose a version of CbO is proposed below. The idea
is to order elements of descriptions, and start to reduce descriptions by changing
its elements in this order.</p>
        <p>Let objects be described by n interval and k binary attributes. So, we can
rewrite description as d = hv1; w1; :::; vn; wn; b1; :::; bki, where vi and wi are the
left and right bounds of the i-th interval attribute for i = 1; :::; n, and bj indicates
whether description d contains the j-th binary attribute for j = 1; :::; k. So, if
bj is 0 (g) does not contain the j-th binary attribute for all g 2 d , and if
bj is 1, then (g) may or may not contain the j-th binary attribute for all
g 2 d . In other words, 1 u 0 = 0 or 0 v 1. The de nition of similarity operator
remains the same: we take minimum of left bounds, maximum of right bounds,
and set intersection, which in the given notation can be written as element-wise
conjunction of indicator vectors:
d1 u d2 = hhmin(v1;i; v2;i); max(w1;i; w2;i)ii=1;:::;n; hb1;j ^ b2;j ij=1;:::;ki :
(5)</p>
        <p>The introduced version of CbO starts from the most general description
and speci es it by reducing intervals and adding binary features. For interval
reduction it is necessary to choose some step value si for each interval attribute
(i = 1; :::n;). If we aim at some sort of scaling we can set these values by ourselves,
or if scaling is unwanted si is set to the smallest di erence between values of the
initial feature corresponding to the i-th interval attribute.</p>
        <p>Further we denote dall = gF2G (g), min(d1; d2) denotes the minimum
position of unequal elements of tuples d1 and d2 in element-wise comparison. Let
suc(d) denote the set of all children of the node corresponding to the description
d. Let prev(d) return the parent of d, address(d) return the address of d, and
nexti(d) store the position of the description tuple d which must be changed
at the next algorithm returning to d. Let denote integer division, % denote
residue of division, and [ ] be an operator of taking the element of the tuple.
Function AddConcept set required links between tree nodes when a new node is
added. Function OneIteration changes the description dcurr given as argument
in position nexti(dcurr) and takes closure of the changed description by ( ) .
If min of the closure and dcurr is not less than nexti(dcurr) then the function
returns the closure, otherwise it returns dcurr.
1.
2.
3.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.</p>
        <p>def AddConcept(parent; child)
suc(parent) address(child)
prev(child) := parent
nexti(child) := nexti(parent)
def OneIteration(dcurr)
dnew := dcurr
i := nexti(dcurr)
if i 2n then
q = i 2
r = i%2
dnew[i] := dnew[i] sq(2r 1)
dadd := dnew
if dnew[2q] dnew[2q + 1] and min(dcurr; dadd)</p>
        <p>AddConcept(dcurr; dadd)
return dadd
else return dcurr
else
dnew[i] := d[i] ^ 1
dadd := dnew
if dnew[i] 6= d[i] and min(d; dadd)</p>
        <p>AddConcept(dcurr; dadd)
return dadd
else return dcurr
i then
0. d := dall; nexti(d) := 1; prev(d) := ;; suc(d) := ;
1. until d = dall and nexti(d) &gt; 2n + k do
2. until nexti(d) &gt; 2n + k do
3. dadd:=OneIteration(d)
4. if d 6= dadd then
i then
5.
6.
7.
8.</p>
        <p>d := dadd
output(d; d )
else nexti(d) := nexti(d) + 1
d := prev(d)</p>
        <p>Lines 7 and 14 of function OneIteration has complexity O((2n + k)jGj), and
all lines 3{8 of the main part of the algorithm are performed at most in this
time. The loop starting at line 2 is repeated 2njGj + k times at worst (as in the
worst case each boundary of all interval attributes can take jGj values), while the
loop starting at line 1 is repeated jLj times exactly, where jLj is the number of
pattern concepts. So, the algorithm has complexity O((2n+k)jGj(2njGj+k)jLj).
The complexity is higher than that of CbO, O((2n + k)jGj2jLj), but in practice
our algorithm may become faster when n and k are small, and the number of
numerical values in data is less than jGj.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Stopping Criterion</title>
      <p>As it is mentioned above it may not be required to generate all pattern concepts,
and the version of CbO may stop when subgroups with di erence in treatment
e ciency and maximal possible extent are generated. To estimate the di erence
in e ciency for some description d we de ne a di erence measure which depends
on the sets of outcomes of patients who match d and have received the same
treatment, and if the value of this measure satis es some criterion of signi cance
the proposed version of CbO stops to generate speci cation of the description
which is currently in work.</p>
      <p>Let p be the number of comparing treatment strategies, and d is the
description currently processed by the algorithm. Assume Qi = foutcome(g) j g 2
d ; treatment(g) = ig for all i = 1; :::; p, where outcome(g) is the outcome of g,
and treatment(g) is the treatment assigned to g. So, jQij denotes the number of
patients received treatment i. We de ne di erence measure which takes the sets
of outcomes corresponding to each treatment strategy and returns the
estimation of di erence, and set threshold ".The criterion looks like if (Q1; :::; Qp) &gt; "
the algorithm does not generate the children of the currently processed node and
returns to its parent node.</p>
      <p>Except the stopping criterion itself several additional restrictions are
necessary. So, if a subgroup does not contain patients per each of p treatment
strategies we are not able to compute and need to return to the parent subgroup
without generating children nodes, since the antimonotonicity of operator ( )
ensures this property for all descendants of the current node. Also, it may be
important to result in descriptions with approximately equal number of patients
per treatment strategy in corresponding subgroups. Therefore, additional
parameter 2 [0; 1] is provided to control the ratio of each treatment strategy in
a subgroup (see line 4 in function Restrictions). If is set to one this restriction
is deactivated, if it is set to zero the numbers of patients per each treatment
strategy must be equal. Let outc(d) be hQiii=1;:::;n, outc(d)[i] be Qi, and j j
return the power of the set. Function Restrictions checks ful llment of these
restrictions for a particular description. Let isempty(d) be F alse if the subgroups
corresponding to description d do not contain patients from every treatment
strategy, and T rue otherwise. Let also notBalanced(d) be F alse if the subgroup
corresponding to d is not balanced (do not satisfy restriction on proportion of
treatment strategies in the subgroup), and T rue otherwise.</p>
      <p>def Restrictions(d)
0. isempty(d) := F alse; notBalanced(d) := F alse
1. for j from 1 to p do
2. isempty(d) := isempty(d) _ (joutc(d)[j]j = 0)
43.. snhoatBreal:=ancjoeudtj(cd(dd)j)[:j=]j notBalanced(d) _ ( 1 p</p>
      <p>The algorithm outputs the set of maximal size subgroups (i.e. maximal
extents) with signi cant di erence in treatment e ciency. Since we do not
construct the whole pattern lattice the resulting set may contain subgroups which
subsumed under the other subgroups from this set. Smaller subgroups should be
excluded from the output by post-processing.
5
5.1</p>
      <sec id="sec-3-1">
        <title>Dataset</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Application to ALL Dataset</title>
      <p>
        The dataset consists of more than 2000 patients from 1 to 18 years old with newly
diagnosed ALL. All of them were included into the standard risk group (SRG) or
into the intermediate risk group (ImRG) of randomized clinical trial
MB-ALL2008 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The protocol of this trial contains three stages of treatment for SRG
and ImRG: induction (36 days), consolidation (25 weeks), and maintenance (2-3
years). In this paper we only focus on the induction stage. Induction therapy
aims at bringing a patient into remission and is very toxic. At this stage of
treatment patients from SRG and ImRG are randomized into 3 and 2 treatment
strategies correspondingly. Let us code them as T1, T2, and T3 in SRG and T4
and T5 in ImRG. SRG and ImRG parts of the dataset may be considered as two
independent datasets because SRG and ImRG therapies di er considerably.
      </p>
      <p>Each patient from the dataset is described by the set of initial features,
treatment strategy which he or she was assigned at randomization, and outcome
features. From all initial features 8 were chosen for the analysis: sex (male or
female), age (in years from the birth date to the start of the therapy), initial
white blood count (per nl) (WBC), immunophenotype (B- or T-ALL), central
nervous system (CNS) status (normal, cytosis is less than 5 per mcl and blast
cells, neuroleukemia), liver enlargement (in cm), spleen enlargement (in cm),
mediastinum status (normal or pathological). As for outcome features of the
dataset each outcome feature consists of two parts: the result of the therapy
and time from the beginning of the therapy to the date when the result was
xed (in years). In this dataset we have two such features. One of them is xing
the time before patient's death.This feature has three possible states: alive, lost
to follow-up or death. When a patient is alive or lost to follow-up we say that
censoring happened because we cannot measure exactly the time before death.
The other outcome feature represents the time before a negative event . This
feature can possess the following states: alive in remission, lost to follow-up,
death in remission, secondary tumor, relapse or metastases, nonresponse to the
therapy or disease progression, or death in induction. Alive in remission and lost
to follow-up events correspond to censoring. Two types of outcomes are used for
di erent variants of treatment e ciency estimation which are presented below.</p>
      <p>Finally, we exclude patients with missed values from the further analysis and
result in 1221 SRG patients: 387, 366, and 368 patients received T1, T2, and
T3 respectively. 929 patients in ImRG: 467 and 462 patients received T4 and T5
respectively.
5.2</p>
      <sec id="sec-4-1">
        <title>Data Preprocessing</title>
        <p>As it was shown above the initial features of the patients should be transformed
into the tuple of interval and binary attributes. Age, WBC, liver enlargement,
and spleen enlargement are numerical features, so for each of them an interval
attribute is created. Moreover, we scale them a little to obtain subgroup
descriptions which make sense for physicians. For example, it is not correct enough in
medical terms that one treatment is more e ective for patients, let's say, up to
5.4 years old. Similar limitations should also be applied to other 3 numerical
features. To answer these limitations we propose the following way of interval
attributes construction:
1. Let one set the steps of interval reduction in CbO to sage = sliver = sspleen =
1 and sW BC = 10.
2. For each name 2 fage; liver; spleen; W BCg and for every g 2 D, where D
is the set of all patients, if name(g) = x then the value of the corresponding
interval attribute is set to [sname bx=snamec; sname dx=snamee].</p>
        <p>All nominal features (sex, immunophenotype, CNS status, and mediastinum
status) are converted to the set of binary attributes in the way presented in
section 2.1. For instance, we construct two binary attributes corresponding to
sex: one indicating males and another indicating females.
5.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Methodology of Generating Hypotheses</title>
        <p>
          The algorithm of subgroup descriptions generation is proposed in Section 4. It
requires to set the di erence measure. For the data of the childhood ALL we
choose log-rank statistics [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. As it is a statistical method of detecting the di
erence between two (or several) survival curves [20{22] the threshold is naturally
chosen to satisfy 95% level of con dence of log-rank test. So, if p-value of
twosided log-rank test is less than 5%, we output current description as a potential
subgroup description and do not generate its children, otherwise we continue
to generate children descriptions in accordance with the introduced algorithm.
Finally, from the set of potential subgroup descriptions we delete descriptions
subsumed under other potential subgroup descriptions.
        </p>
        <p>As in SRG three treatment strategies (T1; T2; T3) are compared, the
algorithm selected those subgroups where the di erence between any pair of the
treatment strategies is signi cant. Assume one of descriptions we get is d. For
the subgroup described by this description we should compare every pair of
treatment strategies: T1 and T2, T1 and T3, and T2 and T3. For each pair where
log-rank test detects the di erence with con dence 95% and power 80% we make
a hypothesis. For instance, if pair Ti and Tj satis es these requirements then a
hypothesis says that treatment strategies Ti and Tj a ect patients described by
d di erently. At the same time if the survival curve for Ti, for instance, is located
above the survival curve for Tj we can even say that Ti is better for patients
described by d than Tj . For patients from ImRG only two treatment strategies
are compared. Therefore it is enough to estimate power, and if it is more than
80% we make a hypothesis in the same way.
5.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>Summary of Results</title>
        <p>
          The proposed algorithm was run to compare separately overall survival (OS)[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ],
event-free survival (EFS)[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], and relapse-free survival (RFS)[
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] in each risk
group with set to 13 and 15 for SRG and ImRG respectively. We also add
restrictions on the size of the subgroups: not less than 20 and 200 patients per
each treatment strategy for SRG and ImRG, respectively (the choice is explained
by the greater number of patients per treatment and possible descriptions for
ImRG). So, as a result three sets of subgroup descriptions were obtained for each
risk group (SRG and ImRG), one per each type of survival.
        </p>
        <p>
          The results of experiments for SRG are presented in Table 1. OS, EFS, and
RFS stand for three types of survival described above, CbO stands for the
proposed approach, and IT stands for Interaction Tree from [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. To construct an
interaction tree the same restriction on the size of subgroups (i.e. leaves) was
set: not less than 20 patients per each treatment strategy. Performing IT with
pruning results in no subgroups, therefore we compared to unpruned trees. To
estimate subgroups in each of them paired logrank test p-values for every pair of
compared treatment strategies are estimated. We have also performed bootstrap
sampling on 1000 samples, and for each subgroup we estimate p-value median
and 0.95 unpivotal con dence interval for the di erence between long-term
survival estimations for every pair of compared treatment strategies. We count the
number of subgroups where the result of comparison of even one pair of
compared treatment strategies in this subgroups satis es restrictions at the heading.
So, p corresponds to p-value of paired logrank test, pm is a median estimated
by bootstrap, dl and dr are the left and the right boundaries of 0.95 bootstrap
con dence interval for the di erence in long-term survival.
For ImRG we obtained 2559 and 153 subgroups based on OS and EFS
respectively and no subgroups based on RFS since the superiority of T5 over T4 in
RFS holds for the whole set of ImRG patients. Moreover, all obtained subgroups
based on OS and EFS con rm that T5 is better than T4. For this reason we did
not carry out an experiments on ImRG by applying Interaction Trees.
Given all hypotheses for ImRG propose the superiority of T5 over T4 it is more
interesting to look at the hypotheses for SRG. Short-term estimation of the
treatment strategy e ciencies carried out by physicians shows that T1 is signi cantly
worse than T2 and T3 for all patients from SRG while long-term estimations
show no signi cant di erence. However, by applying the proposed algorithm we
found several subgroups where T1 is better than T2 and T3 in long-term. For
instance, the description was obtained on the basis of di erence in OS: 4 age ,
3 liver enlargement 7, and normal mediastinum status. Testing T1 vs T2 and
T1 vs T3 we got p-values 1.4% and 1.9% and power estimations 86% and 94%.
There are approximately 60 patients per treatment strategy in the subgroup.
OS curves for the patients which can be described by even one of these
descriptions is presented in Fig. 1. Con dence intervals of p-values obtained from 1000
sample bootstrap: [0.3%, 1.6%] and [0.7%, 1.7%]. So, the advantage of strategy
T1 over T2 and T3 for patients matching the description seems con dent and
independent from the certain data.
In this paper we have introduced an approach to solving the problem of
determining relevant subgroups of patients for therapy optimization. The approach
is based on representing data by numerical pattern structures and applying the
version of CbO algorithm. The algorithm computes the pattern lattice top-down
(starting with the most general descriptions) and its stopping criterion allows
one to generate subgroups with signi cant di erences in the e ciency of
treatment strategies containing the maximal possible number of patients to satisfy
statistical power restrictions. This approach allows one to avoid binarization or
using similarity measures on patients, which can result in artifacts. The approach
is also not biased by local optimization heuristics used for constructing decision
trees and random forests. The situations when various subgroups are not disjoint
will be the subject of further study.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>Acknowledgments</title>
        <p>We would like to thank Prof. Alexander I. Karachunskiy and his colleagues from
the Federal Research Institute of Pediatric Hematology, Oncology and
Immunology, Moscow, Russia for helpful discussions and for providing us with data on
MB-ALL-2008 protocol.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Levin</surname>
            ,
            <given-names>K.A.</given-names>
          </string-name>
          :
          <article-title>Study Design VII. Randomised Controlled Trials</article-title>
          .
          <source>Evident-Based Dentistry 8</source>
          ,
          <fpage>22</fpage>
          -
          <lpage>23</lpage>
          , Nature Publishing Group (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zeileis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hothorn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hornik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <article-title>Model-Based Recursive Partitioning</article-title>
          .
          <source>J. Comput. Graph. Stat</source>
          .
          <volume>17</volume>
          (
          <issue>2</issue>
          ),
          <fpage>492</fpage>
          -
          <lpage>514</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Foster</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Ruberg</surname>
            ,
            <given-names>S. Subgroup</given-names>
          </string-name>
          <article-title>Identi cation from Randomized Clinical Trial Data</article-title>
          .
          <source>Stat. Med</source>
          .
          <volume>30</volume>
          (
          <issue>24</issue>
          ),
          <fpage>2867</fpage>
          -
          <lpage>2880</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeng</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rush</surname>
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosorok</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          <string-name>
            <surname>Estimating</surname>
          </string-name>
          <article-title>Individualized Treatment Rules Using Outcome Weighted Learning</article-title>
          .
          <source>J. Am. Stat. Assoc</source>
          .
          <volume>107</volume>
          (
          <issue>449</issue>
          ),
          <fpage>1106</fpage>
          -
          <lpage>1118</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Korepanova</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karachunskiy</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          :
          <article-title>Matchings and Decision Trees for Determining Optimal Therapy</article-title>
          . In:
          <string-name>
            <surname>Ignatov D.I</surname>
          </string-name>
          . et al.
          <source>Analysis of Images, Social Networks and Texts, Proc. 3rd International Conference (AIST2014)</source>
          ,
          <source>Communications in Computer and Information Science</source>
          , Vol.
          <volume>436</volume>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>110</lpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Dusseldorp</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conversano</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Os</surname>
            ,
            <given-names>B.J.</given-names>
          </string-name>
          <article-title>Combining an Additive and TreeBased Regression Model Simultaneously: STIMA</article-title>
          .
          <source>J. Comput. Graph. Stat</source>
          .
          <volume>19</volume>
          (
          <issue>3</issue>
          ),
          <fpage>514</fpage>
          -
          <lpage>530</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dusseldorp</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Van Mechelen</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <article-title>Qualitative Interaction Trees: a Tool to Identify Qualitative Treatment-Subgroup Interactions</article-title>
          .
          <source>Stat. Med</source>
          .
          <volume>33</volume>
          ,
          <issue>219</issue>
          {
          <fpage>237</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lipkovich</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dmitrienko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denne</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Enas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Subgroup Identi cation Based on Di erential E ect Search a Recursive Partitioning Method for Establishing Response to Treatment in Patient Subpopulations</article-title>
          .
          <source>Stat. Med</source>
          .
          <volume>30</volume>
          (
          <issue>21</issue>
          ),
          <fpage>2601</fpage>
          -
          <lpage>2621</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Interaction Trees with Censored Survival Data</article-title>
          .
          <source>Int. J. Biostat</source>
          .
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <volume>2</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsai</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nickerson</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>B. Subgroup</given-names>
          </string-name>
          <article-title>Analysis via Recursive Partitioning</article-title>
          .
          <source>J. Mach. Learn. Res</source>
          .
          <volume>10</volume>
          ,
          <fpage>141</fpage>
          -
          <lpage>158</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Loh</surname>
          </string-name>
          , W.-Y.,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Man</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>A Regression Tree Approach to Identifying Subgroups with Di erential Treatment E ects</article-title>
          .
          <source>Stst. Med</source>
          .
          <volume>34</volume>
          ,
          <year>1818</year>
          {
          <year>1833</year>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ching-Hon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robison</surname>
            ,
            <given-names>L.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Look</surname>
          </string-name>
          , A.T. Acute Lymphoblastic Leukaemia.
          <source>The Lancet</source>
          .
          <volume>9617</volume>
          (
          <issue>371</issue>
          ),
          <volume>1030</volume>
          {
          <fpage>1043</fpage>
          ,
          <string-name>
            <given-names>The</given-names>
            <surname>Lancet</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Karachunskiy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roumiantseva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lagoiko</surname>
          </string-name>
          , S. (eds.)
          <article-title>E cacy and Toxicity of Dexamethasone vs Methylprednisolone Long-Term Results in More than 1000 Patients from the Russian Randomized Multicentric trial ALL-MB 2002</article-title>
          ,
          <article-title>Letter to the Editor</article-title>
          .
          <source>Leukemia. 29</source>
          ,
          <year>1955</year>
          {1958, Nature Publishing Group (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O. Pattern</given-names>
          </string-name>
          <string-name>
            <surname>Structures</surname>
            and
            <given-names>Their</given-names>
          </string-name>
          <string-name>
            <surname>Projections</surname>
          </string-name>
          .
          <source>In: Proc. Stumme</source>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Delugach</surname>
          </string-name>
          , H. (eds.) 9th
          <source>International Conference on Conceptual Structures (ICCS 2001). Lecture Notes in Arti cial Intelligence</source>
          , vol.
          <volume>2120</volume>
          , pp.
          <volume>129</volume>
          {
          <issue>142</issue>
          ,
          <string-name>
            <surname>Springer</surname>
          </string-name>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          <article-title>Pattern Structures for Analyzing Complex Data</article-title>
          . In: Sakai H. et al.
          <source>(eds.) Proc. 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing (RSFDGrC</source>
          <year>2009</year>
          ).
          <source>Lecture Notes in Arti cial Intelligence</source>
          , vol.
          <volume>5908</volume>
          , pp.
          <volume>33</volume>
          {
          <issue>44</issue>
          ,
          <string-name>
            <surname>Springer</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kaytoue</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duplessis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Mining Gene Expression Data with Pattern Structures in Formal Concept Analysis</article-title>
          .
          <source>Information Sciences</source>
          ,
          <volume>10</volume>
          (
          <issue>181</issue>
          ),
          <year>1989</year>
          {2001, Elsevier, New York (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          <article-title>Scalable Knowledge Discovery in Complex Data with Pattern Structures</article-title>
          . In: Maji,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , et al.
          <source>(eds.) Proc. 5th International Conference Pattern Recognition and Machine Intelligence (PReMI'2013). Lecture Notes in Computer Science</source>
          , vol.
          <volume>8251</volume>
          , pp.
          <volume>30</volume>
          {
          <issue>41</issue>
          ,
          <string-name>
            <surname>Springer</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          <article-title>A Fast Algorithm for Computing All Intersections of Objects from an Arbitrary Semilattice</article-title>
          .
          <source>Nauchno-Tekhnicheskaya Informatsiya Seriya 2 - Informatsionnye Protsessy i Sistemy</source>
          ,
          <volume>1</volume>
          , 17{
          <fpage>20</fpage>
          (
          <year>1993</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <article-title>A service of the U</article-title>
          .S. National Institutes of Health, https://clinicaltrials.gov/ct2/show/NCT01953770
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Kleinbaum</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Kaplan-Meier Survival Curves and the Log-Rank Test</article-title>
          .
          <source>In: Survival Analysis</source>
          , pp.
          <volume>55</volume>
          {
          <issue>96</issue>
          , Springer New York (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <article-title>Nonparametric Estimation from Incomplete Observations</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          ,
          <volume>282</volume>
          (
          <issue>53</issue>
          ),
          <volume>457</volume>
          {
          <fpage>481</fpage>
          (
          <year>1958</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>May</surname>
          </string-name>
          , W.L.
          <article-title>Kaplan-Meier Survival Analysis</article-title>
          .
          <source>In: Encyclopedia of Cancer</source>
          , pp.
          <fpage>1590</fpage>
          -
          <lpage>1593</lpage>
          , Springer Berlin Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. National Cancer Institute, http://www.cancer.gov/publications/dictionaries/cancerterms?cdrid=
          <fpage>655245</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. National Cancer Institute, http://www.cancer.gov/publications/dictionaries/cancerterms?cdrid=
          <fpage>655269</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25. National Cancer Institute, http://www.cancer.gov/publications/dictionaries/cancerterms?cdrid=
          <fpage>655254</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>