=Paper= {{Paper |id=Vol-2718/paper02 |storemode=property |title=Extraction of Classification Rules from Sequences of Crystal Growth Data |pdfUrl=https://ceur-ws.org/Vol-2718/paper02.pdf |volume=Vol-2718 |authors=Radek Buša,Yann Dauxais,Stefan Ecklebe,Natasha Dropka,Martin Holeňa |dblpUrl=https://dblp.org/rec/conf/itat/BusaDEDH20 }} ==Extraction of Classification Rules from Sequences of Crystal Growth Data== https://ceur-ws.org/Vol-2718/paper02.pdf
    Extraction of Classification Rules from Sequences of Crystal Growth Data

                       Radek Buša1 , Yann Dauxais2 , Stefan Ecklebe3 , Natasha Dropka4 , Martin Holeňa5,6
                1   Faculty of Information Technology, Czech Technical University, Thákurova 9, Prague, Czech Republic
                                           2 KU Leuven, Celestijnenlaan 200a, Leuven, Belgium
                         3 Institute of Control Theory, TU Dresden, Georg-Schumann-Str. 7a, Dresden, Germany
                                  4 Leibniz Institut für Kristalzüchtung, Max-Born Str. 2, Berlin, Germany
                          5 Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic
                                6 Leibniz Institute for Catalysis, Albert-Einstein Str. 29a, Rostock, Germany


Abstract: The paper presents a generalization of a data                   to crystal growth data, due to the following two restric-
mining method for the extraction of classification rules for              tions:
classification of sequences of events, which is called dis-
                                                                          (i)  The events in [3] are described with scalar values of
criminant chronicles mining. The generalization is moti-
                                                                               the temporal attribute. On the other hand, members of
vated by the objective to extract classification rules from
                                                                               sequences of crystal growth data, which we will for
crystal growth data, for which the original method needs
                                                                               simplicity also call events, are described with vectors
to be extended to events with vectors of attributes and to
                                                                               of attribute values.
real-valued attributes. The paper elaborates incorporat-
                                                                          (ii) The temporal attribute describing events in [3] has a
ing both extensions into the theoretical fundamentals of
                                                                               finite number of values, thus it can be represented
the original method, and describes a corresponding mod-
                                                                               by a finite subset of integers. On the other hand,
ification of a system for discriminant chronicles mining,
                                                                               attributes describing events in sequences of crystal
which has been developed three years ago to implement
                                                                               growth data are real-valued.
the original method. Finally, an application of the gener-
alized method, using the modified system for discriminant                    Therefore, we have extended the method from [3] to
chronicles mining, to data from the growth of GaAs crys-                  sequences of events described by real-valued vector at-
tals by vertical gradient freeze method is briefly sketched.              tributes. This extension is the main contribution of the
                                                                          paper.
                                                                             The next section briefly recalls the original method pro-
1    Introduction                                                         posed in [3]. Its extension removing the restrictions i and ii
                                                                          above is described in Section 3. Finally, an application of
This paper deals with data mining of crystal growth data,                 the proposed method to crystal growth data is sketched in
obtained either experimentally or from simulations. Such                  Section 4.
data records the crystal growth process, its performance,
and conditions in the melt, such as temperatures in various
control points or the power of heaters, or parameters of the              2     Classification Rules Extraction with
magnetic fields influencing melt convection [6, 7]. In par-                     Discriminant Chronicles
ticular, we consider the common situation that the perfor-
mance data indicates whether the crystal growth process
                                                                          Let E be a finite set, the elements of which are called event
can be classified as satisfactory according to a given crite-
                                                                          types, and let T be an arbitrary subset of the extended reals,
ria , e.g., according to the shape or position of the solid/liq-
                                                                          T ⊂ R̄. For a multiset of m event types, the rather unusual
uid interface. Hence, the primary data mining approach to
                                                                          notation {{e1 , . . . , em }} has been introduced in [3], and a
that data is the extraction of classification rules.
                                                                          couple (e,t) ∈ E × T is called event.
   Although a plethora of methods for classification rules
                                                                             Assume further that some ordering is imposed to the
extraction exist [10, 11], most of them cannot be used for
                                                                          event types in the application domain. In [3], where tem-
our data. The reason is that crystal growth proceeds se-
                                                                          poral relationships are investigated, a total ordering corre-
quentially, hence, the data is inheretly sequential. There-
                                                                          sponding to their order of occurrence is considered. For
fore, we have chosen a specific rules extraction method
                                                                          the rules extraction method proposed in Section 3), how-
extracting classification rules for the classification of se-
                                                                          ever, the weaker concept of a partial ordering ≺ will be
quences of events, which was proposed in [3]. It is called
                                                                          sufficient, with a semantics tailored to a partiucular appli-
“discriminant chronicles mining” because it was originally
                                                                          cation (see Section 4 for the real-world application con-
developed for events described with attributes conveying a
                                                                          sidered in this paper). For e1 , e2 ∈ E, t − ,t + ∈ R̄ such
temporal meaning. However, it cannot be directly applied
                                                                          that e1 ≺ e2 ,t − ≤ t + , a temporal constraint is a tuple
      Copyright c 2020 for this paper by its authors. Use permitted un-
                                                                          (e1 , e2 ,t − ,t + ), also denoted e1 [t − ,t + ]e2 . The semantics
der Creative Commons License Attribution 4.0 International (CC BY         of such a temporal constraint is as follows: the difference
4.0).                                                                     between the timestamps t2 of an event (e2 ,t2 ) of type e2
and the timestamp t1 of an event (e2 ,t2 ) of type e2 ful-                (ii) e0i = e f (i) , i = 1, . . . , m;
fills t − ≤ t2 − t1 ≤ t + . A temporal constraint e1 [t − ,t + ]e2        (iii) if i 6= j and e0i ≺ e0j , then t f ( j) − t f (i) ∈ [a, b], where
is called satisfied by a couple of events ((e,t), (e0 ,t 0 )) if                e0i [a, b]e0j ∈ T .
e1 = e, e2 = e0 and t 0 − t ∈ [t − ,t + ]. Because constraining              We say that C occurs in s if there exists at least one
two events to occur in a fixed interval duration is too strict            occurrence of C in s.
for most applications, the simpliest way to represent tem-                   Let further S be a set of sequences. The support of a
poral constraints is by using duration intervals. These in-               chronicle C in S is the number of sequences from S in
tervals can be interpreted as two constraints defining the                which it occurs:
lowest and highest accepted duration, respectively.
    Using a set E of event types and a set T of tempo-
ral constraints, two complementary concepts can be intro-                               supp(C, S) = #{s ∈ S|C occurs in s}.                         (4)
duced:
                                                                              If
(i) If we are interested in finding events that pairwise
       satisfy a given set T of temporal contraints, then                                            supp(C, S) ≥ σmin                               (5)
       the concept of a simple temporal constraint network
       [12], alternatively called simple temporal problem [4]             for a given σmin > 0 or equivalently
       is useful, which can be defined as the triple
                                                                                                     supp(C, S)
                                                                                                                ≥ fmin                               (6)
         (E, ≺, T ), whereT is a set of temporal constraints                                            #S
                       e[l, u]e0 , such that e, e0 ∈ E, e ≺ e0 . (1)      for a given fmin = σ#Smin
                                                                                                    , then C is called frequent in S on
                                                                          the level fmin .
(ii) If we are interested in mining temporal constraints                     Finally, let S+ and S− be two disjoint sets of sequences.
     from given sequences of events, then the concept of a                The growth rate of C for S+ with respect to S− is defined:
     chronicle [3, 9] is useful, which can be defined as the
                                                                                               supp(C,S+ )
                                                                                             (
     couple                                                                                             −   if supp(C, S− ) > 0
                                                                                  g(C, S) = supp(C,S )                              (7)
                                                                                               +∞           if supp(C, S− ) = 0.
       (E , T ), where E = {{e1 , . . . , em }}, ei ∈ E and
      T is a set of temporal constraints e[l, u]e0 such that                 If C is frequent and g(C, S) ≥ gmin for a given minimal
      (∃i, j = 1, . . . , m)i 6= j & ei ≺ e j & e = ei & e0 = e j }.      growth rate gmin ≥ 1, then C is called discriminant for S+
                                                                   (2)    with respect to S− on the level gmin .
                                                                             The sequence sets S+ and S− can be viewed as two
      A chronicle is a temporal extension of episodes or                  classes of their union S = S+ ∪ S− . Hence, the algo-
      partial orders introduced in [8], a type of pattern ded-            rithm DCM for discriminant chronicles mining presented
      icated to summarize sequential data. Chronicles have                in [3] is actually a sophisticated algorithm for extraction of
      proven their usefulness in applications where the tem-              classification rules. Before searching frequent chronicles
      poral dimension is mandatory to differentiate two dif-              satisfying some temporal constraints, it searches frequent
      ferent behaviors. The first application of them is on               chronicles with the set of temporal constraints T∞ , which
      alarm log data in [5] where the temporal distance be-               is equivalent to the extraction of classification rules with-
      tween two alarm events is very important.                           out temporal constraints. To this end, any rules extraction
      Observe that the constraint e[−∞, ∞]e0 holds for any                algorithm can be used. In the implementation of DCM in
      e, e0 ∈ E, meaning that this constraint actually does               [3], the algorithm Ripper, based on the minimal descrition
      not constrain anything. In case that no constraint                  length principle, [2] has been employed.
      from T constrains anything, the set T will be de-
      noted T∞ . Hence, (E , T∞ ) is a chronicle for the set of
      temporal constraints T∞ of which, it is only required:              3        Proposed Rules Extraction Method
       (∃i, j = 1, . . . , m)i 6= j & ei ≺ e j & e = ei & e0 = e j .      3.1      Discriminant Multi-dimensional Chronicles
                                                                (3)
                                                                          Let d, n ∈ N, d, n ≥ 2. For a set E of event types
   Because the research reported in this paper concerns                   with a partial ordering ≺, T ⊂ R̄d and a label set L =
data mining, it relies on the latter concept, as well as on               {+, −}, an event is a couple (e,t), where e ∈ E,t ∈ T
several additional concepts concerning chronicles.                        and a labelled sequence of events is defined as a tuple
   Let m, n ∈ N, 2 ≤ m ≤ n,C = ({{e1 , . . . , em }}, T ) be a            (SID, (e1 ,t1 ), . . . , (en ,tn ), L), where SID ∈ N is a sequence
chronicle and s = ((e1 ,t1 ), . . . , (en ,tn )), n ≥ 2, be a se-         index, unique among all considered labelled sequences of
quence of events. An occurence of C in s is a subsequence                 events, (e1 ,t1 ), . . . , (en ,tn ) are events, and L ∈ L.
s̃ = ((e f (1) ,t f (1) ), . . . , (e f (m) ,t f (m) )) of s such that:      Let ~a = (a1 , . . . , ad ),~b = (b1 , . . . , bd ),~c = (c1 , . . . , cd ) ∈
                                                                            d
(i) f : m̂ → n̂ is an injective function;                                 R , with bi ≥ ai , i = 1, . . . , d, and R(~a,~b) = [a1 , b1 ] ×
                                                 = R(~a,~b) ⇐⇒
[a2 , b2 ] × . . . × [ad , bd ]. The relation ~c ⊂                                          (in the pseudocode, T∞ is represented by the tinf sym-
∀ i ∈ dˆ : ci ∈ [ai , bi ] will be called hyperrectangle test.                              bol). If the given condition is true, no discriminant tem-
A hyperrectangle constraint is a tuple (e1 , e2 ,~t1 ,~t2 ), also                           poral constraints are mined using the extractDC(...)
denoted as e1 [[~t1 ,~t2 ]]e2 where e1 , e2 ∈ E and ~t1 ,~t2 ∈ R̄d .                        function.
A hyperrectangle constraint e1 [[~t1 ,~t2 ]]e2 is said to be sat-                           DCM-MD(S+, S-, fmin, gmin):
isfied by a couple of events ((e,~t), (e0 ,~t 0 )) if and only if                           M := extractMultiSet(S+,fmin). // M is a set of
e = e1 & e0 = e2 & ~t 0 −~t ⊂    = R(~t1 ,~t2 ).                                                                            // frequent multisets
                                                                                            C := emptySet(). // C is a set of resulting
   A multi-dimensional chronicle is a couple (E , T ) such                                                   // discriminant multi-dimensional
that E = {{e1 , e2 , . . . , en }}, ei ∈ E, i ∈ n̂ is a multiset                                             // chronicles
of event types and T = m{e1 [[~t1 ,~t2 ]]e2 |e1 , e2 ∈ E , e1 ≺ e2 }
is a set of hyperrectangle constraints. If in particular all its                            for (m of M):
                                                                                              if supp(S+,{m,tinf}) > (gmin * supp(S-,{m,tinf})):
constraints are e[[(−∞, . . . , −∞), (∞, . . . , ∞)]]e0 , i.e., they                            C.add({m,tinf}). // adds a discriminant chronicle
don’t constraint anything, then this T is again denoted                                                          // without temporal constraints
T∞ :                                                                                          else:
                                                                                                for t of extractDC(S+,S-,m,fmin,gmin):
                                                                                                  C.add({m,t}). // adds a discriminant chronicle
   T∞ = {e[[(−∞, . . . , −∞), (∞, . . . , ∞)]]e0 |                                                              // with temporal constraints
        (∃i, j ∈ m̂) i 6= j & ei ≺ e j & e = ei & e0 = e j }. (8)
                                                                                                return C.
    Let s = ((e1 ,~t1 ), . . . , (en ,~tn )) be a sequence of events,                                       Listing 1: DCM-MD pseudocode
m ∈ n̂ and C = (E = {{e01 , e02 , . . . , e0m }}, T ) be a multi-
dimensional chronicle. An occurrence of the multi-                                             The extractMultiSet(...) function extracts a set
dimensional chronicle C in s is a subsequence s̃ =                                          of frequent multisets from a given sequence set and user-
((e f (1) ,~t f (1) ), (e f (2) ,~t f (2) ), . . . , (e f (m) ,~t f (m) )), such that f :   supplied minimal support threshold ( fmin ). It applies a reg-
m̂ 7−→ n̂ is an injective function, ∀i : e0i = e f (i) , and if i 6= j,                     ular frequent itemset mining algorithm where an event
then~t f ( j) −~t f (i) ⊂ = R(~a,~b) where e0i [[~a,~b]]e0j ∈ T . A multi-                  type a ∈ E occurring n times in a sequence is encoded
dimensional chronicle C is said to occur in sequence s if                                   by n items I1a , I2a , . . . , Ina . An intermediate frequent itemset
                                                                                                                              e
there exists at least one occurrence of C in s.                                             of size m denoted as (Iikk )1≤k≤m is extracted from the sup-
    The support of a multi-dimensional chronicle C in a se-                                 plied sequence set and is further transformed into the re-
quence set S is again defined by 2 like for chronicles in                                   sulting multiset. The last phase of the algorithm incorpo-
                                                                                                                                               e
Section 2. Finally, also the definition of frequent chroni-                                 rates converting each frequent itemset (Iikk )1≤k≤m to a mul-
cles and chronicles discriminant for one set of sequences                                   tiset containing mutually different events ek , k = 1, . . . , m,
with respect to another transfers to multi-dimensional                                      each of them exactly ik times.
chronicles.                                                                                    The extractDC(...) function is used to mine dis-
                                                                                            criminant hyperrectangle constraints from a given frequent
                                                                                            multiset E = {{a1 , a2 , . . . , an }}, disjoint sequence sets S+
3.2     Discriminant Multi-dimensional Chronicles                                           and S− , and with user-defined parameters fmin and gmin .
        Mining                                                                              Exact conceptual and implementation details regarding the
The DCM-MD algorithm illustrated in Listing 1 is a modi-                                    extraction of discriminant hyperrectangle constraints are
fication of the DCM algorithm for discriminant chronicles                                   further elaborated in [1].
mining proposed in [3]. The main aspects of the modifi-
cation are the data model (substituting scalar integer val-
                                                                                            4     Application to Crystal Growth Data
ues for vectors of real numbers) and a new discriminant
hyperrectangle constraints mining algorithm (a substitu-
                                                                                            The need for affordable high quality semiconducting crys-
tion of an algorithm used for discriminant temporal con-
                                                                                            tals such as gallium arsenide GaAs is continuously in-
straints mining proposed in [3]). It operates with multi-
                                                                                            creasing, particularly for the electronic and photovoltaic
dimensional input data and multi-dimensional chroni-
                                                                                            applications. Despite GaAs has a number of outstanding
cles, mining an incomplete set of discriminant multi-
                                                                                            physical properties, its production is hampered by chal-
dimensional chronicles, determined by user-supplied ar-
                                                                                            lenging processes control due to high melting tempera-
gument values fmin (in the pseudocode as fmin) and gmin
                                                                                            tures (1238◦ C) and chemically-aggressive environment.
(in the pseudocode as gmin).
                                                                                            Particularly in-situ measurements of the process variables
   The branching statement in Listing 1 containing the
                                                                                            (e.g. temperatures, velocities, concentrations etc.) in the
condition
                                                                                            GaAs have high contamination potential and lead to the
supp(S+,{m,tinf}) > (gmin*supp(S-,{m,tinf}))                                                low crystal quality. Moreover, in-situ visual observations
                                                                                            of the crystal growth are not possible. Prediction of the po-
is used to check whether given frequent multiset without                                    sition of the crystallization front, i.e. length of the grown
further specific hyperrectangle constraints is discriminant                                 crystal after usage of certain growth recipe (i.e. temporal
profiles of a power of heaters) is a key information for the
process monitoring.                                                            Table 1: Centers c of the 20 clusters in R2 defining event
   Here, we considered Vertical Growth Freeze (VGF)                            types. They were obtained through clustering the first
method for the growth of GaAs crystals. VGF growth                             500 numeric simulations underlying [7] using the k-means
method involves the progressive freezing of the lower end                      algorithm in Matlab
of a melt upward by moving the desired temperature gra-                           Cluster       c1      c2 Cluster          c1      c2
dient in a furnace via temporal change of heating power.                               A 13400 8560                K 14300 8720
1-dimensional model of VGF-GaAs growth is shown in                                     B 12300 8690                L 11900 8650
Figure 1.                                                                              C 15200 8840                M 12700 8660
                                                                                       D 13900 8590                N 12100 8670
                                                                                        E 13600 8730               O 13100 8720
4.1    Used Data                                                                        F 14900 8800                P 13700 8550
                                                                                       G 13200 8570                Q 14100 8710
The above described implementation extending the me-                                   H 12900 8590                R 13300 8730
thod proposed in [3] has been applied to data gathered in                               I 13800 8740                S 14600 8760
the German Research Foundation (DFG) project “Model-                                    J 12500 8680               T 12900 8720
based control and regulation of the VGF crystal growth
process using distributed parametric methods”. The data
records the position of the solid/liquid interface of GaAs
crystals grown by the vertical gradient freeze (VGF) me-                         S− = {((e1 , T1 ), . . . , (e20 , T20 )|ei , Ti , i = 1, . . . , 20,
thod, which involves progressive freezing of the lower end                           originated in a simulation ending with the position
of a melt upward by moving the temperature gradient in                                   of the solid/liquid interface 17.25–25 cm)}                    (10)
a furnace, together with the evolution of temperatures in
0th–4th quarter of the GaAs height. They have been ob-                         As to the number of sequences in both sets, #S+ =
tained by solving the inverse problem for a simplified one                     90, #S− = 165.
dimensional model of the VGF process for different de-                            Finally, the considered partial ordering ≺ of event types
sired growth rates as described in [7], using as input the                     is given by the order of ocurrence of events of those types
evolution of 2-dimensional vectors describing the heat flux                    in any of the event sequencesin S+ or in S− , i.e.,
in and heat flux out (Figure 1). All simulations were per-
formed for 100 times, among which the 5th, 10th, . . . ,
95th, 100th time will in the following serve as milestone                        e ≺ e0 iff (∃((e1 , T1 ), . . . , (e20 , T20 ) ∈ S+ ∪ S− )
times.                                                                                    (∃i, j = 1, . . . , 20) i < j & = ei & e0 = e j . (11)
   For an application of the method presented in Section 3,
event types and events have been defined as follows. The
2-dimensional inputs of the 500 numeric simulations un-                        4.2    Experimental Setup
derlying [7] have been clustered into k = 20 clusters using
                                                                               The experimental setup aimed at a chronicle set contain-
the Matlab implementation of the standard k-means clus-
                                                                               ing about 20-30 elements and including both chronicles
tering algorithm. The centers of the resulting clusters are
                                                                               discriminant for S+ with respect to S− and chronicles
listed in Table 1. An event type is now the fact that the
                                                                               discriminant for S− with respect to S+ . Each chroni-
input belongs to a particular cluster. For each numeric
                                                                               cle (E , T ) ∈ C should contain only a minimal number
simulation, an event type is recorded at every milestone
                                                                               of T∞ constraints.
time. Consequently, the size of any multiset of event types
from one numeric simulation is at most 20. . An event                             Assume that C = (E , T ) is a chronicle, C is a set
is a pair (e, T ), where e is an event type and T ∈ R5 is a                    of chronicles and ts ∈ [0, 1]. Chronicle specificity denoted
vector of temperatures obtained in the numeric simulation                      as s(C) is defined as:
and at the milestone time when e was recorded, provided
                                                                                                   #{e[[t,t 0 ]]e0 ∈ T |e[[t,t 0 ]]e0 6∈ T∞ }
the position of the solid/liquid interface at the end of that                            s(C) =                                               .
simuation was at least 17.25 cm. There were 255 such                                                                 #T
simulations available, thus we have 255 event sequences                            Chronicle set C specific for a specificity threshold ts de-
of length 20, due to the 20 milestone times. They were                         noted as s(C,ts ) is defined as s(C,ts ) = {C|C ∈ C & s(C) ≥
divided into two disjoint sequence sets as follows:                            ts }.
                                                                                   The metrics used for evaluating the convenience of pa-
  S+ = {((e1 , T1 ), . . . , (e20 , T20 )|ei , Ti , i = 1, . . . , 20,         rameters passed to the DC-PBC component are described
      originated in a simulation ending with the position                      in the rest of this paragraph. #M is the size of the set of fre-
                   of the solid/liquid interface >25 cm)}                (9)   quent multisets set as introduced in the pseudocode of the
                                                                               DCM-MD algorithm in Listing 1. #E is the count of dis-
                                                                               tinct frequent multisets which occurred in some discrimi-
                               Figure 1: Illustration explaining the used crystal growth data


nant chronicle of the resulting chronicle set C:                voked for the data described above with argument values
                                                                --mincs 2, --maxcs 5, --fmin 0.1, --gmin 5000.
  #E = #{E |(∃T – a set of
              hyperrecrtangle constraints)(E , T ) ∈ C}.           The resulting set of discriminant chronicles was after-
maxs(C) = max{s(C)|C ∈ C} is the maximal speci-                 wards filtered to include only specific discriminant chron-
ficity value found among the chronicles in C. #s(C,ts )         icles. To this end, a tool chronicle_statgen available
is the count of chronicles specific for ts found in C.          at github.com/busarade-itat was invoked with argu-
   The following parameters were tuned: fmin imple-             ments --minspec 0.7, --vecsize 5.
mented by the --fmin parameter representing minimal
support threshold. gmin implemented by the --gmin pa-              The final result is presented in Tables 2 and 3, counting
rameter representing the minimal growth rate threshold          a total of 26 specific discriminant chronicles – 18 of them
parameter of the DCM-MD algorithm as introduced in List-        discriminant for S+ with respect to S− , the remaining 8
ing 1. min(#E ) implemented by the --mincs param-               discriminant for S− with respect to S+ .
eter representing minimal chronicle event multiset size.
max(#E ) implemented by the --maxcs parameter repre-
senting maximal chronicle event multiset size. ts represent-       Proposed method enables prediction of the conditions
ing the specificity threshold for a custom tool implemented     for reaching targeted crystal length by following the dif-
for extracting specific discriminant chronicles from a set of   ferences among segments in temporal profiles of temper-
discriminant chronicles.                                        atures in characteristic points in the GaAs. If the same
   After evaluating the metrics for each parameter tun-         approach is further applied on the experimental tempera-
ing step, the argument values fmin = 0.1, gmin = 5000,          ture profiles measured by thermocouples in heaters (out-
min(#E ) = 2, max(#E ) = 5, ts = 0.7 proved sufficient for      side of the melt and crystal) as in real experiments, it will
retrieving a set of specific discriminant chronicles with the   be possible to determine moment of reaching desired crys-
desired properties.                                             tal length without visual observations and GaAs contami-
                                                                nation. From that moment on, crystal growth process step
                                                                terminates and cooling down of the furnace starts. Such
4.3   Examples of Extracted Rules
                                                                accurate prediction of the end of solidification step will be
The implementation of generalized DC-PBC available              very beneficial for the process economy and the final crys-
at github.com/busarade-itat/md-dc-pbc was in-                   tal quality.
  Table 2: Resulting set of specific chronicles discriminant for S+ with respect to S− , rounded to 3 significant digits
           E(C)              T(C)                                                                                   supp(C, S+ )   supp(C, S− )
{{e1 = A, e2 = A, e3 = Q}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (−33.7, −51.3, −68.9, −86.7, −105)]]e2 ,                       23              0
                             e1 [[(−∞, −∞, −∞, −∞, −∞), (73.4, 73.5, 73.7, 73.8, 74.0)]]e3 ,
                             e2 [[(−∞, −∞, −∞, −∞, −∞), (159, 160, 161, 162, 180)]]e3 }
{{e1 = A, e2 = I, e3 = Q}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (15.5, 17.9, 18.0, 18.1, 18.2)]]e2 ,                           10              0
                             e1 [[(43.1, 60.6, 73.7, 73.8, 74.0), (∞, ∞, ∞, ∞, ∞)]]e3 ,
                             e2 [[(−∞, −∞, −∞, −∞, −∞), (144, 145, 146, 146, 163)]]e3 }
{{e1 = A, e2 = I, e3 = Q}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (36.9, 37.0, 37.1, 37.1, 37.2)]]e2 ,                           13              0
                             e1 [[(43.1, 60.6, 73.7, 73.8, 74.0), (∞, ∞, ∞, ∞, ∞)]]e3 ,
                             e2 [[(−∞, −∞, −∞, −∞, −∞), (144, 145, 146, 146, 163)]]e3 }
{{e1 = G, e2 = A, e3 = I}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (−128, −128, −129, −130, −130)]]e2 ,                           13              0
                             e1 [[(55.9, 73.8, 91.6, 109, 127), (∞, ∞, ∞, ∞, ∞)]]e3 ,
                             e2 [[(398, 412, 414, 416, 418), (∞, ∞, ∞, ∞, ∞)]]e3 }
{{e1 = G, e2 = A, e3 = I}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (−374, −376, −378, −380, −382)]]e2 ,                           13              0
                             e1 [[(−∞, −∞, −∞, −∞, −∞), (88.6, 107, 125, 144, 162)]]e3 ,
                             e2 [[(398, 412, 414, 416, 418), (∞, ∞, ∞, ∞, ∞)]]e3 }
{{e1 = G, e2 = A, e3 = Q}}   {e1 [[(139, 140, 141, 142, 142), (∞, ∞, ∞, ∞, ∞)]]e2 ,                                      9              0
                             e1 [[(−∞, −∞, −∞, −∞, −∞), (249, 268, 287, 306, 325)]]e3 ,
                             e2 [[(43.5, 61.3, 73.7, 73.8, 74.0), (∞, ∞, ∞, ∞, ∞)]]e3 }
{{e1 = G, e2 = I, e3 = Q}}   {e1 [[(104, 117, 134, 151, 163), (∞, ∞, ∞, ∞, ∞)]]e2 ,                                     18              0
                             e1 [[(−∞, −∞, −∞, −∞, −∞), (189, 205, 222, 239, 256)]]e3 ,
                             e2 [[(−60.4, −60.5, −60.6, −60.7, −60.8), (∞, ∞, ∞, ∞, ∞)]]e3 }
{{e1 = G, e2 = Q, e3 = Q}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (249, 267, 286, 305, 324)]]e2 ,                                14              0
                             e1 [[(60.4, 78.3, 96.3, 114, 132), (∞, ∞, ∞, ∞, ∞)]]e3 ,
                             e2 [[(−32.5, −32.5, −32.6, −32.7, −32.7), (−31.8, −31.8, −31.9, −32.0, −32.0)]]e3 }
    {{e1 = A, e2 = A}}       {e1 [[(−135, −135, −136, −137, −137), (−127, −128, −128, −129, −130)]]e2 }                 38              0
    {{e1 = A, e2 = I}}       {e1 [[(431, 451, 470, 490, 510), (∞, ∞, ∞, ∞, ∞)]]e2 }                                     12              0
    {{e1 = A, e2 = K}}       {e1 [[(286, 305, 324, 343, 362), (∞, ∞, ∞, ∞, ∞)]]e2 }                                     30              0
    {{e1 = A, e2 = Q}}       {e1 [[(372, 391, 411, 430, 449), (∞, ∞, ∞, ∞, ∞)]]e2 }                                     21              0
    {{e1 = G, e2 = A}}       {e1 [[(−∞, −∞, −∞, −∞, −∞), (−375, −377, −379, −381, −383)]]e2 }                           16              0
   {{e1 = G, e2 = G}}        {e1 [[(−124, −125, −125, −126, −127), (−122, −123, −123, −124, −124)]]e2 }                  8              0
   {{e1 = G, e2 = K}}        {e1 [[(26.5, 43.7, 44.6, 44.8, 45.1), (161, 180, 198, 217, 236)]]e2 }                      17              0
    {{e1 = I, e2 = K}}       {e1 [[(72.5, 72.7, 72.8, 73.0, 73.1), (137, 144, 162, 180, 198)]]e2 }                      12              0
    {{e1 = I, e2 = Q}}       {e1 [[(68.6, 68.8, 68.9, 69.1, 69.2), (69.0, 69.1, 69.3, 69.4, 69.5)]]e2 }                  7              0
   {{e1 = Q, e2 = K}}        {e1 [[(−27.3, −27.4, −27.4, −27.5, −27.5), (∞, ∞, ∞, ∞, ∞)]]e2 }                           72              0




  Table 3: Resulting set of specific chronicles discriminant for S− with respect to S+ , rounded to 3 significant digits
           E(C)              T(C)                                                                                   supp(C, S+ )   supp(C, S− )
{{e1 = G, e2 = I, e3 = Q}}   {e1 [[(−∞, −∞, −∞, −∞, −∞), (80.3, 80.5, 80.6, 80.7, 80.9)]]e2 ,                            0             19
                             e1 [[(−∞, −∞, −∞, −∞, −∞), (50.2, 50.2, 50.3, 50.4, 50.5)]]e3 ,
                             e2 [[(−∞, −∞, −∞, −∞, −∞), (67.4, 67.5, 67.6, 77.0, 93.3)]]e3 }
   {{e1 = A, e2 = Q}}        {e1 [[(12.1, 12.1, 12.1, 12.1, 12.1), (13.6, 13.6, 13.6, 13.6, 13.7)]]e2 }                  0             13
   {{e1 = G, e2 = A}}        {e1 [[(316, 333, 351, 368, 385), (∞, ∞, ∞, ∞, ∞)]]e2 }                                      0             16
   {{e1 = G, e2 = G}}        {e1 [[(−121, −122, −122, −123, −123), (∞, ∞, ∞, ∞, ∞)]]e2 }                                 0             39
   {{e1 = G, e2 = I}}        {e1 [[(318, 334, 351, 367, 384), (∞, ∞, ∞, ∞, ∞)]]e2 }                                      0             31
   {{e1 = G, e2 = Q}}        {e1 [[(393, 411, 429, 447, 465), (∞, ∞, ∞, ∞, ∞)]]e2 }                                      0             18
    {{e1 = I, e2 = I}}       {e1 [[(−30.9, −31.0, −31.0, −31.1, −31.1), (∞, ∞, ∞, ∞, ∞)]]e2 }                            0             37
   {{e1 = Q, e2 = Q}}        {e1 [[(−30.8, −30.8, −30.9, −31.0, −31.0), (−30.1, −30.2, −30.2, −30.3, −30.3)]]e2 }        0             14
5   Conclusion                                                            by dynamic neural networks. Journal of Crystal Growth,
                                                                          521:9–14, 2019.
The paper has presented a generalization of the method                [8] Gemma C. Garriga. Summarizing sequential data with
for discriminant chronicles mining proposed in [3]. This                  closed partial orders. In SDM, 2005.
generalization has been motivated by the objective to ex-             [9] M. Ghallab, D. Nau, and P. Traverso. Automated Planning
tract classification rules from crystal growth data, bring-               and Acting. Cambridge University Press, Cambridge, 2016.
ing two additional problems not pertaining to the data to            [10] D.J. Hand. Construction and Assessment of Classification
which the original method had been applied: the events                    Rules. John Wiley and Sons, New York, 1997.
are described with a vector of attributes instead of a sin-          [11] M. Holeňa, P. Pulc, and M. Kopp. Classification Methods
gle scalar attribute, and the attributes are real-valued in-              for Internet Applications. Springer, 2020.
stead of integer-valued. The theoretical fundamentals of             [12] H.C. Lau, T. Ou, and M. Sim. Robust temporal constraint
the method in [3] have been extended to tackle those two                  networks. In International Conference on Tools with Arti-
problems and the system for discriminant chronicles min-                  ficial Intelligence, pages 82–88, 2005.
ing based on [3] has been adapted to accomodate those
extensions, together with some additional implementation
improvements such as refactoring. As a proof of concept
of the presented generalization, it has been applied, using
the modified system, to real-world data with events char-
acterizing the heat fluxes for the growth of GaAs crystals
by vertical gradient freeze method, and with a vector of 5
attributes recording the temperatures in different heights.
Although most of the hyperrectangles in Tables 2 and 3 are
not very restrictive, the extracted classification rules nev-
ertheless show that the proposed approach allows to as-
sess whether the grown crystal will have a desired length
based solely on the temperature profiles. Regarding fu-
ture research, it would be interesting to assess how small
changes to the mined hyperrectangle constraints affect the
manufacturing process of the VGF-GaAs crystals.

Acknowledgement
The research reported in this paper has been supported by
the Czech Science Foundation (GAČR) grant 18-18080S.


References
 [1] Radek Buša. Implementation of a generalized version of a
     system for discriminant chronicles mining. Czech Techni-
     cal University in Prague. Computing and Information Cen-
     tre., Cham, 2020.
 [2] W.W. Cohen. Fast effective rule induction. pages 115–123,
     1995.
 [3] Yann Dauxais, Thomas Guyet, David Gross-Amblard, and
     André Happe. Discriminant chronicles mining. 2017.
 [4] R. Dechter, I. Meiri, and J. Pearl. Temporal constraint net-
     works. Artifical Intelligence, 49:61–95, 1991.
 [5] Christophe Dousson and Thang Vu Duong. Discovering
     chronicles with numerical time constraints from alarm logs
     for monitoring dynamic systems. In IJCAI, pages 620–626,
     1999.
 [6] N. Dropka and M. Holeňa. Optimization of magnetically
     driven directional solidification of silicon using artificial
     neural networks and Gaussian process models. Journal of
     Crystal Growth, 471:53–61, 2017.
 [7] N. Dropka, M. Holeňa, S. Ecklebe, C. Frank-Rotsch, and
     J. Winkler. Fast forecasting of VGF crystal growth process