<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>This work was partially supported by RFBR (grant</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Association Rules Discovery in Multivariate Time Series</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>© Elena Lutsiv</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of St.-Petersburg, Faculty of Mathematics and Mechanics</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2007</year>
      </pub-date>
      <volume>0</volume>
      <fpage>7</fpage>
      <lpage>07</lpage>
      <abstract>
        <p>A problem of association rules discovery in a multivariate time series is considered in this paper. A method for finding interpretable association rules between frequent qualitative patterns is proposed. A pattern is defined as a sequence of mixed states. The multivariate time series is transformed into a set of labeled intervals and mined for frequently occurring patterns. Then these patterns are analyzed to find out which of them occur close to each other frequently. Some modifications and improvements of the method are proposed and discussed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Many classes of data that we deal with in our everyday
life are temporal in their nature: medical information
about patients and their diseases, goods selling data,
financial data from stock markets, etc. In most cases
they describe development of some variables or objects
over time. All these data provide relatively small
amount of knowledge by themselves, but much more
information can be obtained from objects behavior
analysis. Such analysis that aimed at extracting
previously unknown rules from temporal data is called
Temporal Data Mining or Knowledge Discovery. A
comprehensive and detailed overview of temporal data
mining techniques can be found, for example, in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        Discovered knowledge can be used both to improve
understanding of underlying processes and for time
series prediction given some observations in the past. In
both cases it is important for all found regularities,
patterns and rules to be understandable and interpretable
by a domain expert. It seems that the best way for this is
to create a model of time series behavior (e.g., ARMA
or ARIMA models [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), but in many cases these models
are difficult to understand for a human. Furthermore,
for many series, such as stock market data, it is
impossible to develop a global model due to their
chaotic nature. On the other hand, when someone thinks
of the system behavior, he rather keeps in mind some
dependencies between typical local development pieces
or scenarios. When an expert inspects system behavior
over time, usually he tries to find some relatively short
and simple pieces of history that occur frequently
enough and are easy to interpret. The next natural step
is to find out some cause-effect, coincidence or some
other dependencies between frequent episodes. For
example, he may find that “IF pressure falls and it is
summer THEN rain will start in 24 hours with high
confidence”. In this paper we propose a method for
such association rules discovery from a multivariate
temporal sequence.
      </p>
      <p>
        Automation of this process requires a definition of
time series similarity. Frequent episodes obtained with
different measures traditionally used for estimating
similarity (e.g. Euclidian distance) provide a little
information about system behavior that can be
interpreted by a human ([
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). In this paper we use a
different way to discover frequent time series episodes
(or patterns). Similar approaches are used, for example,
in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This kind of process is believed to be
much closer to a process used by a human (see [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]).
The main idea is to divide up the time series into a set
of labeled intervals. Each interval is a time interval
during which some condition is true in the original
series (for example, “a series decreases” or “a patient
has flu”). This can be done in different ways: intervals
and their labels can be identified empirically or as a
result of some automatic process, e.g. short
subsequences clustering [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Once we have the initial
set of intervals, we build all intersections of all subsets
of intervals and add them to our set. Finally we know
all segments of time series where one or more simple
conditions hold simultaneously.
      </p>
      <p>This paper proposes the method of frequent patterns
discovery from the described interval set. A pattern is
defined as a sequence of state labels (simple or mixed).
More formally it will be defined later. An algorithm of
finding such pattern’s occurrences has been developed
and is described in the paper.</p>
      <p>The paper is organized as follows: Section 2
contains a brief overview of similar works; Section 3
defines a notion of a pattern and describes frequent
patterns discovery procedure; in Section 4 a process of
association rules generation is briefly outlined; Section
5 is devoted to discussions of the method and some
modifications are proposed there. Section 6 contains
conclusions and outlines further work directions.
matching even if some conditions except ones defined
by the pattern are held on it.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Related Work</title>
      <p>
        There are a number of approaches dealing with symbol
sequences. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] a set of symbols is obtained from a
time series by clustering short subsequences extracted
with a sliding window. Then the sequence is mined for
associations between symbols that are close enough to
each other. In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] an a priori style algorithm is used to
discover frequent Episodes in a symbol sequence. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
authors apply genetic programming for pattern mining
and use special hardware for efficient sequence
matching.
      </p>
      <p>
        The following approaches work with interval
sequences produced from the time series, for example,
using feature extraction. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] a sequence of intervals
is searched for containment relations. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] a time
series is divided into a set of labeled intervals. Then the
interval set is mined for patterns described in terms of
Allen’s [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] interval logic. Interesting patterns are
restricted to A1 patterns, where operators can be
appended only on the right hand side. Another method
that uses Allen’s operators for pattern definition is
proposed in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. A set of labeled intervals is obtained
from a multivariate time series and mined for patterns
with a sliding window to restrict pattern length. Then
patterns are filtered by their interestingness using
Jmeasure [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        The last mentioned method is close to the proposed
approach, but its has two main disadvantages. The first
is that pattern’s frequency strongly depends on the
width of the sliding window. If an occurrence of a
pattern is a little longer than the window, it’s not taken
into account at all. The second disadvantage is that
patterns in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] are expressed in terms of interval logic,
that makes them difficult to understand.
      </p>
      <p>The approach proposed in this paper solves both of
these problems. It uses “windows” of all widths, that
significantly smooths the difference between
frequencies of patterns with instances of close lengths.
Patterns are expressed in terms of simultaneous states –
in the way that is closer to the human reasoning. The
author believes that a description like “pressure
increases and temperature decreases for 1 hour, and
then (may be after some time) pressure slightly
decreases” is more natural then “an interval where
pressure increases is overlapped by an interval where
temperature decreases, and the latter one is met by an
interval where pressure slightly decreases”. Moreover,
since the first statement involves only important
sequence pieces, it matches more segments of the time
series than the second one.</p>
      <p>
        Ther is another work ([
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]) that uses an idea of
simultaneous states, but in rather simplified form: it’s
authors state that a labelled interval matches an “Event”
(a set of simultaneously holding states) if a set of states
holding on this interval exactly coincides with the
Event. The method proposed in the current paper uses
more flexible approach that allows to treat an interval as
      </p>
    </sec>
    <sec id="sec-3">
      <title>3 Frequent Patterns Discovery</title>
      <p>The frequent patterns discovery procedure includes
three general steps:
1. Extract a basic set of simple intervals from the time
series (i.e. intervals labeled with simple states);
2. Convert an initial time series into a sequence of
nonoverlapping intervals labeled with mixed states;
3. Mine the sequence for frequent patterns.</p>
      <p>This section gives a definition of a pattern and describes
all these stages in details.</p>
      <sec id="sec-3-1">
        <title>3.1 Basic Interval Set</title>
        <p>As was mentioned above, to be able to discover time
series behavior qualitative patterns, we need to extract a
set of labeled intervals from the series. Let’s denote as
S0 a set of all simple states or conditions based on which
we divide our series into intervals (for example, “a
patient has a high temperature” or “humidity goes up”).
States in S0 are not required to be mutually exclusive, so
intervals of the resulting set may overlap. This fact
allows us not to restrict ourselves with only single
univariate time series, but freely work with multivariate
time series or even with several ones as well. Thus we
obtain a set I0 from our time series – a set of intervals
labeled with simple states:</p>
        <p>I0 = {(s1, b1, e1), (s2, b2, e2), … , (sn, bn, en)}
(1)</p>
        <p>Here (si, bi, ei) denotes an interval labeled with a
state si ∈ S0, beginning at bi and ending at ei. All these
intervals are required to be maximal intervals. This
means that in I0 there are no adjacent intervals or
intervals with non-empty intersection labeled with the
same state. I.e.:
∀ i=1..n, j=1..n: si = sj =&gt; ej &lt; bi ∨ ei &lt; bj
(2)</p>
        <p>For the next step we need to order states of S0 in
some way. A choice of ordering procedure is not very
important. For example, states may be ordered
lexicographically according to names of their labels or
somehow else. This is needed only to enable ordering of
all intervals of I0 according to the rules below:
∀ (si, bi, ei), (sj, bj, ej) ∈ I0:
• If bi &lt; bj then (si, bi, ei) &lt; (sj, bj, ej);
• If bi &gt; bj then (si, bi, ei) &gt; (sj, bj, ej);
• If bi = bj then:
− If si &lt; sj then (si, bi, ei) &lt; (sj, bj, ej);
− If si &gt; s j then (si, bi, ei) &gt; (sj, bj, ej);
− Note that si = sj is impossible due to the
maximality requirement.</p>
        <p>I0 – is a set of maximal simple intervals – that is
ordered as described above, we call it a basic interval
set. An example is shown on the Fig. 1. Here S0 = {A,
B, C, D} and I0 = {(A, 1, 4), (B, 2, 3), (C, 2, 6), (D, 3,
5)}.</p>
        <p>A
B</p>
        <p>C
4
D
1
2
3
5
6
As was mentioned above, the main goal of this method
is to find association rules for patterns expressed in
terms of simultaneous or mixed states. A mixed state is
any subset of S0. An interval i is labeled with a mixed
state s = {s1 , …, sn} if:
i = i1 ∩ i2 ∩ … ∩ in: ik ∈ I0 and ik is labeled with
sk, k = 1..n
(3)</p>
        <p>In other words, if two or more simple intervals
overlap, then their intersection is labeled with a mixed
state, composed of all simple state labels of these
intervals. Since I0 consists of maximal intervals, all
simple states in a mixed state are different.</p>
        <p>For further description of patterns and matching
procedure we need to define a sequence I that will be
searched for pattern occurrences. Let D = {d1, d2, …,
dm} be an ordered sequence of all different beginning
and ending points of I0 intervals. I consists of all
intervals (s, dk, dk+1), where s is a mixed state that
includes all simple states holding during this time
interval. For example, Fig. 2 (a) illustrates I for
intervals shown on Fig. 1:
D = {1, 2, 3, 4, 5, 6};
I = (({A}, 1, 2), ({A, B, C}, 2, 3), ({A, C, D}, 3, 4), ({C,
D}, 4, 5), ({C}, 5, 6))</p>
        <p>Note that I consists of maximal intervals due to
maximality of all intervals of I0.</p>
        <p>A pattern is defined as a sequence of mixed states.
For example, &lt;{A}, {A, B}, {C}&gt; denotes a pattern
that consists of mixed states {A}, {A, B} and {C},
where A, B and C are simple states.</p>
        <p>Consider:
− a pattern P = &lt;ps1, …, psn&gt;
− a subsequence Q of I: Q = ((s1, b1, e1), (s2, b2,
e2), … , (sm, bm, em)), where (sk, bk, ek) ∈ I and ek ≤
bk+1.</p>
        <p>An algorithm that matches P and Q is outlined
below. After successful matching the sequence Q is
broken into n groups of consecutive intervals so that
kth group is associated with psk. Q matches P only if u &gt;
n, v &gt; m and all intervals of Q are associated with states
of P when this process finishes.</p>
        <p>u := 1; v := 1;
while (u ≤ n and v ≤ m and psu
while ((v = 1 or ev-1 = bv
or (sv-1,bv-1,ev-1) is</p>
        <p>associated with psu-1)
and v ≤ m and psu ⊆ sv
⊆ sv)
and (u = n or psu+1 {
sv
or psu+1 ⊆ psu))
associate (sv,bv,ev) with psu;
v := v + 1;
end while
u := u + 1;
end while.</p>
        <p>There are some facts that should be noted:
1. If an interval (sk, bk, ek) belongs to a group associated
with psu, then psu ⊆ sk.
2. If ek &lt; bk+1, then (sk, bk, ek) and (sk+1, bk+1, ek+!)
cannot be associated with the same element of P. I. e.
non-adjacent intervals cannot be associated with the
same element of P.
3. If psu+1 ⊆ psu, (sk-1, bk-1, ek-1) is associated with psu
and psu+1 ⊆ sk, then (sk, bk, ek) will be associated with
psu+1 only if psu {</p>
        <p>sk. This is due to semantics of
such patterns. A pattern &lt;{A, B}, {A}&gt; means that at
first conditions A and B must hold for some time
simultaneously and then there must be some time
when A is true and B is false. If an existence of one
of these intervals is not important, then the pattern is
to be reduced to &lt;{A}&gt; or &lt;{A, B}&gt;.</p>
        <p>Fig. 2 (b, c) shows how a sequence ({A}, {A, B, C},
{A, C, D}, {C, D}, {D}) matches patterns &lt;{A}, {A, B},
{C, D}, {D}&gt; and &lt;{A}, {B}, {C}, {D}&gt;.</p>
        <p>a)
b)
c)</p>
        <p>A
A
A</p>
        <p>ABC
AB
B</p>
        <p>ACD</p>
        <p>CD</p>
        <p>CD
C</p>
        <p>D</p>
        <p>D
D</p>
        <p>To find all subsequences of I that match P –
occurrences of P – the following procedure is used:
1. Let u = 1.
2. Find and add to the resulting subsequence the earliest
interval (sk(u), bk(u), ek(u)) so that psu ⊆ sk(u).
3. If psu+1 ⊆ psu, then find the maximum d(u) so that
(sk(u), bk(u), ek(u)), (sk(u)+1, bk(u)+1, ek(u)+1), …, (sk(u)+d(u),
bk(u)+d(u), ek(u)+d(u)) are adjacent intervals and psu ⊆
sk(u), psu ⊆ sk(u)+1, …, psu ⊆ sk(u)+d(u). Add these
intervals to the resulting subsequence.
4. If psu+1 {</p>
        <p>psu, then find the maximum d(u) so that
(sk(u), bk(u), ek(u)), (sk(u)+1, bk(u)+1, ek(u)+1), …, (sk(u)+d(u),
bk(u)+d(u), ek(u)+d(u)) are adjacent intervals and psu ⊆
sk(u), psu(u) ⊆ sk(u)+1, …, psu ⊆ sk(u)+d(u) and psu+1 {
sk(u), psu+1 {
sk(u)+1, …, psu+1 {
sk(u)+d(u). Add these
intervals to the resulting subsequence.
5. Repeat steps 2 – 4 with the next u and all intervals
beginning later than ek(u)+d(u) until u &gt; n or the end of
I is reached. If all elements of P are processed, then
the resulting sequence matches P.
6. Repeat steps 1 – 5 with all intervals beginning later
than ek(1)+d(1) until the end of I is reached.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3 Frequent Patterns Discovery</title>
        <p>In this section the procedure of frequent patterns
discovery is described.</p>
        <p>Let a time series under consideration be of length L.
Consider a time “window” of width w ≤ L that overlaps
[0, L]. There are (L + w) different windows of width w.
Consider a set of all different windows of length ≤ L.
This set is of size 3L2/2.</p>
        <p>Consider a pattern P of cardinality n and a set of all
intervals {[pb1, pe1], [pb2, pe2], …, [pbr, per]} so that
for each 1 ≤ i ≤ r:
− pbi &lt; pbi+1 (if i &lt; r);
− there is a subsequence ((s1, b1, e1), (s2, b2, e2), …, (sm,
bm, em)) of I matching P so that:
• if n ≥ 2, then:
− ex = pbi and by = pei (where (sx, bx, ex) is the last
interval associated with ps1, and (sy, by, ey) is
the first interval associated with psn during
matching);
− there are no shorter (i.e. with less |by – ex|)
subsequences of I matching P within [pbi, pei].
• if n = 1, then pbi = b1 and pei = e1.
n ≥ 2:
n = 1:</p>
        <p>r ⎛
supp(P) = ∑ ⎜⎜ ( pbk − pbk−1)(L − plk ) −
k=1 ⎝
( pbk − pbk−1)2 ⎞⎟
2 ⎟⎠
(4)
supp(P) = ∑⎜⎜( pek − pek−1)(L + plk ) − ( pek − pek−1) − plk 2 ⎞⎟
r ⎛ 2
k=1 ⎝ 2 2 ⎠⎟
supp(P) – a support of P – is a set of all windows of
length ≤ L that contain at least one of [pbi, pei] (if n ≥ 2)
or overlap with at least one of [pbi, pei] (if n = 1), 1 ≤ i
≤ r.</p>
        <p>
          This definition is similar to a support definition
given in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], but considers “sliding windows” of any
length between 0 and L, that allows not to limit a length
of a pattern occurrence. In common words it can be
expressed like this: if we take any window of length ≤
L, and consider it as a time series, and try to find
occurrences of P in it, and succeed, then this window is
an element of supp(P).
        </p>
        <p>|supp(P)| is a cardinality of supp(P). A pattern is
called frequent if |supp(P)| ≥ suppmin. Assuming pb0 =
L + pe1 and pe0 = -L + pb1, (4) is a formula for
|supp(P)| evaluation.</p>
        <p>An important property of these definitions is that
any subpattern of a frequent pattern is also frequent.
This enables the following procedure of frequent pattern
discovery: find all frequent patterns of length 1, then
construct patterns of length k+1 by appending one
mixed state to frequent patterns of length k and get a set
Fk+1 – a set of all frequent patterns of length k+1. The
process finishes when no new frequent patterns can be
found. In more details the procedure is described below:</p>
        <p>Step 1: Scan I and find all patterns &lt;ps&gt; of length 1
so that there exists an interval (s, b, e) ∈ I with ps ⊆ s.
Add all frequent patterns to F1.</p>
        <p>Step k+1: For each P=&lt;ps1, …, psk&gt; ∈ Fk find all
patterns Q=&lt;qs1, …, qsk&gt; ∈ Fk so that qs1 = ps2, …,
qsk-1 = psk, and generate patterns &lt;ps1, …, psk, qsk&gt;
from P and each described Q. Test all generated
patterns and add all frequent ones to Fk+1.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 Association Rules Discovery</title>
      <p>Once all frequent patterns are discovered, we can try to
find association rules. Let’s denote a rule as A B. But
first of all we need to define which patterns will be
considered as associated. This must be some condition
(cond) describing how antecedent pattern (A) and
succedent pattern (B) are disposed. For example, this
can be “B starts during A and ends within 2 hours after
A ends” or “B starts within 5 minutes after A ends”.
This condition is to be chosen by a domain expert so
that discovered rules would be easy to interpret.</p>
      <p>Consider all occurrences of A and B that satisfy
condition cond. Let suppA B(A) and suppA B(B) be a
supports of A and B respectively, that are built basing
only on such occurrences. Support of A B is defined
large set of relatively useless rules. For example, if
occurrences of B cover almost all time series, then all
rules with B as a succedent will be selected, but most of
them will be of small interest.</p>
      <p>
        ⎛ ⎛1− p(B | A) ⎞⎟⎞⎟
J(B; A) = p(A)⎜⎜⎝ p(B | A)log⎜⎛⎝ p(pB( B|)A) ⎞⎟⎠ + (1− p(B | A))log⎜⎝ 1− p(B) ⎠⎟⎠
(6)
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] gives a survey of the most successful
interestingness measures used in knowledge discovery.
One of the most popular measures for association rules
evaluation is a J-measure [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] (see (6)). It is a special
case of Shannon’s mutual information and takes into
account both rule frequency and rule “surprisingness”.
If J-measure of a rule is less than some threshold value,
then it is not interesting enough and should be excluded
from consideration.
      </p>
      <p>In our context p(B) is a probability of pattern B
occurring in randomly selected window of length ≤ L,
and p(A B) is a probability of the fact that B occurs
in a randomly selected window of length ≤ L that
contains an occurrence of A and cond is true for these
occurrences. The probabilities are evaluated as:
supp(B)
There are a number of issues that should be discussed
about the proposed method. One of them is a number of
mixed states. In general case, having N simple states,
we obtain 2N possible mixed states. But in practice
some of them cannot take place simultaneously (or can
hold only simultaneously) due to some reason. For
example, consider 3 time series: pressure, temperature
and humidity data, and 6 simple states for each of them:
“increase”, “slow increase”, “fast increase”, “decrease”,
“slow decrease”, “fast decrease”. There are 23*6 possible
mixed states, but it is obvious that slowly increasing
and fast increasing segments of the same time series
cannot overlap, or decreasing and fast decreasing
segments of the series always overlap, and so on. Thus
the number of possible mixed states is reduced to 73 (or
even 53 if any decreasing/increasing segment is either
fast or slowly decreasing/increasing). Moreover,
possibly not all of them have support exceeding suppmin,
so they will be removed from consideration after the
first step of frequent patterns discovery procedure.</p>
      <p>Another problem is very short mixed-state intervals
appearing due to inaccurate initial time series division
(especially if it was performed by a human). On Fig. 3
(a) the bounds of intervals labeled with simple states A
and B were defined roughly, and this resulted in a small
interval labeled with a mixed state {A, B}. A simple
solution is to ignore intervals shorter than some
threshold ε when searching for pattern occurrences. For
example, if l &lt; ε, then a sequence from the Fig. 3 (a)
does not match a pattern &lt;{A}, {A, B}, {B}&gt;. The
same idea can be applied to situation when there is a
small gap between intervals labeled with the same state
(see Fig. 3 (b)). Such gaps may appear, for example,
due to smoothing faults. In this case it may be ignored
during patterns discovery and both intervals may be
considered as one continuous interval.</p>
      <p>But what if the user wants to have patterns that
exactly define which intervals must be adjacent and
which may have a gap between them in a matching
sequence? For example, he wants to extend pattern
&lt;{A}, {B}, {C}&gt; with some information like “in a
matching sequence {A}-interval must be met by
{B}interval, but there may be a gap between {B}-interval
and {C}-interval”. Such patterns can be enabled by
introducing of a special symbol “gap” ( ). A pattern
described above can be expressed like &lt;{A}, {B},
{C}&gt;. “Gap” is not a mixed state, so pattern cardinality
does not change, but a space of all candidate patterns
increases.
10%
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
12%
14%
16%
18%
20%
22%
24%
26%
28%
30%</p>
    </sec>
    <sec id="sec-5">
      <title>6 Conclusion and Future Work</title>
      <p>This paper proposes a method and an algorithm for
frequent patterns and rules discovery in labeled interval
sequence. The interval sequence can be obtained, for
example, from multivariate time series by dividing each
variable’s behavior into labeled intervals. Since pattern
is a sequence of mixed states, a domain expert can
easily interpret all found frequent patterns and rules.
Inaccuracy of series division and data smoothing
defects can be compensated by filtering intervals with
minimum interval length. The method was tested over
simulated interval set and dependency of a frequent
patterns number from minimum support and minimum
interval length were investigated.</p>
      <p>The problem of the proposed method is a large
number of generated frequent patterns and rules. A
Jmeasure can help to reduce it, but there still remains a
problem of ranking patterns by their interpretability and
“importance”. Also an effective algorithm for frequent
patterns discovery needs to be developed. Although an
initial interval set is converted to a simple sequence of
mixed state intervals and a pattern is either a mixed
states sequence, classical methods of finding
subsequence in a sequence are not suitable for pattern
matching, since this process is rather complicated.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>K.-L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sawhney</surname>
            ,
            <given-names>H.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shim</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Fast Similarity Search in The Presence of Noise, Scaling and Translation in Time-Series Databases</article-title>
          .
          <source>In: Proc. of the 21st Int. Conf. on Very Large Databases</source>
          , Zurich, Switzerland (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Allen</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>Maintaining knowledge about temporal intervals</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>26</volume>
          (
          <issue>11</issue>
          ) (
          <year>1983</year>
          )
          <fpage>832</fpage>
          -
          <lpage>843</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Antunes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Temporal Data Mining: an Overview</article-title>
          .
          <source>In: KDD Workshop on Temporal Data Mining</source>
          , San Francisco (
          <year>2001</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Box</surname>
            ,
            <given-names>G.E.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jenkins</surname>
            ,
            <given-names>G.M.</given-names>
          </string-name>
          :
          <source>Time Series Analysis: Forecasting and Control</source>
          . San-Francisco, Holden-Day (
          <year>1976</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Brendt</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clifford</surname>
          </string-name>
          , J.:
          <article-title>Finding Patterns in Time Series: A Dynamic Programming Approach</article-title>
          . In:
          <article-title>Advances in Knowledge Discovery And Data Mining, chapter 9</article-title>
          . MIT Press (
          <year>1996</year>
          )
          <fpage>229</fpage>
          -
          <lpage>248</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>K.-I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Renganathan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smyth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Rule Discovery from Time Series</article-title>
          .
          <source>In: Proc. of the 4th Int. Conf. on Knowledge Discovery and Data Mining</source>
          , AAAI Press (
          <year>1998</year>
          )
          <fpage>16</fpage>
          -
          <lpage>22</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fu</surname>
            ,
            <given-names>A.W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kam</surname>
            ,
            <given-names>P.S.</given-names>
          </string-name>
          :
          <article-title>Discovering Temporal Patterns for Interval-Based Events</article-title>
          . In: Kambayashi,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Mohania</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.K.</given-names>
            ,
            <surname>Tjoa</surname>
          </string-name>
          , A.M., eds.:
          <source>Second International Conference on Data Warehousing and Knowledge Discovery (DaWaK</source>
          <year>2000</year>
          ).
          <source>Volume</source>
          <year>1874</year>
          ., London, UK, Springer (
          <year>2000</year>
          )
          <fpage>317</fpage>
          -
          <lpage>326</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Hetland</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saetrom</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Temporal Rule Discovery Using Genetic Programming and Specialized Hardware</article-title>
          .
          <source>In: Proc. of 4th Int. Conf. on Recent Advances in Soft Computing</source>
          (
          <year>2002</year>
          )
          <fpage>182</fpage>
          -
          <lpage>188</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Hilderman</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamilton</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          :
          <article-title>Knowledge discovery and interestingness measures: A survey</article-title>
          .
          <source>Technical Report CS 99-04</source>
          , Department of Computer Science, University of Regina (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Höppner</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Discovery of Temporal Patterns - Learning Rules about the Qualitative Behavior of Time Series</article-title>
          .
          <source>In: Proc. of the 5th European Conference on Principles and Practice of Knowledge Discovery in Databases, Lecture Notes in Artificial Intelligence 2168</source>
          , Springer (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hua</surname>
            ,
            <given-names>K.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maulik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villafane</surname>
          </string-name>
          , R.:
          <source>Mining Interval Time Series. In: Data Warehousing and Knowledge Discovery</source>
          . (
          <year>1999</year>
          )
          <fpage>318</fpage>
          -
          <lpage>330</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orgun</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>G.J.:</given-names>
          </string-name>
          <article-title>An Overview of Temporal Data Mining</article-title>
          .
          <source>In: Proceedings of the 1st Australian Data Mining Workshop</source>
          , Canberra, Australia (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verkamo</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          :
          <article-title>Discovery of Frequent Episodes in Event Sequences</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          <volume>1</volume>
          (
          <year>1997</year>
          )
          <fpage>259</fpage>
          -
          <lpage>289</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Moerchen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ultsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Mining Hierarchical Temporal Patterns in Multivariate Time Series</article-title>
          .
          <source>In: Proc. of the 27th German Conference on Artificial Intelligence (KI)</source>
          . Ulm,
          <string-name>
            <surname>Germany</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Smyth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <article-title>Rule Induction Using Information Theory</article-title>
          . In:
          <article-title>Knowledge Discovery in Databases, chapter 9</article-title>
          . MIT Press (
          <year>1991</year>
          )
          <fpage>159</fpage>
          -
          <lpage>176</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>