<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using relational operations to express association rules</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>c Michal Burda</string-name>
          <email>Michal.Burda@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marian Mindek</string-name>
          <email>Marian.Mindek@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jana Sˇ armanova´</string-name>
          <email>Jana.Sarmanova@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science</institution>
          ,
          <addr-line>FEI, VS</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Proceedings of the Spring Young Researcher's Colloquium on Database and Information Systems SYRCoDIS</institution>
          ,
          <addr-line>St.-Petersburg, Russia, 2005</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Association rules are essential data mining tool. There exist many types of them. Indeed, association rules of different types have often completely dissimilar syntax. In this paper, we try to design a framework based on relational operations enabling one to express various association rules using the same syntax. Doing so we will be able to study association rules at high level - it will be possible to formally describe common characteristics of different rule types.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper focuses on the possibility of representing the
association rules using the relational operations. We do
a lot of motivating in the beginning to make the reader
clear why it is important to introduce precise formal
language suitable for writing down various association rules
in the same way.</p>
      <p>After that, we get through many precise definitions
which result in a formal logic called Probabilistic Logic
of Typed Relations (PLTR).</p>
      <p>At last, we reveal the reader some challenging task
our research faces up today.</p>
      <p>This paper expects the reader to be familiar with
basics of association rule mining, mathematical logic, and
with most important notations of statistical science.</p>
    </sec>
    <sec id="sec-2">
      <title>Quick tour over association rule types</title>
      <p>Association rule is an explicit representation of
knowledge about some possibly interesting relationship that
holds good in data. There exist many approaches to get
such rules from data. They use sophisticated statistical
methods or empirical procedures to measure the
relevance of the rule. The best-known type of such
relationship are rules of the type:</p>
      <p>a1 ∧ a2 ∧ a3 ⇒ b.</p>
      <p>
        It express the fact that when an object has got the
attributes a1, a2 and a3 it is very probably that it has got
b, too. For example, the Market-basket analysis (see [1],
[9]) works mainly on data describing shopping activities
of the customers and can produce the subsequent rule:
tequila ∧ salt ⇒ lemon
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
indicating that customers buying tequila and salt often
buy lemons too. Such rules may be generalized to
arbitrary categorial data. For example, let’s have got a
database of patients suffering certain disease. We can
mine the following rule:
weight &gt; 100kg ∧ smoker ∧ ¬sport ⇒ heart-failure.
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
The rule says, there is strong evidence of getting heart
failure for non-sporting smokers heavy at least 100 kg.
Please note, the rule is rather rational and it tells probably
nothing to the experts. However, many surprising rules
may be mined, in practice.
      </p>
      <p>
        We are not restricted on implicational rules only.
Book [7] introduces for example the rules of the form:
mathematics = “excellent” ∼ physics = “excellent”
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
saying that in database about pupils the excellentness
in mathematics appears together with excelentness in
physics. In the other words, the excellenteness in
mathematics is associated with the excellenteness in physics.
      </p>
      <p>Let’s get more complicated. Work [7] presents further
the subsequent rules:</p>
      <p>
        X corr Y / C.
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
The interpretation is: “The values of attributes A and
B are correlated when looking only on objects fulfilling
condition C.” Rules of that type are well applicable on
quantitative data.
      </p>
      <p>Authors of paper [2] have developed the consecutive
rules:
sex = “female” ⇒</p>
      <p>
        wage: mean = $7.90/hr
(overall mean wage = $9.02)
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
indicating that the women’s wage mean is significantly
different to the rest of examined objects.
      </p>
      <p>We can be even more complicated and ask for the
significance of the mean difference in data D between two
subsets A1, A2 ⊂ D whose union isn’t the set D itself
(A1 ∩ A2 = ∅, A1 ∪ A2 ⊂ D) as in [12], [13]. E.g. we
may to compare the mean wage of people with age lower
than 30 and people older than 50 etc.</p>
      <p>Even the rule saying that there is some difference
between two sets of objects in an array of attributes
without telling us the kind of the dissimilarity (motivated by
multi-dimensional statistical tests, see [11]) is better than
knowing nothing.
One of the ways to increase the power of data mining
software is to employ the results of the statistic science
in association rules mining process. The rule types
mentioned above are only a few of many possibilities when
considering the large amount of statistical tests that may
be conceivable for rules mining.</p>
      <p>Statistical tests of hypotheses are very powerful tool.
In statistics, wide range of tests was designed. The tests
are used to make decisions about miscellaneous
characteristics; so, each test could be used to mine a rule of
different type.</p>
      <p>Moreover, statistical tests of hypotheses provide a
basic measure of rule interest. Every statistical test is based
on the computation of probability, at which the tested
characteristic is valid. Thus, the basic measure of rule
interest could be that probability. The more the tested
characteristic is probable, the stronger tested relationship
is, and therefore the more interesting the resulting rule
could be.</p>
      <p>A perfunctory understanding of statistical tests of
hypotheses is assumed. We denote a zero hypothesis H0
and alternative hypothesis HA. This subsection tries to
summarize basic characteristics of statistical tests and
make clear, what can be tested at all.</p>
      <p>Partitioning of statistical tests according to distribution
dependence:
1. Parametric – these tests rely on a special type of
distribution of sample data. These tests should be
used with care in association rule mining process
since we can not say anything specific to the
analyzed data, in general.
2. Non-parametric – often called rank tests, too. These
tests does not depend on a specific shape of
sample’s distribution. Weak apriory assumptions make
them more useful in association rule mining
process. However, weakness of assumptions is paid
with lesser strength of tests and the conversion of
quantitative data to ranks is sometimes relatively
complex because of the need to sort the data, etc.
Partitioning of statistical tests according to the number
of compared samples:
1. One-sample tests – are used to test, whether a
specific sample characteristic ψ is equal to a given
value ψ0. For example, we can test, whether the
weights of flour bags filled by automatic machine
are in average equal to ψ0 = 1 kg (H0 : ψ = ψ0) or
if there is present a systematic error which causes
under-weights or over-weights (HA : ψ 6= ψ0).
Such tests may be utilized in pair-comparing rules;
for example, whether the salary of people increased
after a training course.</p>
      <p>
        Furthermore, one kind of the well-known
association rule (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is defined as one-sample test on
probability of binomial distribution; it tests that the
simultaneous occurrence of condition A ∧ S in the rule is
very probable in contrast to A ∧ ¬S (see [8]).
2. Two-sample tests – are more general. They are able
to compare a characteristic of a sample against
characteristic of second sample. For example, we can
test, whether the mean weight of some race A of
men is significantly different from race B (H0 :
ψA = ψB , HA : ψA 6= ψB ). Such tests are just
ideal to be utilized in association rule mining.
3. Multi-sample tests – are used to compare a specific
characteristics of finite number of samples A, B,
. . . , Z. (H0 : ψA = ψB = . . . = ψZ is tested
against HA : ∃i, j ∈ {A, B, . . . , Z} : ψi 6= ψj .)
These tests could be used in rule mining too.
      </p>
      <p>Partitioning of statistical tests according to the number
of compared quantities:
1. One-dimensional – these tests work only with one
quantity (e.g. tests used to compare weight, wage
or density of something).
2. Multi-dimensional – does not compare the only
quantity, but in general the vector of quantities. We
can compare, for example, weight, height, blood
pressure etc. of two samples with one test trying
to state whether there is certain difference in at least
one attribute.</p>
      <p>Both one-dimensional and multi-dimensional tests may
be used to discover rules.</p>
      <p>Partitioning of statistical tests according to the type of
tested characteristics:</p>
      <p>There exist tests dealing with mean, variance
(dispersion), correlation, and so on. The type of tested
characteristic denotes in fact the type of mined rule. . .
2.2</p>
      <sec id="sec-2-1">
        <title>Where is the problem?</title>
        <p>What do we want to tell by this rule types enumeration?
We can claim, the actually well-known rule types provide
the possibility to mine various knowledge and thus are
very different and the rule notation is very dissimilar too.</p>
        <p>However, to be able to study such different rule types
deeply, to be able to explore the similarities and
relationships between various rule types and to infer formal
conclusions about the rules, we should have got a tool for
uniform rule notation. That is, we need a formal
language capable to express association rules of any type –
and just this topic is addressed in this paper.</p>
        <p>We do so in the hope of uncovering some hidden
truths about association rules which gives us the ideas
of further improvements in the array of association rule
mining algorithms, visualization and so on. We believe
that without good understanding of causalities and
relationships existing between various complex association
rule types we won’t be able to do this. (In fact, the usage
of the framework presented here brings us immediately
the benefit of finding the idea of cosymmetric association
rules, which are discussed shortly in conclusion.)</p>
        <p>Some work at the main topic of this paper was also
done in [7]. However, we present diametrally different
approach here and we think it is easier and more
versatile.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Representing association rules using projection and selection</title>
      <p>We can see from the previous chapter that association
rules express some strong relationship in data. Our idea
of how to design a formal logic language suitable for
association rule notation is based on considering a rule as a
relationship between two or more sub-tables of data table
R.</p>
      <p>
        For example, rule (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) can be taken as a relationship
between a sub-table A that consists of all R’s rows
satisfying criterion a1 ∧ a2 ∧ a3 and sub-table B satisfying
criterion b (see also figure 1(a)).
      </p>
      <p>
        When looking on rule (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) we see that it expresses
relationship between two sub-tables A and B. Both A and
B have got just rows that satisfy condition C, but table
A consists of R’s attribute X only and B of attribute Y
only (vide also figure 1(b)).
      </p>
      <p>
        Rule of type (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) models a relationship between two
disjunctive sub-tables having one identical column. Such
relationships are also depicted in figure 1(c).
      </p>
      <p>We can continue the same way and transform every
rule mentioned in section 2 to the proposition based on
relationship between R’s sub-tables.</p>
      <p>Okay, you might say, we can treat association rules
as relationships between sub-tables of analysed data
table, but how to describe such sub-tables to obtain formal
and easy-to-understand formulae? The answer inheres in
commonly known relational-database operations:
selection and projection. While selection is a tool for choosing
a certain rows satisfying given condition, the projection
takes out only specified columns of data table.
Combining these two operations together, we can describe
almost every sub-table of analysed data table R (see also
figure 2).</p>
      <p>To provide some concrete motivating examples, we
must be a little more formal. Thus, let R be the data
table. The expression of the form
means selection of R’s rows that satisfy given condition
C, and the expression of the form</p>
      <p>
        R(C)
R[A1, A2, . . . , An]
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
states projection on columns (attributes) A1, A2, . . ., An.
      </p>
      <p>It is important for our topic to treat projection as the
operation which holds duplicities. This is a small
difference in contrast to commonly known projection on
relations, since usual relations are sets and sets can not
contain the same item twice. Hence, such relational
projection may cause some information lost – duplicities are
removed. Please see table 1 for illustrative projection
and selection results. Precise definition of notions such
as data table, attribute, projection or selection will be
provided later in section 4.</p>
      <p>
        Relationships between data can be modelled as usual
mappings assigning a truth value to a tuple of sub-tables
given as arguments. That way we are able to write every
rule of section 2 in the new style. To show it, we will
start with rule (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ). We can write it in our new notation as
follows:
corr R(C)[X], R(C)[Y ]
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
In formula (
        <xref ref-type="bibr" rid="ref9">9</xref>
        ) (resp. (
        <xref ref-type="bibr" rid="ref10">10</xref>
        )), we have prepared two
subtables using selections and projections (R(C)[A] and
R(C)[B]). These sub-tables act as arguments to a
mapping named corr which results to “true” when the two
columns given as its arguments are strongly correlated.
      </p>
      <p>
        Rule (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) can be represented by subsequent formula:
R(sex = “female”)[wage] &lt;Z–test
      </p>
      <p>
        R(sex 6= “female”)[wage].
It says that the statistical Z–test (this is the test which
Aumann and Lindell used in [2] to mine rule (
        <xref ref-type="bibr" rid="ref6">6</xref>
        )) showed
significant difference in wage between the set of women
and the rest of data table.
      </p>
      <p>The advantage of our approach also lies in the fact that
the mapping &lt;Z–test and others may be defined as to result
not in only true or false, but generally in arbitrary values
– for example in the p–value of the statistical test on the
background. That way we can construct a multi–valued
logic based on probability.</p>
      <p>
        Our research shows that representation of rule (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is
not such straightforward as in case of preceding rules.
The benefit of a small complications is rise of an idea
of new association rules type. A rule of type (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) we can
write in new notation this way:
∼ R, R(C1), R(C2)
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
where C1 is condition mathematics = “excellent” and
C2 is physics = “excellent”. The reason of the need the
mapping ∼ to be ternary is as follows. To determine (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )’s
truth value we must know the number of records
satisfying also conditions C1 ∧ ¬C2, ¬C1 ∧ C2 and ¬C1 ∧ ¬C2;
it equals to the number of rows in R(C1) − R(C2),
R(C2) − R(C1) and R − R(C1) − R(C2), respectively.
(For more details of how the (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )’s confidence is
computed see [7], [8], [15].) The rule (
        <xref ref-type="bibr" rid="ref12">12</xref>
        ) can be interpreted
as follows: “In the context of data table R the conditions
C1 and C2 are mostly satisfied both or neither.” The new
type of association rules arises when we put selection at
condition C0 to R in the first argument – that is, when
we modify the context:
∼ R(C0), R(C1), R(C2) .
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
Such rule means: “In the context of R’s records
satisfying condition C0. . . ” The equivalent to this rule type is
mining rules (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) on sub-table R0 that is created from R
by omitting all rows that do not satisfy condition C0.
      </p>
      <p>
        The representation of rules (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) or (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is
straightforward. We can apply the same principle as with (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ):
⇒ R, R(a1 ∧ a2 ∧ a3), R(b) ,
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
      </p>
      <p>Please note the possibility of representing more
general rules than described former. We are not limited to
use only one column sub-tables as in figure 1(c). When
we employ some multi-dimensional statistical test (see
subsection 2.1) we can compare many columns at a time.</p>
      <p>Moreover, sometimes even the rules relating two
subtables of type depicted in figure 1(d) are profitable.
Suppose sub-table A contains quality characteristics of some
metal a measured before some process of quality
improvement (A = R(type = a)[Q1, Q2, Q3]) and
subtable B represents the same quality characteristics of
metal b measured after the process of quality
improvement (B = R(type = b)[Q01, Q02, Q03]). Then a rule
A &gt; B denotes the metal of type a is better even before
the improvement process than metal of type b after
improvement is performed. (However, this is rather tricky
example, we know it.)</p>
      <p>We hope the preceding examples give the reader much
motivation to accept our proposition of treating the
association rule mining process as follows. “Mining
association rules of certain type from data table R is a
process of searching rather simple mixtures of selections and
projections on data table R for which the mapping
representing the certain rule type gives truth value of our
demand.”
4</p>
    </sec>
    <sec id="sec-4">
      <title>Basic detailed definitions</title>
      <p>We see that the language for writing association rules
using relational operations is very expressible. We won’t
remain only in coarse motivation examples. In this
section, we provide a detailed definitions of the
Probabilistic logic of typed relations (PLTR) which we propose to
represent association rules in.
We start with basic definitions of the mathematical
representations of data tables, attributes etc. Definitions of
this sub-section are mainly based on the work [14].
However, we had to make some important modifications to
adapt the notations to our needs.</p>
      <p>In the subsequent, we sometimes use the phrase “set
of abstract elements”. It is simply a set whose elements
are not concrete, e.g. X = {x1, x2, x3} is abstract set of
three items.</p>
      <p>Definition 1 Let Ω be an infinite set of abstract elements
which we will call as attributes. Let each a ∈ Ω has
assigned a non-empty set δ(a) = Da called domain of
attribute a. Type A (of relations) is any finite subset of
the set Ω. We denote TΩ a set of all types.</p>
      <p>A type A of relation is something like description
of the data table. It says, what attributes (columns) are
present in the data table and what data can be stored in
that attributes (attribute domain).</p>
      <p>Example 2 Let a1, a2, a3 ∈ Ω, δ(a1) = N, δ(a2) =
Q and δ(a3) is equal to a set of all words made from
English letters of length maximum 30, then a set A =
{a1, a2, a3} is type.</p>
      <p>It is obvious that if A and B are types then A ∪ B,
A ∩ B and A − B are types.</p>
      <p>Definition 3 Let Υ is infinitive set of abstract elements
which we will call as objects. Let A ∈ TΩ is type. Let’s
denote DA = {Da = δ(a) : a ∈ A} (specially D∅ = ∅).
A tuple of type A is pair hk, li, where k ∈ Υ and l is
such mapping l : A → DA that (∀a ∈ A)(l(a) ∈ Da).
A set of all tuples of type A will be denoted as 1A. A set
of all tuples of type {a} (a ∈ Ω) will be denoted as 1a.</p>
      <p>A tuple of type A is intuitively a representation of one
row of data table. We can observe that (∀a ∈ Ω)(1a 6= ∅)
and (∀A ∈ TΩ)(1A 6= ∅) because definitions 1 and 3 say
that Da 6= ∅, DA 6= ∅.</p>
      <p>k, {ha1, 1i , ha2, 0, 25i , ha3, Jamesi}
Example 4 Consider type A from example 2. Then u =
D E
where k ∈ Υ
is a tuple of type A.</p>
      <sec id="sec-4-1">
        <title>Definition 5 (typed relation) Let A ∈ TΩ be type. A</title>
        <p>relation R of type A is any finite subset of the set 1A.
Symbol RA denotes a set of all relations of type A.
Similarly, a set of all relations of type {a} where a ∈ Ω is
denoted with symbol Ra.</p>
        <p>As one can see, the notion of typed relations is
simply a representation of data table. Now, we introduce
definitions of the important relational operations called
selection and projection.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Definition 6 (selection) Let A ∈ TΩ and let R ∈ RA.</title>
        <p>A selection from relation R of type A according to
condition C is relation</p>
        <p>R(C) = {u : u ∈ R ∧ C(u)}
of type A, too. The notation C(u) denotes a selection
condition and it constitutes the fact that condition C
holds on tuple u.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Definition 7 (projection) Let A ∈ TΩ, B ⊆ A and let</title>
        <p>R ∈ RA. A projection of relation R to the type B is
relation</p>
        <p>R[B] = n
u = hku, lui ∈ 1B :
∃v = hku, lvi ∈ R
∀b ∈ B
lu(b) = lv(b) o
(16)
of type B.</p>
        <p>Example 8 As an example of relation R of type A (see
example 2) we can take table 1(a) (ki on the table are
objects, that is, ki ∈ Υ). The selection (a1 &gt; 6) on that
relation is presented in table 1(b). Projection [a3] one can
see in table 1(c).</p>
        <p>Please note that projection is defined as to hold
duplicities. Why this is a matter see section 3.
4.2</p>
        <sec id="sec-4-3-1">
          <title>Modelling relationships</title>
          <p>With these basic definitions we can go forth to define
general notion of a predicate of relationship. Predicate of
relationship is simply a mapping that assigns truth value
to relations given as arguments. Since we are building
probabilistic logic, the truth value will be a probability.
Concretely, truth value will be defined as a pair of values
hl, hi where 0 ≤ l ≤ h ≤ 1 that form an interval of
probability (l is its lower bound and h its upper bound).
Definition 9 Probability interval is a pair i = hl, hi,
where l, h ∈ R and 0 ≤ l ≤ h ≤ 1. A set V of truth
values is a set of any probability interval i. By notation
q ∈0 i = hl, hi where q ∈ R we mean l ≤ q ≤ h.
Definition 10 Let A1, A2, . . . , An ∈ TΩ and let V be a
set of truth values then n-ary relationship predicate is a
mapping p? : Dp? → V where Dp? ⊆ RA1 × RA2 ×
. . . × RAn is a set called domain of relationship
predicate p?.</p>
          <p>So, relationship predicate is a mapping that assigns
to certain relations a truth value. It is obvious, we can
model various relationships that way. The definition
presented above assumes the predicate to result in the
interval of probability. However, we can modify the
definitions to suit classical two-valued logic.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Intermezzo: Simultaneous statistical inference and data mining</title>
      <p>Before going further we should do a little discussion on
data-mining from the statistician’s point of view.</p>
      <p>Nowadays usual practice is to use statistical tests in
association rules mining process. Statistical tests are
based on hypothesis testing. One formulates a zero
hypothesis H0 and appropriate alternative hypothesis HA
such that the simultaneous turning up of H0 and HA is
impossible; that is, P (H0 ∩ HA) = 0 where P states
probability. Hypotheses are constructed as to HA be the
event whose validity we want to be convinced of. The
appropriate statistical test then computes the
probability α at which the zero hypothesis H0 is valid. If this
probability is very small (α &lt; 0, 05, α &lt; 0, 01 or even
α &lt; 0, 001) we are almost sure of H0’s invalidity. In the
other words, we know at least with 1 − α certainty that
HA is valid.</p>
      <p>Please take into account the following important note:
when α isn’t small, we aren’t sure of H0’s validity; we
simply know nothing – vide e.g. [10].</p>
      <p>Thus, stating whether a rule is important or not is
done by the certain test of hypotheses. The importance
of the rule is then derived from the level of significance
at which the zero hypothesis can be rejected – that is,
the truth value of the rule is simply 1 − α. One can see
that the same approach is enabled in this work; we use
interval of probability as the rule’s truth value.</p>
      <p>However, when performing many tests of hypotheses
at the time and rejecting many H0 hypotheses at very
small level of significance α, the resultant probability of
validity is greater than α. In fact, the probability of
erroneous reject of at least one H0 tends rapidly to 1.</p>
      <p>This phenomenon is often called as simultaneous
statistical inference (see [10] etc.). We can explain it using
Bonferroni’s theorem (vide [11]) which sounds as
follows.</p>
      <p>Consider random events A1, A2, . . . , An. Each of
them has got some certain probability P (Ai).
Bonferroni’s theorem says that for the probability of the event
A1 ∩A2 ∩. . .∩An (the simultaneous occurrence of every
event Ai) holds the following:
n
P (A1 ∩ . . . ∩ An) ≤ max 0, 1 − X 1 − P (Ai)
!</p>
      <p>.
(17)
Take Ai to be the validity of any i-th HA. So P (Ai) =
1 − αi, where α is level of significance at which we
have rejected corresponding zero hypothesis H0. Thus
the probability that every H0 was rejected rightfully is
i=1
P (A1 ∩ . . . ∩ An) ≤ max 0, 1 −
n !
X(αi) . (18)
i=1
The sum tends fast to 1 so the right side gets to zero very
quickly. For example, when performing 20 tests and
rejecting zero hypotheses at α = 0, 05 in each test, the
overall probability of at least one unauthorized rejection
is greater or equal to 20 · 0, 05 = 1.</p>
      <p>What is the consquence? In contrast to common
statistical analyses, data-mining results can not be treated
as natural law or undisputed truth. Data-mining gives
answers to the questions of type “what maybe holds in
the area described by my data?” E.g. work [7] proposes
to understand the association rules as enumeration of
hypotheses that are supported with data or as hypotheses
the data tend to.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Advanced definitions</title>
      <p>Although we have vindicate the possibility to ignore
simultaneous statistical inference (vide section 5),
sometimes it is useful to prevent it even in data-mining results.
The concrete task we are accually working on is the
situation when some mined rules belong somehow together.</p>
      <p>For example, let attribute A be categorical with
domain δ(A) = {1, 2, 3, 4} and let attribute B be
quantitative, that is δ(B) = R. Consider the following rules:
R(A = 1)[B] &gt;Z–test R(A = 2)[B],
R(A = 2)[B] &lt;Z–test R(A = 4)[B],</p>
      <p>R(A = 3)[B] &lt;Z–test R(A = 4)[B].</p>
      <p>These rules logically belong together because they
express strong differences in attribute B between objects
that differ in attribute A. We are working on a method
that treats such rule sets specially and since we manage
a set of rules as one complicated rule, it is useful for us
to know the overall p–value. Videlicet, we are interested
what is the probability of the truthfulness of the rule</p>
      <p>R(A = 1)[B] &gt;Z–test R(A = 2)[B] ∧
∧ R(A = 2)[B] &lt;Z–test R(A = 4)[B] ∧
∧ R(A = 3)[B] &lt;Z–test R(A = 4)[B] .
6.1</p>
      <sec id="sec-6-1">
        <title>Probabilistic connectives</title>
        <p>In this sub-section we will define logical connectives
able to work with probability intervals as presented in
definition 9.</p>
        <p>Definition 11 Let V be a set of truth values.
Probabilistic connective is a mapping ι? : V n → V where n is
arity of probabilistic connective ι?.</p>
        <p>Definition 12 Let V be a set of truth values.
Probabilistic negation is unary probabilistic connective ¬? : V →
V such that ¬?(hl, hi) = h1 − h, 1 − li .</p>
        <p>We can see that the traditional feature of negations
holds for our probabilistic negation too:
(19)
(20)
(21)
(22)
¬? ¬?(i) = i.
(23)</p>
        <p>Definitions of probabilistic conjunctions and
disjunctions will be based on the commonly known notations of
t-norm and t-conorm.</p>
        <p>Definition 13 Let V be a set of truth values. Triangular
norm (t-norm) is any binary probabilistic connective T :
V × V → V where ∀x, y, z ∈ V holds:
1. T (x, y) = T (y, x);
(commutativity)
2. T x, T (y, z) = T T (x, y), z ;
(asociativity)
3. if min(x) ≤ min(y) then</p>
        <p>min T (x, z) ≤ min T (y, z)
and if max(x) ≤ max(y) then
max T (x, z) ≤ max T (y, z) ;
(monotony)
4. T h0, 0i , x = h0, 0i and T h1, 1i , x = x.
(restriction)
Definition 14 Let V be a set of truth values. Triangular
co-norm (t-conorm) is such binary probabilistic
connective S : V × V → V where ∀x, y ∈ V holds:
S(x, y) = ¬? T ¬?(x), ¬?(y)
(24)
and T is t-norm.</p>
        <p>Concrete t-norms are interpreted as probabilistic
conjunctions and t-conorms as probabilistic disjunctions.
We can easily check the validity of the subsequent
equation:</p>
        <p>T (x, y) = ¬? S ¬?(x), ¬?(y) .
(25)</p>
        <p>Specific t-norm and t-conorm is defined accordingly
to Bonferroni’s theorem about probability of random
event conjunction (see [11]).</p>
        <p>Definition 15 Let i1, i2 ∈ V and i1 = hl1, h1i, i2 =
?
hl2, h2i. Binary probabilistic connective ∧safe defined
as
∧s?afe(i1, i2) = D max(0, l1 + l2 − 1), min(h1, h2)E
is called as safe conjunction.</p>
        <p>Binary probabilistic connective ∨s?afe defined as
∨s?afe(i1, i2) = D max(l1, l2), min(1, h1 +h2)E (27)
(26)
is called as safe disjunction.
?
Theorem 16 Safe conjunction ∧safe is t-norm and safe
? ?
disjunction ∨safe is t-conorm of ∧safe.</p>
        <p>Proof. Evident.</p>
        <p>With connectives defined in this sub-section we can
even join various relationship predicates together.
7</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Probabilistic Logic of Typed Relations</title>
      <p>Now we have defined all notions necessary for creation
of language of Probabilistic Logic of Typed Relations
(PLTR).</p>
      <p>Definition 17 (alphabet) The PLTR language alphabet
consists from the following symbols:
• Symbols for typed relations: R, S, T , . . .
• Symbols for attributes: a, b, c, . . .
• Logical connectives for creation of selection
conditions: ¬, ∧, ∨, ⇒, ⇔
• Relational operations: −, ∪, ∩
• Symbols of n-ary relationship predicates: p?, q?,
r?, . . . , ≤?, . . .
• Probabilistic logical connectives: ¬?, ∧?, ∨?
• Auxiliary symbols (parentheses, brackets): (, ), [, ]</p>
      <sec id="sec-7-1">
        <title>Definition 18 (language) Language of PLTR is a set</title>
        <p>J of formulae created from the alphabet of the PLTR
language accordingly to the following conditions. Let
A1, A2, A3, A, B ∈ TΩ be types of relations and let
A ⊆ B. We define:
• selection condition of type A:
1. Any symbol a ∈ A of attribute’s name is a
selection condition of type A.
2. If C1, C2 are selection conditions of type A
then also ¬C1, (C1 ∨ C2), (C1 ∧ C2), (C1 ⇒
C2) and (C1 ⇔ C2) are selection conditions
of type A.
3. There is no other selection condition of type</p>
        <p>A.
• term of type A:
1. Symbols of relations of type A are terms of
type A.
2. If T1, T2 are terms of type A then also (T1 ∪
T2), (T1 ∩ T2) and (T1 − T2) are terms of type
A.
3. If T is term of type A and C is selection
condition of type A then also T (C) is term of type
A.
4. If A = {a1, a2, . . . , ak} ⊆ B and T is term of
type B then T [a1, a2, . . . , ak] is term of type
A.</p>
        <p>5. There is no other term of type A.
• formula PLTR:
1. If T1, T2, . . . , Tn are terms of type A1, A2,
. . . , An and p? is n-ary relationship predicate
then p?(T1, T2, . . . , Tn) is formula.
2. If F1, F2 are formulae PLTR then also ¬?F1,
(F1 ∨? F2) and (F1 ∧? F2) are formulae PLTR.
3. There is no other formula PLTR.</p>
        <p>Definition 19 (model) Model of PLTR is a pair M =
hQ, gi where Q is a set of typed relations R1, R2, . . . , Rn
of types A1, A2, . . . , An and g is a mapping assigning
an element from Q to every symbol of typed relations. A
set of all models is denoted M.</p>
        <p>The model definition assumes the relationship
predicates to have fixed interpretations all the time. (That is,
the relationship predicates interpretations are the same
for all models.)
Definition 20 Let V be a set of truth values (that is, V is
a set of all probability intervals). The evaluating function
V al is a function which maps formula F and model M =
hQ, gi to the interval of probability. That is, V al : J ×
M → V . Domain of function V al depends on domains
of relationship predicates used in formula F .</p>
        <p>The receipt to get the value of function V al is as
follows. Let M = hQ, gi be a model of PLTR then:
1. One must find out values of all terms recursively.
(For that case let us introduce a function V al0 for
term evaluation):
(a) Let R be the symbol of typed relation then</p>
        <p>V al0(R) = g(R).
(b) Let R, S be the terms then</p>
        <p>V al0(R ∪ S) =
V al0(R ∩ S) =
V al0(R − S) =</p>
        <p>V al0(R) ∪ V al0(S),
V al0(R) ∩ V al0(S),</p>
        <p>V al0(R) − V al0(S).
(c) Let T be the term and a1, . . . , an ∈ Ω then</p>
        <p>V al0(T [a1, . . . , an]) = V al0(T )[a1, . . . , an].
(d) Let T be the term and C be the selection
condition then</p>
        <p>V al0 T (C) = V al0(T ) (C).
2. One must figure out the formula’s truth value
recursively:
(a) Let T1, . . . , Tn be the terms and let p? be
the n-ary relationship predicate (defined on
hV al0(T1), . . . , V al0(Tn)i) then</p>
        <p>V al p?(T1, . . . , Tn) =
= p? V al0(T1), . . . , V al0(Tn) .
(b) Let F, G be formulae then</p>
        <p>V al ¬?F
V al(F ∧? G)
V al(F ∨? G)
= ¬? V al(F ) ,
= ∧? V al(F ), V al(G) ,
= ∨? V al(F ), V al(G) .</p>
        <p>One can find many well formed formulae of PLTR
language in section 3.
8</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Semantic system of PLTR</title>
      <p>This section provides definitions of logical followings
which are necessary for writing down considerations
about PLTR formulae. We start with definition taken
from book [7]:
Definition 21 A semantic system is determined by an
non-empty set Sent of formulae, a non-empty set M of
models, a non-empty set V of truth values and an
evaluating function V al : (Sent × M) → V .</p>
      <p>We need one more definition from book [7]:
Definition 22 Let S = hSent, M, V, V ali be a
semantic system and let V0 ⊆ V be a set of designated values.
• A formula ϕ ∈ Sent is V0-true in a model M
(notation M |=V0 ϕ) if</p>
      <p>V al(ϕ, M ) ∈ V0.
• A formula ϕ ∈ Sent is V0-tautology (notation |=V0
ϕ) if it is V0-true for each model M ∈ M such that
the formula ϕ is defined on it.
• Formula ϕ ∈ Sent is a logical V0-consequence of
a set Ψ ⊆ Sent of formulae (notation Ψ |=V0 ϕ) if
∀M ∈ M the following holds:
(∀ψ ∈ Ψ)(M |=V0 ψ)
⇒ (M |=V0 ϕ).
• Formulae ϕ ∈ Sent and ψ ∈ Sent are logically
V0-equivalent (notation ϕ ≡V0 ψ) if</p>
      <p>(ϕ |=V0 ψ) ∧ (ψ |=V0 ϕ).</p>
      <p>For new we introduce even stronger concepts:
Definition 23 Let S = hSent, M, V, V ali be a
semantic system.</p>
      <p>• Formula ϕ ∈ Sent is logical consequence of a set
Ψ ⊆ Sent of formulae (notation Ψ |= ϕ) if
• Formulae ϕ ∈ Sent and ψ ∈ Sent are logically
equivalent (notation ϕ ≡ ψ) if
(∀V0 ⊆ V )(Ψ |=V0 ϕ).</p>
      <p>(∀V0 ⊆ V )(ϕ ≡V0 ψ).</p>
      <p>Example 24 Here are examples of some simple logical
equivalences:
p? R(C1) ∪ R(C2)
p? R(C1) ∩ R(C2)
p? R(C1) − R(C2)
≡ p? R(C1 ∨ C2) ,
≡ p? R(C1 ∧ C2) ,
≡ p? R(C1 ∧ ¬C2) .
9</p>
    </sec>
    <sec id="sec-9">
      <title>Related work</title>
      <p>Association rules have been well researched. First
mention about them is dated in sixtieth years of the twentieth
century. Czech authors Ha´jek, Havel and Chytil in 1966
presented the work [6] about automated hypothesis
testing in method called GUHA. Their work appeard long
time before concepts such as data mining or knowledge
discovery becomes familiar. Although their work was
presented many times in the whole world, many
people interested in data mining often erroneously present
Agrawal, Imielinski and Swami ([1]) as the first people
who tried to mine association rules. (See e.g. [9] on page
276.)</p>
      <p>Ha´jek and Havra´nek have written the book [7] where
the concepts of the automatized hypotheses generation
were very well described. They based the thinking of
the rules as of formulae interconnected with generalized
quantifiers.</p>
      <p>Generalized quantifiers is natural generalization of
classical quantifiers ∀ (universal) and ∃ (existential). For
example, Rescher’s (1962) plurality quantifier W says
that “most objects satisfy the formula”. The second
example, the Church’s (1951) quantifier of implication (not
to be confused with the logicall connective of
implication) ⇒o (ϕ1(o), ϕ2(o)) says that the ϕ2 formula is true
for all objects for which the formula ϕ1 is true.</p>
      <p>Authors of the GUHA method have introduced many
such generalized quantifiers that model various
relationships.</p>
      <p>This work proposes the other look at association rules.
We don’t treat association rules as formulae
interconnected with quantifiers but rather as pieces of data
described with relational operations interconnected with
predicates. The difference is in the level of logical
notions where the knowledge is presented. GUHA uses
predicates simply to denote attributes and it models
relationships with quantifiers. We hide the fashion of
describing the objects figuring in a rule in functional
symbols and the relationship modelling arises in predicates,
already.
10</p>
    </sec>
    <sec id="sec-10">
      <title>Conclusion and future work</title>
      <p>
        In this paper, we have presented summary overview of
the best-known association rule types. We have designed
a new language usable for rule representations at that
basis and we have shown many examples of how to use this
language (see also our work [4]). The usage of the
presented language has directly led to some new types of
association rules (vide rule (
        <xref ref-type="bibr" rid="ref13">13</xref>
        )).
      </p>
      <p>Please note that our primary goal of the language
design was to create logic suitable for our further research
on similar properties of various association rules. We
confess the fact that our language may not be ideal
instrument for rule representations to the end-user (analyst).</p>
      <p>Our future work is focused to the study of properties
of so-called cosymmetric rules. Actually, we don’t have
precise definition of what a cosymmetric rule is. (Some
notes may be found in [5].) Intuitively, we can say that
cosymmetric rule is a rule of the form</p>
      <p>
        D(C1)[A1, . . . , An] &gt;? D(C2)[B1, . . . , Bm]
(28)
where two sub-tables are compared and the mapping &gt;?
tells us whether the first sub-table is somehow greater
than the second one. For example, (
        <xref ref-type="bibr" rid="ref11">11</xref>
        ) is a typical
member of cosymmetric rule class. It comes to light that the
class of cosymmetric rules is very large. Our research
shows that even the rules (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) could be treated as
cosymmetric.
      </p>
      <p>We are also working on fast algorithms for mining the
cosymmetric rules. See [3] for the implementation of the
generic framework for mining frequent conjunctions.</p>
      <p>In the centre of interest stands the problem of
visualization of cosymmetric rules – the usage of
conceptual lattices for such purposes is elaborated. We also
study the benefits of displaying the cosymmetric rules as
quasi–ordered set in slightly modified Hasse’s diagram.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Rakesh</given-names>
            <surname>Agrawal</surname>
          </string-name>
          , Tomasz Imielinski, and
          <string-name>
            <given-names>Arun</given-names>
            <surname>Swami</surname>
          </string-name>
          .
          <article-title>Mining associations between sets of items in massive databases</article-title>
          .
          <source>In ACM SIGMOD 1993 Int. Conference on Management of Data</source>
          , pages
          <fpage>207</fpage>
          -
          <lpage>216</lpage>
          , Washington D.C.,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Yonatan</given-names>
            <surname>Aumann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yehuda</given-names>
            <surname>Lindell</surname>
          </string-name>
          .
          <article-title>A statistical theory for quantitative association rules</article-title>
          .
          <source>In Knowledge Discovery and Data Mining</source>
          , pages
          <fpage>261</fpage>
          -
          <lpage>270</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Michal</given-names>
            <surname>Burda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Hynar</surname>
          </string-name>
          , and
          <article-title>Jana Sˇ armanova´</article-title>
          .
          <article-title>Object-oriented interface to algorithms searching frequent conjunctions</article-title>
          .
          <source>In ISIM, Hradec nad Moravic´ı</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Michal</given-names>
            <surname>Burda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Hynar</surname>
          </string-name>
          , and
          <article-title>Jana Sˇ armanova´</article-title>
          .
          <article-title>Probabilistic logic of typed relations (in czech)</article-title>
          .
          <source>In Znalosti, poster proceedings</source>
          , pages
          <fpage>13</fpage>
          -
          <lpage>16</lpage>
          , Vysoke´ Tatry,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Michal</given-names>
            <surname>Burda</surname>
          </string-name>
          , Marian Mindek, and
          <article-title>Jana Sˇ armanova´</article-title>
          .
          <article-title>Characteristics of cosymmetric association rules</article-title>
          .
          <source>In Dateso</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Petr</given-names>
            <surname>Ha</surname>
          </string-name>
          <article-title>´jek, Ivan Havel, and Metodeˇj Chytil. The GUHA method of automatic hypotheses determination</article-title>
          .
          <source>In Computing 1</source>
          , pages
          <fpage>293</fpage>
          -
          <lpage>308</lpage>
          ,
          <year>1966</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Petr</given-names>
            <surname>Ha</surname>
          </string-name>
          <article-title>´jek and Toma´sˇ Havra´nek</article-title>
          .
          <source>Mechanizing Hypothesis Formation</source>
          . Springer-Verlag, Berlin,
          <year>1978</year>
          . Internet: http://www.cs.cas.cz/ ~hajek/guhabook/ (May
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Petr</given-names>
            <surname>Ha</surname>
          </string-name>
          ´jek, Toma´sˇ Havra´nek, and
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Chytil</surname>
          </string-name>
          .
          <article-title>The GUHA method - automatized hypothesis creation (in czech)</article-title>
          .
          <source>Academia, Praha</source>
          ,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Jiawei</given-names>
            <surname>Han</surname>
          </string-name>
          and
          <string-name>
            <given-names>Micheline</given-names>
            <surname>Kamber</surname>
          </string-name>
          .
          <article-title>Data Mining: Concepts and Techniques</article-title>
          . Morgan Kaufmann Publishers, USA,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Toma´sˇ Havra´nek. Statistics for biological and medical sciences (in czech)</article-title>
          .
          <source>Academia, Praha</source>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Petr</given-names>
            <surname>Heba</surname>
          </string-name>
          <article-title>´k and Jirˇ´ı Hustopecky´. Multi-dimensional statistical methods and its applications (in czech)</article-title>
          .
          <source>SNTL, Praha</source>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>Toma´sˇ Karban</article-title>
          .
          <article-title>SDS-rules (in czech)</article-title>
          .
          <source>In Znalosti, poster proceedings</source>
          , pages
          <fpage>17</fpage>
          -
          <lpage>20</lpage>
          , Brno,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <article-title>Toma´sˇ Karban and Milan Sˇimu˚nek. SDS-rules and association rules</article-title>
          .
          <source>In Track on Data Mining (DM)</source>
          ,
          <source>ACM Symposium on Applied Computing</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Wendy</surname>
            <given-names>MacCaull</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Ewa</given-names>
            <surname>Orłowska</surname>
          </string-name>
          .
          <article-title>A calculus of typed relations</article-title>
          .
          <source>In RelMiCS/Kleene-Algebra Ws</source>
          <year>2003</year>
          , LNCS
          <volume>3051</volume>
          , pages
          <fpage>191</fpage>
          -
          <lpage>201</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Jan</given-names>
            <surname>Rauch</surname>
          </string-name>
          .
          <article-title>Association rules and mathematical logic (in czech)</article-title>
          .
          <source>In Znalosti</source>
          , pages
          <fpage>114</fpage>
          -
          <lpage>125</lpage>
          , Brno,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>