<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Modal Logic Association Rule Mining</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mauro Milella</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emilio Muñoz-Velasco</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Pagliarini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Paradiso</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guido Sciavicco</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ionel Eduard Stan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Ferrara</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Màlaga</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Parma</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The most iconic duality of machine learning models is symbolic learning versus functional learning. While functional learning is based on a numerical approach to knowledge extraction and modelling, the purpose of symbolic machine learning is to extract knowledge from data in such a form that it can be understood, discussed, modified, and applied by humans, as well as serve as the basis of artificial intelligence applications. The typical problems associated with machine learning are classification and rule extraction; while classification can be dealt with using both functional and symbolic learning, rule extraction is essentially symbolic. One element common to nearly all definitions and tools for rule extraction is that they are applied to static datasets and based on propositional logic; unfortunately, very often real-world applications give rise to non-static sets of data (e.g., temporal, spatial, graph-based data) and may require more-than-propositional expressive power. In order to extract association rules from non-static data, in this paper we propose a definition of modal association rules based on modal logic, and we study how a standard rule extraction algorithm such as APRIORI can be generalized to the modal case while keeping the properties of the canonical, non-modal case, namely, correctness and completeness.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Modal logic</kwd>
        <kwd>Association rule mining</kwd>
        <kwd>Modal symbolic learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The most iconic and fundamental separation between the sub-fields of machine learning is the
one between functional and symbolic learning. Functional learning is the process of learning a
function that represents the theory underlying a certain phenomenon. Symbolic learning, on
the other hand, is the process of learning a logical description that represents that phenomenon.
Whether one or the other approach should be preferred raised a long-standing debate among
experts, which roots in the fact that functional methods tend to be more versatile and statistically
accurate than symbolic ones, while symbolic methods are able to extract models that can be
interpreted, explained, and then enhanced using human-expert knowledge. These characteristics
of symbolic methods, both for political reasons (consider, for example, the recent General
Data Protection Regulation (GDPR) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] of the European Union, that highlights the need for
interpretable/explainable automatic learning-based decision-making processes, including those
involving AI technologies) and technical ones (interpretable models are often easier to train,
explore, integrate, and implement), are sometimes used as arguments for preferring a symbolic
approach over a functional one. Classification
and rule extraction are, among others, typical
problems of machine learning; while classification can be dealt with using both functional and
symbolic learning, rule extraction is essentially symbolic.
      </p>
      <p>
        From a logical standpoint, canonical symbolic learning methods are mostly characterized by
being based on propositional logic, and by being designed for static data, in which every instance
is described by the value of  attributes. Non-static data, such as dimensional (temporal, spatial,
spatio-temporal) data, graph-based data, or textual data, however, cannot be successfully dealt
with within the same schema in a native way. The standard approach is to pre-process such
data, so that their aspect becomes static again (e.g., by averaging the value of attributes along
all dimensions), and that of-the-shelf static symbolic methods can be used. Modal symbolic
learning [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ] takes a diferent point of view: modal symbolic methods are characterized by
being based on modal logic (e.g., temporal, spatial logic) so that non-static data can be dealt
with natively, and that the extracted knowledge takes the form of interpretable modal logic
formulas. Modal symbolic learning has been mainly focused on classification; here, we consider
the case of association rule extraction. At the propositional level, there are two main approaches
to association rule extraction, or, more particularly, to its key step, frequent sets extraction,
namely APRIORI [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and FP-Growth [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this paper, we consider the most basic one, APRIORI,
and we generalize it to the modal level. A first approach for extracting interval temporal logic
association rules, that can be considered a particular case of this work, has been proposed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>This paper is organized as follows. First, in Section 2, we lay down some basic preliminaries,
and we define the concept of modal datasets. Then, in Section 3, we define modal association
rules from a logical viewpoint, and in Section 4 we extend APRIORI to extract modal frequent
sets and we study its properties, namely, correctness and completeness. In Section 5 we give
examples of possible applications of modal association rules, before concluding.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>(ML) are obtained by the following grammar:
Given a set of propositional letters , the well-formed formulas of the propositional modal logic</p>
      <p>::=  | ¬ |  ∧  | ♢ ,
where the remaining classic Boolean operators can be obtained as shortcuts. In what follows,
we use □  to denote ¬♢ ¬ , and ≡ to denote the logical equivalence. The modality ♢ (resp., □ )
is usually referred to as it is possible that (resp., it is necessary that). Modal logic is considered as
archetypal of temporal, spatial, and spatio-temporal logics, and it is an extension of propositional
logic (PL). Its semantics is given in terms of Kripke models. A (finite)
Kripke model  =
Instance/Kripke model
1
2
3
4
0
0
. . .</p>
      <p>1
1
2
2
3
3
(, ,  ) is composed by a finite set of worlds  (which contains a distinguished world 0,
called initial world), a binary accessibility relation  ⊆
 ×
 :  → 2 , which associates each world with the set of propositional letters that are true on
it. The truth relation ,  ⊩  for a generic model  and a generic world  in that model is
 , and a valuation function
given by the following clauses:
,  ⊩ 
,  ⊩
,  ⊩ ♢
,  ⊩  ∧ 
¬


if
if
if
if
 ∈  ();
,  ⊮  ;
,  ⊩  and ,  ⊩  ;
∃ s.t.  and ,  ⊩ .</p>
      <p>
        Most classic temporal [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
        ] and spatial logics [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] are specializations of the uni-modal
logic with more than one (possibly with higher arieties) accessibility relations (and associated
modalities), subject to constraints that range from very simple and intuitive ones (e.g., transitivity,
antisymmetry) to very complex ones (e.g., when worlds are assumed to be intervals in time and
modalities are assumed to mimic relations between intervals).
(, , ), for each  = 1, . . . , .
      </p>
      <p>Definition 1</p>
      <p>(modal dataset). Given a set of propositional letters  , a modal dataset ℐ =
{1, . . . , } is a finite collection of  instances, each of which is a finite Kripke model  =
The above definition generalizes the conventional notion of propositional, or static, dataset, in
which each of the  instances is associated to the value of  distinct numerical attributes. An
example of modal dataset is illustrated in Fig. 1. In machine learning, several problems are
associated to a dataset ℐ. The most common one is the classification/regression
however, we are interested in the equally common learning problem known as association rule
problem; here,
extraction: in a generic dataset, it is the problem of extracting meaningful associations among
the values of the attributes.</p>
      <p>In the following, we first define the concept of meaningful association rule, and then we
focus our attention to the problem of extracting (all) meaningful rules from a modal dataset.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Modal Logic Association Rules</title>
      <p>
        Rule-based methods are ubiquitous in modern machine learning and data mining
applications [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], that aim at learning regularities in data which can be expressed in the form of if-then
rules. In this paper, we are interested in descriptive (or association) rules, that collectively cover
the instances’ space.
      </p>
      <p>
        Let  be an alphabet on ℐ. In its simplest form, propositional association rules are objects of
the type [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]:
      </p>
      <p>
        : 1 ∧ . . . ∧  ⇒ +1,
where 1, . . . , , +1 ∈  are positive propositional literals (or simply propositional literals);
recalling the standard nomenclature in the association rule mining literature, propositional
literals are also known as items [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We use the notation:
      </p>
      <p>
        :  ⇒ ,
where  = {1, . . . , } is called antecedent,  = {+1} is called consequent, and  ∩  = ∅.
Association analysis must be interpreted with caution: causality is not implied in the inference
process by an association rule as it simply represents strong co-occurrence relationship of
the literals in the antecedent with those in the consequent [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Therefore, to emphasize the
diference with respect to →, which is a logical implication, in (association) rules we use the
symbol ⇒. In this sense,  is not a formula of propositional logic. When treated like one,
however, it becomes a formula that belongs to the known subset of propositional logic, that
is, Horn propositional logic; this usually happens after the extraction phase, when rules are
no longer at risk of not being meaningful. Horn propositional logic is interesting from the
deductive point of view (its satisfiability problem is  -complete), which becomes relevant when
data-driven knowledge is mixed, in some sense, with expert, top-down knowledge in intelligent
applications. Non-empty subsets of  are also called propositional patterns [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Moreover, the
length of a (propositional) pattern  is the set-theoretic cardinality of . In this paper we focus
only on Horn-like association rules; however, the fact that, in other settings, the consequent 
of a rule  :  ⇒  could be a set with more than one literal, that is, it could have ′ literals
 = {+1, . . . , +′ } which, if interpreted as a disjunction, would result in being a non-Horn
logic rule.
      </p>
      <p>In order to formalize the modal association rule mining problem, where instances of a modal
dataset are generic, finite Kripke models, we must first lift the concept of literals to the modal
case.</p>
      <p>Definition 2 (modal literals). For a fixed alphabet , the set of positive modal literals (or simply
modal literals) Λ  over  is defined by the grammar:</p>
      <p>::=  | ♢  | □ ,
where  ∈ .</p>
      <p>Modal literals of the type  are called propositional, as in the propositional case, and modal
literals of the type ♢  (resp., □  ) are called existential (resp., universal).
(1)
(2)
Definition 3 (modal association rules). For a fixed set of modal literals Λ  , modal association
rules are objects of the type:
 :  1 ∧ . . . ∧   ⇒  +1,
(3)
where  1, . . . ,  ,  +1 ∈ Λ  .</p>
      <p>
        Note that (3) has the same structure as (1), except that  has been replaced by Λ  . Non-empty
subsets of Λ  are called modal patterns, and we sometimes write a rule  in its set-theoretical
notation, that is,  ⇒  , where both  and  are modal patterns and  is a singleton.
Similarly to the propositional case, modal rules are not modal formulas; when treated like that,
they belong to the Horn fragment of modal logic MLHorn. There are several, although very
similar, proposals for a Horn fragment of modal logic. Bresolin et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] studied MLHorn and
several of its sub-fragments from the deductive point of view; while the satisfiability problem
for MLHorn is still   -complete (as it is for full ML), sub-fragments of it such as the
one with only universal and propositional literals becomes  -complete.
      </p>
      <p>
        Not all rules are interesting or meaningful. As for the propositional case, in order to capture
the notion of meaningfulness in logical terms, in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] authors implicitly defined a notion of 
holding on an instance, and then on a dataset. They did so by introducing two parameters,
namely support and confidence, which they used to modify the notion of truth of literal, first,
and of a rule, then. After the first introduction of such concepts, and specifically of the concept
of confidence, alternative measures of meaningfulness have been proposed, among which we
mention lift, and conviction (see, e.g., [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). In the end, as we shall see, the key observation is
that their definitions are all consequences of the definition of support; our aim is to generalize
the latter to the modal case, so that all other measures can then be re-defined accordingly. If the
notion of propositional support was originally designed to capture the idea of how frequent
was a literal (resp., a set of literals) across the static dataset, we now have to capture the
additional idea of how frequent a modal literal (resp., a set of modal literals) is across a single
instance, as well. Such two notions are only partially orthogonal: locally, we want to capture
the number of diferent worlds in which a certain phenomenon happens, in order to measure its
interestingness, while, globally, we want to capture the number of diferent instances in which
such an interesting phenomenon occurs. In the following, if  is a set of literals,  = (, ,  )
an instance, and  ∈  a world in , we use ,  ⊩  as a shortcut for ,  ⊩ ⋀︀ ∈  .
Definition 4 (support). Let ℐ be a modal dataset and  ⊆ Λ  be a set of literals. The local
support of  on some instance  ∈ ℐ, being  = (, ,  ), is defined as:
(, ) = |{ ∈  | ,  ⊩ }| ,
| |
and, given a certain local support  ∈ (0, 1] ⊂ R, the global support of  on ℐ relatively to  is
defined as:
 (ℐ, ) = |{ ∈ ℐ | (, ) ≥ }| .
      </p>
      <p>|ℐ|
Similarly to the propositional case, we can define popular interestingness measures for modal
rules in such a way to be (local and global) support-dependent.</p>
      <p>Definition 5 (confidence, lift) . Let ℐ be a modal dataset, and  :  ⇒  a modal rule. The local
confidence of  on some instance  ∈ ℐ is defined as:
 (,  ⇒  ) =
(,  ∪  )
(, )
,
and, given a certain local support  ∈ (0, 1] ⊂ R, the global confidence of  on ℐ relatively to 
is defined as:
 (ℐ,  ⇒  ) =
 (ℐ,  ∪  ) .</p>
      <p>(ℐ, )
Similarly, the local lift of  on some instance  ∈ ℐ is defined as:
 (,  ⇒  ) =</p>
      <p>(,  ∪  )
(, ) · (,  )
,
and, given a certain local support  ∈ (0, 1] ⊂
defined as:</p>
      <p>R, the global lift of  on ℐ relatively to  is
  (ℐ,  ⇒  ) =</p>
      <p>(ℐ,  ∪  )
 (ℐ, ) ·  (ℐ,  )
.</p>
      <p>Confidence measures how frequently items in  appear in instances (in the global setting) or
worlds (in the local setting) that contain . Similarly, lift measures the dependence between
 and  . From the above examples, we can derive a general notion of support-dependent
interestingness measure; we say that a measure is local support-dependent (resp., global
supportdependent) if and only if it can be computed by a function of the local support (resp., global
support) of its set of worlds (resp., instances). We denote by  (resp.,  ) a generic
local (resp., global) support-dependent measure.</p>
      <p>Definition 6 (holding). Let ℐ be a modal dataset,  :  ⇒  a modal rule,   a ( + 1)-vector
of values in (0, 1] ⊂ R (indexed from 0 to ), and let 1, . . . ,  be  local
supportdependent measures. We say that   -holds on  ∈ ℐ if and only if: (i) (,  ∪  ) ≥  [0],
and (ii) for each 1 ≤  ≤ , (,  ) ≥  []. We denote such a situation by:
 ⊩   .</p>
      <p>ℐ ⊩   .</p>
      <p>Similarly, given  ∈ (0, 1] ⊂ R, if   is a ( + 1)-vector of values in (0, 1] ⊂ R (indexed
from 0 to ), and 1, . . . ,  are  global support-dependent measures, we say that
   -holds on ℐ if and only if: (i)  (ℐ,  ∪  ) ≥   [0], and (ii) for each 1 ≤  ≤ ,
 (ℐ,  ) ≥   []. We denote such a situation by:
As defined in the literature, the association rules mining problem has two parts: frequent
patterns mining and extraction of interesting rules. Therefore, in both settings, local and global,
the first conditions, (,  ∪  ) ≥  [0] and  (ℐ,  ∪  ) ≥   [0], enforces a rule
to be frequent in terms of their supports, locally and globally, respectively. The remaining
conditions can be arbitrarily combined but they cannot be separated from imposing a minimal
support. Observe that, the notion of a rule holding on a single instance collapses to the standard
notion of truth when instances are propositional (that is, with a single world); similarly, in
the same case, the notion of a rule holding on a dataset collapses to the (implicit) notion of
propositional rule holding on a dataset (driven by the classical support and confidence/lift/. . . ).
The definition of such generalized notions is not unique, as diferent constraints could be
enforced. However, not every combination makes sense. For example, forcing a rule to be
globally supported if and only if it is locally supported on every instance would imply requiring
perfect global support, which is rather restrictive and possibly non-adherent to the intended
use of rules.</p>
      <p>In the next section we shall focus on solving the following natural problem: given a modal
dataset ℐ and a vector   of ( + 1) values in (0, 1] ⊂ R for some  ∈ (0, 1] ⊂ R, extract (or
mine) all and only modal rules that   -hold on ℐ, which we regard to as interesting.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Mining Modal Logic Rules</title>
      <p>
        As a consequence of focusing on support-dependent interestingness measures only, it turns
out that extracting interesting rules can be achieved by simply extracting sets of literals with
a relatively high support, which in the classic terminology are called frequent sets. To see
 -holds on a
this, consider a rule  :  ⇒  , and assume that we want to check if it [, ]
certain dataset ℐ, where  is the minimum requested global confidence. To this end, it must
hold that  (ℐ,  ∪  ) ≥  and that  (ℐ,  ) ≥ . To compute the latter, one
has to compute, first,  (ℐ,  ∪  ) and  (ℐ, ), which are both by-products of
computing  (ℐ,  ∪  ). Therefore, being able to extract all frequent sets of literals is
necessary and suficient to be able to extract all interesting rules as we have defined them. This
phenomenon occurred at the propositional level too, obviously; it is interesting to observe that
it still holds at the modal one by inheriting exactly the same procedure (see again, e.g., [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]).
      </p>
      <p>
        The problem of enumerating all sets of frequent literals has been studied in the literature [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ];
in general, enumerating frequent sets has several applications, not necessarily related to rule
extraction. The first key observation is that so-called, and well-known APRIORI principle.
Proposition 1 (APRIORI principle [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). If a set of literals is frequent, then all of its subsets must
be frequent.
      </p>
      <p>
        Naturally, there are a potentially exponential number of frequent sets of a given dataset ℐ, even
at the propositional level. Focusing on maximal frequent sets of literals does not change this,
and the complexity of the problem of extracting them becomes therefore interesting. As proved
in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the counting problem associated to the problem of extracting all maximal frequent
sets is already # -hard [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In most cases, the interesting problems in machine learning are
computationally hard, thus the go-to strategy is turning to approximation algorithms. This
is not the case with frequent set extraction: in many practical cases, indeed, there are only a
few maximal frequent sets in a given dataset, and enumerating them in a correct and complete
fashion is the typical approach. There are two classical approaches to this problem, called,
respectively, APRIORI [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and FP-Growth [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In the rest of this section, we consider the former
one, and we study its modal generalization.
      </p>
      <p>ALGORITHM 1: Modal extension of the APRIORI algorithm for mining frequent modal patterns.</p>
      <p>APRIORI has been the first pruning-based algorithm for mining sets of propositional literals.
Borrowing the same concept from the propositional level, the global support models the (relative)
frequency of instances in ℐ that satisfy the modal pattern , whereas the local support, in the
more general modal setting that we present here, models the frequency of worlds in  ∈ ℐ that
satisfy the modal pattern . Observe that the anti-monotonic property (the APRIORI principle)
also holds for the local support, as longer modal patterns hold on fewer worlds, which is a
consequence of the logical conjunction in the definition of the truth ,  ⊩ . The modal
extension of APRIORI is based on the following assumption, which also holds in its propositional
version of APRIORI: modal literals in modal patterns can be lexicographically ordered. The
reason behind such an assumption, that works in synergy with the APRIORI principle, is to
avoid useless computation during the learning routine. Algorithm 1 generalizes the APRIORI
algorithm to the modal case and takes as input a modal dataset ℐ, an alphabet  defined on ℐ,
three user-specified thresholds ,  ∈ (0, 1] ⊂ R, and  ∈ N for the minimum local support,
minimum global support, and maximum modal depth of modal literals, respectively. Setting a
maximum modal depth is a necessary step for the algorithm to work and terminate, as there is
no anti-monotonic property that holds for modal literals; it should be noticed, however, that
modal rules with too long modal literals would disfavour their interpretability. In Algorithm 1,
 and  denote the set of frequent modal patterns and the set of generated candidates of
modal patterns of length , for  ≥ 1 ∈ N, respectively. The algorithm starts by computing
the set of modal literals Λ  from  and  . Then, it computes 1 whose patterns are frequent
from 1. Next, the repeat-until loop block of the algorithm computes iteratively , with
 &gt; 1. The (· ) procedure generates candidate modal patterns  by
joining1 − 1 with itself: two lexicographically ordered frequent ( − 1)-length modal patterns
{ 1, . . . ,  − 1,  − 1} and { 1, . . . ,  − 2,  ′− 1} of − 1 are merged together if they have the
same modal patterns in the first  − 2 positions and if  − 1 &lt;  ′− 1 to form a candidate -length
modal pattern { 1, . . . ,  − 2,  − 1,  ′− 1}. The candidates from  are then pruned by the
 (· ) function based on the APRIORI principle. It is important to stress the
fact that, both (· ) and  (· ) are identical as in the
canonical APRIORI algorithm. At this stage, the algorithm produces  whose elements are
from  which are frequent. Termination is guaranteed by the anti-monotonic property of the
support.</p>
      <p>We now study correctness and completeness of our approach. It is important to stress the
fact that completeness is parametric in  .</p>
      <p>Theorem 1 (correctness).    is correct.</p>
      <p>Proof. We want to show that , for each  ≥ 1, is the set of frequent -length modal
patterns. The proof is by induction on . If  = 1, then 1 contains singleton sets of modal
literals of the type { }, where  ∈ Λ  , that are frequent. Thus, by construction, 1 is made
of frequent 1-length modal patterns. If  &gt; 1, to obtain the set of frequent sets of modal
literals , the algorithm generates candidates by combining elements of − 1, which are
frequent by inductive hypothesis. Let { 1, . . . ,  − 2,  − 1} and { 1, . . . ,  − 2,  ′− 1} be
elements of − 1 used by the candidate generation procedure to generate  ∈ , such that
 = { 1, . . . ,  − 2,  − 1,  ′− 1}. If  is frequent, that is, meets the support, then  ∈ .
Therefore,  is made of -length frequent patterns.</p>
      <p>Theorem 2 (completeness).    is complete.</p>
      <p>Proof. Let  be the set of frequent sets of modal literals and let  be the set of candidate
frequent sets of modal literals, whose modal patterns have length . We want to show that
 ⊆ , for all  ≥ 1. The proof is by induction on . If  = 1, then 1 is the set of singleton
sets of the type { }, and 1 ⊆  holds since 1 is defined as the subset of  that are frequent.
If  &gt; 1, assume, by inductive hypothesis, that − 1 ⊆ − 1. We must show that  ⊆ .
Let  = { 1, . . . ,  − 2,  − 1,  ′− 1}. The pattern  is generated by two ( − 1)-length modal
patterns, say { 1, . . . ,  − 2,  − 1} and { 1, . . . ,  − 2,  ′− 1}, such that  − 1 &lt;  ′− 1, which
are necessarily frequent, by inductive hypothesis. Thus,  ∈  must hold and if  meets the
support, then  ∈ , obtaining the thesis.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Applications</title>
      <p>Modal logics are, as we have observed, archetypal of many temporal, spatial, and
spatiotemporal logics. The purpose of modal symbolic learning is designing symbolic machine learning
1This is the original nomenclature.
techniques to deal with non-static data. Non-static data include, among others, dimensional data
(that is, data with temporal and/or spatial components), graph-based data (that is, data natively
in the form of graphs, such as social networks, or gene expression networks), and textual data
(that is, data in the form of natural language text). The most basic advantage of modal symbolic
learning is that of reuniting all such non-static data under the same viewpoint, that is, seeing
them all as modal data. How this is done in all cases is outside the scope of this paper; by way
of example we take here the dimensional, temporal case to show how the modal framework
emerges naturally.</p>
      <p>
        Dimensional datasets can be regarded as modal datasets in which the Kripke model associated
to an instance is implicit and presented in tabular form; this corresponds to the typical
presentation of dimensional data, as in Fig. 2, left-hand side. While in the classical sense instances
correspond to lines, more than one line make up for a real-world instance. So, for example, each
instance of the left-hand side of the picture may be the status of a patient under observation,
each being described by two attributes 1 and 2, over the course of four time points 1 through
4. Thus, the intended meaning of such a dataset is that we have in fact four patients (being
distinguished by their names 1 through 4), which become four multivariate time series,
denoted by 1 through 4, each composed by four values per attribute, as in Fig. 2, right-hand
side. By regarding each interval in the temporal domain 1, . . . , 4 as a world, one ends up
building a Krikpe structure from each time series (in this example, each structure has 6 worlds,
that is, 6 non-instantaneous intervals). Connecting each pair of such intervals there is exactly
one of 12 Allen’s relations [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]; the modal logic that emerges from this preprocessing, and that
allows one to express how the properties of a time series relate to each other through time is
known as interval temporal logic of Allen’s relations, or HS (from Halpern and Shoham, who
ifrst introduced it [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]).
      </p>
      <p>Observe, now, that each interval in this setting accounts for a set of values for each attribute;
for example, in Fig 2, the instance 1 has values 111, 112, 113 for 1 in the interval [1, 3]. By
applying suitable feature extraction function(s) one can, at this point, aggregate such set of values
in a single one; typical aggregations include the average, the maximum, and the minimum, but
in the literature of learning from temporal data one finds several more, and less trivial, examples.
Let us call  any such aggregation function; by applying  to every possible interval for the
attribute , one ends up with a new set of values, that is, the domain ( ()). At this point,
it is possible to extract a propositional alphabet, of the type:</p>
      <p>1 ≤  () ≤ 2,
with 1, 2 ∈ ( ()). By labeling each world (interval) with the truth value of each of the
propositional letters, we extract from a dimensional, temporal dataset a modal one.</p>
      <p>The explicit knowledge that can be extracted from electroencephalogram recording (EEG) of
subjects that are in certain conditions or are carrying on certain activities is of uttermost
importance in the scientific enquiry of those conditions and activities. The current approaches to this
problem are entirely functional, but, despite the terminological afinity, black-box classification
models such as neural networks cannot be considered models of the human mind. EEG is a
real-world example of dimensional data that could be used for modal association rule extraction.
A typical EEG is performed with a head-set that accommodates several electrodes (typically, in
the fifties), each one of which records, during a single experiment, the voltage that is expressed
by a specific area of the human brain; the signal of each electrode is then preprocessed through
Fourier-like transformations, so that each signal gives rise to several components at diferent
frequency intervals (typically, in the tens). All in all, a single EEG recording generates anywhere
between 500 and 2000 temporal attributes (pairs electrode-frequency) over the course of the
experiment. A recent trend in neurobiology is focusing on how diferent experimental situations
can be distinguished by the form and the values of the corresponding EEG recording, which
helps neurobiologists to understand the nature of the diferent brain areas in diferent settings.
While most often these problems are designed as classification problems, understanding how
the several attributes relate to each other through time in a specific situation may turn out to be
very interesting. Temporal relationships can be extracted as modal rules, specifically HS rules,
which would have the form of assertions of the type: if the value of the voltage in this sensor at
this frequency is within this interval of values during an interval of time, then the value of the
voltage in this other sensor at this other frequency must be within this other interval immediately
after it.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>Symbolic learning is the sub-field of machine learning that deals with symbolic algorithms and
models, which have been known for decades and successfully applied to a variety of contexts.
The main limitation of current symbolic models is the fact that they are essentially based on
classical propositional logic, which implies that dimensional data, graph-based data, or textual
data must be pre-processed in order to be dealt with within the classic framework, at the
expenses of decreasing the interpretability of the result and loosing information.</p>
      <p>Modal symbolic learning is the modal logic generalization of symbolic learning; in this paper
we considered the problem of defining the concept of modal association rules and their logical
characterization, and we devised a correct and complete algorithm, ModalAPRIORI, which is
able to extract frequent sets of modal literals both at the global and local level, and therefore
modal symbolic rules. We also discussed potential applications of modal symbolic rules.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] Parliament and Council of the European Union</article-title>
          ,
          <source>General Data Protection Regulation</source>
          ,
          <year>2016</year>
          . URL: https://gdpr-info.eu/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Manzella</surname>
          </string-name>
          , G. Pagliarini, G. Sciavicco,
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Stan</surname>
          </string-name>
          ,
          <article-title>Interval Temporal Random Forests with an Application to COVID-19 Diagnosis</article-title>
          , in
          <source>: Proceedings of the 28th International Symposium on Temporal Representation and Reasoning (TIME)</source>
          , volume
          <volume>206</volume>
          of Leibniz International Proceedings in Informatics,
          <source>Schloss Dagstuhl - Leibniz-Zentrum für Informatik</source>
          ,
          <year>2021</year>
          , pp.
          <volume>7</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          :
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pagliarini</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Sciavicco, Decision Tree Learning with Spatial Modal Logics</article-title>
          ,
          <source>in: Proceedings of the 12th International Symposium on Games, Automata, Logics, and Formal Verification (GandALF)</source>
          , volume
          <volume>346</volume>
          <source>of Electronic Proceedings in Theoretical Computer Science</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sciavicco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Stan</surname>
          </string-name>
          ,
          <article-title>Knowledge extraction with interval temporal logic decision trees</article-title>
          ,
          <source>in: Proceedings of the 27th International Symposium on Temporal Representation and Reasoning (TIME)</source>
          , volume
          <volume>178</volume>
          of Leibniz International Proceedings in Informatics,
          <source>Schloss Dagstuhl - Leibniz-Zentrum für Informatik</source>
          ,
          <year>2020</year>
          , pp.
          <volume>9</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          :
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Srikant</surname>
          </string-name>
          ,
          <article-title>Fast Algorithms for Mining Association Rules in Large Databases</article-title>
          ,
          <source>in: Proceedings of 20th International Conference on Very Large Data Bases (VLDB)</source>
          ,
          <year>1994</year>
          , pp.
          <fpage>487</fpage>
          -
          <lpage>499</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . Pei,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <article-title>Mining Frequent Patterns without Candidate Generation</article-title>
          ,
          <source>in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bresolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Cominato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gnani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Muñoz-Velasco</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Sciavicco, Extracting Interval Temporal Logic Rules: A First Approach</article-title>
          , in
          <source>: Proceedings of the 25th International Symposium on Temporal Representation and Reasoning (TIME)</source>
          , volume
          <volume>120</volume>
          of Leibniz International Proceedings in Informatics,
          <source>Schloss Dagstuhl - Leibniz-Zentrum für Informatik</source>
          ,
          <year>2018</year>
          , pp.
          <volume>7</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          :
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Emerson</surname>
          </string-name>
          ,
          <article-title>Design and Synthesis of Synchronization Skeletons Using Branching-Time Temporal Logic</article-title>
          ,
          <source>in: Proceedings of the Workshop on Logics of Programs (LOP)</source>
          , volume
          <volume>131</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>1981</year>
          , pp.
          <fpage>52</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Halpern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Propositional</given-names>
            <surname>Modal</surname>
          </string-name>
          <article-title>Logic of Time Intervals</article-title>
          ,
          <source>Journal of the ACM</source>
          <volume>38</volume>
          (
          <year>1991</year>
          )
          <fpage>935</fpage>
          -
          <lpage>962</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pnueli</surname>
          </string-name>
          ,
          <article-title>The Temporal Logic of Programs</article-title>
          ,
          <source>in: Proceedings of the 18th Annual Symposium on Foundations of Computer Science (FOCS)</source>
          ,
          <source>IEEE Computer Society</source>
          ,
          <year>1977</year>
          , p.
          <fpage>46</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aiello</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. van Benthem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Modal</given-names>
            <surname>Walk Through Space</surname>
          </string-name>
          ,
          <source>Journal of Applied Non-Classical Logics</source>
          <volume>12</volume>
          (
          <year>2002</year>
          )
          <fpage>319</fpage>
          -
          <lpage>364</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          , Modal Logics of Topological Relations,
          <source>Logical Methods in Computer Science</source>
          <volume>2</volume>
          (
          <year>2006</year>
          )
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fürnkranz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gamberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lavrac</surname>
          </string-name>
          , Foundations of Rule Learning,
          <source>Cognitive Technologies</source>
          , Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Horn</surname>
          </string-name>
          ,
          <article-title>On Sentences Which are True of Direct Unions of Algebras</article-title>
          ,
          <source>Journal of Symbolic Logic</source>
          <volume>16</volume>
          (
          <year>1951</year>
          )
          <fpage>14</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Steinbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Karpatne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , Introduction to Data Mining, 2nd ed.,
          <source>Pearson</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bresolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Muñoz-Velasco</surname>
          </string-name>
          , G. Sciavicco,
          <source>On Sub-Propositional Fragments of Modal Logic, Logical Methods in Computer Science</source>
          <volume>14</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>G.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>The complexity of mining maximal frequent itemsets and maximal frequent patterns</article-title>
          ,
          <source>in: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>344</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Valiant</surname>
          </string-name>
          ,
          <article-title>The Complexity of Enumeration and Reliability Problems</article-title>
          ,
          <source>SIAM Journal on Computing</source>
          <volume>8</volume>
          (
          <year>1979</year>
          )
          <fpage>410</fpage>
          -
          <lpage>421</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <article-title>Maintaining Knowledge about Temporal Intervals</article-title>
          ,
          <source>Communication of the ACM</source>
          <volume>26</volume>
          (
          <year>1983</year>
          )
          <fpage>832</fpage>
          -
          <lpage>843</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>