1. Introduction

Italian Conference on Theoretical Computer Science, September

ModalFP-Growth: Eficient Extraction of Modal Association Rules from Non-Tabular Data

Mauro Milella

Giovanni Pagliarini

Guido Sciavicco

Ionel Eduard Stan

0 1 0 Department of Mathematics and Computer Science, University of Ferrara , Italy 1 Faculty of Engineering, Free University of Bozen-Bolzano , Italy

2024

1 1 13

This paper explores the extraction of modal association rules from non-tabular data using a novel algorithm, ModalFP-Growth. By extending the FP-Growth algorithm to modal logic, ModalFP-Growth processes instances represented as Kripke models, facilitating eficient rule extraction from temporal, spatial, and spatio-temporal datasets. Each instance is transformed into a tabular form where worlds correspond to rows and literals to columns, enabling the application of the original FP-Growth. The algorithm, then, aggregates locally frequent itemsets from individual instances to identify globally supported itemsets across the dataset. We prove the soundness and completeness of ModalFP-Growth, ensuring that all and only frequent itemsets are included in the ifnal output. Additionally, we present an open-source implementation within the Sole learning and reasoning suite. Experimental evaluations using Halpern and Shoham's Interval Temporal Logic on a public temporal dataset demonstrate the algorithm's practical eficiency and the interpretability of the extracted rules.

eol>Modal logic Association rule mining Modal symbolic learning

1. Introduction

The distinction between sub-symbolic and symbolic learning is a fundamental separation in machine learning. Sub-symbolic learning involves learning a function to represent a phenomenon, ofering versatility and statistical accuracy. Conversely, symbolic learning creates a logical description of the phenomenon, valued for its interpretability and explainability. This interpretability is crucial for both political reasons, such as compliance with the EU’s General Data Protection Regulation (GDPR) [ 1 ],that highlights the need for interpretable/explainable automatic learning-based decision-making processes [ 2, 3 ], and technical reasons, as symbolic models are easier to train, explore, integrate, and implement. In machine learning, classification and rule extraction are common tasks. While classification can utilize both learning types, rule extraction is inherently symbolic. Traditional symbolic methods are based on propositional logic and designed for tabular data, with propositions typically expressed as ◁▷ or ◁▷ ◁▷ (◁▷ ∈ {<, ≤} ) and rules formulated as 1 ∧ . . . ∧ ⇒ +1, where ⇒ denotes a strong co-occurrence relationship between the antecedent and the consequent.

For temporal or spatial data, which are non-tabular, a pre-processing step is usually required to transform the data into a tabular format. However, recent research suggests that native learning methods may yield better results. Modal symbolic learning [ 4, 5, 6 ] uses modal logic to process non-tabular data directly, resulting in interpretable modal logic formulas.

Modal symbolic learning has primarily focused on classification, but the concept of modal association rules has been formalized in [7, 8], introducing the ModalApriori algorithm based on the well-known standard algorithm Apriori. Modal association rules of the type 1 ∧ . . . ∧ ⇒ +1 where are positive modal literals, generalize propositional rules, and can easily cover the cases of both spatial and temporal data, among many others. Allowing rules to work in dimensions such as space and time is a

Instance/Kripke model 1 2 3 4 0 0 . . .

1 1 2 2 3 3 images and text (see, e.g. [9, 10]).

FP-Growth [11] is a more eficient alternative to Apriori for generating frequent patterns from tabular data. This paper extends FP-Growth to the modal case, introducing the ModalFP-Growth algorithm and proving its soundness and completeness. We provide an open-source implementation of both ModalApriori and ModalFP-Growth within the Sole learning and reasoning suite [12]. Finally, we propose a customizable algorithm for probing rules from frequent patterns, and we test our approach in the particular case of temporal rule extraction from a public dataset, discussing the results regarding practical eficiency and the meaningfulness of the extracted rules. 2. Modal Logic, Frequent Patterns, and Association Rules using the following grammar: Given a set of propositions , the well-formed formulas of propositional modal logic are constructed

::= | ¬ | ∧ | ♢ , where the remaining classic Boolean operators can be derived as shortcuts. In this context, we use □ to denote ¬♢ ¬ , and ≡ to indicate logical equivalence. The modality ♢ (resp., □ ) is typically interpreted as it is possible that (resp., it is necessary that). Modal logic is considered archetypical of propositional temporal, spatial, and spatio-temporal logics and is a non-conservative extension of propositional logic.

Formulas are interpreted on Kripke models. A Kripke model = (, , ) consists of a finite set of worlds (including a distinguished world 0, called the initial world), a binary accessibility relation ⊆ propositions true in that world. The pair (, ) is known as the frame. The truth relation , ⊩ ×

, and a valuation function : → 2 , which associates each world with the set of for a model and a world within that model is defined by the following clauses: , ⊩ , ⊩ ¬

, ⊩ ∧ , ⊩ ♢ if if if if ∈ (); , ⊮ ; , ⊩ and , ⊩ ; ∃′ s.t. ′ and , ′ ⊩ .

Note that the truth relation is inherently internal, as formulas are evaluated within models at specific worlds, and local, since a world is compared only with those accessible through the relation .

We use modal logic to express properties of non-tabular data. A non-tabular instance can be seen as a finite Kripke model, and a set of non-tabular instances as a modal dataset.

Definition 1 (modal dataset). Given a set of propositions , a modal dataset ℐ = {1, . . . , } is a finite collection of instances, each of which is a finite Kripke model = (, , ), for each = 1, . . . , .

Definition 2 (modal literals, modal rules). For a fixed alphabet , a positive modal literal (or, simply, a literal) is an object of the type

::= | ♢ | □ .

: 1 ∧ . . . ∧ ⇒ +1, The set of all modal literals over is denoted by Λ . A modal rule (or, simply, a rule) is an object of the type where 1, . . . , , +1 ∈ Λ .

Modal literals of the type are called propositional, as in the propositional case, and modal literals of the type ♢ (resp., □ ) are called existential (resp., universal). Non-empty subsets of Λ are called modal patterns. In a rule , = {1, . . . , } is called antecedent, and it is also denoted by ( ), = {+1} is called consequent, also denoted by ( ), and ∩ = ∅. The length of a (propositional) pattern is the set-theoretic cardinality of . An example of modal dataset with six instances is shown in Fig. 2. Observe that in the example, all instances have the same frame; although this is not a requirement in modal association rule extraction, it is often the case.

Association analysis should be interpreted with caution. Causality is not implied by an association rule: it simply indicates a strong co-occurrence of the literals in the antecedent with those in the consequent [13]. To emphasize the distinction from logical implication (denoted →), we use the symbol ⇒ in rules. Consequently, the conjunction in antecedents should not be interpreted as a logical conjunction. Thus, is not a modal logic formula. After the extraction phase, however, rules are considered meaningful and can be treated as logical formulas. A modal association rule then becomes a formula of the Horn fragment of modal logic [14, 15] (similar to the propositional case, where an association rule becomes a Horn formula [16]). Horn propositional and modal logics are interesting from a deductive perspective, as satisfiability often becomes computationally simpler. This is relevant when data-driven knowledge is integrated with expert, top-down knowledge in intelligent applications.

Not all rules are interesting or meaningful. Similar to the propositional case, in order to capture the notion of meaningfulness in logical terms, in [17] the authors implicitly defined the notion of a rule holding on an instance, and then on a dataset, by introducing two parameters: support and confidence. These parameters modify the notion of truth of a literal, and subsequently of a rule. Following the introduction of these concepts, especially confidence, alternative measures of meaningfulness have been proposed, including lift and conviction (see, e.g., [13]). More recently, additional measures of rule interestingness have been introduced to avoid extracting trivially true rules.

Generalizing these concepts to the modal case requires generalizing the notion of a set of literals, that is, a pattern, being frequent in a dataset. At the propositional level, a literal is considered to hold on some instance if it is true in it; at the modal level, a literal should be considered interesting if it is both locally and globally frequent: via local frequency we aim to evaluate on how many worlds of a given instance occurs, while via global frequency we aim to evaluate in how many instances this happens. This notion, that requires two parameters , (resp., the local and the global support), induces a generalized notion of a pattern holding on some instance and on some dataset ℐ. Definition 3 (support). Let ℐ be a modal dataset and ⊆ Λ be a set of modal literals. The local support of on some instance ∈ ℐ, being = (, , ), is defined as: and, given a certain local support ∈ (0, 1], the global support of on ℐ relatively to is defined as: (, ) = |{ ∈ | , ⊩ }| ,

| | (ℐ, ) = |{ ∈ ℐ | (, ) ≥ }| .

|ℐ| Notation-wise, given certain local and global supports , ∈ (0, 1], we write ⊩ (resp., ℐ ⊩ , ) to denote the fact that (, ) ≥ (resp., (ℐ, ) ≥ ), and we say that locally holds (resp., globally holds) on (resp., ℐ). A set of literals that globally holds on a dataset ℐ is said to be frequent.

Rule extraction relies on frequent pattern extraction, which is determined by support. Rules are derived from frequent patterns using a probing algorithm that assesses the interestingness of a potential rule in terms of its confidence, lift, or other support-dependent indices. These indices must also be generalized to the modal case.

Definition 4 (confidence, lift) . Let ℐ be a modal dataset, and : ⇒ a modal rule. The local confidence of on some instance ∈ ℐ is defined as: and, given a certain local support ∈ (0, 1], the global confidence of on ℐ relatively to is defined as: (, ⇒ ) = (ℐ, ⇒ ) = (, ∪ ) (, ) Similarly, the local lift of on some instance ∈ ℐ is defined as: (, ⇒ ) =

(, ∪ ) (, ) · (, ) and, given a certain local support ∈ (0, 1], the global lift of on ℐ relatively to is defined as: (ℐ, ⇒ ) =

(ℐ, ∪ ) (ℐ, ) · (ℐ, ) .

As before, and again generalizing the propositional case, local and global confidence (as well as lift and the other interestingness measures) induce a notion of a rule holding on an instance and on a dataset; for example, given a certain local and global support, and global confidence , , ∈ (0, 1] ⊂ R, for a rule : ⇒ we write ℐ ⊩ ,, to denote the fact that ℐ ⊩ , ∪ and that (ℐ, ⇒ ) ≥ . This is not the only possible notion of a rule holding on a dataset; diferent ones depend on the subset of measures that are considered in a particular case, and both a local (to a single instance) and a global (to a whole dataset) analysis can be performed. To ease the notation, in general we write ℐ ⊩ (or ⊩ , if the analysis is purely local), where collects all chosen thresholds.

In summary, a proper modal association rule extraction process is characterized by the following choices: () the modal logic of reference: deciding how to interpret the instances of a non-tabular dataset as modal instances, identifying worlds, relations, and propositions; () the learning parameters: setting appropriate parameters to determine minimal local and global support; and () the rule probing algorithm: listing all frequent sets of literals, probing diferent rules on each, and accepting them if they meet minimal confidence and lift, and possibly other specific conditions relevant to the case at hand. 3. Frequent Modal Pattern Extraction with ModalFP-Growth and

Association Rule Mining The FP-Growth algorithm is designed to eficiently extract frequent itemsets from a propositional dataset by compressing the dataset into an FP-Tree structure. This compression enables eficient reading and processing of the crucial information needed for mining frequent itemsets. We start by summarizing the essential characteristics of an FP-Tree and the operation of FP-Growth as originally introduced by Han et al. [11]. Subsequently, we generalize this approach to handle modal datasets, which we call ModalFP-Growth, following the pseudocode in Alg. 1. Finally, we prove both ModalFP-Growth’s soundness and completeness, before presenting the association rule mining algorithm.

3.1. Overview of FP-Growth

An FP-Tree is a data structure composed of two main components: ., a prefix tree, and .ℎ, a hash table known as the header table. Each node ∈ . has five attributes: . , a literal from a fixed alphabet , .ℎ , a collection of child nodes, . , a reference to the parent node (possibly null), . , a counter indicating the number of identical nodes represented by , and . , a pointer to next node ∈ . with the same content (. = . ).

The header table .ℎ maps each literal ∈ to a node ∈ . where . = . This mapping facilitates eficient horizontal traversal of the tree, which is frequently required during the construction of conditional pattern bases in FP-Growth.

The first step in FP-Growth is to compress the initial dataset into an FP-Tree. This is achieved by inserting each instance into the tree as a branch, considering items in order of decreasing frequency to maximize compression. Nodes representing non-frequent items are pruned to ensure the tree contains only necessary information. The process involves two iterative phases. First, it extracts a conditional pattern base for a frequent item , which is a projection of the original dataset retaining only relevant information for mining itemsets containing . Then, it builds a new FP-Tree from the conditional pattern base. This iterative process stops when the generated FP-Tree degenerates into a list, at which point all itemsets and their combinations are mined from the list. The reader can refer to FP-Growth’s original paper by Han et al. [11] for comprehensive details.

3.2. Transition to ModalFP-Growth

To handle modal datasets, we extend the FP-Growth algorithm to the ModalFP-Growth algorithm. A modal dataset consists of instances represented as Kripke models. From an operational perspective, each Kripke model is transformed into a tabular form: rows of the table correspond to worlds in the Kripke model, and the columns correspond to literals (see Fig. 3).

Alg. 1 is the generalization of FP-Growth to modal datasets. The transition from FP-Growth to ModalFP-Growth involves the following key modifications. Each instance in the modal dataset is represented as a Kripke model. These models are transformed into a tabular form where rows correspond to worlds and columns to literals (modal data representation). The literals Λ are generated based on the propositions and the maximum modal depth (literal generation). For each world ∈ in the Kripke model = (, , ) ∈ ℐ, locally frequent itemsets ⊆ Λ are identified based on a local support threshold (local support calculation). These locally frequent itemsets are inserted into an FP-Tree, which is then used to extract (locally) frequent patterns () through the FP-Growth process ♢ , □ 1 ♢, ♢ 0

3 , ♢ , □ 2 , ♢ , ♢ , □ 4 ♢ , □ , ♢ , □ 6 5 , ♢ , □ can be seen as ♢ □

♢ □ 0 1 2 3 4 5 6 ⊥ ⊥ ⊥ ⊥ ⊤ ⊤ ⊥ ⊤ ⊥ ⊥ ⊥ ⊤ ⊥ ⊤ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊤ ⊤ ⊥ ⊥ ⊤ ⊤ ⊤ ⊤ ⊤ ⊤ ⊥ ⊤ ⊤ ⊤ ⊤ ⊥ ⊤ ⊥ ⊤ ⊤ ⊤ ⊤ input : Modal dataset ℐ, propositions , maximum modal depth , user-specific parameterization 1 function -ℎ(ℐ, , , ): Λ ← ← ←

(, ) ℎℎ( ) ℎℎ( ) ← ∅ foreach = (, , ) ∈ ℐ do

- () ← foreach ∈ do end ← ← end ← return 19 end 20 function (, ): (., ) ← { ← ∈ Λ | , ⊩ and (, ) ≥ ()

} (.ℎ) -ℎ(, , ∅)

∪ (, ℐ, ) 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 29 30 31 34 35 36 37 38 39 40 41 44 45 46 (, ) ( , )

∪ -ℎ( , , { }) , ˜ ← if || = 0 then return

() if ∃ ∈ .ℎ and . = then

( = , = , = 1) if ℎ(.) then return { ∪ | ∈ (.)} else end end ← . ← . + 1 ← ←

ℎ(, ) ← (, ˜ ) 32 end 33 function -ℎ(, , ): foreach ← ∅

∈ .ℎ do ← ← ← return 42 end

() 43 function (, ℐ, ): ← { return ∈ . | |ℐ| ≥ }

[] (FP-Tree construction). The globally frequent itemsets () are identified by filtering the combined locally frequent itemsets () based on the global support threshold (global ifltering ). The ModalFP-Growth algorithm ensures the eficient extraction of frequent modal itemsets, maintaining the interpretability and explainability of the symbolic models.

3.3. Soundness and Completeness

The ModalFP-Growth algorithm is both sound and complete, ensuring that all and only the frequent modal itemsets are identified.

Theorem 1 (Soundness of ModalFP-Growth). Let ℐ = {1, . . . , } be a modal dataset, where each is a Kripke model represented in tabular form. Then, if ModalFP-Growth returns an ⊆ Λ then is frequent.

Proof. To prove the soundness of the ModalFP-Growth algorithm, we need to show that every itemset included in the final output is frequent, meaning that it meets the global support criterion.

First, we consider the FP-Growth algorithm applied to the tabular representation of each Kripke model instance = (, , ). This algorithm correctly identifies all itemsets that are locally frequent. Formally, for an itemset to be considered locally frequent in an instance , it must satisfy: (, ) = |{ ∈ | , ⊩ }| ≥ , || where is the local support threshold.

After running FP-Growth on each instance ∈ ℐ, we obtain a collection of locally frequent itemsets for each instance. These itemsets are then combined to form a global collection. An itemset is included in the final result if and only if it appears in a suficient number of instances with the required local support. Specifically, is included if: (ℐ, ) = |{ ∈ ℐ | (, ) ≥ }| ≥ , |ℐ| where is the global support threshold.

Since the FP-Growth algorithm ensures that any itemset identified for an instance satisfies (, ) ≥ , the final step of ModalFP-Growth ensures that an itemset is included in the output if and only if it satisfies the global support criterion. By the definition of global support, for an itemset to be included in the final output, it must satisfy:

(ℐ, ) ≥ .

This means that is frequent in the modal dataset ℐ according to the defined global support criterion. Therefore, every itemset included in the final output of the ModalFP-Growth algorithm is frequent with respect to the specified support thresholds.

Theorem 2 (Completeness of ModalFP-Growth). Let ℐ = {1, . . . , } be a modal dataset, where each is a Kripke model represented in tabular form. Then, if ⊆ Λ is frequent, then ModalFP-Growth returns it.

Proof. To prove the completeness of the ModalFP-Growth algorithm, we need to show that every itemset that is frequent, meaning it meets the global support criterion (ℐ, ) ≥ , is included in the final output of the algorithm.

First, consider an itemset that is frequent in the modal dataset ℐ. By definition, this means: (ℐ, ) = |{ ∈ ℐ | (, ) ≥ }| ≥ , |ℐ| implying that has a local support of at least in at least a fraction of the instances in ℐ. Let ⊆ ℐ be the subset of instances where has a local support of at least :

= { ∈ ℐ | (, ) ≥ }.

By the definition of global support, we have: | | ≥ .

|ℐ|

Next, consider the FP-Growth algorithm applied to each instance ∈ . Since has a local support of at least in each ∈ , the FP-Growth algorithm will correctly identify as a locally frequent itemset for these instances. ModalFP-Growth then combines the locally frequent itemsets identified by FP-Growth from each instance. Since is identified as locally frequent in each instance of , it will be included in the collection of itemsets combined by ModalFP-Growth.

Finally, ModalFP-Growth checks if meets the global support criterion. Since is frequent by assumption, it satisfies (ℐ, ) ≥ . Therefore, will be included in the final output of the ModalFP-Growth algorithm.

3.4. Association Rule Mining

Alg. 2 extracts significant relationships from frequent itemsets using a user-defined interestingness measure, generically denoted by (e.g., confidence or lift, among others) and its threshold . The algorithm processes the modal dataset ℐ and the set of frequent itemsets , by using parameters from , which is supposed to include the value of .

For each frequent itemset ∈ , the algorithm considers all non-empty subsets ⊂ , generating potential rules of the type ⇒ , where = ∖ . Each rule is evaluated based on , which is parametric in (recall Definition 4), and those meeting or exceeding are retained. Generating these potential rules follows specific policies defined by the user in and checked by the function ℎ. Such policies range from rather standard ones, such as imposing that is always a singleton, to specific ones to lower the probability of getting insignificant results. To understand this, recall that at the propositional level a literal is considered to hold on some instance if it is true on it, but at the modal level literals may sometimes be trivially true on a world of a finite Kripke structure, such as in the case of □ on a world without successors; another representative example of potential problem is that of a proposition being true on some world which is the successor of many worlds, forcing ♢ to hold on all of them and making it dificult to establish its interesting degree only by the cardinality of its support set. Rule selection policies may partially address these problems; forcing an antecedent to include at least one non-modal literal, and avoiding specific combinations of propositions between antecedent and consequent of a rule are two among many possible examples.

HS modality

Definition w.r.t. the interval structure ⟨⟩ (after) ⟨⟩ (later) ⟨⟩ (begins) ⟨⟩ (ends) ⟨⟩ (during)

This approach ofers flexibility by allowing any user-defined interestingness measure to be used to ensure that the algorithm adapts to various analytical needs. Additionally, it supports local mining within individual modal instances, leveraging local definitions of support, confidence, lift, and other measures to generate significant rules. These locally mined rules can then be assessed for their global significance.

4. Experiments

As we have explained, modal association rules can be extracted from temporal, spatial, or other types of non-tabular data; to show the efectiveness of our approach, we focus here on a temporal case. 4.1. Interval Temporal Logic and HS Modalities A temporal dataset is generally presented as dataset of multivariate time series. We choose to describe temporal logical patterns using interval temporal logic, a specialization of modal logic; among the various interval temporal logics proposed in recent literature [18], Halpern and Shoham’s Modal Logic for Time Intervals (HS) [19] has received significant attention.

Let D = ⟨, <⟩ be a linear order with domain . A strict interval over D is an ordered pair [, ], where , ∈ D and < . Excluding the identity relation, there are 12 diferent binary ordering relations between two strict intervals on a linear order, often called Allen’s interval relations [20], and depicted in Tab. 1. Interval structures are interpreted as Kripke structures, with Allen’s relations serving as accessibility relations. Each Allen’s relation is associated with an existential modality ⟨⟩. Additionally, for each ∈ {, , , , , }, the transpose of modality ⟨⟩ is ⟨⟩, corresponding to the inverse relation .

Well-formed HS formulas are built from a set of propositions , classical connectives ∧ and ¬, and a modality for each Allen’s interval relation:

::= | ¬ | ∧ | ⟨⟩, where ∈ and ∈ . Syntactically, HS is a modal logic with 12 diamond operators and their corresponding box versions.

The strict semantics of HS is defined in terms of interval models = ⟨I(D), ⟩, where I(D) is the set of all strict intervals over D, and : I(D) → 2 , which associates each interval with the set of propositions true in that world. The truth of a formula on an interval [, ] in an interval model , denoted , [, ] ⊩ , is defined by structural induction: , [, ] ⊩ , [, ] ⊩ ¬

, [, ] ⊩ ⟨⟩ , [, ] ⊩ 1 ∧ 2 if if if where ∈ .

4.2. Time Series as Interval Models

, [, ] ⊮ , ∈ ([, ]), for each ∈ , , [, ] ⊩ 1 and , [, ] ⊩ 2, if there exists [, ] s.t. [, ] [, ] and , [, ] ⊩ , propositions A single multivariate time series with variables = {1, . . . , }, where each temporal variable is defined over points, can be interpreted as an interval model. To this end, we use a set of feature extraction functions ℱ = {1, . . . , }, where each function is defined as : R → R for some natural value ≤ ; examples include maximum and mean. To interpret a time series as an interval model, we fix D = ⟨{1, . . . , }, <⟩, compute the set of strict intervals I(D), and define the set of = { ≤ ( ) ≤ | ∈ ℱ , ∈ , ∈ R ∪ {−∞} , ∈ R ∪ {+∞}}.

In a way, feature extraction functions reduce the dimensionality of time series by summarizing information over intervals, allowing for more sophisticated analysis of the temporal relationships. By applying feature extraction functions to intervals within a time series, we can study their values and their relative qualitative positions. Such transformation enables expressing complex temporal relationships through propositional interval temporal logic, such as, for instance, describing an interval where the average of 1 ≥ 4 and such that it is overlapped by an interval where the maximum of 2 ≤ 12.

4.3. Experimental Setup

In our experiments we use a well-known public dataset, namely NATOPS, introduced in [21]. Each instance in this dataset is a time series representing the , , coordinates of sensors placed on various body parts of subjects performing aircraft handling signals. These signals are standardized in the Naval Air Training and Operating Procedures Standardization (NATOPS) manual. The dataset, initially designed for classification, includes several signals such as "I have command", "All clear", "Not clear", "Spread wings", "Fold wings", and "Lock wings", as depicted in Fig. 4.

Our objective is to describe common patterns within the same family of movements and highlight unique patterns across diferent classes. We focus on the six classes in Fig. 4, so that our actual dataset comprises 360 instances balanced across classes (i.e., 60 instances per class).

4.4. Experimental Procedure

The experiments are organized as follows. First, we decide the policy following which the set is built; we analyze the variable behaviour via a pre-processing step to identify the informative ones and the significant thresholds. Then, we set the local and global support thresholds; in our case, = = 0.1, and = 0.3. Finally, we set the policy for rule extraction; in particular, we choose to ⟨⟩(ℎ) ≥ 1 ∧ (ℎ) ≥ − 0.5 ⇒ [](ℎ) ≥ 0

(ℎ) ≥ 1 ⇒ [](ℎ) ≥ − 0.5 (ℎ) ≥ 1 ∧ ⟨⟩(ℎ) ≥ 1 ⇒ [](ℎ) ≥ − 0.5 (ℎ) ≥ − 0.5 ⇒ [](Δℎ) ≤ 0.0 (Δℎ) ≤ 0.0 ⇒ [](ℎ) ≥ − 0.5

(ℎ) ≥ − 1.0 ⇒ () ≤ − 0.25 ⟨⟩(ℎ) ≥ − 1.0 ∧ () ≤ − 0.25 ∧ ⟨⟩() ≥ − 0.5 ∧ ⟨⟩() ≥ − 0.3 ⇒ ⟨⟩(ℎ) ≥ 0.5

Target class I have command I have command I have command

Not clear Not clear Lock wings Lock wings extract only rules with a singleton consequent, to exclude rules where the same variable appears in both antecedent and consequent (self-absorbing rules), and to exclude rules with antecedents containing only non-propositional literals (non-anchored rules). After mining the most interesting association rules for the target class, we compute the entropy of the set {1, . . . , 6}, where is the global confidence of the rule on the instances of class . Setting = ∑︀6 =1 and = /, the entropy is defined as = − ∑︀6=1 2( ), and it is one way to determine a rule’s efectiveness in describing only the target class. Finally, we interpret the results by reading the rules in natural language, in order to understand and to explain, whenever possible, the extracted patterns.

4.5. Results and Analysis

We conducted three experiments, with results summarized in Tab. 2. Each rule is associated with a target class, and its global support and confidence are reported. The entropy of the confidence across other classes is also included; a 1 − value close to 1 indicates that the rule is very specific to the target class. Variables in the rules represent coordinates , , with subscripts indicating the body part (e.g., ℎ for right hand, for left elbow). The variable ∆ ℎ denotes the diference between the height of the right hand and the right thumb.

Experiment 1: "I have command". The first two rules are common between "I have command" (resp., = 0.61 and = 1.00) and "Lock wings" (resp., = 0.30 and = 1.00) classes. The second rule, for instance, translates to when the right hand is above the ears, it is also slightly to the right of the body. The third rule uniquely identifies a pattern typical to the target class, translating to when the right hand is distant from the body forward and at ear height, it is also to the right.

Experiment 2: "Not clear". The fourth rule is common between "All clear" ( = 0.42) and "Not clear" ( = 0.80) classes, describing a less precise movement. It translates to when the arm is just below shoulder height, the right thumb points downward. The fifth rule uniquely identifies the target class, translating to when the right thumb points downward, the right hand is just below shoulder height, indicating that the thumb is pointed downward during the hand’s ascent.

Experiment 3: "Lock wings". The second-to-last rule captures a common behavior across three classes: "Spread wings", "Fold wings", and "Lock wings." The final rule uniquely describes the target class, translating to when the left hand is at chest height and the left elbow is retracted towards the sternum, while the right elbow is raised and to the right, the right hand is above the chin.

5. Conclusions

This paper introduced a novel approach for extracting modal association rules from non-tabular data. Our extension of the FP-Growth algorithm to handle modal data allows for mining frequent patterns and interpretable association rules, which may be crucial in certain applications. We demonstrated the efect of non-propositional rule extraction on a public dataset in the temporal case.

The primary value of our work lies in its ability to process non-tabular data natively, maintaining interpretability and relevance for domain experts. A significant challenge identified is the selection of appropriate modal relations and alphabets for specific problems, which we propose as a future research direction. We have open-sourced our implementation to promote transparency and reproducibility, enabling others to apply and extend our methods.

Our research provides a robust framework for modal association rule mining and is a first step toward showing its applicability in practice to capture temporal relationships. In future work we will optimize language selection, integrate additional interestingness measures, and extend the framework to other forms of modal logics and relational data.

Acknowledgments

We acknowledge the support of the FIRD project Methodological Developments in Modal Symbolic Geometric Learning, funded by the University of Ferrara, and the INDAM-GNCS project Symbolic and Numerical Analysis of Cyberphysical Systems (code CUP_E53C23001670001), funded by INDAM; Giovanni Pagliarini, Guido Sciavicco, and Ionel Eduard Stan are GNCS-INdAM members. Moreover, this research has also been funded by the Italian Ministry of University and Research through PNRR - M4C2 - Investimento 1.3 (Decreto Direttoriale MUR n. 341 del 15/03/2022), Partenariato Esteso PE00000013 "FAIR - Future Artificial Intelligence Research" - Spoke 8 "Pervasive AI", funded by the European Union under the NextGeneration EU programme". [7] D. Bresolin, E. Cominato, S. Gnani, E. Muñoz-Velasco, G. Sciavicco, Extracting Interval Temporal Logic Rules: A First Approach, in: Proc. of the 25th International Symposium on Temporal Representation and Reasoning (TIME), volume 120 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018, pp. 7:1–7:15. [8] M. Milella, E. Muñoz-Velasco, G. Pagliarini, A. Paradiso, G. Sciavicco, I. E. Stan, On Modal Logic Association Rule Mining, in: Proc. of the 23rd Italian Conference on Theoretical Computer Science (ICTCS), volume 3284 of CEUR, CEUR-WS.org, 2022, pp. 53–65. [9] J. Martinet, S. Satoh, A study of intra-modal association rules for visual modality representation, 2007, pp. 344 – 350. [10] M. Ceci, C. Loglisci, L. Rudd, D. Malerba, Discovering knowledge through multi-modal association rule mining for document image analysis, in: Proceedings of 1st AI*IA Workshop on Intelligent Techniques At LIbraries and Archives co-located with XIV Conference of the Italian Association for Artificial Intelligence, IT@LIA@AI*IA 2015, Ferrara, Italy, September 22, 2015, volume 1509 of CEUR Workshop Proceedings, 2015. [11] J. Han, J. Pei, Y. Yin, Mining Frequent Patterns without Candidate Generation, in: Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 1–12. [12] F. Manzella, G. Pagliarini, A. Paparella, G. Sciavicco, I. E. Stan, Sole.jl – Symbolic Learning in Julia, https://github.com/aclai-lab/Sole.jl, 2024. [13] P. N. Tan, M. S. Steinbach, A. Karpatne, V. Kumar, Introduction to Data Mining, 2nd ed., Pearson, 2019. [14] L. Fariñas Del Cerro, M. Penttonen, A Note on the Complexity of the Satisfiability of Modal Horn

Clauses, The Journal of Logic Programming 4 (1987) 1–10. [15] D. Bresolin, E. Muñoz-Velasco, G. Sciavicco, On Sub-Propositional Fragments of Modal Logic,

Logical Methods in Computer Science 14 (2018). [16] A. Horn, On Sentences Which are True of Direct Unions of Algebras, Journal of Symbolic Logic 16 (1951) 14–21. [17] R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases, in: Proc.

of 20th International Conference on Very Large Data Bases (VLDB), 1994, pp. 487–499. [18] V. Goranko, A. Montanari, G. Sciavicco, A Road Map of Interval Temporal Logics and Duration

Calculi, J. of Applied Non-Classical Logics 14 (2004) 9–54. [19] J. Y. Halpern, Y. Shoham, A Propositional Modal Logic of Time Intervals, Journal of the ACM 38 (1991) 935–962. [20] J. Allen, Maintaining knowledge about temporal intervals, Communications of the ACM 26 (1983) 832–843. [21] Y. Song, D. Demirdjian, R. Davis, Tracking body and hands for gesture recognition: NATOPS aircraft handling signals database, in: Proc. of the 9th IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2011, pp. 500 – 506.

[1] Parliament and Council of the European Union , General Data Protection Regulation , 2016 . URL: https://gdpr-info.eu/.

[2]

A. F.

Markus ,

J. A.

Kors ,

P. R.

Rijnbeek , The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies , J. Biomed. Informatics 113 ( 2021 ) 103655 .

[3]

Vermesan ,

Piuri ,

Scotti ,

Genovese ,

Labati ,

Coscia , Explainability and Interpretability Concepts for Edge AI Systems , 2024 , pp. 197 - 227 .

[4]

Manzella , G. Pagliarini, G. Sciavicco,

I. E.

Stan , Interval Temporal Random Forests with an Application to COVID-19 Diagnosis , in : Proc. of the 28th International Symposium on Temporal Representation and Reasoning (TIME) , volume 206 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik , 2021 , pp. 7 : 1 - 7 : 18 .

[5]

Pagliarini , G. Sciavicco, Decision Tree Learning with Spatial Modal Logics , in: Proc. of the 12th International Symposium on Games, Automata, Logics, and Formal Verification (GandALF) , volume 346 of EPCTS , 2021 , pp. 273 - 290 .

[6]

Sciavicco ,

I. E.

Stan , Knowledge Extraction with Interval Temporal Logic Decision Trees , in: Proc. of the 27th International Symposium on Temporal Representation and Reasoning (TIME) , volume 178 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik , 2020 , pp. 9 : 1 - 9 : 16 .