=Paper= {{Paper |id=Vol-1466/proceedings-cla2015 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1466/proceedings-cla2015.pdf |volume=Vol-1466 }} ==None== https://ceur-ws.org/Vol-1466/proceedings-cla2015.pdf
CLA 2015
Proceedings of the Twelfth International Conference on
Concept Lattices and Their Applications



CLA Conference Series
cla.inf.upol.cz
Blaise Pascal University, LIMOS laboratory, Clermont-Ferrand, France

ISBN 978–2–9544948–0–7
Blaise Pascal University, LIMOS laboratory, Clermont-Ferrand, France




         The Twelfth International Conference on
         Concept Lattices and Their Applications




                            C LA 2015
                  Clermont-Ferrand, France
                    October 13–16, 2015




                             Edited by

                         Sadok Ben Yahia
                           Jan Konecny
CLA 2015
c paper author(s), 2015, for the included papers
c Sadok Ben Yahia, Jan Konecny, Editors, for the volume
Copying permitted only for private and academic purposes.




This work is subject to copyright. All rights reserved. Reproduction or publica-
tion of this material, even partial, is allowed only with the editors’ permission.




Technical Editor:
Jan Konecny, jan.konecny@upol.cz

Cover photo from blognature.fr




Page count:      xiii+254
Impression:      50
Edition:         1st
First published: 2015




Published and printed by:
Blaise Pascal University, LIMOS laboratory, Clermont-Ferrand, France
                           Organization


CLA 2015 was organized by the Blaise Pascal University, LIMOS laboratory,
Clermont-Ferrand.


Steering Committee
Radim Belohlavek             Palacký University, Olomouc, Czech Republic
Sadok Ben Yahia              Faculté des Sciences de Tunis, Tunisia
Jean Diatta                  Université de la Réunion, France
Peter Eklund                 IT University of Copenhagen, Denmark
Sergei O. Kuznetsov          State University HSE, Moscow, Russia
Engelbert Mephu Nguifo       LIMOS, University Blaise Pascal, Clermont-Ferrand,
                             France
Amedeo Napoli                LORIA, Nancy, France
Manuel Ojeda-Aciego          Universidad de Málaga, Spain
Jan Outrata                  Palacký University, Olomouc, Czech Republic

Program Chairs
Sadok Ben Yahia              Faculté des Sciences de Tunis, Tunis, Tunisia
Jan Konecny                  Palacký University, Olomouc, Czech Republic

Program Committee
Simon Andrews                Sheffield Hallam University, Sheffield, United King-
                             dom
Jaume Baixeries              Universitat Politècnica de Catalunya, Barcelona,
                             Catalonia
Radim Belohlavek             Palacký University, Olomouc, Czech Republic
Karell Bertet                L3i – Université de La Rochelle, La Rochelle, France
François Brucker             École Centrale, Marseille, France
Ana Burusco                  Universidad Pública de Navarra, Pamplona, Spain
Claudio Carpineto            Fondazione Ugo Bordoni, Roma, Italy
Pablo Cordero                Universidad de Málaga, Málaga, Spain
Jean Diatta                  Université de la Réunion, Saint-Denis, France
Felix Distel                 Technische Universität Dresden, Dresden, Germany
Florent Domenach             University of Nicosia, Nicosia, Cyprus
Vincent Duquenne             Institut de Mathématiques de Jussieu, Paris, France
Peter Eklund                 IT University of Copenhagen, Denmark
Sébastien Ferré              IRISA – Université de Rennes 1, Rennes, France
Bernhard Ganter              Technische Universität Dresden, Dresden, Germany
Cynthia Vera Glodeanu          Technische Universität Dresden, Dresden, Germany
Alain Gély                     Université de Lorraine, Metz, France
Tarek Hamrouni                 ISAMM, Manouba University, Tunisia
Marianne Huchard               LIRMM – Université Montpellier 2, Montpellier,
                               France
Dmitry Ignatov                 State University HSE, Moscow, Russia
Mehdi Kaytoue                  Liris – Insa, Lyon, France
Stanislav Krajči               Univerzita Pavla Jozefa Šafárika v Košiciach, Košice,
                               Slovakia
Francesco Kriegel              Technische Universität Dresden, Dresden, Germany
Michal Krupka                  Palacký University, Olomouc, Czech Republic
Marzena Kryszkiewicz           Warsaw University of Technology, Warsaw, Poland
Sergei O. Kuznetsov            State University HSE, Moscow, Russia
Léonard Kwuida                 Bern University of Applied Sciences, Bern, Switzer-
                               land
Jesús Medina                   Universidad de Cádiz, Cádiz, Spain
Engelbert Mephu Nguifo         LIMOS, Clermont-Ferrand, France
Rokia Missaoui                 LARIM – Université du Québec en Outaouais,
                               Gatineau, Canada
Amedeo Napoli                  LORIA, Nancy, France
Lhouari Nourine                LIMOS, Clermont-Ferrand, France
Sergei Obiedkov                State University HSE, Moscow, Russia
Manuel Ojeda-Aciego            Universidad de Málaga, Málaga, Spain
Petr Osička                    Palacký University, Olomouc, Czech Republic
Jan Outrata                    Palacký University, Olomouc, Czech Republic
Uta Priss                      Ostfalia University of Applied Sciences, Wolfenbüt-
                               tel, Germany
Francois Rioult                GREYC – Université de Caen Basse-Normandie,
                               Caen, France
Sebastian Rudolph              Technische Universität Dresden, Dresden, Germany
Christian Sacarea              Babes, -Bolyai University, Cluj-Napoca, Romania
Barış Sertkaya                 SAP Research Center, Dresden, Germany
László Szathmáry               University of Debrecen, Debrecen, Hungary
Petko Valtchev                 Université du Québec, Montréal, Canada
Francisco Valverde             Universidad Carlos III, Madrid, Spain

Additional Reviewers
Ľubomír Antoni               Univerzita Pavla Jozefa Šafárika v Košiciach, Košice,
                             Slovakia
Slim Bouker                  LIMOS, Clermont-Ferrand, France
Maria Eugenia Cornejo Piñero Universidad de Cádiz, Cádiz, Spain
Philippe Fournier-Viger      Université du Québec, Montréal, Canada
Eloisa Ramírez Poussa        Universidad de Cádiz, Cádiz, Spain
Stefan E. Schmidt            Technische Universität Dresden, Dresden, Germany
Vilém Vychodil               Palacký University, Olomouc, Czech Republic
Organization Committee
Olivier Raynaud (chair)   LIMOS, Clermont-Ferrand, France

Violaine Antoine          LIMOS, Clermont-Ferrand, France
Anne Berry                LIMOS, Clermont-Ferrand, France
Diyé Dia                  LIMOS, Clermont-Ferrand, France
Kaoutar Ghazi             LIMOS, Clermont-Ferrand, France
Dhouha Grissa             INRA Theix, Clermont-Ferrand, France
Yannick Loiseau           LIMOS, Clermont-Ferrand, France
Engelbert Mephu Nguifo    LIMOS, Clermont-Ferrand, France
Séverine Miginiac         LIMOS, Clermont-Ferrand, France
Lhouari Nourine           LIMOS, Clermont-Ferrand, France
                                   Table of Contents



Preface
Invited Contributions
User Models as Personal Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      1
   Gabriella Pasi

Formal Concept Analysis from the Standpoint of Possibility Theory . . . . .                               3
   Didier Dubois

Clarifying Lattice Structure by Data Polishing . . . . . . . . . . . . . . . . . . . . . . . .            5
   Takeaki Uno

Tractable Interesting Pattern Mining in Large Networks . . . . . . . . . . . . . . .                      7
   Jan Ramon

Extended Dualization: Application to Maximal Pattern Mining . . . . . . . . .                             9
   Lhouari Nourine

Full Papers
Subset-generated complete sublattices as concept lattices . . . . . . . . . . . . . . .                   11
   Martin Kauer, Michal Krupka

RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs . .                                     23
  Mehwish Alam, Amedeo Napoli, Matthieu Osmuk

Finding p-indecomposable Functions: FCA Approach . . . . . . . . . . . . . . . . . .                      35
   Artem Revenko

Putting OAC-triclustering on MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . .            47
   Sergey Zudin, Dmitry V. Gnatyshak, Dmitry I. Ignatov

Concept interestingness measures: a comparative study . . . . . . . . . . . . . . . .                     59
   Sergei O. Kuznetsov, Tatyana P. Makhalova

Why concept lattices are large – Extremal theory for the number of
minimal generators and formal concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          73
   Alexandre Albano, Bogdan Chornomaz

An Aho-Corasick Based Assessment of Algorithms Generating Failure
Deterministic Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   87
   Madoda Nxumalo, Derrick G. Kourie, Loek Cleophas and Bruce W.
   Watson
Context-Aware Recommender System Based on Boolean Matrix
Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   99
   Marat Akhmatnurov, Dmitry I. Ignatov
Class Model Normalization – Outperforming Formal Concept Analysis
approaches with AOC-posets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
   André Miralles, Guilhem Molla, Marianne Huchard, Clémentine
   Nebut, Laurent Deruelle, Mustapha Derras

Partial enumeration of minimal transversals of a hypergraph . . . . . . . . . . . 123
   Lhouari Nourine, Alain Quilliot, Hélène Toussaint
An Introduction to Semiotic-Conceptual Analysis with Formal Concept
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
   Uta Priss

Using the Chu construction for generalizing formal concept analysis . . . . . 147
   Ľubomír Antoni, Inmaculada P. Cabrera, Stanislav Krajči, Ondrej
   Krídlo, Manuel Ojeda-Aciego
From formal concepts to analogical complexes . . . . . . . . . . . . . . . . . . . . . . . . 159
   Laurent Miclet, Jacques Nicolas
Pattern Structures and Their Morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
   Lars Lumpe, Stefan E. Schmidt
NextClosures: Parallel Computation of the Canonical Base . . . . . . . . . . . . . 181
   Francesco Kriegel, Daniel Borchmann

Probabilistic Implicational Bases in FCA and Probabilistic Bases of
GCIs in EL⊥ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
   Francesco Kriegel
Category of isotone bonds between L-fuzzy contexts over different
structures of truth degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
    Jan Konecny, Ondrej Krídlo
From an implicational system to its corresponding D-basis . . . . . . . . . . . . . 217
   Estrella Rodríguez-Lorenzo, Kira Adaricheva, Pablo Cordero,
   Manuel Enciso, Angel Mora

Using Linguistic Hedges in L-rough Concept Analysis . . . . . . . . . . . . . . . . . . 229
   Eduard Bartl, Jan Konecny
Revisiting Pattern Structures for Structured Attribute Sets . . . . . . . . . . . . . 241
   Mehwish Alam, Aleksey Buzmakov, Amedeo Napoli, Alibek
   Sailanbayev

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
                                  Preface


Formal Concept Analysis is a method of analysis of logical data based on for-
malization of conceptual knowledge by means of lattice theory. It has proved
to be of interest to various applied fields such as data visualization, knowledge
discovery and data mining, database theory, and many others.
The International Conference “Concept Lattices and Their Applications (CLA)”
is being organized since 2002 and its aim is to bring together researchers from
various backgrounds to present and discuss their research related to FCA.
The Twelfth edition of CLA was held in Clermont-Ferrand, France from October
13 to 16, 2015. The event was jointly organized by the LIMOS laboratory, CNRS,
and Blaise Pascal university, France.
This volume includes the selected papers and the abstracts of 5 invited talks.
We would like to express our warmest thanks to the keynote speakers.
This year, there were initially 39 submissions, from which the Program Com-
mittee selected 20 papers which represents an acceptance rate of 51.2%.
The program of the conference consisted of five keynote talks given by the follow-
ing distinguished researchers: Didier Dubois, Lhouari Nourine, Gabriella Pasi,
Jan Ramon, and Takeaki Uno, together with twenty communications authored
by researchers from 11 countries, namely: Austria, Czech republic, France, Ger-
many, Kazakhstan, Republic of South Africa, Russia, Slovakia, Spain, Sweden,
and Ukraine.
Each paper was reviewed by 3–4 members of the Program Committee and/or ad-
ditional reviewers. We thank them all for their valuable assistance. It is planned
that extended versions of the best papers will be published in a well-established
journal, after another reviewing process.
The success of such event is mainly due the hard work and dedication of many
people and a collaboration of several institutions. We thank the contributing
authors, who submitted high quality works, we thank to the CLA Steering Com-
mittee, who gave us the opportunity of chairing this edition, and we thank the
Program Committee, the additional reviewers, and the local Organization Com-
mittee. We are also thankful to the following institutions, which have helped
the organization of the twelfth edition of CLA: Coffreo, IPLeanware, Axège, and
Oorace.
We also thank the Easychair conference system as it made easier most of our
administration tasks related to paper submission, selection, and reviewing. Last
but not least we thank Jan Outrata, who offered his files for preparing the
proceedings.


October 2015                                                 Sadok Ben Yahia
                                                                  Jan Konecny
                                                   Program Chairs of CLA 2015
            User Models as Personal Ontologies

                                     Gabriella Pasi

  Laboratorio di Information Retrieval – Università degli Studi di Milano Bicocca,
                                   Milano, Italia

Abstract. The problem of defining user profiles has been a research issue since a long
time; user profiles are employed in a variety of applications, including Information Fil-
tering and Information Retrieval. In particular, considering the Information Retrieval
task, user profiles are functional to the definition of approaches to Personalized search,
which is aimed at tailoring the search outcome to users. In this context the quality of
a user profile is clearly related to the effectiveness of the proposed personalized search
solutions. A user profile represents the user interests and preferences; these can be
captured either explicitly or implicitly. User profiles may be formally represented as
bags of words, as vectors of words or concepts, or still as conceptual taxonomies. More
recent approaches are aimed at formally representing user profiles as ontologies, thus
allowing a richer, more structured and more expressive representation of the knowledge
about the user.
This talk will address the issue of the automatic definition of personal ontologies, i.e.
user-related ontologies. In particular, a method that applies a knowledge extraction
process from the general purpose ontology YAGO will be described. Such a process is
activated by a set of texts (or just a set of words) representatives of the user interests,
and it is aimed to define a structured and semantically coherent representation of the
user topical preferences. The issue of the evaluation of the generated representation
will be discussed too.
Formal Concept Analysis from the Standpoint of
             Possibility Theory

                                      Didier Dubois

                   IRIT – Université Paul Sabatier, Toulouse, France

Abstract. Formal concept analysis (FCA) and possibility theory (PoTh) are two the-
oretical frameworks that are addressing different concerns in the processing of infor-
mation. Namely FCA builds concepts from a relation linking objects to the properties
they satisfy, which has applications in data mining, clustering and related fields, while
PoTh deals with the modeling of (graded) epistemic uncertainty. This difference of
focus explains why the two settings have been developed completely independently for
a very long time. However, it is possible to build a formal analogy between FCA and
PoTh. Both theories heavily rely on the comparison of sets, in terms of containment
or overlap. The four set-functions at work in PoTh actually determine all possible rel-
ative positions of two sets. Then the FCA operator defining the set of objects sharing
a set of properties, which is at the basis of the definition of formal concepts, appears
to be the counterpart of the set function expressing strong (or guaranteed) possibility
in PoTh. Then, it suggests that the three other set functions existing in PoTh should
also make sense in FCA, which leads to consider their FCA counterparts and new fixed
point equations in terms of the new operators. One of these pairs of equations, paral-
leling the one defining formal concepts, define independent sub-contexts of objects and
properties that have nothing in common.
The parallel of FCA with PoTh can still be made more striking using a cube of op-
position (a device extending the traditional square of opposition existing in logic, and
exhibiting a structure at work in many theories aiming at representing some aspects
of the handling of information). The parallel of FCA with PoTh extends to conceptual
pattern structures, where objects, may, e.g., be described by possibilistic knowledge
bases.
In the talk we shall indicate various issues pertaining to FCA that could be worth
studying in the future. For instance, the object-property links in formal contexts of
FCA may be a matter of degree. These degrees may refer to very different notions,
such as the degree of satisfaction of a gradual property, the degree of certainty that an
object has, or not, a property, or still the typicality of an object with respect to a set of
properties. These different intended semantics call for distinct manners of handling the
degrees, as advocated in the presentation. Lastly, applications of FCA to the mining of
association rules, to the fusion of conflicting pieces of information issued from multiple
sources, to clustering of sets of objects on the basis of approximate concepts, or to the
building of conceptual analogical proportions, will be discussed as other examples of
lines of interest for further research.
  Clarifying Lattice Structure by Data Polishing

                                    Takeaki Uno

                Institute of Informatics (NII) of Japan, Tokyo, Japan

Abstract. Concept lattice is made from many kinds of data. We want to use large
scale data for the construction to capture wide and deep meanings but the result of the
construction usually yields a quite huge lattice that is impossible to handle. Several
techniques have been proposed to cope with this problem, but to best of our knowledge
no algorithm attains good granularity, coverage, size distribution, and independence of
concepts at the same time. We consider this difficulty comes from that the concepts
are not clear in the data, so a good approach is to clarify the concepts in the data by
some operations. In this direction, we propose “data polishing” that modify the data
according to feasible hypothesis so that the concepts becomes clear.
   Tractable Interesting Pattern Mining in Large
                      Networks

                                     Jan Ramon

                  University of Leuven – INRIA, Leuven, Belgium

Abstract. Pattern mining is an important data mining task. While the simplest set-
ting, itemset mining, has been thoroughly studied real-world data is getting increas-
ingly complex and network structured, e.g. in the context of social networks, economic
networks, traffic networks, administrative networks, chemical interaction networks and
biological regulatory networks.
This presentation will first provide an overview of graph pattern mining work, and will
then discuss two important questions. First, what is an interesting concept, and can
we obtain suitable mathematical properties to order concepts in some way, obtaining
a lattice or other exploitable structure?
Second, how can we extract collections of interesting patterns from network-structured
data in a computationally tractable way? In the case of graphs, having a lattice on
the class of patterns turns out to be insufficient for computational tractability. We
will discuss difficulties related to pattern matching and related to enumeration, and
additional difficulties arising when considering condensed pattern mining variants.
  Extended Dualization: Application to Maximal
                Pattern Mining

                                   Lhouari Nourine

                           Limos, Clermont-Ferrand, France

Abstract. The hypergraph dualization is a crucial step in many applications in log-
ics, databases, artficial intelligence and pattern mining, especially for hypergraphs or
boolean lattices. The objective of this talk is to study polynomial reductions of the du-
alization problem on arbitrary posets to the dualization problem on boolean lattices,
for which output quasi-polynomial time algorithms exist.
The main application domain concerns pattern mining problems, i.e. the identification
of maximal interesting patterns in database by asking membership queries (predicate)
to a database.
            Subset-generated complete sublattices
                     as concept lattices?

                           Martin Kauer and Michal Krupka

                             Department of Computer Science
                              Palacký University in Olomouc
                                      Czech Republic
                                  martin.kauer@upol.cz
                                 michal.krupka@upol.cz



          Abstract. We present a solution to the problem of finding the complete
          sublattice of a given concept lattice generated by given set of elements.
          We construct the closed subrelation of the incidence relation of the cor-
          responding formal context whose concept lattice is equal to the desired
          complete sublattice. The construction does not require the presence of
          the original concept lattice. We introduce an efficient algorithm for the
          construction and give an example and experiments.


 1      Introduction and problem statement
 One of the basic theoretical results of Formal Concept Analysis (FCA) is the
 correspondence between closed subrelations of a formal context and complete
 sublattices of the corresponding concept lattice [2]. In this paper, we study a re-
 lated problem of constructing the closed subrelation for a complete sublattice
 generated by given set of elements.
     Let hX, Y, Ii be a formal context, B(X, Y, I) its concept lattice. Denote by
 V the complete sublattice of B(X, Y, I) generated by a set P ⊆ B(X, Y, I).
 As it is known [2], there exists a closed subrelation J ⊆ I with the concept
 lattice B(X, Y, J) equal to V . We show a method of constructing J without
 the need of constructing B(X, Y, I) first. We also provide an efficient algorithm
 (with polynomial time complexity), implementing the method. The paper also
 contains an illustrative example and results of experiments, performed on the
 Mushroom dataset from the UCI Machine Learning Repository.


 2      Complete lattices and Formal Concept Analysis
 Recall that a partially ordered set U is called a complete lattice
                                                              W if each  V its subset
 P ⊆ U has a supremum andWinfimum. We denote these      V by     P  and    P , respec-
 tively. A subset V ⊆ U is a -subsemilattice (resp. W-subsemilattice, resp.V     com-
 plete sublattice)
         W V       of U , if for each P ⊆   V  it holds   P  ∈  V   (resp.    P  ∈ V,
 resp. { P, P } ⊆ V ).
  ?
      Supported by the IGA of Palacký University Olomouc, No. PrF 2015 023


c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 11–21, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
12     Martin Kauer and Michal Krupka

                                                    W
    For a subset P ⊆ U we denote by CW P W      the -subsemilattice of U generated
by P , i.e. the smallest (w.r.t. set inclusion) -subsemilattice  Wof U containing P .
CW P always exists and   V is equal   to the intersection of all  -subsemilattices of
U containing P . The -subsemilattice of U generated by P and the complete
sublattice of U generated by P are defined similarly and are denoted by CV P
and CWV P , respectively.
    The operators CW , CV , and CWV are closure operators on the set U . Recall
that a closure operator on a set X is a mapping C : 2X → 2X (where 2X is the
set of all subsets of X) satisfying for all sets A, A1 , A2 ⊆ X

 1. A ⊆ C A,
 2. if A1 ⊆ A2 then C A1 ⊆ C A2 ,
 3. CC A = C A.


   Concept lattices have been introduced in [4], our basic reference is [2]. A (for-
mal ) context is a triple hX, Y, Ii where X is a set of objects, Y a set of attributes
and I ⊆ X × Y a binary relation between X and Y specifying for each object
which attributes it has.
   For subsets A ⊆ X and B ⊆ Y we set

               A↑I = {y ∈ Y | for each x ∈ A it holds hx, yi ∈ I},
               B ↓I = {x ∈ X | for each y ∈ B it holds hx, yi ∈ I}.

The pair h↑I , ↓I i is a Galois connection between sets X and Y , i.e. it satisfies

 1. If A1 ⊆ A2 then A↑2I ⊆ A↑1I , if B1 ⊆ B2 then B2↓I ⊆ B1↓I .
 2. A ⊆ A↑I ↓I and B ⊆ B ↓I ↑I .

    The operator ↑I ↓I is a closure operator on X and the operator ↓I ↑I is a closure
operator on Y .
    A pair hA, Bi satisfying A↑I = B and B ↓I = A is called a (formal ) concept
of hX, Y, Ii. The set A is then called the extent of hA, Bi, the set B the intent of
hA, Bi. When there is no danger of confusion, we can use the term “an extent
of I” instead of “the extent of a concept of hX, Y, Ii”, and similarly for intents.
    A partial order ≤ on the set B(X, Y, I) of all formal concepts of hX, Y, Ii is
defined by hA1 , B1 i ≤ hA2 , B2 i iff A1 ⊆ A2 (iff B2 ⊆ B1 ). B(X, Y, I) along with
≤ is a complete lattice and is called the concept lattice of hX, Y, Ii. Infima and
suprema in B(X, Y, I) are given by
                                         *                           !↓I ↑I +
                     ^                       \           [
                          hAj , Bj i =           Aj ,          Bj                   ,   (1)
                    j∈J                  j∈J             j∈J
                                         *              !↑I ↓I                  +
                     _                       [                       \
                          hAj , Bj i =             Aj            ,         Bj       .   (2)
                    j∈J                      j∈J                     j∈J
                    Subset-generated complete sublattices as concept lattices     13


One of immediate consequences of (1) and (2) is that the intersection of any
system of extents, resp. intents, is again an extent, resp. intent, and that it can
be expressed as follows:
                                !↑I                               !↓I
            \            [                     \           [
                Bj =         Aj     , resp.       Aj =         Bj     ,
             j∈J         j∈J                   j∈J          j∈J

for concepts hAj , Bj i ∈ B(X, Y, I), j ∈ J.
    Concepts h{y}↓I , {y}↓I ↑I i where y ∈ Y are attribute concepts. Each concept
hA, Bi isVinfimum of some attribute concepts (we say the set of all attribute con-
cepts is -dense in B(X, Y, I)). More specifically, T hA, Bi, is infimum of attribute
concepts h{y}↓I , {y}↓I ↑I i for y ∈ B and A = y∈B {y}↓I .
                             ↑I ↓I
W Dually, concepts h{x}            , {x}↑I i for x ∈ X are object
                                                             T concepts, they are
  -dense in B(X, Y, I) and for each concept hA, Bi, B = x∈A {x}↑I .

   A subrelation J ⊆ I is called a closed subrelation of I if each concept of
hX, Y, Ji is also a concept of hX, Y, Ii. There is a correspondence between closed
subrelations of I and complete sublattices of B(X, Y, I) [2, Theorem 13]: For
each closed subrelation J ⊆ I, B(X, Y, J) is a complete sublattice of B(X, Y, I),
and to each complete sublattice V ⊆ B(X, Y, I) there exists a closed subrelation
J ⊆ I such that V = B(X, Y, J).

3   Closed subrelations for generated sublattices
Let us have a context hX, Y, Ii and a subset P of its concept lattice. Denote by
V the complete sublattice of B(X, Y, I) generated by P (i.e. V = CWV P ). Our
aim is to find, without computing the lattice B(X, Y, I), the closed subrelation
J ⊆ I whose concept lattice B(X, Y, J) is equal to V .
    If B(X, Y, I) is finite, V can be obtained by alternating applications of the
closure operators CW and CV to P : we set V1 = CW P , V2 = CV V1 , . . . , and,
generally, VW       W                              = CV Vi−1 for even i > 1. The
             i = C Vi−1 for odd i > 1 and Vi V
sets Vi are -subsemilattices (for odd i) resp. -subsemilattices (for even i) of
B(X, Y, I). Once Vi = Vi−1 , we have the complete sublattice V .
    Note that for infinite B(X, Y, I), V can be infinite even if P is finite. Indeed,
denoting F L(P ) the free lattice generated by P [3] and setting X = Y = F L(P ),
I = ≤ we have F L(P ) ⊆ V ⊆ B(X, Y, I). (B(X, Y, I) is the Dedekind-MacNeille
completion of F L(P ) [2], and we identify P and F L(P ) with subsets of B(X, Y, I)
as usual.) Now, if |P | > 2 then F L(P ) is infinite [3], and so is V .
    We always consider sets Vi together with the appropriate restriction of the
ordering on B(X, Y, I). For each i > 0, Vi is a complete lattice (but not a complete
sublattice of B(X, Y, I)).
    In what follows, we construct formal contexts with concept lattices isomor-
phic to the complete lattices Vi , i > 0. First, we find a formal context for the
complete lattice V1 . Let K1 ⊆ P × Y be given by
                          hhA, Bi, yi ∈ K1    iff y ∈ B.                         (3)
14     Martin Kauer and Michal Krupka


As we can see, rows in the context hP, Y, K1 i are exactly intents of concepts
from P .
Proposition 1. The concept lattice B(P, Y, K1 ) and the complete lattice V1 are
isomorphic. The isomorphism assigns to each concept hB ↓K1 , Bi ∈ B(P, Y, K1 )
the concept hB ↓I , Bi ∈ B(X, Y, I).
Proof. Concepts from V1 are exactly those with intents equal to intersections of
intents of concepts from P . The same holds for concepts from B(P, Y, K1 ).
   Next we describe formal contexts for complete lattices Vi , i > 1. All of the
contexts are of the form hX, Y, Ki i, i.e. they have the set X as the set of objects
and the set Y as the set of attributes (the relation K1 is different in this regard).
The relations Ki for i > 1 are defined in a recursive manner:
                                        
                                           x ∈ {y}↓Ki−1 ↑Ki−1 ↓I for even i,
         for i > 1, hx, yi ∈ Ki iff                                               (4)
                                           y ∈ {x}↑Ki−1 ↓Ki−1 ↑I for odd i.

Proposition 2. For each i > 1,
1. Ki ⊆ I,
2. Ki ⊆ Ki+1 .
Proof. We will prove both parts for odd i; the assertions for even i are proved
similarly.
    1. Let hx, yi ∈ Ki . From {y} ⊆ {y}↓Ki−1 ↑Ki−1 we get {y}↓Ki−1 ↑Ki−1 ↓I ⊆
{y}↓I . Thus, x ∈ {y}↓Ki−1 ↑Ki−1 ↓I implies x ∈ {y}↓I , which is equivalent to
hx, yi ∈ I.
    2. As Ki ⊆ I, we have {y}↓Ki ↑Ki ↓I ⊇ {y}↓Ki ↑Ki ↓Ki = {y}↓Ki . Thus, x ∈
{y}↓Ki yields x ∈ {y}↓Ki ↑Ki ↓I .
    We can see that the definitions of Ki for even and odd i > 1 are dual. In
what follows, we prove properties of Ki for even i and give the versions for odd
i without proofs.
    First we give two basic properties of Ki that are equivalent to the defini-
tion. The first one says that Ki can be constructed as a union of some specific
rectangles, the second one will be used frequently in what follows.
Proposition 3. Let i > 1.
                          S
1. If i is even then Ki = y∈Y {y}↓Ki−1 ↑Ki−1 ↓I × {y}↓Ki−1 ↑Ki−1 . If i is odd then
          S
   Ki = x∈X {x}↑Ki−1 ↓Ki−1 ↑I × {x}↑Ki−1 ↓Ki−1 .
2. If i is even then for each y ∈ Y , {y}↓Ki = {y}↓Ki−1 ↑Ki−1 ↓I . If i is odd then
   for each x ∈ X, {x}↑Ki = {x}↑Ki−1 ↓Ki−1 ↑I .
Proof. We will prove only the assertions for even i.
    1. TheS “⊆” inclusion is evident. We will prove the converse inclusion. If
hx, yi ∈ y0 ∈Y {y 0 }↓Ki−1 ↑Ki−1 ↓I × {y 0 }↓Ki−1 ↑Ki−1 then there is y 0 ∈ Y such that
x ∈ {y 0 }↓Ki−1 ↑Ki−1 ↓I and y ∈ {y 0 }↓Ki−1 ↑Ki−1 . The latter implies {y}↓Ki−1 ↑Ki−1 ⊆
                     Subset-generated complete sublattices as concept lattices       15


{y 0 }↓Ki−1 ↑Ki−1 , whence {y 0 }↓Ki−1 ↑Ki−1 ↓I ⊆ {y}↓Ki−1 ↑Ki−1 ↓I . Thus, x belongs to
{y}↓Ki−1 ↑Ki−1 ↓I and by definition, hx, yi ∈ Ki .
     2. Follows directly from the obvious fact that x ∈ {y}↓Ki if and only if
hx, yi ∈ Ki .
   A direct consequence of 2. of Prop. 3 is the following.
Proposition 4. If i is even then each extent of Ki is also an extent of I. If i
is odd then each intent of Ki is also an intent of I.
Proof. Let i be even. 2. of Prop. 3 implies that each attribute extent of Ki is an
extent of I. Thus, the proposition follows from the fact that each extent of Ki
is an intersection of attribute extents of Ki .
    The statement for odd i is proved similarly except for i = 1 where it follows
by definition.

Proposition 5. Let i > 1. If i is even then for each y ∈ Y it holds

                      {y}↓Ki−1 ↑Ki−1 = {y}↓Ki ↑Ki = {y}↓Ki ↑I .

If i is odd then for each x ∈ X we have

                      {x}↑Ki−1 ↓Ki−1 = {x}↑Ki ↓Ki = {x}↑Ki ↓I .

Proof. We will prove the assertion for even i. By Prop. 4, {y}↓Ki is an extent
of I. The corresponding intent is

                  {y}↓Ki ↑I = {y}↓Ki−1 ↑Ki−1 ↓I ↑I = {y}↓Ki−1 ↑Ki−1                 (5)

(by Prop. 4, {y}↓Ki−1 ↑Ki−1 is an intent of I). Moreover, as Ki ⊆ I (Prop. 2), we
have

                               {y}↓Ki ↑Ki ⊆ {y}↓Ki ↑I .                             (6)

We prove {y}↓Ki−1 ↑Ki−1 ⊆ {y}↓Ki ↑Ki . Let y 0 ∈ {y}↓Ki−1 ↑Ki−1 . It holds

                          {y 0 }↓Ki−1 ↑Ki−1 ⊆ {y}↓Ki−1 ↑Ki−1

(↓Ki−1 ↑Ki−1 is a closure operator). Thus, {y}↓Ki−1 ↑Ki−1 ↓I ⊆ {y 0 }↓Ki−1 ↑Ki−1 ↓I
and so by 2. of Prop. 3, {y}↓Ki ⊆ {y 0 }↓Ki . Applying ↑Ki to both sides we obtain
{y 0 }↓Ki ↑Ki ⊆ {y}↓Ki ↑Ki proving y 0 ∈ {y}↓Ki ↑Ki .
     This, together with (5) and (6), proves the proposition.

Proposition 6. Let i > 1 be even. Then for each intent B of Ki−1 it holds
B ↓Ki = B ↓I . Moreover, if B is an attribute intent (i.e. there is y ∈ Y such that
B = {y}↓Ki−1 ↑Ki−1 ) then hB ↓Ki , Bi is a concept of I.
    If i > 1 is odd then for each extent A of Ki−1 it holds A↑Ki = A↑I . If A is an
object extent (i.e. there is x ∈ X such that A = {x}↑Ki−1 ↓Ki−1 ) then hA, A↑Ki i
is a concept of I.
16     Martin Kauer and Michal Krupka


Proof.S We will prove the assertion for even S
                                             i. Let B be an intent of Ki−1 . It holds
B = y∈B {y} (obviously) and hence B = y∈B {y}↓Ki−1 ↑Ki−1 (since ↓Ki−1 ↑Ki−1
is a closure operator). Therefore (2. of Prop. 3),
                             !↓Ki
                      [                \             \
           B ↓Ki =        {y}      =      {y}↓Ki =      {y}↓Ki−1 ↑Ki−1 ↓I
                      y∈B               y∈B               y∈B
                                             !↓I
                      [
                 =          {y}↓Ki−1 ↑Ki−1         = B ↓I ,
                      y∈B

proving the first part.
    Now let B be an attribute intent of Ki−1 , B = {y}↓Ki−1 ↑Ki−1 . By 2. of Prop. 3
it holds B ↓I = {y}↓Ki . By Prop. 5, B ↓I ↑I = {y}↓Ki ↑I = {y}↓Ki−1 ↑Ki−1 = B.
   Now we turn to complete lattices Vi defined above. We have already shown
in Prop. 1 that the complete lattice V1 and the concept lattice B(P, Y, K1 ) are
isomorphic. Now we give a general result for i > 0.
Proposition 7. For each i > 0, the concept lattice B(P, Y, Ki ) (for i = 1)
resp. B(X, Y, Ki ) (for i > 1) and the complete lattice Vi are isomorphic. The
isomorphism is given by hB ↓Ki , Bi 7→ hB ↓I , Bi if i is odd and by hA, A↑Ki i 7→
hA, A↑I i if i is even.
Proof. We will proceed by induction on i. The base step i = 1 has been already
proved in Prop. 1. We will do the induction step for even i, the other case is
dual.
   As Vi = CV Vi−1 , we have to
 1. show that the set W = {hA, A↑I i | A is an extent of Ki } is a subset of
    B(X, Y, I), containing Vi−1 and
 2. find for each hA, A↑Ki i ∈ B(X, Y, Ki ) a set of concepts from Vi−1 whose
    infimum in B(X, Y, I) has extent equal to A.
    1. By Prop. 4, each extent of Ki is also an extent of I. Thus, W ⊆ B(X, Y, I).
If hA, Bi ∈ Vi−1 then by the induction hypothesis B is an intent of Ki−1 (i − 1
is odd). By Prop. 6, B ↓Ki = B ↓I = A is an extent of Ki and so hA, Bi ∈ W .
    2. Denote B = A↑Ki . For each y ∈ Y , {y}↓Ki−1 ↑Ki−1 is an intent of Ki−1 . By
Prop. 3 and the induction hypothesis,
      h{y}↓Ki , {y}↓Ki−1 ↑Ki−1 i = h{y}↓Ki−1 ↑Ki−1 ↓I , {y}↓Ki−1 ↑Ki−1 i ∈ Vi−1 .

           T of the infimum (taken in B(X, Y, I)) of these concepts for y ∈ B
Now, the extent
is equal to y∈B {y}↓Ki = B ↓Ki = A.
    If X and Y are finite then 2. of Prop. 2 implies there is a number n > 1
such that Kn+1 = Kn . Denote this relation by J. According to Prop. 7, there
are two isomorphisms of the concept lattice B(X, Y, J) and Vn = Vn+1 = V . We
will show that these two isomorphisms coincide and B(X, Y, J) is actually equal
to V . This will also imply J is a closed subrelation of I.
                    Subset-generated complete sublattices as concept lattices     17


Proposition 8. B(X, Y, J) = V .

Proof. Let hA, Bi ∈ B(X, Y, J). It suffices to show that hA, Bi ∈ B(X, Y, I). As
J = Kn+1 = Kn we have J = Ki for some even i and also J = Ki for some odd i.
We can therefore apply both parts of Prop. 6 to J obtaining A = B ↓J = B ↓I
and B = A↑J = A↑I .

   Algorithm 1 uses our results to compute the subrelation J for given hX, Y, Ii
and P .


Algorithm 1 Computing the closed subrelation J.
Input: formal context hX, Y, Ii, subset P ⊆ B(X, Y, I)
Output: the closed subrelation of J ⊆ I whose concept lattice is equal to CWV P
  J ← relation K1 (3)
  i←1
  repeat
     L←J
     i←i+1
     if i is even then
         J ← {hx, yi ∈ X × Y | x ∈ {y}↓L ↑L ↓I }
     else
         J ← {hx, yi ∈ X × Y | y ∈ {x}↑L ↓L ↑I }
     end if
  until i > 2 & J = L
  return J




Proposition 9. Algorithm 1 is correct and terminates after at most max(|I| +
1, 2) iterations.

Proof. Correctness follows from Prop. 8. The terminating condition ensures we
compare J and L only when they are both subrelations of the context hX, Y, Ii
(after the first iteration, L is a subrelation of hP, Y, K1 i and the comparison
would not make sense).
    After each iteration, L holds the relation Ki−1 and J holds Ki (4). Thus,
except for the first iteration, we have L ⊆ J before the algorithm enters the
terminating condition (Prop. 2). As J is always a subset of I (Prop. 2), the
number of iterations will not be greater than |I| + 1. The only exception is
I = ∅. In this case, the algorithm will terminate after 2 steps due to the first
part of the terminating condition.


4   Examples and experiments

Let hX, Y, Ii be the formal context from Fig. 1 (left). The associated con-
cept lattice B(X, Y, I) is depicted in Fig. 1 (right). Let P = {c1 , c2 , c3 } where
18     Martin Kauer and Michal Krupka



                 I y1 y2     y3 y4 y5
                 x1 ×           ×
                                             y1            y2        y3
                 x2 × ×      ×                             x5        x4
                 x3          ×     ×
                 x4          ×               y4                      y5
                 x5   ×                      x1            x2        x3




Fig. 1: Formal context hX, Y, Ii (left) and concept lattice B(X, Y, I), together
with a subset P ⊆ B(X, Y, I), depicted by filled dots (right).


c1 = h{x1 }, {y1 , y4 }i, c2 = h{x1 , x2 }, {y1 }i, c3 = h{x2 , x5 }, {y2 }i are concepts
from B(X, Y, I). These concept are depicted in Fig. 1 by filled dots.
    First, we construct the context hP, Y, K1 i (3). Rows in this context are intents
of concepts from P (see Fig.W2, left). The concept lattice B(P, Y, K1 ) (Fig. 2,
center) is isomorphic to the -subsemilattice V1 = CW P ⊆ B(X, Y, I) (Fig. 2,
right). It is easy to see that elements of B(P, Y, K1 ) and corresponding elements




     K1 y1 y2 y3 y4 y5
                                 y1            y2        y1               y2     y3
     c1 ×        ×                                                        x5     x4
                                 c2            c3
     c2 ×
     c3    ×                     y4                      y4                      y5
                                 c1                      x1               x2     x3

                                               y3 , y5


Fig. 2: Formal
         W      context hP, Y, K1 i (left), the concept lattice B(P, Y, K1 ) (center)
and the -subsemilattice CW P ⊆ B(X, Y, I), isomorphic to B(P, Y, K1 ), depicted
by filled dots (right).


of V1 have the same intents.
     Next step is to construct the subrelation K2 ⊆ I. By (4), K2 consists of ele-
ments hx, yi ∈ X ×Y Vsatisfying x ∈ {y}↓K1 ↑K1 ↓I . The concept lattice B(X, Y, K2 )
is isomorphic to the -subsemilattice V2 = CV V1 ⊆ B(X, Y, I). K2 , B(X, Y, K2 ),
and V2 are depicted in Fig. 3.
     The subrelation K3 ⊆ I is computed again by (4). K3 consists of elements
hx, yi ∈ X × Y satisfying y ∈ {x}↑K2 ↓K2 ↑I . The result can be viewed in Fig. 4.
                    Subset-generated complete sublattices as concept lattices        19



     K2 y1 y2 y3 y4 y5                      x 3 , x4
     x1 ×        ×
                               y1           y2         y1           y2          y3
     x2 × × ·                                                       x5          x4
                                            x5
     x3        ·     ·
     x4        ·               y4                      y4                       y5
     x5    ×                   x1           x2         x1           x2          x3

                                            y3 , y5


Fig. 3: Formal
         V     context hX, Y, K2 i (left), the concept lattice B(X, Y, K2 ) (center)
and the -subsemilattice V2 = CV V1 ⊆ B(X, Y, I), isomorphic to B(X, Y, K2 ),
depicted by filled dots (right). Elements of I \ K2 are depicted by dots in the
table.



     K3 y1 y2 y3 y4 y5                      x 3 , x4
     x1 ×        ×
                               y1           y2         y1           y2          y3
     x2 × × ×                                                       x5          x4
                                            x5
     x3        ·     ·
     x4        ·               y4           y3         y4                       y5
     x5    ×                   x1           x2         x1           x2          x3

                                            y5


Fig. 4: Formal
         W     context hX, Y, K3 i (left), the concept lattice B(X, Y, K3 ) (center)
and the -subsemilattice V3 = CW V2 ⊆ B(X, Y, I), isomorphic to B(X, Y, K3 ),
depicted by filled dots (right). Elements of I \ K3 are depicted by dots in the
table. As K3 = K4 = J, it is a closed subrelation of I and V4 = CV V3 = V3 is
a complete sublattice of B(X, Y, I).


   Notice that already V3 = V2 but K3 6= K2 . We cannot stop and have to
perform another step. After computing K4 we can easily check that K4 = K3 .
We thus obtained the desired closed subrelation J ⊆ I and V4 = V3 is equal to
the desired complete sublattice V ⊆ B(X, Y, I).


    In [1], the authors present an algorithm for computing a sublattice of a given
lattice generated by a given set of elements. Originally, we planned to include
a comparison between their approach and our Alg. 1. Unfortunately, the algo-
rithm in [1] turned out to be incorrect. It is based on the false claim that (using
our notation) the smallest element V  of V , which is greater than or equal to an
element v ∈ B(X, Y, I), is equal to {p ∈ P | p ≥ v}. The algorithm from [1]
fails e.g. on the input depicted in Fig. 5.
20     Martin Kauer and Michal Krupka



                                   p2


                                  p1     v        p3




Fig. 5: An example showing that the algorithm from [1] is incorrect. A complete
lattice with a selected subset P = {p1 , p2 , p3 }. The least element of the sublattice
V generated by P which is greater than or equal to v is p1 ∨ v. The algorithm
incorrectly chooses p2 and “forgets” to add p1 ∨ v to the output.


     The time complexity of our algorithm is clearly polynomial w.r.t. |X| and
|Y |. In Prop. 9 we proved that the number of iterations is O(|I|). Our experi-
ments indicate that this number might be much smaller in the practice. We used
the Mushroom dataset from the UC Irvine Machine Learning Repository, which
contains 8124 objects, 119 attributes and 238710 concepts. For 39 different sizes
of the set P , we selected randomly its elements, 1000 times for each of the sizes.
For each P , we ran our algorithm and measured the number n of iterations, af-
ter which the algorithm terminated. We can see in Tbl. 1 maximal and average
values of n, separately for each size of P . From the results in Tbl. 1 we can see


 |P |(%)   Max n     Avg n    |P |(%)   Max n    Avg n     |P |(%)   Max n    Avg n
  0.005     11         7       0.25       6        3        0.90       5        3
  0.010     10         6       0.30       6        3        0.95       4        3
  0.015     10         5       0.35       6        3           1       4        3
  0.020     10         5       0.40       5        3           2       4        3
  0.025      8         5       0.45       5        3           3       4        3
  0.030      8         4       0.50       5        3           4       4        3
  0.035      8         4       0.55       6        3           5       4        2
  0.040      7         4       0.60       5        3           6       4        2
  0.045     10         4       0.65       4        3           7       4        2
  0.050      8         4       0.70       5        3           8       3        2
  0.100      6         4       0.75       6        3           9       3        2
  0.150      6         4       0.80       6        3         10        3        2
  0.200      6         4       0.85       4        3         11        3        2
Table 1: Results of experiments on Mushrooms dataset. The size of P is given
by the percentage of the size of the concept lattice.



that the number of iterations (both maximal and average values) is very small
compared to the number of objects and attributes. There is also an apparent
decreasing trend of number of iterations for increasing size of P .
                     Subset-generated complete sublattices as concept lattices        21


5    Conclusion and open problems

An obvious advantage of our approach is that we avoid computing the whole con-
cept lattice B(X, Y, I). This should lead to shorter computation time, especially
if the generated sublattice V is substantially smaller than B(X, Y, I).
    The following is an interesting observation and an open problem. It is men-
tioned in [2] that the system of all closed subrelations of I is not a closure
system and, consequently, there does not exist a closure operator assigning to
each subrelation of I a least greater (w.r.t. set inclusion) closed subrelation.
This is indeed true as the intersection of closed subrelations need not be a closed
subrelation. However, our method can be easily modified to compute for any
subrelation K ⊆ I a closed subrelation J ⊇ K, which seems to be minimal in
some sense. Indeed, we can set K1 = K and compute a relation J as described
by Alg. 1, regardless of the fact that K does not satisfy our requirements (intents
of K need not be intents of I). The relation J will be a closed subrelation of I
and it will contain K as a subset. Also note that the dual construction leads to
a different closed subrelation.
    Another open problem is whether it is possible to improve the estimation of
the number of iterations of Alg. 1 from Prop. 9. In fact, we were not able to
construct any example with the number of iterations greater than min(|X|, |Y |).


References
1. Bertet, K., Morvan, M.: Computing the sublattice of a lattice generated by a set of
   elements. In: Proceedings of Third International Conference on Orders, Algorithms
   and Applications. Montpellier, France (1999)
2. Ganter, B., Wille, R.: Formal Concept Analysis – Mathematical Foundations.
   Springer (1999)
3. Whitman, P.M.: Free lattices II. Annals of Mathematics 43(1), pp. 104–115 (1942)
4. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts.
   In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Boston (1982)
      RV-Xplorer: A Way to Navigate Lattice-Based
               Views over RDF Graphs

                Mehwish Alam, Amedeo Napoli, and Matthieu Osmuk

            LORIA (CNRS – Inria Nancy Grand Est – Université de Lorraine)
                    BP 239, Vandoeuvre-lès-Nancy, F-54506, France
               {mehwish.alam,amedeo.napoli,matthieu.osmuk@loria.fr}



          Abstract. More and more data are being published in the form of ma-
          chine readable RDF graphs over Linked Open Data (LOD) Cloud acces-
          sible through SPARQL queries. This study provides interactive naviga-
          tion of RDF graphs obtained by SPARQL queries using Formal Concept
          Analysis. With the help of this View By clause a concept lattice is cre-
          ated as an answer to the SPARQL query which can then be visualized
          and navigated using RV-Xplorer (Rdf View eXplorer). Accordingly, this
          paper discusses the support provided to the expert for answering cer-
          tain questions through the navigation strategies provided by RV-Xplorer.
          Moreover, the paper also provides a comparison of existing state of the
          art approaches.


 Keywords: RV-Xplorer, Lattice Navigation, SPARQL Query Views, Formal
 Concept Analysis


 1      Introduction
 Recently, Web Data is turning into “Web of Data” which contains the meta
 data about the web documents present in HTML and textual format. The goal
 behind this “Web of Data” is to make already existing data to be usable by
 not only human agents but also by machine agents. With the effort of Semantic
 Web community, an emerging source of meta data is published on-line called
 as Linked Open Data (LOD) in the form of RDF data graphs. There has been
 a huge explosion in LOD in recent past and is still growing. Up until 2014,
 LOD contains billions of triples. SPARQL1 is the standard query language for
 accessing RDF graphs. It integrates several resources to generate the required
 answers For instance, queries such as What are the movements of the artists
 displayed in Musee du Louvre? can not be answered by standard search engines.
 Nowadays, Google has introduced a way of answering questions directly such
 as currency conversion, calculator etc. but such queries are answered based on
 most frequent queries posed by the experts.
     When an expert poses a query to a search engine too many results are re-
 trieved for the expert to navigate through, which may be cumbersome when a
  1
      http://www.w3.org/TR/rdf-sparql-query/


c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 23–34, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
24    Mehwish Alam, Amedeo Napoli and Matthieu Osmuk


expert has to go through a number of links to find the interesting ones, hence
leading to the problem of information overload [3]. Same is the case with the an-
swers obtained by SPARQL query with the SELECT [8]. Even if there are hundreds
of answers, it becomes harder for the expert to find the interesting patterns. The
current study is a continuation of Lattice-Based View Access (LBVA) [1] which
provides a view over RDF graphs through SPARQL queries to give complete
understanding of a part of RDF graph that expert wants to analyze with the
help of Formal Concept Analysis. LBVA takes the SPARQL query and returns a
concept lattice called as view instead of the results of the SPARQL query. These
views created by LBVA are machine as well as human processable. Accordingly,
RV-Xplorer (Rdf View eXplorer) exploits the powerful mathematical structure
of these concept lattices thus making it interpretable by human. It also allows
human agents to interact with the concept lattice and perform navigation. The
expert can answer various questions while navigating the concept lattice. RV-
Xplorer provides several ways to guide the expert during this navigation process.
    This paper is structured as follows: section 2 gives the motivating example,
section 3 introduces the required background knowledge to understand the rest
of the paper. Section 4 details the elements of Graphical User Interface while
section 5 and section 6 details the navigation operations as well as other func-
tionalities supported by RV-Xplorer. Section 7 briefly discusses the related work.
Finally, section 8 concludes the paper and discusses the future work.


2    Motivating Example
Consider a scenario where an expert wants to pose following questions based on
articles published in conferences or journals from a team working on data mining.
In the current study, we extract the papers published in “Orpailleur Team“ in
LORIA, Nancy, France. Following are the questions in which an expert may be
interested in:

 – What are the main research topics in the team and the key researchers
   w.r.t. these topics, for example, researchers involved in most of the papers
   in a prominent topic?
 – What is the major area of the research of the leader of the team and various
   key persons?
 – Can the diversity of the team leader and key persons be detected?
 – Given a paper is it possible to retrieve similar papers published in the team?
 – Who are the groups of persons working together?
 – What are the research tendencies and possibly the forthcoming and new
   research topics (for example, single and recent topics which are not in the
   continuation of the present topics)?

   Such kind of questions can not be answered by Google. In this paper we want
to answer such kind of questions through lattice navigation supported by RV-
Xplorer which is built from an initial query and then is explored by the expert
according to her preferences.
        RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs                25


3      Preliminaries
Linked Open Data: Linked Open Data (LOD) [2] is the way of publishing
structured data in the form of RDF graphs. Given a set of URIs U, blank nodes
B and literals L, an RDF triple is represented as t = (s, p, o) ∈ (U ∪ B) × U ×
(U ∪ B ∪ L), where s is a subject, p is a predicate and o is an object. A finite
set of RDF triples is called as RDF Graph G such that G = (V, E), where V is
a set of vertices and E is a set of labeled edges. Each pair of vertices connected
through a labeled edge keeps the information of a statement. Each statement
is represented as hsubject, predicate, objecti referred to as an RDF Triple. V
includes subject and object while E includes the predicate.

SPARQL: A standard query language for RDF graphs is SPARQL2 which
mainly focuses on graph matching. A SPARQL query is composed of two parts
the head and the body. The body of the query contains the Basic Graph Patterns
(present in the WHERE clause of the query). These graph patterns are matched
against the RDF graph and the matched graph is retrieved and manipulated
according to the conditions given in the query. The head of the query is an
expression which indicates how the answers of the query should be constructed.
    Let us consider a query from the scenario in section 2, Q = Who is the team
leader of the data mining team in loria. For answering such questions consider
an RDF resource containing all the papers ever published in the data mining
team. With the help of SPARQL query the papers published in the last 5 years
in English language can be extracted. The SPARQL representation of the query
Q is shown in listing 1.1. Lines 1, 2 keep the information about the prefixes used
in the rest of the query. Line 5, 6 and 7 retrieve all the papers with their authors
and keywords. Line 8 and 9 retrieve the publication year of the paper and filter
according to the condition.
1 PREFIX rdfs : < http :// www . w3 . org /2000/01/ rdf - schema # >
2 PREFIX dc : < http :// purl . org / dc / terms / >

3   SELECT distinct ? title ? keywords ? author
4   where {
5   ? paper dc : creator ? author .
6   ? paper dc : subject ? keywords .
7   ? paper dc : title ? title .
8   ? paper dcterms : issued ? pub li ca ti o nY ea r
9     FILTER ( xsd : date (? publicati on Ye a r ) >= ’2011 -01 -01 ’^^ xsd : date ) }

                        Listing 1.1: SPARQL for extracting triples.


Lattice-Based View Access: Lattice-Based View Access [1], allows the clas-
sification of SPARQL query results into a concept lattice, referred to as a view,
for data analysis, navigation, knowledge discovery and information retrieval pur-
poses. It introduces a new clause VIEW BY which enhances the functionality of
already existing GROUP BY clause in SPARQL query by adding sophisticated
classification and Knowledge Discovery aspects.
2
    http://www.w3.org/TR/rdf-sparql-query/
26       Mehwish Alam, Amedeo Napoli and Matthieu Osmuk


    The variable appearing in the VIEW BY clause of the SPARQL query is re-
ferred to as object variable3 The rest of the variables are the attribute variables.
Then the answer tuples obtained by the query are processed based on object and
the attribute variables. The values obtained for the object variable are mapped
to the objects in the formal context K = (G, M, I) and the answers obtained
for attribute variables are mapped to the attributes in the context. Consider
the query given in listing 1.1 with classification capabilities i.e., containing the
clause VIEW BY ?title then the set of variables in the SELECT clause can be
given as V = {?title, ?keyword, ?author}. The object variable will be ?title and
attribute variable will be ?keyword and ?author. After applying LBVA, the ob-
jects contain the titles of the paper and the attributes are the set of keywords
and authors in the context. From this context, the concept lattice is built which
is referred to as a Lattice-Based View.
    LBVA is oriented towards the classification of SPARQL queries, but we can
interpret the present research activity at a more general level, the classification
of LOD. Accordingly, what is proposed in the paper is a tool for navigating a
classification of LOD.


4      The RV-Xplorer

RV-Xplorer (Rdf View eXplorer) is a tool for navigating concept lattices gener-
ated by the answers of SPARQL queries over part of RDF graphs using Lattice-
Based View Access. Accordingly, this tool provides navigation to the expert
over the classification of SPARQL query answers for analyzing the data, finding
hidden regularities and answering several questions. On each navigation step
it guides the expert in decision making and performing selection to avoid un-
necessary selections. It also allows the user to change her point of view while
navigating i.e., navigation by extent. Moreover, it also allows the expert to only
focus on the specific and interesting part of the concept lattice by allowing her
to hide the part of lattice which is not interesting for her.
    RV-Xplorer is a web-based tool for building concept lattices. On the client
side it uses D3.js which stands for Data-Driven Documents and is based on
Javascript for developing interactive data visualizations in modern web browsers.
It also uses model-view-controller (MVC) which separates presentation, data and
logical components. On the server side we use PHP and MySQL for computing
and storing the data. Generally, data can be a graph or pattern generated by
pattern mining algorithms etc. Currently, this tool is not publicly available.
    Figure 1 shows the overall interface of RV-Xplorer (Rdf View eXplorer) which
consists of three parts: (1) the middle part is called local view which shows
detailed description of the selected concept allowing interaction, navigation and
level-wise navigation, (2) the left panel is referred to as Spy showing the global
view of the concept lattice and (3) the lower left is the summarization index for
guiding the expert in making decision about which node to choose in the next
3
     The object here refers to the object in FCA.
       RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs             27


level by showing the statistics of the next level. For the running scenario, the
concept lattice is also available on-line4 .


4.1    Local View

Each selected node in the concept lattice is shown in the middle part of the
interface displaying complete information. Let c be the selected concept such
that c ∈ C where C is the set of concepts in the complete lattice L = (C, ≤)
then a local view shows the complete information about this concept i.e., the
extent, intent and the links to the super-concept and the sub-concepts. The set of
super and sub-concepts are linked to the selected node where each link represents
the partially ordered relation ≤. By default, the top node is the selected node
and is shown in local view.
    Figure 1 (below) shows the selected concept, the orange part defines the
label of the selected node which is the entry point for the concept, the pink and
yellow parts give the labels of the super-concepts and sub-concepts connected to
the selected concept respectively. The green and blue part give the information
about the intent and the extent respectively.


4.2    Spy

A global view in left panel shows the map of the complete lattice L = (C, ≤)
for a particular SPARQL query over an RDF Graph. It tracks the position
of the expert in the concept lattice and the path followed by the expert to
reach the current concept. It also helps in several navigation tasks such as direct
navigation, changing navigation space and navigation between point-of-views.
All of these navigation modes are discussed in section 5.


4.3    Statistics about the next level

The statistics about the next level are computed with the help of a summariza-
tion index which depicts the information about the distribution of the objects
in the extent of the selected concept in the linked sub-concepts i.e., concepts in
the next level of the concept lattice. Let ci be a concept in the next level where
i ∈ {1, . . . , n} and n is the number of concepts in the next level. ext(ci ) is the
extent of the concept then |ext(ci )| is the size of the extent. Finally, the statistics
about the next level are computed with the help of summarization index.

                                                 |ext(ci )|
               summarization index = P                               × 100          (1)
                                              j={1,...,n} |ext(cj )|
          P
   Here,    j={1,...,n} |ext(cj )| is the sum of extent size of all the concepts in
the next level. The sum of summarization index for all the sub-concept adds to
100%. In Figure 1, the percentages are represented in the form of a pie-chart
4
    http://rv-xplorer.loria.fr/#/graph/orpailleur_paper/1/
28     Mehwish Alam, Amedeo Napoli and Matthieu Osmuk




Fig. 1: Figure above shows the basic interface of RV-Xplorer displaying the top concept.
The Figure below shows the local view of K#52, the concept containing all the papers
authored by Amedeo Napoli.
      RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs         29


which shows the distribution. The sub-concept containing the most elements in
the extent has the highest percentage and hence has the biggest part in the pie
chart.


5     Navigation Operations

In this section we detail some of the classical [4] as well as advanced navigation
operations that are implemented in RV-Xplorer. Navigation can be done locally
with a parallel operation which is shown globally through local and global views.
Navigation operations allow the expert to locate particular pieces of information
which helps in obtaining several answers of the expert questions as well as anal-
ysis of the data at hand. Initially, the selected concept is the top concept which
contains all the objects.


5.1   Guided Downward (Drill down)/ Upward Navigation (Roll-up):

The local view provides expert with the drilling down operation which is achieved
by selecting the sub-concepts given in yellow part of local view. RV-Xplorer
guides the expert in drilling down the concept lattice by showing contents of
the sub-concept to the expert before selecting the node on mouse over. Another
added guidance provided to the expert is with the help of the summarization
index which gives the statistics about the next level. This way the expert can
avoid the attributes or the navigation path which may lead to uninteresting
results. The local view also allows the expert to roll-up from the specific concept
to the general concept. A super-concept can be selected following the link given
in the view.
    Consider the running scenario discussed in section 2 where the expert wants
to know who are researchers having main influences in the team? by analyzing
the publications of this particular team. Initially, the selected concept in the
local view is the top concept (see Figure 1 (above)). Now it can be seen from
the summarization index that most of the papers are contained in K#52. On
mouse over on K#52 it shows that this concept keeps all the papers published
by Amedeo Napoli. From here it can be safely concluded that Amedeo Napoli
is the leader of the team. Similarly, several key team members can be identified
on the same level such as supervisors etc. If the expert wants to view the papers
published by Amedeo Napoli, a downward navigation is performed by selecting
concept K#52. With the help of the summarization index another question can
be answered i.e., what are the main research topics of these researchers?. Again
by consulting the index it can be seen that K#4 keeps the largest percentage of
papers published by Amedeo Napoli (see Figure 1 (below)) and the keyword in
this concept is Formal Concept Analysis meaning that the main area of research
of Amedeo Napoli is Formal Concept Analysis. However, there are many other
areas of research on which he has worked, which shows the diversity of authors
based on the area of research he has published in. Moreover, the sub-lattice
connected to this concept keeps information about the community of authors
30     Mehwish Alam, Amedeo Napoli and Matthieu Osmuk


with who she publishes the most and about which topic and what variants of
formal concept analysis. Now, if the expert wants to retrieve all the papers
published by Amedeo Napoli then she can go back to K#52.


5.2   Direct Navigation

The spy on the left part of the RV-Xplorer (see Figure 1) allows the expert
for direct navigation. If an expert has navigated too deep in the view while
performing multiple drill-down operations then the spy, which keeps track of the
current position of the expert, shows all the paths from the selected concept
to the top concept and allows the expert to directly jump from one concept to
another linked concept without performing level-wise navigation. Unlike drill-
down and roll-up, direct navigation allows the expert to skip two or more hops
and select the more general or specific concept.
    These three navigation modes are very common and are repeatedly discussed
in many of the navigational tools built for concept lattice such as Camelis [11]
and CREDO [5] which may or not may not be for a specific purpose. The main
difference between RV-Xplorer and the two approaches and most of the naviga-
tional tools is that they use folder-tree display. As a contrast we manage to keep
the original structure of a concept lattice. An added advantage of RV-Xplorer is
that these navigation modes are guided at each step meaning that the interface
shows the expert with what is contained in the next node as well as the statis-
tics about the next level. This way the interface guides the expert in choosing
the nodes interesting for her by reducing the chance of performing unnecessary
navigation and backtracking to see the details unnecessarily.


5.3   Navigating Across Point-of-Views

The current interface allows the expert to toggle between points-of-view, i.e., at
any point an expert can start exploring the lattice with respect to the objects
(extent) in the concept lattice. Let c be the selected concept and the expert
is interested in g1 ∈ ext(c) where ext(c) is the extent of the selected concept.
Then if the expert hovers her mouse over this extent in the local view, the Spy
highlights all the concepts where this object is present along with the object
concept of g1 which is highlighted in red.
    For instance, the selected concept contains keyword data dependencies in
the intent and she is interested in the paper Computing Similarity Dependencies
with Pattern Structures and she wants to retrieve all the related or similar papers
then on mouse hover it highlights all the concepts containing this paper. Then
she selects the concept highlighted in red i.e., the object concept of this paper.
The right side of Figure 2 shows the highlighted object concept of Computing
Similarity Dependencies with Pattern Structures in RV-Xplorer. After this con-
cept is selected. The spy highlights all the paths from this concept until bottom
and the top which actually is the sub-lattice associated to this paper. All the
objects contained in the extent of the concepts in this sub-lattice are similar to
      RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs         31


the paper at hand i.e., papers sharing some properties with the paper Computing
Similarity Dependencies with Pattern Structures.
    If we consider the folder-tree display as discussed in most of the navigational
tools such as Camelis [11], CREDO [5] and CEM [7], such kind of navigation
is not possible because it only allows navigation w.r.t. intent and extent is con-
sidered as the answers of the navigations. In case of RV-Xplorer, it is possible
to obtain the sub-lattice related to a certain interesting object and this way the
whole sub-lattice connected to the object concept of the object of interest can
be navigated to retrieve similar objects i.e., sharing at least one attribute with
the object of interest.

5.4   Altering Navigation Space
The navigation space can be changed when the selected concept is deep-down in
the concept lattice without the effort to start the navigation all over again from
the top concept. Let c be the selected concept such that m1 and m2 ∈ int(c)
(int(c) is the intent of the selected concept) and the expert has navigated down-
wards from the concept whose intent only contains m1 . Now the expert wants
to navigate the lattice w.r.t. m2 , on mouse hover the interface highlights all the
concepts where the given attribute exists and further highlights the attribute
concept in red. The attribute concept of m2 can be selected. In the running ex-
ample, if the expert has navigated the lattice w.r.t. the author Amedeo Napoli
and she finds some papers on FCA authored by Amedeo Napoli. Now she wants
to navigate the concept lattice w.r.t. the keyword FCA then she can easily locate
the attribute concept of the keyword FCA and navigate to get specific informa-
tion. The left side of Figure 2 shows the highlighted attribute concept of FCA
in RV-Xplorer.
    In tree-folder display altering navigation space w.r.t. intent needs the expert
to locate the attribute concept by herself by manually checking each of the
branches because it represents the concept lattice as a tree. The problem with
such a display is that it is not easy to alter the browsing space quickly or change
the navigation point of view. Moreover, the sub-lattice connected to a selected
concept can not be seen because of the restrictions posed by tree display.

5.5   Area Expansion
Area expansion allows the expert to select several concepts at one time scat-
tered over the concept lattice and gives the overall view of what these concepts
contains. These concepts are not necessarily a part of navigation path that the
expert is following. It allows the expert to have an overall view of other concepts
without starting the navigation process again.
    This idea was first put-forth in [14], where they allow the expert to move
from one concept lattice to another concept lattice based on the granularity
level w.r.t. a taxonomy and a similarity threshold. The concepts in the concept
lattice with higher threshold contains more detailed information as compared
to the concept lattice built using lesser threshold. One drawback of such kind
32     Mehwish Alam, Amedeo Napoli and Matthieu Osmuk




Fig. 2: Left Figure shows the Attribute Concept of FCA and Right Figure shows the
Object Concept of Computing Similarity Dependencies with Pattern Structures.


of zooming operation is that it requires the computation of several concept lat-
tices. In case of RV-Xplorer, we are dealing with simple concept lattice instead
of the one created after using hierarchies meaning that all such kind of infor-
mation needs to be scaled to obtain a binary context. As we are dealing with
concept lattices built from binary contexts, we bend this functionality to suit
the needs. It does not require computation of many concept lattices as well as
no re-computation is required.

6    Hiding Non-Interesting Parts of the View
One of the most interesting characteristic of RV-Xplorer is that it allows the
expert to hide the non-interesting part of the lattice. Let us consider that expert
selects a concept c and it contains an attribute which is not interesting for
her. She can at any point right click on the concept and select hide sub-lattice.
One of the most interesting characteristic of a concept lattice is that if one
concept contains some attribute in an intent then all the sub-concepts inherit
this attribute. This way if the expert considers one concept as un-interesting then
the whole sub-lattice will be considered as uninteresting and hence will be hidden
from the expert while navigation. Such kind of functionality enables expert to
reduce her navigational space and at the end the concept lattice contains only
those concepts which are interesting for the expert.
    Similar functionality was first introduced in CreChainDo system [15]. Sim-
ilar to CREDO [5], CreChainDo allows the expert to pose a query against the
standard search engine which returns some results. These results are then orga-
nized in the form of a concept lattice and displayed to the expert in the form
of folder-tree display. An added advantage of CreChainDo over CREDO is that
the former allows expert interaction i.e., the expert can mark the concepts as
relevant or irrelevant based on her priorities. After the expert has marked the
concept irrelevant the sub-lattice linked to that concept is deleted. Meaning that,
it reduces the context based on this feedback and the concept lattice is computed
again using the reduced context. In case of RV-Xplorer, the concept lattice is
built on top of RDF graphs. Moreover, we do not recompute the lattice or re-
move anything from the concept lattice. We only hide the non-interesting part
       RV-Xplorer: A Way to Navigate Lattice-Based Views over RDF Graphs         33


of the lattice to reduce the navigation space of the expert. This way a reduction
in the navigation space is performed without re-computing a concept lattice.


7     Related Tools
There have already been many efforts for providing expert the facilities to in-
teract with the concept lattice applied to different domains. In [13], the authors
discuss a query-based faceted search for Semantic Web, as a contrast we are
mostly dealing with navigational capabilities that can be provided by utilizing
the powerful structure introduced by Hasse Diagram. [10] proposes another in-
teresting way of navigating the concept lattice which allows the novice user to
navigate through the concept lattice without having to know the structure of
the concept lattice. Same is the case with SPARKLIS [12], where user can per-
form selections and the tool acts as a query builder. As a contrast, RV-Xplorer
provides exploration/navigational capabilities over SPARQL query answers with
the help of view i.e., a concept lattice for data analysis and information retrieval
purposes. Conexp5 is another tool for visualizing small lattices. As a contrast,
RV-Xplorer allows area expansion and also provides guided navigation. [1] dis-
cusses that the views generated are easily navigable by machine as well as human
agents. Machine agents may access the datasets through SPARQL queries for
application development purposes through generic SPARQL queries generating
huge number of answers and consequently large number of concepts are provided
by View By clause. However, when human agents want to access the information
through SPARQL query they run specialized queries which do not generate huge
number of answers. In the current study we are focusing on manageable number
of answers to be visualized by human agents using our visualization software.
    An added advantage over these approaches is that RV-Xplorer provides guid-
ance to the expert at each step for making the decision about concept selection.
This guidance is provided by showing the user at each step, the contents of the
intent of next level, by showing the distribution of the extent with the help of
summarization index and finally with the help of global view many other ways
of guidance are provided.


8     Discussion
In this study we introduce a new navigational tool for concept lattices called as
RV-Xplorer which provides exploration over SPARQL query answers. With the
help of guided navigation implemented in RV-Xplorer we were able to answer all
the questions posed initially in the scenario. However, this tool is not designed
for only specific purpose any kind of concept lattice can be visualized and data
from any domain can be analyzed using this tool. The RV-Xplorer tool is still
in development and other functionalities should be added such as incremental
visualization (w.r.t. a set of given objects and attributes), iceberg visualization
5
    http://conexp.sourceforge.net/
34     Mehwish Alam, Amedeo Napoli and Matthieu Osmuk


(given a set of attributes and objects, and a frequency threshold), integration
of quality measures, visualization of implications and Duquenne-Guigues basis...
We believe that visualization tools, as many other researchers do (see the tools
discussed in [6]) are of main importance, not only for FCA but for data mining in
general. Accordingly, a new generation of visualization tools should be studied
and designed, and RV-Xplorer is an example of this new tools and what can
be imagined for supporting the analyst in the mining activity. We also want to
perform human evaluation of the tool as discussed in [10] and [13].

References
 1. Mehwish Alam and Amedeo Napoli. Defining views with formal concept analysis
    for understanding SPARQL query results. In Proceedings of the Eleventh Interna-
    tional Conference on Concept Lattices and Their Applications., 2014.
 2. Christian Bizer, Tom Heath, and Tim Berners-Lee. Linked data - the story so far.
    Int. J. Semantic Web Inf. Syst., 5(3):1–22, 2009.
 3. Claudio Carpineto, Stanislaw Osiński, Giovanni Romano, and Dawid Weiss. A
    survey of web clustering engines. ACM Comput. Surv., 41(3):17:1–17:38, 2009.
 4. Claudio Carpineto and Giovanni Romano. A lattice conceptual clustering system
    and its application to browsing retrieval. Machine Learning, 24(2):95–122, 1996.
 5. Claudio Carpineto and Giovanni Romano. Exploiting the potential of concept
    lattices for information retrieval with CREDO. J. UCS, 10(8):985–1013, 2004.
 6. Vı́ctor Codocedo and Amedeo Napoli. Formal concept analysis and information
    retrieval - A survey. In Formal Concept Analysis - 13th International Conference,
    ICFCA 2015, Nerja, Spain, June 23-26, 2015, Proceedings, pages 61–77, 2015.
 7. Richard Cole and Gerd Stumme. CEM - A conceptual email manager. In 8th Inter-
    national Conference on Conceptual Structures, ICCS 2000, Darmstadt, Germany,
    August 14-18, 2000, Proceedings, pages 438–452, 2000.
 8. Claudia d’Amato, Nicola Fanizzi, and Agnieszka Lawrynowicz. Categorize by:
    Deductive aggregation of semantic web query results. In ESWC (1), 2010.
 9. Peter W. Eklund, editor. Concept Lattices, Second International Conference on
    Formal Concept Analysis, ICFCA 2004, Sydney, Australia, February 23-26, 2004,
    Proceedings, Lecture Notes in Computer Science. Springer, 2004.
10. Peter W. Eklund, Jon Ducrou, and Peter Brawn. Concept lattices for information
    visualization: Can novices read line-diagrams? In Eklund [9], pages 57–73.
11. Sébastien Ferré. Camelis: a logical information system to organise and browse a
    collection of documents. Int. J. General Systems, 38(4):379–403, 2009.
12. Sébastien Ferré. Expressive and scalable query-based faceted search over SPARQL
    endpoints. In The Semantic Web - ISWC 2014 - 13th International Semantic Web
    Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part II, 2014.
13. Sébastien Ferré and Alice Hermann. Reconciling faceted search and query lan-
    guages for the semantic web. IJMSO, 7(1):37–54, 2012.
14. Nizar Messai, Marie-Dominique Devignes, Amedeo Napoli, and Malika Smaı̈l-
    Tabbone. Using domain knowledge to guide lattice-based complex data explo-
    ration. In ECAI 2010 - 19th European Conference on Artificial Intelligence, Lisbon,
    Portugal, August 16-20, 2010, Proceedings, pages 847–852, 2010.
15. Emmanuel Nauer and Yannick Toussaint. Dynamical modification of context for an
    iterative and interactive information retrieval process on the web. In Proceedings
    of the Fifth International Conference on Concept Lattices and Their Applications,
    CLA 2007, Montpellier, France, October 24-26, 2007, 2007.
      Finding p-indecomposable Functions: FCA
                      Approach

                                  Artem Revenko12
                                      1
                                         TU Wien
                          Karlsplatz 13, 1040 Vienna, Austria
                                     2
                                       TU Dresden
                    Zellescher Weg 12-14, 01069 Dresden, Germany



        Abstract. The parametric expressibility of functions is a generalization
        of the expressibility via composition. All parametrically closed classes
        of functions (p-clones) form a lattice. For finite domains the lattice is
        shown to be finite, however straight-forward iteration over all functions
        is infeasible, and so far the p-indecomposable functions are only known
        for domains with two and three elements. In this work we show how p-
        indecomposable functions can be computed more efficiently by means of
        an extended version of attribute exploration (AE). Due to the growing
        number of attributes standard AE is not able to guarantee the discovery
        of all p-indecomposable functions. We introduce an extension of AE and
        investigate its properties. We investigate the conditions allowing us to
        guarantee the success of exploration. In experiments the lattice of p-
        clones on three-valued domain was reconstructed.

        Keywords: parametric expressibility, attribute exploration,
        p-indecomposable function


 1    Introduction

 The expressibility of functions is a major topic in mathematics and has a long
 history of investigation. The interest is explainable: when one aims at investi-
 gating any kind of functional properties, which classes of functions should one
 consider? If a function f is expressible through a function h then it often means
 that f inherits properties of h and should not be treated separately. Moreover,
 if h in turn is expressible through f then both have similar or even the same
 properties. Therefore, partition with respect to expressibility is meaningful and
 can be the first step in the investigation of functions.
     With the development of electronics and logical circuits a new question arises:
 if one wants to be able to express all possible functions which minimal set of
 functions should one have at hands? One of the first investigations in this direc-
 tion was carried out in [Pos42]; in this work all the Boolean classes of functions
 closed under expressibility are found and described. Afterwards many important
 works were dedicated to related problems such as the investigation of the struc-
 ture of the lattice of functional classes, for example, [Yab60,Ros70]. However, it

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 35–46, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
36     Artem Revenko


is known that the lattice of classes of functions closed under expressibility is in
general uncountably infinite. In [Kuz79] a more general type of functional ex-
pressibility was introduced – parametric expressibility. A significant advantage
of this type of expressibility is that for any finite domain Ak , |A|= k the lattice of
all classes closed under parametric expressibility classes of functions (p-clones)
is finite [BW87]. However, finding this lattice is a complex task. For k = 3 in
a thorough and tedious investigation [Dan77] it was proved that a system of
197 functions forms the lattice of all p-clones. The investigation was carried out
without the use of computers.
     In this paper we introduce, develop, and investigate the methods and tools
for automation of the exploration of the lattice of p-clones. Therefore, this paper
“applied” to A3 can be seen as complementing the work [Dan77] where a proof
of the correctness of the results obtained using the elaborated in this paper tools
can be found. Namely, in this paper we answer the question how to find all
the p-clones, whereas in [Dan77] it is proved that certain functions allow us to
construct the desired lattice. The presented methods and tools are extensible to
larger domains as well.

Contributions
 – New original approach to exploring the lattice of p-clones introduced;
 – An extension of the standard exploration procedure is introduced and inves-
   tigated;
 – The whole procedure is implemented and executed; the obtained results con-
   firm with the previously known results;
 – It is proved that for certain starting conditions the desired lattice will nec-
   essarily be eventually discovered.


2    Formal Concept Analysis
In what follows we keep to standard definitions of FCA [GW99]. Let G and
M be sets and let I ⊆ G × M be a binary relation between G and M . The
triple K := (G, M, I) is called a (formal) context. The set G is called the set of
objects. The set M is called the set of attributes. A context (G∗ , M∗ , I∗ ) such
that G∗ ⊆ G, M∗ ⊆ M , and I∗ = I ∩ G∗ × M∗ is called a subcontext of K.
    Consider mappings ϕ: 2G → 2M and ψ: 2M → 2G :

                     ϕ(X) := {m ∈ M | gIm for all g ∈ X},

                      ψ(A) := {g ∈ G | gIm for all m ∈ A}.
Mappings ϕ and ψ define a Galois connection between (2G , ⊆) and (2M , ⊆), i.e.
ϕ(X) ⊆ A ⇔ ψ(A) ⊆ X. Usually, instead of ϕ and ψ a single notation (·)0 is
used.
   Let X ⊆ G, A ⊆ M . A formal concept C of a formal context (G, M, I) is a
pair (X, A) such that X 0 = A and A0 = X. The subset of objects X is called the
                         Finding p-indecomposable Functions: FCA Approach            37


extent of C and is denoted by ext(C), and the subset of attributes A is called
the intent of C and is denoted by int(C). For a context (G, M, I), a concept
C1 = (X, A) is a subconcept of a concept C2 = (Y, B) (C1 ≤ C2 ) if X ⊆ Y or,
equivalently, B ⊆ A. This defines a partial order on formal concepts. The set of
all formal concepts of (G, M, I) is denoted by B(G, M, I).
     An implication of K = (G, M, I) is defined as a pair (A, B), where A, B ⊆ M ,
written A → B. A is called the premise, B is called the conclusion of the
implication A → B. The implication A → B is respected by a set of attributes
N if A * N or B ⊆ N . We say that the implication is respected by an object g
if it is respected by the intent of g. If g does not respect an implication then g
is called a counter-example. The implication A → B holds (is valid ) in K if it
is respected by all g 0 , g ∈ G, i.e. every object, that has all the attributes from
A, also has all the attributes from B (A0 ⊆ B 0 ). A unit implication is defined
as an implication with only one attribute in its conclusion, i.e. A → b, where
A ⊆ M, b ∈ M . Every implication A → B can be regarded as a set of unit
implications {A → b | b ∈ B}.
     An implication basis of a context K is defined as a set LK of implications of
K, from which any valid implication for K can be obtained as a consequence and
none of the proper subsets of LK has this property. We call the set of all valid
in K the implicative theory of K. A minimal in the number of implications basis
was defined in [GD86] and is known as the canonical implication basis.
     An object g is called reducible in a context K := (G, M, I) iff ∃X ⊆ G \ g :
g 0 = X 0 . Note that a new object is going to be reducible if in the context there
already exists a formal concept with the same intent as the intent of the new
object. Reducible objects neither contribute to any implication basis nor to the
concept lattice [GW99], therefore, if one is only interested in the implicative
theory or in the concept lattice of the context reducible objects can be eliminated.
In what follows we introduce other types of reducibility, therefore, we refer to
this type of reducibility as plain reducibility.
     In what follows the canonical implication basis is used, however, the investi-
gation could be performed using another implication basis.
     Attribute Exploration (AE) consists in iterations of the following steps until
stabilization: computing the implication basis of a context, finding counterexam-
ples to implications, updating the context with counterexamples as new objects,
recomputing the basis. AE has been successfully used for investigations in many
mostly analytical areas of research. For example, in [KPR06] AE is used for
studying Boolean algebras, in [Dau00] lattice properties are studied, in [Rev14]
algebraic identities are studied.


3    Expressibility of Functions

Consider a set Ak , |A|= k, k ∈ N. Consider a function f : Aar(f ) → A (ar(f )
denotes the arity of f ), the set of all possible functions over Ak of different
arities is denoted by Uk . The particular functions pin (x1 , . . . xn ) = xi are called
38       Artem Revenko


the projections. The set of all projections is denoted by P r. In what follows
instead of writing (x1 , . . . xn ) we use a shorter notation (x).
    Let H ⊆ Uk . We say that f is compositionally expressible through H (denoted
f ≤ H) if the following condition holds:

                               f (x)     ≡      h(j1 (x), . . . , jar(h) (x)),                             (1)

for some h, j1 , . . . jm ∈ H ∪ P r.
    A functional clone is a set of functions containing all projections and closed
under compositions. The set of all functional clones over a domain of size k = 2
forms a countably infinite lattice [Pos42]. However, if k > 2 then the set of all
functional classes is uncountable [YM59].
    Let H ⊆ Uk and for any i ∈ [1, m] : ti , si ∈ H ∪ P r. We say that f, f ∈ Uk is
parametrically expressible through H (denoted f ≤p H) if the following condition
holds:
                                       ^m
                  f (x) = y ⇐⇒ ∃w          ti (x, w, y) = si (x, w, y).        (2)
                                                    i=1

The notation J ≤p H means that every function from J is parametrically ex-
pressible through H. A parametric clone (or p-clone) is a set of functions closed
under parametric expressibility and containing all projections. We consider a spe-
cial relation f • of arity ar(f )+1 on Ak called the graph of function f . f • consists
of the tuples of the form (x, f (x)). If function h is compatible with f • , i.e. if for
all valuations of variables xij in Ak holds the identity (ar(f ) = n, ar(h) = m)

f (h(x11 , . . . , x1m ), . . . h(xn1 , . . . , xnm )) ≡ h(f (x11 , . . . , xn1 ), . . . f (x1m , . . . , xnm )),

then we say that functions f and h commute (denoted f ⊥ h). For a set of
functions H we write f ⊥ H to denote that for all h ∈ H : f ⊥ h. The
commutation property is commutative, i.e. f ⊥ h iff h ⊥ f .
   The centralizer of H is defined by H ⊥ = {g ∈ Uk | g ⊥ H}. In [Kuz79] it is
shown that if f ≤p H then f ⊥ H ⊥ .


     x1 x2 f (x1 , x2 )                                                      h(x1 , x2 ) x1 x2
                                         f     f      f
     0 0       1                                                                1        0 0
                                       h 0     1      1
     0 1       0                                                                1        0 1
                                       h 0     0      1
     1 0       0                                                                0        1 0
                                       h 1     0    6 =
     1 1       1                                                                1        1 1


                           Fig. 1. Functions f and h do not commute


   A function f is called p-indecomposable if each system H parametrically
equivalent to {f } (i.e. f ≤p H and H ≤p f ) contains a function parametrically
equivalent to f . Hence, for each p-indecomposable function there exists a class of
                          Finding p-indecomposable Functions: FCA Approach           39


p-indecomposable functions that are parametrically equivalent to it. From each
such class we take only one representative (only one p-indecomposable function)
and gather them in a set of p-indecomposable functions denoted by Fkp . A p-
clone H cannot be represented as an intersection of p-clones strictly containing
H if and only if there exists a p-indecomposable function f such that H = f ⊥⊥ .
Hence, in order to construct the lattice of all p-clones it suffices to find all p-
indecomposable functions. The lattice of all p-clones for any finite k is finite
[BW87], hence, Fkp is finite.
    In [BW87] it is proved that it suffices to consider p-indecomposable functions
of arity at most k k , however, the authors conjecture that the actual arity should
be equal to k for k ≥ 3. The conjecture is still open. Nevertheless, thanks to
results reported in [Dan77], we know that the conjecture holds for k = 3.


4     Exploration of P-clones

The knowledge about the commutation properties of a finite set of functions
F ⊆ Uk can be represented as a formal context KF = (F, F, ⊥F ), where ⊥F ⊆ F 2 ,
a pair (f1 , f2 ) ∈ F 2 belongs to the relation ⊥F iff f1 ⊥ f2 . Note that the relation
⊥F is symmetric, hence, the objects and the attributes of the context are the
same functions.
    The goal of this paper is to develop methods for constructing the lattice
of all p-clones on A3 . As already noted, for the purpose of constructing the
lattice of p-clones it suffices to find all p-indecomposable functions Fkp . The set
of supremum-irreducible elements of the lattice of p-clones is exactly the set
{f ∗∗ | f ∈ Fkp }.
                                                  k
    For any domain of size k there exist k k functions of arity k. Therefore,
to compute the context of all commuting functions KUk one has to perform
      k     k      2
O(k k ∗ k k ∗ k k ) operations (taking into consideration only functions of arity k
and the cost of commutation check in the worst case). For k = 3 we count about
1030 operations. Therefore, already for k = 3 a brute-force solution is infeasible.3
    We intend to apply AE to commuting functions. For this purpose we de-
veloped and implemented methods for finding counter-examples to implications
over functions from Uk [Rev15]. These methods are not presented in this paper
for the sake of compactness. However, as the number of attributes is not fixed,
the success of applying AE is not guaranteed, i.e. it is not guaranteed that the
complete lattice of p-clones will eventually be discovered using AE.


4.1     Object-Attribute Exploration

We now describe which commuting properties a new function g 6∈ F should
possess in order to alter the concept lattice of the original context K = (F, F, ⊥)
despite the fact that the intent of g is equal to an intent from B(F, F, ⊥F ).
3
    Of course one can use dualities, but it does not give a feasible solution as well as
    there exist only k ∗ (k − 1) dualities.
40      Artem Revenko


    To distinguish between binary relations on different sets of functions we use
subscripts. The commutation relation on F is denoted by ⊥F , i.e. ⊥F = {(h, j) ∈
F 2 | h ⊥ j}. The context with the new function (F ∪ g, F ∪ g, ⊥F ∪g ) is denoted
by KF ∪g . The derivation operator for the context KF ∪g is denoted by (·)⊥F ∪g .

Proposition 1. Let C ∈ B(F, F, ⊥) such that ext(C) * int(C). Let g ∈ Uk , g ∈
                                                                            /
F be a function such that g ⊥F ∪g ∩ F = int(C) (g is reducible in KF ).
   g is irreducible in KF ∪g ⇔ g ⊥ g.

Proof. As ext(C) * int(C) and for all f ∈ F \ int(C) : g 6⊥ f it follows that
g 6⊥ ext(C). We prove the contrapositive statement: g is reducible in KF ∪g ⇔
g 6⊥ g.

⇐ As g 6⊥ g we have g ⊥F ∪g = int(C) = ext(C)⊥F ∪g . Therefore, g is reducible.
⇒ As g is reducible we obtain g ⊥F ∪g = H ⊥F ∪g for some H ⊆ F . Fix this H.
  As H ⊥F ∪g = int(C) we have H ⊥F ∪g ⊥F ∪g = ext(C). Suppose H ⊆ int(C),
  then H ⊥F ∪g ⊥F ∪g ⊆ int(C)⊥F ∪g ⊥F ∪g = int(C). As H ⊥F ∪g ⊥F ∪g = ext(C) and
  ext(C) * int(C) we arrive at a contradiction. Therefore, H * int(C). Hence,
  g 6⊥ H, therefore, g 6∈ H ⊥F ∪g , hence, g 6∈ g ⊥F ∪g .

Corollary 1. If g is reducible in KF , but irreducible in KF ∪g and g ⊥ g then
ext(C) → g holds in KF ∪g .

Proof. As g ⊥F ∪g = int(C)∪{g} and ext(C)⊥F ∪g = int(C) we have ext(C)⊥F ∪g ⊂
g ⊥F ∪g , therefore, ext(C) → g.

     The statement dual to Proposition 1 holds as well.
Proposition 2. Let C ∈ B(F, F, ⊥F ) such that ext(C) ⊆ int(C). Let g ∈
       / F be a function such that g ⊥F ∪g ∩ F = int(C) (g is reducible in KF ).
Uk , g ∈
    g is irreducible in KF ∪g ⇔ g 6⊥ g.

Proof. As ext(C) ⊆ int(C) and g ⊥ int(C) then g ⊥ ext(C). We prove the
contrapositive statement: g is reducible in KF ∪g ⇔ g ⊥ g.

⇐ As g ⊥ g and g ⊥ ext(C) we have ext(C)⊥F ∪g = int(C) ∪ {g} = g ⊥F ∪g .
  Hence, g is reducible.
⇒ As g is reducible we obtain g ⊥F ∪g = H ⊥F ∪g for some H ⊆ F . Fix this H.
  As g ⊥ int(C) we have H ⊥ int(C), hence, H ⊆ ext(C). As g ⊥ ext(C) we
  have g ⊥ H, hence, g ∈ H ⊥F ∪g , therefore, g ∈ g ⊥F ∪g and g ⊥ g.

Corollary 2. If g is reducible in KF , but irreducible in KF ∪g and g 6⊥ g then
g → ext(C) holds in KF ∪g .

Proof. As g ⊥F ∪g = int(C) and ext(C)⊥F ∪g = int(C) ∪ {g} we have g ⊥F ∪g ⊂
ext(C)⊥F ∪g , therefore, g → ext(C).

   In order to distinguish reducibility in the old context KF and in the new
context KF ∪g we introduce a new notation.
                         Finding p-indecomposable Functions: FCA Approach                  41


Definition 1. We call a function g that is reducible in KF , but irreducible in
KF ∪g , first-order irreducible for KF . If g is reducible for KF and reducible in
KF ∪g we call it first-order reducible for KF .

     We remind that if g is irreducible in (F ∪g, F, ⊥F ∪{(g, f ) ∈ {g}×F | f ⊥ g})
we call it plainly irreducible. Hence, if function is first-order reducible for KF
then it is also plainly reducible in KF . Note that g is plainly irreducible in KF
iff g is a counter-example to some valid in KF implication.
     Next we present an example with functions from U3 , in order to explicitly
show this we add 3 in the subscript of every function. The numbering of the
functions is induced by the lexicographic ordering on the outputs of the func-
tions [Rev15]. We use superscripts ·u for unary, ·b for binary, and ·t for ternary
functions.
                                                         (3)
Example 1. The context under consideration K0 is presented in Figure 2. The
                          (3)
implication basis of K0 is empty, therefore, there exist no plainly irreducible
                              b                                                  b
functions. The function f3,756    has the following commuting properties: f3,756      ⊥
  u      b              b         u                b           b
{f3,0 , f3,12015 } and f3,756 6⊥ f3,1 . Moreover, f3,756 6⊥ f3,756 and for the corre-
                                           u        u     b
sponding concept C holds ext(C) = {f3,0      } ⊂ {f3,0 , f3,12015 } = int(C). As follows
                                                                                 (3)
                                  b
from Proposition 2, the function f3,756 is first-order irreducible for K0 .


                                             u    u    b
                                            f3,0 f3,1 f3,12015
                                  u
                                 f3,0        ×           ×
                                  u
                                 f3,1             ×      ×
                                  b
                                 f3,12015    ×    ×

                       (3)                                         u      u      b
    Fig. 2. Context K0       of functions on domain A3 containing f3,0 , f3,1 , f3,12015




Corollary 3. Let C ∈ B(F, F, ⊥F ), g ∈ Uk , g ∈
                                              / F , and g be first-order reducible
for KF .

                                ext(C) ⊥ g       ⇔      g ⊥ g.

Proof. Follows from Propositions 1 and 2 and the fact that ext(C) ⊥ g ⇔
ext(C) ⊆ int(C).

    There remains a possibility that a union of sets of reducible functions is
irreducible. We proceed with the simplest case when there are only two sets
each containing a single first-order reducible function for the current context.
We prove several propositions about such pairs of first-order reducible functions.
The consequences of these propositions are deeper investigated in Section 4.2.
    We consider a context KF and new functions g1 , g2 ∈ Uk , g1 , g2 6∈ F . We
denote {g1 , g2 } by G, ⊥F ∪G = {(h, j) ∈ (F ∪ G)2 | h ⊥ j}, the context (F ∪
42     Artem Revenko


G, F ∪ G, ⊥F ∪G ) is denoted by KF ∪G , the corresponding derivation operator is
denoted by (·)⊥F ∪G . As in the case with one function, for i ∈ {1, 2} : gi is not a
counter-examples to a valid implication iff gi⊥F ∪G ∩ F ∈ int(G, M, I). We denote
the corresponding intents by int(C1 ) and int(C2 ), respectively.

Proposition 3. Let C1 , C2 ∈ B(F, F, ⊥F ) and g1 , g2 ∈
                                                      / F be first-order re-
ducible for KF . Suppose g1 ⊥ g2 .
   Both g1 , g2 are irreducible in KF ∪G ⇔ ext(C1 ) * int(C2 ).

Proof. As g1 is irreducible it holds that g1⊥F ∪G 6= ext(C1 )⊥F ∪G . From Corollary
3 follows that g1 ∈ ext(C1 )⊥F ∪G iff g1 ∈ g1⊥F ∪G . Therefore, ext(C1 )⊥F ∪G =
g1⊥F ∪G \ {g2 }. Hence, ext(C1 ) 6⊥ g2 , hence, ext(C1 ) * int(C2 ). Similarly for g2 ,
ext(C2 ) * int(C1 ).

Proposition 4. Let C1 , C2 ∈ B(F, F, ⊥F ) and g1 , g2 ∈
                                                      / F be first-order re-
ducible for KF . Suppose g1 6⊥ g2 .
   Both g1 , g2 are irreducible in KF ∪G ⇔ ext(C1 ) ⊆ int(C2 ).

Proof. As g1 is irreducible it holds that g1⊥F ∪G 6= ext(C1 )⊥F ∪G . From Corollary
3 follows that g1 ∈ ext(C1 )⊥F ∪G iff g1 ∈ g1⊥F ∪G . Therefore, ext(C1 )⊥F ∪G =
g1⊥F ∪G ∪ {g2 }. Hence, ext(C1 ) ⊥ g2 , hence, ext(C1 ) ⊆ int(C2 ). By the properties
of derivation operators, ext(C2 ) ⊆ int(C1 ).

    The functions mentioned in Propositions 4 and 3 can be called second-order
irreducible for KF . In the next proposition we show that it is not necessary to
look for three functions at once in order to find all p-indecomposable functions.
Therefore, we do not need to define third-order irreducibility.
    Here we use the notation: for I ⊆ {1, 2, 3} : LI = {gi | i ∈ I}. We omit the
curly brackets in I, i.e. L{1,2} = L12 = {g1 , g2 }.

Proposition 5. Let G = {g1 , g2 , g3 } be a set of functions such that G ∩ F = ∅
and for i ∈ {1, 2, 3} : gi⊥F ∪G ∩ F = int(Ci ). If not all functions from G are
reducible in KF ∪G then there exists L ⊂ G such that not all functions from L
are reducible in KF ∪L .

Proof. Let g1 be reducible in KF ∪L12 and in KF ∪L13 . Then there exists H ⊆
                          ⊥                                            ⊥
F ∪ {g2 } : H ⊥F ∪L12 = g1 F ∪L12 and J ⊆ F ∪ {g3 } : J ⊥F ∪L13 = g1 F ∪L13 . Fix
these H and J. If either g2 is irreducible in KF ∪L2 or g3 is irreducible in KF ∪L3
then the proposition is proved. Therefore, we can assume that they are reducible
in corresponding context. Hence, without loss of generality, we can assume that
H, J ⊆ F (i.e. H ∩ G = J ∩ G = ∅). Note that
                           ⊥          ⊥
               g1⊥F ∪G = g1 F ∪L13 ∪ g1 F ∪L12 = J ⊥F ∪L13 ∪ H ⊥F ∪L12 .           (3)

    Let g3 ∈ H ⊥F ∪G . Then g3 ⊥ H. As g3⊥F ∪G ∩ F = int(C3 ) we obtain H ⊆
int(C3 ). Moreover, as int(C3 ) is an intent in KF we have H ⊥F ⊥F ⊆ int(C3 ).
As g1⊥F ∪G ∩ F = H ⊥F = J ⊥F = int(C1 ) we have J ⊥F ⊥F ⊆ int(C3 ) and, by
                         Finding p-indecomposable Functions: FCA Approach            43


properties of closure operators, J ⊆ int(C3 ). Therefore, g3 ⊥ J and g3 ∈ J ⊥F ∪G .
Similarly, if g2 ∈ J ⊥F ∪G then g2 ∈ H ⊥F ∪G . Hence,

                      H ⊥F ∪L12 ∪ J ⊥F ∪L13 = H ⊥F ∪G ∪ J ⊥F ∪G .                   (4)

    Combining (3) and (4) we obtain g1⊥F ∪G = H ⊥F ∪G ∪ J ⊥F ∪G . Therefore,
g1⊥F ∪G = (H ∩ J)⊥F ∪G . Hence, g1 is reducible in KF ∪G and we arrive at a
contradiction with initial assumption.
    Therefore, if g1 , g2 are in KF ∪L12 then at least g1 is irreducible in KF ∪L13 . If
g3 is reducible in KF ∪L13 then g1 is reducible in KF ∪L1 . Otherwise, both g1 , g3
are irreducible in KF ∪L13 .
    Suppose that a context KF contains all p-indecomposable functions, how-
ever, the task is to prove this fact, i.e. that no further p-indecomposable func-
tions exist. Suppose it has been checked that no counter-examples exist and
every single function g ∈ Uk is first-order reducible for KF . According to the
above propositions it is necessary to look for exactly two functions at once in
order to prove the desired statement. Therefore, in order to complete the proof
for every C1 , C2 ∈ B(KF ) one has to find all the functions g1 , g2 such that
 ⊥                            ⊥
g1 F ∪g1 ∩ F = int(C1 ) and g2 F ∪g2 ∩ F = int(C2 ) and then check if g1 commutes
with g2 . Therefore, one has to check the commutation property between all func-
tions (if the context indeed contains all p-indecomposable functions). As already
discussed, this task is infeasible. This result is discouraging. However, having
the knowledge about the final result in some cases we can guarantee that all p-
indecomposable functions will be found even without looking for two functions
at once.

4.2   Implicatively Closed Subcontexts
During the exploration of p-clones one can discover such a subcontext of func-
tions that no further function is a counter-example to existing implications. We
shall say that such a subcontext is implicatively closed, meaning that all the valid
in this subcontext implications are valid in the final context as well. Analysis of
similar constructions can be found in [Gan07].
    In order to guarantee the discovery of all p-indecomposable functions (suc-
cess of exploration) it would suffice to find such a subcontext that it is neither
implicatively closed nor contained in any other implicatively closed subcontext.
Suppose the context KF = (F, F, ⊥F ), F ⊆ Uk is discovered. As earlier, we de-
note the context of all p-indecomposable functions on Uk by KFkp . Let S = Fkp \F .
It would be desirable to be able to guarantee the discovery of functions S by con-
sidering only the discovered part of relation ⊥F and the part ⊥F S (=⊥−1    SF ), see
Figure 3. Unfortunately, as the next example shows, in general it is not possible.
Example 2. Consider the context in Figure 4. The context contains all the p-
indecomposable functions from U2 and three additional objects g1 , g2 , g3 . Func-
tions with commutation properties as of g1 , g2 , g3 do not exist. However, if func-
tions with commutation properties as of g1 , g2 , g3 existed then the functions g1 , g2
44     Artem Revenko


                                           F        S

                                   F   ⊥F          ⊥F S



                                   S   ⊥SF         ⊥S


     Fig. 3. Partitioning of the context KF p of all p-indecomposable functions
                                               k




would not be counter-examples to any valid in KF2p ∪g3 implication. Note that g3
is a counter-example to a valid in KF2p implication. Therefore, the subcontext
containing functions F2p ∪ g3 would be implicatively closed. Moreover, it is even
closed with respect to finding first-order irreducible functions as g1 is reducible
in KF2p ∪{g1 ,g3 } and g2 is reducible in KF2p ∪{g2 ,g3 } .
    However, if instead of g3 we consider the function g4 , which differs from g3
only in that g4 commutes with both g1 and g2 , then the subcontext containing
F2p ∪ g4 is neither implicatively closed nor contained in any implicatively closed
subcontext of the context KF2p ∪{g1 ,g2 ,g4 } . The difference between g3 and g4 is
contained in ⊥S in Figure 3. Therefore, in general it is not possible to guarantee
the discovery of functions S without considering ⊥S .


                           f0u f1u f14
                                    b
                                       f8b f212
                                            t    t
                                                f150 f3u g3 g4 g1 g2
                    f0u    ×       × × ×           ×      × × ×
                    f1u        ×       ×           ×
                      b
                    f14    ×       ×                    × × ×
                    f8b    ×         ×                  ×       ×
                      t
                    f212   × ×                          × × ×
                      t
                    f150   × ×                     ×    ×
                    f3u            × × ×           ×    × × × × ×
                    g3     ×       ×   ×                ×
                    g4     ×       ×   ×                ×     × ×
                    g1     ×                            ×   ×   ×
                    g2                 ×                ×   × × ×

                  Fig. 4. Context KF2p ∪{g1 ,g2 ,g3 } from Example 2




Definition 2. Let KH be a context, KF ⊆ KH , S = H \ F . An object s ∈
S is called an essential counter-example for KF if there exists a valid in KF
implication Imp such that
                         Finding p-indecomposable Functions: FCA Approach           45


1. s is a counter-example to Imp;
2. there does not exist an object p ∈ S \ {s} such that p is a counter-example
   to Imp.
   It is clear that all the essential counter-examples will necessarily be added to
the context during the exploration. The next proposition suggests how one can
check if a counter-example is essential or not.
   In the context KF3p there are several pairs of functions (f1 , f2 ) such that they
commute with the same functions except for one commutes with itself and the
other does not commute with itself. These functions cannot be essential counter-
examples, because they are counter-examples to the same implications, if any.
However, if they are the only counter-examples to some valid implication then
these functions will eventually be discovered by object-attribute exploration.
                                                              ⊥U       ⊥U
Proposition 6. Let s1 , s2 ∈ S such that s2 6⊥ s2 and s1 k = s2 k ∪ {s2 }. If
there exists a valid in KF implication Imp such that the counter-examples are
exactly s1 , s2 ∈ S then s1 is first-order irreducible for KF ∪s2 and s2 is first-order
irreducible for KF ∪s1 .
                                                        ⊥
Proof. s1 in KF ∪s2 . As Imp is valid in KF the set s2 F ∪s1 is closed in KF . There-
    fore, as follows from Proposition 1 for the object concept of s2 (ext(Cs2 ) *
    int(Cs2 )), the function s1 (s1 ⊥ s1 ) is first-order irreducible.
                                                  ⊥
s2 in KF ∪s1 . As Imp is valid in KF the set s1 F ∪s2 is closed in KF . Therefore, as
    follows from Proposition 2 for the object concept of s1 (ext(Cs1 ) ⊆ int(Cs1 )),
    the function s2 (s2 6⊥ s2 ) is first-order irreducible.
    We have investigated different types of reducibilities, we have shown, that
there do not exist third-order irreducible functions. However, the task of finding
second-order irreducible functions is infeasible. Fortunately, it is possible to find
not only zero-order irreducible functions, but also first-order irreducible func-
tions. Moreover, if it would be possible to prove that the functions undiscovered
at the moment are not second-order irreducible then we can guarantee that all
the p-indecomposable functions will eventually be discovered.

5    Results
We take all unary functions as the starting point. Thanks to earlier investigation
in [Dan77] we know the final context. When we investigate all possible implica-
tively closed partitions such that the implicatively closed subcontext contains
all unary functions we find the following:
 – We start with 27 unary functions, 26 of them are p-indecomposable;
 – After adding all essential counter-examples we obtain 147 functions;
 – After using Proposition 6 we obtain 155 functions;
 – There remain 42 functions to be discovered. By direct check we find that
   there does not exist an implicatively closed subcontext containing 155 men-
   tioned above functions such that all the undiscovered functions are second-
   order irreducible.
46     Artem Revenko


Hence, if we start from all unary functions on A3 all the functions F3p will
eventually be discovered.
   The experiment was conducted three times starting from different initial
contexts, all three times the exploration was successful. The exploration stating
                                  u
from a single constant function f3,0 took 207 steps.


References
[BW87] S. Burris and R. Willard. Finitely many primitive positive clones. Proceedings
        of the American Mathematical Society, 101(3):427–430, 1987.
[Dan77] A.F. Danil’chenko. Parametric expressibility of functions of three-valued logic.
        Algebra and Logic, 16(4):266–280, 1977.
[Dau00] F. Dau. Implications of properties concerning complementation in finite lat-
        tices. In: Contributions to General Algebra 12 (D. Dorninger et al., eds.),
        Proceedings of the 58th workshop on general algebra “58. Arbeitstagung All-
        gemeine Algebra”, Vienna, Austria, June 3-6, 1999, Verlag Johannes Heyn,
        Klagenfurt, pages 145–154, 2000.
[Gan07] B. Ganter. Relational galois connections. Formal Concept Analysis, pages
        1–17, 2007.
[GD86] J.-L. Guigues and V. Duquenne. Familles minimales d’implications informa-
        tives résultant d’un tableau de données binaires. Math. Sci. Hum, 24(95):5–18,
        1986.
[GW99] B. Ganter and R. Wille. Formal Concept Analysis: Mathematical Foundations.
        Springer, 1999.
[KPR06] L. Kwuida, C. Pech, and H. Reppe. Generalizations of boolean algebras. an
        attribute exploration. Mathematica Slovaca, 56(2):145–165, 2006.
[Kuz79] A.V. Kuznetsov. Means for detection of nondeducibility and inexpressibility.
        Logical Inference, pages 5–33, 1979.
[Pos42] E.L. Post. The two-valued iterative systems of mathematical logic. Princeton
        University Press, 1942.
[Rev14] A. Revenko. Automatized construction of implicative theory of algebraic
        identities of size up to 5. In Cynthia Vera Glodeanu, Mehdi Kaytoue, and
        Christian Sacarea, editors, Formal Concept Analysis, volume 8478 of Lecture
        Notes in Computer Science, pages 188–202. Springer International Publishing,
        2014.
[Rev15] A. Revenko. Automatic Construction of Implicative Theories for Mathemati-
        cal Domains. PhD thesis, TU Dresden, 2015.
[Ros70] I. Rosenberg. Über die funktionale Vollständigkeit in den mehrwertigen
        Logiken: Struktur der Funktionen von mehreren Veränderlichen auf endlichen
        Mengen. Academia, 1970.
[Yab60] S.V. Yablonsky. Functional Constructions in K-valued Logic. U.S. Joint Pub-
        lications Research Service, 1960.
[YM59] Yu.I. Yanov and A.A. Muchnik. On the existence of k-valued closed classes
        that have no bases. Doklady Akademii Nauk SSSR, 127:44–46, 1959.
       Putting OAC-triclustering on MapReduce

           Sergey Zudin, Dmitry V. Gnatyshak, and Dmitry I. Ignatov

     National Research University Higher School of Economics, Russian Federation
                                  dignatov@hse.ru
                                 http://www.hse.ru


        Abstract. In our previous work an efficient one-pass online algorithm
        for triclustering of binary data (triadic formal contexts) was proposed.
        This algorithm is a modified version of the basic algorithm for OAC-
        triclustering approach; it has linear time and memory complexities. In
        this paper we parallelise it via map-reduce framework in order to make
        it suitable for big datasets. The results of computer experiments show
        the efficiency of the proposed algorithm; for example, it outperforms the
        online counterpart on Bibsonomy dataset with ≈ 800, 000 triples.

        Keywords: Formal Concept Analysis, triclustering, triadic data, data
        mining, big data, MapReduce


 1    Introduction
 Mining of multimodal patterns is one of the hot topics in Data Mining and Ma-
 chine Learning [1,2,3,4]. Thus, cluster analysis of multimodal data and specifi-
 cally of dyadic and triadic relations is a natural extension of the idea of original
 clustering. In dyadic case biclustering methods (the term bicluster was coined
 in [5]) are used to simultaneously find subsets of the sets of objects and at-
 tributes that form homogeneous patterns of the input object-attribute data. In
 fact, one of the most popular applications of biclustering is gene expression anal-
 ysis in Bionformatics [6,7]. Triclustering methods operate in triadic case where
 for each object-attribute pair one assigns a set of some conditions [8,9,10]. Both
 biclustering and triclustering algorithms are widely used in such areas as gene
 expression analysis [11,12,13], recommender systems [14,15,16], social networks
 analysis [17], etc. The processing of numeric multimodal data is also possible by
 modifications of existing approaches for mining dyadic binary relations [18].
     Though there are methods that can enumerate all triclusters satisfying cer-
 tain constraints [2] (in most cases they ensure that triclusters are dense), their
 time complexity is rather high, as in the worst case the maximal number of
 triclusters usually is exponential (e.g. in case of formal triconcepts), showing
 that these methods are hardly scalable. To process big data algorithms need to
 have at most linear time complexity (e.g., O(|I|) in case of n-ary relation I)
 and be easily parallelisable. In addition, in most cases, it is necessary that such
 algorithms output the results in one pass.
     Earlier, in order to create an algorithm satisfying these requirements, we
 adapted a triclustering method based on prime operators (prime OAC-triclustering

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 47–58, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
48     Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


method) [10] and proposed its online version, which is linear, one-pass and eas-
ily parallelisable [19]. However, its parallelisation is possible in different ways.
For example, one can use a popular framework for commodity hardware, Map-
Reduce (M/R) [20]. By the way, there were several successful M/R implementa-
tions in the FCA community and other lattice-oriented domains. Thus, in [21],
the authors adapted Close-by-One algorithm to M/R framework and showed
its efficiency. At the same year, in [22], an efficient M/R algorithm for compu-
tation of closed cube lattices was proposed. The authors of [23] demonstrated
that iterative algorithms like Ganter’s NextClosure can benefit from the usage
of iterative M/R schemes.
    Note that experts aware potential users that M/R is like a big cannon that re-
quires long preparations to shot but fires fast: “the entire distributed-file-system
milieu makes sense only when files are very large and are rarely updated in place
” [20]. In this work, in contrast to our previous study, we assume that there is a
large bulk of data to process that are not coming online.
    The rest of the paper is organized as follows: in Section 2, we recall the orig-
inal method and the online version of the algorithm of prime OAC-triclustering.
In Section 3, we describe the M/R setting of the problem and the corresponding
M/R version of the original algorithm with important implementation aspects.
Finally, in Section 4 we show the results of several experiments which demon-
strate the efficiency of the M/R version of the algorithm. As an addendum, in
the Appendix section, the reader may find our proposal for alternative models
of M/R-based variants of prime OAC-triclustering.


2    Prime object-attribute-condition triclustering
Prime object-attribute-condition triclustering method (OAC-prime) based on
Formal Concept Analysis [24,25] is an extension for the triadic case of object-
attribute biclustering method [26]. Triclusters generated by this method have
similar structure as the corresponding biclusters, namely the cross-like structure
of triples inside the input data cuboid (i.e. formal tricontext).
    Let K = (G, M, B, I) be a triadic context, where G, M , B are respectively
the sets of objects, attributes, and conditions, and I ⊆ G × M × B is a tri-
adic incidence relation. Each prime OAC-tricluster is generated by applying the
following prime operators to each pair of components of some triple:

             (X, Y )0 = {b ∈ B | (g, m, b) ∈ I for all g ∈ X, m ∈ Y },
             (X, Z)0 = {m ∈ M | (g, m, b) ∈ I for all g ∈ X, b ∈ Z},              (1)
              (Y, Z)0 = {g ∈ G | (g, m, b) ∈ I for all m ∈ Y, b ∈ Z},

where X ⊆ G, Y ⊆ M , and Z ⊆ B.
    Then the triple T = ((m, b)0 , (g, b)0 , (g, m)0 ) is called prime OAC-tricluster
based on triple (g, m, b) ∈ I. The components of tricluster are called, respectively,
tricluster extent, tricluster intent, and tricluster modus. The triple (g, m, b) is
called a generating triple of the tricluster T . Figure 1 shows the structure of an
OAC-tricluster (X, Y, Z) based on triple (e        e eb), triples corresponding to the
                                               g , m,
                                     Putting OAC-triclustering on MapReduce           49


gray cells are contained in the context, other triples may be contained in the
tricluster (cuboid) as well.




Fig. 1. Structure of prime OAC-triclusters: the dense cross-like central layer containing
g̃ (left) and the layer for an object g (right) in M × B dimensions.



    The basic algorithm for prime OAC-triclustering method is rather straight-
forward (see [10]). First of all, for each combination of elements from each two
sets of K we apply the corresponding prime operator (we call the resulting sets
prime sets). After that we enumerate all triples from I and on each step we
must generate a tricluster based on the corresponding triple, check whether this
tricluster is already contained in the tricluster set (by using hashing) and also
check extra conditions.
    The total time complexity of the algorithm depends on whether there is a
non-zero minimal density threshold or not and on the complexity of the hashing
algorithm used. In case we use some basic hashing algorithm processing the
tricluster’s extent, intent and modus without a minimal density threshold, the
total time complexity is O(|G||M ||B| + |I|(|G| + |M | + |B|)); in case of a non-
zero minimal density threshold, it is O(|I||G||M ||B|). The memory complexity
is O(|I|(|G| + |M | + |B|)), as we need to keep the dictionaries with the prime
sets in memory.
    In online setting, for triples coming from triadic context K = (G, M, B, I),
the user has no a priori knowledge of the elements and even cardinalities of G,
M , B, and I. At each iteration we receive some set of triples from I: J ⊆ I. After
that we must process J and get the current version of the set of all triclusters. It
is important in this setting to consider every pair of triclusters as being different
as they have different generating triples, even if their respective extents, intents,
and modi are equal. Thus, any other triple can change only one of these two
triclusters, making them different.
    To efficiently access prime sets for their processing, the dictionaries contain-
ing the prime sets are implemented as hash-tables.
    The algorithm is straightforward as well (Alg. 1). It takes some set of triples
(J), the current tricluster set (T ), and the dictionaries containing prime sets
(P rimes) as input and outputs the modified versions of the tricluster set and
50     Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


dictionaries. The algorithm processes each triple (g, m, b) of J sequentially (line
1). At each iteration the algorithm modifies the corresponding prime sets (lines
2-4).
    Finally, it adds a new tricluster to the tricluster set. Note that this tricluster
contains pointers to the corresponding prime sets (in the corresponding dictio-
naries) instead of the copies of the prime sets (line 5) which allows to lower the
memory and access costs.


Algorithm 1 Add function for the online algorithm for prime OAC-triclustering.
Input: J is a set of triples;
    T = {T = (∗X, ∗Y, ∗Z)} is a current set of triclusters;
    P rimesOA, P rimesOC, P rimesAC.
Output: T = {T = (∗X, ∗Y, ∗Z)};
    P rimesOA, P rimesOC, P rimesAC.
 1: for all (g, m, b) ∈ J do
 2:    P rimesOA[g, m] := P rimesOA[g, m] ∪ {b}
 3:    P rimesOC[g, b] := P rimesOC[g, b] ∪ {m}
 4:    P rimesAC[m, b] := P rimesAC[m, b] ∪ {g}
 5:    T := T ∪ {(&P rimesAC[m, b], &P rimesOC[g, b], &P rimesOA[g, m])}
 6: end for



   The algorithm is one-pass and its time and memory complexities are O(|I|).
   Duplicate elimination and selection patterns by user-specific constraints are
done as post-processing to avoid patterns’ loss. The time complexity of the basic
post-processing is O(|I|) and it does not require any additional memory.
   Finally, it seems the algorithm can be easily parallelised by splitting the
subset of triples J into several subsets, processing each of them independently,
and merging the resulting sets afterward.


3     Map-reduce OAC-triclustering

3.1   Map-reduce decomposition

We use a two-stage M/R approach. The first M/R allows us to efficiently cal-
culate all the primes of the existed pairs. The second M/R permits to assemble
the found primes into triclusters. During the first map phase, each triple from
the input context is indexed by a key using hash function depending on one of
the basic entity types, object, attribute, or condition (see Alg. 2). The number
of map keys is equal to the number of reducers.
   Then each first reducer receives the portion of data for a particular key (see
Alg. 3). The internal reducer algorithm is almost a replication of Online OAC-
prime. However, it does not assemble all found triclusters into a final collection;
the reducer simply writes the file with the current triclusters for a given portion of
data to a file or pass it to the second-stage mapper. Since in Hadoop MapReduce
                                      Putting OAC-triclustering on MapReduce    51


Algorithm 2 Distributed OAC-triclustering: First Map
Input: S is a set of input triples as strings;
    r is a number of reducers;
    i is a grouping index (objects, attributes or conditions).
Output: J˜ is a list of hkey, triplei pairs.
 1: for all s ∈ S do
 2:    t := transf orm(s)
 3:    key := hash(t[i]) mod r
 4:    J˜ := J˜ ∪ {hkey, ti}
 5: end for



we should work with text input files and our data are mainly in a tuple-based
form, we use encode/decode function encode()/transf rom() to switch between
the internal tuple representation and the text-based one.


Algorithm 3 Distributed OAC-triclustering: First Reduce
Input: J is a list of triples (for a certain key);
    T = {T = (X, Y, Z)} is a current set of triclusters;
    P rimesOA, P rimesOC, P rimesAC.
Output: file of strings – encoded htriple, triclusteri pairs.
 1: P rimes ← initialise a new multimap
 2: for all (g, m, b) ∈ J do
 3:    P rimes[g, m] := P rimes[g, m] ∪ {b}
 4:    P rimes[g, b] := P rimes[g, b] ∪ {m}
 5:    P rimes[m, b] := P rimes[m, b] ∪ {g}
 6: end for
 7: for all (g, m, b) ∈ J do
 8:    T := (set(P rimes[m, b]), set(P rimes[g, b]), set(P rimes[g, m]))
 9:    s := {encode(h(g, m, b), T i)}
10:    store s
11: end for



     The second mapper takes the found intermediate triclusters (with their keys)
as strings from the files produced by the first-stage reducers (see Alg. 4). It
fills P rimes multimap in one pass through all htriple, triclusteri pairs. In
the next loop for each key (g, m, b) the corresponding tricluster is formed and
htricluster, triclusteri pairs are passed to the second-stage reducer (the key
tricluster can be efficiently implemented by a proper hashing). In its turn, the
second stage reducer eliminates duplicates and outputs the resulting file (Alg. 4).
The set() function helps to avoid duplicates among the values of P rimes[, ],
which is closer to our implementation. However, one can easily omit set() in line
8, provided that P rimes is properly implemented.
     The time complexity of the M/R solution is composed from two terms for
each stage: O(|I|/r) and O(|I|). However, there are communication costs that
52       Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


Algorithm 4 Distributed OAC-triclustering: Second Map
Input: S is a list of strings.
Output: T̃ is an list of htricluster, triclusteri pairs.
 1: P rimes ← initialise a new multimap
 2: for all s ∈ S do
 3:    h(g, m, b), T i := decode(s)
 4:    update P rimes multimap appropriately
 5:    I := I ∪ {(g, m, b)}
 6: end for
 7: for all (g, m, b) ∈ I do
 8:    T := (set(P rimes[m, b]), set(P rimes[g, b]), set(P rimes[g, m]))
 9:    T̃ := T̃ ∪ {hT, T i}
10: end for

Algorithm 5 Distributed OAC-triclustering: Second Reduce
Input: T̂ is a list of htricluster, list of triclustersi pairs.
Output: File with a final set of triclusters {T = (X, Y, Z)}.
 1: for all hT, [T, . . . , T ]i ∈ T̂ do
 2:   store T
 3: end for



should be inevitably paid and can be theoretically estimated as follows [20]: the
replication rate for the first M/R stage r1 = 1 (each triple is passed as one key-
value pair), the reducer size q1 = |I|/r; the replication rate for the second M/R
stage is r2 = 1 (it assign one key-value pair for each tricluster), but the reducer
size varies from q2min = 1 (no duplicate triclusters) and q2max = |I| (one final
tricluster when all the initial triples belong to one absolutely dense cuboid).

3.2     Implementation aspects and used technologies
The application 1 has been implemented in Java within JRE 8 and as distributed
computation framework we use Apache Hadoop 2 .
    We have used many other technologies: Apache Maven (framework for au-
tomatic project assembling), Apache Commons (for work with extended Java
collections), Google Guava (utilities and data structures), Jackson JSON (open-
source library for transformation of object-oriented representation of an object
like tricluster to string), TypeTools (for real-time type resolution of inbound and
outbound key-value pairs), etc.
ChainingJob module. During the development we found that in Hadoop one
MapReduce process can contain only one Mapper and one Reducer. Thus, in
order to develop an application with three “map” phases and one “reduce”,
one needs to create three processes. One process creation (even without various
adjustments) takes 8-10 lines of code. After our vain search of an appropriate
1
     https://github.com/zydins/DistributedTriclustering
2
      http://hadoop.apache.org/
                                   Putting OAC-triclustering on MapReduce        53


library, we developed “chaining-job” module 3 . Its main class contains the fol-
lowing fields: “jobs” (list of all scheduled processes), “name” (common name for
all processes), and “tempDir” (folder name for intermediate results). First, the
algorithm set input path for the first chaining process and path to the result of
the last job; the rest jobs are connected by input and output “key-value” pairs
and directory for intermediate files storage. Then this algorithm runs processes
according to the schedule and waits their completion. In other words, it connects
the input and output of chaining processes that run sequentially.
     Let us shortly describe the most important classes our M/R implementation.
Entity. It is a basic class for object oriented representation of input strings and
maintains three entity types: EXTENT, INTENT, and MODUS. For example:
{“Leon”, EXTENT }.
Tuple. An object of this class stores references to objects of class Entity and
represents two basic entities: triple and tricluster. Mapper and Reducer classes
operate with objects of this type.
FormalContext. This class is an object oriented representation of the underlying
binary relation; it keeps the reference to an object of EntityStorage class (see
below). It also contains methods “add” (add triple) and “getTriclusters” (get
the output set of unique triclusters).
EntityStorage. This class manages the work with extents, intents and modi of
triclustes. It also contains three dictionaries with composite keys. For example,
for (g1, m1, c1) object c1 will be added by key (g1, m1) to the first dictionary;
analogously for keys (g1, c1) and (m1, c1).
     The process-like M/R classes are summarised below.
TupleReadMapper. Its main goal is reading a triple from the input file and trans-
form the triple to an object of class Tuple.
TupleContextReducer. It receives input tuples and fills the underlying tricontext
by them. It also sets the number of first reducers. This number depends on the
available nodes in a distributed system and the structure of input data. The
more unique entities are in triples, the more that value should be.
PrepareMapper. The “map” method receives files from the previous stage. They
contain intermediate triclusters from each object of class TupleContextReducer.
It fills the dictionary with primes. Further, each tricluster triple is transformed
to Tuple structure and is passed to the second reduce phase.
CollectReduce. This class gathers all intermediate triclusters and obtains the final
tricluster set. This process runs in several threads for speed up. The number of
threads is a user-specified parameter.
Executor. It is a starting class of the application, which receives the input pa-
rameters, activates “chaining-job” utility for making a chain of jobs, and starts
the execution.

3
    https://github.com/zydins/chaining-job
54     Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


4     Experiments

Two series of experiments have been conducted in order to test the application
on the synthetic contexts and real world datasets with moderate and large num-
ber of triples in each. In each experiment both versions of the OAC-triclustering
algorithm have been used to extract triclusters from a given context. Only online
and M/R versions of OAC-triclustering algorithm have managed to result pat-
terns for large contexts since the computation time of the compared algorithms
was too high (>3000 s). To evaluate the runtime more carefully, for each context
the average result of 5 runs of the algorithms has been recorded.


4.1   Datasets

Synthetic datasets. As it was mentioned, synthetic contexts were randomly gen-
erated: 1) 20,000 triples (25 unique entities of each type); 2) 100,000 triples (50
unique entities of each type); 3) 1,000,000 triples (all possible combinations of
100 unique entities of each type). However, it is easy to see that some datasets
are not correct formal contexts from algebraic viewpoint. Thus, the first dataset
inevitably contains duplicates since 25 × 25 × 25 gives only 15,625 unique triples.
The second one contains less triples than 503 = 125, 000, the number of all possi-
ble combinations. The third one is just an absolutely dense cuboid 100×100×100
(it contains only one formal concept (OAC-tricluster), the whole context).
    These tests look more like crush test, but they have sense since in M/R
setting the triples can be (partially) repeated, e.g., because of M/R task failures
on some nodes (i.e. restarting processing of some key-value pairs). Even though
the third dataset does not result in 3min(|G|,|M |,|B|) formal triconcepts, the worst
case for formal triconcepts generation in terms of the number of patterns, this
is an example of the worst case scenario for the second reducer since its size is
maximal (q2max = |I|). By the way, our algorithm should correctly assemble the
only one tricluster (G, M, B) and it actually does.
IMDB. This dataset consists of Top-250 list of the Internet Movie Database (250
best movies based on user reviews). The following triadic context is composed:
the set of objects consists of movie names, the set of attributes (tags), the set of
conditions (genres), and each triple of the ternary relation means that the given
movie has the given genre and is assigned the given tag.
Bibsonomy. Finally, a sample of the data of bibsonomy.org from ECML PKDD
discovery challenge 2008 has been used. This website allows users to share book-
marks and lists of literature and tag them. For the tests the following triadic
context has been prepared: the set of objects consists of users, the set of at-
tributes (tags), the set of conditions (bookmarks), and a triple of the ternary
relation means that the given user has assigned the given tag to the given book-
mark.
    The table 1 contains the summary of the contexts.
                                    Putting OAC-triclustering on MapReduce         55

                       Table 1. Contexts for the experiments
                  Context    |G| |M |      |B| # triples Density
                     20k      25     25     25    20,000 1
                    100k      50     50     50 100,000 0.8
                     1m      100    100    100 1,000,000 1
                   IMDB      250    795     22     3,818 0.00087
                 BibSonomy 2,337 67,464 28,920 816,197 1.8 · 10−7




4.2   Results

The experiments has been conducted on the computer running under OS X 10,
using 1,8 GHz Intel Core i5, having 4 Gb 1600 MHz DDR3 and having 8 Gb free
space on its hard drive (a typical commodity hardware). Two M/R modes have
been tested: sequential mode of tasks completion and emulation of distributed
one with 16 first reducers and 32 threads for the second stage.


             Table 2. Results of comparison (time is given in seconds)
      Algorithm/Context       IMDB         20k 100k        1m      Bibsonomy
                           (≈3k triples) triples triples triples (≈800k triples)
      Tribox                   324         800 1,265 >3,000          >3,000
      TRIAS                    189         362     862 >3,000        >3,000
      OAC Box                  374         756 1,265 >3,000          >3,000
      OAC Prime                  7           8     734 >3,000        >3,000
      Online OAC prime           3           3      3        5       >3,000
      M/R OAC prime seq.        12          30     81      166        1,534
      M/R OAC prime distr.       1          15     20       25         520




    In Table 2 we summarise the results of performed tests. It is clear that on
average our application has fewer execution time than its competitors, except
of online version of OAC-triclustering. If we compare the implemented program
with its original online version, the results are worse for not that big but dense
datasets (closer to the worst case scenario q2 = |I|. It is the consequence of the
fact that the application architecture aimed at processing of large amounts of
data; in particular, it is implemented in two stages with time consuming com-
munication. Launching and stopping Apache Hadoop, data writing and passing
between Map and Reduce steps in both stages requires substantial time, that is
why for not that big datasets, when execution time is comparable with time for
infrastructure management, time performance is not perfect. However, with data
size increase the relative performance is growing. Thus, the last test for BibSon-
omy data has been successfully passed, but the competitors were not able to
finish it within 50 min, but our M/R program did it even in sequential mode
within 25 min.
56     Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


5    Conclusion

In this paper we have presented a map-reduce version of OAC-triclustering algo-
rithm. We have shown that the algorithm is efficient from both theoretical and
practical points of view. It remains of linear time complexity and is performed
in two stages (with each stage being M/R distributed); this allows us to use it
for big data problems. However, we believe that it is possible to propose another
variants of map-reduce based algorithm where the reducer exploits composite
keys directly (see Appendix section). So, such algorithms and their comparison
with the current M/R version on real and artificial data is still be in our plans.
However, in despite the step towards Big Data technologies, a proper comparison
of the proposed OAC triclustering and noise tolerant patterns in n-ary relations
by DataPeeler and its descendants [2] is not yet conducted.


Acknowledgments. The study was implemented in the framework of the Basic
Research Program at the National Research University Higher School of Eco-
nomics in 2014-2015, in the Laboratory of Intelligent Systems and Structural
Analysis. The last two authors were partially supported by Russian Foundation
for Basic Research, grant no. 13-07-00504. The authors would like to thank Yuri
Kudriavtsev from PM-Square and Dominik Slezak from Infobright and Warsaw
University for their encouragement given to our studies of M/R technologies.


References

 1. Georgii, E., Tsuda, K., Schölkopf, B.: Multi-way set enumeration in weight tensors.
    Machine Learning 82(2) (2011) 123–155
 2. Cerf, L., Besson, J., Nguyen, K.N., Boulicaut, J.F.: Closed and noise-tolerant
    patterns in n-ary relations. Data Min. Knowl. Discov. 26(3) (2013) 574–619
 3. Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-
    relational data. Data Mining and Knowledge Discovery 28(3) (2014) 808–849
 4. Ignatov, D.I., Gnatyshak, D.V., Kuznetsov, S.O., Mirkin, B.: Triadic formal con-
    cept analysis and triclustering: searching for optimal patterns. Machine Learning
    (2015) 1–32
 5. Mirkin, B.: Mathematical Classification and Clustering. Kluwer, Dordrecht (1996)
 6. Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis:
    A survey. IEEE/ACM Trans. Comput. Biology Bioinform. 1(1) (2004) 24–45
 7. Eren, K., Deveci, M., Kucuktunc, O., Catalyurek, Umit V.: A comparative analysis
    of biclustering algorithms for gene expression data. Briefings in Bioinform. (2012)
 8. Mirkin, B.G., Kramarenko, A.V.: Approximate bicluster and tricluster boxes in the
    analysis of binary data. In Kuznetsov, S.O., et al., eds.: RSFDGrC 2011. Volume
    6743 of Lecture Notes in Computer Science., Springer (2011) 248–256
 9. Ignatov, D.I., Kuznetsov, S.O., Poelmans, J., Zhukov, L.E.: Can triconcepts be-
    come triclusters? International Journal of General Systems 42(6) (2013) 572–593
10. Gnatyshak, D.V., Ignatov, D.I., Kuznetsov, S.O.: From triadic FCA to tricluster-
    ing: Experimental comparison of some triclustering algorithms. In: CLA. (2013)
    249–260
                                     Putting OAC-triclustering on MapReduce           57


11. Zhao, L., Zaki, M.J.: Tricluster: An effective algorithm for mining coherent clusters
    in 3d microarray data. In: SIGMOD 2005 Conference. (2005) 694–705
12. Li, A., Tuck, D.: An effective tri-clustering algorithm combining expression data
    with gene regulation information. Gene regul. and syst. biol. 3 (2009) 49–64
13. Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression
    data with pattern structures in formal concept analysis. Inf. Sci. 181(10) (2011)
    1989–2001
14. Nanopoulos, A., Rafailidis, D., Symeonidis, P., Manolopoulos, Y.: Musicbox: Per-
    sonalized music recommendation based on cubic analysis of social tags. IEEE
    Transactions on Audio, Speech & Language Processing 18(2) (2010) 407–412
15. Jelassi, M.N., Yahia, S.B., Nguifo, E.M.: A personalized recommender system based
    on users’ information in folksonomies. In Carr, L., et al., eds.: WWW (Companion
    Volume), ACM (2013) 1215–1224
16. Ignatov, D.I., Nenova, E., Konstantinova, N., Konstantinov, A.V.: Boolean Matrix
    Factorisation for Collaborative Filtering: An FCA-Based Approach. In: AIMSA
    2014, Varna, Bulgaria, Proceedings. Volume LNCS 8722. (2014) 47–58
17. Gnatyshak, D.V., Ignatov, D.I., Semenov, A.V., Poelmans, J.: Gaining insight in
    social networks with biclustering and triclustering. In: BIR. Volume 128 of Lecture
    Notes in Business Information Processing., Springer (2012) 162–171
18. Kaytoue, M., Kuznetsov, S.O., Macko, J., Napoli, A.: Biclustering meets triadic
    concept analysis. Ann. Math. Artif. Intell. 70(1-2) (2014) 55–79
19. Gnatyshak, D.V., Ignatov, D.I., Kuznetsov, S.O., Nourine, L.: A one-pass triclus-
    tering approach: Is there any room for big data? In: CLA 2014. (2014)
20. Rajaraman, A., Leskovec, J., Ullman, J.D.: MapReduce and the New Software
    Stack. In: Mining of Massive Datasets. Cambridge University Press, England,
    Cambridge (2013) 19–70
21. Krajca, P., Vychodil, V.: Distributed algorithm for computing formal concepts
    using map-reduce framework. In: N. Adams et al. (Eds.): IDA 2009. Volume LNCS
    5772. (2009) 333–344
22. Kuznecov, S., Kudryavcev, Y.: Applying map-reduce paradigm for parallel closed
    cube computation. In: 1st Int. Conf. on Advances in Databases, Knowledge, and
    Data Applications, DBKDS 2009. (2009) 62–67
23. Xu, B., de Frein, R., Robson, E., Foghlu, M.O.: Distributed formal concept analysis
    algorithms based on an iterative mapreduce framework. In Domenach, F., Ignatov,
    D., Poelmans, J., eds.: ICFCA 2012. Volume LNAI 7278. (2012) 292–308
24. Wille, R.: Restructuring lattice theory: An approach based on hierarchies of con-
    cepts. In Rival, I., ed.: Ordered Sets. Volume 83 of NATO Advanced Study Insti-
    tutes Series. Springer Netherlands (1982) 445–470
25. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. 1st
    edn. Springer-Verlag New York, Inc., Secaucus, NJ, USA (1999)
26. Ignatov, D.I., Kuznetsov, S.O., Poelmans, J.: Concept-based biclustering for inter-
    net advertisement. In: ICDM Workshops, IEEE Computer Society (2012) 123–130


Appendix. Alternative variants of two-stage MapReduce

First Map: Finding primes. During this phase every input triple (g, m, b)
is encoded by three key-value pairs h(g, m), bi, h(g, b), mi, and h(m, b), gi. These
pairs are passed to the first reducer. The replication rate is r1 = 3.
58     Sergey Zudin, Dmitry V. Gnatyshak and Dmitry I. Ignatov


First Reduce: Finding primes. This reducer fills three corresponding dic-
tionaries for primes of keys. So, for example, the first dictionary, P rimeOA
contains key-value pairs h(g, m), {b1 , b2 , . . . , bn }i. The reducer size is q1 =
max(|G|, |M |, |B|)
    The process can be stopped after the first reduce phase and all the triclus-
ters found as (P rime[g, m], P rime[g, b], P rime[m, b]) each by enumeration of
(g, m, b) ∈ I. However, to do it faster and keep the result for further computa-
tion, it is possible to use M/R as well.
Second Map: Tricluster generation. The second map does tricluster
combining job, i.e. for each triple (g, m, b) it composes the new key-value
pair, h(g, m, b), ∅i. And for each pair of either type, h(g, m), P rime[g, m]i,
h(g, b), P rime[g, b]i, and h(m, b), P rime[m, b]i it generates key-values pairs
h(g, m, b̃), P rime[g, m]i, h(g, m̃, b), P rimeOC[g, b]i, and h(g̃, m, b), P rime[m, b]i,
where g̃ ∈ G, m̃ ∈ M , and b̃ ∈ B. r2 = (|I|+3|G||M ||B|)/(|I|+|G||M |+|G||B|+
|M ||B|) ≤ (ρ + 3)/(ρ + 3/max(|G|, |M |, |B|)), where ρ is the input tricontext
density.
Second Reduce: Tricluster generation. The second reducer just assem-
bles only one value for each key (g, m, b), the generating triple, its tricluster,
(P rime[g, m], P rime[g, b], P rime[m, b]). If there is no key-value pair h(g, m, b), ∅i
for a particular triple (g, m, b), it does not output any key-value pair for the key.
The reducer size q2 is either 3 (no output) or 4 (tricluster assembled).
Second Map: Tricluster generation with duplicate generating triples.
Second map does tricluster combining job, i.e. for each triple (g, m, b) it composes
a new key-value pair: h(P rime[g, m], P rime[g, b], P rime[m, b]), (g, m, b)i.
Second Map: Tricluster generation with duplicate generating triples.
The second reducer just groups values for each key: h(X, Y, Z), {(g1 , m1 , b1 ), . . . ,
(gn , mn , bn )}i.
    These two variations of the second stage have their merits: the first one is
beneficial for further computations with a new portion of triples and the last one
is more compact and informative. Of course, each variant of the second stage has
its own runtime complexity which depends not only on the model representation
but is also sensitive to datastructures implementation and M/R communication
costs and settings.
              Concept interestingness measures:
                    a comparative study

                 Sergei O. Kuznetsov1 and Tatiana P. Makhalova1,2
 1
     National Research University Higher School of Economics, Kochnovsky pr. 3,
                              Moscow 125319, Russia
      2
        ISIMA, Complexe scientifique des Cézeaux, 63177 Aubière Cedex, France

                     skuznetsov@hse.ru,t.makhalova@gmail.com



         Abstract. Concept lattices arising from noisy or high dimensional data
         have huge amount of formal concepts, which complicates the analysis of
         concepts and dependencies in data. In this paper, we consider several
         methods for pruning concept lattices and discuss results of their com-
         parative study.


 1      Introduction

 Formal Concept Analysis (FCA) underlies several methods for rule mining, clus-
 tering and building taxonomies. When constructing a taxonomy one often deals
 with high dimensional or/and noisy data which results in a huge amount of for-
 mal concepts and dependencies given by implications and association rules. To
 tackle this issue different approaches were proposed for selecting most important
 or interesting concepts. In this paper we consider existing approaches which fall
 into the following groups: pre-processing of a formal context, modification of the
 closure operator, and concept filtering based on interestingness indices (mea-
 sures). We mostly focus on comparison of interestingness measures and study
 their correlations.


 2      FCA framework

 Here we briefly recall FCA terminology [20]. A formal context is a triple (G, M, I),
 where G is called a set objects, M is called a set attributes and I ⊆ G × M is a
 relation called incidence relation, i.e. (g, m) ∈ I if the object g has the attribute
                                  0
 m. The derivation operators (·) are defined for A ⊆ G and B ⊆ M as follows:

                            A0 = {m ∈ M |∀g ∈ A : gIm}
                            B 0 = {g ∈ G|∀m ∈ B : gIm}

 A0 is the set of attributes common to all objects of A and B 0 is the set of objects
                                                            0
 sharing all attributes of B. The double application of (·) is a closure operator,

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 59–72, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
60      Sergei O. Kuznetsov and Tatyana P. Makhalova

       00
i.e. (·) is extensive, idempotent and monotone. Sets A ⊆ G, B ⊆ M , such that
A = A00 and B = B 00 are said to be closed.
     A (formal) concept is a pair (A, B), where A ⊆ G, B ⊆ M and A0 = B,
   0
B = A. A is called the (formal) extent and B is called the (formal) intent of the
concept (A, B). A partial order 6 is defined on the set of concepts as follows:
(A, B) ≤ (C, D) iff A ⊆ C (D ⊆ B), a pair (A, B) is a subconcept of (C, D),
while (C, D) is a superconcept of (A, B).


3     Methods for simplifying a lattice structure

With the growth of the dimension of a context the size of a lattice can increase
exponentially, it becomes almost impossible to deal with the huge amount of
formal concepts. With this respect a wide variety of methods have been pro-
posed. Classification of them was presented in [16]. Authors proposed to divide
techniques for lattice pruning into three classes: redundant information removal,
simplification, selection. In this paper, we consider also other classes of methods
and their application to concept pruning.


3.1    Pre-processing

Algorithms for concept lattice are time consuming. To decrease computation
costs one can reduce the size of a formal context. Cheung and Vogel [13] applied
Singular Value Decomposition (SVD) to obtain a low-rank approximation of
Term-Document matrix and construct concept lattice using pruned concepts.
Since this method is also computationally complex [25], alternative methods
such as spherical k-Means [14] and fuzzy k-Means [17], Non-negative Matrix
Decomposition [33] were proposed.
    Dimensionality reduction can dramatically decrease the computational load
and simplify the lattice structure, but in most cases it is very difficult to interpret
the obtained results.
    Another way to solve described problems without changing the dimension of
the context was proposed in [18], where an algorithm that significantly improves
the lattice structure by making small changes of context was presented. The
central notion of the method is the concept incomparability w.r.t. ≤ relation.
The goal of the proposed method is to diminish total incomparability of the
concepts in the lattice.
    The authors note that the result is close to that of fuzzy k-Means, but the
former is achieved with fewer context changes than required by the latter. How-
ever, such transformations do not always lead to the decrease of a number of
formal concepts, the transformations of a context are aimed at increasing the
share of comparable concepts, thus this method does not ensure a significant
simplification of the lattice structure.
    Context pruning by clustering objects was introduced in [15]. The similarity
of objects is defined as the weighted sum of shared attributes. Thus, the original
context is replaced by the reduced one. Firstly, we need to assign weights wm
                      Concept interestingness measures: a comparative study      61


for each attribute m ∈ M . The similarity between objects is defined as weighted
sum of shared attributes.
    Objects are considered similar if sim(g, h) ≥ ε, where ε is a predefined thresh-
old. In order to avoid the generation of large clusters another threshold α was
proposed. Thus, the algorithm is an agglomerative clustering procedure, such
that at each step clusters are brought together if the similarity between them is
less than ε and the volume of clusters is less than α|G| objects.


3.2   Reduction based on a background knowledge or predefined
      constraints

Another approach to tackle computation and representation issues is to de-
termine constraints on the closure operator. It can be done using background
knowledge of attributes. In [8] the extended closure operator was presented. It is
based on the notion of AD-formulas (attribute-dependency formulas), which es-
tablish dependence of attributes and their relative importance. Put differently,
the occurrence of certain attributes implies that more important ones should
also occur. Concepts which do not satisfy this condition are not included in the
lattice.
    In [5] a numerical approach to defining attribute importance was proposed.
The importance of a formal concept can be defined by various aggregation func-
tions (average, minimum, maximum) and different intent subsets (generator,
minimal generator or intent itself). It was shown [5] that there is a correspon-
dence between this numerical approach and AD-formulas.
    Carpineto and Romano [12] considered document-term relation and proposed
to use a thesaurus of terms to prune the lattice. Two different attributes are
considered as same if there is a common ancestor in the hierarchy. To enrich the
set of attributes they used a thesaurus, but in general, it may be quite difficult
to establish such kind of relationship between arbitrary attributes.
    Computing concepts with extents exceeding a threshold was proposed in [26]
and studied in relation to frequent itemset mining in [34]. The main drawback
of this approach, called “iceberg lattice” mining, is missing rare and probably
interesting concepts.
    Several polynomial-time algorithms for computing Galois sub-hierarchies were
proposed, see [9, 3].


3.3   Filtering concepts

Selecting most interesting concepts by means of interestingness measures (in-
dices) is the most widespread way of dealing with the huge number of concepts.
The situation is aggravated by complexity of computing some indices. However,
this approach may be fruitful, since it provides flexible tools for exploration of
a derived taxonomy. In this section we consider different indices for filtering
formal concepts. These indices can be divided into the following groups: mea-
sures designed to assess closed itemsets (formal concepts), arbitrary itemsets and
62      Sergei O. Kuznetsov and Tatyana P. Makhalova


measures for assessing the membership in a basic level (a psychology-motivated
approach).

Indices for formal concepts

Stability Stability indices were introduced in [27, 28] and modified in [29]. One
distinguishes intensional and extensional stability. The first one allows estimating
the strength of dependence of an intent on each object of the respective extent.
Extensional stability is defined dually.
                                         | {C ⊆ A|C 0 = B} |
                        Stabi (A, B) =
                                                2|A|
    The problem of computing stability is #P -complete [28] and hence it makes
this measure impractical for large contexts. In [4] its Monte Carlo approximation
was introduced, a combination of Monte Carlo and upper bound estimate was
proposed in [10]. Since for large contexts the stability is close to 1 [21] the
logarithmic scale of stability (inducing the same ranking as stability) [10] is
often used:
                        LStab (c) = −log2 (1 − Stab (c))
     The bounds of stability are given by
                                      X
     ∆min (c) − log2 (|M |) ≤ −log2           2−∆(c,d) ≤ LStab (c) ≤ ∆min (c) ,
                                    d∈DD(c)

where ∆min (c) = mind∈DD(c) ∆ (c, d), DD (c) is a set of all direct descendants
of c in the lattice and ∆ (c, d) is the size of the set-difference between extents of
formal concepts c and d.
    In our experiments we used the bounds of logarithmic stability, because the
combined method is still computationally demanding.

Concept Probability Stability of a formal concept may be interpreted as proba-
bility of retaining its intent after removing some objects from the extent, taking
that all subsets of the extent have equal probability. In [24] it was noticed that
some interesting concepts with small number of object usually have low stability
value. To ensure selection of interesting infrequent closed patterns, the concept
probability was introduced. It is equivalent to the probability of a concept in-
troduced earlier by R. Emilion [19].
    The probability that an arbitrary object has all attributes from the set B is
defined as follows                         Y
                                     pB =      pm
                                         m∈B

    Concept probability is defined as the probability of B being closed:
                n                           n
                                               "                               #
               X                           X                 n−k
                                                                 Y           
          00            0             00          k                        k
 p (B = B ) =     p (|B | = k, B = B ) =         pB (1 − pB )         1 − pm
                k=0                           k=0                  m∈B
                                                                    /
                      Concept interestingness measures: a comparative study      63


where n = |G|.
    The concept probability has the following probabilistic components: the oc-
currence of each attribute from B in all k objects, the absence of at least one
attribute from B in other objects and the absence of other attributes shared by
all k objects.

Robustness Another probabilistic approach to assessing a formal concept was
proposed in [35]. Robustness is defined as the probability of a formal concept
intent remaining closed while deleting objects, where every object of a formal
context is retained with probability α. Then for a formal concept c = (A, B) the
robustness is given as follows:
                              X       |B |−|Bc |        |A |−|Ad |
                   r (c, α) =   (−1) d           (1 − α) c
                              dc


Separation The separation index was considered in [24]. The main idea behind
this measure is to describe the area covered by a formal concept among all
nonzero elements in the corresponding rows and columns of the formal context.
Thus, the value characterizes how specific is the relationship between objects
and attributes of the concept with respect to the formal context.
                                              |A||B|
                  s (A, B) = P         0
                                              P       0
                                 g∈A |g | +     m∈B |m | − |A||B|


Basic Level Metrics The group of so-called “basic level” measures was consid-
ered by Belohlavek and Trnecka [6, 7]. These measures were proposed to formalize
the existing psychological approach to defining basic level of a concept [31].

Similarity approach (S) A similarity approach to basic level was proposed in
[32] and subsequently formalized and applied to FCA in [6]. The authors defined
basic level as combination of three fuzzy functions that correspond to formalized
properties outlined by Rosch: high cohesion of concepts, considerably greater
cohesion with respect to upper neighbor and a slightly less cohesion with respect
to lower neighbors. The membership degree of a basic level is defined as follows:

              BLS = coh∗∗ (A, B) ⊗ coh∗∗             ∗∗
                                      un (A, B) ⊗ cohln (A, B) ,

where αi is a fuzzy function that corresponds to the conditions defined above,
⊗ is t-norm [23].
    A cohesion of a formal concept is a measure of pairwise similarity of all object
in the extent. Various similarity measures can be used for cohesion functions:
                                      |B1 ∩ B2 | + |M − (B1 ∪ B2 ) |
               simSM C (B1 , B2 ) =
                                                    |M |

                                               |B1 ∩ B2 |
                           simJ (B1 , B2 ) =
                                               |B1 ∪ B2 |
64     Sergei O. Kuznetsov and Tatyana P. Makhalova


    The first similarity index simSM C takes into account the number of com-
mon attributes, while Jaccard similarity simJ takes exactly the proportion of
attributes shared by two sets. There are two ways to compute cohesion of formal
concepts: taking average or minimal similarity among sets of attributes of the
concept extent, the formulas are represented below (for average and minimal
similarity respectively).
                                  P                            0    0
                      a             x1 ,x2 ⊆A,x1 6=x2 sim... (x1 , x2 )
                  coh... (A, B) =
                                           |A| (|A| − 1) /2
                                                    0    0
                       cohm
                          ... (A, B) = min sim... (x1 , x2 )
                                         x1 ,x2 ∈A

The Rosch’s properties for upper and lower neighbors take the following forms:
                                  P                 ∗          ∗
              a∗                    c∈U N (A,B) coh... (c) /coh... (A, B)
          coh...,un (A, B) = 1 −
                                              |U N (A, B) |
                                 P               ∗               ∗
                                  c∈LN (A,B) coh... (A, B) /coh... (c)
             coha∗
                 ...,ln (A, B) =
                                            |LN (A, B) |

             cohm∗
                ...,un (A, B) = 1 −       max        coh∗... (c) /coh∗... (A, B)
                                       c∈U N (A,B)

                cohm∗
                   ...,ln (A, B) =      min       coh∗... (A, B) /coh∗... (c)
                                     c∈LN (A,B)

where U N (A, B) and LN (A, B) are upper and lower neighbors of a formal
concept (A, B) respectively.
    As the authors noted, experiments revealed that the type of cohesion function
does not affect the result, while the choice of similarity measure can greatly
affect the outcome. More than that, in some cases upper (lower) neighbors may
have higher (lower) cohesion than the formal concept itself (for example, some
boundary cases, when a neighbors’s extent (an intent) consists of identical rows
(columns) of a formal context). To tackle this issue of non-monotonic neighbors
w.r.t. similarity function authors proposed to take coh∗∗             ∗∗
                                                        ...,ln and coh...,un as 0, if
the rate of non-monotonic neighbors is larger that a threshold.
    In our experiments we used the following notation: SMC∗∗ and J∗∗ , where
the first star is replaced by a cohesion type, the second one is replaced by the
type of a similarity function. Below, we consider another four metrics that were
introduced in [7].

Predictability approach (P) Predictability of a formal concept is computed in a
quite similar way to BLS . A cohesion function is replaced by a predictability
function:

           P (A, B) = pred∗∗ (A, B) ⊗ pred∗∗              ∗∗
                                          un (A, B) ⊗ predln (A, B)

The main idea behind this approach is to assign high score to concept (A, B)
with low conditional entropy of the presence of attributes not in B in intents of
objects from A (i.e., requiring few attributes outside B in objects from A)[7]:
                      Concept interestingness measures: a comparative study          65



                                                    |A ∩ y 0 |     |A ∩ y 0 |
                E (I [hx, yi ∈ I] |I [x ∈ A]) = −              log
                                                      |A|            |A|
                                     X       E (I [hx, yi ∈ I] |I [x ∈ A])
              pred (A, B) = 1 −                                            .
                                                       |M − B|
                                   y∈M −B

Cue Validity (CV), Category Feature Collocation (CFC), Category Utility (CU)
The following measures based on the conditional probability of object g ∈ A
given that y ⊆ g 0 were introduced in [7]:
                                    X                X |A|
                      CV (A, B) =       P (A|y 0 ) =
                                                       |y 0 |
                                       y∈B                  y∈B

                           X                               X |A ∩ y 0 | |A ∩ y 0 |
           CF C (A, B) =         p (A|y 0 ) p (y 0 |A) =
                                                 |y 0 |        |A|
                           y∈M                             y∈M
                                                "               2  0 2 #
                  Xh         2
                                        i |A| X      |A ∩ y 0 |      |y |
                         0          0 2
CU (A, B) = p (A)    p (y |A) − p (y ) =                           −
                                          |G|           |y 0 |       |G|
                   y∈M                                      y∈M

    The main intuition behind CV is to express probability of extent given at-
tributes from intent, CFC index takes into account the relationship between all
attributes of the concept and intent of the formal concept, while CU evaluates
how much an attribute in an intent is characteristic for a given concept rather
than for the whole context [36].

Metrics for arbitrary itemsets

Frequency(support) It is one of the most popular measures in the theory of
pattern mining. According to this index the most “interesting” concepts are
frequent ones (having high support). For an arbitrary formal concept the support
is defined as follows
                                               |A|
                                supp (A, B) =
                                               |G|
The support provides efficient level-wise algorithms for constructing semilattices
since it exhibits anti-monotonicity (a priori property [2, 30]):
                       B1 ⊂ B2 → supp (B1 ) ≥ supp (B2 )

Lift In the previous section different methods with background knowledge were
considered. Another way to add additional knowledge to data is proposed in [11].
Under assumption of attributes independence it is possible to compute individual
frequencies of attributes and take their product as the expected frequency. The
ratio of the observed frequency to its expectation is defined as lift. The lift of a
formal concept (A, B) is defined as follows:
                                     P (A)          |A|/|G|
                    lif t (B) = Q            0)
                                                =Q       0
                                    b∈B P (b       b∈B |b |/|G|
66     Sergei O. Kuznetsov and Tatyana P. Makhalova


Collective Strength The collective strength [1] combines ideas of comparing the
observed data and expectation under the assumption of independence of at-
tributes. To calculate this measure for a formal concept (A, B) one needs to
define for B the set of objects VQ
                                 B that has at least one attribute in B, but not
all of them at once. Denote q = b∈B supp (b0 ) and supp (VB ) = v, the collective
strength of a formal concept has the following form:
                                          1−v q
                               cs (B) =
                                           v 1−q


4     Experiments

In this section, we compare measures with respect to their ability to help selecting
most interesting concepts and filtering concepts coming from noisy datasets. For
both goals, one is interested in a ranking of concepts rather than in particular
values of the measures.


4.1   Formal Concept Mining

Usually concept lattices constructed from empirical data have huge amount of
formal concepts, many of them being redundant, excessive and useless. In this
connection the measures can be used to estimate how meaningful a concept is.
Since the “interestingness” of a concept is a fairly subjective measure, the correct
comparison of indices in terms of ability to select meaningful ones is impossible.
With this respect we focus on similarity of indices described above. To identify
how similar indices are, we use the Kendall tau correlation coefficient [22]. Put
differently, we consider pairwise similarity of two lists of the same concepts that
are ordered by values of the chosen indices. A set of strongly correlated measures
can be replaced by one with the lowest computational complexity.
    We randomly generated 100 formal contexts of random sizes. The number of
attributes was in range between 10 and 40, while the number of objects varied
from 10 to 70. For generated contexts we calculated pairwise Kendall tau for
all indices of each context.The averaged values of correlations coefficients are
represented in Table 1.
    In [7] it was shown that the CU, CFC and CV are correlated, while S and
P are not strongly correlated to other metrics. The results of our simulations
allow us to conclude that CU, CFC and CV are also pairwise correlated to
separation and support. Moreover, support is strongly correlated to separation
and probability. Since the computational complexity of support is less than that
of separation and probability, it is preferable to use support. It is worth noting
that predictability (P) and robustness are not correlated to any other metrics
and hence they can not be replaced by the metrics introduced so far.
    Thus, based on the correlation analysis, it is possible to reduce computation-
ally complexity by choosing the most easily computable index within the class
of correlated metrics.
                         Concept interestingness measures: a comparative study             67

                Table 1. Kendall tau correlation coefficient for indices
       Smm
        J    Sma
               J   Sam
                     J    Saa
                           J   Smm    ma    am    aa
                                SM C SSM C SSM C SSM C P     CU CFC CV Rob0.8 Rob0.5 Rob0.3
Prob   0.18 0.15 0.14 0.14 0.04 0.03 0.00 -0.02 0.04 0.30 0.49 -0.01 -0.07 -0.11          -0.14
Sep    0.20 0.20 0.18 0.18 0.07 0.07 0.14 0.12 0.05 0.36 0.45 0.54 -0.11 -0.12            -0.13
CS     -0.08 -0.05 -0.06 -0.05 -0.07 -0.07 0.02 0.04 -0.09 0.04 -0.12 0.29 0.00 0.02       0.04
Lift   -0.16 -0.13 -0.08 -0.07 -0.09 -0.08 0.02 0.03 -0.15 -0.07 -0.25 0.25 0.07 0.10      0.11
Sup    0.17 0.17 0.21 0.21 -0.01 -0.02 0.03 0.00 -0.06 0.54 0.80 0.31 -0.10 -0.15         -0.18
Stab   0.08 0.08 0.11 0.11 0.01 0.01 -0.02 -0.02 -0.18 -0.05 0.08 0.12 0.23 0.14           0.06
Stabl 0.06 0.06 0.11 0.11 0.02 0.02 0.01 0.01 -0.17 -0.16 -0.05 0.07 0.24 0.21             0.14
Stabh 0.15 0.14 0.15 0.14 0.02 0.01 -0.04 -0.05 -0.11 0.24 0.45 0.23 0.13 0.00            -0.09
Rob0.1 -0.09 -0.09 -0.02 -0.02 0.00 0.00 -0.01 0.00 -0.02 -0.11 -0.16 -0.09 0.56 0.73      0.86
Rob0.3 -0.10 -0.10 -0.03 -0.02 0.00 0.00 -0.02 0.00 -0.03 -0.12 -0.18 -0.09 0.68 0.86
Rob0.5 -0.08 -0.08 -0.02 -0.02 0.02 0.02 -0.02 -0.01 -0.03 -0.12 -0.15 -0.07 0.82
Rob0.8 -0.06 -0.06 -0.03 -0.02 0.03 0.03 -0.03 -0.02 -0.03 -0.11 -0.12 -0.06
CV     0.08 0.09 0.15 0.15 -0.04 -0.04 0.05 0.05 -0.14 0.50 0.52
CFC    0.09 0.08 0.15 0.15 -0.13 -0.13 -0.05 -0.06 -0.18 0.72
CU     0.03 0.04 0.10 0.11 -0.13 -0.13 -0.06 -0.07 -0.17                           -0.11 Stabh
P      0.43 0.42 0.28 0.27 0.50 0.50 0.40 0.41                               0.39 0.09 Stabl
Saa
 SM C 0.39 0.39 0.56 0.56 0.49 0.50 0.92                                0.86 0.59 0.03 Stab
Sam
 SM C 0.39 0.38 0.58 0.57 0.48 0.49                               0.18 0.02 0.58 -0.17 Sup
Sma
 SM C 0.51 0.50 0.37 0.37 0.96                              -0.47 -0.04 0.05 -0.29 0.10   Lift
Smm
 SM C 0.51 0.48 0.36 0.36                             0.64 -0.32 -0.09 -0.04 -0.25 0.03   CS
Saa
 J     0.41 0.42 0.95                            0.14 0.01 0.42 0.03 -0.02 0.20 -0.13     Sep
Sam
 J     0.42 0.41                           0.17 -0.53 -0.73 0.76 0.15 0.02 0.48 -0.14 Prob
Sma
 J     0.90                                 Sep CS Lift Sup Stab Stabl Stabh Rob0.1




4.2    Noise Filtering

In practice, we often have to deal with noisy data. In this case, the number of
formal concepts can be very large and the lattice structure becomes too compli-
cated [24]. To test the ability to filter out noise we took 5 lattices of different
structure. Four of them are quite simple (Fig. 1) and the fifth one is the bina-
rized fragment of the Mushroom data set 1 on 500 objects and 14 attributes, its
concept lattice consists of 54 formal concepts.




                      (a)            (b)            (c)            (d)

Fig. 1. Concept lattices for formal contexts with 300 objects and 6 attributes (a -
c),with 400 objects and 4 attributes (d)

1
    https://archive.ics.uci.edu/ml/datasets/Mushroom
68      Sergei O. Kuznetsov and Tatyana P. Makhalova


   For a generated 0-1 datatable we changed table elements (0 to 1 and 1 to
0) with a given probability. The rate of noise (the probability of replacement)
varied in the range from 0.05 to 0.5. We test the ability of a measure to filter
redundant concepts in terms of precision and recall. For top-n (w.r.t. a measure)
formal concepts, the recall and precision are defined as follows:

                                     |original conceptstop−n |
                     recalltop−n =
                                        |original concepts|

                                       |original conceptstop−n |
                    precisiontop−n =
                                          |top − n concepts|


                    Table 2. Precision of indices with recall = 0, 6

            Noise
                     Prob     Sep    Stabl   Stabh   CV     CFC        CU     Freq Rob0.5
            rate
Antichain     0.1    0.03      1       1       1      1      0.15      0.25   0.13   0.05
              0.3    0.03      1       1       1      1      0.09      0.20   0.10   0.02
              0.5    0.02    0.20    0.12    0.13    0.29    0.07      0.10   0.06   0.02
Chain         0.1    0.80    0.44      1       1     0.67    0.27      0.13   0.27   0.80
              0.3    0.21    0.18    0.67      1     0.22    0.18      0.17   0.19    1
              0.5    0.29    0.13    0.25    0.57    0.21    0.14      0.16   0.14   0.57
Context 3     0.1    0.20      1       1       1     0.36    0.33      0.44   0.67   0.40
              0.3    0.16    0.67    0.80    0.80    0.44    0.33      0.44   0.50   0.40
              0.5    0.19    0.50    0.50    0.50    0.44    0.27      0.33   0.50   0.57
Context 4     0.1    0.44    1.00    1.00    1.00    1.00    0.80      0.57   0.80   0.50
              0.3    0.22    1.00    1.00    1.00    1.00    0.80      0.57   0.80   0.57
              0.5    0.14    0.67    1.00    1.00    0.44    0.80      0.57   0.80   0.67
Mushroom      0.1    0.28    0.29    0.84    0.84    0.32    0.28      0.32   0.31   0.30
              0.3    0.16    0.16    0.36    0.39    0.25    0.18      0.20   0.22   0.09
              0.5    0.08    0.10    0.17    0.17    0.14    0.11      0.16   0.11   0.06


    Figures 2 show the ROC curve for the measures. The curves that are close
to the left upper corner correspond to the most powerful measures.
    The best and most stable results correspond to the high estimate of stability
(stabilityh ). The similar precision has the lower estimate of stability (Table 2),
whereas precision of separation and probability depends on the proportion of
noise and lattice structure as well. The measures of basic level that utilize sim-
ilarity and predictability approaches become zero for some concepts. The rate
of vanished concepts (including original ones) increases as the noise probability
gets bigger. In our study we take such concepts as “false negative”, so in this
case ROC curves do not pass through the point (1,1). More than that, recall and
                      Concept interestingness measures: a comparative study       69




Fig. 2. Averaged ROC curves of indices among contexts 1 - 5 with different noise rate
(0.1 - 0.5)


precision are unstable with respect to the noise rate and lattice structure. This
group of measures is inappropriate for noise filtering.
     The other basic level measures, such as CU, CFC and CV, demonstrate much
better recall compared to previous ones. However, in general the precision of CU,
CFC and CV is determined by lattice structure (Table 2).
     Frequency has the highest precision among the indices that are applicable for
the assessment of arbitrary sets of attributes. Frequency is stable with respect
to the noise rate, but can vary under different lattice structures. For the lift
and the collective strength precision depends on the lattice structure, and the
collective strength also has quite unstable recall.
     Precision of robustness depends on both lattice structure and value of α
(Fig. 2). In our study we have got the highest precision for α close to 0.5.
     Thus, the most preferred metrics for noise filtering are stability estimates,
CV, frequency and robustness (where α is greater than 0.4).
     In [24] it was noticed that the combination of the indices can improve the
filtering power of indices. In this regard, we have studied top-n concepts selected
by pairwise combination of measures. As it was shown by the experiments, the
combination of measures may improve recall of the top-n set, while precision
gets lower with respect to a more accurate measure. Figure 3 shows recall and
precision of different combination of measures. In the best case it is possible to
improve the recall, the precision on small sets of top-n concepts is lower than
the precision of one measure by itself.


5   Conclusion

In this paper we have considered various methods for selecting interesting con-
cepts and noise reduction. We focused on the most promising and well inter-
pretable approach based on interestingness measures of concepts. Since “inter-
70     Sergei O. Kuznetsov and Tatyana P. Makhalova




Fig. 3. Recall and precision of metrics and their combination on a Mushroom dataset
fragment with the noise probability 0.1


estingness” of a concept is a subjective measure, we have compared several mea-
sures known in the literature and identified groups of most correlated ones. CU,
CFC, CV, separation and frequency make up the first group. Frequency is cor-
related to separation and probability.
    Another part of our experiments was focused on the noise filtering. We have
found that the stability estimates work perfectly with data of various noise rate
and different structure of the original lattice. Robustness and 3 of basic level met-
rics (cue validity, category utility and category feature collocation approaches)
could also be applied to noise reduction. The combination of measures can also
improve the recall, but only in the case of high noise rate.

Acknowledgments
The authors were supported by the project “Mathematical Models, Algorithms,
and Software Tools for Mining of Structural and Textual Data” supported by
the Basic Research Program of the National Research University Higher School
of Economics.


References
 1. Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. In: Proceedings
    of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of
    database systems. pp. 18–24. ACM (1998)
 2. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In:
    Proc. 20th int. conf. very large data bases, VLDB. vol. 1215, pp. 487–499 (1994)
 3. Arévalo, G., Berry, A., Huchard, M., Perrot, G., Sigayret, A.: Performances of
    galois sub-hierarchy-building algorithms. In: Formal Concept Analysis, pp. 166–
    180. Springer (2007)
 4. Babin, M.A., Kuznetsov, S.O.: Approximating concept stability. In: Domenach,
    F., Ignatov, D., Poelmans, J. (eds.) Formal Concept Analysis. Lecture Notes in
    Computer Science, vol. 7278, pp. 7–15. Springer Berlin Heidelberg (2012)
 5. Belohlavek, R., Macko, J.: Selecting important concepts using weights. In:
    Valtchev, P., Jschke, R. (eds.) Formal Concept Analysis, Lecture Notes in Com-
    puter Science, vol. 6628, pp. 65–80. Springer Berlin Heidelberg (2011)
 6. Belohlavek, R., Trnecka, M.: Basic level of concepts in formal concept analysis. In:
    Domenach, F., Ignatov, D., Poelmans, J. (eds.) Formal Concept Analysis, Lecture
    Notes in Computer Science, vol. 7278, pp. 28–44. Springer Berlin Heidelberg (2012)
                       Concept interestingness measures: a comparative study         71


 7. Belohlavek, R., Trnecka, M.: Basic level in formal concept analysis: Interesting
    concepts and psychological ramifications. In: Proceedings of the Twenty-Third In-
    ternational Joint Conference on Artificial Intelligence. pp. 1233–1239. IJCAI ’13,
    AAAI Press (2013)
 8. Belohlavek, R., Vychodil, V.: Formal concept analysis with background knowledge:
    attribute priorities. Systems, Man, and Cybernetics, Part C: Applications and
    Reviews, IEEE Transactions on 39(4), 399–409 (2009)
 9. Berry, A., Huchard, M., McConnell, R., Sigayret, A., Spinrad, J.: Efficiently com-
    puting a linear extension of the sub-hierarchy of a concept lattice. In: Ganter, B.,
    Godin, R. (eds.) Formal Concept Analysis, Lecture Notes in Computer Science,
    vol. 3403, pp. 208–222. Springer Berlin Heidelberg (2005)
10. Buzmakov, A., Kuznetsov, S.O., Napoli, A.: Scalable estimates of concept stabil-
    ity. In: Glodeanu, C., Kaytoue, M., Sacarea, C. (eds.) Formal Concept Analysis,
    Lecture Notes in Computer Science, vol. 8478, pp. 157–172. Springer International
    Publishing (2014)
11. Cabena, P., Choi, H.H., Kim, I.S., Otsuka, S., Reinschmidt, J., Saarenvirta, G.:
    Intelligent miner for data applications guide. IBM RedBook SG24-5252-00 173
    (1999)
12. Carpineto, C., Romano, G.: A lattice conceptual clustering system and its appli-
    cation to browsing retrieval. Machine Learning 24(2), 95–122 (1996)
13. Cheung, K., Vogel, D.: Complexity reduction in lattice-based information retrieval.
    Information Retrieval 8(2), 285–299 (2005)
14. Dhillon, I., Modha, D.: Concept decompositions for large sparse text data using
    clustering. Machine Learning 42(1-2), 143–175 (2001)
15. Dias, S.M., Vieira, N.: Reducing the size of concept lattices: The JBOS approach.
    In: Proceedings of the 7th International Conference on Concept Lattices and Their
    Applications, Sevilla, Spain, October 19-21, 2010. pp. 80–91 (2010)
16. Dias, S.M., Vieira, N.J.: Concept lattices reduction: Definition, analysis and clas-
    sification. Expert Systems with Applications 42(20), 7084 – 7097 (2015)
17. Dobša, J., Dalbelo-Bašić, B.: Comparison of information retrieval techniques: la-
    tent semantic indexing and concept indexing. Journal of Inf. and Organizational
    Sciences 28(1-2), 1–17 (2004)
18. Düntsch, I., Gediga, G.: Simplifying contextual structures. In: Kryszkiewicz, M.,
    Bandyopadhyay, S., Rybinski, H., Pal, S.K. (eds.) Pattern Recognition and Ma-
    chine Intelligence, Lecture Notes in Computer Science, vol. 9124, pp. 23–32.
    Springer International Publishing (2015)
19. Emilion, R.: Concepts of a discrete random variable. In: Brito, P., Cucumel, G.,
    Bertrand, P., de Carvalho, F. (eds.) Selected Contributions in Data Analysis and
    Classification, pp. 247–258. Studies in Classification, Data Analysis, and Knowl-
    edge Organization, Springer Berlin Heidelberg (2007)
20. Ganter, B., Wille, R.: Contextual attribute logic. In: Tepfenhart, W., Cyre, W.
    (eds.) Conceptual Structures: Standards and Practices, Lecture Notes in Computer
    Science, vol. 1640, pp. 377–388. Springer Berlin Heidelberg (1999)
21. Jay, N., Kohler, F., Napoli, A.: Analysis of social communities with iceberg and
    stability-based concept lattices. In: Medina, R., Obiedkov, S. (eds.) Formal Concept
    Analysis, Lecture Notes in Computer Science, vol. 4933, pp. 258–272. Springer
    Berlin Heidelberg (2008)
22. Kendall, M.G.: A new measure of rank correlation. Biometrika pp. 81–93 (1938)
23. Klement, E.P., Mesiar, R., Pap, E.: Triangular norms. Springer Netherlands (2000)
72     Sergei O. Kuznetsov and Tatyana P. Makhalova


24. Klimushkin, M., Obiedkov, S., Roth, C.: Approaches to the selection of relevant
    concepts in the case of noisy data. In: Kwuida, L., Sertkaya, B. (eds.) Formal
    Concept Analysis, Lecture Notes in Computer Science, vol. 5986, pp. 255–266.
    Springer Berlin Heidelberg (2010)
25. Kumar, C.A., Srinivas, S.: Latent semantic indexing using eigenvalue analysis for
    efficient information retrieval. Int. J. Appl. Math. Comput. Sci 16(4), 551–558
    (2006)
26. Kuznetsov, S.O.: Interpretation on graphs and complexity characteristics of a
    search for specific patterns. Automatic Documentation and Mathematical Linguis-
    tics 24(1), 37–45 (1989)
27. Kuznetsov, S.O.: Stability as an estimate of degree of substantiation of hypotheses
    derived on the basis of operational similarity. Nauchn. Tekh. Inf., Ser. 2 (12), 21–29
    (1990)
28. Kuznetsov, S.O.: On stability of a formal concept. Annals of Mathematics and
    Artificial Intelligence 49(1-4), 101–115 (2007)
29. Kuznetsov, S.O., Obiedkov, S., Roth, C.: Reducing the representation complexity
    of lattice-based taxonomies. In: Conceptual Structures: Knowledge Architectures
    for Smart Applications, pp. 241–254. Springer Berlin Heidelberg (2007)
30. Mannila, H., Toivonen, H., Verkamo, A.I.: Efficient algorithms for discovering asso-
    ciation rules. In: KDD-94: AAAI workshop on Knowledge Discovery in Databases.
    pp. 181–192 (1994)
31. Murphy, G.L.: The big book of concepts. MIT press (2002)
32. Rosch, E.: Principles of categorization pp. 27–48 (1978)
33. Snasel, V., Polovincak, M., Abdulla, H.M.D., Horak, Z.: On concept lattices and
    implication bases from reduced contexts. In: ICCS. pp. 83–90 (2008)
34. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg
    concept lattices with titanic. Data Knowl. Eng. 42(2), 189–222 (Aug 2002)
35. Tatti, N., Moerchen, F., Calders, T.: Finding robust itemsets under subsampling.
    ACM Transactions on Database Systems (TODS) 39(3), 20 (2014)
36. Zeigenfuse, M.D., Lee, M.D.: A comparison of three measures of the association
    between a feature and a concept. In: Proceedings of the 33rd Annual Conference
    of the Cognitive Science Society. pp. 243–248 (2011)
                  Why concept lattices are large
                 Extremal theory for the number of
               minimal generators and formal concepts

                    Alexandre Albano1 and Bogdan Chornomaz2
                             1
                               Technische Universität Dresden
                      2
                          V.N. Karazin Kharkiv National University



        Abstract. A unique type of subcontexts is always present in formal
        contexts with many concepts: the contranominal scales. We make this
        precise by giving an upper bound for the number of minimal generators
        (and thereby for the number of concepts) of contexts without contranom-
        inal scales larger than a given size. Extremal contexts are constructed
        which meet this bound exactly. They are completely classified.


 1    Introduction
 The primitive data model of Formal Concept Analysis is that of a formal context,
 which is unfolded into a concept lattice for further analysis. It is well known that
 concept lattices may be exponentially larger than the contexts which gave rise
 to them. An obvious example is the boolean lattice B(k), having 2k elements,
 the standard context of which is the k × k contranominal scale Nc (k). This is not
 the only example of contexts having large associated concept lattices: indeed, the
 lattice of any subcontext is embeddable in the lattice of the whole context [4],
 which means that contexts having large contranominal scales as subcontexts
 necessarily have large concept lattices as well. Those considerations induce one
 natural question, namely, whether there are other reasons for a concept lattice
 to be large. As it will be shown in this paper, the answer is no.
      The structure of the paper is as follows. Our starting point is a known up-
 per bound for the number of concepts, which we improve using the language of
 minimal generators. Then, we show that our result is the best possible by con-
 structing lattices which attain exactly the improved upper bound. These lattices,
 i.e., the extremal lattices, are characterized.


 2    Fundamentals
 For a set S, a context of the form (S, S, 6=) will be called a contranominal scale.
 We will denote by Nc (k) the contranominal scale with k objects (and k at-
 tributes), that is, the context ([k], [k], 6=), where [k] := {1, 2, . . . , k}. The expres-
 sion K1 ≤ K denotes that K1 is a subcontext of K. The symbol ∼                  = expresses
 the existence of an order-isomorphism whenever two ordered sets are involved

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 73–86, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
74     Alexandre Albano and Bogdan Chornomaz


or, alternatively, the existence of a context isomorphism in the case of formal
contexts. For a context K to be Nc (k)-free means that there does not exist a
subcontext K1 ≤ K with K1 ∼    = Nc (k). The boolean lattice with k atoms, that is,
     c
B(N (k)), will be denoted by B(k). Similarly, we say that a lattice L is B(k)-free
whenever B(k) does not (order-)embed into L. Using Proposition 32 from [4]
one has that K is Nc (k)-free whenever B(K) is B(k)-free. The converse is also
true and is, in fact, the content of our first proposition. An example of a context
which has Nc (3) as a subcontext along with its concept lattice is depicted in
Figure 1. One may observe that the context is Nc (4)-free, since its lattice has
ten concepts (and would have at least sixteen otherwise).



               m n o p q
             g   ××
             h × ××                           o          n             m
             i ×× ×
             j ×     ××
             k ×
                                              g          h             i




Fig. 1: A context K with Nc (3) ≤ K and its concept lattice. The object and attribute
concepts belonging to the B(3) suborder are indicated on the diagram.


     We denote by J(L) and M (L), respectively, the set of completely join-
irreducible and meet-irreducible elements of a lattice L. The least and greatest
elements of a lattice L will be denoted, respectively, by 0L and 1L . The symbol ≺
will designate the covering relation between elements, that is, x ≺ y if x < y and
for every z ∈ L, x < z ≤ y ⇒ z = y. The length of a finite lattice is the number
of elements in a maximum chain minus one. An atom is an element covering 0L ,
while a coatom is an element covered by 1L . Whenever two elements x, y ∈ L
are incomparable, we will write x||y. We denote by A(L) the set of atoms of a
lattice L. For an element l ∈ L, we shall write ↓ l := {x ∈ L | x ≤ l} as well as
↑ l := {x ∈ L | x ≥ l}. Moreover, for l ∈ L we denote by Al the set A(L) ∩ ↓ l
and,W similarly, by Jl the set J(L)∩ ↓ l. A complete lattice L is called atomistic if
x = Ax holds for every x ∈ L. In this case, A(L) = J(L).
Proposition 1. Let K be a context such that B(k) embeds into B(K). Then
Nc (k) ≤ K.
Proof. Let (A1 , B1 ), . . . , (Ak , Bk ) be the atoms of B(k) in B(K). Similarly, de-
note its coatoms by (C1 , D1 ), . . . , (Ck , Dk ) in such a way that (Ai , Bi ) ≤ (Cj , Dj ) ⇔
i 6= j for each i, j. Note that the sets Ai , as well as the sets Di , are non-empty.
Let i ∈ [k]. Since (Ai , Bi )  (Ci , Di ), we may take an object/attribute pair
gi ∈ Ai , mi ∈ Di with gi I mi . For every chosen object gi ∈ Ai , one has that
                                                Why concept lattices are large      75


gi Imj for every j ∈ [k] with j 6= i, because of (Ai , Bi ) ≤ (Cj , Dj ), which implies
Bi ⊇ Dj . Consequently, k distinct objects gi (as well as k distinct attributes mi )
were chosen. Combining both relations results in gi Imj ⇔ i 6= j for each i ∈ [k],
that is, the objects and attributes gi , mi form a contranominal scale in K.

Definition 1. Let (G, M, I) be a formal context. A set S ⊆ G is said to be
a minimal generator (of the extent S 00 ) if T 00 6= S 00 for every proper subset
T ( S. The set of all minimal generators of a context K will be denoted by
MinGen(K).

Observation: In contexts with finitely many objects, every extent has at least
one minimal generator. Clearly, two different extents cannot share one same
minimal generator. Thus, the upper bound |B(K)| ≤ |MinGen(K)| holds for
contexts with finite object sets.
    The problem of computing exactly the number of concepts does not admit
a polynomial-time algorithm, unless P=NP. This was shown by Kuznetsov [5]:
more precisely, this counting problem is #P-complete. However, there are results
which establish upper bounds for the number of concepts: see for example [1–3,
6, 7].


3    The upper bound

Our investigations were inspired by a result of Prisner, who gave the first upper
bound regarding contranominal-scale free contexts. The original version is in
graph theoretic language. Reformulated it reads as follows:

Theorem 1 (Prisner [6]). Let K = (G, M, I) be a Nc (k)-free context. Then,

                            |B(K)| ≤ (|G||M |)k−1 + 1.

    In this section we will show an improvement of Theorem 1. For that, we will
relate minimal generators with contranominal scales. The first step towards this
is the equivalence shown in Proposition 2. Note that, since derivation operators
are antitone, the 6= symbol may be substituted by ).

Proposition 2. Let (G, M, I) be a formal context. A set S ⊆ G is a minimal
generator if and only if for every g ∈ S, it holds that (S \ {g})0 6= S 0 .

Proof. We will show the two equivalent contrapositions. If (S \ {g})0 = S 0 , then,
of course, (S \ {g})00 = S 00 , and S is not a minimal generator. For the converse,
suppose that S is not a minimal generator, and take a proper subset T of S with
T 00 = S 00 . Note that T 00 = S 00 implies T 0 = S 0 . Let g ∈ S \ T . On one hand,
(S \ {g}) ⊆ S implies (S \ {g})0 ⊇ S 0 . On the other hand, (S \ {g}) ⊇ T implies
(S \ {g})0 ⊆ T 0 = S 0 . Combining both yields (S \ {g})0 = S 0 .

    The next proposition relates minimal generators and contranominal scales.
76     Alexandre Albano and Bogdan Chornomaz


Lemma 1. Let K = (G, M, I) be a context and A ⊆ G. There exists a contra-
nominal scale K1 ≤ K having A as its object set if and only if A is a minimal
generator. In particular, if G is finite:

       max{|A| : A is a minimal generator} = max{k ∈ N : Nc (k) ≤ K}.
       A⊆G

Proof. Suppose that A is a minimal generator and let g ∈ A. By Proposition 2,
one has that (A \ {g})0 ) A0 . Hence, there exists an attribute m with g I m and
hIm for every h ∈ A \ {g}. Clearly, two different objects g1 , g2 ∈ A cannot give
rise to the same attribute m, since the two pairs of conditions gi I m and hIm for
every h ∈ A\{gi } cannot be satisfied simultaneously (i = 1, 2). Thus, there exists
an injection ι : A → M with g I ι(g), hIι(g) for each g ∈ A and each h ∈ A \ {g}.
By setting N = ι(A), one has that (A, N, I ∩ (A × N )) is a contranominal scale.
For the converse, let K1 = (S, S, 6=) ≤ K be a contranominal scale and let g ∈ S.
Clearly, g ∈/ S 0 . Moreover, g ∈ (S \ {g})0 . This amounts to (S \ {g})0 ) S 0 for
each g ∈ S. By Proposition 2, the set S is a minimal generator.

   A consequence of Lemma 1 is the following, which is an improvement of the
order of k! · |M |k /k of Prisner’s bound.

Theorem 2. Let K = (G, M, I) be a Nc (k)-free context with finite G. Then:

                                                       X
                                                       k−1
                                                             |G|
                                                                
                    |B(K)| ≤ |MinGen(K)| ≤                        .
                                                       i=0
                                                              i

In particular, if k ≤ |G|
                       2 :

                                               |G|k−1
                              |B(K)| ≤ k ·             .
                                              (k − 1)!

Proof. Lemma 1 guarantees that K does not have any minimal generator of
cardinality greater or equal to k. The sum above is the number of subsets of G
having cardinality at most k − 1.

Definition 2. We denote by f (n, k) the upper bound in Theorem 2:

                                             X
                                             k−1
                                                   n
                                                     
                               f (n, k) :=             .
                                             i=0
                                                   i

   The upper bound in Theorem 2 for f (n, k) gets worse as k gets close to |G|
                                                                            2 .
Tighter upper bounds for the sum of binomial coefficients may be found in [9].


4    Sharpness: preparatory results

The following property of f (n, k) is needed for the next two sections.
                                               Why concept lattices are large    77


Proposition 3. The function f (n, k) satisfies the following identity:

                     f (n, k) = f (n − 1, k − 1) + f (n − 1, k).

Proof. This follows from a standard binomial identity: f (n − 1, k) + f (n −
             Pk−1 n−1 Pk−2 n−1              Pk−1        Pk−1 n−1
1, k − 1) =     i=0   i + j=0 j        = 1 + i=1 n−1 i−1 +     j=1    j    =
     Pk−1 n
1 + i=1 i = f (n, k).

    Consider a finite lattice L. It is well known that W every element x ∈ L is the
supremum of some subset of J(L): for example, x = Jx . We call such a subset
a representation of x through join-irreducible elements (for brevity, we may say
a representation through irreducibles of x or even only W  a representation of x). A
representation S ⊆ J(L) of x is called irredundant if (S \ {y}) 6= x for every
y ∈ S. Of course, every x ∈ L has an irredundant representation, but it does not
need to be unique. Note that irredundant representations are precisely minimal
generators when one takes the standard context of L, (J(L), M (L), ≤). Indeed,
in that formal context, the closure of object sets corresponds to the supremum
of join-irreducible elements of L. For an element x ∈ L, there may exist elements
in Jx which belong to every representation of x: the so-called extremal points.
An element z ∈ Jx is an extremal point of x if there exists a lower neighbor
y of x such that Jy = Jx \ {z}. Every representation of W     x must contain every
extremal point z of x since, in this case, the supremum (Jx \ {z}) is strictly
smaller than x (and is actually covered by x).
    In Section 5 we shall construct finite lattices for which every element has ex-
actly one irredundant representation. It turns out that, in the finite case, these
lattices are precisely the meet-distributive lattices. This is implied by Theorem
44 of [4], which actually gives information about the unique irredundant repre-
sentation as well: a finite lattice L is meet-distributive if and only if for every
x ∈ L the set Ex of all extremal points of x is a representation of x (and Ex is,
therefore, the unique irredundant representation of x, since every representation
of x must contain Ex ). Proposition 4 provides a characteristic property for the
finite case which will be used in our constructions.

Proposition 4. Let L be a finite lattice. The following assertions are equivalent:
  i) L is meet-distributive.
 ii) Every element x ∈ L is the supremum of its extremal points.
iii) For every x, y ∈ L with x ≺ y, it holds that |Jy \ Jx | = 1.

Proof. The equivalence between i) and ii) may be found in Theorem 44 of [4].
Let x ∈ L and define Ex = {z ∈ Jx | z is an extremal point of x}. We now show
that ii) implies iii). Let y ∈ L with y < x. This implies Jy ( Jx . TheWset Jy
does not contain Ex , because this would force y ≥ W x. Therefore, y = Jy is
upper bounded by some element in the set U = { (Jx \ {z}) | z ∈ Ex } (note
that x ∈/ U ). Hence, every lower neighbor of x has a representation of the W
                                                                            form
(Jx \ {z}) with z ∈ Ex . Now we show that iii) implies ii). Define y = Ex
and suppose by contradiction that y < x. Then, there exists an element z such
78     Alexandre Albano and Bogdan Chornomaz


that y ≤ z ≺ x and Jz ⊇ Ex . But then, z ≺ x implies Jx \ Jz = {w} for some
w ∈ J(L), which means that w is an extremal point of x. This contradicts the
fact that Ex contains all extremal points of x.

   The next lemma will be useful in Section 5, when we shall change the per-
spective from lattices to contexts.

Lemma 2. Let L be a finite lattice. If L is B(k)-free, then every element has a
representation through join-irreducibles of size at most k − 1. The converse holds
if L is meet-distributive.

Proof. Let K = (J(L), M (L), ≤) be the standard context of L. We identify the
elements of L with the extents of K via x 7→ Jx . Suppose that L is B(k)-free.
Then, K is Nc (k)-free. Let A be an arbitrary extent of K and S a minimal
generator
W         of A. Then, by Lemma 1, it follows that |S| ≤ k − 1. Since A = S 00 =
   S, we have the desired representation. Now, suppose that |Jy \ Jx | = 1 holds
for every x, y ∈ L with x ≺ y (cf. Proposition 4). To prove the converse, we
suppose that B(k) embeds into L and our goal is to show that some x ∈ L does
not have any representation with fewer than k elements of J(L). Now, since B(k)
embeds into L, Proposition 1 implies that Nc (k) is a subcontext of K. Applying
Lemma 1, we have that there exists a minimal generator S ⊆ J(L) with |S| = k.
Equivalently, S is an irredundant representation of the element S 00 of L. By
Proposition 4, S is the unique irredundant representation of S 00 . Therefore, S 00
cannot be expressed as the supremum of fewer than k join-irreducible elements.



5    Sharpness: construction of extremal lattices

In this section, we will consider only finite lattices. Our objective is to construct
lattices which prove that the bound in Theorem 2 is sharp.

Definition 3. For positive integers n and k, we call a lattice (n,k)-extremal
if it has at most n join-irreducible elements, is B(k)-free, and has exactly f (n, k)
elements.

   It is clear that every (n, 1)-extremal lattice is trivial, i.e., the lattice with
one element. To construct (n, k)-extremal lattices with larger k, we will use an
operation which we call doubling.

Definition 4. Let L be an ordered set and K ⊆ L. The doubling of K in L
                             .          .                             .
is defined to be L[K] = L ∪ K, where K is a disjoint copy of K, i.e., K ∩ L = ∅.
The order in (L[K], ≤0 ) is defined as follows:
                        .          .               . .     .    .
         ≤0 = ≤ ∪ {(x, y) ∈ L × K | x ≤ y} ∪ {(x, y) ∈ K × K | x ≤ y}.
                                                        Why concept lattices are large       79

                                             .
    We will employ the notation x to denote the image under doubling of an
                                    .                               .
element x ∈ K. Note that x ≺ x for every x ∈ K, and that x is the only upper
                     .
neighbor of x in K. When L is a set family C ⊆ P(G), then the diagram of
L[K] can be easily depicted: the doubling C[D] (with D ⊆ C) corresponds to
the set family C ∪ {D ∪ {g} | D ∈ D}, where g ∈      / G is a new element. Figure 2
illustrates three doubling operations. The first one is the doubling of the chain
{∅, {2}, {1, 2}} inside the closure system C1 = P([2]), resulting in C2 . The (a
fortiori ) closure systems C3 and C4 are obtained by doubling, respectively, the
chains {∅, {3}, {2, 3}, {1, 2, 3}} and {∅, {2}, {2, 3}, {1, 2, 3}} inside C2 .


                                                                   123

                           12                                12               23

                     1                   2               1               2          3


                               C1                                      C2
                         1234                                  1234

                     123             234                     123              234

                12              23           34         12               23         24

            1              2         3            4 1              2          3          4


                               C3                                      C4

                      Fig. 2: Doubling chains inside closure systems


    Since we are interested in constructing lattices, it is vital to guarantee that
the doubling operation produces a lattice. By a meet-subsemilattice of a lattice L
is meant a subset K of L, endowed with the inherited order, such that x ∧ y ∈ K
holds for every x, y ∈ K. It is called topped if 1L ∈ K.

Proposition 5. If K is a topped meet-subsemilattice of a lattice L, then L[K]
is a lattice.

Proof. Let x, y ∈ L[K]. If both x and y belong to L, then clearly x ∧ y and x ∨ y
belong to L ⊆ L[K]. Suppose that only one among x and y, say x, belongs to
              .
L. Then y = z with z ∈ K. We have that x ∧ y = x ∧ z ∈ L ⊆ L[K] because of
      .               .
x  0K and
         V y = z ∨ 0K . For the supremum, set S = {w ∈ K | w ≥ x, w ≥ z}
and u = S. Note that the fact that K is topped causes S 6= ∅. Since K is a
meet-subsemilattice, we have that u ∈ K. It is clear that u is the least upper
80     Alexandre Albano and Bogdan Chornomaz

                                                      .
bound of x and z which belongs to K. Therefore, u is the least upper bound of
                      .                     .                               .
x and y, because of 0K  u and y = z ∨ 0K . The remaining case is x, y ∈ K
                                                       .       .
for which, clearly x ∧ y exists. Moreover, writing x = t, yV= z with t, z ∈ K
and setting S = {w ∈ K | w ≥ t, w ≥ z} as well as u = S make clear that
 .
u = x ∨ y.

    When extrinsically considered, topped meet-subsemilattices are lattices. This
is compatible with the proof of Proposition 5, where the supremum and infimum
                    .                                     .           .
of two elements in K may be easily verified to belong to K: that is, K is actually
a sublattice of L[K].
    A suborder K of an ordered set L is called cover-preserving if x ≺K y implies
x ≺L y for every x, y ∈ K. This property plays a key role by preserving meet-
distributivity under a doubling operation:

Proposition 6. Let L be a meet-distributive lattice and let K be a cover-preserving,
topped meet-subsemilattice of L. Then, L[K] is a meet-distributive lattice.

Proof. The fact that L[K] is a lattice comes from Proposition 5. Every element
 . .
x ∈ K has one lower neighbor in K, namely, x. Thus, the total number of lower
               .
neighbors of x is one only if x does not cover any element in K, that is, x = 0K .
             .
Therefore, 0K is the only join-irreducible of L[K] which is not a join-irreducible
                                                         0
of L. Let x, y ∈ L[K] with x ≺L[K] y. We use J(·)          to denote our J-notation
in L[K] and J(·) in L. If x, y ∈ L, then clearly Jy = Jy and Jx0 = Jx , which
                                                         0

results in |Jy0 \ Jx0 | = |Jy \ Jx | = 1. If x, y ∈
                                                                 .             .
                                                   / L, then x = z and y = w with
z, w ∈ K and z ≺K w. From the fact that K is cover-preserving, we conclude
that z ≺L w. Because L is meet-distributive, it follows that |Jw \Jz | = 1. Clearly
                        .                     .
one has Jx0 = Jz ∪ {0K } and Jy0 = Jw ∪ {0K }, which yields |Jy0 \ Jx0 | = 1. For the
remaining case, one has necessarily x ∈ L and y ∈
                 .                                . / L. In these conditions, x ≺ y
results in y = x and, therefore, Jy0 = Jx0 ∪ {0K }, implying |Jy0 \ Jx0 | = 1.

    Proposition 7 is the first assertion about extremal meet-subsemilattices. We
note that the set of join-irreducible elements of a meet-subsemilattice K of a
lattice L is not the same as the set J(L) ∩ K. Therefore, what is meant by an
(n, k)-extremal meet-subsemilattice of L is precisely the following: a lattice K
which is (n, k)-extremal and a meet-subsemilattice of L as well.
    Observe that chains with n + 1 elements are precisely the (n, 2)-extremal
lattices. Proposition 7 illustrates, in particular, that an n + 1 element chain may
be seen as the result of a doubling operation on an n element chain, provided
that the doubling operation is performed with respect to the trivial topped meet-
subsemilattice ↑ 1, which is (n, 1)-extremal.

Proposition 7. Let L be an (n − 1, k)-extremal lattice with n, k ≥ 2. Suppose
that K is a topped, (n − 1, k − 1)-extremal meet-subsemilattice. If L[K] is B(k)-
free, then it is an (n, k)-extremal lattice.
                                             Why concept lattices are large   81


Proof. Proposition 5 guarantees that L[K] is indeed a lattice. As in the proof of
                                                .
Proposition 6, we have that J(L[K]) = J(L) ∪ {0K } and in particular, L[K] has
at most n join-irreducible elements. The claim that L[K] has f (n, k) elements
follows from Proposition 3.
    It is clear now how (n, 2)-extremal lattices can be obtained by doubling
a trivial meet-subsemilattice of an (n − 1, 2)-extremal lattice. The succeeding
propositions and lemmas aim particularly towards a generalization of this opera-
tion: the doubling of topped, (n − 1, k − 1)-extremal meet-subsemilattices inside
(n − 1, k)-extremal lattices, yielding (n, k)-extremal lattices for k ≥ 3.
Proposition 8. Suppose that L is an (n, k)-extremal lattice. Then, for every
S, T ⊆ J(L) with |S|, |T | ≤ k − 1:
                              _     _
                                 S=   T ⇒ S = T.

Moreover, if k ≥ 2 then |J(L)| = n.
Proof. We may suppose k ≥ 2 since the assertion holds trivially for k = 1.
Lemma 2 guarantees that every  W element x of L has a representation of size at
most k − 1. Therefore L = { S | S ⊆ J(L), |S| ≤ k − 1}. Because of k ≥ 2
and the fact that |L| = f (n, k) is also the number of subsets of [n] having at
most k − 1 elements, one has |J(L)| ≥ n. In fact equality must hold, because
L has at most n join-irreducible elements. As a consequence of |J(L)| = n and
|L| = f (n, k), we W     W no two sets S, T ⊆ J(L) with S 6= T may lead to the
                   have that
same supremum S = T .
   Chains are the only extremal lattices which are not atomistic, as a conse-
quence of the next lemma.
Lemma 3. Suppose that L is an (n, k)-extremal lattice with k ≥ 3. Then L is
atomistic and meet-distributive. In particular, the length of L equals the number
of its atoms and there exists an atom which is an extremal point of 1L .
Proof. If L were not atomistic, there would exist two comparable join-irreducible
elements, say, x, y with x < y. But then x ∨ y = y, which contradicts Proposi-
tion 8. Suppose that L is not meet-distributive and take x, y ∈ L with x ≺ y
such that Ay \ Ax has at least two elements. Clearly x 6= 0L and, therefore,
Ax 6= ∅. Let u, v ∈ Ay \ Ax be any two distinct elements. From u ∈  / Ax follows
that x < x ∨ u ≤ y which, in turn, implies x ∨ u = y. Similarly, v ∈/ Ax implies
x < x ∨ v ≤ y which, in turn, implies x ∨ v = y. Let a ∈ Ax . Now, a ≤ x and
x||u imply a ∨ u = x ∨ u = y, as well as a ≤ x and x||v imply a ∨ v = x ∨ v = y.
We obtain a ∨ u = a ∨ v, contradicting Proposition 8. Choosing a maximal chain
x0 ≺ x1 ≺ . . . ≺ xl in L and noticing that the sizes of the sets Axi grow by
exactly one element make the two remaining claims clear.
   Lemma 4 shows that non-trivial, extremal meet-subsemilattices are always
cover-preserving and topped. These two properties will be useful to assure that
a doubling L[K] is a meet-distributive lattice.
82     Alexandre Albano and Bogdan Chornomaz


Lemma 4. Let L be an (n, k)-extremal lattice with k ≥ 3 and suppose that K
is an (n, k − 1)-extremal subsemilattice of L. Then, K is cover-preserving and
topped. If k ≥ 4, then K and L are atomistic with A(K) = A(L).

Proof. In case that k = 3 then K is B(2)-free, that is, K is a chain. By Lemma 3,
                                              P1
K must be a maximal chain in order to have i=0 ni = n + 1 elements. Hence,
1K = 1L . The maximality of K guarantees that K is cover-preserving. Now,
suppose that k ≥ 4. Again by Lemma 3, we have that both K and L are atom-
istic. Since K has n atoms, 0K must be covered by n elements in L. But this is
possible only if 0L = 0K , because L also has n atoms. This forces A(K) = A(L)
as well as 1L ∈ K, because 1L is the only element that upper bounds each
a ∈ A(K). To prove that K is cover-preserving, we apply Lemma 3 twice, ob-
taining |Ay \ Ax | = 1 for every x, y ∈ K with x ≺K y as well as |Ay \ Ax | = 1
for every x, y ∈ L with x ≺L y. Both conditions hold simultaneously only if the
implication x ≺K y ⇒ x ≺L y holds, i.e., if K is cover-preserving.

    A complete meet-embedding
                  V              is a meet-embedding which preserves arbitrary
meets, including     ∅. As a consequence, the greatest element of one lattice
gets mapped to the greatest element of the other. Images of complete meet-
embeddings are topped meet-subsemilattices. This notion is required for the
following simple fact, which aids us in the construction of sequences of (n, k)-
extremal lattices with fixed n and growing k. In Proposition 9, the symbol K[J]
(for instance) means actually the doubling of the image of J under the corre-
sponding embedding.

Proposition 9. Suppose that J, K and L are lattices with complete meet-embeddings
E1 : J → K and E2 : K → L. Then, there exists a complete meet-embedding from
K[J] into L[K].

Proof. The fact that K[J] and L[K] are lattices comes from Proposition 5. Of
                                           .      .
course, there is an induced embedding from J into K, but for which we will use
the same symbol E1 . The mapping E3 : K[J] → L[K] defined by E3 (x) = E1 (x)
         .
for x ∈ J and E3 (x) = E2 (x) for x ∈ K may be checked as being a complete
meet-embedding.

   As mentioned after Proposition 7, we will make use of an operation which
doubles an extremal meet-subsemilattice of an extremal lattice. The next theo-
rem shows that the lattice produced by this operation is indeed extremal.

Theorem 3. Let L be an (n − 1, k)-extremal lattice with n ≥ 2 and k ≥ 3 and
suppose that K is an (n − 1, k − 1)-extremal meet-subsemilattice of L. Then,
L[K] is an (n, k)-extremal lattice.

Proof. Lemma 3 guarantees that L is atomistic and meet-distributive. Moreover,
Lemma 4 guarantees that K is cover-preserving and topped, so that, in partic-
ular, L[K] is a meet-distributive lattice, as a consequence of Proposition 6. To
prove that L[K] is (n, k)-extremal it is sufficient to show that L[K] is B(k)-free,
                                                Why concept lattices are large   83


because of Proposition 7. We will do so by proving that every element of L[K]
has a representation through join-irreducibles of size at most k − 1. This indeed
suffices because L[K] is meet-distributive, so that Lemma 2 may be applied.
                                    .
Observe that J(L[K]) = A(L) ∪ 0K , since L is atomistic.
    Suppose that k = 3. In this case K is a chain, and a maximal one W because of
Lemma 4. Let x ∈ L[K]. If x ∈ L, then Lemma 2 implies that x = S for some
S ⊆ A(L), |S| ≤ 2 and the same representation may be used in L[K]. If x ∈    / L,
             .      .
then x = y = y ∨ 0K for some y ∈ K. If y ∈ A(L), we are done. Otherwise, take
                                                            .
z, w ∈ A(L) such that z ∨ w = y and thus x = z ∨ w ∨ 0K . Exactly one among
z and w belong to K. Without loss of generality, let it be z. Then, it is clear
       .         .                .          .
that 0K ≺ z ∨ 0K and that z ∨ 0K < w ∨ 0K , since there exists only one element
             .                       .         .
covering 0K . Hence, x = z ∨ w ∨ 0K = w ∨ 0K .
    Suppose that k ≥ 4. As noted after Proposition 5 one has that K is, by itself,
a lattice. Moreover, Lemma 4 guaranteesWthat K is atomistic with A(L) = A(K).
Let x ∈ L[K]. If x ∈ L then, in L, x = S for some S ⊆ A(L) ⊆ J(L[K]) with
|S| ≤ k − 1, because of Lemma 2 and the fact that L is B(k)-free. Of course, S
is also a representation of x in L[K]. If x ∈
                                                          .          . .
                                            / L,Wthen x = y for some y ∈ K. Since
K is B(k − 1)-free, it follows that, in K, y = S for some S ⊆ A(K) ⊆ J(L[K])
with |S| ≤ k − 2, once again as a consequence of Lemma 2. Clearly, in L[K],
          . W.        . W
one has y = S = 0K ∨ S, where the last equality follows from the fact that
 .         .                                                     .
z = z ∨ 0K for every z ∈ S. Thus, we have a representation of y = x through no
more than k − 1 join-irreducible elements of L[K].

   Corollary 1 describes how (n, k)-extremal lattices can be non-deterministically
constructed. In particular, the upper bound present in Theorem 2 is the best
possible.

Corollary 1. For every n and k, there exists at least one (n, k)-extremal lattice.

Proof. Define a partial function Φ satisfying

Φ : N∗ × N∗ → L
               
               
                ([n], ⊆),                        if k = 1.
               
               
               ({∅, {1}}, ⊆),                    if k ≥ 2, n = 1.                    ,
      (n, k) 7→ Φ(n − 1, k)[E(Φ(n − 1, k − 1))], if n, k ≥ 2 and there exists a
               
               
               
                                                complete meet-embedding
               
                                                 E : Φ(n − 1, k − 1) → Φ(n − 1, k).

where L is the class of all lattices. We prove by induction on n that Φ(n, k) is
a total function. The cases n = 1 and n = 2 are trivial. Let n ∈ N with n ≥ 3
and suppose that Φ(n − 1, k) is defined for every k ∈ N∗ . Let k ∈ N, k ≥ 2. By
the induction hypothesis, the values Φ(n − 1, k) and Φ(n − 1, k − 1) are defined.
If k = 2, then Φ(n − 1, k − 1) is a trivial lattice and the existence of a complete
meet-embedding into Φ(n − 1, k) is clear and, thereby, Φ(n, k) is defined. We
therefore assume k ≥ 3. By the definition of Φ, one has that Φ(n − 1, k) = Φ(n −
84     Alexandre Albano and Bogdan Chornomaz


2, k)[E(Φ(n−2, k−1))] and that Φ(n−1, k−1) = Φ(n−2, k−1)[F(Φ(n−2, k−2))]
for some pair of complete meet-embeddings E and F. Applying Proposition 9
with Φ(n − 2, k − 2), Φ(n − 2, k − 1) and Φ(n − 2, k) results in the existence
of a complete meet-embedding G : Φ(n − 1, k − 1) → Φ(n − 1, k), which yields
that Φ(n, k) is defined. Since k is arbitrary, every Φ(n, k) is defined. The (n, k)-
extremality of each lattice can be proved by induction on n as well and by
invoking Theorem 3.

    Figure 3 depicts the diagrams of nine (n, k)-extremal lattices which are con-
structible by Corollary 1. It is true that, in general, (n, k)-extremal lattices are
not unique up to isomorphism: note that the (3, 3) and (4, 3)-extremal lattices
in Figure 3 are also present in Figure 2 as the lattices C2 and C3 . The lattice C4 ,
depicted in that same figure, is a (4, 3)-extremal lattice which is not isomorphic
to C3 . We shall, however, show in the next section that every extremal lattice
arises from the construction described in Corollary 1.


6    Characterization of extremal lattices

In the last section, we constructed lattices whose sizes are exactly the upper
bound present in Theorem 2. In this section, we will show that every lattice
meeting those requirements must be obtained from our construction.

Lemma 5. Let L be an atomistic lattice, a an atom and c a coatom with Ac =
                                    E
A(L) \ {a}. Then, the mapping x 7− → c ∧ x is a complete meet-embedding of ↑ a
into ↓ c such that E(x) ≺ x for every x ∈ ↑ a.

Proof. The fact that E preserves non-empty meets is clear, since c is a fixed
element. Also, 1L is mapped to c = 1↓c , so that E preserves arbitrary meets.
Note that
                     AE(x) = Ac∧x = Ax ∩ Ac = Ax \ {a}.
Hence, E(x) ≺ x as well as E(x) ∨ a = x. The latter implies injectivity.

   The next theorem shows that every extremal lattice is constructible by the
process described in Corollary 1, and can be seen as a converse of that result.

Theorem 4. Let L be an (n, k)-extremal lattice with k ≥ 3. Then, L = J ∪˙ K
where J is an (n − 1, k)-extremal lattice and K is an (n − 1, k − 1)-extremal
lattice. Moreover, there exists a complete meet-embedding E : K → J such that
E(x) ≺ x for every x ∈ K. In particular, L ∼= J[E(K)].

Proof. From Lemma 3, one has that L is atomistic and we may take an atom a
which is an extremal point of 1L , that is, A(L) \ {a} = Ac with c being a coatom
of L. Consider the lattices J =↓ c and K =↑ a. Observe that L = J ∪˙ K and
let E : K → J be a complete meet-embedding provided by Lemma 5. Clearly,
J has n − 1 atoms and is B(k)-free, therefore, |J| ≤ f (n − 1, k). Moreover, K
must be B(k − 1)-free: indeed, if there existed B ∼    = B(k − 1) inside K, then
                                                 Why concept lattices are large      85




k=2




k=3




k=4



                  n=2                    n=3                     n=4

Fig. 3: Diagrams of (n, k)-extremal lattices with 2 ≤ n, k ≤ 4. Elements shaded in black
represent the doubled (n − 1, k − 1)-extremal lattice.


B ∪ E(B) would be a boolean lattice with k atoms inside J, which is impossible.
The lattice K has at most n − 1 atoms, and consequently |K| ≤ f (n − 1, k − 1),
                          Pk−1 
since the function n 7→ i=0 ni is monotonic increasing. Now, we have that
|J| + |K| = |L| = f (n, k) = f (n − 1, k) + f (n − 1, k − 1), where the last equality
follows from Proposition 3. Since |J| ≤ f (n−1, k) and |K| ≤ f (n−1, k−1), those
two inequalities must hold with equality. Therefore, J and K are, respectively,
(n − 1, k) and (n − 1, k − 1)-extremal.


7    Conclusion and related work
We showed an upper bound for the number of minimal generators of a context
which is sharp. Extremal lattices were constructed and also characterized. The
rôle played by contranominal scales in formal contexts may be seen as analogous
86     Alexandre Albano and Bogdan Chornomaz


as that of cliques in simple graphs, when one considers extremality with respect
to the number of edges. The Sauer-Shelah Lemma [8] provides an upper bound
which is similar to that of Theorem 2. This is not a coincidence, because it can
be shown, not without some effort, that the condition of a concept lattice being
B(k)-free is equivalent to the family of its extents not shattering a set of size
k. As for the sharpness of the bound (which we prove in Section 5), in our case
it is non-trivial, whereas the sharpness for the result of Sauer and Shelah is
immediate.


8    Acknowledgements
We want to deeply thank Bernhard Ganter for the invaluable feedback and fruit-
ful discussions.


References
1. Alexandre Albano. Upper bound for the number of concepts of contranominal-scale
   free contexts. In Formal Concept Analysis - 12th International Conference, ICFCA
   2014, Cluj-Napoca, Romania, June 10-13, 2014. Proceedings, pages 44–53, 2014.
2. Alexandre Albano and Alair Pereira do Lago. A convexity upper bound for the
   number of maximal bicliques of a bipartite graph. Discrete Applied Mathematics,
   165(0):12 – 24, 2014. 10th Cologne/Twente Workshop on Graphs and Combinatorial
   Optimization (CTW 2011).
3. David Eppstein. Arboricity and bipartite subgraph listing algorithms. Information
   Processing Letters, 51(4):207–211, 1994.
4. Bernhard Ganter and Rudolf Wille. Formal Concept Analysis: Mathematical Foun-
   dations. Springer, Berlin-Heidelberg, 1999.
5. Sergei O. Kuznetsov. On computing the size of a lattice and related decision prob-
   lems. Order, 18(4):313–321, 2001.
6. Erich Prisner. Bicliques in graphs I: bounds on their number. Combinatorica,
   20(1):109–117, 2000.
7. Dieter Schütt. Abschätzungen für die Anzahl der Begriffe von Kontexten. Master’s
   thesis, TH Darmstadt, 1987.
8. Saharon Shelah. A combinatorial problem; stability and order for models and the-
   ories in infinitary languages. Pacific J. Math., 41(1):247–261, 1972.
9. Thomas Worsch. Lower and upper bounds for (sums of) binomial coefficients, 1994.
            An Aho-Corasick Based Assessment of
         Algorithms Generating Failure Deterministic
                     Finite Automata

     Madoda Nxumalo1 , Derrick G. Kourie2,3 , Loek Cleophas2,4 , and Bruce W.
                                  Watson2,3
                   1
                  Computer Science, Pretoria University, South Africa
     2
      FASTAR Research, Information Science, Stellenbosch University, South Africa
   3
     Centre for Artificial Intelligence Research, CSIR Meraka Institute, South Africa
 4
    Foundations of Language Processing, Computer Science, Umeå University, Sweden
       {madoda,derrick,loek,bruce}@fastar.org — http://www.fastar.org


          Abstract. The Aho-Corasick algorithm derives a failure deterministic
          finite automaton for finding matches of a finite set of keywords in a
          text. It has the minimum number of transitions needed for this task.
          The DFA-Homomorphic Algorithm (DHA) algorithm is more general,
          deriving from an arbitrary complete deterministic finite automaton a
          language-equivalent failure deterministic finite automaton. DHA takes
          formal concepts of a lattice as input. This lattice is built from a state/out-
          transition formal context that is derived from the complete deterministic
          finite automaton. In this paper, three general variants of the abstract
          DHA are benchmarked against the specialised Aho-Corasick algorithm.
          It is shown that when heuristics for these variants are suitably chosen,
          the minimality attained by the Aho-Corasick algorithm can be closely
          approximated. A published non-lattice-based algorithm is also shown to
          perform well in experiments.


          Keywords: Failure deterministic finite automaton, Aho-Corasick algo-
          rithm



 1       Introduction

 A deterministic finite automaton (DFA) defines a set of strings, called its lan-
 guage. It is represented as a graph with symbol-labelled transitions between
 states. There are efficient algorithms to determine whether an input string be-
 longs to the DFA’s language. An approach to reducing DFA memory require-
 ments is the use of so-called failure deterministic finite automata (FDFAs, also
 defined in Section 2). An FDFA can be used to define the same language as a
 DFA with a reduced number of transitions and hence, reduced space required
 to store transition information. Essentially, this is achieved by replacing certain
 DFA state transitions by so-called failure transitions. A small additional com-
 putational cost is incurred in recognising whether given strings are part of the

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 87–98, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
88       Madoda Nxumalo et al.


language. By using a trie (the term used for a DFA graph that is a tree) and
failure transitions, Aho and Corasick [1] generalised the so-called KMP algo-
rithm [7] to multi-keyword pattern matching. There are two versions of their
algorithm: the so-called optimal one, which we call aco, and a failure one, acf.
aco builds a minimal DFA to find all matches of a given keyword set in a text.
acf builds, in a first step, a trie using all prefixes of words from the set. Each
state therefore represents a string which is the prefix of a keyword. Moreover,
the string of a state is spelled out by transitions that connect the start state
of the trie to that state. In a second step, aco then inserts a failure transition
from each state of the trie to some other state. To briefly illustrate the nature
failure transitions, suppose p is a state in the trie representing the string and
keyword she and suppose q is another state representing the prefix string he of
the keyword hers. Then a transition from p to q would indicate that he is the
longest suffix of she that matches a prefix of some other keyword. With appro-
priate further elaboration (details may be found in [1, 11]) the output of acf is an
FDFA that is language equivalent to the aco one It can also be shown that acf
is minimal in the following sense. No other FDFA that is language-equivalent to
aco can have fewer transitions than acf

An algorithm proposed in [8], called the DFA-Homomorphic Algorithm (DHA),
constructs from any complete DFA a language-equivalent FDFA. As predicted
by the theory [2], the resulting FDFA is not necessarily minimal. The abstract
version of the algorithm described in [8] involves the construction of a concept
lattice as explained in Section 2. The original version of the algorithm leaves a
number of decisions as nondeterministic choices. It also strives for minimality by
following a “greedy” heuristic in respect of information embedded in the lattice.
However, concrete versions of DHA and the effect of this heuristic have not been
tested. Here we propose various alternative concrete versions of the algorithm for
different deterministic choices.The acf FDFAs provide a benchmark for assessing
the performance of these concrete variants of DHA. An aco DFA is used as
input to each of several variants of the DHA and the resulting DHA-FDFAs are
compared against the acf version.

An alternative approach to constructing FDFAs from an arbitrary DFA has been
proposed by Kumar et al [10]5 . Their technique is based on finding the maxi-
mal spanning tree [9] of a suitably weighted nondirected graph that reflects the
structure of the underlying DFA. Two algorithms were proposed. One is based
on a maximal spanning tree, and the other on a redefined maximal spanning
tree. Further details about their algorithms may be found in their original pub-
lication. The original maximal spanning tree based algorithm was included in
our comparative study for its performance assessment using acf as the bench-
mark.

5
     Their research uses different terminology to that given above. They refer to an
     FDFA as a delayed-input DFA (abbreviated to D2 F A) and failure transitions are
     called default transitions.
    An Aho-Corasick Based Assessment of Algorithms Generating Failure DFAs         89


Various other FDFA related research has been conducted for certain limited
contexts. See [5] for an overview. A recent example is discussed in [4], where ideas
from [8] were used to modify construction of a so-called factor oracle automaton
to use failure transitions, saving up to 9% of symbol transitions.
Section 2 provides the formal preliminaries relevant to this research. In Sec-
tion 3 we introduce the deterministic variants of the DHA that are subsequently
benchmarked. Section 4 outlines the investigation’s experimental environment,
the data generated, the methods of assessment and the results. Section 5 draws
conclusions and points to further research work currently underway.


2     Preliminaries

An alphabet is a set of symbols, Σ, of size |Σ|, and Σ ∗ denotes the set of all
sequences over this alphabet, including the empty sequence, denoted by . A
string (or word), s, is an element of Σ ∗ and its length is denoted |s|. Note that
|| = 0. The concatenation of strings p and q is represented as pq. If s = pqw
then q is a substring of s, p and pq are prefixes of s and q and qw are suffixes
of s. Moreover, q is a proper substring iff ¬((p = ) ∨ (w = )). Similarly, pq is a
proper prefix iff w 6=  and qw is a proper suffix iff p 6= .
A deterministic finite automata (DFA) is a quintuple, D = (Q, Σ, δ, F, qs ), where
Q is a finite set of states; Σ is an alphabet; δ ∈ Q × Σ 9 Q is the (possibly
partial) symbol transition function mapping symbol/state pairs to states; qs ∈ Q
is the start state; and F ⊆ Q is a set of final states. If δ is a total function, then
the DFA is called complete. In the case of a complete DFA, the extension of δ is
defined as δ ∗ ∈ Q × Σ ∗ −→ Q where δ ∗ (p, ) = p and if δ(p, a) = q and w ∈ Σ ∗ ,
then δ ∗ (p, aw) = δ ∗ (q, w). A finite string, w, is said to be accepted by the DFA
iff δ ∗ (qs , w) ∈ F . The language of a DFA is the set of accepted strings.
A failure DFA (FDFA) is a six-tuple, (Q, Σ, δ, f, F, qs ), where D = (Q, Σ, δ, F, qs )
is a (not necessarily complete) DFA and f ∈ Q 9 Q is a (possibly partial) failure
transition function. For all a ∈ Σ and p ∈ Q, the functions δ and f are related in
the following way: f(p) = q for some q ∈ Q if δ(p, a) is not defined. The extension
of δ in an FDFA context is similar to its DFA version in that δ ∗ ∈ Q × Σ ∗ −→ Q
and δ ∗ (p, ) = p. However:
                           ∗
                             δ (q, w) if δ(p, a) = q
            δ ∗ (p, aw) =
                           δ ∗ (q, aw) if δ(p, a) is not defined and f(p) = q

An FDFA is said to accept string w ∈ Σ ∗ iff δ ∗ (qs , w) ∈ F . An FDFA’s language
is its set of accepted strings. It can be shown that every complete DFA has a
language-equivalent FDFA and vice-versa. When constructing an FDFA from a
DFA, care must be taken to avoid so-called divergent cycles of failure transitions
because they lead to an infinite sequence of failure traversals in string processing
algorithms. (Details are provided in [8].)
90          Madoda Nxumalo et al.


Subfigure 1a depicts a complete DFA for which Q = {q1 , q2 , q3 } and Σ = {a, b, c}.
Its start state is q1 and q3 is the only final state. Its symbol transitions are de-
picted as solid arrows between states. Subfigure 1b shows a language-equivalent
FDFA where dashed arrows indicate the failure transitions. Note, for example
that ab is in the language of both automata. In the DFA case, δ ∗ (q1 , ab) =
δ ∗ (q1 , b) = δ ∗ (q3 , ) In the FDFA case, δ ∗ (q1 , ab) = δ ∗ (q1 , b) = δ ∗ (q2 , b) =
δ ∗ (q3 , b) = δ ∗ (q3 , ).


                    a                          b                a                        b
                                 b

                           c            b                             f         f
     start         q1            q2           q3   start        q1        q2         q3
                           a
                                 c                                         c
                                a,c                                        f

         (a) D = (Q, Σ, δ, q1 , {q3 })                (b) F = (Q, Σ, δ, f, q1 , {q3 })




             ha, q1 i hb, q3 i hc, q2 i hc, q1 i
       q1      X        X        X
       q2      X        X        X
       q3      X        X                 X                (d) State/out-transition lattice

     (c) State/out-transition context

                  Fig. 1: Example automata and state/out-transition lattice


This text relies on standard formal concept analysis terminology and definitions.
(See, for example [6]). A so-called state/out-transition concept lattice can be
derived from any DFA. The objects of its formal context are DFA states, q ∈ Q.
Each attribute is a pair of the form ha, pi ∈ Σ × Q. A state q is deemed to
have this attribute if δ(q, a) = p, i.e. if q is the source of a transition on a to p.
Subfigure 1c is the state/out-transition context for the DFA in Subfigure 1a and
Subfigure 1d is the line diagram of the associated state/out-transition concept
lattice. The latter subfigure shows two intermediate concepts, each larger than
the bottom concept and smaller than the top, but not commensurate with one
another. The right-hand side intermediate concept depicts the fact that states
q1 and q2 (its extent) are similar in that each as a transition on symbol a to q1 ,
b to q3 and on c to q2 — i.e. the concept’s intent is {ha, q1 i, hb, q3 i, hc,q2 i}
Each concept in a state/out-transition lattice can be characterised by a certain
value, called its arc redundancy. For a concept c it is defined as ar(c) = (|int(c)|−
1)×(|ext(c)|−1), where ext(c) and int(c) denote the extent and intent of concept
    An Aho-Corasick Based Assessment of Algorithms Generating Failure DFAs            91


c respectively. The arc redundancy of a concept represents the number of arcs
that may be saved by doing the following:
 1. singling out one of the states in the concept’s extent;
 2. at all the remaining states in the concept’s extent, removing all out-transitions
    mentioned in the concept’s intent;
 3. inserting a failure arc from each of the states in step 2 to the singled out
    state in step 1.
The expression, |ext(c)| − 1 represents the number of states in step 2 above.
At each such state, |int(c)| symbol transitions are removed and a failure arc is
inserted. Thus, |int(c)| − 1 is the total number of transitions saved at each of
|ext(c)| − 1 states so that ar(c) is indeed the total number of arcs saved by the
above transformation.
The positive arc redundancy (PAR) set consists of all concepts whose arc redun-
dancy is greater than zero.


3     The DHA Variants

For the DFA-Homomorphic Algorithm (DHA) to convert a DFA into a language
equivalent FDFA, a three stage transformation of the DFA is undertaken. Ini-
tially, the DFA is represented as a state/out-transition context. From the derived
concept lattices, the PAR set is extracted to serve as input for the DHA.
The basic DHA proposed in [8] is outlined in Algorithm 1. The variable O is used
to keep track of states that are not the source of any failure transitions. This
is to ensure that a state is never the source of more than one failure transition.
Initially all states qualify. A concept c is selected and removed from PAR set,
so that c is no longer available in subsequent iterations. The initial version of
DHA proposed specifically selecting a concept, c, with maximum arc redundancy.
The specification given here leaves open how the choice will be made.
From c’s extent, one of the states, t, is chosen to be a failure transition target
state. DHA gives no specific criteria for which state in ext(c) to choose. The
remaining set of states in ext(c) is denoted by ext0 (c). Then, for each state s in
ext0 (c) that qualifies to be the source of a failure transition (i.e. that is also in O)
all transitions in int(c) are removed from s and a failure transition is installed
from s to t. Because state s has become a failure transition source state whose
target state is t, it may no longer be the source of any other failure transition,
and so is removed from O. These steps are repeated until it is no longer possible
to install any more failure transitions. It should be noted that in this particular
formulation of the abstract algorithm the PAR set is not recomputed to reflect
changes in arc redundancy as the DFA is progressively transformed into an
FDFA. This does not affect the correctness of the algorithm, but may affect its
optimality. Investigating such effects is not within the scope of this study. The
third and fifth lines of Algorithm 1, namely
92       Madoda Nxumalo et al.


      c := selectAConcept(PAR) and
      t := getAnyState(ext(c)) respectively.
are non-specific in the original formulation of DHA.

Three variants of the algorithm are proposed with respect to the third line. For
convenience we represent each variant by a conjunct on the right hand side of an
assignment where c is the assignment target. This, of course, is a slight abuse of
notation since c is not the outcome of a logical operation, but the selection of a
concept from the PAR set according to a criterion represented by h mar(PAR),
h me(PAR) or h mi(PAR). Each selection option a different greedy heuristic
for choosing concept c from the PAR set. By greedy we mean that an element
from the set is selected, based on some maximal or minimal feature, without
regard to possible opportunities lost in the forthcoming iterations by making
these selections. In addition to these heuristics, a single heuristic is proposed for
the fifth line relating to choosing the target state, t, for the failure transitions.
These choices are illustrated as colour-coded assignment statements shown in
the skeleton Algorithm 2 and are now briefly explained. The rationale for these
heuristics will be discussed a section below.
Algorithm 1                                    Algorithm 2
O := Q; PAR := {c | ar(c) > 0};                O := Q; PAR := {c | ar(c) > 0};
do ((O 6= ∅) ∧ (PAR 6= ∅)) →                   do ((O 6= ∅) ∧ (PAR 6= ∅)) →
   c := SelectConcept(PAR);                       c := h mar(PAR) ∨ h me(PAR) ∨ h mi(PAR)
   PAR : = PAR\{c};                               PAR : = PAR\{c}
   t := getAnyState(ext(c));                      t := ClosestT oRoot(c)
   ext0 (c) := ext(c)\{t};                        ext0 (c) := ext(c)\{t};
   for each (s ∈ ext0 (c) ∩ O) →                  for each (s ∈ ext0 (c) ∩ O) →
        if a failure cycle is not created →            if a failure cycle is not created →
            for each ((a, r) ∈ int(c)) →                   for each ((a, r) ∈ int(c)) →
                  δ : = δ \ {hs, a, ri}                          δ : = δ \ {hs, a, ri}
            rof ;                                          rof ;
            f(s) : = t;                                    f(s) : = t;
            O : = O\{s} ;                                  O : = O\{s} ;
        fi                                             fi
   rof                                            rof
od                                             od




The heuristics for choosing concept c from the PAR set in each iteration are as
follows: The h mar heuristic: c is a PAR concept with a maximum arc redun-
dancy. The h mi heuristic: c is a PAR concept with a maximum intent size. The
h me heuristic: c is a PAR concept with a minimum extent size. Once one of
these heuristics has been applied, the so-called ClosestToRoot heuristic is used
to select a state t in ext(c) to become the target state of failure transitions from
each of the remaining states in ext(c). The heuristic means that t is selected as
the state in ext(c) that is closest6 to aco’s start state. Transition modifications
are subsequently made on the FDFA produced to date, provided that a divergent
failure cycle is not produced.

6
     Since a trie has no cycles, the notion of “closest” here simply means a state with the
     shortest path from the start state to that state.
    An Aho-Corasick Based Assessment of Algorithms Generating Failure DFAs          93


4     The Experiment

The experiments were conducted on an Intel i5 dual core CPU machine, running
Linux Ubuntu 14.4. Code was written in C++ and compiled under the GCC
version 4.8.2 compiler.
It can easily be demonstrated that if there are no overlaps between proper pre-
fixes and proper suffixes of keywords in a keyword set, then the associated acf
FDFA’s failure transitions will all loop back to its start state, and out Clos-
estToRoot heuristic will behave similarly. To avoid keyword sets that lead to
such trivial acf FDFAs, the following keyword set construction algorithm was
devised.
Keywords (also referred to as patterns) are from an alphabet set of size 10. Their
lengths range from 5 to 60 characters. Keyword sets of sizes 5, 10, 15, . . . , 100
respectively are generated. For each of these 20 different set sizes, twelve alter-
native keyword sets are generated. Thus in total 12 × 20 = 240 keyword sets are
available.
To construct a keyword set of size N , an initial N random strings are generated7 .
Each such string has random content taken from the alphabet and random length
in the range 5 and 30. However, for reasons given below, only a limited number of
these N strings, say M , are directly inserted into the keyword set. The set is then
incrementally grown to the desired size, N , by repeating the following:
      Select a prefix of random length, say p, from a randomly selected string
      in the current keyword set. Remove a string, say w, from the set of strings
      not yet in the keyword set. Insert either pw or wp into the keyword set.
Steps are taken to ensure that there is a reasonable representation of each of
these three differently constructed keywords in a given keyword set.
These keyword sets served as input to the SPARE-PARTS toolkit [12] to cre-
ate the associated acf FDFAs and the aco DFAs. A routine was written to ex-
tract state/out-transition contexts from the aco DFAs. These contexts were used
by the lattice construction software package known as FCART (version 0.9)[3]
supplied to us by National Research University Higher School of Economics
(Moscow, Russia). The DHA variants under test used resulting concept lattices
to generate the FDFAs. As previously mentioned, a Kumar et al [10] algorithm
was also implemented to generate FDFAs from the DFAs. These will be refer-
enced as kum FDFAs.
Figures 2 and 3 give several views of the extent to which the DHA-based and
kum FDFAs correspond with acf FDFAs. The experimental data is available
online8 . For notational convenience fijk denotes the set of failure transitions of
7
    Note that all random selections mentioned use a pseudo-random number generator.
8
    The experimental data files can be found at this URL:
    http : //www .fastar .org/wiki/index .php?title = Conference Papers#2015 .
94     Madoda Nxumalo et al.


the FDFA corresponding to k th keyword set of size 5j that was generated by
the algorithm variant i ∈ FA\{aco}, where k ∈ [1, 12], j ∈ [1, 20] and FA =
                                          i
{acf, aco, mar, mi, me, kum}. Similarly, δjk refers to symbol transition sets of the
associated FDFAs and, in this case, also the aco DFAs if i = aco. A dot notation
in the subscript is used for averages. Thus, for some i ∈ FA\{aco} we use |fij. | to
denote the average number of failure transitions in the i-type FDFAs produced
                                                         i
by the 12 keyword sets of size 5j, and similarly |δj.      | represents the average
number of symbol transitions.

                                       100




                                                                        ●       ●    ●   ●   ●   ●    ●   ●   ●   ●    ●   ●   ●    ●
                                       80




                                                   ●   ●   ●   ●    ●       ●
                    Arcs Savings (%)

                                       60




                                                   ●
                                       40




                                                       ●
                                                           ●

                                                               ●
                                                                    ●   ●   ●
                                                                                ●
                                                                                     ●                                              ●
                                                                                         ●   ●                             ●   ●
                                       20




                                                                                                 ●    ●   ●            ●
                                                                                                              ●   ●
                                               ●   mar
                                                   mi
                                                   me
                                               ●   acf
                                                   kum
                                       0




                                               0               20               40               60               80               100

                                                                                Pattern Set Size




                                                             aco         i
                                                           |δj.  | − (|δj. | + |fij. |)
                                             Fig. 2:                  aco |             × 100
                                                                    |δj.



Figure 2 shows how many more transitions aco automata require (as a percentage
of aco) compared to each of the FDFA variants. Note that data has been averaged
over the 12 keyword set samples for each of the set sizes and note that the
FDFA transitions include both symbol and failure transitions. The minimal acf
FDFAs attain an average savings of about 80% over all sample sizes and the
mi, me and kum FDFAs track this performance almost identically. Although not
clearly visible in the graph, the me heuristic shows a slight degradation for larger
set sizes, while the kum FDFAs consistently perform about 1% to 2% worse. By
way of contrast, the mar heuristic barely achieves a 50% savings for small sample
sizes, and drops below a 20% savings for a sample size of about 75, after which
there is some evidence that it might improve slightly.
The fact that the percentage transition savings of the various FDFA variants
closely correspond to that of acf does not mean that the positioning of the failure
and symbol transitions should show a one-to-one matching. The extent to which
the transitions precisely match one another is shown in Figure 3. These box-and-
                                   An Aho-Corasick Based Assessment of Algorithms Generating Failure DFAs                                                                                                                                       95


                                               mi                                                                me                                                       mi                                                                  me

No. of Inequivalent Symbol Arcs




                                                                No. of Inequivalent Symbol Arcs
                                   2.0                                                                                                                  100                                                                     100
                                                                                                  15




                                                                                                                            Matching failure arcs (%)




                                                                                                                                                                                                    Matching failure arcs (%)
                                   1.5                                                                                                                  90                                                                      90
                                                                                                                                                                                        ●
                                                                                                                                                                                                                                                           ●
                                                                                                  10
                                                                                                                                                        80                                                                      80
                                   1.0                                                                                                                                                      ●                                                                  ●


                                                                                                                                                        70                                                                      70
                                                                                                   5
                                   0.5
                                                                                                                                                        60                                                                      60
                                   0.0                                                             0
                                           5
                                          10
                                          15
                                          20
                                          25
                                          30
                                          35
                                          40
                                          45
                                          50
                                          55
                                          60
                                          65
                                          70
                                          75
                                          80
                                          85
                                          90
                                          95
                                         100




                                                                                                          5
                                                                                                         10
                                                                                                         15
                                                                                                         20
                                                                                                         25
                                                                                                         30
                                                                                                         35
                                                                                                         40
                                                                                                         45
                                                                                                         50
                                                                                                         55
                                                                                                         60
                                                                                                         65
                                                                                                         70
                                                                                                         75
                                                                                                         80
                                                                                                         85
                                                                                                         90
                                                                                                         95
                                                                                                        100




                                                                                                                                                                5
                                                                                                                                                               10
                                                                                                                                                               15
                                                                                                                                                               20
                                                                                                                                                               25
                                                                                                                                                               30
                                                                                                                                                               35
                                                                                                                                                               40
                                                                                                                                                               45
                                                                                                                                                               50
                                                                                                                                                               55
                                                                                                                                                               60
                                                                                                                                                               65
                                                                                                                                                               70
                                                                                                                                                               75
                                                                                                                                                               80
                                                                                                                                                               85
                                                                                                                                                               90
                                                                                                                                                               95
                                                                                                                                                              100




                                                                                                                                                                                                                                        5
                                                                                                                                                                                                                                       10
                                                                                                                                                                                                                                       15
                                                                                                                                                                                                                                       20
                                                                                                                                                                                                                                       25
                                                                                                                                                                                                                                       30
                                                                                                                                                                                                                                       35
                                                                                                                                                                                                                                       40
                                                                                                                                                                                                                                       45
                                                                                                                                                                                                                                       50
                                                                                                                                                                                                                                       55
                                                                                                                                                                                                                                       60
                                                                                                                                                                                                                                       65
                                                                                                                                                                                                                                       70
                                                                                                                                                                                                                                       75
                                                                                                                                                                                                                                       80
                                                                                                                                                                                                                                       85
                                                                                                                                                                                                                                       90
                                                                                                                                                                                                                                       95
                                                                                                                                                                                                                                      100
                                         Pattern Set Size                                                Pattern Set Size                                         Pattern Set Size                                                      Pattern Set Size



                                              mar                                                             kum                                                      mar                                                                   kum

                                                                                                                                                        80
No. of Inequivalent Symbol Arcs




                                                            ●   No. of Inequivalent Symbol Arcs   250
                   10000                                                                                                                                                                                                        25




                                                                                                                            Matching failure arcs (%)




                                                                                                                                                                                                    Matching failure arcs (%)
                                  8000                                                            200                                                   60                                                                                          ●
                                                                                                                                                                                                                                                           ●
                                                                                                                                                                                                                                20
                                                                                                                                                                                                                                                               ●
                                  6000           ●                                                150                                                         ●
                                                                                                             ●                                          40
                                                                                                                                                                                                                                15                  ●
                                  4000                                                            100                                                                                                                                               ●

                                                                                                                                                                                                                                                           ●
                                                                                                                                                        20
                                  2000                                                            50                                                          ●
                                                                                                                                                              ●
                                                                                                                                                                                                ●                               10
                                                                                                                                                                      ●             ●
                                                                                                                                                                          ● ●   ●
                                                                                                                                                                          ● ●
                                     0                                                             0                                                      0                                                                       5
                                           5
                                          10
                                          15
                                          20
                                          25
                                          30
                                          35
                                          40
                                          45
                                          50
                                          55
                                          60
                                          65
                                          70
                                          75
                                          80
                                          85
                                          90
                                          95
                                         100




                                                                                                          5
                                                                                                         10
                                                                                                         15
                                                                                                         20
                                                                                                         25
                                                                                                         30
                                                                                                         35
                                                                                                         40
                                                                                                         45
                                                                                                         50
                                                                                                         55
                                                                                                         60
                                                                                                         65
                                                                                                         70
                                                                                                         75
                                                                                                         80
                                                                                                         85
                                                                                                         90
                                                                                                         95
                                                                                                        100




                                                                                                                                                                5
                                                                                                                                                               10
                                                                                                                                                               15
                                                                                                                                                               20
                                                                                                                                                               25
                                                                                                                                                               30
                                                                                                                                                               35
                                                                                                                                                               40
                                                                                                                                                               45
                                                                                                                                                               50
                                                                                                                                                               55
                                                                                                                                                               60
                                                                                                                                                               65
                                                                                                                                                               70
                                                                                                                                                               75
                                                                                                                                                               80
                                                                                                                                                               85
                                                                                                                                                               90
                                                                                                                                                               95
                                                                                                                                                              100




                                                                                                                                                                                                                                        5
                                                                                                                                                                                                                                       10
                                                                                                                                                                                                                                       15
                                                                                                                                                                                                                                       20
                                                                                                                                                                                                                                       25
                                                                                                                                                                                                                                       30
                                                                                                                                                                                                                                       35
                                                                                                                                                                                                                                       40
                                                                                                                                                                                                                                       45
                                                                                                                                                                                                                                       50
                                                                                                                                                                                                                                       55
                                                                                                                                                                                                                                       60
                                                                                                                                                                                                                                       65
                                                                                                                                                                                                                                       70
                                                                                                                                                                                                                                       75
                                                                                                                                                                                                                                       80
                                                                                                                                                                                                                                       85
                                                                                                                                                                                                                                       90
                                                                                                                                                                                                                                       95
                                                                                                                                                                                                                                      100
                                         Pattern Set Size                                                Pattern Set Size                                         Pattern Set Size                                                      Pattern Set Size



                                                   acf       acf
                                             (a) |δjk  | − |δjk     i
                                                                 ∩ δjk |                                                                                                        |fijk ∩ facf
                                                                                                                                                                                         jk |
                                                                                                                                                                  (b)                                                           × 100
                                                                                                                                                                                        |facf
                                                                                                                                                                                          jk |


                                                                                                        Fig. 3: Transition Matches



whisker plots show explicitly the median, 25th and 75th percentiles as well as
outliers of each of the 12 sample keyword sets of a given size. Subfigure 3a shows
the number of symbol transitions in acf FDFAs that do not correspond with
those in mi, me, mar and kum respectively. Subfigure 3b shows the percentage
of acf failure transitions matching those of the FDFAs generated by mi, me, mar
and kum respectively.

The symbol transitions for the mi heuristic are practically identical to those of
acf, differing by at most two. Differences are not significantly related to sample
size. Differences for me are somewhat larger, increasing slightly with larger sam-
ple size, though still relatively modest in relation to the overall number of FDFA
transitions. (There are |Q| − 1 transitions in the underlying trie.) In the cases of
mar and kum, the differences are approximately linearly dependent on the size
of the keyword set, reaching over 9000 and 250 respectively for keywords sets of
size 100.

The failure transition differences in regard to mi and me show a very similar
pattern as keyword size increases. Only in isolated instances do they fully match
those of acf, but the matching correspondence drops from a median of more
than 95% in the case of the smallest keywords sets to a median of about 50%
for the largest keyword sets. The median kum failure transition correspondence
with acf is in a range of about 12-18% for all pattern set sizes. However, in the
case of mar, the degree of correspondence is much worse: at best the median
value is just over 60% for small keyword sets, dropping close to zero for medium
96     Madoda Nxumalo et al.


range keyword set sizes, and then increasing slightly to about 10% for the largest
keyword sets.
Overall, Figures 2 and 3 reveal that there is a variety of ways in which failure
transitions may be positioned in an FDFA, and that lead to very good—in many
cases even optimal—transition savings. It is interesting to note that even in the
kum FDFAs, the total number of transition savings is very close to optimal,
despite relatively large differences in the positioning of the transitions. However,
the figures also show that this flexibility in positioning failure transitions to
achieve good arc savings eventually breaks down, as in the case of the mar
FDFAs.
One of the reasons for differences between acf FDFAs and the others is that some
implementations of the acf algorithm, including the SPARE-PARTS implemen-
tation, inserts a failure arc at every state (except the start state), even if there
is an out-transition on every alphabet symbol from a state. Such a failure arc is
of course redundant. Inspection of the data showed that some of the randomly
generated keyword sets lead to such “useless” failure transitions, but they are
so rare that they do not materially affect the overall observations.
The overall rankings of the output FDFAs of the various algorithms to acf could
broadly be stated as mi > me > kum > mar. This ranking is with respect to
closeness of transition placement to acf. Since the original focus of this study
was to explore heuristics for the DHA algorithm, further comments about the
kum algorithm are reserved for the next section.
The rationale for the mar heuristic is clear: it will cause the maximum savings
in transitions in a given iteration. It was in fact the initial criterion proposed
in [8]. It is therefore somewhat surprising that it did not perform very well in
comparison to other heuristics. It would seem that, in the present context, it is
too greedy—i.e. by selecting a concept whose extent contains the set of states
that can effect maximal savings such that in one iteration, it deleteriously elimi-
nates from consideration concepts whose extent contains some of those states in
subsequent iterations. Note that, being based on the maximum of the product
of extent and intent sizes, it will tend to select concepts in the middle of the
concept lattice.
When early trials in our data showed up mar’s relatively poor performance, the
mi and me heuristics were introduced to prioritise concepts in the top or bottom
regions of the lattice. These latter two heuristics will maximize the number
of symbol transitions to be removed per state when replacing them with failure
transitions, in so far as concepts with large intents tend to have small extents and
vice-versa. Although such a relationship is, of course, data-dependent, random
data tends in that direction, as was confirmed by inspection of our data.
These two heuristics appear to be rather successful at attaining acf-like FDFAs.
However, the ClosestToRoot heuristic has also played a part in this success. Note
that the acf failure transitions are designed to record that a suffix of a state’s
    An Aho-Corasick Based Assessment of Algorithms Generating Failure DFAs         97


string is also a prefix of some other state’s string. Thus, f (q) = p means that a
suffix of state q’s string is also a prefix of state p’s string. However, since there
may be several suffixes of q’s string and several states whose prefixes meet this
criterion, the definition of f requires that the longest possible suffix of q’s string
should be used. This ensures that there is only one possible state, p, in the trie
whose prefix corresponds to that suffix. Thus, on the one hand, acf directs a
failure transition “backwards” towards a state whose depth is less than that of
the current state. On the other hand, acf selects a failure transition’s target state
to be as far as possible from the start state, because the suffix (and therefore
also the prefix) used must be maximal in length.
The ClosestToRoot heuristic approximates the acf action in that it also directs
failure transitions backwards towards the start state. However, by selecting a
failure transition’s target state to be as close as possible from the start state,
it seems to contradict acf actions. It is interesting to note in Subfigure 3b that
both mi and me show a rapid and more or less linear decline in failure transition
matchings with respect to acf when pattern set size reaches about 65. We conjec-
ture that for smaller keyword sizes, theClosestToRoot heuristic does not conflict
significantly with acf’s actions because there are few failure target states from
which to choose. When keyword set sizes become greater, there is likely to be
more failure target states from which to choose, and consequently less correspon-
dence between the failure transitions chosen according to differing criteria.This
is but one of several matters that has been left for further study.


5     Conclusions and Future Agenda

Our ultimate purpose is to investigate heuristics for building FDFAs from gen-
eralised complete DFAs—a domain where optimal behaviour is known a priori
to be computationally hard. The comparison against acf FDFAs outlined above
is a firm but limited starting point. The next step is to construct complete DFAs
from randomly generated FDFAs and examine the extent to which the heuristics
tested out in this study can reconstruct the latter from the former. Because gen-
eralised DFAs can have cycles, the ClosestToRoot heuristic will be generalised
by using Dijktra’s algorithm for calculating the shortest distance from the start
state to each DFA state. It remains to be seen whether mar will perform any
better in the generalised context.
The relatively small alphabet size of 10 was dictated by unavoidable growth in
the size of the associated concept lattices. Even though suitable strategies for
trimming the lattice (for example by not generating concepts with arc redun-
dancy less than 2) are being investigated, it is recognised that use of DHA will
always be constrained by the potential for the associated lattice to grow exponen-
tially. Nevertheless, from a theoretical perspective a lattice-based DHA approach
to FDFA generation is attractive because it encapsulates the solution space in
which a minimal FDFA might be found—i.e. each ordering of its concepts maps
98     Madoda Nxumalo et al.


to a possible language-equivalent FDFA that can be derived from DFA and at
least one such ordering will be a minimal FDFA.
The kum FDFA generation approach is not as constrained by space limitations as
the DHA approach and in the present experiments it has performed reasonably
well. In the original publication, a somewhat more refined version is reported
that attempts to avoid unnecessary chains of failure transitions. Future research
should examine the minimising potential of this refined version using generalised
DFAs as input and should explore more fully the relationship between these kum-
based algorithms and the DHA algorithms.


References
 1. A. V. Aho and M. J. Corasick. Efficient string matching: An aid to bibliographic
    search. Commun. ACM, 18(6):333–340, 1975.
 2. H. Björklund, J. Björklund, and N. Zechner. Compression of finite-state automata
    through failure transitions. Theor. Comput. Sci., 557:87–100, 2014.
 3. A. Buzmakov and A. Neznanov. Practical computing with pattern structures in
    FCART environment. In Proceedings of the International Workshop ”What can
    FCA do for Artificial Intelligence?” (FCA4AI at IJCAI 2013), Beijing, China,
    August 5, 2013., pages 49–56, 2013.
 4. L. Cleophas, D. G. Kourie, and B. W. Watson. Weak factor automata: Comparing
    (failure) oracles and storacles. In J. Holub and J. Žďárek, editors, Proceedings of the
    Prague Stringology Conference 2013, pages 176–190, Czech Technical University in
    Prague, Czech Republic, 2013.
 5. M. Crochemore and C. Hancart. Automata for matching patterns. In S. A. Rozen-
    berg G., editor, Handbook of Formal Languages, volume 2, Linear Modeling: Back-
    ground and Application, pages 399–462. Springer-Verlag, 1997. incollection.
 6. B. Ganter, G. Stumme, and R. Wille, editors. Formal Concept Analysis, Foun-
    dations and Applications, volume 3626 of Lecture Notes in Computer Science.
    Springer, 2005.
 7. D. E. Knuth, J. H. M. Jr., and V. R. Pratt. Fast pattern matching in strings.
    SIAM J. Comput., 6(2):323–350, 1977.
 8. D. G. Kourie, B. W. Watson, L. Cleophas, and F. Venter. Failure deterministic
    finite automata. In J. Holub and J. Žďárek, editors, Proceedings of the Prague
    Stringology Conference 2012, pages 28–41, Czech Technical University in Prague,
    Czech Republic, 2012.
 9. J. B. Kruskal. On the shortest spanning subtree of a graph and the traveling
    salesman problem. Proceedings of the American Mathematical Society, 7(1):48–50,
    1956.
10. S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. S. Turner. Algorithms
    to accelerate multiple regular expressions matching for deep packet inspection.
    In Proceedings of the ACM SIGCOMM 2006 Conference on Applications, Tech-
    nologies, Architectures, and Protocols for Computer Communications, Pisa, Italy,
    September 11-15, 2006, pages 339–350, 2006.
11. B. W. Watson. Taxonomies and Toolkits of Regular Language Algorithms. PhD
    thesis, Eindhoven University of Technology, Sept. 1995.
12. B. W. Watson and L. G. Cleophas. SPARE Parts: a C++ toolkit for string pattern
    recognition. Softw., Pract. Exper., 34(7):697–710, 2004.
  Context-Aware Recommender System Based on
          Boolean Matrix Factorisation

                    Marat Akhmatnurov and Dmitry I. Ignatov

          National Research University Higher School of Economics, Moscow
                                 dignatov@hse.ru



        Abstract. In this work we propose and study an approach for collabora-
        tive filtering, which is based on Boolean matrix factorisation and exploits
        additional (context) information about users and items. To avoid simi-
        larity loss in case of Boolean representation we use an adjusted type of
        projection of a target user to the obtained factor space. We have com-
        pared the proposed method with SVD-based approach on the MovieLens
        dataset. The experiments demonstrate that the proposed method has
        better MAE and Precision and comparable Recall and F-measure. We
        also report an increase of quality in the context information presence.

        Keywords: Boolean Matrix Factorisation, Formal Concept Analysis,
        Recommender Algorithms, Context-Aware Recommendations


 1    Introduction

 Recommender Systems have recently become one of the most popular applica-
 tions of Machine Learning and Data Mining. Their primary aim is to help users
 to find proper items like movies, books or goods within an underlying informa-
 tion system. Collaborative filtering recommender algorithms based on matrix
 factorisation (MF) techniques are now considered industry standard [1]. The
 main assumption here is that similar users prefer similar items and MF helps to
 find (latent) similarity in the reduced space efficiently.
     Among the most often used types of MF we should definitely mention Sin-
 gular Value Decomposition (SVD) [2] and its various modifications like Proba-
 bilistic Latent Semantic Analysis (PLSA) [3]. However, several existing factori-
 sation techniques, for example, non-negative matrix factorisation (NMF) [4] and
 Boolean matrix factorisation (BMF) [5], seem to be less studied in the context of
 Recommender Systems. Another approach similar to MF is biclustering, which
 has also been successfully applied in recommender system domain [6,7]. For ex-
 ample, Formal Concept Analysis (FCA) [8] can be also used as a biclustering
 technique and there are several examples of its applications in the recommender
 systems domain [9,10]. A parameter-free approach that exploits a neighbour-
 hood of the object concept for a particular user also proved its effectiveness [11];
 it has a predecessor based on object-attribute biclusters [7] that also capture
 the neighbourhood of every user and item pair in an input formal context. Our

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 99–110, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
100     Marat Akhmatnurov and Dmitry I. Ignatov


previous approach based on FCA exploits Boolean factorisation based on formal
concepts and follows user-based k-nearest neighbours strategy [12].
    The aim of this study is to continue comparing the recommendation qual-
ity of several aforementioned techniques on the real dataset and investigation of
methods’ interrelationship and applicability. In particular, in our previous study,
it was especially interesting to conduct experiments and compare recommenda-
tion quality in case of a numeric input matrix and its scaled Boolean counterpart
in terms of Mean Absolute Error (MAE) as well as Precision and Recall. Our
previous results showed that the BMF-based approach is of comparable quality
with the SVD-based one [12]. Thus, one of the next steps is definitely usage of
auxiliary information containing users’ and items’ features, i.e. so called context
information (for BMF vs SVD see section 4).
    Another novelty of the paper is defined by the fact that we have adjusted the
original Boolean projection of users to the factor space by support-based weights
that results in a sufficient quality increase. We also investigate the approximate
greedy algorithm proposed in [5] in the recommender setting, which tends to
generate factors with large number of users, and more balanced (in terms of
ratio between users’ and items’ number per factor) modification of the Close-by-
One algorithm [13].
    The practical significance of the paper is determined by the demand of rec-
ommender systems’ industry, that is focused on gaining reliable quality in terms
of average MAE.
    The rest of the paper consists of five sections. Section 2 is an introductory
review of the existing MF-based approaches to collaborative filtering. In section
3 we describe our recommender algorithm which is based on Boolean matrix
factorisation using closed sets of users and items (that is FCA). Section 4 contains
results of experimental comparison of two MF-based recommender algorithms
by means of cross-validation in terms of MAE, Precision, Recall and F -measure.
The last section concludes the paper.


2     Introductory review

In this section we briefly describe two approaches to the decomposition of real-
valued and Boolean matrices. In addition we provide the reader with the general
scheme of user-based recommendation that relies on MF and a simple way of
direct incorporation of context information into MF-based algoritms.


2.1   Singular Value Decomposition

Singular Value Decomposition (SVD) is a decomposition of a rectangular matrix
A ∈ Rm×n (m > n) into a product of three matrices
                                               
                                            Σ
                                A=U                 V T,                        (1)
                                            0
                         Context-Aware Recommender System Based on BMF        101


where U ∈ Rm×m and V ∈ Rn×n are orthogonal matrices, and Σ ∈ Rn×n is a
diagonal matrix such that Σ = diag(σ1 , . . . , σn ) and σ1 ≥ σ2 ≥ . . . ≥ σn ≥ 0.
The columns of the matrix U and V are called singular vectors, and the numbers
σi are singular values.
    In the context of recommendation systems rows of U and V can be inter-
preted as vectors of user’s and items’s attitude to a certain topic (factor), and
the corresponding singular values as importance of the topic among the others.
The main disadvantages are the dense outputted decomposition matrices and
negative values of factors which are difficult to interpret.
    The advantage of SVD for recommendation systems is that this method
allows to obtain a vector of user’s attitude to certain topics for a new user
without SVD decomposition of the whole matrix.
    The computational complexity of SVD according to [2] is O(mn2 ) floating-
point operations if m ≥ n or more precisely 2mn2 + 2n3 .

2.2     Boolean Matrix Factorisation based on FCA
Description of FCA-based BMF. Boolean matrix factorisation (BMF) is a de-
composition of the original matrix I ∈ {0, 1}n×m , where Iij ∈ {0, 1}, into a
Boolean matrix product P ◦ Q of binary matrices P ∈ {0, 1}n×k and Q ∈
{0, 1}k×m for the smallest possible number of k. We define Boolean matrix prod-
uct as follows:
                                        _k
                            (P ◦ Q)ij =    Pil · Qlj ,                      (2)
                                            l=1
       W
where denotes disjunction, and · conjunction.
   Matrix I can be considered a matrix of binary relations between set X of
objects (users), and a set Y of attributes (items that users have evaluated). We
assume that xIy iff the user x evaluated object y. The triple (X, Y, I) clearly
forms a formal context1 .
   Consider a set F ⊆ B(X, Y, I), a subset of all formal concepts of context
(X, Y, I), and introduce matrices PF and QF :
                                                   
                              1, i ∈ Al ,             1, j ∈ Bl ,
                  (PF )il =               (QF )lj =               ,
                              0, i ∈
                                   / Al ,             0, j ∈
                                                           / Bl .
where (Al , Bl ) is a formal concept from F .
    We can consider decomposition of the matrix I into binary matrix product
PF and QF as described above. The theorems on universality and optimality of
formal concepts are proved in [5].
    There are several algorithms for finding PF and QF by calculating formal
concepts based on these theorems [5]. The approximate algorithm we use for
comparison (Algorithm 2 from [5]) avoids computation of all possible formal
concepts and therefore works much faster [5]. Time estimation of the calculations
in the worst case yields O(k|G||M |3 ), where k is the number of found factors,
|G| is the number of objects, |M | is the number of attributes.
1
    We have to omit basic FCA definitions; for more details see [8].
102       Marat Akhmatnurov and Dmitry I. Ignatov


2.3     Contextual information

Contextual Information is a multi-faceted notion that is present in several dis-
ciplines. In the recommender systems domain, the context is any auxiliary in-
formation concerning users (like gender, age, occupation, living place) and/or
items (like genre of a movie, book or music), which shows not only a user’s mark
given to an item but explicitly or implicitly describes the circumstances of such
evaluation (e.g., including time and place) [15].
    From the representational viewpoint context2 can be described by a binary
relation, which shows that a user or an item possesses a certain attribute-value
pair. In case the contextual information is described by finite-valued attributes, it
can be represented by finite number of binary relations; otherwise, when we have
countable or continuous values, their domains can be split into (semi)intervals
(cf. scaling in FCA). As a result one may obtain a block matrix:
                                                   
                                             RCuser
                                  I=                  ,
                                         Citem O

where R is a utility matrix of users’ ratings to items, Cuser represents context
information of users, Citem contains context iformation of items and O is zero-
filled matrix.


                    Table 1. Adding auxialiry (context) information

                              Movies                           Gender    Age
             Brave Termi- Gladi- Million- Hot         God-     M F 0-20 21-45 46+
             Heart nator ator    aire     Snow        father
                                 from
                                 ghetto
Anna         5               5       5                2            +   +
Vladimir             5       5       3                5        +             +
Katja        4               4       5                4            +         +
Mikhail      3       5       5                        5        +             +
Nikolay                      2                   5    4        +                   +
Olga         5       3       4       5                             +   +
Petr         5                       4           5    4        +                   +
Drama        +               +       +           +    +
Action               +       +                   +    +
Comedy       +                       +




    In case of more complex rating’s scale the ratings can be reduced to binary
scale (e.g., “like/dislike”) by binary thresholding or by FCA-based scaling.
2
    In order to avoid confusion, please note that formal context is a different notion.
                      Context-Aware Recommender System Based on BMF                     103


2.4   General scheme of user-based recommendations

Once a matrix of ratings is factorised we need to learn how to compute recom-
mendations for users and to evaluate whether a particular method handles this
task well.
    For the factorised matrices already well-known algorithm based on the simi-
larity of users can be applied, where for finding k nearest neighbors we use not
the original matrix of ratings R ∈ Rm×n , but the matrix I ∈ Rm×f , where m is
a number of users, and f is a number of factors. After the selection of k users,
which are the most similar to a given user, based on the factors that are peculiar
to them, it is possible, based on collaborative filtering formulas to calculate the
prospective ratings for a given user.
    After generation of recommendations the performance of the recommender
system can be estimated by measures such as MAE, Precision and Recall.
    Collaborative recommender systems try to predict the utility (in our case
ratings) of items for a particular user based on the items previously rated by
other users.
    Memory-based algorithms make rating predictions based on the entire col-
lection of previously rated items by the users. That is, the value of the unknown
rating ru,m for a user u and item m is usually computed as an aggregate of the
ratings of some other (usually, the k most similar) users for the same item m:

                                ru,m = aggrũ∈Ũ rũ,m ,

where Ũ denotes a set of k users that are the most similar to user u, who have
rated item m. For example, the function aggr may be weighted average of ratings
[15]:
                            X                              X
                   ru,m =           sim(ũ, u) · rũ,m /           sim(u, ũ).          (3)
                            ũ∈Ũ                          ũ∈Ũ

    The similarity measure between users u and ũ, sim(ũ, u), is essentially an
inverse distance measure and is used as a weight, i.e., the more similar users c
and ũ are, the more weight rating rũ,m will carry in the prediction of rũ,m .
    The similarity between two users is based on their ratings of items that both
users have rated. There are several popular approaches: Pearson correlation,
cosine-based, and Hamming-based similarities.
    We further compare the cosine-based and normalised Hamming-based simi-
larities:
                                                                                1/2
                              X                        X              X
            simcos (u, v) =           rum · rvm /              2
                                                               rum            2 
                                                                             rvm        (4)
                              m∈M̃                     m∈M̃           m∈M̃

                                               X
                   simHam (u, v) = 1 −               |rum − rvm |/|M̃ |,                (5)
                                              m∈M̃
104     Marat Akhmatnurov and Dmitry I. Ignatov


where M̃ is either the set of co-rated items (movies) for users u and v or the
whole set of items.
    To apply this approach in case of FCA-based BMF recommender algorithm
we simply consider the user-factor matrices obtained after factorisation of the
initial data as an input.
    For the input matrix in Table 1 the corresponding decomposition is below:
                                    
                     100100011                                
                   0 1 0 0 0 1 1 0 0     10010000000
                                                            
                   1 0 0 0 0 1 0 1 1 0 1 1 0 0 1 1 0 0 1 0
                                     0 0 0 0 1 1 1 0 0 0 1
                   0 1 0 0 0 1 1 0 0                        
                                                            
                   0 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 1 1 0 0
                                     ◦ 0 0 0 0 1 1 0 0 0 0 0
                   1 0 0 1 0 0 0 1 1                        
                                                            
                   1 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
                                     0 1 1 0 0 1 0 0 0 0 0
                   1 0 0 0 1 0 0 1 0                        
                                     1 0 1 1 0 0 0 0 0 0 0
                   0 0 0 0 1 0 1 0 0
                                           10110001000
                     100000000


3     Proposed Approach

In contrast to [5], for the recommender setting we mostly interested whether the
concepts of more balanced extent and intent size may give us an advantage and
use the following criterion to this end:

                   W (A, B) = (2|A||B|)/(|A|2 + |B|2 ) ∈ [0; 1],               (6)

where (A, B) is a formal concept.
     In subsection 2.2 we recalled that finding Boolean factors is reduced to the
task of finding of covering formal concepts for the same input matrix.
     To this end we modified Close-by-One ([13]). This algorithm traverses the
tree of corresponding concept lattice in depth-first manner and returns the set of
all formal concepts, which is redundant for the Boolean decomposition task. The
deeper the algorithm is in the tree, the larger the intents are, and the smaller
the extents of formal concepts. Thus, for every branch of the tree the proposed
measure in eq. (6) is growing until some depth and then (in case the traverse
continues) goes down.
     The proposed modifications are: 1) the traverse of a certain branch is carried
out until W is growing with the covered square (size of extent × size of intent);
2) at each iteration we do not accept concepts with intents that contained in the
union of intents of previously generated concepts.
     In case the intent of a certain concept is covered by its children (fulfilling
condition 1), then this concept is not included into F.
     For Close-by-One there is a linear order  on G. Assume C ⊂ G is generated
from A ⊂ G by addition g ∈ G (C = A ∪ {g}) such that g  max(A), then the
set C 00 is called canonically generated if min(C 00 \ C)  g.
                        Context-Aware Recommender System Based on BMF                    105


 Algorithm 1: Generation of balanced formal concepts
   Data: Formal context (U, M, I)
   Result: The set of balanced formal concepts F
   foreach u ∈ U do
      A ← {u};
      stack.push(A0 );
      g ← u; g + +;
      repeat
          if g ∈
               / U then
              if stack.T op 6= ∅ then
                  add (A00 , A0 ) to F;
                  stack.T op ← ∅;
               while stack.T op = ∅ do
                  g ← max(A);
                  A ← A \ {g};
                  stack.pop;
                  g + +;

           else
               B ← A ∪ {g};
               if (B 00 is a canonical generation) and W (C 00 , C 0 ) ≥ W (A00 , A0 ) and
               |C 00 × C 0 | ≥ |A00 × A0 | then
                    stack.T op ← (stack.T op \ C 0 );
                    A ← C;
               g + +;
       until A 6= ∅;
   return F;



    The obtained set F is still be redundant, that is why we further select fac-
tors with maximal coverage until we have covered the whole matrix or required
percentage.
    The main aim of factorisation is the reduction of computation steps and
revealing latent similarity since users’ similarities are computed in a factor space.
As a projection matrix of user profiles to a factor space one may use “user-factor”
from Boolean factorisation of utility matrix (P in (2)). However, in this case in
the obtained user profiles most of the vector components are getting zeros, and
thus we lose similarity information.
    To smooth the loss effects we proposed the following weighted projection:
                                                P
                                                   Iuv · Qf v
                                 Iu· · Qf ·    v∈V
                         P̃uf =              =    P           ,
                                  ||Qf · ||1          Qf v
                                                   v∈V


where P˜uf indicates whether factor f covers user u, Iu· is a binary vector
describing profile of user u, Qf · is a binary vector of items belonging to factor f
106      Marat Akhmatnurov and Dmitry I. Ignatov


(the corresponding row of Q in decomposition eq. (2)). The coordinates of the
obtained projection vector lie within [0; 1].
   For Table 1 the weighted projection is as follows:
                                1                     
                                 1 5 0 1 0 31 13 1 1
                               0 1 1 1 1 1 1 1 1 
                                3 21 45 21 2 3 4 
                               1                      
                                5 41 15 21 1 3 11 11 
                               0 1                    
                                2 2 5 2 12 11 3 4 
                               0 1 0 1                
                          P̃ =    5          3 3 0 0
                               1 1 0 1 0 1 1 1 1 .
                                25 1 31 13 2 1 
                               1 1 1                  
                                25 1 25 32 23 3 23 
                               1                      
                                25 21 15 1 32 3 11 41 
                               0                      
                                   5 2 5 1 3 1 3 4
                                        1         2 1
                                 10 0 5 0 0 0 3 2

4     Experiments
The proposed approach and compared ones have been implemented in C++ and
evaluated on the MovieLens-100k data set. This data set features 100000 ratings
in five-star scale, 1682 Movies, Contextual information about movies (19 genres),
943 users (each user has rated at least 20 movies), and demographic info for the
users (gender, age, occupation, zip (ignored)). The users have been divided into
seven age groups: under 18, 18-25, 26-35, 36-45, 45-49, 50-55, 56+.
    Five star ratings are converted to binary scale by the following rule:
                                      (
                                       1, Rij > 3,
                                Iij =
                                       0, else

   The scaled dataset is split into two sets according to bimodal cross-validation
scheme [16]: training set and test set with a ratio 80:20, and 20% of ratings in
the test set are hidden3 .

Measure of users similarity First of all, the influence of similarity has been
compared. As we can see in the Fig. 4, Hamming distance based similarity is
significantly better in terms of M AE and Precision. However it is worse in Recall
and F-measure. Even though, given the superiority in terms of M AE (widely
adopted in the RS community measure), we decided to use Hamming distance
based similarity.

Projection into factor space In the series of tests the influence of projection
method has been studied. The weighted projection keeps more information and
as a result helps us to find similar user of higher accuracy. That is why this
method has significant primacy in terms of all investigated measures of quality.
3
    This partition into test and training set is done 5 times resulting in 25 hidden
    submatrices and differs from the one provided by MovieLens group; hence the results
    might be different.
                                           Context-Aware Recommender System Based on BMF                                             107

                      0.4                                                           1


                     0.35                                                          0.8




                                                                      Precision
        MAE
                      0.3                                                          0.6


                     0.25                                                          0.4


                      0.2                                                          0.2
                            0    20       40       60      80   100                      0   20       40       60      80     100
                                      Number of neighbours                                        Number of neighbours

                      0.5                                                          0.4
                                                                                                                     Hamming
                      0.4                                                          0.3                               Cosine
         F−measure




                                                                      Recall
                      0.3                                                          0.2


                      0.2                                                          0.1


                      0.1                                                           0
                            0    20       40       60      80   100                      0   20        40      60      80     100
                                      Number of neighbours
                                                    Number of neighbours
      Fig. 1. Comparison of two similarity measures (BMF     at 80% coverage)

                       0.3                                                         0.8

                      0.28                                                         0.6
                                                                       Precision
        MAE




                      0.26                                                         0.4

                      0.24                                                         0.2

                      0.22                                                           0
                             0    20      40       60      80   100                      0   20       40       60      80      100
                                       Number of neigbours                                         Number of neigbours
                       0.4                                                         0.4
                                                                                                        Weighted projection
                       0.3                                                         0.3                  Boolean projection
          F−measure




                                                                        Recall




                       0.2                                                         0.2

                       0.1                                                         0.1

                        0                                                            0
                             0    20      40       60      80   100                      0   20       40       60      80      100
                                       Number of neigbours                                         Number of neigbours
                      Fig. 2. Comparison of two types of projection into factor space


FCA-based algorithm and factors number The main studied algorithm to find
Boolean factors as formal concepts is a modified algorithm Close by One. It was
compared with greedy algorithm from [5] in terms of factors number and final
RS quality measures.
                                 Coverage              50% 60% 70% 80% 90%
                                 Modified Close by One 168 228 305 421 622
                                 Greedy algorithm      222 297 397 533 737
    CbO covers the input matrix with a smaller count of factors, but it requires
more time (in our experiments, 180 times more on average with one thread
calculations). At the same time we have to admit that there is no influence to
RS quality: thus Recall, Precision and MAE mainly differ only in the third digit.

Incorporation of context information and comparison with SVD For the SVD-
based approach additional (context) information has been attached in a similar
108     Marat Akhmatnurov and Dmitry I. Ignatov


way, but there we use maximal rating (5 stars) in the attached columns and
rows.
         Coverage                     50% 60% 70% 80% 85% 90%
         BMF                          168 228 305 421 508 622
         BMF (No context information) 163 220 294 401 479 596
         SVD                          162 218 287 373 430 496
         SVD (No context information) 157 211 277 361 416 480

    BMF and SVD give similar number of factors, especially for small coverage;
context information does not significantly change their number, but it gives an
increase of precision (1-2% more accurate predictions in Table 4).


           Table 2. Influence of contextual information (80% coverage)

         Number       Precision       Recall      F-measure       MAE
      of neighbours clean cntxt clean cntxt clean cntxt clean cntxt
             1      0.3589 0.3609 0.2668 0.2647 0.3061 0.3054 0.2446 0.2434
             5      0.6353 0.6442 0.1420 0.1412 0.2321 0.2317 0.2371 0.2359
            10      0.6975 0.7045 0.1126 0.1114 0.1938 0.1924 0.2399 0.2388
            15      0.7168 0.7258 0.0994 0.0979 0.1746 0.1726 0.2422 0.2411
            20      0.7282 0.7373 0.0911 0.0903 0.1619 0.1610 0.2442 0.2429
            25      0.7291 0.7427 0.0861 0.0853 0.1540 0.1531 0.2457 0.2445
            30      0.7318 0.7426 0.0823 0.0818 0.1480 0.1474 0.2472 0.2459
            40      0.7342 0.7508 0.0767 0.0759 0.1389 0.1379 0.2497 0.2484
            50      0.7332 0.7487 0.0716 0.0712 0.1304 0.1301 0.2518 0.2504
            60      0.7314 0.7478 0.0682 0.0678 0.1247 0.1243 0.2536 0.2522
            70      0.7333 0.7477 0.0658 0.0654 0.1208 0.1202 0.2552 0.2538
            80      0.7342 0.7449 0.0632 0.0624 0.1164 0.1151 0.2567 0.2553
           100      0.7299 0.7461 0.0590 0.0583 0.1092 0.1081 0.2594 0.2580




    With a similar number or factors (SVD at 85% coverage and BMF at 80%)
Boolean Factorisation results in smaller M AE and higher Precision where num-
ber of neighbours is not high. It can be explained by different nature of factors
in these factorisation models.


5     Conclusion
In the paper we considered two modifications of Boolean matrix factorisation,
which are suitable for Recommender Systems. They were compared on real
datasets with the presence of auxiliary (context) information. We found out
that MAE of our BMF-based approach is sufficiently lower than MAE of SVD-
based approach for almost the same number of factor at fixed coverage level of
BMF and SVD. The Precision of BMF-based approach is slightly lower when
the number of neighbours is about a couple of dozens and comparable for the
                                          Context-Aware Recommender System Based on BMF                                            109

                      0.4                                                         1


                     0.35                                                        0.8




                                                                     Precision
        MAE
                      0.3                                                        0.6


                     0.25                                                        0.4


                      0.2                                                        0.2
                            0   20       40       60      80   100                     0   20       40       60      80      100
                                     Number of neighbours                                       Number of neighbours

                      0.4                                                        0.4
                                                                                                             BMF80+context
                      0.3                                                        0.3                         SVD85+context
         F−measure




                                                                                                             SVD85




                                                                     Recall
                      0.2                                                        0.2


                      0.1                                                        0.1


                       0                                                          0
                            0   20       40       60      80   100                     0   20       40       60      80      100
                                     Number of neighbours                                       Number of neighbours
                     Fig. 3. Comparison of different matrix factorisation approaches


remaining part of the observed range. The Recall is lower than results in lower
F-measure. The proposed weighted projection alleviates the information loss of
original Boolean projection resulting in a substantial quality gain.
    We also revealed that the presence of contextual information results in a
small quality increase (about 1-2%) in terms of MAE, Recall and Precision.
    We studied the influence of more balanced factors in terms of ratio of number
of users and items. Finally, we should report that greedy approximate algorithm
[5], even though that it results in more factors with larger user’s component,
is faster and demonstrates almost the same quality. So, its use is beneficial for
recommender systems due to polynomial time computational complexity.
    As a future research direction we would like to investigate the proposed
approach with the previously ([9,6,10,7]) and recently introduced FCA-based
ones ([11,12,17]). As for Boolean matrix factorisation in case of context-aware
information, since the data can be naturally represented as multi-relational, we
would like to continue our collaboration with the authors of the paper [18]. We
definitely need to use user- and item-based independent information like time
and location, which can be considered as pure contextual in nature and treated
by n-ary methods [19].


Acknowledgments. We would like to thank Alexander Tuzhilin, Elena Nen-
ova, Radim Belohlavek, Vilem Vychodil, Sergei Kuznetsov, Sergei Obiedkov,
Vladimir Bobrikov, Mikhail Roizner, and anonymous reviewers for their com-
ments, remarks and explicit and implicit help during the paper preparations.
This work was supported by the Basic Research Program at the National Re-
search University Higher School of Economics in 2014-2015 and performed in
the Laboratory of Intelligent Systems and Structural Analysis. First author was
also supported by Russian Foundation for Basic Research (grant #13-07-00504).
110     Marat Akhmatnurov and Dmitry I. Ignatov


References
 1. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender
    systems. Computer 42(8) (2009) 30–37
 2. Trefethen, L.N., Bau, D.: Numerical Linear Algebra. 3rd edition edn. SIAM (1997)
 3. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis.
    Machine Learning 42(1-2) (2001) 177–196
 4. Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural
    Comput. 19(10) (October 2007) 2756–2779
 5. Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a
    novel method of matrix decomposition. Journal of Computer and System Sciences
    76(1) (2010) 3 – 20 Special Issue on Intelligent Data Analysis.
 6. Symeonidis, P., Nanopoulos, A., Papadopoulos, A., Manolopoulos, Y.: Nearest-
    biclusters collaborative filtering based on constant and coherent values. Informa-
    tion Retrieval 11(1) (2008) 51–75
 7. Ignatov, D.I., Kuznetsov, S.O., Poelmans, J.: Concept-Based Biclustering for In-
    ternet Advertisement. In: Data Mining Workshops (ICDMW), 2012 IEEE 12th
    International Conference on. (Dec 2012) 123–130
 8. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations.
    Springer, Berlin/Heidelberg (1999)
 9. du Boucher-Ryan, P., Bridge, D.: Collaborative recommending using formal con-
    cept analysis. Knowledge-Based Systems 19(5) (2006) 309 – 315 {AI} 2005 {SI}.
10. Ignatov, D.I., Kuznetsov, S.O.: Concept-based recommendations for internet ad-
    vertisement. In Belohlavek, R., Kuznetsov, S.O., eds.: Proc. of The Sixth Inter-
    national Conference Concept Lattices and Their Applications (CLA’08), Palacky
    University, Olomouc (2008) 157–166
11. Alqadah, F., Reddy, C., Hu, J., Alqadah, H.: Biclustering neighborhood-based
    collaborative filtering method for top-n recommender systems. Knowledge and
    Information Systems (2014) 1–17
12. Ignatov, D.I., Nenova, E., Konstantinova, N., Konstantinov, A.V.: Boolean Matrix
    Factorisation for Collaborative Filtering: An FCA-Based Approach. In: Artificial
    Intelligence: Methodology, Systems, and Applications - 16th Int. Conf., AIMSA
    2014, Varna, Bulgaria, September 11-13, 2014. Proceedings. (2014) 47–58
13. Kuznetsov, S.O.: A fast algorithm for computing all intersections of objects in a
    finite semilattice. Automatic Documentation and Math. Ling. 27(5) (1993) 11–21
14. Birkhoff, G.: Lattice Theory. 11th edn. Harvard University, Cambridge, MA (2011)
15. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender sys-
    tems: A survey of the state-of-the-art and possible extensions. IEEE Trans. on
    Knowl. and Data Eng. 17(6) (June 2005) 734–749
16. Ignatov, D.I., Poelmans, J., Dedene, G., Viaene, S.: A New Cross-Validation Tech-
    nique to Evaluate Quality of Recommender Systems. In Kundu, M., Mitra, S.,
    Mazumdar, D., Pal, S., eds.: Perception and Machine Intelligence. Volume 7143 of
    LNCS. Springer (2012) 195–202
17. Ignatov, D.I., Kornilov, D.: Raps: A recommender algorithm based on pattern
    structures. In: Proceeding of FCA4AI 2015 workshop at IJCAI 2015. (2015)
18. Trnecka, M., Trneckova, M.: An algorithm for the multi-relational boolean fac-
    tor analysis based on essential elements. In: Proceedings of 11th International
    Conference on Concept Lattices and their Applications. (2014)
19. Ignatov, D., Gnatyshak, D., Kuznetsov, S., Mirkin, B.: Triadic formal concept
    analysis and triclustering: searching for optimal patterns. Machine Learning (2015)
    1–32
                    Class Model Normalization
     Outperforming Formal Concept Analysis approaches
                     with AOC-posets

                A. Miralles1,2 , G. Molla1 , M. Huchard2 , C. Nebut2 ,
                           L. Deruelle3 , and M. Derras3

                           (1) Tetis/IRSTEA, France
          andre.miralles@teledetection.fr, guilhem.molla@irstea.fr
              (2) LIRMM, CNRS & Université de Montpellier, France
                           huchard,nebut@lirmm.fr
                          (3) Berger Levrault, France
  laurent.deruelle@berger-levrault.com,mustapha.derras@berger-levrault.com



        Abstract. Designing or reengineering class models in the domain of
        programming or modeling involves capturing technical and domain con-
        cepts, finding the right abstractions and avoiding duplications. Making
        this last task in a systematic way corresponds to a kind of model nor-
        malization. Several approaches have been proposed, that all converge
        towards the use of Formal Concept Analysis (FCA). An extension of
        FCA to linked data, Relational Concept Analysis (RCA) helped to mine
        better reusable abstractions. But RCA relies on iteratively building con-
        cept lattices, which may cause a combinatorial explosion in the number
        of the built artifacts. In this paper, we investigate the use of an alterna-
        tive RCA process, relying on a specific sub-order of the concept lattice
        (AOC-poset) which preserves the most relevant part of the normal form.
        We measure, on case studies from Java models extracted from Java code
        and from UML models, the practical reduction that AOC-posets bring
        to the normal form of the class model.


 Keywords: Inheritance hierarchy, class model normalization, class model reengi-
 neering, Formal Concept Analysis, Relational Concept Analysis


 1    Introduction
 In object-oriented software or information systems, the specialization-generaliza-
 tion hierarchy is a main dimension in class model organization, as the is-a rela-
 tion in the design of domain ontologies. Indeed, it captures a classification of the
 domain objects which is structuring for human comprehension and which makes
 the representation efficient. Designing or reengineering class models in the do-
 main of programming or in the domain of modeling still remains a tricky task.
 It includes the integration of technical and domain concepts sometimes with
 no clear semantics, and the definition of the adequate abstractions while avoid-
 ing the duplication of information. In many aspects, this task corresponds to a

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 111–122, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
112     André Miralles et al.


kind of class model normalization, focusing on specialization and redundancy,
by analogy to the database schema normalization. Normalization is important
to assist forward engineering of a reliable and maintainable class model. It is
also useful to address the erosion of the specialization-generalization hierarchy
during software evolution.
    After the seminal paper of R. Godin and H. Mili at OOPSLA’93 [1], several
approaches have been proposed to address this normalization, that all converged
to the implicit or explicit use of Formal Concept Analysis (FCA [2]) techniques.
In this context, FCA was used to mine descriptions that are common to groups
of classes and to suggest re-factorizing and creating more reusable super-classes.
Several approaches more specifically used a specific sub-order of the concept
lattice which captures the most relevant new super-classes (the AOC-poset,
for Attribute-Object Concept poset [3]). Then, Relational Concept Analysis
(RCA [4]), an extension of FCA to linked data, was proposed to find deeper re-
factorizations. However, RCA iterates on building concept lattices, that leads to
a combinatorial explosion of the number of the built artifacts (including classes).
    In this paper, we investigate the use of an alternative version of RCA, rely-
ing on AOC-posets. With AOC-posets, RCA might not converge, thus we have
to carefully handle the iteration mechanism, but when it converges, we expect
more efficiency. We measure, on case studies from UML models and from Java
models rebuilt from industrial Java code, the reduction brought in practice by
this approach to the normal form of the class model. We also show that, on
realistic tuning, the reasonable number of the new artifacts allows the designer
to analyze them and decide which of them should be kept in the model.
    In Section 2, the bases for FCA and RCA in the context of class models are
outlined. Then we present current work in this domain, and the motivation for
this study (Section 3). In Section 4, we give the setup of the experiment, as well
as the results. We conclude in Section 5 with a few perspectives of this work.


2     Class Model Normalization

FCA: A classical approach for applying FCA to class model normalization
involves building a formal context K-class=(G, M, I), where the classes (in G)
are associated with their characteristics (attributes, roles, operations, in M )
through I ⊆ G × M . There are many variations in this description of classes. For
example, Fig. 1 (right-hand side) shows such a formal context for the class model
presented in Fig. 2(a). A concept is a pair (Extent, Intent) where Extent = {g ∈
G|∀m ∈ Intent, (g, m) ∈ I} and Intent = {m ∈ M |∀g ∈ Extent, (g, m) ∈ I}.
The concept extent represents the objects that own all the characteristics of the
intent; the concept intent represents the characteristics shared by all objects
of the extent. The specialization order between two formal concepts is given
by: (Extent 1, Intent 1) < (Extent 2, Intent 2) ⇔ Extent 1 ⊂ Extent 2. It
provides the concepts with a lattice structure.
    In the concept lattice, there is an ascending inheritance of objects and a de-
scending inheritance of characteristics. The simplified intent of a formal concept
                                                                                                                                                    Class Model Normalization                                                                                                                            113




                                  recommendedHeight
                                                      windVaneDimension




                                                                                                                                                                                          measureInterval


                                                                                                                                                                                                            measuringDate




                                                                                                                                                                                                                                                                       windDirection
                                                                                                                                                                                                                                          waterAmount
                                                                                                                                                                                                                                                        windStrength
                                                                                                 plateDimension




                                                                                                                                                                                                                            codeQuality
                                                                                                                                                                             tubeHeight
                                                                          cupNumber



                                                                                                                  tubeLength




                                                                                                                                                                                          accuracy
                                                                                      vaneType




                                                                                                                                                                                                                                                                                       rainfall
                                                                                                                                                                                                                                                                                                  wind
                                                                                                                                                      K-class
       Device                                                                                                                                         RainGauge  ×               ×
       CupAnemometer ×     ×                                                                                                                          Anemometer   × ×             ×
       VaneAnemometer × ×    ×                                                                                                                        Rainfall         × × ×
       PlateAnemometer × ×     ×                                                                                                                      Wind             × ×   × ×
       PitotAnemometer   ×       ×




Fig. 1. Anemometer Formal Context (left-hand side), Formal context K-
class=(G, M, I) with G is the set of classes of Fig. 2(a) associated by I with the
set M of their attribute and role names (right-hand side)

                                                                 rainfall                                                                                                                 wind
               RainGauge                                                                                                       Rainfall         Anemometer                                                                                Wind
                                                                       *                                                                                                                     *
              tubeHeight                                                                                     measuringDate                  measuringInterval                                                          measuringDate
                                                                                                             codeQuality                    accuracy                                                                   codeQuality
                                                                                                             waterAmount                                                                                               windStrength
                                                                                                                                                                                                                       windDirection
       (a)
                                                                                                      Measure
                                                                                            measuringDate
                                                                                            codeQuality

                                                                                                                                                                                               measure
                                                                                                                                                             Device                                                                         Measure
                       rainfall                                                                                                                                                                      *
       RainGauge                                                 Rainfall                                                                                                                                                             measuringDate
                             *                                                                                                                                                                                                        codeQuality
       tubeHeight                          waterAmount
                                                                                      wind
         Anemometer                                                                                                               Wind
                                                                                         *                                                      RainGauge             Anemometer                                Rainfall                                                               Wind
       measuringInterval                                                                                               windStrength             tubeHeight       measuringInterval                    waterAmount                                                      windStrength
       accuracy                                                                                                        windDirection
 (b)                                                                                                                                      (c)                    accuracy                                                                                              windDirection




Fig. 2. Example of class model normalization with FCA and RCA [5]: (a) initial class
model ; (b) (resp. (c)) class model refactored with FCA (resp. RCA).


is its intent without the characteristics inherited from its super-concept intents.
The simplified extent is defined in a similar way. In our example, among the for-
mal concepts that can be extracted from K-class, Concept C = ({Rainfall,
Wind}, {measuringDate, codeQuality}) highlights two classes that share the
two attributes measuringDate and codeQuality. This concept C is interpreted
as a new super-class of the classes of its extent, namely Rainfall and Wind. The
new super-class, called here Measure, appears in Fig. 2(b). New super-classes
are named by the class model designer.

AOC-posets: In the framework of class model analysis with FCA, often AOC-
posets, rather than concept lattices, are used. Formal context of Fig. 1 (left-hand
side) is used to illustrate the difference between the concept lattice and the AOC-
poset. The concept lattice like in Fig. 3(a) contains all the concepts from the
formal context. Some concepts, like Concept Device 2, inherit all their charac-
teristics from their super-concepts and their objects from their sub-concepts. In
the particular case of object-oriented modeling, they would correspond to empty
114     André Miralles et al.


description, with no correspondence with an initial class description and be rarely
considered. In the AOC-poset like in Fig. 3(b), only concepts that introduce one
characteristic or one object are kept, simplifying drastically the structure in case
of large datasets. The number of concepts in the concept lattice can increase up
to 2min(|G|,|M |) , while it is bounded by |G| + |M | in the AOC-poset. The Iceberg
lattice, such as introduced in [6], is another well known sub-set of the concept
lattice which is used in many applications. The iceberg lattice is induced by the
sub-set of concepts which have an extent support greater than a given threshold.
In our case, this would mean only keeping new super-classes that have a mini-
mum number of sub-classes, which is not relevant in modeling and programming:
a super-class may only have one sub-class.



RCA: RCA helps to go further and get the class model of Fig. 2(c). In this ad-
ditional normalization step, RainGauge and Anemometer have a new super-class
which has been discovered because both have a role towards a sort of Measure
(resp. Rainfall and Wind). To this end, the class model is encoded in a Rela-
tional Context Family (RCF) as the one in Table 1, composed of several formal
contexts that separately describe classes (K-class), attributes (K-attribute),
and roles (K-role) and of several relations including relation between classes and
attributes (r-hasAttribute), relation between classes and roles (r-hasRole), or
relation between roles and their type r-hasTypeEnd. Here again, this encoding
can vary and integrate other modeling artifacts, like operations or associations.



                                                         Concept_Device_0




                                              Concept_Device_1     Concept_Device_3
                                              recommendedHeight     windVaneDimension




                                 Concept_Device_5        Concept_Device_2      Concept_Device_8
                                       cupNumber                                    tubeLength
                                     CupAnemometer                               PitotAnemometer



                                              Concept_Device_6     Concept_Device_7
                                                  vaneType            plateDimension
                                               VaneAnemometer        PlateAnemometer



                                                         Concept_Device_4


                          (a)
                                            Concept_Device_1        Concept_Device_3
                                            recommendedHeight       windVaneDimension




                    Concept_Device_5        Concept_Device_6        Concept_Device_7           Concept_Device_8
                       cupNumber                vaneType              plateDimension                  tubeLength

              (b)    CupAnemometer           VaneAnemometer          PlateAnemometer               PitotAnemometer




  Fig. 3. Concept lattice (a) and AOC-poset (b) for anemometers (Fig. 1, left) [5]
                                                                                                                       Class Model Normalization                                                    115


    RCA is an iterative process where concepts emerge at each step. Relations
and concepts discovered at one step are integrated into contexts through re-
lational attributes for computing concept lattices at the next step. At step 0,
attributes with the same name (resp. the two attributes measuringDate or the
two attributes codeQuality) are grouped. At step 1, classes that share attributes
from an attribute group are grouped into a concept that produces a new super-
class (e.g. Wind and Rainfall are grouped to produce the super-class Measure).
At step 2, roles rainfall and wind share the fact that they end at a sub-class of
Measure, thus they are grouped into new role shown under the name measure in
Fig. 2(c). At step 3, the current context of classes (extended with relational at-
tributes) shows that both classes RainGauge and Anemometer have a role ending
to Measure. Then a new super-class, called Device by the designer, is extracted.


             Table 1. Context family for the set of classes of Fig. 2(a)

                                                                                 measureInterval


                                                                                                   measuringDate




                                                                                                                                                               windDirection
                                                                                                                                  waterAmount
                                                                                                                                                windStrength
                                                                                                                   codeQuality
                                                                    tubeHeight


                                                                                 accuracy
                                 rainfall




                                                                                                                                                                               Kclass
                                            wind




                                                   Kattribute                                                                                                                  RainGauge
                                                   RG::tubeHeight     ×                                                                                                        Anemometer
             Krole                                 A::measureInterval   ×                                                                                                      Rainfall
             rainfall ×                            A::accuracy            ×                                                                                                    Wind
             wind       ×                          R::measuringDate         ×
                                                   W::measuringDate         ×
                                                   R::codeQuality             ×
                                                   W::codeQuality             ×
                                                   R::waterAmount               ×
                                                   W::windStrength                ×
                                                   W::windDirection                 ×
                                              A::measureInterval



                                              W::measuringDate
                                              R::measuringDate




                                              W::windDirection
                                              W::windStrength
                                              R::waterAmount
                                              RG::tubeHeight




                                              W::codeQuality
                 rhasAttribute




                                              R::codeQuality




                                                                                       rhasRole




                                                                                                                                                                                      Anemometer
                                                                                                                                                                                      RainGauge
                                              A::accuracy




                                                                                                                                 rainfall
                                                                                                                                 wind




                                                                                                                                                                                      Rainfall
                                                                                                                                                                                      Wind




                                                                    RainGauge x
                                                                                                                                                               rhasT ypeEnd
                                                                    Anemometer  x
            RainGauge x                                               Rainfall                                                                                 rainfall                     x
            Anemometer  x x                                            Wind                                                                                    wind                             x
              Rainfall      x   x   x
               Wind           x   x   x x




3    Previous work on RCA and Class Model Normalization

RCA has been first assessed on small [7] or medium [8] class models, encoding
technical information (multiplicity, visibility, being abstract, initial value) in the
RCF, which was the source of many non-relevant concepts.
   In [9], the authors assessed RCA on Ecore models, Java programs and UML
models. The encoding was similar to the one presented in Section 2 to illustrate
RCA (classes, attributes, roles, described by their names and their relationships).
116     André Miralles et al.


While for Java models, the number of discovered class concepts (about 13%)
was very reasonable, for UML class models, the increase (about 600%) made the
post-analysis impossible to achieve.
    Recently, we systematically studied various encodings of the design class
model of an information system about Pesticides [10, 11]. We noticed a strong
impact of association encoding, and that encoding only named and navigable
ends of associations was feasible, while encoding all ends (including those without
a name and those that are not navigable) led to an explosion of the number of
concepts. Restricting to named ends and to navigable ends means that we give
strong importance to the semantics used by designer, thus the lost concepts have
a greater chance to be uninteresting.
    Guided by the intuition of the model designer, we recently proposed a con-
trolled approach with progressive concept extraction [5]. In this approach, the
designer chooses at each step of the RCA process the formal contexts and the
relations he wants to explore. For example, he may begin with classes and at-
tributes, then add roles, then associations, then remove all information about
classes and consider only classes and operations, etc. Such a choice is memorized
in a factorization path. In [5], we used AOC-posets, but we did not evaluate the
difference between AOC-posets and concept lattices in the controlled process.
The objective was to evaluate the number of discovered concepts at each step
and to observe trends in their progress. It was worth noting that the curves of
the same factorization path applied to different models had the same shape.
    In this paper, we will use the 15 versions of the analysis class model of the
same information system on Pesticides, as well as a dataset coming from in-
dustrial Java models. Contrarily to the experiments made by authors in [9–11],
our objective is to evaluate the benefits of using the variant of RCA, which
builds AOC-posets (rather than concept lattices) during the construction pro-
cess. Studying the concepts discovered at each step, we can also evaluate what
was called the automatic factorization path in the controlled approach of [5]. It is
clear that AOC-posets are smaller than concept lattices, thus the results we ex-
pect focus on assessing the amount of the reduction in the number of discovered
concepts that will be brought to the designer for analysis.


4     Case study

Experimental setup: Figure 4 presents the metamodel used in practice to
define the RCF for our experiments. Object-Attribute contexts are associated
to classes, attributes, operations, roles and associations. Attributes, operations,
roles and associations are described by their names in UML and Java models.
Classes are described by their names in UML models.
    Object-object contexts correspond to the meta-relations hasAttribute, has-
Operation, hasRole (between Class and Role and between Association and
Role), hasTypeEnd, and isRoleOf. This last meta-relation is not used in the
Java model RCF, because from Java code we can only extract unidirectional
associations (corresponding to Java attributes whose type is a class).
                                                 Class Model Normalization   117


   All the roles extracted from Java code have a name. In UML models, we only
consider the named roles, and the navigable ends of associations to focus on the
most meaningful elements. We do not consider multiplicities in role description,
nor the initializer methods (constructors).




                      Fig. 4. Metamodel used to define the RCF.




    Each of the 15 UML models corresponds to a version of an information
system on Pesticides. These models, described in [10], were collected during the
Pesticides information system development and then the evolution of the project
throughout 6 years. The RCF for UML models contains all the UML model el-
ements. We also used 15 Java models coming from Java programs developed by
the company Berger-Levrault in the domain of human resource management.
These 15 Java models come from 3 Java programs: Agents, Chapter and Ap-
praisal Campaign. For each program, we determined a central meaningful class
and navigated from this central class through association roles at several dis-
tances: 1, 2, 4, 8 and 16. For the 5 Java models, we could not get results due
to the size of the models. The Java programs are the Java counterpart of the
database accesses, thus we focused on Java attributes, that are encoded as at-
tributes when their type is primitive (integer, real, boolean, etc), and as roles
of a unidirectional associations when their type is a class. The operations were
access and update operations associated to these attributes, thus they do not
bring any meaningful information. For the sake of space, we only present results
on 4 representative Java models, from the Chapter program (distances 1, 2, 4,
8). Depending on the version, 254 to 552 model elements are involved in the
Pesticides models and 34 to 171 of these model elements are classes. The Chap-
ter models are composed of 204 to 3979 model elements of which 37 to 282 are
classes.
   The experiments have been made with the help of UML profiles of Objecteer-
ing1 to extract the RCF from the UML models and Java modules to extract the
RCF from the Java models. RCAExplore2 is used to compute the lattices and

1
    www.objecteering.com/
2
    dolques.free.fr/rcaexplore/
118      André Miralles et al.


the AOC-posets, and Talend3 to extract information from RCAExplore outputs,
to integrate data, to build the curves and analyze the results.

Results: We computed various metrics on the built lattices and AOC-posets,
such as the number of several categories of concepts: merged concepts (simplified
extents have a cardinal strictly greater than 1), perennial concepts (simplified
extents have a cardinal equals to 1) and new concepts (simplified extents are
empty). A merged concept corresponds to a set of model elements that have
the same description, for example a merged attribute concept groups attributes
with a same name. A perennial concept corresponds to a model element which
has at least one characteristic that makes it distinct from the others. A new
concept corresponds to a group of a model elements that share part of their
description, and no model element has exactly this description (it always has
some additional information). We focus in this paper on the classes and on the
number of new class concepts, because they are the predominating elements
that reveal potential new abstractions and a possible lack of factorization in the
current model. To highlight the observed increase brought by the conceptual
structures (lattice and AOC-poset), we present the ratio of new class concepts
on the number of initial classes in the model (Fig. 5, 6, 7, 8). To compare lattices
and AOC-posets, we compute the ratio: #N#N     ew class concepts in lattice
                                             ew class concepts in AOC−poset (Fig. 9,
10). For Chapter models, lattices are computed for the steps 0 to 6 and, for all
other cases, lattices or AOC-posets are determined up to step 24.




          Fig. 5. New class concept number in lattices for Pesticides models



   In lattices for the Pesticides models (Fig. 5), the process always stops between
steps 6 and 14 depending on the models. At step 6, the ratio of new class concepts
on the number of classes varies between 165% (V10) and 253% (V05). Results
3
    www.talend.com/
                                                Class Model Normalization      119




       Fig. 6. New class concept number in AOC-posets for Pesticides models




are hard to analyze for a human designer without any further filtering. In lattices
for the Chapter models (Fig. 7), we stopped the process at step 6, because we
observed a high complexity: for example, for distance 8, at step 6, we get 43656
new class concepts. It is humanly impossible to analyze the obtained new class
concepts. At distance 16, Chapter model could not even be processed. For the
model at distance 1 (resp. 2), at step 6, the ratio is 11 % (resp. 62%) remaining
reasonable (resp. beginning to be hard to analyze). We also observe stepping
curves, that are explained by the metamodel: at step 1, new class concepts are
introduced due to attribute and role concept creation of step 0, then they remain
stable until step 2, while role and association concepts increase, etc...




          Fig. 7. New class concept number in lattices for Chapter models
120     André Miralles et al.


    Curves for AOC-posets for Pesticides models are shown in Fig. 6. The order-
ing on Pesticides AOC-posets roughly corresponds to the need of factorization:
the highest curves correspond to V00 and V01 where few inheritance relations
(there is none in V00) have been used. This shows the factorization work done
by the designer during model versioning. The ratio at step 6 varies between 56%
(V10) and 132% (V00). In Pesticides lattices, we observed many curve crossings,
and curves have different shapes, while with Pesticide AOC-posets, the curves
have a regular shape and globally they decrease.
    For Chapter models (Fig. 8), convergence is obtained for all AOC-posets
(distance 1 to 8) and the process stops between steps 5 and 23. The curves are
also ordered and, as for lattices, the highest curve corresponds to the highest
distance. The curves reveal many opportunities for factorization. The ratio at
step 6 varies between 5% (distance 1) and 161% (distance 8).




       Fig. 8. New class concept number in AOC-posets for Chapter models



    Fig. 9 and 10 allow lattices and AOC-posets to be compared. We always have
more concepts in lattices than in AOC-posets, which was expected. In the AOC-
poset of the Chapter model at distance 1, there is only one new class concept,
while in the lattice they are three (including the top and bottom concepts),
explaining the beginning of the curve (Fig. 10). For version V00, the behavior
of the lattice-based process and the AOC-based process are similar. Except for
version V04 (at version V05 a duplication of a part of the model has been made
for working purpose), the highest curves correspond to the last models, which
have been better factorized, and we here notice the highest difference between the
two processes. We may hypothesize that lattices may contain many uninteresting
factorizations compared to AOC-posets in these cases. This experiment shows
that, in practice, the AOC-posets generally produce results that can be analyzed,
while lattices are often difficult to compute in some cases and are often too huge
to be used for our purpose.
                                                 Class Model Normalization      121




        Fig. 9. Ratio #N#N ew class concepts in lattice
                         ew class concepts in AOC−poset
                                                        for Pesticides models




5   Conclusion

For class model normalization, concept lattices and AOC-posets are two struc-
tures giving two different normal forms. UML models that are rebuilt from these
structures are interesting in both cases from a thematic point of view. Never-
theless, lattices are often too huge, and AOC-posets offer a good technique to
reduce the complexity.
    As future work, we plan to go more deeply into an exploratory approach,
defining different factorization paths, with model rebuilding at each step with
expert validation. This would allow the complexity to be controlled introduc-
ing less new concepts at each step. We also plan to use domain ontologies to
guide acceptance of a new formal concept, because it corresponds to a thematic
concept.


Acknowledgment

This work has been supported by Berger Levrault. The authors also warmly
thank X. Dolques for the RCAExplore tool which has been used for experiments.


References

 1. Godin, R., Mili, H.: Building and Maintaining Analysis-Level Class Hierarchies
    Using Galois Lattices. In: Proceedings of the Eight Annual Conference on Object-
    Oriented Programming Systems, Languages, and Applications (OOPSLA 93),
    ACM (1993) 394–410
 2. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundation.
    Springer-Verlag Berlin (1999)
122     André Miralles et al.




         Fig. 10. Ratio #N#N ew class concepts in lattice
                           ew class concepts in AOC−poset
                                                          for Chapter models




 3. Dolques, X., Le Ber, F., Huchard, M.: AOC-Posets: a Scalable Alternative to
    Concept Lattices for Relational Concept Analysis. In: Proceedings of the Tenth
    International Conference on Concept Lattices and Their Applications (CLA 2013).
    Volume 1062 of CEUR Workshop Proceedings., CEUR-WS.org (2013) 129–140
 4. Rouane-Hacène, M., Huchard, M., Napoli, A., Valtchev, P.: Relational concept
    analysis: mining concept lattices from multi-relational data. Ann. Math. Artif.
    Intell. 67(1) (2013) 81–108
 5. Miralles, A., Huchard, M., Dolques, X., Le Ber, F., Libourel, T., Nebut, C.,
    Guédi, A.O.: Méthode de factorisation progressive pour accroı̂tre l’abstraction
    d’un modèle de classes. Ingénierie des Systèmes d’Information 20(2) (2015) 9–39
 6. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing Iceberg
    Concept Lattices with TITANIC. Data Knowl. Eng. 42(2) (2002) 189–222
 7. Rouane-Hacène, M.: Relational concept analysis, application to software re-
    engineering. Thèse de doctorat, Université du Québec à Montréal (2005)
 8. Roume, C.: Analyse et restructuration de hiérarchies de classes. Thèse de doctorat,
    Université Montpellier 2 (2004)
 9. Falleri, J.R., Huchard, M., Nebut, C.: A generic approach for class model nor-
    malization. In: Proceedings of the 23rd IEEE/ACM International Conference on
    Automated Software Engineering (ASE 2008). (2008) 431–434
10. Osman Guédi, A., Miralles, A., Huchard, M., Nebut, C.: A practical application
    of relational concept analysis to class model factorization: Lessons learned from
    a thematic information system. In: Proceedings of the Tenth International Con-
    ference on Concept Lattices and Their Applications (CLA 2013). Volume 1062 of
    CEUR Workshop Proceedings., CEUR-WS.org (2013) 9–20
11. Osman Guédi, A., Huchard, M., Miralles, A., Nebut, C.: Sizing the underlying
    factorization structure of a class model. In: Proceedings of the 17th IEEE In-
    ternational Enterprise Distributed Object Computing Conference, (EDOC 2013).
    (2013) 167–172
 Partial enumeration of minimal transversals of a
                   hypergraph

               Lhouari Nourine, Alain Quilliot and Hélène Toussaint

        Clermont-Université, Université Blaise Pascal, LIMOS, CNRS, France
                   {nourine, quilliot, helene.toussaint}@isima.fr



        Abstract. In this paper, we propose the first approach to deal with
        enumeration problems with huge number of solutions, when interesting-
        ness measures are not known. The idea developed in the following is to
        partially enumerate the solutions, i.e. to enumerate only a representative
        sample of the set of all solutions. Clearly many works are done in data
        sampling, where a data set is given and the objective is to compute a
        representative sample. But, to our knowledge, we are the first to deal
        with sampling when data is given implicitly, i.e. data is obtained using
        an algorithm. The experiments show that the proposed approach gives
        good results according to several criteria (size, frequency, lexicographical
        order).


 1    Introduction
 Most of problems in data mining ask for the enumeration of all solutions that
 satisfy some given property [1, 10]. This is a natural process in many applications,
 e.g. marked basket analysis [1] and biology [2] where experts have to choose
 between those solutions. An enumeration problem asks to design an output-
 polynomial algorithm for listing without duplications the set of all solutions. An
 output-polynomial algorithm is an algorithm whose running time is bounded by
 a polynomial depending on the sum of the sizes of the input and output.
     There are several approachs to enumerate all solutions to a given enumeration
 problem. Johnson et al. [13] have given a polynomial-time algorithm to enumer-
 ate all maximal cliques or stables of a given graph. Fredman and Khachiyan
 [7] have proposed a quasi-polynomial-time algorithm to enumerate all minimal
 transversal of an hypergraph. For enumeration problems the size of the output
 may be exponential in the size of the input, which in general is different from
 optimization problems where the size of the output is polynomially related to
 the size of the input. The drawback of the enumeration algorithms is that the
 number of solutions may be exponential in the size of the input, which is infea-
 sible in practice. In data mining, some interestingness measures or constraints
 are used to bound the size of the output, e.g. these measures can be explicitly
 specified by the user [8]. In operation research, we use quality criteria in order
 to consider appropriate decision [21].
     In this paper, we deal with enumeration problems with huge number of so-
 lutions, when interestingness measures are not known. This case happens when

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 123–134, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
124     Lhouari Nourine, Alain Quilliot and Hélène Toussaint


the expert has no idea about data and knowledges that are looking for. The
objective is to enumerate only a representative sample of the set of all solutions.
Clearly many works are done in data sampling, where are given a data set and
the objective is to compute a representative sample. To our knowledges, this idea
is new for sampling when data is given implicitly, i.e. data is obtained using an
algorithm. One can use the naive approach which first enumerates all the solu-
tions and then applies sampling methods, which is not possible for huge number
of solutions.
    To evaluate our approach, we consider a challenging enumeration problem,
which is related to mining maximal frequent item sets [1, 10], dualization of
monotone boolean functions [5] and other problems [10]. We applied our ap-
proach to several instances of transversal hypergraphs [17, 20], and obtain good
results.


2     Related works

Golovach et al. [9] have proposed an algorithm to enumerate all minimal domi-
nating sets of a graph. First they generate maximal independent sets and then
apply a flipping operation to them to generate new minimal dominating sets,
where the enumeration of maximal independent sets is polynomial. Clearly, a
relaxation of the flipping operation leads to a partial enumeration since the
number of minimal dominating sets can be exponential in the number of min-
imal independent sets, e.g. cobipartie graphs. Jelassi et al.[12] and Raynaud et
al.[19] have considered some kind of redundancy in hypergraphs like twin ele-
ments to obtain a concise representation. Their ideas can avoid the enumeration
of similar minimal transversals of an hypergraph.


3     Transversal hypergraph enumeration

A hypergraph H = (V, E) consists of a finite collection E of sets over a finite set
V . The elements of E are called hyperedges, or simply edges. An hypergraph is
said simple if for any E, E 0 ∈ E E 6⊆ E 0 . A transversal (or hitting set) of H is
a set T ⊆ V that intersects every edge of E. A vertex x in a transversal T is
said to be redundant if T \ {x} is still a transversal. A transversal is minimal if
it does not contain any redundant vertex. The set T of all minimal transversal
of H = (V, E) constitutes together with V also a hypergraph T r(H)    P = (V, T ),
which is called the transversal hypergraph of H. We denote by k = E∈E | E |.

Example 1. Consider the hypergraph H = (V, E): V = {1, 2, 3, 4, 5} and E =
{E1 , E2 , E3 } with E1 = {1, 3, 4}, E2 = {1, 3, 5} and E3 = {1, 2}. The set of all
minimal transversals is T = {{1}, {2, 3}, {2, 4, 5}} and k = 3 + 3 + 2 = 8

    Given a simple hypergraph H = (V, E), the transversal hypergraph enumera-
tion problem concerns the enumeration without repetitions of T r(H). This prob-
lem has been intensively studied due to its link with several problems isuch as
               Partial enumeration of minimal transversals of a hypergraph    125


data mining and learning [3, 4, 11, 15, 18]. Recently, Kante et al.[14] have shown
that the enumeration of all minimal transversals of an hypergraph is polynomi-
ally equivalent to the enumeration of all minimal domination sets of a graph.
It is known that the corresponding decision problem belongs to coNP, but still
open whether there exists an output-polynomial-time algorithm.


4     Partial transversal hypergraph enumeration
We introduce the partial (or incomplete) search algorithm for enumerating min-
imal transversals of an hypergraph H. The search space is the set of all transver-
sals which is very large. The strategy is divided into two steps:
 – The initialization procedure considers a transversal T of H and then ap-
   plies a reduction (at random) algorithm to T in order to obtain a minimal
   transversal Tm of H. This step is detailed in section 4.1.
 – The local search algorithm considers a minimal transversal Tm and then
   applies local changes to Tm in which we add and delete vertices according
   to some ordering of the vertices. This step is detailed in section 4.2
   These steps are repeated for at most k transversals depending on the input
hypergraph H. Figure 1 illustrates the proposed approach.




          Fig. 1. Approach to partial enumeration of minimal transversals




4.1   Initialization
Let H(V, E) be the input hypergraph, E ∈ E and x ∈ E. The initialization
step starts with the transversal (V \ E) ∪ {x} and then applies a reduction
126       Lhouari Nourine, Alain Quilliot and Hélène Toussaint


algorithm to obtain a minimal transversal. The following property shows that
the set (V \ E) ∪ {x} is a transversal and any minimal transversal contained in
(V \ E) ∪ {x} contains the vertex x.

Property 1. Let H = (V, E) be a simple hypergraph, E ∈ E and x ∈ E. Then
(V \ E) ∪ {x} a transversal of H. Moreover, any minimal transversal T ⊆ (V \
E) ∪ {x} will contain x.

Proof. Let H = (V, E) be a simple hypergraph and E 0 ∈ E with E 6= E 0 . Since
H is simple then there exists at least one element y ∈ E 0 such that y 6∈ E. So
y ∈ (V \ E) and thus E 0 ∩ (V \ E) 6= ∅. We conclude that (V \ E) ∪ {x} is a
transversal since x ∈ E.
   Now let T ⊆ (V \E)∪{x} be a minimal transversal. Then E ∩(V \E)∪{x} =
{x} and thus x must belong to T otherwise T does not intersect E. 
    According to property 1, we can apply the initialization procedure to any
pair (x, E) where E ∈ E and x ∈ E. In other words, the initialization is applied
to at most k transversals of H as shown in Algorithm 1.
 Algorithm 1: Initialization
  Input : A hypergraph H(V, E) and σ an ordering of V
  Output: A sample of minimal transversals
  begin
         ST RAN S = ∅;
         for E ∈ E do
            for x ∈ E do
               T = (V \ E) ∪ {x};{Initial transversal}
               Tm = Reduce(T, σ);
               ST RAN S = ST RAN S ∪ {Tm };
         return(ST RAN S);

    Now we describe the reduction process, which takes a transversal T and
a random ordering σ of V and returns a minimal transversal Tm . Indeed, we
delete vertices from T according to the ordering σ until we obtain a minimal
transversal.

 Algorithm 2: Reduce(T, σ)
  Input : A transversal T and an ordering σ = σ1 ...σ|V | of the vertices of
          H.
  Output: A minimal transversal
      for i = 1 to |V | do
         if T \ {σi } is a transversal then
             T ← T \ {σi } ;
      Return(T );
               Partial enumeration of minimal transversals of a hypergraph   127


Example 2 (continued). Suppose we are given the hypergraph in example 1 and
σ = (1, 2, 3, 4, 5) for the input to Algorithm 1. First, it takes the hyperedge
E = {1, 3, 4} and for x = 1 we obtain the minimal transversal {1}, for x = 3
we obtain {2, 3} and for x = 4 we obtain {2, 4, 5}. Then the algorithm con-
tinue with the hyper edges {1, 3, 5} and {1, 2}. Finally the algorithm returns
ST RAN S = {{1}, {2, 3}, {2, 4, 5}}, i.e. the other iterations do not add new
minimal transversals.
Theorem 1. Algorithm 1 computes at most k minimal transversals of an input
hypergraph H = (V, E).
Proof. The initialization procedure considers at most k minimal transversals of
H. Since a minimal transversal can be obtained several times, the result follows.

   The following proposition shows that any minimal transversal of the hyper-
graph H = (V, E) can be obtained using the initialization procedure. Indeed, the
choice of the ordering σ is important in the proposed strategy.

Proposition 1. Let H = (V, E) be an hypergraph and T be a minimal transver-
sal of H. Then, there exists a total order σ, E ∈ E and x ∈ E such that
T = Reduce((V \ E) ∪ {x}, σ).
Proof. Let T be a minimal transversal of H = (V, E). Then there exists at least
one hyperedge E ∈ E such that T ∩ E = {x}, x ∈ V , otherwise T is not minimal.
Thus T ⊆ (V \ E) ∪ {x}. Now, if we take the elements that are not in T before
the elements in T in σ, the algorithm Reduce((V \ E) ∪ {x}, σ) returns T . 
   The initialization procedure guaranties that for any vertex x ∈ V at least one
minimal transversal containing x is listed. The experiments in section 5, shows
the sample of minimal transversals obtained by the initialization procedure is a
representative sample of the set of all minimal transversals.

4.2   Local search algorithms
The local search algorithm takes each minimal transversal found in the initial-
ization step and searches for new minimal transversals to improve the initial
solution. The search of neighbors is based on vertices orderings.
    Let H = (V, E) be an hypergraph and x ∈ V . We define the frequency of
x as the number of minimal transversals of H that contain x. The algorithm
takes a minimal transversal T and a bound N max which bounds the number of
iterations and the number of neighboors generated by T . Each iteration of the
while loop, starts with a minimal transversal T and computes two orderings as
follows:
 – σ c is an ordering according to the increasing order of frequency of vertices
   in V \ T in minimal transversals already obtained by the current call. This
   ordering has a better coverage of the solution set, i.e. by keeping the rarest
   vertices in the transversals.
128       Lhouari Nourine, Alain Quilliot and Hélène Toussaint


 – σ is a random ordering of the vertices in T .

    Algorithm 3: Neighboor(T, N max)
     Input : A minimal transversal T of H = (V, E) and an integer N max
     Output: A set of minimal transversals
      Q = T;
      i = 1;
      while i ≤ N max do
          σ c ← the set V \ T sorted in increasing order of frequency of vertices
          in minimal transversals in Q;
          σ ← sort T at random;
          Add elements the elements in σ c to T until a vertex x ∈ T \ σ c
          becomes redundant in T ;
          T =Reduce(T, σ);
          Q = Q ∪ {T };
          i = i + 1;
      return(Q);

      Now we give the global procedure of the proposed approach.
    Algorithm 4: Global procedure for partial enumeration of minimal
    transversals
      Input : An hypergraph H = (V, E) and an integer N max
      Output: A sample of minimal transversals of H
      σ =choose a random ordering of V ;
      ST RAN = Q = Initialization(H, σ);
      while Q 6= ∅ do
         T = choose a minimal transversal T in Q;
         ST RAN S = ST RAN S ∪ N eighboor(T, N max);
      Return(ST RAN S);

   In the following, we describe experiments to evaluate the results that have
been obtained.


5      Experimentation
The purpose of the experiments is to see if the proposed approach allow us to
generate a representative set of solutions. For this reason, we have conducted
the experiments on two different classes of hypergraphs (see [20]) for which the
number of minimal transversals is huge compared to the size of the input. We use
Uno’s Algorithm SHD (Sparsity-based Hypergraph Dualization, ver. 3.1) [20],
to enumerate all minimal transversals. The experiments are done using linux
CentOS cpu Intel Xeon 3.6 GHz and C++ language.
   In the following, we denote Tpartial the set of minimal transversals generated
by Algorithm 4, and Texact the set of all minimal transversals. First, we analyze
                 Partial enumeration of minimal transversals of a hypergraph    129

                 T
the percentage Tpartial
                 exact
                        and then we evaluate the representativeness of the sample
Tpartial .


5.1      The size of Tpartial

We will distinguish between minimal transversals that are obtained using Algo-
rithm 1 (or the initialization procedure) and those that are generated using the
local search. For these tests we set the maximal number of neighboors N max to
3.
Tables 1 and 2 show the results for the two classes of hypergraph instances,
namely ”lose” and random ”p8”.

      The first three columns have the following meaning:

 – instance: instance name.
 – instance size: the size of the instance (number of edges × number of vertices).
 – total # of transv.: the exact number of minimal transversals | Texact |.

   The second (resp. last) three columns give the results for the initialization
procedure (resp. Global algorithm):

 – # transv. found : the number of minimal transversals found.
 – % transv. found : the percentage of minimal transversals found.
 – cpu (s): the run time in seconds




                       Table 1. Results for all ”lose” instances



    According to these tests, we can see that the percentage of minimal transver-
sals found using the initialization procedure is very low, but it decreases as far
as the size of Texact increases. Clearly, this percentage is strongly related to the
input. Indeed, the number k (entropy) of the hypergraph increases according to
the size of the input hypergraph. We can also see that the local search increases
significantly the number of solutions found by a factor 2 to 2.5 approximatively.
130     Lhouari Nourine, Alain Quilliot and Hélène Toussaint




                       Table 2. Results for all ”p8 ” instances




But it remains coherent with the chosen value N max = 3. It argues that the lo-
cal search finds other transversals that are not found either by the initialization
procedure nor previous local search. Notice that the parameter N max can be
increased whenever the size of Tpartial is not sufficient.


5.2   The representativeness of Tpartial

To evaluate the representativeness of Tpartial , we consider three criteria:

 – Size of the minimal transversals in Tpartial .
 – Frequency of vertices in Tpartial .
 – Lexicographical rank of the minimal transversals in Tpartial .

    Each criteria is illustrated using a bar graph for two instances from different
classes. The bar graphs in figures 2 and 3 are surprising. Indeed the bar graphs
vary nearly in the same manner with respect to the initialization and the local
search algorithm for all the considered criteria.

    Figures 2(a) 3(a) show that the percentage of minimal transversals of each
size (found either by the initialization procedure and local search) fits the per-
centage of all minimal transversals that are found.
    Figures 2(b) and 3(b) show that the same analysis holds when ordering min-
imal transversals lexicographically (e.g. based on a total ordering of vertices).
Clearly, the lexicographical rank of a minimal transversal belongs to the interval
[1..2|V | ]. For visualization aspect, we divide this interval into |V | subintervals,
where the subinterval i contains the number of minimal transversals with a rank
        |V |     |V |
r ∈ [ 2|V | i; 2|V | (i + 1)[, (i = 0, . . . , |V | − 1).
    Figures 2(c) and 3(c) confirm this behavior when considering frequency of
vertices. Indeed, frequency of vertices in minimal transversals in Tpartial is the
same when considering all minimal transversals, i.e.. the set Texact .
               Partial enumeration of minimal transversals of a hypergraph    131




Fig. 2. The bar graph for ”lose100”: a) number of minimal transversals per size b)
number of minimal transversals according the lexicographical rank; c) Frequency of
vertices in minimal transversals.




Fig. 3. The bar graph for ”p8 200”; a) number of minimal transversals per size; b)
number of minimal transversals according the lexicographical rank; c) Frequency of
vertices in minimal transversals.
132     Lhouari Nourine, Alain Quilliot and Hélène Toussaint




Fig. 4. Visualizing the solutions space of ”lose100”. The abscissa is given by the size
of the transversal (transversals of the same size are spread out using a norm) and the
ordinate corresponds to the frequency of its vertices.




Fig. 5. Visualizing the solutions space of ”p8 200”. The abscissa is given by the size
of the transversal (transversals of the same size are spread out using a norm) and the
ordinate corresponds to the frequency of its vertices.
                Partial enumeration of minimal transversals of a hypergraph       133


    Figures 4 and 5 show that the set Tpartial is also representative even when
considering minimal transversals with the same size. Indeed, minimal transver-
sals having the same size are spread out using a norm. We notice that the points
corresponding to minimal transversals in Tpartial are scattered in the image.
    This experiment allows us to conclude that the sample Tpartial produced by
Algorithm 4 is representative relatively to the criteria under consideration. Other
results can be found in http://www2.isima.fr/˜toussain/


6   Conclusion and discussions

We are convinced that the initialization procedure is the most important in this
approach. Indeed, the set of minimal transversals obtained using this procedure is
a representative sample, since it garantee that for any vertex of the hypergraph
there is at least one minimal transversal which contains it (see property 1).
Moreover the local search procedure can be used to increase the number of
solutions, and as we have seen in the experiments, it keeps the same properties
as the initialization procedure.
    We hope that this approach improves enumeration in big data and will be
of interests to the readers to investigate heuristics methods [6] for enumeration
problems.
    This paper opens new challenges related to partial and approximate enu-
meration problems. For example, given an hypergraph H = (V, E), is there an
algorithm that for any given ε, it enumerates a set Tpartial ⊆ T r(H) such that
(1 − ε)|T r(H)| ≤ |Tpartial | ≤ |T r(H)|? We also require that the algorithm is
output-polynomial in the sizes of H, Tpartial and 1ε . To our knowledge, there is
no work on approximate algorithms for enumeration problems, but results on
approximate counting problems may be applied [16].

   Acknowledgment: This work has been funded by the french national re-
search agency (ANR DAG project, 2009-2013) and CNRS (Mastodons PETASKY
project, 2012-2015).


References
 1. R. Agrawal, T. Imielinski, and A. Swami. Mining associations between sets of items
    in massive databases. In ACM SIGMOD 1993, Washington D.C., pages 207–216,
    1993.
 2. J. Y. Chen and S. Lonardi. Biological Data Mining. Chapman and Hall/CRC,
    2009.
 3. T. Eiter and G. Gottlob. Identifying the minimal transversals of a hypergraph and
    related problems. SIAM J. Comput., 24(6):1278–1304, 1995.
 4. T. Eiter, G. Gottlob, and K. Makino. New results on monotone dualization and
    generating hypergraph transversals. SIAM J. Comput., 32(2):514–537, 2003.
 5. T. Eiter, K. Makino, and G. Gottlob. Computational aspects of monotone dual-
    ization: A brief survey. Discrete Applied Mathematics, 156(11):2035–2049, 2008.
134     Lhouari Nourine, Alain Quilliot and Hélène Toussaint


 6. T. Feo and M. Resende. Greedy randomized adaptive search procedures. Journal
    of Global Optimization, 6(2):109–133, 1995.
 7. M. Fredman and L. Khachiyan. On the complexity of dualization of monotone
    disjunctive normal forms. Journal of Algorithms, 21:618–628, 1996.
 8. L. Geng and H. J. Hamilton. Interestingness measures for data mining: A survey.
    ACM Comput. Surv., 38(3), Sept. 2006.
 9. P. Golovach, P. Heggernes, D. Kratsch, and Y. Villanger. An incremental polyno-
    mial time algorithm to enumerate all minimal edge dominating sets. Algorithmica,
    72(3):836–859, 2015.
10. D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen, and R. S. Sharm.
    Discovering all most specific sentences. ACM Trans. Database Syst., 28(2):140–174,
    2003.
11. D. Gunopulos, R. Khardon, H. Mannila, and H. Toivonen. Data mining, hyper-
    graph transversals, and machine learning. In PODS, pages 209–216, 1997.
12. M. Jelassi, C. Largeron, and S. Ben Yahia. Concise representation of hypergraph
    minimal transversals: Approach and application on the dependency inference prob-
    lem. In Research Challenges in Information Science (RCIS), 2015 IEEE 9th In-
    ternational Conference on, pages 434–444, May 2015.
13. D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maxi-
    mal independent sets. Inf. Process. Lett., 27(3):119–123, 1988.
14. M. M. Kanté, V. Limouzy, A. Mary, and L. Nourine. On the enumeration of mini-
    mal dominating sets and related notions. SIAM Journal on Discrete Mathematics,
    28(4):1916–1929, 2014.
15. L. Khachiyan, E. Boros, K. M. Elbassioni, and V. Gurvich. An efficient implemen-
    tation of a quasi-polynomial algorithm for generating hypergraph transversals and
    its application in joint generation. Discrete Applied Mathematics, 154(16):2350–
    2372, 2006.
16. J. Liu and P. Lu. Fptas for counting monotone cnf. In Proceedings of the Twenty-
    Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, pages
    1531–1548. SIAM, 2015.
17. K. Murakami and T. Uno. Efficient algorithms for dualizing large-scale hyper-
    graphs. Discrete Applied Mathematics, 170:83–94, 2014.
18. L. Nourine and J.-M. Petit. Extending set-based dualization: Application to pat-
    tern mining. In ECAI, pages IOS Press ed, Montpellier, France, 2012.
19. O. Raynaud, R. Medina, and C. Noyer. Twin vertices in hypergraphs. Electronic
    Notes in Discrete Mathematics, 27:87–89, 2006.
20. T. Uno. http://research.nii.ac.jp/ uno/dualization.html.
21. C. A. Weber, J. R. Current, and W. Benton. Vendor selection criteria and methods.
    European Journal of Operational Research, 50(1):2 – 18, 1991.
         An Introduction to Semiotic-Conceptual Analysis
                  with Formal Concept Analysis

                                              Uta Priss

                            Zentrum für erfolgreiches Lehren und Lernen
                              Ostfalia University of Applied Sciences
                                       Wolfenbüttel, Germany
                                     www.upriss.org.uk



           Abstract. This paper presents a formalisation of Peirce’s notion of ‘sign’ using
           a triadic relation with a functional dependency. The three sign components are
           then modelled as concepts in lattices which are connected via a semiotic map-
           ping. We call the study of relationships relating to semiotic systems modelled in
           this manner a semiotic-conceptual analysis. It is argued that semiotic-conceptual
           analyses allow for a variety of applications and serve as a unifying framework for
           a number of previously presented applications of FCA.


 1 Introduction

 The American philosopher C. S. Peirce was a major contributor to many fields with a
 particular interest in semiotics. The following quote shows one of his definitions for the
 relationships involved in using a sign:

       A REPRESENTAMEN is a subject of a triadic relation TO a second, called
       its OBJECT, FOR a third, called its INTERPRETANT, this triadic relation be-
       ing such that the REPRESENTAMEN determines its interpretant to stand in
       the same triadic relation to the same object for some interpretant. (Peirce, CP
       1.541)1

     According to Peirce a sign consists of a physical form (representamen) which could,
 for example, be written, spoken or represented by neurons firing in a brain, a meaning
 (object) and another sign (interpretant) which mirrors the original sign, for example, in
 the mind of a person producing or observing a sign. It should be pointed out that the use
 of the term ‘relation’ by Peirce is not necessarily the same as in modern mathematics
 which distinguishes more clearly between a ‘relation’ and its ‘instances’. Initially Peirce
 even referred to mathematical relations as ‘relatives’ (Maddux, 1991).
     We have previously presented an attempt at mathematically formalising Peirce’s
 definition (Priss, 2004). In our previous attempt we tried to presuppose as few assump-
 tions about semiotic relations as possible which led to a fairly open structural descrip-
 tion which, however, appeared to be limited with respect to usefulness in applications.
  1
      It is customary among Peirce scholars to cite Peirce in this manner using an abbreviation of
      the publication series, volume number and paragraph or page numbers.


c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 135–146, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
136        Uta Priss


Now we are presenting another formalisation which imposes a functional dependency
on the triadic relation. The notions from Priss (2004) are translated into this new for-
malism which is in many ways simpler, more clearly defined and appears to be more
useful for applications.
     In order to avoid confusion with the notion of ‘object’ in FCA2 , we use the term
‘denotation’ instead of Peirce’s ‘object’ in the remainder of this paper. We translate the
first half of Peirce’s definition into modern language as follows: “A representamen is
a parameter of a function resulting in a second, called its denotation, where the third,
the function instance, is called interpretant.” – or in other words as a function of type
‘third(first) = second’. In our modelling, a set of such functions together with their
parameter/value pairs constitute a triadic semiotic relation. We use Peirce’s notion of
‘interpretant’ for the function instances whereas the functions themselves are called
‘interpretations’. A sign is then an instance of this triadic relation consisting of rep-
resentamen, denotation and interpretation. This is more formally defined in the next
section.
     The second half of Peirce’s definition refers to the mental image that a sign in-
vokes in participants of communication acts. For Peirce, interpretants are mental images
which can themselves be thought about and thus become representamens for other inter-
pretants and so on. Because the focus of this paper is on formal, not natural languages,
mental images are not important. We suggest that in formal languages, interpretants are
not mental representations but instead other formal structures, for example, states in a
computer program.
     The data structures used in formal languages (such as programming languages,
XML or UML) can contain a significant amount of complexity. A semiotic-conceptual
analysis as proposed in this paper allows to investigate the components of such struc-
tures as signs with their representamens, denotations and interpretations and their rela-
tionships to each other. As a means of structuring the semiotic components we use FCA
concept lattices.
     It should be noted that there has recently been an increase of interest in triadic
FCA (e.g., Gnatyshak et al. (2013), Belohlavek & Osicka (2012)) which could also be
used to investigate triadic semiotic relations. But Peirce tends to see triadic relations as
consisting of three components of increasing complexity:

      The First is that whose being is simply in itself, not referring to anything nor
      lying behind anything. The Second is that which is what it is by force of some-
      thing to which it is second. The Third is that which is what it is owing to
      things between which it mediates and which it brings into relation to each other.
      (Peirce, EP 1:248; CP 1.356)

    In our opinion this is better expressed by a function instance of type ‘third(first)
= second’ than by an instance ‘(first, second, third)’ of a triadic relation. Other re-
searchers have already suggested formalisations of Peirce’s philosophy. Interestingly,
Marty (1992), Goguen (1999) and Zalamea (2010) all suggest using Category Theory
 2
     Because Formal Concept Analysis (FCA) is the main topic of this conference, this paper does
     not provide an introduction to FCA. Information about FCA can be found, for example, on-line
     (http://www.fcahome.org.uk) and in the main FCA textbook by Ganter & Wille (1999).
                 Semiotic-Conceptual Analysis with Formal Concept Analysis              137


for modelling Peirce’s philosophy even though they appear to have worked indepen-
dently of each other. Marty even connects Category Theory with FCA in his modelling.
Goguen develops what he calls ‘algebraic semiotics’. Zalamea is more focussed on Ex-
istential Graphs than semiotics (and unfortunately most of his papers are in Spanish).
Nevertheless our formalisation is by far not as abstract as any of these existing formal-
isations which are therefore not further discussed in this paper.
    This paper has five further sections. Section 2 presents the definitions of signs, semi-
otic relations and NULL-elements. Section 3 continues with defining concept lattices
for semiotic relations. Section 4 explains degrees of equality among signs. Section 5
discusses mappings among the lattices from Section 3 and presents further examples.
The paper ends with a concluding section.


2 Core definitions of a semiotic-conceptual analysis

The main purpose of this work is to extend Peirce’s sign notion to formal languages
such as computer programming languages and formal representations. Figure 1 displays
a simple Python program which is called ’Example 1’ in the remainder of this paper.
The table underneath shows the values of the variables of Example 1 after an execution.
The variables are representamens and their values are denotations. Because Peirce’s
definition of signs seems to indicate that there is a separate interpretant for each sign,
there are at least eight different interpretants in column 3 of the table. It seems more
interesting, however, to group interpretants than to consider them individually. We call
such groupings of interpretants interpretations. In natural language examples, one could
group all the interpretants that belong to a sentence or paragraph. In programming lan-
guages starting a loop or calling a subroutine might start a new interpretation. As a
condition for interpretations we propose that each representamen must have a unique
denotation in an interpretation, or in other words, interpretations are functions. There
are many different possibilities for choosing sets of interpretations. Two possibilities,
called IA and IB in the rest of the paper, are shown in the last two columns of the ta-
ble. Each contains two elements which is in this case the minimum required number
because some variables in Example 1 have two different values. In our modelling an
interpretant corresponds to a pair of representamen and interpretation. For R and IA
there are ten interpretants (and therefore ten signs) whereas there are eight for R and
IB . The first three columns of the table can be automatically derived using a debugging
tool. The interpretants are numbered in the sequence in which they are printed by the
debugger.
     Definition 1: A semiotic relation S ⊆ I × R × D is a relation between three sets
(a set R of representamens, a set D of denotations and a set I of interpretations) with
the condition that any i ∈ I is a partial function i : R →7 D. A relation instance (i, r, d)
with i(r) = d is called a sign. In addition to S, we define the semiotic (partial) mapping
S : I × R →7 D with S(i, r) = d iff i(r) = d. The pairs (i, r) for which d exists are
called interpretants.
     It follows that there are as many signs as there are interpretants. Example 1 shows
two semiotic relations using either IA or IB for the interpretations. The interpretations
firstLoop, secondLoop and firstValue are total functions. The interpretation second-
138      Uta Priss

input_end = "no"
while input_end != "yes":
    input1 = raw_input("Please type something: ")
    input2 = raw_input("Please type something else: ")
    if (input1 == input2):
       counter = 1
       error = 1
       print "The two inputs should be different!"
    else:
       counter = 2
    input_end = raw_input("End this program? ")

representamens R denotations D interpretants interpretations IA interpretations IB
    (variables)     (values)                 (∼10 interpretants) (∼ 8 interpretants)
       input1    ”Hello World”       j1           firstLoop          firstValue
       input2    ”Hello World”       j2           firstLoop          firstValue
      counter          1             j3           firstLoop          firstValue
     input end         no            j4           firstLoop          firstValue
        error          1             j5           firstLoop          firstValue
       input1    ”Hello World” j6 (or j1)       secondLoop           firstValue
       input2    ”How are you”       j7         secondLoop         secondValue
      counter          2             j8         secondLoop         secondValue
     input end        yes            j9         secondLoop         secondValue
        error          1        j10 (or j5)     secondLoop           firstValue

                  Fig. 1. A Python program (called ‘Example 1’ in this paper)



Value is a partial function. Because for (i1 , r1 , d1 ) and (i2 , r2 , d2 ), i1 = i2 , r1 = r2 ⇒
d1 = d2 , it follows that all r ∈ R with r(i) = d are also partial functions r : I →          7 D.
The reason for having a relation S and a mapping S is because Peirce defines a relation
but in applications a mapping might be more usable. In this paper the sets R, D and I
are meant to be finite and not in any sense universal but built for an application. The
assignment operation (written ‘:=’ in mathematics or ‘=’ in programming languages)
is an example of i(r) = d except that i is usually implied and not explicitly stated in
that case.
    Using the terminology from database theory, we call a semiotic relation a triadic re-
lation with functional dependency. This is because, on the one hand, Peirce calls it not a
mapping but a ‘triadic relation’, on the other hand, without this functional dependency
it would not be possible to determine the meaning of a sign given its representamen and
an interpretation. Some philosophers might object to Definition 1 because of the func-
tional dependency. We argue that the added functional dependency yields an interesting
structure which can be explored as shown in this paper.
    The idea of using interpretations as a means of assigning meaning to symbols is
already known from formal semantics and model theory. But this paper has a different
focus by treating interpretations and representamens as dual structures. Furthermore in
applications, S(i, r) might be implemented as an algorithmic procedure which deter-
mines d for r based on information about i at runtime. A debugger as in Example 1
                 Semiotic-Conceptual Analysis with Formal Concept Analysis             139


is not part of the original code but at a meta-level. Since the original code might re-
quest user input (as in Example 1), the relation instances (i, r, d) are only known while
or after the code was executed. Thus the semiotic relation is dynamically generated in
an application. This is in accordance with Peirce’s ideas about how it is important for
semiotics to consider how a sign is actually used. The mathematical modelling (as in
Definition 1) which is conducted after a computer program finished running, ignores
this and simply considers the semiotic relation to be statically presented.
    Priss (2004) distinguishes between triadic signs and anonymous signs which are
less complex. In the case of anonymous signs, the underlying semiotic relation can be
reduced to a binary or unary relation because of additional constraints. Examples of
anonymous signs are constants in programming languages and many variables used
in mathematical expressions. For instance, the values of variables in the Pythagorean
equation a2 + b2 = c2 are all the values of all possibly existing right-angled triangles.
But, on the one hand, if a2 + b2 = c2 is used in a proof, it is fine to assume |I| = 1
because within the proof the variables do not change their values. On the other hand,
if someone uses the formula for an existing triangle, one can assume S(i, r) = r be-
cause in that case the denotations can be set equal to the representamens. Thus within a
proof or within a mathematical calculation variables can be instances of binary or unary
relations and thus anonymous signs. However, in the following Python program:
a = input("First side: ")
b = input("Second side: ")
print a*a + b*b

     the values of the variables change depending on what is entered by a user. Here the
signs a and b are triadic.
     A sign is usually represented by its representamen. In a semiotic analysis it may
be important to distinguish between ‘sign’ and ‘representamen’. In natural language
this is sometimes indicated by using quotes (e.g., the word ‘word’). In Example 1, the
variable ‘input1’ is a representamen whereas the variable ‘input1’ with a value ‘Hello
World’ in the context of firstLoop is a sign. It can happen that a representamen is taken
out of its context of use and loses its connection to an interpretation and a denotation.
For example, one can encounter an old file which can no longer be read by any current
program. But a sign always has three components (i, r, d). Thus just looking at the
source code of a file creates an interpretant in that person’s mind even though this new
sign and the original sign may have nothing in common other than the representamen.
Using the next definition, interpretations that are partial functions can be converted into
total functions.
     Definition 2: For a semiotic relation, a NULL-element d⊥ is a special kind of deno-
tation with the following conditions: (i) i(r) undefined in D ⇒ i(r) := d⊥ in D∪{d⊥ }.
(ii) d⊥ ∈ D ⇒ all i are total functions.
     Thus by enlarging D with one more element, one can convert all i into total func-
tions. If all i are already total functions, then d⊥ need not exist. The semiotic mapping
S can be extended to a total function S : I × R → D ∪ {d⊥ }. There can be different
reasons for NULL-elements: caused by the selection of interpretations or by the code
itself. Variables are successively added to a program and thus undefined for any inter-
pretation that occurs before a variable is first defined. In Example 1, secondValue is a
partial function because secondValue(input1) = secondValue(error) = d⊥ . But IA shows
140        Uta Priss


that all interpretations can be total functions. On the other hand, if a user always enters
two different values, then the variable ‘error’ is undefined for all interpretations. This
could be avoided by changing the code of Example 1. In more complex programs it
may be more difficult to avoid d⊥ , for example if a call to an external routine returns
an undefined value. Programming languages tend to allow operations with d⊥ , such as
evaluating whether a variable is equal to d⊥ , in order to avoid run-time errors resulting
from undefined values. Because the modelling in the next section would be more com-
plex if conditions for d⊥ were added we decided to mostly ignore d⊥ in the remainder
of this introductory paper.


3 Concept lattices of a semiotic relation
In order to explore relationships among signs we are suggesting to model the compo-
nents of signs as concept lattices. The interpretations which are (partial) functions from
R to D then give rise to mappings between the lattice for R and the lattice for D. Fig-
ure 2 shows an example of concept lattices for the semiotic relation from Example 1.
The objects are the representamens, denotations and interpretations of Example 1. The
attributes are selected for characterising the sets and depend on the purpose of an appli-
cation. If the denotations are values of a programming language, then data types are a
fairly natural choice for the attributes of a denotation lattice.
     Attributes for representamens should focus on representational aspects. In Example
1, all input variables start with the letters ‘input’ because of a naming style used by
the programmer of that program. In some languages certain variables start with upper
or lowercase letters, use additional symbols (such as ‘@’ for arrays) or are complex
structures (such as ’root.find(”file”).attrib[”size”]’) which can be analysed in a repre-
sentamen lattice. In strongly-typed languages, data types could be attributes of repre-
sentamens but in languages where variables can change their type, data types do not
belong into a representamen lattice. Rules for representamens also determine what is to
be ignored. For example white space is ignored in many locations of a computer pro-
gram. The font of written signs is often ignored but mathematicians might use Latin,
Greek and Fraktur fonts for representamens of different types of denotations.
     One way of deriving a lattice for interpretations is to create a partially ordered set
using the ordering relation as to whether one interpretation precedes another one or
whether they exist in parallel. A lattice is then generated using the Dedekind closure. In
Figure 2 the attributes represent some scaling of the time points of the interpretations.
Thus temporal sequences can be expressed but any other ordering can be used as well.
     Definition 3: For a set R of representamens, a concept lattice B(R, MR , JR ) is
defined where MR is a set of attributes used to characterise representamens and JR
is a binary relation JR ⊆ R × MR . B(R, MR , JR ) is called representamen lattice.
B(R, MR , JR ) is complete for a set of interpretations3 if for all r ∈ R: ∀i∈I : γ(r1 ) =
γ(r2 ) ⇒ i(r1 ) = i(r2 ) and γ(r1 ) 6= γ(r2 ) ⇒ ∃i∈I : i(r1 ) 6= i(r2 ).
     Definition 4: For a set I of interpretations, a concept lattice B(I, MI , JI ) is defined
where MI is a set of attributes used to characterise the interpretations and JI is a binary
 3
     For an object o its object concept γ(o) is the smallest concept which has the object in its
     extension.
                     Semiotic-Conceptual Analysis with Formal Concept Analysis                       141


           Representamen lattice             Denotation lattice         Interpretation lattice

                                                           is defined
                                             string
             input          other                         number                    >0
                                    "Hello World"
           input1           counter "How are you"           1               firstLoop
           input2           error                           2
           input_end                        binary                                   >1
                                      positive     negative
                                                                            secondLoop
                                             yes      no
                                                                                                 .

                                    Fig. 2. Lattices for Example 1




relation JI ⊆ I × MI . B(I, MI , JI ) is called interpretation lattice. B(I, MI , JI ) is
complete for a set of representamens if for all i ∈ I: ∀r∈R : γ(i1 ) = γ(i2 ) ⇒ i1 (r) =
i2 (r) and γ(i1 ) 6= γ(i2 ) ⇒ ∃r∈R : i1 (r) 6= i2 (r).
     The representamen lattice in Figure 2 is not complete for IA because, for exam-
ple, ‘input end’ and ’input1’ have different denotations. The interpretation lattice is
complete for R because no objects have the same object concept and firstLoop and sec-
ondLoop have different denotations, for example, for ‘counter’. Completeness means
that exactly those representamens or interpretations that share their object concepts can
be used interchangeably without any impact on their relationship with the other sets.
The dashed lines in Figure 2 are explained below in Section 5.
     Definition 5: For a set D \ {d⊥ } of denotations, a concept lattice B(D, MD , JD )
is defined where MD is a set of attributes used to characterise the denotations and JD
is a binary relation JD ⊆ D × MD . B(D, MD , JD ) is called denotation lattice.


4 Equality and other sign properties

Before continuing with the consequences of the definitions of the previous section,
equality of signs should be discussed because there are different degrees of equality.
Two signs, (i1 , r1 , d1 ) and (i2 , r2 , d2 ), are equal if all three components are equal. Be-
cause of the functional dependency this means that two signs are equal if i1 = i2 and
r1 = r2 . In normal mathematics the equal sign is used for denotational equality. For
example, x = 5 means that x has the value of 5 although clearly the representamen x
has nothing in common with the representamen 5. Since signs are usually represented
by their representamens denotational equality needs to be distinguished from equal-
ity between signs. Denotational equality is called ‘strong synonymy’ in the definition
below. Even strong synonymy is sometimes too much. For example natural language
synonyms (such as ‘car’ and ‘automobile’) tend to always still have subtle differences
in meaning. In programming languages, if a counter variable increases its value by 1,
it is still thought of as the same variable. But if such a variable changes from ‘3’ to
‘Hello World’ and then to ‘4’, depending on the circumstances, it might indicate an
142        Uta Priss


error. Therefore we are defining a tolerance relation4 T ⊆ D × D to express that some
denotations are close to each other in meaning. With respect to the denotation lattice,
the relation T can be defined as the equivalence relation of having the same object con-
cept or via a distance metric between concepts. The following definition is an adaptation
from Priss (2004) that is adjusted to the formalisation in this paper.
    Definition 6: For a semiotic relation with tolerance relations TD ⊆ D × D and
TI ⊆ I × I the following are defined:
 • i1 and i2 are compatible ⇔ ∀r∈R,i1 (r)6=d⊥ ,i2 (r)6=d⊥ : (i1 (r), i2 (r)) ∈ TD
 • i1 and i2 are mergeable ⇔ ∀r∈R,i1 (r)6=d⊥ ,i2 (r)6=d⊥ : i1 (r) = i2 (r)
 • i1 and i2 are TI -mergeable ⇔ (i1 , i2 ) ∈ TI and i1 and i2 are mergeable
 • (i1 , r1 , d1 ) and (i2 , r2 , d2 ) are strong synonyms ⇔ r1 6= r2 and d1 = d2
 • (i1 , r1 , d1 ) and (i2 , r2 , d2 ) are synonyms ⇔ r1 6= r2 and (d1 , d2 ) ∈ TD
 • (i1 , r1 , d1 ) and (i2 , r2 , d2 ) are equinyms ⇔ r1 = r2 and d1 = d2
 • (i1 , r1 , d1 ) and (i2 , r2 , d2 ) are polysemous ⇔ r1 = r2 and (d1 , d2 ) ∈ TD
 • (i1 , r1 , d1 ) and (i2 , r2 , d2 ) are homographs ⇔ r1 = r2 and (d1 , d2 ) 6∈ TD
     It follows that if a representamen lattice is complete for a set of interpretations, rep-
resentamens that share their object concepts are strong synonyms for all interpretations.
In Example 1, if TD corresponds to {Hello World, How are you}, {yes, no}, {1, 2} then
firstLoop and secondLoop are compatible. Essentially this means that variables do not
radically change their meaning between firstLoop and secondLoop. Mergeable interpre-
tations have the same denotation for each representamen and could be merged into one
interpretation. In Example 1 the interpretations in IA (or in IB ) are not mergeable. Us-
ing TI -mergeability it can be ensured that only interpretations which have something
in common (for example temporal adjacency) are merged. There are no examples of
homographs in Example 1 but the following table shows some examples for the other
notions of Definition 6.

      strong synonyms (firstLoop, input2, ”Hello World”) (secondLoop, input1, ”Hello World”)
      synonyms        (firstLoop, input1, ”Hello World”) (secondLoop, input2, ”How are you”)
      equinyms        (firstLoop, input1, ”Hello World”) (secondLoop, input1, ”Hello World”)
      polysemous      (firstLoop, input2, ”Hello World”) (secondLoop, input2, ”How are you”)

    Some programming languages use further types of synonymy-like relations, for ex-
ample, variables can have the same value and but not the same data type or the same
value but not be referring to the same object. An example of homographs in natural
languages is presented by the verb ‘lead’ and the metal ‘lead’. In programming lan-
guages, homographs are variables which have the same name but are used for totally
different purposes. If this happens in separate subroutines of a program, it does not
pose a problem. But if it involves global variables it might indicate an error in the code.
Thus algorithms for homograph detection can be useful for checking the consistency of
programs. Compatible interpretations are free of homographs.
    Definition 7: A semiotic relation with concept lattices as defined in Definitions
3-5 is called a semiotic system. The study of semiotic systems is called a semiotic-
conceptual analysis.
 4
     A tolerance relation is reflexive and symmetric.
                 Semiotic-Conceptual Analysis with Formal Concept Analysis               143


5 Mappings between the concept lattices

A next step is to investigate how (and whether) the interpretations as functions from
R to D give rise to interesting mappings between the representamen and denotation
lattice. For example, if the representamen lattice has an attribute ‘starts with uppercase
letter’ and it is common practice in a programming language to use uppercase letters
for names of classes and there is an attribute ‘classes’ in the denotation lattice, then
one would want to investigate whether this information is preserved by the mapping
amongst the lattices. The following definition describes a basic relationship:
     Definition 8: For a semiotic relation, the power set P(R), subsetsW I1W⊆ I and R1 ⊆
R we define: I1∨ : P (R)\{} → B(D, MD , JD ) with I1∨ (R1 ) := i∈I1 r∈R1 γ(i(r)).
     Because the join relation in a lattice is commutative and associative it does not
matter
W       Wwhether oneW first iterates
                            W         through interpretations or through representamens (i.e.,
   i∈I1    r∈R1 or   r∈R1      i∈I1 ). An analogous function can be defined for infima. One
can also consider the inverse (I1∨ )−1 .
     Definition 8 allows for different types of applications. One can look at the results
for extensions (and thus concepts of the representamen lattice), one-element sets (cor-
responding to individual elements in R) or elements of a tolerance relation. The same
holds for the subsets of I. The question that arises in each application is whether the
mapping I1∨ has some further properties, such as being order-preserving or whether
it forms an ‘infomorphism’ in Barwise & Seligman’s (1997) terminology (together
with an inverse mapping). It may be of interest to find the subsets of R for which
(I1∨ )−1 I1∨ (R1 ) = R1 .
     In the case of Figure 2, ‘input end’ is mapped onto the concepts with attribute ‘pos-
itive’, ‘negative’ or ‘binary’ depending on which set of interpretations is used. The
other representamens are always mapped onto the same concepts no matter which set
of interpretations is used. For the extensions of the representamen lattice this leads to
an order-preserving mapping. Thus overall the structures of the representamen and de-
notation lattice seem very compatible in this example. Other examples could produce
mappings which change radically between different interpretations. In a worst case sce-
nario, every representamen is mapped to the top concept of the denotation lattice as
soon as more than one interpretation is involved.
     In Figure 2 the interpretation lattice is depicted without any connection to the other
two lattices. Furthermore even though a construction of R1∨ in analogy to I1∨ would
be possible it would not be interesting for most applications because most elements
would be mapped to the top element of the denotation lattice. Thus different strategies
are needed for the representamen and interpretation lattices. One possibility of connect-
ing the three lattices is to use a ‘faceted display’ similar to Priss (2000). The idea for
Figure 3 is to use two facets: the denotation lattice which also contains the mapped
representamens and the interpretation lattice. If a user ‘clicks’ on the upper concept in
the interpretation lattice, the lattice on the left-hand side of Figure 3 is displayed. If a
user clicks on the lower interpretation, the lattice on the right-hand side is displayed.
Switching between the two interpretations would show the movement of ‘input end’.
This is also reminiscent of the work by Wolff (2004) who uses ‘animated’ concept lat-
tices which show the movement of ‘complex objects’ (in contrast to formal objects)
144      Uta Priss


across the nodes of a concept lattice. In our semiotic-conceptual analysis the interpreta-
tions are not necessarily linearly-ordered (as Wolff’s time units) but ordered according
to a concept lattice.



                                   is defined                                    is defined
                       string                              >0        string
                                         number                                      number
              input2 input1                                        input1
                                                    firstLoop
                                 error    counter                                      error
                                                                   input2
                        binary                             >1          binary         counter
                  positive     negative             secondLoop   positive     negative
                           input_end                              input_end




                              Fig. 3. Switching between interpretations




    Instead of variables or strings, representamens can also be more complex structures,
such as graphs, UML diagrams, Peirce’s existential graphs, relations or other complex
mathematical structures which are then analysed using interpretations. Figure 4 shows
an example from Priss (1998) which was originally presented in terms of what Priss
called ‘relational concept analysis5 ’. The words in the figure are entries from the elec-
tronic lexical database WordNet6 . The solid lines in Figure 4 are subconcept relation
instances from a concept lattice although the lattice drawing is incomplete in the figure.
The dashed lines are part-whole relation instances that are defined among the concepts.
Using a semiotic-conceptual analysis, this figure can be generated by using represen-
tamens which are instances of a part-whole relation. Two interpretations are involved:
one maps the first component of each relation instance into the denotation lattice, the
other one maps the second component. Each dashed line corresponds to the mapping
of one representamen. For each representamen, I ∨ (r) is the whole and I ∧ (r) the part
of the relation instance. Priss (1999) calculates bases for semantic relations which in
this modelling as a semiotic-conceptual analysis correspond to searching for infima and
suprema of such representamens as binary relations.
    Figure 4 shows an example of a data error. The supremum of ‘hand’ and ‘foot’
should be the concept which is a part of ‘limb’. There should be a part-whole relation
from ‘digit’ to that ‘extremity’ concept. Although it would be possible to write an al-
gorithm that checks for this error systematically, this is probably again an example of
where a user can detect an error in the data more easily (because of the lack of sym-
metry) if the data is graphically presented. We argue that there are so many different
ways of how semiotic-conceptual analyses can be used that it is not feasible to write

 5
   These days the notion ‘relational concept analysis’ is used in a different meaning by other
   authors.
 6
   https://wordnet.princeton.edu/
                 Semiotic-Conceptual Analysis with Formal Concept Analysis                     145


                                                                                  animal
                   body part
                                   external                        human
                                   body part   extremity,
                                               appendage
                                                                           limb

                                                                   arm                   leg


                                                                         extremity


                                                               hand               foot

                                                                   digit
                       structure
                                                     finger                toe


                                                            nail


                                               fingernail            toenail


                   Fig. 4. Representamens showing a part-whole relation



algorithms for any possible situation. In many cases the data can be modelled for an
application and then interactively investigated.
    In Figure 4, the representamens are instances of a binary relation or pairs of deno-
tations. Thus there are no intrinsic differences between what is a representamen, deno-
tation or interpretation. Denotations are often represented by strings and thus are signs
themselves (with respect to another semiotic relation). A computer program as a whole
can also be a representamen. Since that is then a single representamen, the relation
between the program output (its denotations) and the succession of states (its interpre-
tations) is then a binary relation. Priss (2004) shows an example of a concept lattice for
such a relation.


6 Conclusion and outlook
This paper presents a semiotic-conceptual analysis that models the three components
of a Peircean semiotic relation as concept lattices which are connected via a semiotic
mapping. The paper shows that the formalisation of such a semiotic-conceptual analysis
provides a unified framework for a number of our previous FCA applications (Priss,
1998-2004). It also presents another view on Wolff’s (2004) animated concept lattices.
    But this paper only describes a starting point for this kind of modelling. Instead
of considering one semiotic system with sets R, D, I, one could also consider several
semiotic systems with sets R1 , D1 , I1 and so on as subsets of larger sets R, D, I. Then
one could investigate what happens if, for example, the signs from one semiotic system
become the representamens, interpretations or denotations of another semiotic system.
146      Uta Priss


For example, in the second half of Peirce’s sign definition in Section 1 he suggests
that for i1 (r) = d there should be an i2 with i2 (i1 ) = d. Furthermore one could
consider a denotation lattice as a channel between different representamen lattices in
the terminology of Barwise & Seligman’s (1997) information flow theory as briefly
mentioned in Section 5 which also poses some other open questions.
    There are connections with existing formalisms (for example model-theoretic se-
mantics) that need further exploration. In some sense a semiotic-conceptual analy-
sis subsumes syntactic relationships (among representamens), semantic relationships
(among denotations) and pragmatic relationships (among interpretations) in one for-
malisation. Other forms of semiotic analyses which use the definitions from Section 2
and 4 but use other structures than concept lattices (as suggested in Section 3) are pos-
sible as well. Hopefully future research will address such questions and continue this
work.


References
1. Barwise, Jon; Seligman, Jerry (1997). Information Flow. The Logic of Distributed Systems.
   Cambridge University Press.
2. Belohlavek, Radim; Osicka, Petr (2012). Triadic concept lattices of data with graded at-
   tributes. International Journal of General Systems 41.2, p. 93-108.
3. Ganter, Bernhard; Wille, Rudolf (1999). Formal Concept Analysis. Mathematical Founda-
   tions. Berlin-Heidelberg-New York: Springer.
4. Gnatyshak, Dmitry; Ignatov, Dmitry; Kuznetsov, Sergei O. (2013). From Triadic FCA to Tri-
   clustering: Experimental Comparison of Some Triclustering Algorithms. CLA. Vol. 1062.
5. Goguen, Joseph (1999). An introduction to algebraic semiotics, with application to user in-
   terface design. Computation for metaphors, analogy, and agents. Springer Berlin Heidelberg,
   p. 242-291.
6. Priss, Uta (1998). The Formalization of WordNet by Methods of Relational Concept Analysis.
   In: Fellbaum, Christiane (ed.), WordNet: An Electronic Lexical Database and Some of its
   Applications, MIT press, p. 179-196.
7. Priss, Uta (1999). Efficient Implementation of Semantic Relations in Lexical Databases. Com-
   putational Intelligence, Vol. 15, 1, p. 79-87.
8. Priss, Uta (2000). Lattice-based Information Retrieval. Knowledge Organization, Vol. 27, 3,
   p. 132-142.
9. Priss, Uta (2004). Signs and Formal Concepts. In: Eklund (ed.), Concept Lattices: Second
   International Conference on Formal Concept Analysis, Springer Verlag, LNCS 2961, 2004, p.
   28-38.
10. Maddux, Roger D. (1991). The origin of relation algebras in the development and axiomati-
   zation of the calculus of relations. Studia Logica 50, 3-4, p. 421-455.
11. Marty, Robert (1992). Foliated semantic networks: concepts, facts, qualities. Computers &
   mathematics with applications 23.6, p. 679-696.
12. Wolff, Karl Erich (2004). Towards a conceptual theory of indistinguishable objects. Concept
   Lattices. Springer Berlin Heidelberg, p. 180-188.
13. Zalamea, Fernando (2010). Towards a Complex Variable Interpretation of Peirces Existential
   Graphs. In: Bergman, M., Paavola, S., Pietarinen, A.-V., & Rydenfelt, H. (Eds.). Ideas in
   Action: Proceedings of the Applying Peirce Conference, p. 277-287.
               Using the Chu construction for
             generalizing formal concept analysis

       L. Antoni1 , I.P. Cabrera2 , S. Krajči1 , O. Krı́dlo1 , M. Ojeda-Aciego2
                 1
                    University of Pavol Jozef Šafárik, Košice, Slovakia ?
       2
           Universidad de Málaga. Departamento Matemática Aplicada. Spain ??



        Abstract. The goal of this paper is to show a connection between FCA
        generalisations and the Chu construction on the category ChuCors, the
        category of formal contexts and Chu correspondences. All needed cat-
        egorical properties like categorical product, tensor product and its bi-
        functor properties are presented and proved. Finally, the second order
        generalisation of FCA is represented by a category built up in terms of
        the Chu construction.

        Keywords: formal concept analysis, category theory, Chu construction


 1    Introduction
 The importance of category theory as a foundational tool was discovered soon
 after its very introduction by Eilenberg and MacLane about seventy years ago.
 On the other hand, Formal Concept Analysis (FCA) has largely shown both
 its practical applications and its capability to be generalized to more abstract
 frameworks, and this is why it has become a very active research topic in the
 recent years; for instance, a framework for FCA has been recently introduced
 in [19] in which the sets of objects and attributes are no longer unstructured
 but have a hypergraph structure by means of certain ideas from mathematical
 morphology. On the other hand, for an application of the FCA formalism to
 other areas, in [11] the authors introduce a representation of algebraic domains
 in terms of FCA.
     The Chu construction [8] is a theoretical method that, from a symmetric
 monoidal closed (autonomous) category and a dualizing object, generates a *-
 autonomous category. This construction, or the closely related notion of Chu
 space, has been applied to represent quantum physical systems and their sym-
 metries [1, 2].
     This paper continues with the study of the categorical foundations of formal
 concept analysis. Some authors have noticed the property of being a cartesian
 closed category of certain concept structures that can be approximated [10, 20];
  ?
    Partially supported by the Scientific Grant Agency of the Ministry of Education of
    Slovak Republic under contract VEGA 1/0073/15.
 ??
    Partially supported by the Spanish Science Ministry projects TIN12-39353-C04-01
    and TIN11-28084.


c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 147–158, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
148     Ľubomír Antoni et al.


others have provided a categorical construction of certain extensions of FCA [12];
morphisms have received a categorical treatment in [17] as a means for the
modelling of communication.
    There already exist some approaches [9] which consider the Chu construction
in terms of FCA. In the current paper, we continue the previous study by the
authors on the categorical foundation of FCA [13,15,16]. Specifically, the goal of
this paper is to highlight the importance of the Chu construction in the research
area of categorical description of the theory of FCA and its generalisations. The
Chu construction plays here the role of some recipe for constructing a suitable
category that covers the second order generalisation of FCA.
    The structure of this paper is the following: in Section 2 we recall the prelim-
inary notions required both from category theory and formal concept analysis.
Then, the various categorical properties of the input category which are required
(like the existence of categorical and tensor product) are developed in detail in
Sections 3 and 4. An application of the Chu construction is presented in Section 5
where it is also showed how to construct formal contexts of second order from
the category of classical formal contexts and Chu correspondences (ChuCors).


2     Preliminaries

In order to make the manuscript self-contained, the fundamental notions and its
required properties are recalled in this section.

Definition 1. A formal context is any triple C = hB, A, Ri where B and A are
finite sets and R ⊆ B × A is a binary relation. It is customary to say that B is
a set of objects, A is a set of attributes and R represents a relation between
objects and attributes.
    On a given formal context (B, A, R), the derivation (or concept-forming)
operators are a pair of mappings ↑ : 2B → 2A and ↓ : 2A → 2B such that if
X ⊆ B, then ↑X is the set of all attributes which are related to every object in
X and, similarly, if Y ⊆ A, then ↓Y is the set of all objects which are related to
every attribute in Y .

    In order to simplify the description of subsequent computations, it is conve-
nient to describe the concept forming operators in terms of characteristic func-
tions, namely, considering the subsets as functions on the set of Boolean values.
Specifically, given X ⊆ B and Y ⊆ A, we can consider mappings ↑X : A → {0, 1}
and ↓Y : B → {0, 1}
               ^                          
 1. ↑X(a) =         (b ∈ X) ⇒ ((b, a) ∈ R) for any a ∈ A
               b∈B
               ^                            
 2. ↓Y (b) =         (a ∈ Y ) ⇒ ((b, a) ∈ R) for any b ∈ B
               a∈A

where the infimum is considered in the set of Boolean values and ⇒ is the truth-
function of the implication of classical logic.
         Using the Chu construction for generalizing formal concept analysis     149


Definition 2. A formal concept is a pair of sets hX, Y i ∈ 2B × 2A which is a
fixpoint of the pair of concept-forming operators, namely, ↑X = Y and ↓Y = X.
The object part X is called the extent and the attribute part Y is called the intent.

   There are two main constructions relating two formal contexts: the bonds
and the Chu correspondences. Their formal definitions are recalled below:
Definition 3. Consider C1 = hB1 , A1 , R1 i and C2 = hB2 , A2 , R2 i two formal
contexts. A bond between C1 and C2 is any relation β ∈ 2B1 ×A2 such that its
columns are extents of C1 and its rows are intents of C2 . All bonds between such
contexts will be denoted by Bonds(C1 , C2 ).
    The Chu correspondence between contexts can be seen as an alternative
inter-contextual structure which, instead, links intents of C1 and extents of C2 .
Namely,
Definition 4. Consider C1 = hB1 , A1 , R1 i and C2 = hB2 , A2 , R2 i two formal
contexts. A Chu correspondence between C1 and C2 is any pair of multimappings
ϕ = hϕL , ϕR i such that

 – ϕL : B1 → Ext(C2 )
 – ϕR : A2 → Int(C1 )
 – ↑2(ϕL (b1 ))(a2 ) = ↓1(ϕR (a2 ))(b1 ) for any (b1 , a2 ) ∈ B1 × A2

All Chu correspondences between such contexts will be denoted by Chu(C1 , C2 ).
    The notions of bond and Chu correspondence are interchangeable; specifi-
cally, we will use the bond βϕ associated to a Chu correspondence ϕ from C1
to C2 defined for b1 ∈ B1 , a2 ∈ A2 as follows:

                 βϕ (b1 , a2 ) = ↑2 (ϕL (b1 ))(a2 ) = ↓1 (ϕR (a2 ))(b1 )

   The set of all bonds (resp. Chu correspondences) between any two formal
contexts endowed with set inclusion as ordering have a complete lattice structure.
Moreover, both complete lattices are dually isomorphic.
   In order to formally define the composition of two Chu correspondences, we
need to introduce the extension principle below:

                                : X → 2Y we define its extended mapping
Definition 5. Given a mapping ϕ S
ϕ+ : 2 → 2 defined by ϕ+ (M ) = x∈M ϕ(x), for all M ∈ 2X .
      X    Y


   The set of formal contexts together with Chu correspondences as morphisms
forms a category denoted by ChuCors. Specifically:

 – objects formal contexts
 – arrows Chu correspondences
 – identity arrow ι : C → C of context C = hB, A, Ri
     • ιL (o) = ↓↑ ({b}), for all b ∈ B
     • ιR (a) = ↑↓ ({a}), for all a ∈ A
150     Ľubomír Antoni et al.


 – composition ϕ2 ◦ ϕ1 : C1 → C3 of arrows ϕ1 : C1 → C2 , ϕ2 : C2 → C3 (where
   Ci = hBi , Ai , Ri i, i ∈ {1, 2, 3})
     • (ϕ2 ◦ ϕ1 )L : B1 → 2B3 and (ϕ2 ◦ ϕ1 )R : A3 → 2A1
     • (ϕ2 ◦ ϕ1 )L (b1 ) = ↓3 ↑3 (ϕ2L+ (ϕ1L (b1 )))
     • (ϕ2 ◦ ϕ1 )R (a3 ) = ↑1 ↓1 (ϕ1R+ (ϕ2R (a3 )))
   The category ChuCors is *-autonomous and equivalent to the category of
complete lattices and isotone Galois connection, more results on this category
and its L-fuzzy extensions can be found in [13, 15, 16, 18].


3     Categorical product on ChuCors
In this section, the category ChuCors is proved to contain all finite categorical
products, that is, it is a Cartesian category. To begin with, it is convenient to
recall the notion of categorical product.

Definition 6. Let C1 and C2 be two objects in a category. By a product of C1
and C2 we mean an object P with arrows πi : P → Ci for i ∈ {1, 2} satisfying
the following condition: For any object D and arrows δi : D → Ci for i ∈ {1, 2},
there exists a unique arrow γ : D → P such that γ ◦ πi = δi for all i ∈ {1, 2}.

   The construction will use the notion of disjoint union of two sets S1 ] S2
which can be formally described as ({1} × S1 ) ∪ ({2} × S2 ) and, therefore, their
elements will be denoted as ordered pairs (i, s) where i ∈ {1, 2} and s ∈ Si . Now,
we can proceed with the construction:

Definition 7. Consider C1 = hB1 , A1 , R1 i and C2 = hB2 , A2 , R2 i two formal
contexts. The product of such contexts is a new formal context

                       C1 × C2 = hB1 ] B2 , A1 ] A2 , R1×2 i

where the relation R1×2 is given by
                                                                          
           ((i, b), (j, a)) ∈ R1×2 if and only if (i = j) ⇒ (b, a) ∈ Ri

for any (b, a) ∈ Bi × Aj and (i, j) ∈ {1, 2} × {1, 2}.

Lemma 1. The above defined contextual product fulfills the property of the cat-
egorical product on the category ChuCors.

Proof. We define the projection arrows hπiL , πiR i ∈ Chu(C1 ×C2 , Ci ) for i ∈ {1, 2}
as follows
 – πiL : B1 ] B2 → Ext(Ci ) ⊆ 2Bi
 – πiR : Ai → Int(C1 × C2 ) ⊆ 2A1 ∪A2
 – such that for any (k, x) ∈ B1 ] B2 and ai ∈ Ai the following equality holds

                        ↑i (πiL (k, x))(ai ) = ↓1×2 (πiR (ai ))(k, x)
         Using the Chu construction for generalizing formal concept analysis        151


The definition of the projections is given below
                      (
                        ↓i ↑i (χx )(bi ) for k = i
    πiL (k, x)(bi ) =                             for any (k, x) ∈ B1 ] B2 and bi ∈ Bi
                        ↓ ↑ 0 (bi ) for k 6= i
                      ( i i
                        ↑i ↓i (χai )(y) for k = i
    πiR (ai )(k, y) =                             for any (k, y) ∈ A1 ]A2 and ai ∈ Ai .
                        ↑k ↓k 0 (y) for k 6= i
    The proof that the definitions above actually provide a Chu correspondence
is just a long, although straightforward, computation and it is omitted.
    Now, one has to show that to any formal context D = hE, F, Gi, where
G ⊆ E × F and any pair of arrows (δ1 , δ2 ) with δi : D → Ci for all i ∈ {1, 2},
there exists a unique morphism γ : D → C1 × C2 such that the following diagram
commutes:
                                   π1          π2
                               C1 < C1 × C2      > C2
                                 <     .∧.
                                         ..       >
                                          .. γ
                                   δ1      .
                                           ..  δ2
                                       D
We give just the definition of γ as a pair of mappings γL : E → 2B1 ]B2 and
γR : A1 ] A2 → 2F

 – γL (e)(k, x) = δkL (e)(x) for any e ∈ E and (k, x) ∈ B1 ] B2 .
 – γR (k, y)(f ) = δkR (y)(f ) for any f ∈ F and (k, y) ∈ A1 ] A2 .

   Checking the condition of categorical product is again straightforward but
long and tedious and, hence, it is omitted.                                 t
                                                                            u

    We have just proved that binary products exist, but a cartesian category
requires the existence of all finite products. If we recall the well-known categorical
theorem which states that if a category has a terminal object and binary product,
then it has all finite products, we have just to prove the existence of a terminal
object (namely, the nullary product) in order to prove ChuCors to be cartesian.
    Any formal context of the form hB, A, B × Ai where the incidence relation
is the full cartesian product of the sets of objects and attributes is (isomorphic
to) the terminal object of ChuCors. Such formal context has just one formal
concept hB, Ai; hence, from any other formal context there is just one Chu
correspondence to hB, A, B × Ai.


4    Tensor product and its bifunctor property
Apart from the categorical product, another product-like construction can be
given in the category ChuCors, for which the notion of transposed context C ∗ is
needed.
    Given a formal context C = hB, A, Ri, its transposed context is C ∗ = hA, B, Rt i,
where Rt (a, b) holds iff R(b, a) holds. Now, if ϕ ∈ Chu(C1 , C2 ), one can consider
ϕ∗ ∈ Chu(C2∗ , C1∗ ) defined by ϕ∗L = ϕR and ϕ∗R = ϕL .
152      Ľubomír Antoni et al.


Definition 8. The tensor product of formal contexts Ci = hBi , Ai , Ri i for i ∈
{1, 2} is defined as the formal context C1 C2 = hB1 ×B2 , Chu(C1 , C2∗ ), R i where

                         R ((b1 , b2 ), ϕ) = ↓2 (ϕL (b1 ))(b2 ).

    Mori studied in [18] the properties of the tensor product above, and proved
that ChuCors with  is a symmetric and monoidal category. Those results were
later extended to the L-fuzzy case in [13]. In both papers, the structure of the
formal concepts of a product context was established as an ordered pair formed
by a bond and a set of Chu correspondences.

Lemma 2. Let Ci = hBi , Ai , Ri i for i ∈ {1, 2} be two formal contexts, and let
                             Chu(C1 ,C2∗ )
hβ, Xi ∈ Bonds(C
          V
                     ∗
                1 , C2 ) × 2               be an arbitrary formal concept of C1  C2 .
Then β = ψ∈X βψ and X = {ψ ∈ Chu(C1 , C2∗ ) | β ≤ βψ }.

Proof. Let X be an arbitrary subset of Chu(C1 , C2∗ ). Then, for all (b1 , b2 ) ∈
B1 × B2 , we have
                                 ^                                  
         ↓C1 C2 (X)(b1 , b2 ) =       (ψ ∈ X) ⇒ ↓2 (ψL (b1 ))(b2 )
                                   ψ∈Chu(C1 ,C2∗ )
                                    ^                              ^
                               =         ↓2 (ψL (b1 ))(b2 ) =           βψ (b1 , b2 )
                                   ψ∈X                            ψ∈X

Let β be an arbitrary subset of B1 × B2 . Then, for all ψ ∈ Chu(C1 , C2∗ )
                                ^                                        
          ↑C1 C2 (β)(ψ) =               β(b1 , b2 ) ⇒ ↓2 (ψL (b1 ))(b2 )
                               (b1 ,b2 )∈B1 ×B2
                                      ^                                         
                           =                         β(b1 , b2 ) ⇒ βψ (b1 , b2 )
                               (b1 ,b2 )∈B1 ×B2

Hence ↑C1 C2 (β) = {ψ ∈ Chu(C1 , C2∗ ) | β ≤ βψ }                                         t
                                                                                           u

   We now introduce the notion of product of one context with a Chu corre-
spondence.

Definition 9. Let Ci = hBi , Ai , Ri i for i ∈ {0, 1, 2} be formal contexts, and
consider ϕ ∈ Chu(C1 , C2 ). Then, the pair of mappings

      (C0  ϕ)L : B0 × B1 → 2B0 ×B2            (C0  ϕ)R : Chu(C0 , C2 ) → 2Chu(C0 ,C1 )

is defined as follows:
 – (C0  ϕ)L (b, b1 )(o, b2 ) = ↓C0 C2 ↑C0 C2 (γϕb,b1 )(o, b2 ) where
                                            
   γϕb,b1 (o, b2 ) = (b = o) ∧ ϕL (b1 )(b2 ) for any b, o ∈ B0 , bi ∈ Bi with i ∈ {1, 2}
                                               
 – (C0  ϕ)R (ψ2 )(ψ1 ) = ψ1 ≤ (ψ2 ◦ ϕ∗ ) for any ψi ∈ Chu(C0 , Ci )

    As one could expect, the result is a Chu correspondence between the products
of the contexts. Specifically:
         Using the Chu construction for generalizing formal concept analysis             153


Lemma 3. Let Ci = hBi , Ai , Ri i be formal contexts for i ∈ {0, 1, 2}, and con-
sider ϕ ∈ Chu(C1 , C2 ). Then C0  ϕ ∈ Chu(C0  C1 , C0  C2 ).

Proof. (C0  ϕ)L (b, b1 ) ∈ Ext(C0  C2 ) for any (b, b1 ) ∈ B0 × B1 follows directly
from its definition. (C0  ϕ)R (ψ) ∈ Int(C0  C1 ) for any ψ ∈ Chu(C0 , C1 ) follows
from Lemma 2.
   Consider an arbitrary b ∈ B0 , b1 ∈ B1 and ψ2 ∈ Chu(C0 , C2∗ )
                            
  ↑C0 C2 (C0  ϕ)L (b, b1 ) (ψ2 )
          = ↑C0 C2 ↓C0 C2 ↑C0 C2 (γϕb,b1 )(ψ2 )
          = ↑C0 C2 (γϕb,b1 )(ψ2 )
                 ^                                             
          =                  γϕb,b1 (o, b2 ) ⇒ ↓ (ψ2R (b2 ))(o)
              (o,b2 )∈B0 ×B2
                     ^                                                        
          =                          (o = b) ∧ ϕL (b1 )(b2 ) ⇒ ↓ (ψ2R (b2 ))(o)
              (o,b2 )∈B0 ×B2
               ^       ^                                                 
          =                     (o = b) ⇒ ϕL (b1 )(b2 ) ⇒ ↓ (ψ2R (b2 ))(o)
              o∈B0 b2 ∈B2
               ^                       ^                                            
          =            (o = b) ⇒                (ϕL (b1 )(b2 ) ⇒ ↓ (ψ2R (b2 ))(o))
              o∈B0                     b2 ∈B2
               ^                                              
          =             ϕL (b1 )(b2 ) ⇒ ↓ (ψ2R (b2 ))(b)
              b2 ∈B2
               ^                             ^                               
          =             ϕL (b1 )(b2 ) ⇒            (ψ2R (b2 )(a) ⇒ R(b, a))
              b2 ∈B2                          a∈A
              ^ _                                                          
          =                     (ϕL (b1 )(b2 ) ∧ ψ2R (b2 )(a)) ⇒ R(b, a)
              a∈A      b2 ∈B2
              ^                                  
          =            ψ2R+ (ϕL (b1 ))(a) ⇒ R(b, a)
              a∈A
          = ↓ (ψ2R+ (ϕL (b1 ))(b) = ↓↑↓ (ψ2R+ (ϕL (b1 ))(b) = ↓ ((ϕ ◦ ψ2 )R (b1 ))(b)

Note the use above of the extended mapping as given in Definition 5 in relation
to the composition of Chu correspondences.
    On the other hand, we have

          ↓C0 C1 ((C0  ϕ)R (ψ2 ))(b, b1 )
                          ^
                   =              ((C0  ϕ)R (ψ2 )(ψ1 ) ⇒ ↓ (ψ1R (b1 ))(b))
                           ψ1 ∈Chu(C0 ,C1 )
                                  ^
                       =                   ((ψ1 ≥ ϕ ◦ ψ2 ) ⇒ ↓ (ψ1R (b1 ))(b))
                           ψ1 ∈Chu(C0 ,C1 )
154     Ľubomír Antoni et al.

                            ^
               =                      ↓ (ψ1R (b1 ))(b) = ↓ ((ϕ ◦ ψ2 )R (b1 ))(b)
                   ψ1 ∈Chu(C0 ,C1 )
                      ψ1 ≥ϕ◦ψ2

   Hence ↑C0 C2 ((C0  ϕ)L (b, b1 ))(ψ2 ) = ↓C0 C1 ((C0  ϕ)R (ψ2 ))(b, b1 ). So if
ϕ ∈ Chu(C1 , C2 ) then C0  ϕ ∈ Chu(C0  C1 , C0  C2 ).                           t
                                                                                   u
     Given a fixed formal context C, the tensor product C  (−) forms a mapping
between objects of ChuCors assigning to any formal context D the formal context
CD. Moreover to any arrow ϕ ∈ Chu(C1 , C2 ) it assigns an arrow Cϕ ∈ Chu(C
C1 , C  C2 ). We will show that this mapping preserves the unit arrows and the
composition of Chu correspondences. Hence the mapping forms an endofunctor
on ChuCors, that is, a covariant functor from the category ChuCors to itself.
     To begin with, let us recall the definition of functor between two categories:
Definition 10 (See [6]). A covariant functor F : C → D between categories C
and D is a mapping of objects to objects and arrows to arrows, in such a way
that:
 – For any morphism f : A → B, one has F (f ) : F (A) → F (B)
 – F (g ◦ f ) = F (g) ◦ F (f )
 – F (1A ) = 1F (A) .
Lemma 4. Let C = hB, A, Ri be a formal context. C  (−) is an endofunctor
on ChuCors.
Proof. Consider the unit morphism ιC1 of a formal context C1 = hB1 , A1 , R1 i,
and let us show that (C  ιC1 ) = ιCC1 . In other words, C  (−) respects unit
arrows in ChuCors.
                               
       ↑CC1 (C  ιC1 )(b, b1 ) (ψ)
                   ^                                                         
             =                   (o = b) ∧ ιC1 L (b1 )(o1 ) ⇒ ↓1 (ψL (o))(o1 )
                   (o,o1 )∈B×B1
                    ^                                           
              =             ↓1 ↑1 (χb1 )(o1 ) ⇒ ↓1 (ψL (b))(o1 )
                   o1 ∈B1
                    ^                                ^                              
              =              ↓1 ↑1 (χb1 )(o1 ) ⇒            ψL (b)(a1 ) ⇒ R(o1 , a1 )
                   o1 ∈B1                          a1 ∈A1
                    ^       ^                                                     
              =                       ↓1 ↑1 (χb1 )(o1 ) ⇒ ψL (b)(a1 ) ⇒ R(o1 , a1 )
                   o1 ∈B1 a1 ∈A1
                    ^       ^                                                       
              =                    ψL (b)(a1 ) ⇒      ↓1 ↑1 (χb1 )(o1 ) ⇒ R(o1 , a1 )
                   o1 ∈B1 a1 ∈A1
                    ^                         ^                                     
              =             ψL (b)(a1 ) ⇒             ↓1 ↑1 (χb1 )(o1 ) ⇒ R(o1 , a1 )
                   a1 ∈A1                    o1 ∈B1
                    ^                                         
              =             ψL (b)(a1 ) ⇒ ↑1 ↓1 ↑1 (χb1 )(a1 )
                   a1 ∈A1
           Using the Chu construction for generalizing formal concept analysis                     155

                              ^                                
                      =             ψL (b)(a1 ) ⇒ R1 (b1 , a1 ) = ↓1 (ψL (b))(b1 )
                          a1 ∈A1



and, on the other hand, we have
                   ↑CC1 (ιCC1 (b, b1 ))(ψ)
                               = ↑CC1 (χ(b,b1 ) )(ψ)
                                    ^                                               
                               =                χ(b,b1 ) (o, o1 ) ⇒ ↓1 (ψL (o))(o1 )
                                  (o,o1 )∈B×B1

                               = ↓1 (ψL (b))(b1 )
As a result, we have obtained ↑CC1 ((C ιC1 )(b, b1 ))(ψ) =↑CC1 (ιCC1 (b, b1 ))(ψ)
for any (b, b1 ) ∈ B × B1 and any ψ ∈ Chu(C, C1 ); hence, ιCC1 = (C  ιC1 ).
    We will show now that C  (−) preserves the composition of arrows. Specif-
ically, this means that for any two arrows ϕi ∈ Chu(Ci , Ci+1 ) for i ∈ {1, 2} it
holds that C  (ϕ1 ◦ ϕ2 ) = (C  ϕ1 ) ◦ (C  ϕ2 ).
                                    
      ↑CC3 C  (ϕ1 ◦ ϕ2 ) L (b, b1 ) (ψ3 )
                    ^                                                          
             =               (o = b) ∧ (ϕ1 ◦ ϕ2 )L (b1 )(b3 ) ⇒ ↓ (ψ3R (b3 ))(o)
                   (o,b3 )∈B×B3
                      ^                                                    
               =               (ϕ1 ◦ ϕ2 )L (b1 )(b3 ) ⇒ ↓ (ψ3R (b3 ))(b)
                   b3 ∈B3

                (by similar operations to those in the first part of the proof)
                                        
               = ↓ (ϕ1 ◦ ϕ2 ◦ ψ3 )L (b1 ) (b)
On the other hand, and writing F for C  − in order to simplify the resulting
expressions, we have
↑F C3 ((F ϕ1 ◦ F ϕ2 )L (b, b1 ))(ψ3 )
                                                 
  = ↑F C3 ↓F C3 ↑F C3 (F ϕ2 )L+ (F ϕ1 )L (b, b1 ) (ψ3 )
         ^        
  =
      (o,b3 )∈B×B3
               _                                                                                     
                              (F ϕ1 )L (b, b1 )(j, b2 ) ∧ (F ϕ2 )L (j, b2 )(o, b3 ) ⇒ ↓ (ψ3R (b3 ))(o)
        (j,b2 )∈B×B2
       ^       ^                                                          
  =                       ϕ1L (b1 )(b2 ) ∧ ϕ2L (b2 )(b3 ) ⇒ ↓ (ψ3R (b3 ))(b)
      b3 ∈B3 b2 ∈B2
       ^  _                                              
                                                                             
  =                        ϕ1L (b1 )(b2 ) ∧ ϕ2L (b2 )(b3 ) ⇒ ↓ (ψ3R (b3 ))(b)
      b3 ∈B3   b2 ∈B2
       ^                                              
  =            ϕ2L+ (ϕ1L (b1 ))(b3 ) ⇒ ↓ (ψ3R (b3 ))(b)
      b3 ∈B3
156        Ľubomír Antoni et al.

                             ^                                               
                        =            (ϕ1 ◦ ϕ2 )L (b1 )(b3 ) ⇒ ↓ (ψ3R (b3 ))(b)
                            b3 ∈B3



From the previous equalities we see that C  (ϕ1 ◦ ϕ2 ) = (C  ϕ1 ) ◦ (C  ϕ2 ).
Hence, composition is preserved.
   As a result, the mapping C  (−) forms a functor from ChuCors to itself. t u
   All the previous computations can be applied to the first argument without
any problems, hence we can directly state the following proposition.
Proposition 1. The tensor product forms a bifunctor −  − from ChuCors ×
ChuCors to ChuCors.


5      The Chu construction on ChuCors and second order
       formal concept analysis
A second order formal context [14] focuses on the external formal contexts and
it serves a bridge between the L-fuzzy [3, 7] and heterogeneous [4] frameworks.
Definition 11. SConsiderS two non-empty index sets I and J and an L-fuzzy
formal context h i∈I Bi , j∈J Aj , ri, whereby
    – Bi1 ∩ Bi2 = ∅ for any i1 , i2 ∈ I,
    – Aj1S∩ Aj2 = ∅Sfor any j1 , j2 ∈ J,
    – r : i∈I Bi × j∈J Aj −→ L.
Moreover, consider two non-empty sets of L-fuzzy formal contexts (external for-
mal contexts) notated by
    – {hBi , Ti , pi i : i ∈ I}, whereby Ci = hBi , Ti , pi i,
    – {hOj , Aj , qj i : j ∈ J}, whereby Dj = hOj , Aj , qj i.
A second order formal context is a tuple
           D[                       [                                 [            E
                 Bi , {Ci ; i ∈ I},   Aj , {Dj ; j ∈ J},                       ri,j ,
                  i∈I                       j∈J                    (i,j)∈I×J

whereby ri,j : Bi × Aj −→ L is defined as ri,j (o, a) = r(o, a) for any o ∈ Bi and
a ∈ Aj .
    The Chu construction [8] is a theoretical process that, from a symmetric
monoidal closed (autonomous) category and a dualizing object, generates a *-
autonomous category. The basic theory of *-autonomous categories and their
properties are given in [5, 6].
    In the following, the construction will be applied on ChuCors and the dual-
izing object ⊥ = h{}, {}, 6=i as inputs. In this section it is shown how second
order FCA [14] is connected to the output of such construction.
    The category generated by the Chu construction and ChuCors and ⊥ will be
denoted by CHU(ChuCors, ⊥):
         Using the Chu construction for generalizing formal concept analysis       157


 – Its objects are triplets of the form hC, D, ρi where
     • C and D are objects of the input category ChuCors (i.e. formal contexts)
     • ρ is an arrow in Chu(C  D, ⊥)
 – Its morphisms are pairs of the form hϕ, ψi : hC1 , C2 , ρ1 i → hD1 , D2 , ρ2 i where
   Ci and Di are formal contexts for i ∈ {1, 2} and
     • ϕ and ψ are elements from Chu(C1 , D1 ) and Chu(D2 , C2 ), respectively,
       such that the following diagram commutes

                                          C1  ψ
                                C1  D2        > C1  C2
                           ϕ  D2                    ρ1
                                  ∨                 ∨
                               D1  D2             >⊥
                                            ρ2

        or, equivalently, the following equality holds

                               (C1  ψ) ◦ ρ1 = (ϕ  D2 ) ◦ ρ2

   There are some interesting facts in the previous construction with respect to
the second order FCA [14]:

1. To begin with, every object hC1 , C2 , ρi in CHU(ChuCorsL , ⊥), and recall that
   ρ ∈ Chu(C1  C2 , ⊥), can be represented as a second order formal context
   (from Definition 11). Simply take into account that, from basic properties of
   the tensor product, we can obtain Chu(C1  C2 , ⊥) ∼  = Chu(C1 , C2∗ ).
   Specifically, as ChuCors is a closed monoidal category, we have that for every
   three formal contexts C1 , C2 , C3 the following isomorphism holds

                   ChuCors(C1  C2 , C3 ) ∼
                                          = ChuCors(C1 , C2 ( C3 ),
   whereby C2 ( C3 denotes the value at C3 of the right adjoint and recall that
   C2 ( ⊥ ∼  = C2∗ because ChuCors is *-autonomous. The other necessary details
   about closed monoidal categories and the corresponding notations one can
   find in [6].
2. Similarly, any second order formal context (from Definition 11) is repre-
   sentable by an object of CHU(ChuCorsL , ⊥).


6   Conclusions and future work

After introducing the basic definitions needed from category theory and formal
concept analysis, in this paper we have studied two different product construc-
tions in the category ChuCors, namely the categorical product and the tensor
product. The existence of products allows to represent tables and, hence, bi-
nary relations; the tensor product is proved to fulfill the required properties
of a bifunctor, which enables us to consider the Chu construction on the cat-
egory ChuCors. As a first application, we have sketched the representation of
158     Ľubomír Antoni et al.


second order formal concept analysis [14] in terms of the Chu construction on
the category ChuCors.
    The use of different subcategories of ChuCors as input to the Chu construc-
tion seems to be an interesting way of obtaining different existing generalizations
of FCA. For future work, we are planning to provide representations based on the
Chu construction for one-sided FCA, heterogeneous FCA, multi-adjoint FCA,
etcetera.

References
 1. S. Abramsky. Coalgebras, Chu Spaces, and Representations of Physical Systems.
    Journal of Philosophical Logic, 42(3):551–574, 2013.
 2. S. Abramsky. Big Toy Models: Representing Physical Systems As Chu Spaces.
    Synthese, 186(3):697–718, 2012.
 3. C. Alcalde, A. Burusco, R. Fuentes-González, The use of two relations in L-fuzzy
    contexts. Information Sciences, 301:1–12, 2015.
 4. L. Antoni, S. Krajči, O. Krı́dlo, B. Macek, L. Pisková, On heterogeneous formal
    contexts. Fuzzy Sets and Systems, 234:22–33, 2014.
 5. M. Barr, *-Autonomous categories, vol. 752 of Lecture Notes in Mathematics.
    Springer-Verlag, 1979.
 6. M. Barr, Ch. Wells, Category theory for computing science, 2nd ed., Prentice Hall
    International (UK) Ltd., 1995.
 7. R. Bělohlávek. Concept lattices and order in fuzzy logic. Annals of Pure and
    Applied Logic, 128:277–298, 2004.
 8. P.-H. Chu, Constructing *-autonomous categories. Appendix to [5], pages 103–107.
 9. J. T. Denniston, A. Melton, and S. E. Rodabaugh. Formal concept analysis and
    lattice-valued Chu systems. Fuzzy Sets and Systems, 216:52–90, 2013.
10. P. Hitzler and G.-Q. Zhang. A cartesian closed category of approximable concept
    structures. Lecture Notes in Computer Science, 3127:170–185, 2004.
11. M. Huang, Q. Li, and L. Guo. Formal Contexts for Algebraic Domains. Electronic
    Notes in Theoretical Computer Science, 301:79–90, 2014.
12. S. Krajči. A categorical view at generalized concept lattices. Kybernetika,
    43(2):255–264, 2007.
13. O. Krı́dlo, S. Krajči, and M. Ojeda-Aciego. The category of L-Chu correspondences
    and the structure of L-bonds. Fundamenta Informaticae, 115(4):297–325, 2012.
14. O. Krı́dlo, P. Mihalčin, S. Krajči, and L. Antoni. Formal concept analysis of higher
    order. Proceedings of Concept Lattices and their Applications (CLA), 117–128,
    2013.
15. O. Krı́dlo and M. Ojeda-Aciego. On L-fuzzy Chu correspondences. Intl J of
    Computer Mathematics, 88(9):1808–1818, 2011.
16. O. Krı́dlo and M. Ojeda-Aciego. Revising the link between L-Chu Correspondences
    and Completely Lattice L-ordered Sets. Annals of Mathematics and Artificial
    Intelligence 72:91–113, 2014.
17. M. Krötzsch, P. Hitzler, and G.-Q. Zhang. Morphisms in context. Lecture Notes
    in Computer Science, 3596:223–237, 2005.
18. H. Mori. Chu correspondences. Hokkaido Mathematical Journal, 37:147–214, 2008.
19. J.G. Stell. Formal Concept Analysis over Graphs and Hypergraphs. Lecture Notes
    in Computer Science, 8323:165–179, 2014.
20. G.-Q. Zhang and G. Shen. Approximable concepts, Chu spaces, and information
    systems. Theory and Applications of Categories, 17(5):80–102, 2006.
     From formal concepts to analogical complexes

                        Laurent Miclet1 and Jacques Nicolas2
         1
             Université de Rennes 1, UMR IRISA, Dyliss team, Rennes, France,
                                 miclet@univ-rennes1.fr
                   2
                     Inria Rennes, France, jacques.nicolas@inria.fr



        Abstract. Reasoning by analogy is an important component of common
        sense reasoning whose formalization has undergone recent improvements
        with the logical and algebraic study of the analogical proportion. The
        starting point of this study considers analogical proportions on a formal
        context. We introduce analogical complexes, a companion of formal con-
        cepts formed by using analogy between four subsets of objects in place
        of the initial binary relation. They represent subsets of objects and at-
        tributes that share a maximal analogical relation. We show that the set of
        all complexes can be structured in an analogical complex lattice and give
        explicit formulae for the computation of their infimum and supremum.

        Keywords: analogical reasoning, analogical proportion, formal concept,
        analogical complex, lattice of analogical complexes


 1    Introduction

 Analogical reasoning [4] plays an important role in human reasoning. It en-
 ables us to draw plausible conclusions by exploiting parallels between situations,
 and as such has been studied in AI for a long time, e.g., [5, 9] under various
 approaches [3]. A key pattern which is associated with the idea of analogical
 reasoning is the notion of analogical proportion (AP), i. e. a statement between
 two pairs (A, B) and (C, D) of the form ‘A is to B as C is to D’ where all
 elements A, B, C, D are in a same category .
     However, it is only in the last decade that researchers working in computa-
 tional linguistics have started to study these proportions in a formal way [6, 17,
 19]. More recently, analogical proportions have been shown as being of particu-
 lar interest for classification tasks [10] or for solving IQ tests [2]. Moreover, in
 the last five years, there has been a number of works, e.g., [11, 15] studying the
 propositional logic modeling of analogical proportions.
     In all previous cases, the ability to work on the set of all possible analogical
 proportions is required, either for checking missing objects or attributes or for
 making informed recommendations or more generally ensuring the completeness
 and efficiency of reasoning. In practice the analysis of objects composed of binary
 attributes, such as those studied by Formal Concept Analysis, is an important
 and easy context where AP are used. The question is whether it is possible to
 obtain a good representation of the space of all AP by applying the principles of

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 159–170, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
160     Laurent Miclet and Jacques Nicolas


FCA. A heuristic algorithm to discover such proportions by inspecting a lattice of
formal concepts has been proposed in [14]. Moreover, a definition of an analogical
proportion between formal concepts has been given in [13], as a particular case
of proportions between elements of a lattice, studied also in [18].
    In this paper, we are interested in a slightly different task involving a more
integrated view of concept categorization and analogy: looking for the structure
of the space of all AP. Our goal is to build an extension of formal concepts con-
sidering the presence of analogical proportions as the funding relation instead
of the initial binary relation between objects and attributes. We call this ex-
tension analogical complexes, which isolate subcontexts in formal contexts with
a certain structure reflecting the existence of a maximal analogical proportion
between subsets of objects and subsets of attributes.


2     Basics on Analogical Proportion

Definition 1 (Analogical proportion [7, 12]). An analogical proportion (AP)
on a set X is a quaternary relation on X, i.e. a subset of X 4 whose elements
(x, y, z, t), written x : y :: z : t , which reads ’x is to y as z is to t’, must obey the
following two axioms:

1. Symmetry of ’as’: x : y :: z : t ⇔ z : t :: x : y
2. Exchange of means: x : y :: z : t ⇔ x : z :: y : t

In case of formal contexts, objects are described by boolean attributes. An AP
(x, y, z, t) between four Boolean variables exists if the following formula is true:

                   (x ∧ ¬y) ⇔ (z ∧ ¬t) and (y ∧ ¬x) ⇔ (t ∧ ¬z)

   Basically, the formula expresses that the dissimilarity observed between x
and y is the same as the dissimilarity between z and t. An equivalent formula is

                    x 6= y ⇔ (x = z ∧ y = t) and x = y ⇔ z = t

It has 6 models of Boolean 4-tuples among the 16 possible ones. Note that this
includes the trivial cases where x = y = z = t. Since we are only interested in
this paper in non trivial analogical proportions, we further require that x 6= t
and y 6= z. This reduces the number of possible Boolean 4-tuples in AP to four
and it leads to the notion of analogical schema that we will use for the definition
of analogical complexes.
                                                                             
                                                                       0011
                                                                     0 1 0 1
Definition 2 (Analogical schema). The binary matrix AS =                     
                                                                     1 0 1 0 is
                                                                       1100
called an analogical schema. We write AS(i, j) if the value at row i and column
j of matrix AS is 1 (e.g. AS(1,3) and AS(1,4)) .
                                    From formal concepts to analogical complexes    161


     The analogical schema may be seen as a formal context on four objects o1 ,
o2 , o3 , o4 that are in the non-trivial AP: o1 : o2 :: o3 : o4 . The figure 1 shows the
associated concept lattice. In this lattice, A ∧ D = B ∧ C and A ∨ D = B ∨ C.
The figure also give names for each column and row profiles that we call object
and attribute types: for instance the first column as type 1 and the second row
as type b.


                                      >




                  AB           AC               BD         CD          1 2 3 4
            {4}          {3}              {2}        {1}             a     ××
                                                                     b × ×
                                                                     c× ×
                                                                     d××
            {a}          {b}              {c}        {d}
                  A            B                C          D




                                      ⊥



      Fig. 1. Left:Concept lattice of an analogical schema (reduced labeling).
Analogical schema with object and attribute types.



    We use in this paper the zoo dataset proposed by R. Forsyth [8] for illustra-
tion purpose. We call smallzoo the formal context extracted from this database
corresponding to attributes 2 to 9 and to the objects corresponding to the two
largest classes 1 and 2. Moreover, this context has been clarified and we have
chosen arbitrarily one object for each of the 10 different types of objects with
different attribute profiles. The corresponding table is given below.

   smallzoo    2       3     4    5       6       7       8        9     18
              hair feathers eggs milk airborne aquatic predator toothed type
   1 aardvark 1        0     0    1       0       0       1        1      1
  12 chicken 0         1     1    0       1       0       0        0      2
  17     crow 0        1     1    0       1       0       1        0      2
  20 dolphin 0         0     0    1       0       1       1        1      1
  22     duck 0        1     1    0       1       1       0        0      2
  28 fruitbat 1        0     0    1       1       0       0        1      1
  42      kiwi 0       1     1    0       0       0       1        0      2
  49     mink 1        0     0    1       0       1       1        1      1
  59 penguin 0         1     1    0       0       1       1        0      2
  64 platypus 1        0     1    1       0       1       1        0      1
162          Laurent Miclet and Jacques Nicolas


   The formal concept lattice is provided in figure 2, as computed by FCA
Extension [16]. It contains 31 elements. The central elements (at least two objects
and two attributes) are listed below:
                                                                                   
      c(3)      {20; 49; 59; 64}, {7; 8}          c(6)    {1; 20; 28; 49}, {5; 9}
                                                                               
      c(7)      {1; 20; 49; 64}, {5; 8}           c(8)    {1; 20; 49}, {5; 8; 9}
                                                                               
      c(9)      {20; 49; 64}, {5; 7; 8}           c(10)   {20; 49}, {5; 7; 8; 9}
                                                                              
      c(12)     {17; 42; 59; 64}, {4; 8}          c(13)   {22; 59; 64}, {4; 7}
                                                                                     
      c(14)     {59; 64}, {4; 7; 8}               c(15)   {12; 17; 22; 42; 59}, {3; 4}
                                                                             
      c(16)     {17; 42; 59}, {3; 4; 8}           c(17)   {22; 59}, {3; 4; 7}
                                                                                 
      c(19)     {12; 17; 22}, {3; 4; 6}           c(22)   {1; 28; 49; 64}, {2; 5}
                                                                               
      c(23)     {1; 28; 49}, {2; 5; 9}            c(24)   {1; 49; 64}, {2; 5; 8}
                                                                               
      c(25)     {1; 49}, {2; 5; 8; 9}             c(26)   {49; 64}, {2; 5; 7; 8}


Example 1. If one extracts in smallzoo the subcontext crossing (12, 28, 59, 49)
- that is, (chicken, fruitbat, penguin, mink)- and (7, 2, 3, 6) -(aquatic, hair,
feathers, airborne)-, it is clearly an analogical schema.
    The 4-tuple (chicken : fruitbat :: penguin : mink) is an analogical proportion
that finds a support using attributes (aquatic, hair, feathers, airborne). Each
attribute reflects one of the four possible types of Boolean analogy. For instance,
hair is false for chicken and penguin and true for fruitbat and mink whereas
feathers is true for chicken and penguin and false for fruitbat and mink. The
observed analogy can be explained thanks to this typology: the dissimilarity
between chicken and fruitbat based on the opposition feather/hair is the same
as the dissimilarity between penguin and mink and there are two other opposite
attributes, airborne and aquatic, that explain the similarity within each ’is to’
relation. Note that the analogical schema if fully symmetric and thus one could
also in principle write AP between attributes: hair:feathers::aquatic: airborne.


3     An analogical complex is to an analogical proportion as
      a concept is to a binary relation
3.1     Analogical complexes
A formal concept on a context (X, Y, I) is a maximal subcontext for which
relation I is valid. We define analogical complexes in the same way: they are
maximal subcontexts for which the 4-tuples are in AP. This requires to split
objects and attributes in four classes.
Definition 3 (Analogical complex). Given a formal context (X, Y, I), a set
of objects O ⊆ X, O = O1 ∪ O2 ∪ O3 ∪ O4 , a set of attributes A ⊆ Y , A =
A1 ∪ A2 ∪ A3 ∪ A4 , and a binary relation I, the subcontext (O, A) forms an
analogical complex (O1,4 , A1,4 ) iff
                              From formal concepts to analogical complexes      163




Fig. 2. Formal concept lattice of formal context smallzoo. Drawing from Concept Ex-
plorer [20].


1. The binary relation is compatible with the analogical schema AS:
   ∀o ∈ Oi , i = 1..4, ∀a ∈ Aj , j = 1..4, I(o, a) ⇔ AS(i, j).
2. The context is maximal with respect to the first property (⊕ denotes the ex-
   clusive or and \ the set-theoretic difference):
   ∀o ∈ X\O, ∃j ∈ [1, 4], ∃a ∈ Aj , I(o, a) ⊕ AS(i, j).
   ∀a ∈ Y \A, ∃i ∈ [1, 4], ∃o ∈ Oi , I(o, a) ⊕ AS(i, j).


    The first property states that the value of an attribute for an object in a com-
plex is a function of object type and attribute type (integer from 1 to 4) given by
the analogical schema. The second property states that adding an object (resp.
an attribute) to the complex would discard the first property for at least one
attribute (resp. object) value. Note that the ways analogical schema or analogi-
cal complex are defined are completely symmetric. Thus the role of objects and
attributes may be interchanged in all properties on analogical complexes.
164        Laurent Miclet and Jacques Nicolas


Example 2. We extract two subcontexts from smallzoo, highlighting analogical
schemas by sorting rows and columns.

                                                                           A1 A2 A3 A4
                                                                           a7       a6
                                                                    O1 o12 0        1
                                                                       o17 0        1
                                                                       o28 0        1
                                                                    O2 o12 0        1
                   A1    A2       A3    A4
                                                                       o17 0        1
                   a7 a8 a2 a5 a9 a3 a4 a6
                                                                       o28 0        1
O1 o12 (chicken) 0 0 0 0 0 1 1 1                                    O3 o20 1        0
O2 o28 (f ruitbat) 0 0 1 1 1 0 0 1                                     o49 1        0
O3 o59 (penguin) 1 1 0 0 0 1 1 0                                       o59 1        0
O4 o49 (mink)      1 1 1 1 1 0 0 0                                     o64 1        0
                                                                    O4 o20 1        0
                                                                       o49 1        0
                                                                       o59 1        0
                                                                       o64 1        0

   These subcontexts are maximal in the sense that it is not possible to add
an object or an attribute without breaking the analogical proportion. They are
associated to the following analogical complexes:
                                                                         
              ({12}, {28}, {59}, {49}), ({7, 8}, {2, 5, 9}, {3, 4}, {6})
                                                                                        
      ({12, 17, 28}, {12, 17, 28}, {20, 49, 59, 64}, {20, 49, 59, 64}), ({7}, ∅, ∅, {6})

    The first example provides a strong analogical relation between four animals
in the context smallzoo since it uses all attributes and all the types of analogy.
Attribute clusters correspond to aquatic predators, toothed animals with hair
and milk, birds (feathers and eggs) and flying animals (airborne). The second
example shows some of the sets in analogical complexes can be empty. In such a
case some sets may be duplicated. Among all complexes, those that exhibit all
types of analogy are particularly meaningful: we call them complete complexes.


3.2      Complete analogical complexes (CAC)

Definition 4. A complex C = (O1,4 , A1,4 ) is complete if none of its eight sets
are empty.

       construction, if CA = (O1,4 , A1,4 ) is a complete analogical complex and
    By S
if A = i=1,4 Ai , the following formula holds:
      ∀(o1 , o2 , o3 , o4 ) ∈ O1,4 , ∀(a1 , a2 , a3 , a4 ) ∈ A1,4
                          (o↑1 ∩ o↑4 ) ∩ A = (o↑2 ∩ o↑3 ) ∩ A = ∅ and o↑1 ∪ o↑4 = o↑2 ∩ o↑3 = A
                                   From formal concepts to analogical complexes                 165


    The next proposition shows that CAC exhibits strong discrimination and sim-
ilarity properties among pairs of objects and attributes. The similarity condition
alone would lead to the concatenation of independent (non overlapping) formal
concepts. The discrimination condition tempers this tendency by requiring the
simultaneous presence of opposite pairs.

Proposition 1. Let us define on a formal context F C = (X, Y, I) the relations:

  discrimination(oi , oj , ak , al ) = I(oi , ak ) ∧ I(oj , al ) ∧ ¬I(oi , al ) ∧ ¬I(oj , ak ).

      similarity(oi , oj , ak , al ) = I(oi , ak ) ∧ I(oj , ak ) ∧ I(oi , al ) ∧ I(oj , al ).
   A complete analogical complex (O1,4 , A1,4 ) in F C corresponds to a maximal
subcontext such that:

1. object pair discrimination (resp. similarity): ∀(oi , oj ) ∈ Oi × Oj , i 6= j,
   ∃(ak , al ) ∈ Ak × Al such that discrimination(oi , oj , ak , al )
   (resp. similarity(oi , oj , ak , al ));
2. attribute pair discrimination (res. similarity): ∀(ak , al ) ∈ Ak × Al , k 6= l,
   ∃(oi , oj ) ∈ Oi × Oj such that discrimination(oi , oj , ak , al )
   (resp. similarity(oi , oj , ak , al )).

Proof. Since objects and attribute have a completely symmetrical role, it is suffi-
cient to prove the proposition for object pairs. It proceeds easily by enumerating
the possible type pairs with different elements. If objects have type 1 and 2 or
3 and 4, attributes allowing object pair discrimination have type b and c and
attributes allowing object pair similarity have type a and d. If objects have type
1 and 3 or 2 and 4, attributes allowing object pair discrimination have type a
and d and attributes allowing object pair similarity have type b and c. If objects
have type 1 and 4 and if t1 ∈ T1 = {a, b} and t2 ∈ T2 = {c, d}, attributes allow-
ing object pair discrimination have type t1 and t2 and attributes allowing object
pair similarity have different types both in T1 or both in T2 . If objects have type
2 and 3 and if t1 ∈ T1 = {a, c} and t2 ∈ T2 = {b, d}, attributes allowing object
pair discrimination have type t1 and t2 and attributes allowing object pair sim-
ilarity have different types both in T1 or both in T2 .                            t
                                                                                   u

    In case of incomplete complexes, some of these properties are no more relevant
and a degenerate behaviour may appear: some of the sets may be identical. This
fact allows to establish a new proposition on complete complexes:

Proposition 2. In a complete analogical complex, side-by-side intersections of
sets are empty.

Proof. This property holds since when the intersection of two object (resp. at-
tribute) sets in an analogical complex AC is not empty, then AC contains at
least two empty attribute (resp. object) sets. This fact is a consequence of prop-
erty 1.
Indeed, if an object belongs to two different types, their profiles must be the
166     Laurent Miclet and Jacques Nicolas


same. The discrimination property ensures that the profile of two different ob-
ject types differ by at least two different attribute with different types (e.g. if
the object has type 1 and 3, attributes of type b and c should have different
values). Thus it cannot exists attributes of the discriminant type (e.g. attributes
of type b and c in the previous case) and the corresponding sets are empty. This
completes the proof.
    The converse of the proposition is not true: if all side-by-side intersections
of sets differ, the complex is not necessary complete. For instance, consider the
following context:
                                      a1 a2 a3 a4 a5 a6
                                   o1 0 0 0 1 1 1
                                   o2 0 1 1 1 1 1
                                   o3 1 0 0 0 0 1
                                   o4 1 1 1 0 0 0
                                   o5 1 0 1 1 0 0
It contains the following not complete complex:
                                                                                
                ({o1 }, {o2 }, {o3 }, {o4 }), ({a1 }, {a2 , a3 }, ∅, {a4 , a5 })

4     The lattice of analogical complexes
Definition 5 (Partial Order on analogical complexes). Given two ana-
logical complexes C 1 = (O1,4
                           1
                              , A11,4 ) and C 2 = (O1,4
                                                    2
                                                        , A21,4 ), the partial order ≤ is
defined by
                                                                                  
      C 1 ≤ C 2 iff    Oi1 ⊆ Oi2 for i = 1, 4 and A2i ⊆ A1i for i = 1, 4 .
C 1 is called a sub-complex of C 2 and C 2 is called a super-complex of C 1
    As for formal concepts, the set of all complexes has a lattice structure. Let
us first define a derivation operator on analogical quadruplets:
Definition 6 (Derivation on set quadruplets).
   Let O = O1 ∪ O2 ∪ O3 ∪ O4 be a set of objects partitioned in four subsets, and
                                                                  0
A be a set of attributes. For all i and j ∈ [1, 4], one defines Oij = {a ∈ A | ∀o ∈
Oi I(o, a) ⇔ AS(i, j)}
   Let A = A1 ∪ A2 ∪ A3 ∪ A4 be a set of attributes partitioned in four subsets,
                                                                  0
and O be a set of objects. For all i and j ∈ [1, 4], one defines Aij = {o ∈ O | ∀a ∈
Ai I(o, a) ⇔ AS(i, j)}
   Finally, we define the derivation on quadruplets as follows:
                                  4
                                  \            4
                                               \             4
                                                             \             4
                                                                           \
                                         0             0             0             0
                       0
                      O1,4 =(          Oj1 ,         Oj2 ,         Oj3 ,         Oj4 )
                                j=1            j=1           j=1           j=1

                                  4
                                  \            4
                                               \             4
                                                             \             4
                                                                           \
                                         0             0             0             0
                      A01,4 = (        Aj1 ,         Aj2 ,         Aj3 ,         Aj4 )
                                 j=1           j=1           j=1           j=1
                                  From formal concepts to analogical complexes              167

                                                                          0
Example 3. Consider O = ({12}, {28}, {59}, {49}). One has: O11 = {a ∈ A | ¬I(12, a)} =
{2, 5, 7, 8, 9};
      0
    O21 = {a ∈ A | ¬I(28, a)} = {3, 4, 7, 8};
      0
    O31 = {a ∈ A | I(59, a)} = {3, 4, 7, 8};
      0                                                  T4    0
    O41 = {a ∈ A | I(49, a)} = {2, 4, 5, 7, 8} Then O10 = j=1 Oj1 = {7, 8}
Finally, O0 = ({7, 8}, {2, 5, 9}, {3, 4}, {6}).

   We exhibit a basic theorem for these complexes that naturally extends the
basic theorem on concepts:
Proposition 3. Given two analogical complexes C 1 = (O1,4
                                                      1
                                                          , A11,4 ) and C 2 =
  2      2
(O1,4 , A1,4 ),
 – The join of C 1 and C 2 is defined by C 1 ∧ C 2 = (O1,4 , A1,4 ) where

                              ∀i ∈ [1, 4] Oi = Oi (C 1 ) ∩ Oi (C 2 )
                                                                                             00
    A1,4 = A1 (C 1 )∪A1 (C 2 ), A2 (C 1 )∪A2 (C 2 ), A3 (C 1 )∪A3 (C 2 ), A4 (C 1 )∪A4 (C 2 )
 – The meet of C 1 and C 2 is defined by C 1 ∨ C 2 = (O1,4 , A1,4 ) where
                                                                                             00
    O1,4 = O1 (C 1 )∪O1 (C 2 ), O2 (C 1 )∪O2 (C 2 ), O3 (C 1 )∪O3 (C 2 ), O4 (C 1 )∪O4 (C 2 )

Proof. The meet and the join are dual and one only needs to prove the proposi-
tion for the join. The ordering by set inclusion requires the set of objects Oi of
C 1 ∧ C 2 to be included in Oi (C 1 ) ∩ Oi (C 2 ) and its set of attributes Aj to be in-
cluded in Aj (C 1 ) ∪ Aj C 2 ). Taking exactly the intersection of objects thus ensures
the set of objects to be maximal. The corresponding maximal sets of attributes
may be inferred using the derivation operator ’ we have just defined. Another
way to generate these sets is to apply the derivation operator twice on the union
of sets of attributes.

Example 4. The complex lattice of smallzoo has 24 elements, including 18 com-
plete complexes. It is sketched in figure 3.
   In this lattice, for example, the join of the analogical complex numbered 9
and 12, which are as follows
                                                                       
            9 = ({12}, {28}, {59, 64}, {20, 49}), ({7}, {9}, {4}, {6})
                                                                         
          12 = ({12, 17}, {28}, {59}, {49, 64}), ({7}, {2, 5}, {3}, {6})
is number 15, namely:
                                                                          
           15 = ({12}, {28}, {59}, {49}), ({7, 8}, {2, 5, 9}, {3, 4}, {6})

The resulting object sets are for each type the intersection of the two joined
object sets. The resulting attribute sets contain for each type the union of the
two joined attribute sets and may contain other elements with a correct profile
on all objects. For instance, A1 (9 ∧ 12) = {7, 8} is made of the union of A1 (9)
168     Laurent Miclet and Jacques Nicolas


and A1 (12) ({7}) plus attribute 8 since 8 has the right profile (0, 0, 1, 1) on O1,4
(that is, ¬I(12, 8), ¬I(28, 8), I(59, 8) and I(49, 8)).
   The meet of the analogical complexes numbered 9 and 12 is number 19,
namely
                                                                                       
19 = ({12, 17, 28}, {12, 17, 28}, {20, 49, 59, 64}, {20, 49, 59, 64}), ({7}, ∅, ∅, {6})


5     Conclusion

We have introduced a new conceptual object called analogical complex that uses
a complex relation, analogical proportion, to compare objects with respect to
their attribute values. Although this relation works on set quadruplets instead
of simple sets like in formal concepts, we have shown that it is possible to keep
the main properties of concepts, that is, maximality and comparison at the
level of object or attribute pairs. The set of all complexes are structured within
a lattice that contains two types of elements. The most meaningful ones only
contain non empty sets and are a strong support for doing analogical inference.
An interesting extension of this work would be to develop this inference process
for analogical data mining in a way close to rule generation in FCA.
    The degenerate case where some of the sets are empty is more frequent than
in FCA where their presence is limited to the top or bottom of the lattice. The
presence of a single empty set may reflect the lack of some object or attribute
and is thus a possible new research direction for completing a knowledge base or
an ontology. Particularly, analogy in a Boolean framework introduces a form of
negation through the search of dissimilarities (discrimination) between objects.
    We have written an implementation the search for complete analogical com-
plexes, using the Answer Set Programming framework [1]. The properties of
definition 3 are translated straightforwardly in logical constraints and the search
of all complexes is achieved by an ASP solver looking for all solutions. The de-
scription of the ASP program would be beyond the scope of this paper but it
can be seen as a relatively simple exercise of extension of the search for formal
concepts by adding a few logical constraints. It is likely that most of the existing
tools of FCA could be adapted the same way for analogical complex analysis.
This would allow to include both categorization and analogy within common
data mining environments.


References

 1. Brewka, G., Eiter, T., Truszczyński, M.: Answer set program-
    ming at a glance. Commun. ACM 54(12), 92–103 (Dec 2011),
    http://doi.acm.org/10.1145/2043174.2043195
 2. Correa, W., Prade, H., Richard, G.: When intelligence is just a matter of copying.
    In: et al., L.D.R. (ed.) Proc. 20th Europ. Conf. on Artificial Intelligence, Montpel-
    lier, Aug. 27-31. pp. 276–281. IOS Press (2012)
                                From formal concepts to analogical complexes          169


 3. French, R.M.: The computational modeling of analogy-making. Trends in Cognitive
    Sciences 6(5), 200 – 205 (2002)
 4. Gentner, D., Holyoak, K.J., Kokinov, B.N.: The Analogical Mind: Perspectives
    from Cognitive Science. Cognitive Science, and Philosophy, MIT Press, Cambridge,
    MA (2001)
 5. Hofstadter, D., Mitchell, M.: The Copycat project: A model of mental fluidity and
    analogy-making. In: Hofstadter, D., The Fluid Analogies Research Group (eds.)
    Fluid Concepts and Creative Analogies: Computer Models of the Fundamental
    Mechanisms of Thought. pp. 205–267. Basic Books, Inc., New York, NY (1995)
 6. Lepage, Y.: Analogy and formal languages. Electr. Notes Theor. Comput. Sci. 53
    (2001)
 7. Lepage, Y.: Analogy and formal languages. In: Proc. FG/MOL 2001. pp. 373–378
    (2001), (see also http://www.slt.atr.co.jp/ lepage/pdf/dhdryl.pdf.gz)
 8. Lichman,         M.:      UCI      machine        learning      repository     (2013),
    http://archive.ics.uci.edu/ml
 9. Melis, E., Veloso, M.: Analogy in problem solving. In: Handbook of Practical Rea-
    soning: Computational and Theoretical Aspects. Oxford Univ. Press (1998)
10. Miclet, L., Bayoudh, S., Delhay, A.: Analogical dissimilarity: definition, algorithms
    and two experiments in machine learning. JAIR, 32 pp. 793–824 (2008)
11. Miclet, L., Prade, H.: Handling analogical proportions in classical logic and fuzzy
    logics settings. Proc. 10th Eur. Conf. on Symbolic and Quantitative Approaches
    to Reasoning with Uncertainty (ECSQARU’09),Verona (2009)
12. Miclet, L., Prade, H.: Handling analogical proportions in classical logic and fuzzy
    logics settings. In: Proc. 10th Eur. Conf. on Symbolic and Quantitative Approaches
    to Reasoning with Uncertainty (ECSQARU’09),Verona. pp. 638–650. Springer,
    LNCS 5590 (2009)
13. Miclet, L., Barbot, N., Prade, H.: From analogical proportions in lattices to pro-
    portional analogies in formal concepts. In: ECAI - 21th European Conference on
    Artificial Intelligence. Prague, Czech Republic (Aug 2014)
14. Miclet, L., Prade, H., Guennec, D.: Looking for Analogical Proportions in a Formal
    Concept Analysis Setting. In: Amedeo Napoli, V.V. (ed.) Conference on Concept
    Lattices and Their Applications. pp. 295–307. Nancy, France (Oct 2011)
15. Prade, H., Richard, G.: Homogeneous logical proportions: Their uniqueness and
    their role in similarity-based prediction. Proc. of the 13th International Conference
    on Principles of Knowledge Representation and Reasoning KR2012 pp. 402 – 412
    (2012)
16. Radvansky, M., Sklenar, V.: Fca extension for ms excel 2007,
    http://www.fca.radvansky.net (2010)
17. Stroppa, N., Yvon, F.: An analogical learner for morphological analysis. In: Online
    Proc. 9th Conf. Comput. Natural Language Learning (CoNLL-2005). pp. 120–127
    (2005)
18. Stroppa, N., Yvon, F.: Analogical learning and formal proportions: Definitions and
    methodological issues. ENST Paris report (2005)
19. Stroppa, N., Yvon, F.: Du quatrième de proportion comme principe inductif :
    une proposition et son application à l’apprentissage de la morphologie. Traitement
    Automatique des Langues 47(2), 1–27 (2006)
20. Yevtushenko, S.: System of data analysis ”concept explorer”. (in russian). In: Proc.
    of the 7th national conference on artificial intelligence (KII-2000) ,Russia. pp. 127–
    134 (2000)
170              Laurent Miclet and Jacques Nicolas



                                                                             ∅A
                                                                             ∅A
                                                                             ∅A
                                                                             ∅A




                                     ∅ 3, 4, 7, 8                                                      1 3, 4, 6, 7
                                     28 2, 5, 6, 9                                                     ∅ 3, 4, 6, 7
                                     59 3, 4, 7, 8                                                     ∅ 2, 5, 8, 9
                                     ∅ 2, 5, 6, 9                                                      22 2, 5, 8, 9




                        15                                                                                                              4
                     12 7, 8                                                                                                        1 3, 4, 7
                     28 2, 5, 9                                                                                                     28 6
                     59 3, 4                                                                                                        59 8
                     49 6                                                                                                           22 2, 5, 9




        16                  10                       17                                                                5                              2
                                                                                      3
                                                                                  1      6
                                                                                  20, 59 7
                                                                                  28     2
                                                                                  22     8
                                                              12
                                                          12, 17 7
14                  11                    18              28     2, 5                              7                                    6
                                                          59     3
                                                          49, 64 6



          9                                                                                         8                                                  1
      12     7                                                                               1, 20, 49, 64 3                                     1, 20, 49 4
      28     9                                       13                                      28            6                                     28         6
      59, 64 4                                                                               42, 59        8                                     42, 59, 64 8
      20, 49 6                                                                               12, 22        5                                     12, 22     9




                        19
                                                                                                          1, 20, 42, 49, 49, 64 ∅
                  12, 17, 28     7
                                                                                                          12, 22, 28            6
                  12, 17, 28     ∅
                                                                                                          1, 20, 42, 49, 49, 64 8
                  20, 49, 59, 64 ∅
                                                                                                          12, 22, 28            ∅
                  20, 49, 59, 64 6




                                                                        O∅
                                                                        O∅
                                                                        O∅
                                                                        O∅




Fig. 3. Hasse diagram of the analogical complex lattice for formal context smallzoo.
For reasons of space some nodes are not explicitely given.
                               Pattern Structures
                              and Their Morphisms

                            Lars Lumpe1 and Stefan E. Schmidt2

                                      Institut für Algebra,
                                 Technische Universität Dresden
                           larslumpe@gmail.com1 , midt1@msn.com2



        Abstract. Projections of pattern structures don’t always lead to pattern struc-
        tures, however residual projections and o-projections do. As a unifying approach,
        we introduce the notion of pattern morphisms between pattern structures and pro-
        vide a general sufficient condition for a homomorphic image of a pattern structure
        being again a pattern structure. In particular, we receive a better understanding of
        the theory of o-projections.


 1 Introduction

 Pattern structures within the framework of formal concept analysis have been intro-
 duced in [3]. Since then they have turned out to be a useful tool for analysing various
 real-world applications (cf. [3–7]). In this work we want to point out that the theoretical
 foundations of pattern structures encourage still some fruitful discussions. In particular,
 the role projections play within pattern structures for information reduction still needs
 some further investigation.
 The goal of our paper is to establish an adequate concept of pattern morphism be-
 tween pattern structures, which also gives a better understanding of the concept of o-
 projections as recently introduced and investigated in [2]. In [8], we showed that pro-
 jections of pattern structures do not necessarily lead to pattern structures again, how-
 ever, residual projections do. It turns out that the concept of residual maps between the
 posets of patterns (w.r.t. two pattern structures) gives the key for a unifying view of
 o-projections and residual projections.
 We also derive that a pattern morphism from a pattern structure to a pattern setup (in-
 troduced in this paper), which is surjective on the sets of objects, yields again a pattern
 structure.
 Our main result states that a pattern morphism always induces an adjunction between
 the corresponding concept lattices. In case the underlying map between the sets of ob-
 jects is surjective, the induced residuated map between the concept lattices turns out to
 be surjective too.
 The fundamental order theoretic concepts of our paper are nicely presented in the book
 on Residuation Theory by T.S. Blythe and M.F. Janowitz (cf. [1]).




c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 171–179, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
172       Lars Lumpe and Stefan E. Schmidt


2 Preliminaries
Definition 1 (Adjunction). Let P  pP, ¤q and L  pL, ¤q be posets; furthermore let
f : P Ñ L and g : L Ñ P be maps.
(1) The pair p f , gq is an adjunction w.r.t. pP, Lq if f x ¤ y is equivalent to x ¤ gy for
    all x P P and y P L. In this case, we will refer to pP, L, f , gq as a poset adjunction.
(2) f is residuated from P to L if the preimage of a principal ideal in L under f is
    always a principal ideal in P, that is, for every y P L there exists x P P s.t.

                              f 1 tt P L | t ¤ yu  ts P P | s ¤ xu.

(3) g is residual from L to P if the preimage of a principal filter in P under g is always
    a principal filter in L, that is, for every x P P there exists y P L s.t.

                              g1 ts P P | x ¤ su  tt P L | y ¤ t u.

(4) The dual of L is given by Lop  pL, ¥q with ¥: tpx,t q P L  L | t ¤ xu. The pair
    p f , gq is a Galois connection w.r.t. pP, Lq if p f , gq is an adjunction w.r.t. pP, Lop q.
      The following well-known facts are straightforward (cf. [1]).
Proposition 1. Let P  pP, ¤q and L  pL, ¤q be posets.
(1) A map f : P Ñ L is residuated from P to L iff there exists a map g : L Ñ P s.t. p f , gq
    is an adjunction w.r.t. pP, Lq.
(2) A map g : L Ñ P is residual from L to P iff there exists a map f : P Ñ L s.t. p f , gq
    is an adjunction w.r.t. pP, Lq.
(3) If p f , gq and ph, kq are adjunctions w.r.t. pP, Lq with f  h or g  k then f  h and
    g  k.
(4) If f is a residuated map from P to L, then there exists a unique residual map f
    from L to P s.t. p f , f q is an adjunction w.r.t. pP, Lq. In this case, f is called the
    residual map of f .
(5) If g is a residual map from L to P, then there exists a unique residuated map g
    from P to L s.t. pg , gq is an adjunction w.r.t. pP, Lq. In this case, g is called the
    residuated map of g.
Definition 2. Let P  pP, ¤q be a poset and T „ P. Then
(1) The restriction of P onto T is given by P|T : pT, ¤ XpT  T qq, which clearly is a
    poset too.
(2) The canonical embedding of P|T into P is given by the map T Ñ P,t ÞÑ t.
(3) T is a kernel system in P if the canonical embedding τ of P|T into P is residuated.
    In this case, the residual map ϕ of τ will also be called the residual map of T in P.
    The composition κ : τ  ϕ is referred to as the kernel operator associated with T
    in P.
(4) Dually, T is a closure system in P if the cannonical embedding τ of P|T into
    P is residual. In this case, the residuated map ψ of τ will also be called the
    residuated map of T in P. The composition γ : τ  ψ is referred to as the closure
    operator associated with T in P.
                                      Pattern Structures and Their Morphisms           173


(5) A map κ : P Ñ P is a kernel operator on P if s ¤ x is equivalent to s ¤ κx for all
    s P κP and x P P.
    Remark: In this case, κP forms a kernel system in P, the kernel operator of which
    is κ.
(6) Dually, a map γ : P Ñ P is a closure operator on P if x ¤ t is equivalent to γx ¤ t
    for all x P P and t P γP.
    Remark: In this case, ϕP forms a closure system in P, the closure operator of which
    is γ.
    The following known facts will be needed for the sequel (cf. [1]) .
Proposition 2. Let P  pP, ¤q and L  pL, ¤q be posets.
(1) If f is a residuated map from P to L then f preserves all existing suprema in P,
    that is, if s P P is the supremum (least upper bound) of X „ P in P then f s is the
    supremum of f X in L.
    In case P and L are complete lattices, the reverse holds too: If a map f from P to
    L preserves all suprema, that is,
                            f psupP X q  supL f X f or all X „ P,
    then f is residuated.
(2) If g is a residual map from L to P, then g preserves all existing infima in L, that is,
    if t P L is the infimum (greatest lower bound) of Y „ L in L then gt is the infimum
    of gY in P.
    In case P and L are complete lattices, the reverse holds too: If a map g from L to P
    preserves all infima, that is,
                             f pinfP Y q  infL gY f or all Y „ L,
    then g is residual.
(3) For an adjunction p f , gq w.r.t. pP, Lq the following hold:
   (a1) f is an isotone map from P to L.
   (a2) f  g  f  f
   (a3) f P is a kernel system in L with f  g as associated kernel operator on L. In
        particular, L Ñ P, y ÞÑ f gy is a residual map from L to L| f P.
   (b1) g is an isotone map from L to P.
   (b2) g  f  g  g
   (b3) gL is a closure system in P with g  f as associated closure operator on P. In
        particular, P Ñ gL, x ÞÑ g f x is a residuated map from P to P|gL.

3   Adjunctions and Their Concept Posets
Definition 3. Let P : pP, S, σ , σ q and Q : pQ, T, τ, τ   q be poset adjunctions. Then
a pair pα, β q forms a morphism from P to Q if pP, Q, α, α   q and pS, T, β , β q are poset
adjunctions satisfying
                                      τ α  β σ
Remark: This implies α      τ  σ  β , that is, the following diagrams are commu-
tative:
174       Lars Lumpe and Stefan E. Schmidt


                               α                                           α
                        P               Q                          P               Q

                    σ                          τ              σ                      τ

                        S                  T                       S               T
                               β                                           β

Next we illustrate the involved poset adjunctions:
                                                     α
                                       P                               Q
                                                    α


                                   σ           σ               τ           τ


                                                     β
                                       S                               T
                                                     β

Definition 4 (Concept Poset). For a poset adjunction P  pP, S, σ , σ                    q let
                            BP : tp p, sq P P  S | σ p  s ^ σ s  pu

denote the set of pformalq concepts in P . Then the concept poset of P is given by

                                           BP : pP  Sq | BP ,

that is, p p0 , s0 q ¤ p p1 , s1 q holds iff p0 ¤ p1 iff s0 ¤ s1 , for all p p0 , s0 q, p p1 , s1 q P BP . If
p p, sq is a formal concept in P then p is referred to as extent in P and s as intent in P .
Theorem 1. Let pα, β q be a morphism from a poset adjunction P  pP, S, σ , σ                         q to a
poset adjunction Q  pQ, T, τ, τ q. Then

                                               pBP , BQ , Φ,Ψ q
is a poset adjunction for

                                Φ : BP Ñ BQ , p p, sq ÞÑ pτ β s, β sq

and
                              Ψ : BQ Ñ BP , pq,t q ÞÑ pα q, σ α qq.
In addition, if α is surjective then so is Φ.
Remark: In particular we want to point out that α q is an extent in P for every extent
q in Q and similarly, β s is an intent in Q for every intent s in P .
                                       Pattern Structures and Their Morphisms            175


Proof. Let p p, sq P BP and pq,t q P BQ ; then σ p  s and σ s  p and τq  t and
τ t  q. This implies β s  β σ p  τα p, thus

                                Φ p p, sq  pτ β s, β sq P BP

(since ττ β s  ττ τα p  τα p  β sq. Similarly, Ψ pq,t q P BQ .
Assume now that Φ p p, sq ¤ pq,t q holds, which implies β s ¤ t. It follows that

                                   τα p  β σ p  β s ¤ t

and hence
                                    p ¤ α τ t  α q,
that is, p p, sq ¤ Ψ pq,t q.
Conversely, assume that p p, sq ¤ Ψ pq,t q holds, which implies p ¤ α q. It follows that

                              p ¤ α q  α τ t  σ β t,

and hence β s  β σ p ¤ t, that is, Φ p p, sq ¤ pq,t q.
Assume now that α is surjective; then α  α  idQ . Let pq,t q P BP , that is, τq  t and
τ t  q. Then for p : α q and s : σ p we have p p, sq P BP since

  σ s  σ σ α q  σ σ α τ t  σ σ σ β t  σ β t  α τ t  α q  p.

Our claim is now that Φ p p, sq  pq,t q holds, that is, β s  t. The latter is true, since
α p  αα q  q implies

                               β s  β σ p  τα p  τq  t.



Discussion for clarification: The question was raised whether, in the previous theorem,
the residuated map Φ from BP to BQ allows some modification, since the map

                            P  S Ñ Q  T, p p, sq ÞÑ pα p, β sq

is obviously residuated from P  S to Q  T. However, in general the latter map does
not restrict to a map from BP to BQ . Indeed, our construction of the map Φ is of the
form p p, sq ÞÑ pα 1 p, β sq. As a warning, we want to point out that, in general, there is no
residuated map from BP to BQ of the form p p, sq ÞÑ pα p, β 1 sq. The simple reason for
this is that β s is an intent in Q for every intent s in P , while there may exist an extent p
in P such that α p is not an extent in Q .


4 Morphisms between Pattern Structures
Definition 5. A triple G  pG, D, δ q is a pattern setup if G is a set, D  pD, „q is a
poset, and δ : G Ñ D is a map. In case every subset of δ G : tδ g | g P Gu has an
infimum in D, we will refer to G as pattern structure. Then the set

                                 CG : tinfD δ X | X „ Gu
176         Lars Lumpe and Stefan E. Schmidt


forms a closure system in D.


If G  pG, D, δ q and H  pH, E, ε q each is a pattern setup, then a pair p f , ϕ q forms a
pattern morphism from G to H if f : G Ñ H is a map and ϕ is a residual map from D
to E satisfying ϕ  δ  ε  f , that is, the following diagram is commutative:

                                                  f
                                           G             H

                                       δ                     ε

                                           D             E
                                                  ϕ

In the sequel we show how our previous considerations apply to pattern structures.

Applications
(1) Let G be a pattern structure and H be a pattern setup. If p f , ϕ q is a pattern morphism
    from G to H with f being surjective, then H is also a pattern structure.
(2) Let G  pG, D, δ q and H  pH, E, ε q be pattern structures. Also let p f , ϕ q be a
    pattern morphism from G to H .
    To apply the previous theorem we give the following construction:
    f gives rise to an adjunction pα, α q between the power set lattices 2G : p2G , „q
    and 2H : p2H , „q via
                                   α : 2G Ñ 2H , X ÞÑ f X
      and
                                     α : 2H Ñ 2G ,Y ÞÑ f 1Y.
      Further let ϕ  denote the residuated map of ϕ w.r.t. pE, Dq, that is, pE, D, ϕ  , ϕ q is
      a poset adjunction. Then, obviously, pDop , Eop , ϕ, ϕ  q is a poset adjunction too.

      For pattern structures the following operators are essential:

                                    : 2G Ñ D, X ÞÑ infD δ X
                                    : D Ñ 2G , d ÞÑ tg P G | d „ δ gu
                                
                                    : 2H Ñ E, Z ÞÑ infE εZ
                                    : E Ñ 2H , e ÞÑ th P H | e „ εhu


      It now follows that pα, ϕ q forms a morphism from the poset adjunction

                                           P  p2G , Dop , , q

      to the poset adjunction
                                           Q  p2H , Eop , , q.
                                  Pattern Structures and Their Morphisms     177


In particular, p f X q  ϕ pX q holds for all X „ G.
Here we give an illustration of the constructed adjunctions:
                                         α
                            2G                        2H
                                        α


                                                           




                                         ϕ
                            Dop                       Eop
                                        ϕ

Replacing Dop by D and Eop by E we receive the following commutative diagrams:

                        α                                   α
                2G            2H                  2G             2H

                                                                     




                                                  D              E
                                                            ϕ
                 D              E
                        ϕ

In combination we receive the following diagram of Galois connections and ad-
junctions between them:
                                         α
                            2G                        2H
                                        α


                                                  l         




                                         ϕ
                            D                          E
                                        ϕ

For the following we recollect that the concept lattice of G is given by BG : BP
— similarly, BH : BQ .
Now we are prepared to give an application of Theorem 1 to concept lattices of
pattern structures: pBG, BH , Φ,Ψ q is an adjunction for

                      Φ : BG Ñ BH , pX, d q ÞÑ ppϕd q , ϕd q
178         Lars Lumpe and Stefan E. Schmidt


      and
                          Ψ : BH Ñ BG, pZ, eq ÞÑ p f 1 Z, p f 1 Z q q.
    In case f is surjective, Φ is surjective too.
    Remark: This application implies a generalization of Proposition 1 in [2], that is, if
    Z is an extent in H , then f 1 Z is an extent in G, and if d is an intent in G then ϕd
    is an intent in H .
(3) Let G  pG, D, δ q be a pattern structure and let κ be a kernel operator on D. Then
    ϕ : D Ñ κD, d ÞÑ κd forms a residual map from D to κD : D | κD, and pidG , ϕ q
    is a pattern morphism from G to H : pG, κD, ϕ  δ q.
    Remark: In [2], ϕ is called an o-projection. The above clarifies the role of o-
    projections for pattern structures.
(4) Let G  pG, D, δ q be a pattern structure, and let κ be a residual kernel operator on
    D. Then pidG , κ q is a pattern morphism from G to H : pG, D, κ  δ q.
    Remark: In [8], κ is also referred to as a residual projection. The above clarifies the
    role of residual projections for pattern structures.
(5) Generalizing [2] and [8], we observe that if G  pG, D, δ q is a pattern structure and
    ϕ is a residual map from D to E, then pidG , ϕ q is a pattern morphism from G to
    H  pG, E, ϕ  δ q satisfying that

                             Φ : BG Ñ BH , pX, d q ÞÑ ppϕd q , ϕd q

      is a surjective residuated map from BG to BH .
      In particular, X   ϕ pX ) holds for all X „ G.
      Remark: This application gives a better understanding to properly generalize the
      concept of projections as discussed in [3] and subsequently in [2, 4–8].


References

  1. T.S. Blyth, M.F.Janowitz (1972), Residuation Theory, Pergamon Press, pp. 1-382.
  2. A. Buzmakov, S. O. Kuznetsov, A. Napoli (2015) , Revisiting Pattern Structure Projections.
     Formal Concept Analysis. Lecture Notes in Artificial Intelligence (Springer), Vol. 9113, pp
     200-215.
  3. B. Ganter, S. O. Kuznetsov (2001), Pattern Structures and Their Projections. Proc. 9th Int.
     Conf. on Conceptual Structures, ICCS01, G. Stumme and H. Delugach (Eds.). Lecture
     Notes in Artificial Intelligence (Springer), Vol. 2120, pp. 129-142.
  4. T. B. Kaiser, S. E. Schmidt (2011), Some remarks on the relation between annotated ordered
     sets and pattern structures. Pattern Recognition and Machine Intelligence. Lecture Notes in
     Computer Science (Springer), Vol. 6744, pp 43-48.
  5. M. Kaytoue, S. O. Kuznetsov, A. Napoli, S. Duplessis (2011), Mining gene expression data
     with pattern structures in formal concept analysis. Information Sciences (Elsevier), Vol.181,
     pp. 1989-2001.
  6. S. O. Kuznetsov (2009), Pattern structures for analyzing complex data. In H. Sakai et al.
     (Eds.). Proceedings of the 12th international conference on rough sets, fuzzy sets, data
     mining and granular computing (RSFDGrC09). Lecture Notes in Artificial Intelligence
     (Springer), Vol. 5908, pp. 33-44.
                                      Pattern Structures and Their Morphisms             179


7. S. O. Kuznetsov (2013), Scalable Knowledge Discovery in Complex Data with Pattern
   Structures. In: P. Maji, A. Ghosh, M.N. Murty, K. Ghosh, S.K. Pal, (Eds.). Proc. 5th Inter-
   national Conference Pattern Recognition and Machine Intelligence (PReMI2013). Lecture
   Notes in Computer Science (Springer), Vol. 8251, pp. 30-41.
8. L. Lumpe, S. E. Schmidt (2015), A Note on Pattern Structures and Their Projections. For-
   mal Concept Analysis. Lecture Notes in Artificial Intelligence (Springer), Vol. 9113, pp
   145-150.
                      NextClosures:
        Parallel Computation of the Canonical Base

                        Francesco Kriegel and Daniel Borchmann

             Institute of Theoretical Computer Science, TU Dresden, Germany
                 {francesco.kriegel,daniel.borchmann}@tu-dresden.de



        Abstract. The canonical base of a formal context plays a distinguished role
        in formal concept analysis. This is because it is the only minimal base so
        far that can be described explicitly. For the computation of this base several
        algorithms have been proposed. However, all those algorithms work sequentially,
        by computing only one pseudo-intent at a time – a fact which heavily impairs
        the practicability of using the canonical base in real-world applications. In this
        paper we shall introduce an approach that remedies this deficit by allowing
        the canonical base to be computed in a parallel manner. First experimental
        evaluations show that for sufficiently large data-sets the speedup is proportional
        to the number of available CPUs.

        Keywords: Formal Concept Analysis, Canonical Base, Parallel Algorithms


 1    Introduction
 The implicational theory of a formal context is of interest in a large variety of appli-
 cations. In those cases, computing the canonical base of the given context is often
 desirable, as it has minimal cardinality among all possible bases. On the other hand,
 conducting this computation often imposes a major challenge, often endangering the
 practicability of the underlying approach.
    There are two known algorithms for computing the canonical base of a formal
 context [6, 12]. Both algorithms work sequentially, i.e. they compute one implication
 after the other. Moreover, both algorithms compute in addition to the implications
 of the canonical base all formal concepts of the given context. This is a disadvantage,
 as the number of formal concepts can be exponential in the size of the canonical base.
 On the other hand, the size of the canonical base can be exponential in the size of
 the underlying context [10]. Additionally, up to today it is not known whether the
 canonical base can be computed in output-polynomial time, and certain complexity
 results hint at a negative answer [3]. For the algorithm from [6], and indeed for any
 algorithm that computes the pseudo-intents in a lectic order, it has been shown that
 it cannot compute the canonical base with polynomial delay [2].
    However, the impact of theoretical complexity results for practical application is often
 hard to access, and it is often worth investigating faster algorithm for theoretically in-
 tractable results. A popular approach is to explore the possibilities to parallelize known se-
 quential algorithms. This is also true for formal concept analysis, as can be seen in the de-
 velopment of parallel versions for computing the concept lattice of a formal context [5, 13].
    In this work we want to investigate the development of a parallel algorithm for
 computing the canonical base of a formal context K. The underlying idea is actually

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 181–192, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
182     Francesco Kriegel and Daniel Borchmann


quite simple, and has been used by Lindig [11] to (sequentially) compute the concept
lattice of a formal context: to compute the canonical base, we compute the lattice
of all intents and pseudo-intents of K. This lattice can be computed bottom up, in
a level-wise order, and this computation can be done in parallel provided that the
lattice has a certain “width” at a particular level. The crucial fact now is that the upper
neighbours of an intent or pseudo-intent B can be easily computed by just iterating over
all attributes m ∈/ B and computing the closure of B ∪ { m }. In the approach of Lindig
mentioned above this closure is just the usual double-prime operation B →     7 BII of the
underlying formal context K. In our approach it is the closure operator whose closures
are exactly the intents and pseudo-intents of K. Surprisingly, despite the simpleness
of our approach, we are not aware of any prior work on computing the canonical base
of a formal context in a parallel manner. Furthermore, experimental results presented
in this work indicate that for suitable large data-sets the computation of the canonical
base can be speed up by a factor proportional to the number of available CPUs.
   The paper is structured as follows. After recalling all necessary notions of formal
concept analysis in Section 2, we shall describe in Section 3 our approach of computing
the canonical base in parallel. Benchmarks of this algorithm are presented in Section 4,
and we shall close this work with some conclusions in Section 5.

2     Preliminaries
This section gives a brief overview on the notions of formal concept analysis [7] that are
used in this document. The basic structure is a formal context K = (G, M, I) consisting
of a set G of objects, a set M of attributes, and an incidence relation I ⊆ G × M. For a
pair (g, m) ∈ I we also use the infix notation g I m, and say that the object g has the
attribute m. Each formal context K induces the derivation operators ·I : P(G) → P(M)
and ·I : P(M) → P(G) that are defined as follows for object sets A ⊆ G and attribute
sets B ⊆ M:
AI := { m ∈ M | ∀g ∈ A: (g, m) ∈ I } and BI := { m ∈ M | ∀m ∈ B : (g, m) ∈ I } .
In other words, AI is the set of all attributes that all objects from A have in common,
and dually BI is the set of all objects which have all attributes from B. A formal
concept of K is a pair (A, B) such that AI = B and B = AI , and the set of all formal
concepts of K is denoted by B(K).
   An implication over the set M is an expression of the form X → Y where X, Y ⊆ M.
An implication X → Y over M is valid in K if X I ⊆ Y I . A set L of implications over
M is valid in K if each implication in L is valid in K. An implication X → Y follows
from the set L if X → Y is valid in every context with attribute set M in which L
is valid. Furthermore, a model of X → Y is a set T ⊆ M such that X ⊆ T implies
Y ⊆ T . A model of L is a model of all implications in L, and X L is the smallest
superset of X that is a model of L. The set X L can be computed as follows.
             [                                    [
     X L :=        X Ln where X L1 := X ∪ { B | A → B ∈ L and A ⊆ X }
               n≥1

                           and X Ln+1 := (X Ln )L1       for all n ∈ N.
   The following lemma shows some well-known equivalent statements for entailment
of implications from implication sets. We will not prove them here.
                    NextClosures: Parallel Computation of the Canonical Base               183


Lemma 1. Let L ∪ { X → Y } be a set of implications over M. Then the following
statements are equivalent:
 1. X → Y follows from L.
 2. If K is a formal context with attribute set M such that L is valid in K, then
    X → Y is also valid in K.
 3. If T ⊆ M and T is a model of L, then T is a model of X → Y .
 4. Y ⊆ X L.
   An attribute set B ⊆ M is called intent of K = (G, M, I) if B = BII . An at-
tribute set P ⊆ M is called pseudo-intent of K if P =
                                                    6 P II , and furthermore for each
pseudo-intent Q ( P the set inclusion QII ⊆ P is satisfied. We denote the set of all
pseudo-intents of K by PsInt(K). Then the canonical implicational base of K is defined
as the following implication set:

                               { P → P II | P ∈ PsInt(K) }.

The canonical base has the property that it is a minimal base of K, i.e. it is a base of K,
meaning that it is a set of valid implications of K such that every valid implication
of K is entailed by it. Furthermore, its cardinality is minimal among all bases of K.
   It is readily verified that a subset X ⊆ M is an intent or a pseudo-intent of K if
and only if X is a closure of the closure operator K∗ that is defined as follows:
            [                                   [
  X K :=           X Kn where X K1 := X ∪ { P II | P ∈ PsInt(K) and P ( X }
       ∗               ∗              ∗

              n≥1

                           and X Kn+1 := (X Kn )K1
                                      ∗           ∗       ∗
                                                                  for all n ∈ N.

Of course, if L is the canonical base of K as described above, then both closure operators
K∗ and L∗ coincide, where L∗ is defined by the following equations:
       ∗      [          ∗               ∗          [
    X L :=           X Ln where X L1 := X ∪ { B | A → B ∈ L and A ( X }
                n≥1
                                          ∗           ∗       ∗
                             and X Ln+1 := (X Ln )L1                for all n ∈ N.


3    Parallel Computation of the Canonical Base
The well-known NextClosure algorithm developed by Ganter [6] can be used to enumer-
ate the implications of the canonical base. The mathematical idea behind this algorithm
is to compute all intents and pseudo-intents of our formal context K in a certain linear
order, namely the lectic order. As an advantage the next (pseudo-)intent is uniquely deter-
mined, but we potentially have to do backtracking in order to find it. It can be seen quite
easily that those sets form a complete lattice, and the NextClosure algorithm uses the
closure operator K∗ of this lattice to enumerate the pseudo-intents of K in the lectic order.
Furthermore, this algorithm is inherently sequential, i.e. it is not possible to parallelize it.
   In our approach we shall not make use of the lectic order. Indeed, our algorithm
will enumerate all intents and pseudo-intents of K in the subset-order, with no further
restrictions. As a benefit we get a very easy and obvious way to parallelize this enu-
meration. Moreover, in multi-threaded implementations no communication between
different threads is necessary. However, as it is the case with all other known algorithms
184     Francesco Kriegel and Daniel Borchmann


for computing the canonical base, we also have to compute all intents in addition to
all pseudo-intents of the given formal context.
   The main idea is very simple and works as follows. From the definition of pseudo-
intents we see that in order to decide whether an attribute set P ⊆ M is a pseudo-intent
we only need all pseudo-intents Q ( P , i.e. it suffices to know all pseudo-intents with
a smaller cardinality than P . This allows for the level-wise computation of the canon-
ical base w.r.t. the subset inclusion order, i.e. we can enumerate the (pseudo-)intents
w.r.t. increasing cardinality.
   An algorithm that implements this idea works as follows. First we start by considering
the empty set, as it is the only set with cardinality 0. Of course, the empty set must
either be an intent or a pseudo-intent, and the distinction can be made by checking
whether ∅ = ∅II . Then assuming inductively that all pseudo-intents with cardinality
< k have been determined, we can correctly decide whether a subset P ⊆ M with
|P | = k is a pseudo-intent or not.
   To compute the lattice of intents and pseudo-intents of K the algorithm manages a
set of candidates that contains the (pseudo-)intents on the current level. Then, whenever
a pseudo-intent P has been found, the ⊆-next closure is uniquely determined by its
closure P II in the context K. If an∗ intent B has been found, then the ⊆-next closures
must be of the form (B ∪ { m })K , m ∈    / B. However, as we are not aware of the full
implicational base of K yet, but only of an approximation L of it, the operators K∗ and
L∗ do not coincide on all subsets of M. We will show that they yield the same closure
for attribute sets B ⊆ M with a cardinality |B| ≤ k if L contains all implications
P → P II where P is a pseudo-intent of K with cardinality |P | < k. Consequently, the
L∗-closure of a set B ∪ { m } may not be an intent or pseudo-intent of K. Instead they
are added to the candidate list, and are processed when all pseudo-intents with smaller
cardinality have been determined. We will formally prove that this technique is correct.
Furthermore, the computation of all pseudo-intents and intents of cardinality k can
be done in parallel, since they are independent of each other.
   In summary, we shortly describe the inductive structure of the algorithm as follows:
Let K be a finite formal context. We use four variables: k denotes the current cardinality
of candidates, C is the set of candidates, B is a set of formal concepts, and L is an
implication set. Then the algorithm works as follows.

 1. Set k := 0, C := { ∅ }, B := ∅, and L := ∅.
 2. In parallel: For each candidate set C ∈ C with cardinality |C| = k determine
    whether it is L∗-closed. If not, then add its L∗-closure to the candidate set C, and
    go to Step 5.
 3. If C is an intent of K, then add the formal concept (C I , C) to B. Otherwise C
    must be a pseudo-intent, and thus we add the formal implication C → C II to the
    set L, and add the formal concept (C I , C II ) to the set B.
 4. For each observed intent C II , add all its upper neighbours C II ∪ { m } where
    m∈ / C II to the candidate set C.
 5. Wait until all candidates of cardinality k have been processed. If k < |M|, then in-
    crease the candidate cardinality k by 1, and go to Step 2. Otherwise return B and L.

   In order to approximate the operator L∗ we furthermore introduce the following
notion: If L is a set of implications, then Lk denotes the subset of L that consists
of all implications whose premises have a cardinality of at most k.
                   NextClosures: Parallel Computation of the Canonical Base          185


Lemma 2. Let K = (G, M, I) be a formal context, L its canonical implicational base,
and X ⊆ M an attribute set. Then the following statements are equivalent:

 1. X is either an intent or a pseudo-intent of K.
 2. X is K∗-closed.
 3. X is L∗-closed.
 4. X is (L|X|−1)∗-closed.
 5. There is a k ≥ |X| − 1 such that X is (Lk )∗-closed.
 6. For all k ≥ |X| − 1 it holds that X is (Lk )∗-closed.

Proof. 1⇔2. If X is an intent or a pseudo-intent, then it is obviously K∗1 -closed,
i.e. K∗-closed. Vice versa, if X is K∗-closed, but no intent, then X contains the closure
P II of every pseudo-intent P ( X, and hence X must be a pseudo-intent. 2⇔3. is
obvious. 3⇔4. follows directly from the fact that P ( X implies |P | < |X|. 4⇔5.
The only-if-direction is trivial. Consider k ≥ |X| − 1 such that X is (Lk )∗-closed.
Then X contains all conclusions B where A → B ∈ L is an implication with premise
A ( X such that |A| ≤ k. Of course, A ( X implies |A| < |X|, and thus X is
(L|X|−1)∗-closed as well. 4⇔6. The only-if-direction is trivial. Finally, assume that
k ≥ |X| − 1 and X is (L|X|−1)∗-closed. Obviously, there are no subsets A ( X with
|X| ≤ |A| ≤ k, and so X must be (Lk )∗-closed, too.                                   t
                                                                                       u

  As an immediate consequence of Lemma 2 we infer that in order to decide the
K∗-closedness of an attribute set X it suffices to know all implications in the canonical
base whose premise has a lower cardinality than X.

Corollary 3. If L contains all implications P → P II where P is a pseudo-intent
of K with |P | < k, and otherwise only implications with premise cardinality k, then
for all attribute sets X ⊆ M with |X| ≤ k the following statements are equivalent:

 1. X is an intent or a pseudo-intent of K.
 2. X is L∗-closed.

  This corollary allows us in a certain sense to approximate the set of all K∗-closures
w.r.t. increasing cardinality, and thus also permits the approximation of the closure
operator L∗ where L is the canonical base of K. In the following Lemma 4 we will
characterise the structure of the lattice of all K∗-closures, and also give a method to
compute upper neighbours. It is true that between comparable pseudo-intents there
must always be an intent. In particular, the unique upper K∗-closed neighbour of a
pseudo-intent must be an intent.

Lemma 4. Let K be a formal context. Then the following statements are true:

 1. If P ⊆ M is a pseudo-intent, then there is no intent or pseudo-intent strictly
    between P and P II .
 2. If B ⊆ M is∗ an intent, then the next intents or pseudo-intents are of the form
    (B ∪ { m })K for attributes m ∈
                                  6 B.
 3. If X ( Y ⊆ M are neighbouring K∗-closures, then Y = (X ∪ { m })K for all
                                                                           ∗


    attributes m ∈ Y \ X.
186     Francesco Kriegel and Daniel Borchmann


Algorithm 1 NextClosures (K)
 1 k := 0, C := { ∅ }, B := ∅, L := ∅
 2 while k ≤ |M| do
 3      for all C ∈ C with |C| = k do in parallel
 4            C := C \ { C }
                           ∗
 5            if C = C L then
 6                 if C = 6 C II then
 7                      L := L ∪ { C → C II }
 8                 B := B ∪ { (C I , C II ) }
 9                 C := C ∪ { C II ∪ { m } | m ∈
                                               6 C II }
10            else                 ∗
11                 C := C ∪ { C L }
12      Wait for termination of all parallel processes.
13      k := k + 1
14 return (B, L)



Proof. 1. Let P ⊆ M be a pseudo-intent of K. Then for every intent B between P
    and P II , i.e. P ⊆ B ⊆ P II , we have B = BII = P II . Thus, there cannot be an
    intent strictly between P and P II . Furthermore, if Q were a pseudo-intent such
    that P ( Q ⊆ P II , then P II ⊆ Q, and thus Q = P II , a contradiction.
 2. Let B ⊆ M be an intent of K, and X ⊇ B an intent or pseudo-intent of K such that
    there is no other intent or pseudo-intent between them. Then B ⊆ B ∪ { m } ⊆ X
    for every m ∈ X \ B. Thus, B = BK ( (B ∪ { m })K ⊆ X K = X. Then
                                              ∗                ∗       ∗


    (B ∪ { m })K is an intent or a pseudo-intent between B and X that strictly
                   ∗


    contains B, and hence X = (B ∪ { m })K .
                                                ∗


 3. Consider an attribute m ∈ Y \ X. Then X ∪ { m } ⊆ Y , and thus X (
    (X ∪ { m })K ⊆ Y , as Y is already closed. Therefore, (X ∪ { m })K = Y .
                   ∗                                                     ∗
                                                                                   t
                                                                                   u
   We are now ready to formulate our algorithm NextClosures in pseudo-code, see
Algorithm 1. In the remainder of this section we shall show that this algorithm always
terminates for finite formal contexts K, and that it returns the canonical base as well as
the set of all formal concepts of K. Beforehand, let us introduce the following notation:
 1. NextClosures is in state k if it has processed all candidate sets with a cardinality
    ≤ k, but none of cardinality > k.
 2. Ck denotes the set of candidates in state k.
 3. Lk denotes the set of implications in state k.
 4. Bk denotes the set of formal concepts in state k.
Proposition 5. Let K be a formal context, and assume that NextClosures has been
started on K and is in state k. Then the following statements are true:
 1. Ck contains all pseudo-intents of K with cardinality k + 1, and all intents of K
    with cardinality k + 1 whose corresponding formal concept is not already in Bk .
 2. Bk contains all formal concepts of K whose intent has cardinality ≤ k.
 3. Lk contains all implications P → P II where the premise P is a pseudo-intent of K
    with cardinality ≤ k.
 4. Between the states k and k + 1 an attribute set with cardinality k + 1 is an intent
    or pseudo-intent of K if and only if it is L∗-closed.
                   NextClosures: Parallel Computation of the Canonical Base            187


Proof. We prove the statements by induction on k. The base case handles the initial
state k = −1. Of course, ∅ is always an intent or a pseudo-intent of K. Furthermore, it is
the only attribute set of cardinality 0 and contained in the candidate set C. As there are
no sets with cardinality ≤ −1, B−1 and L−1 trivially satisfy Statements 2 and 3. Finally,
we have that L−1 = ∅, and hence every attribute set is L∗−1-closed, in particular ∅.
   We now assume that the induction hypothesis is true for k. For every implication
set L between states k and k + 1, i.e. Lk ⊆ L ⊆ Lk+1, the induction hypothesis yields
that L contains all formal implications P → P II where P is a pseudo-intent of K with
cardinality ≤ k, and furthermore only implications whose premises have cardinality k +1
(by definition of Algorithm 1). Additionally, we know that the candidate set C contains
all pseudo-intents P of K where |P | = k +1, and all intents B of K such that |B| = k +1
and (BI , B) ∈  / B. Corollary 3 immediately yields the validity of Statements 2 and 3
for k + 1, as those K∗-closures are recognized correctly in line 5. Then Lk+1 contains
all implications P → P II where P is a pseudo-intent of K with |P | ≤ k + 1, and hence
each implication set L with Lk+1 ⊆ L ⊆ Lk+2 contains all those implications and
furthermore only implications with a premise cardinality k + 2. By another application
of Corollary 3 we conclude that also Statement 4 is satisfied for k + 1.
   Finally, we show Statement 1 for k + 1. Consider any K∗-closed set X where
|X| = k + 2. Then Lemma 4 states that for all lower K∗-neighbours Y and all
m ∈ X \ Y it is true that (Y ∪ { m })K = X. We proceed with a case distinction.
                                           ∗


   If there is a lower K -neighbour Y which is a pseudo-intent, then Lemma 4 yields that
                        ∗

the (unique) next K∗-neighbour is obtained as Y II , and the formal concept (Y I , Y II )
is added to the set B in line 8. Of course, it is true that X = Y II .
   Otherwise all lower K∗-neighbours Y are intents, and in particular this is the case for
X being a pseudo-intent by Lemma 4. Then for all these Y we have (Y ∪ { m })K = X
                                                                                   ∗


for all m ∈ X \ Y . Furthermore, all sets Z with Y ∪ { m } ( Z ( X are not K∗-closed.
Since X \ Y is finite, the following sequence must also be finite:
                                               ∗
           C0 := Y ∪ { m } and Ci+1 := CiL where L|Ci |−1 ⊆ L ⊆ L|Ci |.

The sequence is well-defined, since implications from L|Ci | \L|Ci |−1 have no influence on
the closure of Ci. Furthermore, the sequence obviously ends with the set X, and contains
no further K∗-closed sets, and each of the sets C0, C1, . . . appears as a candidate during
the run of the algorithm, cf. lines 9 and 11.                                             t
                                                                                          u

   From the previous result we can infer that in the last state |M| the set B contains
all formal concepts of the input context K, and that L is the canonical base of K. Both
sets are returned from Algorithm 1, and hence we can conclude that NextClosures
is sound and complete. The following corollary summarises our results obtained so far,
and also shows termination.

Corollary 6. If the algorithm NextClosures is started on a finite formal context K
as input, then it terminates, and returns both the set of all formal concepts and the
canonical base of K as output.

Proof. The second part of the statement is a direct consequence of Proposition 5. In
the final state |M| the set L contains all formal implications P → P II where P is a
pseudo-intent of K. In particular, L is the canonical implicational base. Furthermore,
the set B contains all formal concepts of K.
188       Francesco Kriegel and Daniel Borchmann


   Finally, the computation time between states k and k + 1 is finite, because there
are only finitely many candidates of cardinality k + 1, and the computation of closures
w.r.t. the operators L∗ and ·II can be done in finite time. As there are exactly |M|
states for a finite formal context, the algorithm must terminate.                    t
                                                                                     u
   One could ask whether there are formal contexts that do not allow for a speedup
in the enumeration of all intents and pseudo-intents on parallel execution. This would
happen for formal contexts whose intents and pseudo-intents are linearly ordered.
However, this is impossible.
Lemma 7. Let K = (G, M, I) be a non-empty clarified formal context. Then the set
of its intents and pseudo-intents is not linearly ordered w.r.t. subset inclusion ⊆.
Proof. Assume that K := (G, M, I) with G := { g1, . . . , gn }, n > 0, were a clar-
ified formal context with intents and pseudo-intents P1 ( P2 ( . . . ( P`. In
particular, then also all object intents form a chain g1I ( g2I ( . . . ( gnI where
n ≤ `. Since K is attribute-clarified, it follows gj+1          I
                                                                     \ gjI = 1 for all j, and hence
w.l.o.g. M = { m1, . . . , mn }, and gi I mj iff i ≥ j. Eventually, K is isomorphic to the
ordinal scale Kn := ({ 1, . . . , n } , { 1, . . . , n } , ≤). It is easily verified that the pseudo-
intents of Kn are either ∅, or of the form { m, n } where m < n − 1, a contradiction.
  Consequently, there is no formal context with a linearly ordered set of intents and
pseudo-intents. Hence, a parallel enumeration of the intents and pseudo-intents will
always result in a speedup compared to a sequential enumeration.


4     Benchmarks
The purpose of this section is to show that our parallel algorithm for computing the
canonical base indeed yields a speedup, both qualitatively and quantitatively, compared
to the classical algorithm based on NextClosure [6]. To this end, we shall present the
running times of our algorithm when applied to selected data-sets and with a varying
number of available CPUs. We shall see that, up to a certain limit, the running time of
our algorithms decreases proportional to the number of available CPUs. Furthermore,
we shall also show that this speedup is not only qualitative, but indeed yields a real
speedup compared to the original sequential algorithm for computing the canonical base.
   The presented algorithm NextClosures has been integrated into Concept Ex-
plorer FX [8]. The implementation is a straight-forward adaption of Algorithm 1 to the
programming language Java 8, and heavily uses the new Stream API and thread-safe
concurrent collection classes (like ConcurrentHashMap). As we have described before,
the processing of all candidates on the current cardinality level can be done in parallel,
i.e. for each of them a separate thread is started that executes the necessary operations
for lines 4 to 11 in Algorithm 1. Furthermore, as the candidates on the same level cannot
affect each other, no communication between the threads is needed. More specifically,
we have seen that the decision whether a candidate is an intent or a pseudo-intent is
independent of all other sets with the same or a higher cardinality.
   The formal contexts used for the benchmarks 1 are listed in Figure 1, and are either
obtained from the FCA Data Repository [4] ( a to d , and f to p ), randomly created
1
    Readers who are interested in the test contexts should send a mail to one of the authors.
                     NextClosures: Parallel Computation of the Canonical Base        189


                   Formal Context       Objects Attributes Density
                 a car.cxt               1728       25      28 %
                 b mushroom.cxt          8124      119      19 %
                 c tic-tac-toe.cxt        958       29      34 %
                 d wine.cxt               178       68      20 %
                 e algorithms.cxt        2688       54      22 %
                 f o1000a10d10.cxt       1000       10      10 %
                 g o1000a20d10.cxt       1000       20      10 %
                 h o1000a36d17.cxt       1000       36      16 %
                 i o1000a49d14.cxt       1000       49      14 %
                 j o1000a50d10.cxt       1000       50      10 %
                 k o1000a64d12.cxt       1000       64      12 %
                 l o1000a81d11.cxt       1000       81      11 %
                 m o1000a100d10-001.cxt  1000      100      11 %
                 n o1000a100d10-002.cxt  1000      100      11 %
                 o o1000a100d10.cxt      1000      100      11 %
                 p o2000a81d11.cxt       2000       81      11 %
                 q 24.cxt                 17        26      51 %
                 r 35.cxt                 18        24      43 %
                 s 51.cxt                 26        17      76 %
                 t 54.cxt                 20        20      48 %
                 u 79.cxt                 25        26      68 %

                          Fig. 1. Formal Contexts in Benchmarks


( q to n ), or created from experimental results ( e ). For each of them we executed
the implementation at least three times, and recorded the average computation times.
The experiments were performed on the following two systems:

Taurus (1 Node of Bull HPC-Cluster, ZIH)
    CPUs: 2x Intel Xeon E5-2690 with eight cores @ 2.9 GHz, RAM: 32 GB
Atlas (1 Node of Megware PC-Farm, ZIH)
    CPUs: 4x AMD Opteron 6274 with sixteen cores @ 2.2 GHz, RAM: 64 GB

   The benchmark results are displayed in Figure 2. The charts have both axes
logarithmically scaled, to emphasise the correlation between the execution times and the
number of available CPUs. We can see that the computation time is almost inverse linear
proportional to the number of available CPUs, provided that the context is large enough.
In this case there are enough candidates on each cardinality level for the computation
to be done in parallel. However, we shall note that there are some cases where the
computation times increase when utilising all available CPUs. We are currently not aware
of an explanation for this exception – maybe it is due to some technical details of the
platforms or the operation systems, e.g. some background tasks that are executed during
the benchmark, or overhead caused by thread maintenance. Note that we did not have full
system access during the experiments, but could only execute tasks by scheduling them
in a batch system. Additionally, for some of the test contexts only benchmarks for a large
number of CPUs could be performed, due to the time limitations on the test systems.
   Furthermore, we have performed the same benchmark with small-sized contexts
having at most 15 attributes. The computation times were far below one second. We
have noticed that there is a certain number of available CPUs for which there is no
190                       Francesco Kriegel and Daniel Borchmann




                                                                          p m                                                             o
                                                                                                                                          n
                                                                                                                                          m
                                                                                                                       l                  p



                                                                          l                                     k           l
                                                                              l l


                                                                                                                                 l
                                                                                                                       k                  l
                                                                                                                                     l
                                                                              kk                                i
                                                                      b                                                     k
                                                                          b
                                                                              b                                        i
                                                                                  b                                                       k
                   1h




                                                                                                         1h
                            h                                                                                   j
                                          d             i        i                                                               k
                                                j                     i i                                       d
                                                                        k         i                             c
                                          c         j                                                                                k
                                                        j    j                                                         j    i
                                 h                               j
                                               d                      j j jj
                            e                  c                                                                       d
                                          h                                                                     h
                                                    d                                                                       j    i
                                                                                                                       c
                                                    c
                                                        d                                                                            ji
                                 e                                                                                                        j
                                               h                              i                                        h    d
                                                        c    d                                                              c    j
                                                                                                                e                         i
                                                    h        c
                                          e                      d                                                               d
                            a                           h        c d                                                        h    c
                                                                                                                       e
Computation Time




                                                                                      Computation Time
                                                                   c d                                                               d
                                                             h       cd
                                                e                                                                                    c
                                 a                               h     cd                                                                 d
                                                                   h    c                                                                 c
                                                    e                hh                                         a           e    h
                            u
                                                                        h
                                          a             e                                                                            h
                                                                                                                                          h
                                 u
                   1min




                                                                                                         1min
                                                             e                                                  u      a         e
                                               a                 e
                                          u                                                                                          e
                                                                      e
                                                                          e
                                                    a                         e                                             a             e
                                               u                                                                       u
                                                    u                             e
                                                        u
                                                        a
                            g                                u
                                                                 u u              u                                         u
                                                             a            uu                                                     a
                                 g                               a
                                                                                                                                 u   a
                                                                      a                                         g                    u    u
                            s                                             a                                                               a
                                                                              a
                                          g                                       a
                                 s                                                                                     g
                                          s         s
                            q                  g
                                               s        s    s                                                  s
                                                                 s s s ss
                                 q                                                                                          g
                                                    g
                            r                           g                                                              s
                            f                                                                                   q
                            t             q                                                                                               s
                                                             g                                                              s    g   s
                                 rf                              g                                                     q         s
                   1s




                                                                                                         1s




                                 t                                                                                                   g
                                                                      g g
                                               q                              gg                                r
                                          r         q                                                                                     g
                                                                          t                                     tf
                                          t             q                                                                   q
                                          f
                                                             q                    t
                                                r                                 q                                    r
                                                    r            q                                                     tf        q
                                                f                     q qq
                                                t   t                                                                       r
                                                    f   t
                                                        r                                                                            q    q
                                                        f
                                                             t
                                                             r                                                              tf   t
                                                             f   rf
                                                                 t
                                                                    rf rf r
                                                                    t     tf rf                                                  r
                                                                                                                                     r
                                                                                                                                 f        r
                                                                                                                                     tf   t
                                                                                                                                          f



                            1   2         4    8        16       32           64                                1      2    4    8        16
                                       Number of CPUs                                                                Number of CPUs
                                      Fig. 2. Benchmark Results (left: Atlas, right: Taurus)
                                       NextClosures: Parallel Computation of the Canonical Base      191

                                   d




                       1h
                                   h                  d
                                                      c
                                   c
                                                      h                   d
                                                                          c
                                                                                     d
                                                                                     c         c
                                                                                               d
                                   e                  e                   h
    Computation Time                                                      e
                                                                                     h         h
                                                                                     e
                                                                                               e
                                                      a
                       1min        a
                                                      u                   a
                                                                          u          a
                                   u                                                 u         a
                                                                                               u

                                   g                  g
                                                                          g
                                                      s                              g
                                   q                                                           g
                                                                          s
                                   r                  q                              s         s
                       1s




                                   s
                                   t                  r
                                                      tf                  q
                                   f                                                 q
                                                                          r
                                                                          tf                   q
                                                                                     r
                                                                                     tf        rf
                                                                                               t

                              NextClosure       NextClosures        NextClosures          NextClosures
                                (1 CPU)           (1 CPU)             (2 CPUs)              (4 CPUs)

                                               Fig. 3. Performance Comparison


further increase in speed of the algorithm. This happens when the number of candidates
is smaller than the available CPUs.
   Finally, we compared our two implementations of NextClosure and NextClosures
when only one CPU is utilised. The comparison was performed on a notebook with
Intel Core i7-3720QM CPU with four cores @ 2.6 GHz and 8 GB RAM. The results
are shown in Figure 3. We conclude that our proposed algorithm is on average as fast
as NextClosure on the test contexts. The computation time ratio is between 13 and 3,
depending on the specific context. Low or no speedups are expected for formal contexts
where NextClosure does not have to do backtracking, and hence can find the next
intent or pseudo-intent immediately.


5                      Conclusion
In this paper we have introduced the parallel algorithm NextClosures for the computa-
tion of the canonical base. It constructs the lattice of all intents and pseudo-intents of a
given formal context from bottom to top in a level-wise order w.r.t. increasing cardinality.
As the elements in a certain level of this lattice can be computed independently, they
can also be enumerated in parallel, thus yielding a parallel algorithm for computing the
canonical base. Indeed, first benchmarks show that NextClosures allows for a speedup
that is proportional to the number of available CPUs, up to a certain natural limit.
Furthermore, we have compared its performance to the well-known algorithm NextClo-
sure when utilising only one CPU. It could be observed that on average our algorithm
(on one CPU) has the same performance as NextClosure, at least for the test contexts.
   So far we have only introduced the core idea of the algorithm, but it should be clear
that certain extensions are possible. For example, it is not hard to see that our parallel
algorithm can be extended to also handle background knowledge given as a set of impli-
cations or as a constraint closure operator [1]. In order to yield attribute exploration, our
algorithm can also be extended to include expert interaction for exploration of the canoni-
cal base of partially known contexts, much in the same way as the classical algorithm. One
benefit is the possibility to have several experts answering questions in parallel. Another
advantage is the constant increase in the difficulty of the questions (i.e. premise cardi-
192      Francesco Kriegel and Daniel Borchmann


nality), compared to the questions posed by default attribute exploration in lectic order.
Those extensions have not been presented here due to a lack of space, but we shall present
them in a future publication. Meanwhile, they can be found in a technical report [9].
Acknowledgements The authors thank Bernhard Ganter for helpful hints on optimal
formal contexts for his NextClosure algorithm. Furthermore, the authors thank the
anonymous reviewers for their constructive comments.
   The benchmarks were performed on servers at the Institute of Theoretical Computer
Science, and the Centre for Information Services and High Performance Computing
(ZIH) at TU Dresden. We thank them for their generous allocations of computer time.

References
 [1]   Radim Belohlávek and Vilém Vychodil. “Formal Concept Analysis with Constraints
       by Closure Operators”. In: Conceptual Structures: Inspiration and Application, 14th
       International Conference on Conceptual Structures, ICCS 2006, Aalborg, Denmark, July
       16-21, 2006, Proceedings. Ed. by Henrik Schärfe, Pascal Hitzler, and Peter Øhrstrøm.
       Vol. 4068. Lecture Notes in Computer Science. Springer, 2006, pp. 131–143.
 [2]   Felix Distel. “Hardness of Enumerating Pseudo-Intents in the Lectic Order”. In: Proceed-
       ings of the 8th Interational Conference of Formal Concept Analysis. (Agadir, Morocco).
       Ed. by Léonard Kwuida and Barış Sertkaya. Vol. 5986. Lecture Notes in Computer
       Science. Springer, 2010, pp. 124–137.
 [3]   Felix Distel and Barış Sertkaya. “On the Complexity of Enumerating Pseudo-Intents”.
       In: Discrete Applied Mathematics 159.6 (2011), pp. 450–466.
 [4]   FCA Data Repository. url: http://www.fcarepository.com.
 [5]   Huaiguo Fu and Engelbert Mephu Nguifo. “A Parallel Algorithm to Generate Formal
       Concepts for Large Data”. In: Proceedings of the Second International Conference on
       Formal Concept Analysis. (Sydney, Australia). Ed. by Peter W. Eklund. Vol. 2961.
       Lecture Notes in Computer Science. Springer, 2004, pp. 394–401.
 [6]   Bernhard Ganter. “Two Basic Algorithms in Concept Analysis”. In: Proceedings of the
       8th Interational Conference of Formal Concept Analysis. (Agadir, Morocco). Ed. by
       Léonard Kwuida and Barış Sertkaya. Vol. 5986. Lecture Notes in Computer Science.
       Springer, 2010, pp. 312–340.
 [7]   Bernhard Ganter and Rudolf Wille. Formal Concept Analysis: Mathematical Foundations.
       Springer, 1999.
 [8]   Francesco Kriegel. Concept Explorer FX. Software for Formal Concept Analysis. 2010-
       2015. url: https://github.com/francesco-kriegel/conexp-fx.
 [9]   Francesco Kriegel. NextClosures – Parallel Exploration of Constrained Closure Operators.
       LTCS-Report 15-01. Chair for Automata Theory, TU Dresden, 2015.
[10]   Sergei O. Kuznetsov. “On the Intractability of Computing the Duquenne-Guigues Base”.
       In: Journal of Universal Computer Science 10.8 (2004), pp. 927–933.
[11]   Christian Lindig. “Fast Concept Analysis”. In: Working with Conceptual Structures –
       Contributions to ICCS 2000. (Aachen, Germany). Ed. by Gerhard Stumme. Shaker
       Verlag, 2000, pp. 152–161.
[12]   Sergei A. Obiedkov and Vincent Duquenne. “Attribute-Incremental Construction of
       the Canonical Implication Basis”. In: Annals of Mathematics and Artificial Intelligence
       49.1-4 (2007), pp. 77–99.
[13]   Vilém Vychodil, Petr Krajča, and Jan Outrata. “Parallel Recursive Algorithm for FCA”.
       In: Proceedings of the 6th International Conference on Concept Lattices and Their
       Applications. Ed. by Radim Bělohlávek and Sergej O. Kuznetsov. Palacký University,
       Olomouc, 2008, pp. 71–82.
             Probabilistic Implicational Bases in FCA
             and Probabilistic Bases of GCIs in EL⊥

                                     Francesco Kriegel

             Institute for Theoretical Computer Science, TU Dresden, Germany
                             francesco.kriegel@tu-dresden.de
                        http://lat.inf.tu-dresden.de/˜francesco


        Abstract. A probabilistic formal context is a triadic context whose third dimen-
        sion is a set of worlds equipped with a probability measure. After a formal
        definition of this notion, this document introduces probability of implications,
        and provides a construction for a base of implications whose probability satisfy
        a given lower threshold. A comparison between confidence and probability of
        implications is drawn, which yields the fact that both measures do not coin-
        cide, and cannot be compared. Furthermore, the results are extended towards
        the light-weight description logic EL⊥ with probabilistic interpretations, and
        a method for computing a base of general concept inclusions whose probability
        fulfill a certain lower bound is proposed.

        Keywords: Formal Concept Analysis, Description Logics, Probabilistic Formal
        Context, Probabilistic Interpretation, Implication, General Concept Inclusion


 1 Introduction
 Most data-sets from real-world applications contain errors and noise. Hence, for mining
 them special techniques are necessary in order to circumvent the expression of the
 errors. This document focuses on rule mining, especially we attempt to extract rules
 that are approximately valid in data-sets, or families of data-sets, respectively. There
 are at least two measures for the approximate soundness of rules, namely confidence
 and probability. While confidence expresses the number of counterexamples in a single
 data-set, probability expresses somehow the number of data-sets in a data-set family
 that do not contain any counterexample. More specifically, we consider implications
 in the formal concept analysis setting [7], and general concept inclusions (GCIs) in the
 description logics setting [1] (in the light-weight description logic EL⊥ ).
    Firstly, for axiomatizing rules from formal contexts possibly containing wrong in-
 cidences or having missing incidences the notion of a partial implication (also called
 association rule) and confidence has been defined by Luxenburger in [12]. Further-
 more, Luxenburger introduced a method for the computation of a base of all partial
 implications holding in a formal context whose confidence is above a certain threshold.
 In [2] Borchmann has extended the results to the description logic EL⊥ by adjusting
 the notion of confidence to GCIs, and also gave a method for the construction of a base
 of confident GCIs for an interpretation.
    Secondly, another perspective is a family of data-sets representing different views
 of the same domain, e.g., knowledge of different persons, or observations of an exper-
 iment that has been repeated several times, since some effects could not be observed

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 193–204, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
194       Francesco Kriegel


in every case. In the field of formal concept analysis, Vityaev, Demin, and Ponomaryov,
have introduced in probabilistic extensions of formal contexts and their formal concepts
and implications, and furthermore gave some methods for their computation, cf. [4].
In [9] the author has shown some methods for the computation of a base of GCIs in
probabilistic description logics where concept and role constructors are available to
express probability directly in the concept descriptions. Here, we want to use another
approach, and do not allow for probabilistic constructors, but define the notion of a
probability of general concept inclusions in the light-weight description logic EL⊥ .
Furthermore, we provide a method for the computation of a base of GCIs satisfying
a certain lower threshold for the probability. More specifically, we use the description
logic EL⊥ with probabilistic interpretations that have been introduced by Lutz and
Schröder in [11]. Beforehand, we only consider conjunctions in the language of formal
concept analysis, and define the notion of a probabilistic formal context in a more
general form than in [4], and provide a technique for the computation of base of
implications satisfying a given lower probability threshold.
   The document is structured as follows. In Section 2 some basic notions for probabilis-
tic extensions of formal concept analysis are defined. Then, in Section 3 a method for the
computation of a base for all implications satisfying a given lower probability threshold
in a probabilistic formal context is developed, and its correctness is proven. The fol-
lowing sections extend the results to the description logic EL⊥ . In particular, Section 4
introduces the basic notions for EL⊥ , and defines probabilistic interpretations. Section 5
shows a technique for the construction of a base of GCIs holding in a probabilistic
interpretation and fulfilling a lower probability threshold. Furthermore, a comparison
of the notions of confidence and probability is drawn at the end of Section 3.


2 Probabilistic Formal Concept Analysis

A probability measure P on a countable set W is a mapping P: 2W → [0, 1] such that
P(∅) = 0 and P(W ) = 1 hold, and P is σ-additive, i.e.,     S for all pairwise disjoint
countable families (Un )n∈N with Un ⊆ W it holds that P( n∈N Un ) = ∑n∈N P(Un ).
A world w ∈ W is possible if P{ w } > 0 holds, and impossible otherwise. The set of
all possible worlds is denoted by Wε , and the set of all impossible worlds is denoted
by W0. Obviously, Wε ] W0 is a partition of W.

Definition 1 (Probabilistic Formal Context). A probabilistic formal context K is a
tuple (G, M, W, I, P) that consists of a set G of objects, a set M of attributes, a countable set
W of worlds, an incidence relation I ⊆ G × M × W, and a probability measure P on W.
For a triple (g, m, w) ∈ I we say that object g has attribute m in world w. Furthermore, we
define the derivations in world w as operators · Iw : 2G → 2M and · Iw : 2M → 2G where

                          A Iw := { m ∈ M | ∀g ∈ A : (g, m, w) ∈ I }
                          B Iw := { g ∈ G | ∀m ∈ B : (g, m, w) ∈ I }

for object sets A ⊆ G and attribute sets B ⊆ M, i.e., A Iw is the set of all common attributes of
all objects in A in the world w, and B Iw is the set of all objects that have all attributes in B in w.
            Probabilistic Implicational Bases and Probabilistic Bases of GCIs            195


Definition 2 (Implication, Probability). Let K = (G, M, W, I, P) be a probabilistic
formal context. For attribute sets X, Y ⊆ M we call X → Y an implication over M, and its
probability in K is defined as the measure of the set of worlds it holds in, i.e.,

                        P(X → Y) := P{ w ∈ W | X Iw ⊆ Y Iw }.
Furthermore, we define the following properties for implications X → Y:
 1. X → Y holds in world w of K if X Iw ⊆ Y Iw is satisfied.
 2. X → Y certainly holds in K if it holds in all worlds of K.
 3. X → Y almost certainly holds in K if it holds in all possible worlds of K.
 4. X → Y possibly holds in K if it holds in a possible world of K.
 5. X → Y is impossible in K if it does not hold in any possible world of K.
 6. X → Y is refuted by K if does not hold in any world of K.
  It is readily verified that P(X → Y) = P{ w ∈ Wε | X Iw ⊆ Y Iw } = ∑{ P{ w } | w ∈
Wε and X Iw ⊆ Y Iw }. An implication X → Y almost certainly holds if P(X → Y) = 1,
possibly holds if P(X → Y) > 0, and is impossible if P(X → Y) = 0. If X → Y
certainly holds, then it is almost certain, and if X → Y is refuted, then it is impossible.

3 Probabilistic Implicational Bases
At first we introduce the notion of a probabilistic implicational base. Then we will
develop and prove a construction for such bases w.r.t. probabilistic formal contexts. If
the underlying context is finite, then the base is computable. The reader should be aware
of the standard notions of formal concept analysis in [7]. Recall that an implication
follows from an implication set if, and only if, it can be syntactically deduced using the
so-called Armstrong rules as follows: 1. From X ⊇ Y infer X → Y. 2. From X → Y
and Y → Z infer X → Z. 3. From X1 → Y1 and X2 → Y2 infer X1 ∪ X2 → Y1 ∪ Y2.
Definition 3 (Probabilistic Implicational Base). Let K = (G, M, W, I, P) be a proba-
bilistic formal context, and p ∈ [0, 1] a threshold. A probabilistic implicational base for K
and p is an implication set B over M that satisfies the following properties:
 1. B is sound for K and p, i.e., P(X → Y) ≥ p holds for all implications X → Y ∈ B, and
 2. B is complete for K and p, i.e., if P(X → Y) ≥ p, then X → Y follows from B.
A probabilistic implicational base is irredundant if none of its implications follows from the
others, and is minimal if it has minimal cardinality among all bases for K and p.
   It is readily verified that the above definition is a straight-forward generalization of
implicational bases as defined in [7, Definition 37], in particular formal contexts coincide
with probabilistic formal contexts having only one possible world, and implications
holding in the formal context coincide with implications having probability 1.
   We now define a transformation from probabilistic formal contexts to formal contexts.
It allows to decide whether an implication (almost) certainly holds, and furthermore it
can be utilized to construct an implicational base for the (almost) certain implications.
Definition 4 (Scaling). Let K be a probabilistic formal context. The certain scaling of K is
the formal context K× := (G × W, M, I × ) where ((g, w), m) ∈ I × iff (g, m, w) ∈ I, and
the almost certain scaling of K is the subcontext K×                      ×
                                                    ε := (G × Wε , M, Iε ) of K .
                                                                                   ×
196      Francesco Kriegel


Lemma 5. Let K = (G, M, W, I, P) be a probabilistic formal context, and let X → Y be a
formal implication. Then the following statements are satisfied:
 1. X → Y certainly holds in K if, and only if, X → Y holds in K× .
 2. X → Y almost certainly holds in K if, and only if, X → Y holds in K×
                                                                       ε .

Proof. It is readily verified that the following equivalences hold:

      P(X → Y) = 1 ⇔ ∀w ∈ W : X Iw ⊆ Y Iw
                             ×    ]                         ]                            ×
                      ⇔ XI =         w∈W
                                           X Iw × { w } ⊆
                                                              w∈W
                                                                    Y Iw × { w } = Y I
                      ⇔ K× |= X → Y.

The second statement can be proven analogously.                                              t
                                                                                             u
  Recall the notion of a pseudo-intent [6–8]: An attribute set P ⊆ M of a formal context
(G, M, I ) is a pseudo-intent if P 6= P II , and Q II ⊆ P holds for all pseudo-intents Q ( P.
Furthermore, it is well-known that the canonical implicational base of a formal context
(G, M, I ) consists of all implications P → P II where P is a pseudo-intent, cf. [6–8].
Consequently, the next corollary is an immediate consequence of Lemma 5.
Corollary 6. Let K be a probabilistic formal context. Then the following statements hold:
 1. An implicational base for K× is an implicational base for the certain implications of K,
    in particular this holds for the following implication set:
                                             × ×
                           BK := { P → P I I | P ∈ PsInt(K× ) }.

 2. An implicational base for K× ε w.r.t. the background knowledge BK is an implicational
    base for the almost certain implications of K, in particular this holds for the following
    implication set:
                                              × ×
                    BK,1 := BK ∪ { P → P Iε Iε | P ∈ PsInt(K×
                                                            ε , BK ) }.

Lemma 7. Let K = (G, M, W, I, P) be a probabilistic formal context. Then the following
statements are satisfied:
 1. Y ⊆ X implies that X → Y certainly holds in K.
 2. X1 ⊆ X2 and Y1 ⊇ Y2 imply P(X1 → Y1 ) ≤ PV    (X2 → Y2 ).
 3. X0 ⊆ X1 ⊆ . . . ⊆ Xn implies P(X0 → Xn ) ≤ in=1 P(Xi−1 → Xi ).

Proof. 1. If Y ⊆ X, then X Iw ⊆ Y Iw follows for all worlds w ∈ W.
 2. Assume X1 ⊆ X2 and Y2 ⊆ Y1. Then X1Iw ⊇ X2Iw and Y2Iw ⊇ Y1Iw follow for all
    worlds w ∈ W. Consider a world w ∈ W where X1Iw ⊆ Y1Iw . Of course, we may
    conclude that X2Iw ⊆ Y2Iw . As a consequence we get P(X1 → Y1 ) ≤ P(X2 → Y2 ).
 3. We prove the third claim by induction on n. For n = 0 there is nothing to show,
    and the case n = 1 is trivial. Hence, consider n = 2 for the induction base,
    and let X0 ⊆ X1 ⊆ X2. Then we have that X0Iw ⊇ X1Iw ⊇ X2Iw is satisfied in
    all worlds w ∈ W. Now consider a world w ∈ W where X0Iw ⊆ X2Iw is true.
            Probabilistic Implicational Bases and Probabilistic Bases of GCIs               197


    Of course, it then follows that X0Iw ⊆ X1Iw ⊆ X2Iw . Consequently, we conclude
    P(X0 → X2 ) ≤ P(X0 → X1 ) and P(X0 → X2 ) ≤ P(X1 → X2 ).
    For the induction step let n > 2. The induction hypothesis yields that
                                                ^n−1
                           P(X0 → Xn−1 ) ≤               P(Xi−1 → Xi ).
                                                   i=1

    Of course, it also holds that X0 ⊆ Xn−1 ⊆ Xn , and it follows by induction
    hypothesis and the previous inequality that
                                                                     ^n
       P(X0 → Xn ) ≤ P(X0 → Xn−1 ) ∧ P(Xn−1 → Xn ) ≤                         P(Xi−1 → Xi ). t
                                                                                            u
                                                                       i=1

Lemma 8. Let K = (G, M, W, I, P) be a probabilistic formal context. Then for all implica-
tions X → Y the following equalities are valid:
                                     × ×        × ×            × ×        × ×
                P(X → Y) = P(X I I → Y I I ) = P(X Iε Iε → Y Iε Iε ).

Proof. Let X → Y be an implication. Then for all worlds w ∈ W it holds that
                                                                                            ×
 g ∈ X Iw ⇔ ∀m ∈ X : (g, m, w) ∈ I ⇔ ∀m ∈ X : ((g, w), m) ∈ I × ⇔ (g, w) ∈ X I ,
                                           ×
and we conclude that X Iw = π1 (X I ∩ (G × { w })). Furthermore, we then infer
          × ×
X Iw = X I I Iw , and thus the following equations hold:

       P(X → Y) = P{ w ∈ W | X Iw ⊆ Y Iw }
                                          × ×         × ×              × ×        × ×
                     = P{ w ∈ W | X I I Iw ⊆ Y I I Iw } = P(X I I → Y I I ).
                                                                                        ×
In particular, for all possible worlds w ∈ Wε it holds that g ∈ X Iw ⇔ (g, w) ∈ X Iε , and
                       ×                                × ×
thus X Iw = π1 (X Iε ∩ (G × { w })) and X Iw = X Iε Iε Iw are satisfied. Consequently,
                                                  × ×        × ×
it may be concluded that P(X → Y) = P(X Iε Iε → Y Iε Iε ).                               t
                                                                                         u
Lemma 9. Let K be a probabilistic formal context. Then the following statements hold:
 1. If B is an implicational base for the certain implications of K, then the implication X → Y
                            × ×          × ×
    follows from B ∪ { X I I → Y I I }.
 2. If B is an implicational base for the almost certain implications of K, then the implication
                                       × ×         × ×
    X → Y follows from B ∪ { X Iε Iε → Y Iε Iε }.
                                                × ×
Proof. Of course, the implication X → X I I holds in K× , i.e., certainly holds in K
                                                                       × ×
by Lemma 5, and hence follows from B. Thus, the implication X → Y I I is entailed
            × ×       × ×                        × ×
by B ∪ { X I I → Y I I }, and because of Y ⊆ Y I I the claim follows.
  The second statement follows analogously.                                        t
                                                                                   u
Lemma 10. Let K be a probabilistic formal context. Then the following statements hold:
          × ×        × ×            × ×                  × ×
 1. P(X Iε Iε → Y Iε Iε ) = P(X Iε Iε → (X ∪ Y) Iε Iε ),
              × ×       × ×
 2. (X ∪ Y) Iε Iε → Y Iε Iε certainly holds in K, and
198      Francesco Kriegel

        × ×         × ×                       × ×                × ×               × ×    × ×
 3. X Iε Iε → Y Iε Iε is entailed by { X Iε Iε → (X ∪ Y) Iε Iε , (X ∪ Y) Iε Iε → Y Iε Iε }.
                                                          × ×
                                            (X ∪ Y) Iε Iε
                                        p
                              × ×                                     × ×
                           X Iε Iε                  p            Y Iε Iε
                            × ×         × ×    × ×                    × ×         × ×
Proof. First note that (X Iε Iε ∪ Y Iε Iε ) Iε Iε = (X ∪ Y) Iε Iε . As Y Iε Iε is a subset of
    × ×       × × × ×                                 × ×       × ×
(X Iε Iε ∪ Y Iε Iε ) Iε Iε , the implication (X ∪ Y) Iε Iε → Y Iε Iε certainly holds in K, cf.
Statement 1 in Lemma 7.
  Furthermore, we have that X Iw ⊆ Y Iw if, and only if, X Iw ⊆ X Iw ∩ Y Iw = (X ∪ Y) Iw .
Hence, the implication X → Y has the same probability as X → X ∪ Y. Consequently,
we may conclude by means of Lemma 7 that
       × ×       × ×                                                        × ×            × ×
P(X Iε Iε → Y Iε Iε ) = P(X → Y) = P(X → X ∪ Y) = P(X Iε Iε → (X ∪ Y) Iε Iε ).
                × ×                  × ×                × ×      × ×                × ×    × ×
Obviously, { X Iε Iε → (X ∪ Y) Iε Iε , (X ∪ Y) Iε Iε → Y Iε Iε } entails X Iε Iε → Y Iε Iε . t
                                                                                             u

Lemma 11. Let K be a probabilistic formal context, and X, Y be intents of K×
                                                                           ε such that
X ⊆ Y and P(X → Y) ≥ p. Then the following statements are true:

 1. There is a chain of neighboring intents X = X0 ≺ X1 ≺ X2 ≺ . . . ≺ Xn = Y in K×
                                                                                  ε ,
 2. P(Xi−1 → Xi ) ≥ p for all i ∈ { 1, . . . , n }, and
 3. X → Y is entailed by { Xi−1 → Xi | i ∈ { 1, . . . , n } }.

Proof. The existence of a chain X = X0 ≺ X1 ≺ X2 ≺ . . . ≺ Xn−1 ≺ Xn = Y of
neighboring intents between X and Y in K×        ε follows from X ⊆ Y.
  From Statement 3 in Lemma 7 it follows that all implications Xi−1 → Xi have a
probability of at least p in K. It is trivial that they entail X → Y.        t
                                                                             u

Theorem 12 (Probabilistic Implicational Base). Let K be a probabilistic formal context,
and p ∈ [0, 1) a probability threshold. Then the following implication set is a probabilistic
implicational base for K and p:

      BK,p := BK,1 ∪ { X → Y | X, Y ∈ Int(K×
                                           ε ) and X ≺ Y and P(X → Y) ≥ p }.

Proof. All implications in BK,1 hold almost certainly in K, and thus have probability 1.
By construction, all other implications X → Y in the second subset have a probability
≥ p. Hence, Statement 1 in Definition 3 is satisfied.
   Now consider an implication X → Y over M such that P(X → Y) ≥ p. We have
to prove Statement 2 of Definition 3, i.e., that X → Y is entailed by BK,p .
                                                                      × ×         × ×
  Lemma 8 yields that both implications X → Y and X Iε Iε → Y Iε Iε have the same
                                                                      × ×    × ×
probability. Lemma 9 states that X → Y follows from BK,1 ∪ { X Iε Iε → Y Iε Iε }.
                                                 × ×        × ×            × ×
According to Lemma 10, the implication X Iε Iε → Y Iε Iε follows from { X Iε Iε →
         ×  ×            ×  ×      ×  ×
(X ∪ Y) Iε Iε , (X ∪ Y) Iε Iε → Y Iε Iε }. Furthermore, it holds that
              × ×                 × ×               × ×         × ×
         P(X Iε Iε → (X ∪ Y) Iε Iε ) = P(X Iε Iε → Y Iε Iε ) = P(X → Y) ≥ p,
            Probabilistic Implicational Bases and Probabilistic Bases of GCIs                199

                                        × ×          × ×
and the second implication (X ∪ Y) Iε Iε → Y Iε Iε certainly holds, i.e., follows from
BK,1. Finally, Lemma 11 states that there is a chain of neighboring intents of K×    ε
               × ×                         × ×
starting at X Iε Iε and ending at (X ∪ Y) Iε Iε , i.e.,
            × ×       I× I×     I× I×      I× I×              I× I×             × ×
          X Iε Iε = X0ε ε ≺ X1ε ε ≺ X2ε ε ≺ . . . ≺ Xnε ε = (X ∪ Y) Iε Iε ,
                              I× I×      I× I×
such that all implications Xi−1 → Xi
                             ε ε        ε ε
                                            have a probability ≥ p, and are thus con-
tained in BK,p . Hence, BK,p entails the implication X → Y.                        t
                                                                                   u
Corollary 13. Let K be a probabilistic formal context. Then the following set is an implica-
tional base for the possible implications of K:
    BK,ε := BK,1 ∪ { X → Y | X, Y ∈ Int(K×
                                         ε ) and X ≺ Y and P(X → Y) > 0 }.

  However, it is not possible to show irredundancy or minimality for the base of
probabilistic implications given above in Theorem 12. Consider the probabilistic formal
context K = ({ g1, g2 }, { m1, m2 }, { w1, w2 }, I, { { w1 } 7→ 12 , { w2 } 7→ 12 }) whose
incidence relation I is defined as follows:
    w1 m1 m2        w2 m1 m2             (G × W, { m2 })
    g1 × ×          g1 × ×
    g2    ×         g2 × ×               ({ (g1, w1 ), (g1, w2 ), (g2, w2 ) }, { m1, m2 })
The only pseudo-intent of K× is ∅, and the concept lattice of K× is shown above.
Hence, we have the following probabilistic implicational base for p = 12 :
                     BK, 1 = { ∅ → { m2 }, { m2 } → { m1, m2 } }.
                         2

However, the set B := { ∅ → { m1, m2 } } is also a probabilistic implicational base for
K and 12 with less elements.
   In order to compute a minimal base for the implications holding in a probabilistic
formal context with a probability ≥ p, one can for example determine the above given
probabilistic base, and minimize it by means of constructing the Duquenne-Guigues
base of it. This either requires the transformation of the implication set into a formal con-
text that has this implication set as an implicational base, or directly compute all pseudo-
closures of the closure operator induced by the (probabilistic) implicational base.
   Recall that the confidence of an implication X → Y in a formal context (G, M, I )
is defined as conf (X → Y) := (X ∪ Y) I / X I , cf. [12]. In general, there is no corre-
spondence between the probability of an implication in K and its confidence in K×
or K× ε . To prove this we will provide two counterexamples. As first counterexample
we consider the context K above. It is readily verified that P({ m2 } → { m1 }) = 21
and conf ({ m2 } → { m1 }) = 34 , i.e., the confidence is greater than the probability.
Furthermore, consider the following modification of K as second counterexample:
                                w1 m1 m2           w2 m1 m2
                                g1 × ×             g1    ×
                                g2                 g2    ×
Then we have that P({ m2 } → { m1 }) = 12 and conf ({ m2 } → { m1 }) = 31 , i.e., the
confidence is smaller than the probability.
200      Francesco Kriegel


4 The Description Logic EL⊥ and Probabilistic Interpretations

This section gives a brief overview on the light-weight description logic EL⊥ [1]. First,
assume that ( NC , NR ) is a signature, i.e., NC is a set of concept names, and NR is a
set of role names, respectively. Then EL⊥ -concept descriptions C over ( NC , NR ) may be
constructed according to the following inductive rule (where A ∈ NC and r ∈ NR ):
                             C ::= ⊥ | > | A | C u C | ∃ r. C.

We shall denote the set of all EL⊥ -concept descriptions over ( NC , NR ) by EL⊥ ( NC , NR ).
Second, the semantics of EL⊥ -concept descriptions is defined by means of interpre-
tations: An interpretation is a tuple I = (∆I , ·I ) that consists of a set ∆I , called domain,
                                                   I       I  I
and an extension function ·I : NC ∪ NR → 2∆ ∪ 2∆ ×∆ that maps concept names
A ∈ NC to subsets AI ⊆ ∆I and role names r ∈ NR to binary relations rI ⊆ ∆I × ∆I .
The extension function is extended to all EL⊥ -concept descriptions as follows:

                     ⊥I := ∅,
                     >I := ∆I ,
              (C u D)I := CI ∩ DI ,
               (∃ r. C)I := { d ∈ ∆I | ∃e ∈ ∆I : (d, e) ∈ rI and e ∈ CI }.

A general concept inclusion (GCI) in EL⊥ is of the form C v D where C and D are EL⊥ -
concept descriptions. It holds in an interpretation I if CI ⊆ DI is satisfied, and we then
also write I |= C v D, and say that I is a model of C v D. Furthermore, C is subsumed
by D if C v D holds in all interpretations, and we shall denote this by C v D, too. A
TBox is a set of GCIs, and a model of a TBox is a model of all its GCIs. A TBox T entails
a GCI C v D, denoted by T |= C v D, if every model of T is a model of C v D.
   To introduce probability into the description logic EL⊥ , we now present the notion
of a probabilistic interpretation from [11]. It is simply a family of interpretations over
the same domain and the same signature, indexed by a set of worlds that is equipped
with a probability measure.
Definition 14 (Probabilistic Interpretation, [11]). Let ( NC , NR ) be a signature. A prob-
abilistic interpretation I is a tuple (∆I , (·Iw )w∈W , W, P) consisting of a set ∆I , called
domain, a countable set W of worlds, a probability measure P on W, and an extension
function ·Iw for each world w ∈ W, i.e., (∆I , ·Iw ) is an interpretation for each w ∈ W.
  For a general concept inclusion C v D its probability in I is defined as follows:

                        P(C v D) := P{ w ∈ W | CIw ⊆ DIw }.
Furthermore, for a GCI C v D we define the following properties (as for probabilistic formal
contexts): 1. C v D holds in world w if CIw ⊆ DIw . 2. C v D certainly holds in I if it
holds in all worlds. 3. C v D almost certainly holds in I if it holds in all possible worlds.
4. C v D possibly holds in I if it holds in a possible world. 5. C v D is impossible in
I if it does not hold in any possible world. 6. C v D is refuted by I if it does not hold in
any world.
            Probabilistic Implicational Bases and Probabilistic Bases of GCIs            201


 It is readily verified that P(C v D) = P{ w ∈ Wε | CIw ⊆ DIw } = ∑{ P{ w } | w ∈
Wε and CIw ⊆ DIw } for all general concept inclusions C v D.

5 Probabilistic Bases of GCIs
In the following we construct from a probabilistic interpretation I a base of GCIs that
entails all GCIs with a probability greater than a given threshold p w.r.t. I .
Definition 15 (Probabilistic Base). Let I be a probabilistic interpretation, and p ∈ [0, 1]
a threshold. A probabilistic base of GCIs for I and p is a TBox B that satisfies the following
conditions:
 1. B is sound for I and p, i.e., P(C v D) ≥ p for all GCIs C v D ∈ B, and
 2. B is complete for I and p, i.e., if P(C v D) ≥ p, then B |= C v D.
A probabilistic base B is irredundant if none of its GCIs follows from the others, and is
minimal if it has minimal cardinality among all probabilistic bases for I and p.
   For a probabilistic interpretation I we define its certain scaling as the disjoint union
                                                                                         ×
of all interpretations Iw with w ∈ W, i.e., as the interpretation I × := (∆I × W, ·I )
whose extension mapping is given as follows:
                        ×
                  AI := { (d, w) | d ∈ AIw }                             ( A ∈ NC ),
                       I× :                                     Iw
                   r        = { ((d, w), (e, w)) | (d, e) ∈ r        }   (r ∈ NR ).
Furthermore, the almost certain scaling Iε× of I is the disjoint union of all interpretations
Iw where w ∈ Wε is a possible world. Analogously to Lemma 5, a GCI C v D certainly
holds in I iff it holds in I × , and almost certainly holds in I iff it holds in Iε× .
  In [5] the so-called model-based most-specific concept descriptions (mmscs) have been
defined w.r.t. greatest fixpoint semantics as follows: Let J be an interpretation, and X ⊆
∆J . Then a concept description C is a mmsc of X in J , if X ⊆ CJ is satisfied, and C v
D for all concept descriptions D with X ⊆ DJ . It is easy to see that all mmscs of X are
unique up to equivalence, and hence we denote the mmsc of X in J by XJ . Please note
that there is also a role-depth bounded variant w.r.t. descriptive semantics given in [3].
Lemma 16. Let I be a probabilistic interpretation. Then the following statements hold:
                              ×
 1. CIw × { w } = CI ∩ (∆I × { w }) for all concept descriptions C and worlds w ∈ W.
                    ×
 2. CIw × { w } = CIε ∩ (∆I × { w }) for all concept descriptions C and possible worlds
    w ∈ Wε .
                      × ×      × ×             × ×         × ×
 3. P(C v D) = P(CI I v DI I ) = P(CIε Iε v DIε Iε ) for all GCIs C v D.
Proof. 1. We prove the claim by structural induction on C. By definition, the statement
    holds for ⊥, >, and all concept names A ∈ NC . Consider a conjunction C u D, then

                        (C u D)Iw × { w } = (CIw ∩ DIw ) × { w }
                                               = CIw × { w } ∩ DIw × { w }
                                              I.H. I ×          ×
                                               =C        ∩ DI ∩ (∆I × { w })
                                                            ×
                                               = (C u D)I ∩ (∆I × { w }).
202       Francesco Kriegel


      For an existential restriction ∃ r. C the following equalities hold:

                  (∃ r. C)Iw × { w }
            = { d ∈ ∆I | ∃e ∈ ∆I : (d, e) ∈ rIw and e ∈ CIw } × { w }
                                                      ×
            = { (d, w) | ∃(e, w) : ((d, w), (e, w)) ∈ rI and (e, w) ∈ CIw × { w } }
           I.H.                                       ×                   ×
            = { (d, w) | ∃(e, w) : ((d, w), (e, w)) ∈ rI and (e, w) ∈ CI }
                          ×
            = (∃ r. C)I ∩ (∆I × { w }).

 2. analogously.
 3. Using the first statement we may conclude that the following equalities hold:

              P(C v D)
           = P{ w ∈ W | CIw × { w } ⊆ DIw × { w } }
                                ×                      ×
           = P{ w ∈ W | CI ∩ (∆I × { w }) ⊆ DI ∩ (∆I × { w }) }
                                × × ×                       × × ×
           = P{ w ∈ W | CI I I ∩ (∆I × { w }) ⊆ DI I I ∩ (∆I × { w }) }
                                × ×                 × ×
           = P{ w ∈ W | CI I Iw × { w } ⊆ DI I Iw × { w } }
                       × ×       × ×
           = P(CI I v DI I ).

      The second equality follows analogously.                                           t
                                                                                         u

  For a probabilistic interpretation I = (∆I , ·I , W, P) and a set M of EL⊥ -concept
descriptions we define their induced context as the probabilistic formal context KI ,M :=
(∆I , M, W, I, P) where (d, C, w) ∈ I iff d ∈ CIw .

Lemma 17. Let I be a probabilistic interpretation, M a set of EL⊥ -concept descriptions, and
X, Y ⊆ M. Then the probability
                          d    ofdthe implication X → Y in the induced contextdKI ,M equals
                                                                                      d
the probability of the GCI X v Y in I , i.e., it holds that P(X → Y) = P( X v Y).
Proof. The following equivalences are satisfied for all Z ⊆ M and worlds w ∈ W:
                                                                    l
      d ∈ Z Iw ⇔ ∀C ∈ Z : (d, C, w) ∈ I ⇔ ∀C ∈ Z : d ∈ CIw ⇔ d ∈ ( Z)Iw .

Now consider two subsets X, Y ⊆ M, then it holds that

         P(X → Y) = P{ w ∈ W | X Iw ⊆ Y Iw }
                                l            l       l   l
                  = P{ w ∈ W | ( X)Iw ⊆ ( Y)Iw } = P( X v Y).                            t
                                                                                         u
  Analogously to [5], the context KI is defined as KI ,MI with the following attributes:
                                                ×
                    MI := { ⊥ } ∪ NC ∪ { ∃ r. XIε | ∅ 6= X ⊆ ∆I × Wε }.
                                         ⊥
  For an implication
               d set Bdover adset M of EL -concept descriptions we define its
induced TBox by B := { X v Y | X → Y ∈ B }.
             Probabilistic Implicational Bases and Probabilistic Bases of GCIs                203

                                                                                 d
Corollary 18. If B contains an almost certain implicational base for KI , then       B is complete
for the almost certain GCIs of I .

Proof. We know that a GCI almost certainly holds in I if, and only if, it holds in
Iε× . Let B0 ⊆ B be an almost certain implicational base for KI , i.e., an implicational
base for (KI )×  = KIε× . Then according to Distel in [5, Theorem 5.12] it follows that
           d ε
the TBox B 0 d  is a base of GCIs for Iε× , i.e., a base for the almost certain GCIs of I .
Consequently, B is complete for the almost certain GCIs of I .                          t
                                                                                        u

Theorem 19. Let I be a probabilistic interpretation, and p ∈ [0, 1] a threshold. If B is a
probabilistic implicational
                d           base for KI and p that contains an almost certain implicational base
for KI , then B is a probabilistic base of GCIs for I and p.
                          d         d     d
Proof. Consider a GCId X vd Y ∈ B. Then Lemma 17 yields that the implication
X → Y and the GCI X v Y have the same probability.            d Since d B is a probabilistic
implicational base for KI and p, we conclude that P( X v Y) ≥ p is satisfied.
   Assume
      d        that C v D is an arbitrary GCI with probabilityd≥ p. We have to show
that B entails C v D. LetdJ be an          d arbitrary
                                                    d model of B. Consider an impli-
cation X → Y ∈ B, then X v Y ∈                         B holds, and hence it follows that
 d J            d J
( X) ⊆ ( Y) . Consequently, the implication X → Y holds in the induced con-
text KJ ,MI . (We here mean the non-probabilistic formal context that is induced by
a non-probabilistic interpretation, cf. [2, 3, 5].)
   Furthermore, since all model-based most-specific d        concept descriptions of Iε× are
expressible in terms of MI , we have that E ≡ π MI (E) holds for all mmscs E of Iε× ,
cf. [2, 3, 5]. Hence, we may conclude that
                                       × ×         × ×
                 P(C v D) = P(CIε Iε v DIε Iε )
                              l         × ×     l       × ×
                          = P( π MI (CIε Iε ) v π MI (DIε Iε ))
                                             × ×                 × ×
                              = P(π MI (CIε Iε ) → π MI (DIε Iε )).
                                                     × ×                 × ×
Consequently, B entails the implication π MI (CIε Iε ) → π MI (DIε Iε ), hence it holds
                                        × ×       × ×
in KJ ,MI , and furthermore the GCI CIε Iε v DIε Iε holds in J . As J is an arbitrary
                d              ×  ×       × ×
                              Iε Iε v DIε Iε .
interpretation, B entails C d
   Corollary 18 yields that B is complete for the almost certain GCIs of I . In par-
                         × ×                                                       d
ticular, the GCI C v CIε Iε almost certainly holds in I , and hence follows from B.
                    d               × ×                         × ×
We conclude that B |= C v DIε Iε . Ofdcourse, the GCI DIε Iε v D holds in all
interpretations. Finally, we conclude that B entails C v D.                          t
                                                                                     u

Corollary
d          20. Let I be a probabilistic interpretation, and p ∈ [0, 1] a threshold. Then
  BKI ,p is a probabilistic base of GCIs for I and p where BKI ,p is defined as in Theorem 12.


6 Conclusion

We have introduced the notion of a probabilistic formal context as a triadic context
whose third dimension is a set of worlds equipped with a probability measure. Then
204       Francesco Kriegel


the probability of implications in such probabilistic formal contexts was defined, and a
construction of a base of implications whose probability exceeds a given threshold has
been proposed, and its correctness has been verified. Furthermore, the results have been
applied to the light-weight description logic EL⊥ with probabilistic interpretations,
and so we formulated a method for the computation of a base of general concept
inclusions whose probability satisfies a given lower threshold.
  For finite input data-sets all of the provided constructions are computable. In partic-
ular, [3, 5] provide methods for the computation of model-based most-specific concept
descriptions, and the algorithms in [6, 10] can be utilized to compute concept lattices
and canonical implicational bases (or bases of GCIs, respectively).
  The author thanks Sebastian Rudolph for proof reading and a fruitful discussion,
and the anonymous reviewers for their constructive comments.

References
 [1]   Franz Baader et al., eds. The Description Logic Handbook: Theory, Implementation, and
       Applications. New York, NY, USA: Cambridge University Press, 2003.
 [2]   Daniel Borchmann. “Learning Terminological Knowledge with High Confidence from
       Erroneous Data”. PhD thesis. TU Dresden, Germany, 2014.
 [3]   Daniel Borchmann, Felix Distel, and Francesco Kriegel. Axiomatization of General Concept
       Inclusions from Finite Interpretations. LTCS-Report 15-13. Chair for Automata Theory,
       Institute for Theoretical Computer Science, TU Dresden, Germany, 2015.
 [4]   Alexander V. Demin, Denis K. Ponomaryov, and Evgenii Vityaev. “Probabilistic Concepts
       in Formal Contexts”. In: Perspectives of Systems Informatics - 8th International Andrei Ershov
       Memorial Conference, PSI 2011, Novosibirsk, Russia, June 27-July 1, 2011, Revised Selected
       Papers. Ed. by Edmund M. Clarke, Irina Virbitskaite, and Andrei Voronkov. Vol. 7162.
       Lecture Notes in Computer Science. Springer, 2011, pp. 394–410.
 [5]   Felix Distel. “Learning Description Logic Knowledge Bases from Data using Methods
       from Formal Concept Analysis”. PhD thesis. TU Dresden, Germany, 2011.
 [6]   Bernhard Ganter. “Two Basic Algorithms in Concept Analysis”. In: Formal Concept Analysis,
       8th International Conference, ICFCA 2010, Agadir, Morocco, March 15-18, 2010. Proceedings.
       Ed. by Léonard Kwuida and Baris Sertkaya. Vol. 5986. Lecture Notes in Computer Science.
       Springer, 2010, pp. 312–340.
 [7]   Bernhard Ganter and Rudolf Wille. Formal Concept Analysis - Mathematical Foundations.
       Springer, 1999.
 [8]   Jean-Luc Guigues and Vincent Duquenne. “Famille minimale d’implications informatives
       résultant d’un tableau de données binaires”. In: Mathématiques et Sciences Humaines 95
       (1986), pp. 5–18.
 [9]   Francesco Kriegel. “Axiomatization of General Concept Inclusions in Probabilistic Descrip-
       tion Logics”. In: Proceedings of the 38th German Conference on Artificial Intelligence, KI 2015,
       Dresden, Germany, September 21-25, 2015. Vol. 9324. Lecture Notes in Artificial Intelligence.
       Springer Verlag, 2015.
[10]   Francesco Kriegel. NextClosures – Parallel Exploration of Constrained Closure Operators. LTCS-
       Report 15-01. Chair for Automata Theory, Institute for Theoretical Computer Science, TU
       Dresden, Germany, 2015.
[11]   Carsten Lutz and Lutz Schröder. “Probabilistic Description Logics for Subjective Uncer-
       tainty”. In: Principles of Knowledge Representation and Reasoning: Proceedings of the Twelfth
       International Conference, KR 2010, Toronto, Ontario, Canada, May 9-13, 2010. Ed. by Fangzhen
       Lin, Ulrike Sattler, and Miroslaw Truszczynski. AAAI Press, 2010.
[12]   Michael Luxenburger. “Implikationen, Abhängigkeiten und Galois Abbildungen”. PhD
       thesis. TH Darmstadt, Germany, 1993.
      Category of isotone bonds between L-fuzzy
      contexts over different structures of truth
                        degrees

                        Jan Konecny1 and Ondrej Krı́dlo2
                         1
                            Data Analysis and Modeling Lab
                Dept. Computer Science, Palacky University, Olomouc
                17. listopadu 12, CZ-77146 Olomouc, Czech Republic
                 2
                   Institute of Computer Science, Faculty of Science
                       Pavol Jozef Šafárik University in Košice
                          Jesenná 5, 040 01 Košice, Slovakia.
                jan.konecny@upol.cz            ondrej.kridlo@upjs.sk




        Abstract. We describe properties of compositions of isotone bonds be-
        tween L-fuzzy contexts over different complete residuated lattices and
        we show that L-fuzzy contexts as objects and isotone bonds as arrows
        form a category.



 1    Introduction

 In Formal Concept Analysis, bonds represent relationships between formal con-
 texts. One of the motivations for introducing this notion is to provide a tool
 for studying mappings between formal contexts, corresponding to the behavior
 of Galois connections between their corresponding concept lattices. The notions
 of bonds, scale measures and informorphisms were studied by [14] aiming at a
 thorough study of the theory of morphisms in FCA.
     In our previous works, we studied generalizations of bonds into an L-fuzzy
 setting in [12, 11]. In [13] we also provided a study of bonds between formal
 fuzzy contexts over different structures of truth degrees. The bonds were based
 on mappings between complete residuated lattices, called residuation-preserving
 Galois connections. These mappings were too strict and in [9] we proposed to re-
 place them by residuation-preserving pl, kq-connections or residuation-preserving
 dual pl, kq-connections between complete residuated lattices.
     In the present paper we continue our study [12] of properties of bonds be-
 tween formal contexts over different structures of truth degrees; this time we
 concern with bonds mimicking isotone Galois connections between concept lat-
 tices formed by isotone concept-forming operators. Particularly, we describe the
 category of formal fuzzy contexts and isotone bonds between them. The paper
 also extends [13, 9] as we consider a setting with fuzzy formal contexts over
 different complete residuated lattices.

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 205–216, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
 206     Jan Konecny and Ondrej Krídlo


    The structure of the paper is as follows. First, in Section 2 we recall basic
 notions required in the rest of the paper. Section 3.1 considers weak homoge-
 neous L-bonds w.r.t. isotone concept-forming operators and their compositions.
 Section 3.2 then generalizes the results to the setting of formal fuzzy contexts
 over different structure of truth degrees. Finally, we summarize our results and
 outline our future research in this area in Section 4.


 2     Preliminaries

 2.1   Residuated lattices, fuzzy sets, and fuzzy relations

 We use complete residuated lattices as basic structures of truth degrees. A com-
 plete residuated lattice is a structure L “ xL, ^, _, b, Ñ, 0, 1y such that

  (i) xL, ^, _, 0, 1y is a complete lattice, i.e. a partially ordered set in which arbi-
      trary infima and suprema exist;
 (ii) xL, b, 1y is a commutative monoid, i.e. b is a binary operation which is
      commutative, associative, and a b 1 “ a for each a P L;
(iii) b and Ñ satisfy adjointness, i.e. a b b ď c iff a ď b Ñ c.

 0 and 1 denote the least and greatest elements. The partial order of L is denoted
 by ď. Throughout this work, L denotes an arbitrary complete residuated lattice.
     Elements a of L are called truth degrees. Operations b (multiplication) and
 Ñ (residuum) play the role of (truth functions of) “fuzzy conjunction” and
 “fuzzy implication”.
     An L-set (or L-fuzzy set) A in a universe set X is a mapping assigning to
 each x P X some truth degree Apxq P L. The set of all L-sets in a universe X is
 denoted LX .
     The operations with L-sets are defined componentwise. For instance, the
 intersection of L-sets A, B P LX is an L-set A X B in X such that pA X Bqpxq “
 Apxq ^ Bpxq for each x P X, etc.
     An L-set A P LX is called crisp if Apxq P t0, 1u for each x P X. Crisp L-
 sets can be identified with ordinary sets. For a crisp A, we also write x P A for
 Apxq “ 1 and x R A for Apxq “ 0. An L-set A P LX is called empty (denoted
 by H) if Apxq “ 0 for each x P X. For a P L and A P LX , the a-multiplication
 a b A and a-shift a Ñ A are L-sets defined by

                               pa b Aqpxq “ a b Apxq,
                               pa Ñ Aqpxq “ a Ñ Apxq.

     Binary L-relations (binary L-fuzzy relations) between X and Y can be thought
 of as L-sets in the universe X ˆ Y . That is, a binary L-relation I P LXˆY be-
 tween a set X and a set Y is a mapping assigning to each x P X and each y P Y
 a truth degree Ipx, yq P L (a degree to which x and y are related by I).
     For an L-relation I P LXˆY we define its transpose as the L-relation I T P
   Y ˆX
 L      given by I T py, xq “ Ipx, yq for each x P X, y P Y .
                           Category of isotone bonds between L-fuzzy contexts              207


   Various composition operators for binary L-relations were extensively studied
by [6]; we will use the following composition operators, defined for relations
A P LXˆF and B P LF ˆY :
                                    ł
                    pA ˝ Bqpx, yq “     Apx, f q b Bpf, yq,                  (1)
                                          f PF
                                           ľ
                      pA Ż Bqpx, yq “            Bpf, yq Ñ Apx, f q.                       (2)
                                          f PF

    Note also that for L “ t0, 1u, A˝B coincides with the well-known composition
of binary relations.
    We will occasionally use some of the following properties concerning the
associativity of several composition operators, see [2].
Theorem 1. The operator ˝ from above has the following properties concerning
composition.
 – Associativity:

                                   R ˝ pS ˝ T q “ pR ˝ Sq ˝ T.                             (3)

 – Distributivity:
             ď          ď                                    ď            ď
            p Ri q ˝ S “ pRi ˝ Sq,               and   R˝p       Si q “       pR ˝ Si q.   (4)
              i                i                             i            i


2.2   Formal fuzzy concept analysis
An L-context is a triplet xX, Y, Iy where X and Y are (ordinary nonempty) sets
and I P LXˆY is an L-relation between X and Y . Elements of X are called
objects, elements of Y are called attributes, I is called an incidence relation.
Ipx, yq “ a is read: “The object x has the attribute y to degree a.”
   Consider the following pair xX, Yy of operators X : LX Ñ LY and Y : LY Ñ LX
induced by an L-context xX, Y, Iy:
                     ł                              ľ
           AX pyq “     Apxq b Ipx, yq, B Y pxq “       Ipx, yq Ñ Bpyq.      (5)
                     xPX                                  yPY

for all A P LX and B P LY . When we consider concept-forming operators in-
duced by multiple L-relations, we write the inducing L-relation as the subscript
of the symbols of the operators. For example, the pair of concept-forming oper-
ators induced by L-relation I are written as xXI , YI y.
Remark 1. Notice that the pair of concept-forming operators can be interpreted
as instances of the composition operators between relations. Applying the iso-
morphisms L1ˆX – LX and LY ˆ1 – LY whenever necessary, one could write
them, alternatively, as

                    AX “ A ˝ I       and B Y “ I Ž B       p“ B Ż I T q.
208       Jan Konecny and Ondrej Krídlo


      Furthermore, denote the set of fixed points of xX , Y y by B XY pX, Y, Iq, i.e.
               B XY pX, Y, Iq “ txA, By P LX ˆ LY | AX “ B, B Y “ Au.                   (6)
The set of fixed points endowed with ď, defined by
              xA1 , B1 y ď xA2 , B2 y if A1 Ď A2 (equivalently B2 Ď B1 )
is a complete lattice [5], called an attribute-oriented L-concept lattice associated
with I, and its elements are called (attribute-oriented) formal L-concepts (or just
L-concepts). For thorough studies of attribute-oriented concept lattices, see [5,
7, 15]. In a formal concept xA, By, the A is called an extent, and B is called an
intent. The set of all extents and the set of all intents are denoted by ExtXY and
IntXY , respectively. That is,
           ExtXY pX, Y, Iq “ tA P LX | xA, By P B XY pX, Y, Iq for some Bu,
                                                                                        (7)
           IntXY pX, Y, Iq “ tB P LY | xA, By P B XY pX, Y, Iq for some Au.

Equivalently, we can characterize ExtXY pX, Y, Iq and IntXY pX, Y, Iq as follows

                            ExtXY pX, Y, Iq “ tB Y | B P LY u,
                                                                                        (8)
                            IntXY pX, Y, Iq “ tAX | A P LX u.
      We will need the following lemma from [4].
Lemma 1. Consider L-contexts xX, Y, Iy, xX, F, Ay, and xF, Y, By.
(a) IntXY pX, Y, Iq Ď IntXY pF, Y, Bq if and only if there exists A1 P LXˆF such
    that I “ A1 ˝ B,
(b) ExtXY pX, Y, A ˝ Bq Ď ExtXY pX, F, Aq.
Definition 1. An L-relation β P LX1 ˆY2 is called a homogeneous weak L-bond3
from L-context xX1 , Y1 , I1 y to L-context xX2 , Y2 , I2 y if
                         ExtXY pX1 , Y2 , βq Ď ExtXY pX1 , Y1 , I1 q,
                                                                                        (9)
                         IntXY pX , Y , βq Ď IntXY pX , Y , I q.
                                   1   2                 2   2   2

   In this paper we assume only weak homogeneous L-bonds w.r.t. xX, Yy. In
what follows, we omit the words ‘weak homogeneous’ and the pair of concept-
forming operators and call them just ‘L-bonds’.
   We will utilize the following characterization of L-bonds.
Lemma 2 ([7]). An L-relation β P LX1 ˆY2 is an L-bond from xX1 , Y1 , I1 y to
xX2 , Y2 , I2 y iff there is such L-relation Se that β “ Se ˝ I2 and YSe maps extents
of B XY pX2 , Y2 , I2 q to extents of B XY pX2 , Y2 , I2 q.
Remark 2. Note that due to results on fuzzy relational equations we have that
the L-relation Se from Lemma 2 is equal to β Ż I2T (see [2]).
3
    The notion of L-bond was introduced in [12]; however we adapt its definition the
    same way as in [8, 10] w.r.t. xX, Yy
                         Category of isotone bonds between L-fuzzy contexts         209


3     Results
Firstly, we describe compositions of L-bonds and show that they form a category.
Later we generalize the results to setting of isotone bonds between fuzzy contexts
over different complete residuated lattices.

3.1     Setting with uniform structures of truth degrees
We start with the notion of composition of L-bonds.
Definition 2. Let β1 be an L-bond from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y and β2 be
an L-bond from xX2 , Y2 , I2 y to xX3 , Y3 , I3 y. Define composition of β1 and β2 as
the L-relation pβ1 Ż I2T q ˝ β2 P LX1 ˆY3 and denote it β1 ‚ β2 .
Theorem 2. The composition of L-bonds is an L-bond.
Proof. Let β1 be an L-bond from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y and β2 be an L-
bond from xX2 , Y2 , I2 y to xX3 , Y3 , I3 y. By Lemma 2 there are Se P LX1 ˆX2 , Se 1 P
LX2 ˆX3 such that β1 “ Se ˝ I2 , β2 “ Se 1 ˝ I3 . By Definition 2 and Remark 2 we
have
                               β1 ‚ β2 “ pβ1 Ż I2T q ˝ β2
                                        “ Se ˝ Se 1 ˝ I3 .
Hence we have
                     IntXY pX1 , Y3 , β1 ‚ β2 q Ď IntXY pX3 , Y3 , I3 q            (10)
by Lemma 1 (a). Note that the mapping YSe maps extents of I2 to extents of I1
by Lemma 2 and that B Yβ2 is extent of I2 for any B P IntXY pX3 , Y3 , I3 q by (8).
Thus we have
                   B Yβ1 ‚β2 “ B Yβ2 YSe P ExtXY pX1 , Y1 , I1 q,
hence
                    ExtXY pX1 , Y3 , β1 ‚ β2 q Ď ExtXY pX1 , Y1 , I1 q.            (11)
The equalities (10) and (11) imply that β1 ‚ β2 is an L-bond.                         \
                                                                                      [
Lemma 3. Let β be an L-bond from L-context xX1 , Y1 , I1 y to L-context xX2 , Y2 , I2 y.
For any L-set A P LX1 we have that AXI1 YI1 Xβ “ AXβ .
Proof. Let A be an arbitrary L-set from LX1 . Then
          AXI1 YI1 Xβ Ě AXβ since p´qXβ is isotone and AXI1 YI1 Ě A
                      “ AX β Y β X β
                    “ AXβ Yβ XI2 YI2 Xβ due to definition of L-bond
                    Ě AXI1 YI1 Xβ since the mapping p´qXI1 YI1 Xβ is isotone


Hence AXI1 YI1 Xβ “ AXβ .                                                             \
                                                                                      [
210       Jan Konecny and Ondrej Krídlo


The equality from Lemma 3 written in relational form is A˝β “ pA˝I1 qŻI1T q˝β;
we use that to prove the following theorem.

Theorem 3. Composition of L-bonds is associative.

Proof. Let β1 be an L-bond from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y, β2 be an L-bond
from xX2 , Y2 , I2 y to xX3 , Y3 , I3 y, and β3 be an L-bond from xX3 , Y3 , I3 y to
xX4 , Y4 , I4 y. We have

pβ1 ‚ β2 q ‚ β3 “ pppβ1 Ż I2T q ˝ β2 q Ż I3T q ˝ β3                 by Definition 2
                 “ ppSe ˝ β2 q Ż I3T q ˝ β3                           by Remark 2
                 “ ppSe ˝ pSe1 ˝ I3 qq Ż I3T q ˝ β3                   by Lemma 2
                 “ pppSe ˝ Se1 q ˝ I3 q Ż I3T q ˝ β3                         by (3)
                 “ pSe ˝ Se1 q ˝ β3                                   by Lemma 3
                 “ Se ˝ pSe1 ˝ β3 q                                          by (3)
                 “ Se ˝ pβ2 ‚ β3 q “ β1 ‚ pβ2 ‚ β3 q by Remark 2 and Definition 2.

                                                                                   \
                                                                                   [

      We obtain a category of L-contexts and L-bonds.

Theorem 4. The structure of L-contexts and L-bonds forms a category:
Objects are L-contexts,
Arrows are L-bonds where
   identity arrow of any formal L-context xX, Y, Iy is its incidence relation I,4
   composition of arrows β1 ‚ β2 is given by Definition 2.
Remark 3. The category is equivalent to category of attribute-oriented concept
lattices and isotone Galois connections. That is analogous to results in [12]. We
will bring more about is in full version of the paper.

3.2     Setting with different structures of truth degrees
In this section we generalize the previous results into a setting in which fuzzy con-
texts are defined over different complete residuated lattices. To do that we need
to explore compositions of underlying morphisms called residuation-preserving
pl, kq-connections between complete residuated lattices.

pl, kq-connections and their compositions
Firstly, let us recall definition and basic properties of the pl, kq-connections in-
troduced in [9].

Definition 3 ([9]). Let L1 , L2 be complete residuated lattices, let l P L1 , k P L2
and let λ : L1 Ñ L2 , κ : L2 Ñ L1 be mappings, such that
4
    Clearly, I is an L-bond from xX, Y, Iy to xX, Y, Iy.
                             Category of isotone bonds between L-fuzzy contexts   211


                                  1                                      ‚ 1
                                  ‚
                     d                                                   ‚ 0.75
                         ‚                    ‚ c
                                                                         ‚ 0.5
              b ‚                 ‚
                                      a                                  ‚ 0.25
                         ‚
                         0                                               ‚ 0


                 b   0   a    b   c   d   1         Ñ    0   a   b   c   d   1
                 0   0   0    0   0   0   0         0    1   1   1   1   1   1
                 a   0   0    0   a   0   a         a    d   1   d   1   1   1
                 b   0   0    b   0   b   b         b    c   c   1   c   1   1
                 c   0   a    0   c   a   c         c    b   d   b   1   d   1
                 d   0   0    b   a   b   d         d    a   c   d   c   1   1
                 1   0   a    b   c   d   1         1    0   a   b   c   d   1



Fig. 1. Six-element residuated lattice, with b and Ñ as showed in the bottom part
(011010:00A0B0BCAB in [3]), (top left), five-element Lukasiewicz chain (111:000AB in
[3]), (top right), and pc, 0.5q-connection between them.


 – xλ, κy is an isotone Galois connection between L1 and L2 ,
 – κλpa1 q “ l Ñ1 pl b1 a1 q for each a1 P L1 ,
 – λκpa2 q “ k b2 pk Ñ2 a2 q for each a2 P L2 .
We call xλ, κy an pl, kq-connection from L1 to L2 . An pl, kq-connection from L1
to L2 is called residuation-preserving if
                     κpk b2 pλpaq Ñ2 λpbqqq “ κλpaq Ñ1 κλpbq                      (12)
holds true for any a, b P L2 .

Theorem 5 ([9]). Let xλ, κy be a residuation-preserving pl, kq-connection from
L1 to L2 . The algebra xfixpλ, κq, ^, _, b, Ñ, 0, 1y where ^ and _ are given by
the order
                 xa1 , a2 y ď xb1 , b2 y if a1 ď1 b1 ,
                                                                           (13)
                                         (equivalently, if a2 ď2 b2 )
and the adjoint pair is given by
      xa1 , a2 y Ñ xb1 , b2 y “ xa1 Ñ1 b1 , k b2 pa2 Ñ2 b2 qy                     (14)
                             “ xa1 Ñ1 b1 , k b2 ppk Ñ2 a2 q Ñ2 pk Ñ2 b2 qqy,      (15)
      xa1 , a2 y b xb1 , b2 y “ xl Ñ1 pl b1 a1 b1 b1 q, a2 b2 pk Ñ2 b2 qy         (16)
                             “ xl Ñ1 pl b1 a1 b1 b1 q, pk Ñ2 a2 q b2 b2 y         (17)
is a complete residuated lattice.
212     Jan Konecny and Ondrej Krídlo


Figure 1 shows an example of pl, kq-connection. We refer the reader to [9] for
ideas behind pl, kq-connections, examples and further details.
    Now we define composition of pl, kq-connections and show that it is an pl, kq-
connection as well. In addition, the composition preserves residuation-preservation,
that means that composition of residuation-preserving pl, kq-connections is a
residuation-preserving pl, kq-connection as well.
Theorem 6. Let xλ1 , κ1 y be an pl1 , k2 q-connection from L1 to L2 and xλ2 , κ2 y
be an pk2 , j3 q-connection from L2 to L3 . Then the pair of mappings λ : L1 Ñ L3 ,
κ : L3 Ñ L1 , defined by
                            λpa1 q “ λ2 pk2 Ñ2 λ1 pa1 qq,
                                                                               (18)
                            κpa3 q “ κ1 pk2 b2 κ2 pa3 qq
for each a1 P L1 and a2 P L2 , is an pl1 , j3 q-connection from L1 to L3 .
Proof. First, we prove that κλpa1 q “ l1 Ñ1 pl1 b1 a1 q for each a1 P L1 and
λκpa2 q “ j3 b3 pj3 Ñ3 a3 q for each a3 P L3 . For each a1 P L1 , we have
               κλpa1 q “ κ1 pk2 b2 κ2 pλ2 pk2 Ñ2 λ1 pa1 qqqq
                       “ κ1 pk2 b2 pk2 Ñ2 pk2 b pk2 Ñ2 λ1 pa1 qqqqq
                       “ κ1 pk2 b2 pk2 Ñ2 λ1 pa1 qqq
                       “ κ1 pλ1 pκ1 pλ1 pa1 qqqq
                       “ κ1 pλ1 pa1 qq
                       “ l1 Ñ1 pl1 b1 a1 q.
Similarly, we have for each a3 P L3
               λκpa3 q “ λ2 pk2 Ñ2 λ1 pκ1 pk2 b2 κ2 pa3 qqqq
                      “ λ2 pk2 Ñ2 pk2 b2 pk2 Ñ2 pk2 b2 κ2 pa3 qqqqq
                      “ λ2 pk2 Ñ2 pk2 b2 κ2 pa3 qqqqq
                      “ λ2 pκ2 pλ2 pκ2 pa3 qqqq
                      “ λ2 pκ2 pa3 qq
                      “ j3 b3 pj3 Ñ3 a3 q.
Since κλpa1 q “ l1 Ñ1 pl1 b1 a1 q ě1 a1 and λκpa3 q “ j3 b3 pj3 b3 a3 q ď3 a3 we
only need to show monotony to prove that xλ, κy is an isotone Galois connection:
For each a1 , b1 P L1 we have
a1 ď1 b1 implies λ1 pa1 q ď2 λ1 pb1 q since λ1 is monotone,
          implies k2 Ñ2 λ1 pa1 q ď2 k2 Ñ2 λ1 pb1 q since Ñ2 is monotone
                                                     in its second argument,
          implies λ2 pk2 Ñ2 λ1 pa1 qq ď3 λ2 pk2 Ñ2 λ1 pb1 qq since λ2 is monotone.
Thus a1 ď1 b1 implies λpa1 q ď3 λpb1 q for each a1 , b1 P L1 . Similarly, one can
show that a3 ď3 b3 implies κpa3 q ď1 κpb3 q.
                                                                                \
                                                                                [
                           Category of isotone bonds between L-fuzzy contexts               213


Theorem 7. Let xλ1 , κ1 y be a residuation-preserving pl1 , k2 q-connection from
L1 to L2 and xλ2 , κ2 y be a residuation-preserving pk2 , j3 q-connection from L2 to
L3 . Then the pair of mappings λ : L1 Ñ L3 , κ : L3 Ñ L1 , defined by (18), is a
residuation-preserving pl1 , j3 q-connection from L1 to L3 .

Proof. For each a1 , b1 P L1 we have

κλpa1 q Ñ1 κλpb1 q “
“ pl1 Ñ1 pl1 b1 a1 qq Ñ1 pl1 Ñ1 pl1 b1 b1 qq
“ κ1 λ1 pa1 q Ñ1 κ1 λ1 pb1 q
“ κ1 pk2 b2 pλ1 pa1 q Ñ2 λ1 pb1 qqq
“ κ1 pk2 b2 pλ1 κ1 λ1 pa1 q Ñ2 λ1 κ1 λ1 pb1 qqq
“ κ1 pk2 b2 ppk2 b2 pk2 Ñ2 λ1 pa1 qqq Ñ2 pk2 b2 pk2 Ñ2 λ1 pb1 qqqqq
“ κ1 pk2 b2 ppk2 b2 pk2 Ñ2 pk2 b2 pk2 Ñ2 λ1 pa1 qqqqq Ñ2 pk2 b2 pk2 Ñ2 λ1 pb1 qqqqq
“ κ1 pk2 b2 ppk2 Ñ2 pk2 b2 pk2 Ñ2 λ1 pa1 qqqq Ñ2 pk2 Ñ2 pk2 b2 pk2 Ñ2 λ1 pb1 qqqqq
“ κ1 pk2 b2 pκ2 λ2 pk2 Ñ2 λ1 pa1 qq Ñ2 κ2 λ2 pk2 Ñ2 λ1 pb1 qqqq
“ κ1 pk2 b2 κ2 pj3 b3 pλ2 pk2 Ñ2 λ1 pa1 qq Ñ3 λ2 pk2 Ñ2 λ1 pb1 qqqqq
“ κ1 pk2 b2 κ2 pj3 b3 pλpa1 q Ñ3 λpb1 qqqq
“ κpj3 b3 pλpa1 q Ñ3 λpb1 qqq.

                                                                                             \
                                                                                             [

    We call xλ, κy from (18) a composition of xλ1 , κ1 y and xλ2 , κ2 y and we denote
it as xλ1 , κ1 y ‚ xλ2 , κ2 y “ xλ1 ‚ λ2 , κ1 ‚ κ2 y. Now we show, that the composition
of pl, kq-connections is associative.

Theorem 8. Let xλ1 , κ1 y be an pl1 , k2 q-connection from L1 to L2 , xλ2 , κ2 y be
a pk2 , j3 q-connection from L2 to L3 , and xλ3 , κ3 y be a pj3 , m4 q-connection from
L3 to L4 . Then

         xλ1 , κ1 y ‚ pxλ2 , κ2 y ‚ xλ3 , κ3 yq “ pxλ1 , κ1 y ‚ xλ2 , κ2 yq ‚ xλ3 , κ3 y.

Proof. We have for each a P L1

                 pλ1 ‚ pλ2 ‚ λ3 qqpa1 q “ pλ2 ‚ λ3 qpk2 Ñ2 λ1 pa1 qq
                                          “ λ3 pj3 Ñ λ2 pk2 Ñ2 λ1 pa1 qqq
                                          “ λ3 pj3 Ñ pλ1 ‚ λ2 qpa1 qq
                                          “ ppλ1 ‚ λ2 q ‚ λ3 qpa1 q

and similarly for the κ-part.                                                                \
                                                                                             [

Theorem 9. The following structure forms a category.
Objects are pairs xL, ey, where L is a complete residuated lattices and e P L.
Arrows from xL1 , ly to xL2 , ky are pl, kq-connections from L1 to L2 , where
214       Jan Konecny and Ondrej Krídlo


      identity arrow on any xL, ey is pe, eq-connection xλ, κy where λpaq “ e b a
         and κpaq “ e Ñ a for each a P L.
      composition of arrows is as defined in (18).
If we use just residuation-preserving pl, kq-connections we obtain a sub-category.

      Now, we can explore bonds based on residuation-preserving pl, kq-connections.

Definition 4. Let L1 , L2 be complete residuated lattices, xλ, κy be residuation-
preserving pl, kq-connection from L1 to L2 , and let xX1 , Y1 , I1 y and xX2 , Y2 , I2 y
                                                                    X1 ˆY2
be L1 -context and L2 -context, respectively. We call β P Lxλ,κy           a xλ, κy-bond
from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y if the following inclusions hold.

                       ExtMO pX1 , Y2 , βq Ď ExtXY pX1 , Y1 , κλpI1 qq,                   (19)
                       IntMO pX , Y , βq Ď IntXY pX , Y , λκpI qq.
                                  1    2                2   2      2                      (20)

    The concept-forming operators xM, Oy induced by xλ, κy-bond β from xX1 , Y1 , I1 y
to xX2 , Y2 , I2 y are given by5

                                      AMβ “ λpAqXproj2 pβq ,
                                                                                          (21)
                                      B Oβ “ κpBqYproj1 pβq .

Theorem 10. Let xX1 , Y1 , I1 y be an L1 -context, xX2 , Y2 , I2 y be an L2 -context,
and xλ, κy an pl, kq-connection from L1 to L2 . Then β P Lxλ,κy is a xλ, κy-bond
from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y if and only if it is a Lxλ,κy -bond w.r.t. xX, Yy from
xX1 , Y1 , xκλpI1 q, λpI1 qyy to xX2 , Y2 , xκpI2 q, λκpI2 qyy.

Proof. Directly from the definition and (21).                                               \
                                                                                            [

    For what follows we will need the following product of fuzzy relations. Let
xλ1 , κ1 y be pl1 , k2 q-connection from L1 to L2 , xλ2 , κ2 y be pk2 , m3 q-connection
                                                 Y ˆZ
from L2 to L3 , and I P LXˆY                                               XˆZ
                                xλ1 ,κ1 y , J P Lxλ2 ,κ2 y . Then I b J P Lxλ1 ‚λ2 ,κ1 ‚κ2 y is
defined as

        I b J “ xκ1 pKq, λ2 pk2 Ñ2 Kqy where K “ proj2 pIq ˝2 proj1 pJq                   (22)

and ˝2 is composition of L2 -relations (1).

Lemma 4. Let xX1 , Y1 , I1 y be an L1 -context, xX2 , Y2 , I2 y be an L2 -context, and
xλ, κy an pl, kq-connection from L1 to L2 .
                                                                    1 ˆX2
(a) An Lxλ,κy -relation β for which exist Lxλ,κy -relations Se P LX
                                                                  xλ,κy   and Si P
         1 ˆY2
      LYxλ,κy  such that

                       β “ xκλpI1 q, λpI1 qy b Si “ Se b xκpI2 q, λκpI2 qy

      is a xλ, κy-bond from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y.
5
    proj1 , proj2 denote projection of first and second component of a pair, respectively.
                           Category of isotone bonds between L-fuzzy contexts            215


(b) Each xλ, κy-bond β from xX1 , Y1 , I1 y to xX2 , Y2 , I2 y satisfies that there is
            1 ˆX2
    Se P LX
          xλ,κy   such that
                                  β “ Se b xκpI2 q, λκpI2 qy.
Proof. From Theorem 10 and Lemma 1.                                                        \
                                                                                           [
Theorem 11. Let xλ1 , κ1 y be an pl1 , k2 q-connection from L1 to L2 , xλ2 , κ2 y be
an pk2 , j3 q-connection from L2 to L3 , β1 be xλ1 , κ1 y-bond from xX1 , Y1 , I1 y to
xX2 , Y2 , I2 y, and β2 be xλ2 , κ2 y-bond from xX2 , Y2 , I2 y to xX3 , Y3 , I3 y.
                                       β “ Se b β2 ,                                     (23)
where Se “ β1 Ż xκ1 pI2T q, λ1 κ1 pI2T qy, is a xλ1 ‚ λ2 , κ1 ‚ κ2 y-bond from xX1 , Y1 , I1 y
to xX3 , Y3 , I3 y.
    Let us denote β from (23) as β “ β1 ‚ β2 and call it a composition of isotone
xλ, κy-bonds. Now we show associativity of this composition.
Theorem 12. Let xλ1 , κ1 y be an pl1 , k2 q-connection from L1 to L2 , xλ2 , κ2 y be
an pk2 , j3 q-connection from L2 to L3 , xλ3 , κ3 y be an pj3 , m4 q-connection from L3
to L4 , and βi be xλi , κi y-bond from xXi , Yi , Ii y to xXi`1 , Yi`1 , Ii`1 y. Then
                             β1 ‚ pβ2 ‚ β3 q “ pβ1 ‚ β2 q ‚ β3 .
Proof. Follows from Theorem 3, Theorem 8, and Theorem 10.                                  \
                                                                                           [
   Finally, we can state that L-contexts over different structures of truth degrees
and bonds between them form a category.
Theorem 13. Objects are pairs xK, ey, where K is a L-context and e P L.
Arrows between xK1 , ly and xK2 , ky, where K1 is an L1 -context, K2 is an
   L2 -context and l P L1 , k P L2 , are xλ, κy-bonds, where xλ, κy is an pl, kq-
   connection.
   identity arrow for a pair xK, ey of L-context xX, Y, Iy and e is xλ, κy-
       bond I with xλ, κy are pe, eq-connections xλ, κy where λpxq “ e Ñ a and
       κpxq “ e b a for each a P L.
   composition of arrows β1 ‚ β2 is given by (23).

4    Future Research
Our future research in this area includes addressing the following issues:
 – Antitone bonds between fuzzy contexts over different complete residuated
   lattices were studied in [9]; basics of Isotone bonds are presented in this
   paper. We want to extend this study to heterogeneous bonds[11]. We will
   bring results on them and their compositions in the full version of this paper.
 – As block relations are a special case of bonds, they share many properties
   (see [11]). It can be fruitful to study the compositions described in this paper
   in context of block L-relations. In addition, the composition applied on block
   (crisp) relations correspond with multiplication used in calculus studied in
   [1]. This observation deserves deeper study; we believe that this can bring a
   new interesting insight to the calculus.
216     Jan Konecny and Ondrej Krídlo


Acknowledgments
Jan Konecny is supported by grant No. 15-17899S, “Decompositions of Matrices
with Boolean and Ordinal Data: Theory and Algorithms”, of the Czech Science
Foundation.
Ondrej Krı́dlo is supported by grant VEGA 1/0073/15 by the Ministry of Ed-
ucation, Science, Research and Sport of the Slovak republic and University Sci-
ence Park TECHNICOM for Innovation Applications Supported by Knowledge
Technology, ITMS: 26220220182, supported by the Research & Development
Operational Programme funded by the ERDF.

References
 1. Eduard Bartl and Michal Krupka. Residuated lattices of block relations: Size
    reduction of concept lattices. to appear in IJGS, 2015.
 2. Radim Belohlavek. Fuzzy Relational Systems: Foundations and Principles. Kluwer
    Academic Publishers, Norwell, USA, 2002.
 3. Radim Belohlavek and Vilem Vychodil. Residuated lattices of size ď 12. Order,
    27(2):147–161, 2010.
 4. Radim Belohlavek and Jan Konecny. Row and Column Spaces of Matrices over
    Residuated Lattices. Fundam. Inform., 115(4):279–295, 2012.
 5. George Georgescu and Andrei Popescu. Non-dual fuzzy connections. Arch. Math.
    Log., 43(8):1009–1039, 2004.
 6. Ladislav J Kohout and Wyllis Bandler. Relational-product architectures for infor-
    mation processing. Information Sciences, 37(1-3):25–37, 1985.
 7. Jan Konecny. Isotone fuzzy Galois connections with hedges. Information Sciences,
    181(10):1804–1817, 2011.
 8. Jan Konecny. Antitone L-bonds. In: Information Processing and Management of
    Uncertainty in Knowledge-Based Systems – 15th International Conference, IPMU
    2014, pages 71–80, 2014.
 9. Jan Konecny. Bonds between L-fuzzy contexts over different structures of truth-
    degrees, In: 13th International Conference, ICFCA 2015, Nerja, Spain, June 23-26,
    pages 81–96, 2015.
10. Jan Konecny and Manuel Ojeda-Aciego. Isotone L-bonds. In Manuel Ojeda-Aciego
    and Jan Outrata, editors, CLA, volume 1062 of CEUR Workshop Proceedings,
    pages 153–162. CEUR-WS.org, 2013.
11. Jan Konecny and Manuel Ojeda-Aciego. On homogeneous L-bonds and heteroge-
    neous L-bonds. to appear in IJGS, 2015.
12. Ondrej Krı́dlo, Stanislav Krajči, and Manuel Ojeda-Aciego. The Category of L-Chu
    Correspondences and the Structure of L-Bonds. Fundam. Inform., 115(4):297–325,
    2012.
13. Ondrej Kridlo and Manuel Ojeda-Aciego. CRL-Chu Correspondences. In Manuel
    Ojeda-Aciego and Jan Outrata, editors, CLA, volume 1062 of CEUR Workshop
    Proceedings, pages 105–116. CEUR-WS.org, 2013.
14. Markus Krötzsch, Pascal Hitzler, and Guo-Qiang Zhang. Morphisms in context.
    In Conceptual Structures: Common Semantics for Sharing Knowledge, 13th Inter-
    national Conference on Conceptual Structures, ICCS 2005, Kassel, Germany, July
    17-22, 2005, Proceedings, pages 223–237, 2005.
15. Jesús Medina. Multi-adjoint property-oriented and object-oriented concept lat-
    tices. Inf. Sci., 190:95–106, 2012.
            From an implicational system to its
                  corresponding D-basis

     Estrella Rodrı́guez-Lorenzo1 , Kira Adaricheva2 , Pablo Cordero1 , Manuel
                            Enciso1 , and Angel Mora1
                    1
                       University of Málaga, Andalucı́a Tech, Spain,
  e-mail: {estrellarodlor,amora}@ctima.uma.es, pcordero@uma.es, enciso@lcc.uma.es
                          2
                            Nazarbayev University, Kazakhstan
                            e-mail: kira.adaricheva@nu.edu.kz


        Abstract. Closure system is a fundamental concept appearing in several
        areas such as databases, formal concept analysis, artificial intelligence,
        etc. It is well-known that there exists a connection between a closure
        operator on a set and the lattice of its closed sets. Furthermore, the
        closure system can be replaced by a set of implications but this set has
        usually a lot of redundancy inducing non desired properties.
        In the literature, there is a common interest in the search of the mini-
        mality of a set of implications because of the importance of bases. The
        well-known Duquenne-Guigues basis satisfies this minimality condition.
        However, several authors emphasize the relevance of the optimality in
        order to reduce the size of implications in the basis. In addition to this,
        some bases have been defined to improve the computation of closures
        relying on the directness property. The efficiency of computation with
        the direct basis is achieved due to the fact that the closure is computed
        in one traversal.
        In this work, we focus on the D-basis, which is ordered-direct. An open
        problem is to obtain it from an arbitrary implicational system, so it is
        our aim in this paper. We introduce a method to compute the D-basis
        by means of minimal generators calculated using the Simplification Logic
        for implications.


 1    Introduction

 Discovering knowledge and information retrieval are currently active issues where
 Formal Concept Analysis (FCA) provides tools and methods for data analysis.
 The notions around the concept lattice may be considered as the main attractions
 in Formal Concept Analysis and they are strongly connected to the notion of
 closure.
     Closure system is a fundamental concept appearing in several areas such as
 database theory, formal concept analysis, artificial intelligence, etc. It is well-
 known that there exists a connection between a closure operator on a set and
 the lattice of its closed sets. Furthermore, the closure system can be presented,
 dually, as a set of attribute implications, namely an implicational system but
 this set has usually a lot of redundancy inducing non-desired properties.

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 217–228, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
218     Estrella Rodríguez-Lorenzo et al.


     We can not fail to mention the relevance of the role of the implication notion
in different areas. It was the main actor of the normalization theory in database
area; it has an outstanding character in Formal Concept Analysis and it was
prominently used in Frequent Set Mining and Learning Spaces, see the survey
of M. Wild [10]. The latter is devoted to mathematical theory of implications
and the different faces of the concept of an implication. Implications linked data
represented in several forms going from the relationship between itemsets in
transactions (Frequent Set Mining) to the boolean functions (Horn Theory).
     Nonetheless, as V. Duquenne says in [6] “it is surprising if not hard to ac-
knowledge that we did not learn much more on their intimacy in the meantime,
despite many interesting papers using or revisiting them”. We believe there is a
long way to go, and a deeper theory on properties of implications with automated
and efficient methods to manipulate them can be developed.
     In this paper, we are focused in the Formal Concept Analysis area and the
fundamental notions are assumed (see [7]). The task of information retrieval
carried out by the tools in FCA conduits to infer concepts from the data set,
i.e. to deduce (in an automated way) a set of objects that may be precisely
characterized by a set of attributes. Such concepts inherit an order relation
induced by attribute set inclusion, providing a lattice structure of the concept set.
Here implications are retrieved from a binary table (formal context) representing
the relationship between a set of objects and a set of attributes. Implications
represent an alternative way for the underlying information contained in the
formal context.
     Many applications must massively compute closures of sets of attributes and
any improvement of execution time is relevant. In [9] the author establishes the
necessity of obtaining succinct representation of closure operators to achieve an
efficient computational usage. In this direction, properties associated to implica-
tions are studied to render equivalent sets fulfilling desired properties, directness
and optimality.
     An important matter in FCA is to transform implicational systems in canon-
ical forms for special proposals in order to provide an efficient further man-
agement. Hence, some alternative definitions have been established: Duquenne-
Guigues basis, direct optimal basis, D-basis, etc. In this work we focus on the
last one [1], because it combines, in a balanced way, a brief representation (it
has a small number of elements) and a efficient computation of closures (it is
computed in just one traversal). To this end, D-basis proposes an order in which
implications will be attended.
     The major issue is that the execution of the D-basis in one iteration is more
efficient that the execution of a shorter, but un-ordered one, for instance the
canonical basis of Duquenne and Guigues. K. Adaricheva et.al prove in [1] that
one can extract the D-basis from any direct unit basis Σ in time polynomial
of size of Σ, and it takes only linear time of the number of implications of the
D-basis to put it into a proper order.
     In [5] we have proposed a method to calculate all the minimal generators from
a set of implications as a way to remove redundancy in the basis. The method
                    From an implicational system to its corresponding D-basis    219


to compute all the minimal generators is based on the Simplification Logic for
implications [8]. Using this logic we are able to remove redundancy in the impli-
cations [4] and following the same style of application of the Simplification Rule
to the set of implications we can obtain all the minimal generators.
    Currently the retrieval of the D-basis from an arbitrary implicational sys-
tem is an open problem, so it becomes our aim in this paper. We introduce a
method to compute the D-basis by means of minimal generators calculated us-
ing the Simplification Logic for implications. The relationship among minimal
generators, covers, minimal covers and D-basis is presented and an algorithm to
calculate D-basis from an arbitrary set of implications is proposed.
    Section 2 presents the main notions necessary to the understanding of the
new method: closure operators, the D-basis, Simplification Logic and the method
to calculate minimal generators. In Section 3, the relationships between covers
and generators are presented. In Section 4, the new method to obtain the D-
basis from a set of implications is shown, and some conclusions and extensions
are proposed in Section 5.


2      Background

2.1     Closure systems

Given a non-empty set M and the set1 2M of all its subsets, a closure operator
is a map φ : 2M → 2M that satisfies the following, for all X, Y ∈ 2M :
(1) increasing: X ⊆ φ(X);
(2) isotone: X ⊆ Y implies φ(X) ⊆ φ(Y );
(3) idempotent: φ(φ(X)) = φ(X).
We will refer to the pair hM, φi of a set M and a closure operator on it as a
closure system.
    In the next two subsections we will follow the introduction of the implica-
tional system based on the minimal proper covers 2 given in [1], which was named
there the D-basis.
    We will call closure system reduced, if φ({x}) = φ({y}) → x = y, for any
x, y ∈ M 3 . If the closure system hM, φi is not reduced, one can modify it to
produce an equivalent one that is reduced, see [1] for more details.
    We will now define a closure operator φ∗ , which is associated with a given
operator φ.
                  hM, φi be a closure system. Define φ∗ as a self-map on 2M
Definition 1. Let S
           ∗
such that φ (X) = x∈X φ(x), X ⊆ 2M .
      It is straightforward to verify that
1
  In the FCA framework, that set M can be thought a set of attributes of a context.
2
  Although in [1] it was introduced as minimal cover, here we name it minimal proper
  cover because in this paper we generalize the notion of cover in Section 3.
3
  To clarify the notation φ({x}) will be represented as φ(x) if no risk of confusion.
220       Estrella Rodríguez-Lorenzo et al.


Lemma 1. φ∗ is a closure operator on M .
      Given a closure system hM, φi, we introduce several important concepts.
Definition 2 ([1]). For x ∈ M we call a subset X ⊆ M a proper cover for x if
x ∈ φ(X) \ φ∗ (X). If X is a proper cover for x, it will be denoted as x ∼p X.

2.2     The D-basis
In this subsection, we briefly summarize the introduction of the D-basis in [1].
Its definition is strongly based on the notion of a minimal proper cover:
Definition 3. A proper cover Y for x is called minimal, if, for any other proper
cover Z for x, Z ⊆ φ∗ (Y ) implies Y ⊆ Z.
    The existence of several proper covers for the same element induces the need
to introduce the notion of minimality.
Lemma 2. If x ∼p X, then there exists Y such that x ∼p Y , Y ⊆ φ∗ (X) and
Y is a minimal proper cover for x. In other words, every proper cover can be
reduced to a minimal proper cover under the subset relation added with the φ∗
operator.
    These ideas bring to the following definition of the implicational system defin-
ing the reduced closure system by means of the minimal proper covers.
Definition 4. Given a reduced closure system hM, φi, define the D-basis ΣD
as a union of two subsets of implications:
1. {y → x : x ∈ φ(y) \ y, y ∈ M } (such implications are called binary);
2. {X → x : X is a minimal proper cover for x}.
    Note that the D-basis belongs to the family of the unit bases, i.e. implica-
tional sets where each implication A → b has a singleton b ∈ M as a consequent.
Lemma 3. ΣD generates hM, φi.

2.3     Ordered direct set of implications
Here we recall the notion of the ordered direct basis introduced in [1], which is
designed for a quick computation of the closures based on some fixed order of
implications. First we recall the definition of the ordered iteration of implications.
Definition 5. Suppose the set of implications Σ is equipped with some linear
order, or equivalently, the implications are indexed as Σ = {s1 , s2 , . . . , sn }. De-
fine a mapping ρΣ : 2M → 2M associated with this ordering as follows. For any
X ⊆ M , let X0 = X. If Xk is computed and implication sk+1 is A → B, then
                                  
                                    Xk ∪ B, if A ⊆ Xk ,
                          Xk+1 =
                                    Xk ,     otherwise.
Finally, ρΣ (X) = Xn . We will call ρΣ an ordered iteration of Σ.
                  From an implicational system to its corresponding D-basis      221


    The concept of the ordered iteration is central for the definition of the ordered
direct basis. For any given set of implications Σ on set M , by φΣ we understand
the closure operator on M defined by Σ. Equivalently, the fixed points of φΣ
are exactly subsets X ⊆ M which are stable for all implications A → B in Σ: if
A ⊆ X, then B ⊆ X.
Definition 6. The set of implications with some linear ordering on it, hΣ,                   a , ...Yk...                       a , ...Yk...
                   ....                         ....
                                                                                 ....                               ....

           Implicational System    Set of (non trivial) Closed sets       Set of Covers                  Set of Minimal Covers
                                     and Minimal Generators




                                  Fig. 1. Stages of D-basis algorithm



Thus, let a be an attribute and mg be the set of minimal generators such that
its closure contains a. We write this association as a pair ha, mgi. Let Φ be a
set of such pairs of attributes with their generators. We define the Function Add
which builds the set of covers produced in Stage 2 as follows:
          Add(ha, mgi, Φ) = {ha, {g ∈ mg|a 6∈ g} ∪ {mga }i : ha, mga i ∈ Φ}
   Then, in stage 3, the algorithm picks up the set of minimal covers from the set
obtained in stage 2 using the Function MinimalCovers. The method ends with
the Function OrderedComp which applies Composition Rule at the same time
that it orders the implications in the following sense: the first implications in the
D-basis are the binary ones (those with the left-hand side being a singleton).


 Algorithm 1: D-basis
      input : An implicational system Σ on M
      output: The D-basis ΣD on M
      begin
         MinGen:=MinGen0 (M , Σ)
         C:= ∅
         foreach hC, mg(C)i ∈MinGen do
            foreach a ∈ C do
                C:=Add(ha, mg(C)i, C)
         ΣD := ∅
         foreach ha, mga i ∈ C do
            mga :=MinimalCovers(mga )
            foreach g ∈ mga do ΣD := ΣD ∪ {g → a} ;
         OrderedComp(ΣD )
         return ΣD




Example 5. Algorithm 1 returns the following D-basis from the input implica-
tional system of Example 3:
          ΣD = {a → d, bce → ad, ab → ce, ae → bc, bde → ac, cd → abe}
                        From an implicational system to its corresponding D-basis                                                               227


   We emphasize that although ac is a minimal generator, it is not a minimal
cover, thus an implication with ac in the left-hand side is redundant (deduced
from inference axioms) and hence should not appear in the D-basis.


A detailed illustrative example

In the conclusion of this section we show the execution of the method, in all its
stages, on a set of implications from [3], which was used later to illustrate the
D-basis definition in [1].

Σ = {5 → 4, 23 → 4, 24 → 3, 34 → 2, 14 → 235, 25 → 134, 35 → 124, 15 → 24, 123 → 45}



    As a first step in the algorithm, MinGen0 renders the following set of pairs of
closed sets and its non-trivial minimal generators, see Figure 2:
        {h12345, {123, 14, 15, 25, 35}i, h234, {23, 24, 34}i, h45, {5}i, h∅, ∅i}


                              5⟶4, 2 3⟶4, 2 4⟶3, 3 4⟶2, 1 4⟶2 3 5, 2 5⟶1 3 4, 3 5⟶1 2 4, 1 5⟶2 4, 1 2 3⟶4 5




                        ø⟶5                                  ø⟶2 4                   ø⟶1 4                   ø⟶3 5             ø⟶1 2 3
                    4 5                                     2 3 4               1 2 3 4 5                   1 2 3 4 5         1 2 3 4 5
                   1⟶2 3, 2⟶1 3, 3⟶1 2                     1⟶5, 5⟶1                 ø                           ø                 ø
                                                                                        {1 4}                      {3 5}              {1 2 3}
                                          {5}                        {2 4}



                                                           ø⟶1       ø⟶5                        ø⟶2 5                      ø⟶1 5
                                                           1 5       1 5                      1 2 3 4 5                1 2 3 4 5
                  ø⟶1         ø⟶2       ø⟶3                 ø         ø                           ø                        ø
                                                           {1 2 4}   {2 4 5}
                1 2 3         1 2 3                                                                 {2 5}
                                        1 2 3                                                                                 {1 5}
                  ø             ø         ø
                 {1 5}          {2 5}      {3 5}

                                                   ø⟶2 3                     ø⟶3 4
                                                   2 3 4                     2 3 4
                                                  1⟶5, 5⟶1               1⟶5, 5⟶1
                                                            {2 3}                     {3 4}



                                        ø⟶1         ø⟶5                  ø⟶1          ø⟶5
                                        1 5         1 5                  1 5          1 5
                                         ø           ø                    ø            ø
                                        {1 2 3}     {2 3 5}              {1 3 4}      {3 4 5}




                                              Fig. 2. MinGen0 Execution


    Then, for each closed set and each of its elements, our algorithm renders the
following set of pairs of elements and covers:

              {h1, {25, 35}i, h2, {14, 15, 35, 34}i, h3, {14, 15, 25, 24}i,
                     h4, {123, 15, 25, 35, 5, 23}i, h5, {123, 14}i}

For each element, the Function MinimalCovers picks up its minimal covers:
       {h1, {25, 35}i, h2, {14, 34}i, h3, {14, 24}i, h4, {5, 23}i, h5, {14, 123}i}
   Finally, at the last step, the algorithm turns these pairs into implications and
applies ordered composition resulting in the D-basis.
  ΣD = {5 → 4, 23 → 4, 24 → 3, 34 → 2, 14 → 235, 25 → 1, 35 → 1, 123 → 5}
228     Estrella Rodríguez-Lorenzo et al.


5     Conclusion and future works
In this work we have presented a way to obtain the D-basis from any implica-
tional system. In [1] the algorithm was proposed to compute the D-basis from
any direct basis, but the computation from any implicational system was left
open. There exists also an efficient algorithm for the computation of the D-basis
from the context using the method of finding the minimal transversals of the
associated hypergraphs [2], but this assumes the different input for the closure
system which is outside the scope of this paper.
    The Function MinimalCovers renders the D-basis within the framework of
the closure systems without the need of any transformation. A key point of our
work is the connection between covers and generators. Using minimal gener-
ators, the D-basis is obtained by reducing the set of minimal generators and
transforming it into a set of minimal covers.
    As future work, we propose to develop an algorithm which computes the
D-basis with better integration of the minimal generator computation to render
the minimal covers in a more direct way. In addition, we are planning to design
an empirical study and to make a comparison between this algorithm and other
techniques proposed in previous papers.


Acknowledgment
Supported by Grants TIN2011-28084 and TIN2014-59471-P of the Science and
Innovation Ministry of Spain.

References
1. K. Adaricheva and J. B. Nation and R. Rand, Ordered direct implicational basis of
   a finite closure system, Discrete Applied Mathematics, 161 (6): 707–723, 2013.
2. K. Adaricheva and J.B. Nation, Discovery of the D-basis in binary tables based on
   hypergraph dualization, http://arxiv.org/abs/1504.02875, 2015.
3. K. Bertet, B. Monjardet, The multiple facets of the canonical direct unit implica-
   tional basis, Theor. Comput. Sci., 411(22-24): 2155–2166, 2010.
4. P. Cordero, A Mora, M. Enciso, I.Pérez de Guzmán, SLFD Logic: Elimination of
   Data Redundancy in Knowledge Representation, LNCS, 2527: 141–150, 2002.
5. P. Cordero, M. Enciso, A Mora, M. Ojeda-Aciego, Computing Minimal Generators
   from Implications: a Logic-guided Approach, CLA 2012: 187–198, 2012.
6. V. Duquenne, Some variations on Alan Day’s Algorithm for Calculating Canonical
   Basis of Implications, CLA 2007: 192–207, 2007.
7. B. Ganter, Two basic algorithms in concept analysis,       Technische Hochschule,
   Darmstadt, 1984.
8. A. Mora, M. Enciso, P. Cordero, I. Fortes, Closure via functional dependence sim-
   plification, International Journal of Computer Mathematics, 89(4): 510–526, 2012.
9. S. Rudolph, Some Notes on Managing Closure Operators, LNCS, 7278: 278–291,
   2012.
10. M. Wild, The joy of implications, aka pure Horn functions: mainly a survey, http:
   //arxiv.org/abs/1411.6432, 2014.
                       Using Linguistic Hedges
                     in L-rough Concept Analysis

                            Eduard Bartl and Jan Konecny

                          Data Analysis and Modeling Lab
                 Dept. Computer Science, Palacky University, Olomouc
                         17. listopadu 12, CZ-77146 Olomouc
                                    Czech Republic



        Abstract. We enrich concept-forming operators in L-rough Concept Anal-
        ysis with linguistic hedges which model semantics of logical connectives
        ‘very’ and ‘slightly’. Using hedges as parameters for the concept-forming
        operators we are allowed to modify our uncertainty when forming con-
        cepts. As a consequence, by selection of these hedges we can control the
        size of concept lattice.


 Keywords: Formal concept analysis; concept lattice; fuzzy set; linguistic hedge;
 rough set; uncertainty.


 1 Introduction
 In [2] we presented a framework which allows us to work with positive and
 negative attributes in the fuzzy setting by applying two unipolar scales for
 intents – a positive one and a negative one. The positive scale is implicitly
 modeled by an antitone Galois connection while the negative scale is modeled
 by an isotone Galois connection. In this paper we extend this approach in two
 ways.
     First, we work with uncertain information. To do this we extend formal
 fuzzy contexts to contain two truth-degrees for each object-attribute pair. The
 two truth-degrees represent necessity and possibility of the fact that an object
 has an attribute. The interval between these degrees represents the uncertainty
 presented in a given data.
     Second, we parametrize the concept-forming operators used in the frame-
 work by unary operators called truth-stressing and truth-depressing linguistic
 hedges. Their intended use is to model semantics of statements ‘it is very sure
 that this attribute belongs to a fuzzy set (intent)’ and ‘it is slightly possible that an
 attribute belongs a fuzzy set (intent)’, respectively. In the paper, we demonstrate
 how the hedges influence the size of concept lattice.


 2 Preliminaries
 In this section we summarize the basic notions used in the paper.

c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 229–240, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
230        Eduard Bartl and Jan Konecny


Residuated Lattices and Fuzzy Sets

     We use complete residuated lattices as basic structures of truth-degrees.
A complete residuated lattice [4, 12, 17] is a structure L “ xL, ^, _, b, Ñ, 0, 1y
such that xL, ^, _, 0, 1y is a complete lattice, i.e. a partially ordered set in which
arbitrary infima and suprema exist; xL, b, 1y is a commutative monoid, i.e. b is
a binary operation which is commutative, associative, and a b 1 “ a for each
a P L; b and Ñ satisfy adjointness, i.e. a b b ď c iff a ď b Ñ c. 0 and 1 denote the
least and greatest elements. The partial order of L is denoted by ď. Throughout
this work, L denotes an arbitrary complete residuated lattice.
     Elements of L are called truth degrees. Operations b (multiplication) and Ñ
(residuum) play the role of (truth functions of) “fuzzy conjunction” and “fuzzy
implication”. Furthermore, we define the complement of a P L as a “ a Ñ 0.
     An L-set (or fuzzy set) A in a universe set X is a mapping assigning to each
x P X some truth degree Apxq P L. The set of all L-sets in a universe X is denoted
LX .
     The operations with L-sets are defined componentwise. For instance, the
intersection of L-sets A, B P LX is an L-set AXB in X such that pAXBqpxq “ Apxq^
Bpxq for each x P X. An L-set A P LX is also denoted tApxq{x | x P Xu. If for all y P X
distinct from x1 , . . . , xn we have Apyq “ 0, we also write tApx1 q{x1 , . . . , Apxn q{xn u.
     An L-set A P LX is called normal if there is x P X such that Apxq “ 1. An
L-set A P LX is called crisp if Apxq P t0, 1u for each x P X. Crisp L-sets can be
identified with ordinary sets. For a crisp A, we also write x P A for Apxq “ 1 and
x < A for Apxq “ 0.
                   X
Ź For A, B P L we define the degree of inclusion of A in B by SpA, Bq “
   xPX Apxq Ñ Bpxq. Graded inclusion generalizes the classical inclusion relation.
Described verbally, SpA, Bq represents a degree to which A is a subset of B. In
particular, we write A Ď B iff SpA, Bq “ 1. As a consequence, we have A Ď B iff
Apxq ď Bpxq for each x P X.
     By L´1 we denote L with dual lattice order. An L-rough set A in a universe X
is a pair of L-sets A “ xA, Ay P pL ˆ L´1 qU . The A is called an lower approximation
of A and the A is called a upper approximation of A.1
     The operations with L-rough sets are again defined componentwise, i.e.
                       č                  č          č´1         č ď
                             xA, Ay “ x         A,         Ay “ x A, Ay,
                       iPI                iPI        iPI       iPI   iPI
                       ď                  ď          ď´1         ď č
                             xA, Ay “ x         A,         Ay “ x A, Ay.
                       iPI                iPI        iPI       iPI   iPI

Similarly, the graded subsethood is then applied componentwise

              SpxA, Ay, xB, Byq “ SpA, Bq ^ S´1 pA, Bq “ SpA, Bq ^ SpB, Aq
 1
     In our setting we consider intents to be L-rough sets; the lower and upper approxima-
     tion are interpreted as necessary intent and possible intent, respectively.
                         Using Linguistic Hedges in L-rough Concept Analysis      231


and the crisp subsethood is then defined using the graded subsethood:

          xA, Ay Ď xB, By iff SpxA, Ay, xB, Byq “ 1, iff A Ď B and B Ď A.

An L-rough set xA, Ay is called natural if A Ď A.
    Binary L-relations (binary fuzzy relations) between X and Y can be thought
of as L-sets in the universe X ˆ Y. That is, a binary L-relation I P LXˆY between
a set X and a set Y is a mapping assigning to each x P X and each y P Y a truth
degree Ipx, yq P L (a degree to which x and y are related by I). L-rough relations
are then pL ˆ L´1 q-sets in X ˆ Y. For L-relation I P LXˆY we define its inverse
I´1 P LYˆX as I´1 py, xq “ Ipx, yq for all x P X, y P Y.

Formal Concept Analysis in the Fuzzy Setting

    An L-context is a triplet xX, Y, Iy where X and Y are (ordinary) sets and
I P LXˆY is an L-relation between X and Y. Elements of X are called objects,
elements of Y are called attributes, I is called an incidence relation. Ipx, yq “ a is
read: “The object x has the attribute y to degree a.”
    Consider the following pairs of operators induced by an L-context xX, Y, Iy.
First, the pair xÒ, Óy of operators Ò : LX Ñ LY and Ó : LY Ñ LX is defined by
                     ľ                                  ľ
          AÒ pyq “      Apxq Ñ Ipx, yq and BÓ pxq “        Bpyq Ñ Ipx, yq.
                   xPX                                    yPY

Second, the pair xX, Yy of operators X : LX Ñ LY and Y : LY Ñ LX is defined by
                  ł                                  ľ
         AX pyq “      Apxq b Ipx, yq and BY pxq “        Ipx, yq Ñ Bpyq.
                   xPX                                    yPY

    To emphasize that the operators are induced by I, we also denote the opera-
tors by xÒI , ÓI y and xXI , YI y.
    Fixpoints of these operators are called formal concepts. The set of all formal
concepts (along with set inclusion) forms a complete lattice, called L-concept
lattice. We denote the sets of all concepts (as well as the corresponding L-concept
lattice) by BÒÓ pX, Y, Iq and BXY pX, Y, Iq, i.e.

               BÒÓ pX, Y, Iq “ txA, By P LX ˆ LY | AÒ “ B, BÓ “ Au,
               BXY pX, Y, Iq “ txA, By P LX ˆ LY | AX “ B, BY “ Au.

   For an L-concept lattice BpX, Y, Iq, where B is either BÒÓ or BXY , denote the
corresponding sets of extents and intents by ExtpX, Y, Iq and IntpX, Y, Iq. That is,
              ExtpX, Y, Iq “ tA P LX | xA, By P BpX, Y, Iq for some Bu,
              IntpX, Y, Iq “ tB P LY | xA, By P BpX, Y, Iq for some Au.
  An pL1 , L2 q-Galois connection between the sets X and Y is a pair x f, gy of
mappings f : LX 1
                  Ñ LY2 , g : LY2 Ñ LX
                                     1
                                       , satisfying
                               SpA, gpBqq “ SpB, f pAqq
232      Eduard Bartl and Jan Konecny


for every A P LX
               1
                 , B P LY2 .
    One can easily observe that the couple xÒ, Óy forms an pL, Lq-Galois connec-
tion between X and Y, while xX, Yy forms an pL, L´1 q-Galois connection between
X and Y.

L-rough Contexts and L-rough Concepts Lattices

    An L-rough context is a quadruple xX, Y, I, Iy, where X and Y are (crisp) sets
of objects and attributes, respectively, and the xI, Iy is a L-rough relation. The
meaning of xI, Iy is as follows: Ipx, yq (resp. Ipx, yq) is the truth degree to which
the object x surely (resp. possibly) has the attribute y. The quadruple xX, Y, I, Iy
is called a L-rough context.
    The L-rough context induces two operators defined as follows. Let xX, Y, I, Iy
be an L-rough context. Define L-rough concept-forming operators as

                                       AM “ xAÒI , AXI y,
                                                    Y                                (1)
                                  xB, ByO “ BÓI X B I

for A P LX , B, B P LY . Fixed points of xM, Oy, i.e. tuples xA, xB, Byy P LX ˆpLˆL´1 qY
such that AM “ xB, By and xB, ByO “ A, are called L-rough concepts. The B and B
are called lower intent approximation and upper intent approximation, respectively.
    In [2] we showed that the pair of operators (1) is an pL, L ˆ L´1 q-Galois
connection.

Linguistic Hedges

    Truth-stressing hedges were studied from the point of fuzzy logic as logical
connectives ‘very true’, see [13]. Our approach is close to that in [13]. A truth-
stressing hedge is a mapping ˚ : L Ñ L satisfying

               1˚ “ 1,    a˚ ď a,      a ď b implies a˚ ď b˚ ,   a˚˚ “ a˚            (2)

for each a, b P L. Truth-stressing hedges were used to parametrize antitone L-
Galois connections e.g. in [3, 5, 9], and also to parameterize isotone L-Galois
connections in [1].
    On every complete residuated lattice L, there are two important truth-
stressing hedges:
 (i) identity, i.e. a˚ “ a pa P Lq;
(ii) globalization, i.e.                     "
                                                 1, if a “ 1,
                                      a˚ “
                                                 0, otherwise.
    A truth-depressing hedge is a mapping  : L Ñ L such that following conditions
are satisfied

               0 “ 0,     a ď a ,    a ď b implies a ď b ,   a “ a
                         Using Linguistic Hedges in L-rough Concept Analysis            233


for each a, b P L. A truth-depressing hedge is a (truth function of) logical con-
nective ‘slightly true’, see [16].
    On every complete residuated lattice L, there are two important truth-
depressing hedges:

 (i) identity, i.e. a “ a pa P Lq;
(ii) antiglobalization, i.e.
                                               "
                                                  0, if a “ 0,
                                      a “
                                                   1, otherwise .




                                                                                       1


                                                                                       0.75


                                                                                       0.5


                                                                                       0.25


                                                                                       0
    ˚G          ˚1         ˚2             ˚3            ˚4          ˚5   ˚6       id



                                                                                       1


                                                                                       0.75


                                                                                       0.5


                                                                                       0.25


                                                                                       0
    G          1         2             3            4          5   6       id



Fig. 1. Truth-stressing hedges (top) and truth-depressing hedges (bottom) on 5-element
chain with Łukasiewicz operations L “ xt0, 0.25, 0.5, 0.75, 1u, min, max, b, Ñ, 0, 1y. The
leftmost truth-stressing hedge ˚G is the globalization, leftmost truth-depressing hedge
G
   is the antiglobalization. The rightmost hedges denoted by id are the identities.
234     Eduard Bartl and Jan Konecny


    For truth-stressing/truth-depressing hedge ˚ we denote by fixp˚q set of its
idempotent elements in L; i.e. fixp˚q “ ta P L | a˚ “ au.
    Let ˚1 , ˚2 be truth-stressing hedges on L such that fixp˚1 q Ď fixp˚2 q; then
for each a P A, a˚1 ˚2 “ a˚1 holds. The same holds true for ˚1 , ˚2 being truth-
depressing hedges.
    We naturally extend application of truth-stressing/truth-depressing hedges
to L-sets: A˚ pxq “ Apxq˚ for all x P U.

3 Results
The L-rough concept-forming operator M gives for each L-set of objects two
L-sets of attributes. The first one represents a necessity of having the attributes
and second one a possibility of having the attributes. We add linguistic hedges
to the concept-forming operators to control shape of the two L-sets.
    Since the L-rough concept-forming operators are defined via xÒ, Óy and xX, Yy,
we first recall the parametrization of these operators as described in [8, 15].

3.1   Linguistic Hedges in Formal Fuzzy Concept Analysis
Let xX, Y, Iy be an L-context and let r, q be truth-stressing hedges on L. The
antitone concept-forming operators parametrized by r and q induced by I are
defined as                          ľ
                          AÒr pyq “    Apxqr Ñ Ipx, yq,
                                       xPX
                                       ľ
                            Óq
                           B pxq “           Bpyqq Ñ Ipx, yq
                                       yPY

for all A P LX , B P LY .
    Let r and ♠ be truth-stressing hedge and truth-depressing hedge on L,
respectively. The isotone concept-forming operators parametrized by r and ♠
induced by I are defined as
                                    ł
                          AXr pyq “   Apxqr b Ipx, yq,
                                       xPX
                                       ľ
                           BY♠ pxq “         Ipx, yq Ñ Bpyq♠
                                       yPY

for all A P LX , B P LY .
    Properties of the hedges in the setting of multi-adjoint concept lattices with
heterogeneous conjunctors were studied in [14].

3.2   L-rough Concept-Forming Operators with Linguistic Hedges
Let r, q be truth-stressing hedges on L and let ♠ be a truth-depressing hedge on
L. We parametrize the L-rough concept-forming operators as
                                                         Y♠
                  AN “ xAÒr , AXr y and xB, ByH “ BÓq X B                       (3)
                                      Using Linguistic Hedges in L-rough Concept Analysis                             235


for A P LX , B, B P LY .

Remark 1. When the all three hedges are identities the pair xN, Hy is equivalent
to xM, Oy; so it is an pL, L ˆ L´1 q-Galois connection. For arbitrary hedges this
does not hold.

    The following theorem describes properties of xN, Hy.

Theorem 1. The pair xN, Hy of L-rough concept-forming operators parametrized by
hedges has the following properties.
                                                                       ♠                   ♠
(a) AN “ ArM “ ArN and xB, ByH “ xBq , B yO “ xBq , B yH
(b) AM Ď AN and xB, ByO Ď xB, ByH
(c) SpAr1 , Ar2 q ď SpAN2 , AN1 q and SpxB1 , B1 y, xB2 , B2 yq ď SpxB2 , B2 yH , xB1 , B1 yH q
                                        ♠
(d) Ar Ď ANH and xBq , B y Ď xB, ByHN ;
(e) A1 Ď A2 implies AN2 Ď AN1 and xB1 , B1 y Ď xB2 , B2 y implies xB2 , B2 yH Ď xB1 , B1 yH
                                                ♠
 (f) SpAr , xB, ByH q “ SpxBq , B y, AN q
      Ť              Ş              Ť       Ş      ♠     Ş
(g) p iPI Ari qN “ iPI ANi and px iPI Bi q , iPI Bi yqH “ iPI xBi , Bi yH
(h) ANH “ ANHNH and xB, ByHN “ xB, ByHNHN .

Proof. (a) Follows immediately from definition of N and H and idempotency of
hedges.
    (b) From (2) we have Ar Ď A; by properties of Galois connections the in-
clusion implies AM Ď ArM , which is by (a) equivalent to AM Ď AN . Proof of the
second statement in (b) is similar.
    (c) Follows from (a) and properties of Galois connections.
    (d) By [2, Corollary 1(a)] we have Ar Ď ArMO . Using (a) we get Ar Ď ANO and
from (b) we have ANO Ď ANH , so Ar Ď ANH . Similarly for the second claim.
    (e) Follows directly from [2, Corollary 1(c)] and properties of Galois connec-
tions.
    (f) Since xM, Oy forms pL, L ˆ L´1 q-Galois connection and using (a) we have
                                                ♠                            ♠                              ♠
SpAr , xB, ByH q “ SpAr , xBq , B yO q “ SpxBq , B y, ArM q “ SpxBq , B y, AN q.
   (g) We can easily get
                 ď              ď          ď               č Ò ď
                p Ari qN “ xp Ari qÒr , p Ari qXr y “ x Ai r , AXi r y
                       iPI                  iPI     iPI                                  iPI          iPI
                                            č           č
                                            Òr  X
                                       “ xAi , Ai y “
                                                  r
                                                            ANi ,
                                        iPI             iPI

and
                ď              č       ♠       ď            č ♠       č        č Y♠
           px         Bi q ,         Bi yqH “ p Bi q qÓq X p Bi qY♠ “   Bi Óq X Bi
                iPI            iPI                   iPI                   iPI                 iPI              iPI
                                                    č                   Y♠       č
                                                                Óq                               H
                                            “             pBi        X Bi q “          xBi , Bi y .
                                                    iPI                          iPI
236       Eduard Bartl and Jan Konecny


   (h) Using (a), (d) and (e) twice, we have ANH Ď ANHNH . Using (d) for xB, By “
A we have ANr Ď ANrHN “ ANHN . Then applying (e) we get ANHNH Ď ANH
  N

proving the first claim. The second claim can be proved analogically.           \
                                                                                [
      The set of fixed points of xN, Hy endowed with partial order ď given by

                  xA1 , B1 , B1 y ď xA2 , B2 , B2 y iff   A1 Ď A2
                                                                                      (4)
                                                    iff   xB1 , B1 y Ď xB2 , B2 y

is denoted by BNH
               r,q,♠ pX, Y, I, Iq.

Remark 2. Note that from (4) it is clear that if a concept has non-natural L-rough
intent then all its subconcepts have non-natural intent. If such concepts are
not desired, one can simply ignore them and work with the iceberg lattice of
concepts with natural L-rough intents.
      The next theorem shows a crisp representation of BNH
                                                        r,q,♠ pX, Y, I, Iq.
                                                                              ÒÓ
Theorem 2. BNH  r,q,♠ pX, Y, I, Iq is isomorphic to ordinary concept lattice B pXˆfixprq, Yˆ
                  ˆ
fixpqq ˆ fixp♠q, I q where

              xxx, ay, xy, b, byy P Iˆ iff a b b ď Ipx, yq and a Ñ b ě Ipx, yq.
Proof. This proof can be done by following the same steps as in [8, 15].               \
                                                                                       [
      The following theorem explains the structure of BNH
                                                       r,q,♠ pX, Y, I, Iq.

Theorem 3. BNH
            r,q,♠ pX, Y, I, Iq is a complete lattice with suprema and infima defined as
              ľ                          č          ď       č ♠
                   xAi , xBi , Bi yy “ xp Ai qNH , x Bi q , Bi yHN y,
                     i                                    i        i
                   ł                      ď           č ď
                    xAi , xBi , Bi yy “ xp Ari qNH , x Bi , Bi yHN y
                     i                        i               i   i

for all Ai P LX , Bi P LY , Bi P LY .
Proof. Follows from Theorem 2.                                                         \
                                                                                       [
Remark 3. Note that if we alternatively define (3) as
                                                                 Y♠
                 AN “ xpAÒr qq , pAXr q♠ y and xB, ByH “ pBÓq X B qr                  (5)
or                                                              Y
                   AN “ xpAÒ qq , pAX q♠ y and xB, ByH “ pBÓ X B qr                   (6)
or
                    AN “ xpAÒr qq , pAXr q♠ y and         xB, ByH “ xB, ByO
or                                                              Y♠
                         AN “ AM        and   xB, ByH “ pBÓq X B qr
we obtain an isomorphic concept lattice. In addition (5) and (6) produce the
same concept lattice.
                         Using Linguistic Hedges in L-rough Concept Analysis        237


3.3    Size Reduction of Fuzzy Rough Concept Lattices
This part provides analogous results on reduction with truth-stressing and truth-
depressing hedges as [10] for antitone fuzzy concept-forming operators and [15]
for isotone fuzzy concept-forming operators.
    For the next theorem we need the following lemma.
Lemma 1. Let r, ♥, q, ♦ be truth-stressing hedges on L such that fixprq Ď fixp♥q, fixpqq
Ď fixp♦q; let ♠, s be truth-depressing hedges on L such that and fixp♠q Ď fixpsq. We
have
                       AN♥ Ď ANr and xB, ByH♦,s Ď xB, ByHq,♠ .
Proof. We have Ar♥ Ď A♥ from (2). From the assumption fixprq Ď fixp♥q we get
Ar♥ “ Ar ; whence we have Ar Ď A♥ . Theorem 1(e) implies A♥N Ď ArN which is
by the claim (a) of this theorem equivalent to AN♥ Ď ANr . The second claim can
be proved similarly.                                                          \
                                                                              [
Theorem 4. Let r, ♥, q, ♦ be truth-stressing hedges on L such that fixprq Ď fixp♥q,
fixpqq Ď fixp♦q; let ♠, s be truth-depressing hedges on L s.t. and fixp♠q Ď fixpsq,

                          |BNH                      NH
                            r,q,♠ pX, Y, I, Iq| ď |B♥,♦,s pX, Y, I, Iq|

for all L-rough contexts xX, Y, I, Iy.

In addition, if r “ ♥ “ id, we have

                         ExtNH                      NH
                            r,q,♠ pX, Y, I, Iq Ď Ext♥,♦,s pX, Y, I, Iq.

Similarly, if q “ ♦ “ ♠ “ s “ id, we have

                          IntNH                      NH
                             r,q,♠ pX, Y, I, Iq Ď Int♥,♦,s pX, Y, I, Iq.

Proof. (4) follows directly from Theorem 2 and results on subcontexts in [11].
   Now, we show (4). Note that each A P ExtNHr,q,♠ pX, Y, I, Iq we have

                          A “ ANr Hq,♠ “ AN♥ Hq,♠ Ě AN♥ H♦,s Ě A.
      Thus we have A P ExtNH
                          ♥,♦,s pX, Y, I, Iq. The inclusion (4) can be proved similarly.
                                                                                      \
                                                                                      [
Example 1. Consider the truth-stressing hedges ˚G , ˚1 , ˚2 , id and truth-depressing
hedges G , 1 , 2 , id from Figure 1. One can easily observe that
                         fixp˚G q Ď fixp˚1 q Ď fixp˚2 q Ď fixpidq
                         fixpG q Ď fixp1 q Ď fixp2 q Ď fixpidq.
Consider the L-context of books and their graded properties in Fig. 2 with L
being 5-element Łukasiewicz chain. Using various combinations of the hedges
we obtain a smooth transition in size of the associated fuzzy rough concept
lattice going from 10 concepts up to 498 (see Tab. 1). When the 5-element Gödel
chain is used instead, we again get a transition going from 10 concepts up to
298 (see Tab. 2).
238       Eduard Bartl and Jan Konecny



                 High rating     Large no. of pages    Low price     Top sales rank
      1             0.75                  0                1                0
      2              0.5                  1              0.25              0.5
      3               1                   1              0.25              0.5
      4             0.75                 0.5             0.25               1
      5             0.75                0.25             0.75               0
      6               1                   0              0.75             0.25


Fig. 2. L-context of books and their graded properties; this L-context was used in [1, 15]
to demonstrate reduction of L-concept lattices using hedges.



            ♠ “ G ˚G      ˚1    ˚2     id    ♠ “ 1 ˚G      ˚1     ˚2      id
                ˚G 10      16    59     61        ˚G 15      28     71     110
                ˚1 12      22    65     93        ˚1 15      28     71     170
                ˚2 15      26    69    103        ˚2 22      28     79     195
                id 19      41    97    152        id 28      28    110     264

            ♠ “ 2 ˚G     ˚1      ˚2    id     ♠ “ id ˚G      ˚1     ˚2     id
                ˚G 15     53     134   211        ˚G 27       75    160    297
                ˚1 15     53     134   290         ˚1 27      75    160    372
                ˚2 22     63     146   327         ˚2 32      80    165    396
                id 28     80     181   415         id 40      99    202    498

Table 1. Numbers of concepts in L-context from Fig. 2 formed by xN, Hy parametrized by
r, q, and ♠. A 5-element Łukasiewicz chain is used as the structure of truth degrees. The
rows represent the hedge r and the columns represent the hedge q.




             ♠ “ G ˚G      ˚1    ˚2    id    ♠ “ G ˚G       ˚1    ˚2     id
                 ˚G 10      18    24    24        ˚G 15       29    36     45
                 ˚1 12      21    33    36        ˚1 15       32    49     63
                 ˚2 15      29    45    48        ˚2 22       57    78    106
                 id 19      33    51    54        id 28       66    89    117


           ♠ “ G ˚G      ˚1      ˚2    id    ♠ “ G ˚G       ˚1     ˚2     id
               ˚G 15      32      48    59        ˚G 27       50    66     125
               ˚1 15      32      59    75        ˚1 27       50    80     167
               ˚2 22      57      88   118        ˚2 32       79    113    257
               id 28      66     100   130        id 40       90    127    298

Table 2. Numbers of concepts in L-context from Fig. 2 formed by xN, Hy parametrized by
r, q, and ♠. A 5-element Gödel chain is used as the structure of truth degrees. The rows
represent the hedge r and the columns represent the hedge q.
                          Using Linguistic Hedges in L-rough Concept Analysis              239


4 Conclusion and further research

We have shown that the L-rough concept-forming operators can be parameter-
ized by truth-stressing and truth-depressing hedges similarly as the antitone
and isotone fuzzy concept-forming operators.
    Our future research includes a study of attribute implications using whose
semantics is related to the present setting. That will combine results on fuzzy
attribute implications [7] and attribute containment formulas [6].


Acknowledgment

Supported by grant No. 15-17899S, “Decompositions of Matrices with Boolean
and Ordinal Data: Theory and Algorithms”, of the Czech Science Foundation.


References

 1. Eduard Bartl, Radim Belohlavek, Jan Konecny, and Vilem Vychodil. Isotone Galois
    connections and concept lattices with hedges. In IEEE IS 2008, Int. IEEE Conference
    on Intelligent Systems, pages 15–24–15–28, Varna, Bulgaria, 2008.
 2. Eduard Bartl and Jan Konecny. Formal L-concepts with Rough Intents. In CLA 2014:
    Proceedings of the 11th International Conference on Concept Lattices and Their Applications,
    pages 207–218, 2014.
 3. Radim Belohlavek. Reduction and simple proof of characterization of fuzzy concept
    lattices. Fundamenta Informaticae, 46(4):277–285, 2001.
 4. Radim Belohlavek. Fuzzy Relational Systems: Foundations and Principles. Kluwer
    Academic Publishers, Norwell, USA, 2002.
 5. Radim Belohlavek, Tatana Funioková, and Vilem Vychodil. Fuzzy closure operators
    with truth stressers. Logic Journal of the IGPL, 13(5):503–513, 2005.
 6. Radim Belohlavek and Jan Konecny. A logic of attribute containment, 2008.
 7. Radim Belohlavek and Vilem Vychodil. A logic of graded attributes. submitted to
    Artificial Intelligence.
 8. Radim Belohlavek and Vilem Vychodil. Reducing the size of fuzzy concept lattices
    by hedges. In FUZZ-IEEE 2005, The IEEE International Conference on Fuzzy Systems,
    pages 663–668, Reno (Nevada, USA), 2005.
 9. Radim Belohlavek and Vilem Vychodil. Fuzzy concept lattices constrained by
    hedges. JACIII, 11(6):536–545, 2007.
10. Radim Belohlavek and Vilem Vychodil. Formal concept analysis and linguistic
    hedges. Int. J. General Systems, 41(5):503–532, 2012.
11. Bernard Ganter and Rudolf Wille. Formal Concept Analysis – Mathematical Foundations.
    Springer, 1999.
12. Petr Hájek. Metamathematics of Fuzzy Logic (Trends in Logic). Springer, November
    2001.
13. Petr Hájek. On very true. Fuzzy Sets and Systems, 124(3):329–333, 2001.
14. Jan Konecny, Jesús Medina and Manuel Ojeda-Aciego Multi-adjoint concept lattices
    with heterogeneous conjunctors and hedges. Annals of Mathematics and Artificial
    Intelligence, 72(1):73–89, 2011.
240     Eduard Bartl and Jan Konecny


15. Jan Konecny. Isotone fuzzy Galois connections with hedges. Information Sciences,
    181(10):1804–1817, 2011. Special Issue on Information Engineering Applications
    Based on Lattices.
16. Vilem Vychodil. Truth-depressing hedges and BL-logic. Fuzzy Sets and Systems,
    157(15):2074–2090, 2006.
17. Morgan Ward and R. P. Dilworth. Residuated lattices. Transactions of the American
    Mathematical Society, 45:335–354, 1939.
       Revisiting Pattern Structures for Structured
                      Attribute Sets

               Mehwish Alam1 , Aleksey Buzmakov1 , Amedeo Napoli1 , and
                                 Alibek Sailanbayev2?
      1
          LORIA (CNRS – Inria NGE – U. de Lorraine), Vandœuvre-lès-Nancy, France
                       2
                         Nazarbayev University, Astana, Kazakhstan
               { mehwish.alam, aleksey.buzmakov, amedeo.napoli, } @loria.fr,
                               alibek.sailanbayev@nu.edu.kz



            Abstract. In this paper, we revisit an original proposition on pattern
            structures for structured sets of attributes. There are several reasons for
            carrying out this kind of research work. The original proposition does
            not give many details on the whole framework, and especially on the
            possible ways of implementing the similarity operation. There exists an
            alternative definition without any reference to pattern structures, and
            we would like to make a parallel between two points of view. Moreover
            we discuss an efficient implementation of the intersection operation in
            the corresponding pattern structure. Finally, we discovered that pattern
            structures for structured attribute sets are very well adapted to the clas-
            sification and the analysis of RDF data. We terminate the paper by an
            experimental section where it is shown that the provided implementation
            of pattern structures for structured attribute sets is quite efficient.


 Keywords: Formal Concept Analysis, Pattern Structures, Structured Attribute
 Sets, Least Common Ancestor, Range Minimum Query.


 1        Introduction

 In this paper, we want to make precise and develop a section of [1] related to
 pattern structures and structured sets of attributes. There are several reasons
 for carrying out this kind of research work. Firstly, the the pattern structures,
 the similarity operator u and the associated subsumption operator v for struc-
 tured sets of attributes are based on antichains and rather briefly sketched in
 the original paper. Secondly, there is an alternative and a more “qualitative”
 point of view on the same subject in [2, 3] without any reference to pattern
 structures, and we would like to make a parallel between these two points of
 view. Finally, for classifying RDF triples in the analysis of the content of Linked
 Open Data (LOD), we discovered that actually pattern structures for structured
 sets of attributes are very well adapted to solve this problem [4]. Moreover, the
  ?
      This work was done during the stay of Alibek Sailanbayev at LORIA, France.


c paper author(s), 2015. Published in Sadok Ben Yahia, Jan Konecny (Eds.): CLA
 2015, pp. 241–252, ISBN 978–2–9544948–0–7, Blaise Pascal University, LIMOS
 laboratory, Clermont-Ferrand, 2015. Copying permitted only for private and
 academic purposes.
242     Mehwish Alam et al.


classification of RDF triples provides a very good and practical example for illus-
trating the use of such a pattern structure and helps to reconcile the two above
points of view.
    Accordingly, in this paper, we will go back to the two original definitions and
show how they are related. For completing the history, it is worth mentioning
that antichains, whose intersection is the basis of the similarity operation in
the pattern structure for structured attribute sets, our paper, are studied in the
book [5]. Moreover, this book cites as an application of antichain intersection an
older paper from 1994 [6], written in French, about the decomposition of total
orderings and its application to knowledge discovery.
    Then, we proceed to present a way of efficiently working with antichains and
intersection of antichains, which can be very useful, especially in case of large sets
of data. The last section details a series of experiments where it is shown that
pattern structures can be implemented with an efficient intersection operation
and that they have a generally better behavior than scaled contexts.


2     Pattern Structures for Structured Attributes

2.1   Pattern Structures

Formal Concept Analysis [7] can process only binary contexts. Pattern structures
are an extension of FCA which allow a direct processing of such kind of data.
The formalism of pattern structures was introduced in [1].
    A pattern structure is a triple (G, (D, u), δ), where G is the set of objects,
(D, u) is a meet-semilattice of descriptions, and δ : G → D maps an object to
its description. In other words, a pattern structure composed of a set of objects,
a set of descriptions equipped with a similarity operation denoted by u. This
similarity operation is idempotent, commutative and associative. If (G, (D, u), δ)
is a pattern structures then the derivation operators (Galois connection) are
defined as:

                        l
                A :=         δ(g)                        for A ⊆ G
                        g∈A

                 d := {g ∈ G|d v δ(g)}                    for d ∈ D

Each element in D is referred to as a pattern. The natural order on (D, u), given
by c v d ⇔ c u d = c is called the subsumption order. Now a pattern concept
can be defined as follows:

Definition 1 (Pattern Concept). A pattern concept of a pattern structure
(G, (D, u), δ) is a pair (A, d) where A ⊆ G and d ∈ D such that A = d and
A = d , where A is called the concept extent and d is called the concept intent.

    A pattern extent corresponds to the maximal set of objects A whose descrip-
tions subsume the description d, where d is the maximal common description
                  Revisiting Pattern Structures for Structured Attribute Sets       243


for objects in A. The set of all pattern concepts is partially ordered w.r.t. inclu-
sion on extents, i.e., (A1 , d1 ) ≤ (A2 , d2 ) iff A1 ⊆ A2 (or, equivalently, d2 v d1 ),
making a lattice, called pattern lattice.

2.2   Two original propositions on structured attribute sets
We briefly recall two original propositions supporting the present study. The first
work is firstly published by Carpineto & Romano in [2] and then developed in
[3]. The second work is related to the definition of pattern structures by Ganter
& Kuznetsov in [1].
     In [2, 3], the authors consider a formal context (G, M, I) and an extended set
of attributes M ∗ ⊃ M where attributes are organized within a subsumption hi-
erarchy according to a partial ordering denoted by ≤M ∗ . The following condition
should be satisfied:
     ∀g ∈ G, m1 ∈ M, m2 ∈ M ∗ : [(g, m1 ) ∈ I, m1 ≤M ∗ m2 ] =⇒ (g, m2 ) ∈ I
     The subsumption hierarchy can be either a tree or an acyclic graph with a
unique maximal element, as this is the case of attributes lying in a thesaurus for
example. Then the building of a concept lattice from such a context can be done
in two main ways. A first is to use a scaling and to complete the description of
an object with all attributes implied by the original attributes. We discuss this
scaling operation in detail later. The problem would be the space necessary to
store the scaled context, especially in case of big data. A second way is to use
an “extended intersection operation” between sets of attributes which is defined
as follows. The intersection of two sets of attributes Y1 and Y2 is obtained by
finding for each pair (m1 , m2 ), m1 ∈ Y1 , m2 ∈ Y2 , the most specific attributes in
M ∗ that are more general than m1 and m2 , and then retaining only the most
specific elements of the set of attributes generated in this way. Then if (X1 , Y1 )
and (X2 , Y2 ) are two concepts, we have:
     (X1 , Y1 ) ≤ (X2 , Y2 ) ⇐⇒ ∀m2 ∈ Y2 , ∃m1 ∈ Y1 , m1 ≤M ∗ m2
     In other words, this intersection operation corresponds to the intersection of
two antichains as this is explained in [1], where the authors define the formalism
of pattern structures and take as an instantiation structured attribute sets. More
formally, it is assumed that the attribute set (M, ≤M ) is finite and partially
ordered, and that all attribute combinations that can occur must be order ideals
(downsets) of this order. Then, any order ideal O can be described by the set
of its maximal elements; O = {x|∃y ∈ M, x ≤ y}. It should be noticed that the
order considered on the attribute sets in [1] is reversed with respect to the order
considered in [2, 3]. However, we keep the original definitions used in [1] in the
present paragraph. These maximal elements form an antichain, and conversely,
each antichain is the set of maximal elements of some order ideal. Thus, the
semilattice (D, u) of patterns in the pattern structure consists of all antichains
of the ordered attribute set. In addition, it is isomorphic to the lattice of all
order ideals of the ordered set, and thus isomorphic to the concept lattice of the
context (P, P, 6≥). For two antichains AC1 and AC2 , the infimum AC1 u AC2
consists of all maximal elements of the order ideal:
     {m ∈ P | ∃ac1 ∈ AC1 , ∃ac2 ∈ AC2 , m ≤ ac1 and m ≤ ac2 }.
244      Mehwish Alam et al.


    There is a “canonical representation context” (or an associated scaling oper-
ator) for the pattern structure (G, (D, u), δ) related to structured attribute sets,
which is defined by the set of “principal ideals ↓ p” as follows: (G, P, I) with
(g, p) ∈ I ⇐⇒ p ≤ δ(g).
    In the next section, we make precise and discuss the pattern structure for
structured attribute sets by taking the point of view of filters and not of ideals
in agreement with the order from [2, 3], with the most general attributes above.


2.3    From Structured Attributes to Tree-shaped Attributes

An important case of structured attributes is “tree-shaped attributes”, i.e., when
the attributes are organized within a partial order corresponding to a rooted tree.
If it is the case, then the root of the tree, denoted by >, can be matched against
the description of any object, while the leaves of this tree are the most detailed
descriptions. For example, the root can correspond to the attribute ‘Animal’ and
a leaf can correspond to the attribute ‘Cat’; somewhere in between there could
be attribute ‘Mammal’.
     An example of such kind of data naturally appears in the domain of semantic
web data. For example, Figure 1 gives a small part of ACCS1 . This attribute tree
will be used as a running example and should be read as follows. If an object
belongs to class C1 (and probably to some other classes), then it necessarily
belongs to classes C10 , C12 , and >, e.g., if an object is a cat, then it is a mammal
and an animal. Accordingly, the description of an object can include several
classes, e.g., classes C1 , C5 and C8 . Thus, some of the tree-shaped attributes can
be omitted from the description of an object. However, they should be always
taken into account when computing the intersection between descriptions. Thus,
in order to avoid redundancy in the descriptions, we can allow only antichains of
the tree as possible elements in the set D of descriptions, and then, accordingly
compute the intersection of antichains.
     An efficient way of computing intersection of antichains is explained in the
next section. Here it is important to notice that although it is a hard task to
efficiently compute intersection of antichains in an arbitrary partial order of
attributes, the intersection of antichains in a tree can help in computing this
more general intersection. Indeed, in a partial order of attributes, we can add an
artificial attribute > that can be matched against any description. Then, instead
of considering an intersection of antichains in an arbitrary poset we can take a
spanning tree of it with > taken as the root. Although we have lost some relations
between attributes, and, thus, the size of the antichains is probably larger, we
can apply the efficient intersection of antichains of tree discussed below.


2.4    On Computing Intersection of Antichains in a Tree

In this subsection we show how to efficiently solve the problem of intersection
of antichains in a tree. The problem is formalized as follows. A partial order is
1
    https://www.acm.org/about/class/2012
                 Revisiting Pattern Structures for Structured Attribute Sets      245


                                                      >



                                C12                            C6



                     C10                   C11                 C13



                C1         C2         C4         C5       C7   C8    C9


  Fig. 1: A small part from ACM Computing Classification System (ACCS).



described by the Hasse diagram corresponding to the tree. The root is denoted
by > and it is larger w.r.t. the partial order than any other element in the tree.
Given a rooted tree T and two antichains X and Y , we should find an antichain
Z such that (1) for all x ∈ X ∪ Y there is z ∈ Z such that x ≤ z and (2) no
z ∈ Z can be removed or changed to z̃ < z without violating requirement (1).
    If the cardinality of antichains X and Y is 1 then this task is reduced to
the well-known problem of a Least Common Ancestor (LCA). In 1984 it was
already shown that the LCA problem can be reduced to Range Minimum Query
(RMQ) problem [8]. Later several simpler approaches were introduced for solving
the LCA problem. Here we briefly introduce the reduction of LCA to RMQ in
accordance with [9].

Reduction of LCA to RMQ. Given an array of numbers, the RMQ problem
consists in efficient answering queries on the position of the minimal value in a
given range (interval) of positions for this array. For example, given an array
                                         Array [ 2 1 0 3 2 ]
                                      Positions 1 2 3 4 5
where the first value is in position 1 and the last value is in position 5, the answer
to the query on the position of the minimal number in the range 2–4, i.e., the
corresponding part of array is [1;0;3], is 3 (the value of the 3rd element in the
array is 0 and it is the minimal value in this range). Accordingly, the position
of the minimal number in the range 1–2 (the part of the array is [2;1]) is 2. The
good point about this problem is that it can be solved in O(n) preprocessing
computational time and in O(1) computational time per one query [9], where n
is the number of elements in the array.
    In order to introduce the reduction of LCA to RMQ we need to know what is
the depth of a tree vertex. The depth of a vertex in a rooted tree is the number
of edges in the shortest path from that vertex to the root of the tree.
    We create the array of depths of the vertices in the tree that is used as an
input array for RMQ. We build this array in the following way. We traverse the
tree in depth first order (see Figure 2). Every time the algorithm considers a
246     Mehwish Alam et al.


                                        0

                                  1           6

                              2         3           7

                                  4           5

      Depth array D       [ 0 1 2 1 2 3 2 3 2 1 0 1 2 1 0 ]
      Corresponding vertex v0 v1 v2 v1 v3 v4 v3 v5 v3 v1 v0 v6 v7 v6 v0
      Positions             1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Fig. 2: Reducing RMQ task to LCA. Arrows show the depth first order traversal.
The depth array D is accompanied by the corresponding vertices and positions.


vertex, i.e., the first visit or a return to the vertex, we should put the depth
of that vertex at the end of the depth array D. We also keep track of a vertex
corresponding to each depth in D. The depth array D has 2|T | − 1 values, where
|T | is the number of vertices in the tree.
    Now for any value in D we know the corresponding vertex of the tree and
any vertex of the tree is associated with several positions in D. For example, in
Figure 2 the value in the first position of D, i.e., D[1], is 0, corresponding to
the root of the tree. If we take vertex 3, then the associated values of D are on
positions 5, 7, and 9.
    Given two vertices A, B ∈ T , let a be one of the positions in D corresponding
to vertex A, let b be one of the positions in D corresponding to B. Then it
can be shown that the vertex corresponding to the minimal value in D in the
range a–b is the least common ancestor of A and B. For example, to find LCA
between vertices 3 and 6 in Figure 2, one should first take two positions in D
corresponding to vertices 3 and 6. Positions 5,7, and 9 in array D correspond to
vertex 3, positions 12 and 14 correspond to vertex 6. Thus, we can query RMQ
for ranges 5–14, 7–14, 7–12, etc. The minimal value in D for all these ranges is 0
located at position 11 in D, i.e., RMQ(5, 14) = 11. Thus, the vertex corresponding
to position 11, i.e., vertex 0, is the least common ancestor for vertices 3 and 6.
    Let us notice that if A ∈ T is an ancestor of B ∈ T and a and b are two
positions corresponding to the vertices A and B, then the position RMQ(a, b) in
D always corresponds to the vertex A, in most of the cases RMQ(a, b) = a. Thus
we are also able to check if a vertex of T is an ancestor of another vertex of T .
    Now we know how to solve the LCA problem in O(|T |) preprocessing com-
putational time and O(1) computational time per query. Let us return to the
problem of intersecting antichains of a tree.


Antichain intersection problem. Let us first discuss the naive approach
to this problem. Given two antichains A, B ⊂ T , one can compute the set
                 Revisiting Pattern Structures for Structured Attribute Sets         247

D[ 0 1 2 3 2 3 2 1 2 3 2 3 2 1 0 1 2 3 2 3 2 3 2 1 0]
   > C12 C10 C1 C10 C2 C10 C12 C11 C4 C11 C5 C11 C12 > C6 C13 C7 C13 C8 C13 C9 C13 C6 >
   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Fig. 3: Depth array, the corresponding vertices, and indices for the tree in Fig-
ure 1.


{LCA(a, b) | ∀a ∈ A and ∀b ∈ B}. Then this set should be filtered for remov-
ing the comparable elements in order to get an antichain. It is easy to see that
the result is the intersection of A and B but it requires at least |A|·|B| operations.
    Let us reformulate this naive approach in terms of RMQ. Given a depth array
D and two sets of indices A, B ⊆ N|D| forming an antichain, we should compute
the set Z = {RMQ(a, b) | ∀a ∈ A and ∀b ∈ B} and then remove all elements
z ∈ Z such that there is x ∈ Z \ {z} with the position RMQ(z, x) corresponding
to the same vertex as z, i.e., elements z corresponding to an ancestor of another
element from Z.
    Let us consider for example the tree T given in Figure 1. Figure 3 shows the
depth array, the corresponding vertices, and indices of this array. Let us show
how to compute the intersection of A = {C1 , C5 , C8 } and B = {C1 , C7 , C9 }. The
expected result is {C1 , C13 }. First we translate the sets A and B to the indices
in array D for RMQ, i.e., A = {4, 12, 20} and B = {4, 18, 22}. Then we compute
RMQ for all pairs from A and B:

  {RMQ(4, 4) = 4, RMQ(4, 18) = 15, RMQ(4, 22) = 15, · · · , RMQ(20, 18) = 19, · · · }.

Now we should remove positions corresponding to ancestors in the tree, e.g.,
RMQ(4, 15) = 15 and, hence, 15 should be removed. The result is {4, 13} repre-
senting exactly {C1 , C13 }.
    Let us discuss two points that help us to reduce the complexity of the naive
approach. Consider the positions i ≤ l ≤ m ≤ j and k = RMQ(i, j), n = RMQ(l, m).
Then the depth in the position k is not larger than the depth in the position
n, D[k] ≤ D[n]. Hence the position RMQ(k, n) corresponds to the same vertex as
position k. For example, in Figure 3 RMQ(4, 6) = 5 and RMQ(2, 7) = 2. The value
in position 5 in the array D is D[5] = 2. It is larger than the value in position 2,
D[2] = 1. Thus, the value in position returned by RMQ for the larger range is
smaller than the value in position returned by RMQ for the smaller range.
    Thus, given two sets of indices A, B ⊆ N|D| corresponding to antichains, we
can modify the naive algorithm by ordering the set A ∪ B and computing RMQ
only for consecutive elements from different sets, rather then for all pairs from
different sets. For example, for intersecting A = {4, 12, 20} and B = {4, 18, 22},
we join them to the set Z = {4A , 4B , 12A , 18B , 20A , 22B }. Then, we compute
RMQ only for consecutive elements from different sets, i.e., RMQ(4, 4) = 4,
RMQ(4, 12) = 8, RMQ(12, 18) = 15, RMQ(18, 20) = 19, and RMQ(20, 22) = 21. The
cardinality of A ∪ B is less then |A| + |B|, hence, the number of the consecutive
elements is O(|A|+|B|), and, thus, the number of RMQs of consecutive elements
is O(|A| + |B|).
248     Mehwish Alam et al.


    However, the set Z of RMQs of consecutive elements does not not necessarily
correspond to an antichain in T . Thus we should filter this set, in order to remove
all ancestors of another elements form Z. Accordingly, it is clear that to filter
the set Z it is enough to check only consecutive elements of Z. For example,
the intersection of A = {4, 12, 20} and B = {4, 18, 22} gives us the following
set Z = {4, 8, 15, 19, 21}. Let us now check the RMQs of consecutive elements.
RMQ(4, 8) = 8, thus, 8 is an ancestor of 4 and 8 can be removed. Since 8 is
removed, we compare RMQ(4, 15) = 15, thus, 15 should be also removed. Then we
compute RMQ(4, 19) = 15, i.e., the indices 4 and 19 are not ancestors and both
are kept. Now we compute RMQ(19, 21) = 19 and, thus, 19 should be removed
(actually positions 19 and 21 correspond to the same vertex C13 and one of
them should be removed). Thus, the result of intersecting A and B is {4, 21}
corresponding to the antichain {C1 , C13 }.
    Since the number of elements in the set Z is O(|A| + |B|), then overall com-
plexity of computing intersection for two antichains A, B ⊂ T of a tree T is
O(|A| + |B|) or, taking into account that the cardinality of an antichain in a tree
is less then the number of leaves (vertices having no descendants) in this tree,
the complexity of computing intersection of two antichains is O(|Leaves(T )|).

Antichain intersection by scaling. An equivalent approach for computing
intersection of antichains is to scale the antichains to the corresponding filters.
A filter corresponding to an antichain in a poset is the set of all elements of
the poset that are larger then at least one element from the antichain. For
example, let us consider a tree-shaped poset in Figure 1. A filter corresponding
to the antichain {C1 , C5 , C8 } is the set of all ancestors of all elements from the
antichain, i.e., it is equal to {C1 , C10 , C12 , >, C5 , C11 , C8 , C13 , C6 }.
    The set-intersection of filters corresponding to the given antichains is a filter
corresponding to the antichain resulting from intersection of the antichains. How-
ever this approach has a higher complexity. Indeed, the size of a filter is O(|T |)
and, thus, the computational complexity of intersecting two antichains by means
of a scaling is O(|T |) which is harder then O(|Leaves(T )|) for intersecting an-
tichains directly. Indeed, the number of leaves in a tree can be dramatically
smaller than the number of vertices in this tree. For example, the number of
vertices in Figure 1 is 13, while the number of leaves is only 7. Thus, the direct
intersection of antichains is more efficient than the intersection by means of a
scaling procedure.

Relation to intersection of antichains in partially ordered sets of at-
tributes. As it was mentioned in the previous section, the intersection of an-
tichains in arbitrary posets can be reduced to the intersection of antichains in a
tree. However, the size of the antichain representing a description of an object
can increase. Indeed, since we have reduced a poset to a tree, some relations
have been lost, and thus the attributes that are subsumed in the poset for a
given antichain A are no more subsumed in the tree for A, and hence should be
added to A. However, the reduction is still more computationally efficient than
                            Revisiting Pattern Structures for Structured Attribute Sets            249



Table 1: Results of the experiments with different kind of data.
#objects is the number of objects in the corresponding dataset. #attributes is the number of nu-
merical attributes before scaling. |G| is the number of objects used for building the lattice. |T | is
the size of the attribute tree and the number of attributes in the scaled context |M |. Leaves(T ) is
the number of leaves in the attribute tree. |L| is the size of the concept lattice for the corresponding
data. tT is the computational time for data represented as a set of antichains in the attribute tree.
tK is the computational time represented by a scaled context, i.e., by a set of filters in the attribute
tree; ‘*’ shows that the we are not able to build the whole lattice. tnum is the computational time
for numerical data represented by an interval pattern structure.


                                            (a) Real data experiments.

                Dataset     |G| |T | Leaves(T ) |L|      tT      tK
                DBLP       5293 33207 33198     10134 45 sec 21 sec
            Biomedical Data 63 1490     933    1725582 145 sec 162 sec

                                          (b) Numerical data experiments.
                            #attributes
                 #objects




     Dataset        |G| |T | |Leaves(T )| |L|      tT        tK       tnum
      BK      96 5 35 626         10     840897 37 sec 42 sec* 19 sec
      LO     16 7 16 224          26       1875 0.043 sec 0.088 sec 0.024 sec
      NT     131 3 131 140         6     128624 3.6 sec 6.8 sec 3.1 sec
      PO      60 16 22 1236       58     416837 49 sec 57 sec* 10.7 sec
      PT 5000 49 22 4084          60     452316 50 sec 38 sec* 15 sec
      PW 200 11 94 436            21     1148656 60 sec 49 sec* 48 sec
      PY      74 28 36 340        53     771569 46 sec 40 sec* 21 sec
      QU 2178 4 44 8212           8       783013 28 sec 30 sec* 15.4 sec
       TZ    186 61 31 626        88     650041 58 sec 43 sec* 22 sec
      VY      52 4 52 202         15      202666 5.9 sec 11.6 sec 3 sec




computing the intersection of antichains in a poset by means of a scaling as it
is discussed in the previous paragraph. However, for the reduction it could be
interesting to find the spanning tree with the minimal number of leaves. Unfortu-
nately, this is an NP-complete task and it thus cannot be applied for increasing
the computational efficiency [10]. We should notice here that there is some work
that solves the LCA problem for more general cases, e.g., lattices [11] or partially
ordered sets [9]. However, it is an open question whether these works can help to
efficiently compute intersection of antichains in the corresponding structures.
250     Mehwish Alam et al.


3     Experiments and Discussion

Several experiments are conducted using publicly available data on a MacBook
with a 1.3GHz Intel Core i5, 4GB of RAM running OS X Yosemite 10.3. We
have used FCAPS2 software developed in C++ for dealing with different kinds of
pattern structures. It can build a concept lattice starting from a standard formal
context or from object descriptions given as antichains of a given tree. The last
one is based on the similarity operation that is discussed above.
     We performed our experiments on two datasets from different domains namely
DBLP and biomedical data. In these datasets, object descriptions are given as
subsets of attributes. A taxonomy of the attributes is already known based on
domain knowledge. We compute a concept lattice in two different ways. In the
first one, we directly compute the concept lattice from the antichains in a taxon-
omy. In the second one we scale every description to the corresponding filter of
the taxonomy. After this we do not rely on the taxonomy and process the scaled
context with standard FCA.
     The first data set is DBLP, from which we extracted a subset of papers with
their keywords published in conferences in Machine Learning domain. The tax-
onomy used for classifying such kind of triples is ACM Computing Classification
System (ACCS)3 .
     The second data set belongs to the domain of life sciences. It contains in-
formation about drugs, their side effects (SIDER4 ), and their categories (Drug-
Bank5 ). The taxonomies related to this dataset are MedDRA 6 for side effects
and MeSH7 for drug categories.
     The parameters of the datasets and the computational results are shown in
Table 1a. It can be noticed that for DBLP the context consists of 5293 objects
and 33207 attributes, in the taxonomy of the attributes we have 33198 leaves
meaning that most of attributes are mutually incomparable. It took 45 seconds
to produce a lattice having 10134 concepts directly from the descriptions given
by antichains of the taxonomy. To produce the same lattice starting from a
scaled context the program only takes 21 seconds. However, if we consider the
biomedical data, the approach based on antichains is better. Indeed, it takes
145 seconds, while the computation starting from the scaled contexts takes 162
seconds. In this case, the dataset contains 1490 attributes with 933 leaves. Thus,
the direct approach works faster if the number of leaves is significantly smaller
than the number of vertices. It is worth noticing that the size of antichains is
significantly smaller than the size of the filters, and thus our approach is more
efficient. However, when the number of leaves is comparable to the number of
vertices, our approach is slower. Although in this case our approach has the same
2
  https://github.com/AlekseyBuzmakov/FCAPS
3
  https://www.acm.org/about/class/2012
4
  http://sideeffects.embl.de/
5
  http://www.drugbank.ca/
6
  http://meddra.org/
7
  http://www.ncbi.nlm.nih.gov/mesh/
                 Revisiting Pattern Structures for Structured Attribute Sets    251


computational complexity as the scaling approach, the antichain intersection
problem requires more efforts than the set intersection.
     Since the efficiency of the antichain approach is high for the trees with a
low number of leaves, we can use this method to increase efficiency of standard
FCA for special kind of contexts. In a context (G, M, I) an attribute m1 can be
considered as an ancestor of another attribute m2 if any object containing the
attribute m2 also contains the attribute m1 . Accordingly we can construct an
attribute tree T and rely on it for computing intersection operation. In this case
the set of attributes M and the set of vertices of T are the same and |M | = |T |.
The second part of the experiment was based on this observation.
     We used numerical data from Bilkent University in the second part of the
experiments8 . It was converted to formal contexts by the standard interodinal
scaling. The scaled attributes are closely connected, i.e., there are a lot of pairs
of attributes (m1 , m2 ) such that the set of objects described by m1 is a subset
of objects described by m2 , i.e., (m1 )0 ⊆ (m2 )0 . Thus, we can say that m1 ≤ m2 .
Using this property we built attribute trees from the scaled contexts. These
trees have many more vertices than leaves, thus, the approach introduced in this
paper should be efficient. We compare our approach with the scaling approach.
Moreover, recently, it was shown that interval pattern structures (IPS) can be
efficiently used to process such kind of data [12]. Accordingly we also compared
our approach with IPS.
     The results are shown in Table 1b. Compared to Table 1a it has several
additional columns. First of all, since for numerical data we typically got large
lattices, in most of the cases we considered only part of the objects. The actual
number of used objects is given in the column |G|, while the total size of the
dataset is given in the column ‘#objects’, e.g., BK dataset contained 96 objects,
while we have used only 35. In addition for every dataset we also provide the
number of the numerical attributes, e.g., BK has 5 numerical attributes. We
should notice that when we built the lattice from some datasets by standard
FCA, the lattice was so large that the memory was swapping and we stopped
the computation. It was not the case for our approach since antichains requires
less memory to store than the corresponding filters. The fact of swapping is
shown by ‘*’ next to computational time in column tK . In addition we also show
the time for IPS to process the same dataset. For example, the processing of BK
dataset took 37 seconds by our approach, took more than 42 seconds by standard
FCA and memory had started swapping, and took 19 seconds by IPS.
     This experiment shows that our approach takes not only less time to com-
pute concept lattice, but also requires less memory, since there is no memory
swapping. We can also see that the computation time for IPS is smaller than
for our approach. However, IPS is only applicable for numerical data, while our
approach can be applied for all cases when attributes of a context are structured.
For example, we can deal with graph data scaled to the set of frequent subgraphs
where many such attributes are subgraphs of other attributes.

8
    http://funapp.cs.bilkent.edu.tr/DataSets/
252     Mehwish Alam et al.


4     Conclusion

In this paper we recalled two approaches for dealing with structured attributes
and explained how we can compute intersection of antichains in tree-shaped
posets of attributes, an essential operation for working with structured attributes.
Our experiments showed the computational efficiency of the proposed approach.
Accordingly, we are interested in applying our approach to other kinds of data
such as graph data. Moreover, the generalization of our approach to other kinds
of posets is also of high interest.


References
 1. Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: ICCS.
    LNCS 2120, Springer (2001) 129–142
 2. Carpineto, C., Romano, G.: A lattice conceptual clustering system and its appli-
    cation to browsing retrieval. Machine Learning 24(2) (1996) 95–122
 3. Carpineto, C., Romano, G.: Concept Data Analysis: Theory and Applications.
    John Wiley & Sons, Chichester, UK (2004)
 4. Alam, M., Napoli, A.: Interactive exploration over RDF data using Formal Concept
    Analysis. In: International Conference on Data Science and Advanced Analytics,
    DSAA 2015, Paris, France, October 19 - October 21, 2015, IEEE (2015)
 5. Caspard, N., Leclerc, B., Monjardet, B.: Finite Ordered Sets. Cambridge University
    Press, Cambridge, UK (2012) First published in French as “Ensembles ordonnés
    finis : concepts, résultats et usages”, Springer 2009.
 6. Pichon, E., Lenca, P., Guillet, F., Wang, J.W.: Un algorithme de partition d’un
    produit direct d’ordres totaux en un nombre fini de chaı̂nes. Mathématiques,
    Informatique et Sciences Humaines 125 (1994) 5–15
 7. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations.
    Springer, Berlin/Heidelberg (1999)
 8. Gabow, H.N., Bentley, J.L., Tarjan, R.E.: Scaling and Related Techniques for
    Geometry Problems. In: Proc. Sixt. Annu. ACM Symp. Theory Comput. STOC
    ’84, New York, NY, USA, ACM (1984) 135–143
 9. Bender, M.A., Farach-Colton, M., Pemmasani, G., Skiena, S., Sumazin, P.: Lowest
    common ancestors in trees and DAGs. J. Algorithms 57(2) (2005) 75–94
10. Salamon, G., Wiener, G.: On finding spanning trees with few leaves. Inf. Process.
    Lett. 105(5) (2008) 164–169
11. Aı̈t-Kaci, H., Boyer, R., Lincoln, P., Nasr, R.: Efficient Implementation of Lattice
    Operations. ACM Trans. Program. Lang. Syst. 11(1) (January 1989) 115–146
12. Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression
    data with pattern structures in formal concept analysis. Inf. Sci. (Ny). 181(10)
    (2011) 1989 – 2001
                               Author Index


Adaricheva, Kira, 217                Lumpe, Lars, 171
Akhmatnurov, Marat, 99
Alam, Mehwish, 23, 241               Makhalova, Tatyana P., 59
Albano, Alexandre, 73                Miclet, Laurent, 159
Antoni, Ľubomír, 147                 Miralles, André, 111
                                     Molla, Guilhem, 111
Bartl, Eduard, 229                   Mora, Angel, 217
Borchmann, Daniel, 181
Buzmakov, Aleksey, 241               Napoli, Amedeo, 23, 241
                                     Nebut, Clémentine, 111
Cabrera, Inmaculada P., 147          Nicolas, Jacques, 159
Chornomaz, Bogdan, 73                Nourine, Lhouari, 9, 123
Cleophas, Loek, 87                   Nxumalo, Madoda, 87
Cordero, Pablo, 217
                                     Ojeda-Aciego, Manuel, 147
Derras, Mustapha, 111                Osmuk, Matthieu, 23
Deruelle, Laurent, 111
Dubois, Didier, 3                    Pasi, Gabriella, 1
                                     Priss, Uta, 135
Enciso, Manuel, 217
                                     Quilliot, Alain, 123
Gnatyshak, Dmitry V., 47
                                     Ramon, Jan, 7
                                     Revenko, Artem, 35
Huchard, Marianne, 111
                                     Rodríguez-Lorenzo, Estrella, 217
Ignatov, Dmitry I., 47, 99           Sailanbayev, Alibek, 241
                                     Schmidt, Stefan E., 171
Kauer, Martin, 11
Konecny, Jan, 205, 229               Toussaint, Hélène, 123
Kourie, Derrick G., 87
Krídlo, Ondrej, 147, 205             Uno, Takeaki, 5
Krajči, Stanislav, 147
Kriegel, Francesco, 181, 193         Watson, Bruce W., 87
Krupka, Michal, 11
Kuznetsov, Sergei O., 59             Zudin, Sergey, 47
Editors:                Sadok Ben Yahia, Jan Konecny


Title:                  CLA 2015, Proceedings of the Twelfth International
                        Conference on Concept Lattices and Their Applications


Publisher & Print:      UBP, Limos,
                        Campus Universitaire des Cézeaux,
                        63178 AUBIERE CEDEX – FRANCE


Place, year, edition:   Clermont-Ferrand, France, 2015, 1st


Page count:             xiii+254


Impression:             50


Archived at:            cla.inf.upol.cz




                                   Not for sale



                             ISBN 978–2–9544948–0–7