=Paper= {{Paper |id=None |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-959/proceedings-cla2011.pdf |volume=Vol-959 }} ==None== https://ceur-ws.org/Vol-959/proceedings-cla2011.pdf
CLA 2011
Proceedings of the Eighth International Conference on
Concept Lattices and Their Applications



CLA Conference Series
http://cla.inf.upol.cz
 INRIA Nancy – Grand Est and LORIA, France




The Eighth International Conference on
Concept Lattices and Their Applications




                CLA 2011
             Nancy, France
         October 17–20, 2011




                 Edited by

              Amedeo Napoli
              Vilem Vychodil
CLA 2011, October 17–20, 2011, Nancy, France.
Copyright c 2011 by paper authors.
Copying permitted only for private and academic purposes.
This volume is published and copyrighted by its editors.




Technical Editors:
Jan Outrata, jan.outrata@upol.cz
Vilem Vychodil, vychodil@acm.org




Page count:      xii+419
Impression:      100
Edition:         1st
First published: 2011




Printed version published by INRIA Nancy – Grand Est and LORIA, France
ISBN 978–2–905267–78–8
                         Organization


CLA 2011 was organized by the INRIA Nancy – Grand Est and LORIA


Steering Committee
Radim Belohlavek         Palacky University, Olomouc, Czech Republic
Sadok Ben Yahia          Faculté des Sciences de Tunis, Tunisia
Jean Diatta              Université de la Réunion, France
Peter Eklund             University of Wollongong, Australia
Sergei O. Kuznetsov      State University HSE, Moscow, Russia
Michel Liquière         LIRMM, Montpellier, France
Engelbert Mephu Nguifo   LIMOS, Clermont-Ferrand, France


Program Chairs
Amedeo Napoli            INRIA NGE/LORIA, Nancy, France
Vilem Vychodil           Palacky University, Olomouc, Czech Republic


Program Committee
Jaume Baixeries          Polytechnical University of Catalonia
Jose Balcazar            University of Cantabria and UPC Barcelona, Spain
Radim Belohlavek         Palacky University, Olomouc, Czech Republic
Karell Bertet            University of La Rochelle, France
François Brucker        University of Marseille, France
Claudio Carpineto        Fondazione Ugo Bordoni, Roma, Italy
Jean Diatta              Université de la Réunion, France
Felix Distel             TU Dresden, Germany
Florent Domenach         University of Nicosia, Cyprus
Mireille Ducassé        IRISA Rennes, France
Alain Gély              University of Metz, France
Cynthia Vera Glodeanu    TU Dresden, Germany
Marianne Huchard         LIRMM, Montpellier, France
Vassilis G. Kaburlasos   TEI, Kavala, Greece
Stanislav Krajci         University of P.J. Safarik, Kosice, Slovakia
Sergei O. Kuznetsov      State University HSE, Moscow, Russia
Léonard Kwuida          Zurich University of Applied Sciences, Switzerland
Mondher Maddouri         URPAH, University of Gafsa, Tunisie
Rokia Missaoui           UQO, Gatineau, Canada
Lhouari Nourine          LIMOS, University of Clermont Ferrand, France
Sergei Obiedkov         State University HSE, Moscow, Russia
Manuel Ojeda-Aciego     University of Malaga, Spain
Jan Outrata             Palacky University, Olomouc, Czech Republic
Pascal Poncelet         LIRMM, Montpellier, France
Uta Priss               Napier University, Edinburgh, United Kingdom
Olivier Raynaud         LIMOS, University of Clermont Ferrand, France
Camille Roth            EHESS, Paris, France
Stefan Schmidt          TU Dresden, Germany
Baris Sertkaya          SAP Research Center, Dresden, Germany
Henry Soldano           Université of Paris 13, France
Gerd Stumme             University of Kassel, Germany
Petko Valtchev          Université du Québec à Montréal, Canada


Additional Reviewers
Mikhail Babin           State University HSE, Moscow, Russia
Daniel Borchmann        TU Dresden, Germany
Peggy Cellier           IRISA Rennes, France
Sebastien Ferre         IRISA Rennes, France
Nathalie Girard         University of La Rochelle, France
Alice Hermann           IRISA Rennes, France
Mehdi Kaytoue           INRIA NGE/LORIA, Nancy, France
Petr Krajca             Palacky University, Olomouc, Czech Republic
Christian Meschke       TU Dresden, Germany
Petr Osicka             Palacky University, Olomouc, Czech Republic
Violaine Prince         LIRMM, Montpellier, France
Chedy Raissy            INRIA NGE/LORIA, Nancy, France
Yoan Renaud             LIRIS, Lyon, France
Heiko Reppe             TU Dresden, Germany
Lucie Urbanova          Palacky University, Olomouc, Czech Republic
Jean Villerd            ENSAIA, Nancy, France


Organization Committee
Mehdi Kaytoue (chair)   INRIA NGE/LORIA, Nancy, France
Elias Egho              INRIA NGE/LORIA, Nancy, France
Felipe Melo             INRIA NGE/LORIA, Nancy, France
Amedeo Napoli           INRIA NGE/LORIA, Nancy, France
Chedy Raı̈ssi           INRIA NGE/LORIA, Nancy, France
Jean Villerd            ENSAIA, Nancy, France
                                        Table of Contents



Preface
Invited Contributions
Mathematical Morphology, Lattices, and Formal Concept Analysis . . . . . .                                                 1
  Isabelle Bloch

Random concept lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              3
  Richard Emilion

Galois and his Connections—A retrospective on the 200th birthday of
Evariste Galois . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      5
   Marcel Erné

Canonical extensions, Duality theory, and Formal Concept Analysis . . . . .                                                7
   Mai Gehrke

Galois connections and residuation: origins and developments II . . . . . . . . .                                          9
   Bruno Leclerc

Galois connections and residuation: origins and developments I . . . . . . . . .                                          11
   Bernard Monjardet

Metrics, Betweeness Relations, and Entropies on Lattices and Applications                                                 13
  Dan Simovici

Long Papers
Vertical decomposition of a lattice using clique separators . . . . . . . . . . . . . .                                   15
   Anne Berry, Romain Pogorelcnik and Alain Sigayret

Building up Shared Knowledge with Logical Information Systems . . . . . . .                                               31
   Mireille Ducasse, Sebastien Ferre and Peggy Cellier

Comparing performance of algorithms for generating the Duquenne-
Guigues basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   43
   Konstantin Bazhanov and Sergei Obiedkov

Filtering Machine Translation Results with Automatically Constructed
Concept Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      59
    Yılmaz Kılıçaslan and Edip Serdar Güner

Concept lattices in fuzzy relation equations . . . . . . . . . . . . . . . . . . . . . . . . . . .                        75
   Juan Carlos Dı́az and Jesús Medina-Moreno
Adaptation knowledge discovery for cooking using closed itemset
extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   87
   Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer

Fast Computation of Proper Premises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
   Uwe Ryssel, Felix Distel and Daniel Borchmann

Block relations in fuzzy setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
   Jan Konecny and Michal Krupka

A closure algorithm using a recursive decomposition of the set of Moore
co-families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
    Pierre Colomb, Alexis Irlande, Olivier Raynaud and Yoan Renaud

Iterative Software Design of Computer Games through FCA . . . . . . . . . . . . 143
    David Llansó, Marco Antonio Gómez-Martı́n, Pedro Pablo
    Gomez-Martin and Pedro Antonio González-Calero

Fuzzy-valued Triadic Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
   Cynthia Vera Glodeanu

Mining bicluster of similar values with triadic concept analysis . . . . . . . . . . 175
   Mehdi Kaytoue, Sergei Kuznetsov, Juraj Macko, Wagner Meira and
   Amedeo Napoli

Fast Mining of Iceberg Lattices: A Modular Approach Using Generators . 191
   Laszlo Szathmary, Petko Valtchev, Amedeo Napoli, Robert Godin,
   Alix Boc and Vladimir Makarenkov

Boolean factors as a means of clustering of interestingness measures of
association rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
   Radim Belohlavek, Dhouha Grissa, Sylvie Guillaume, Engelbert
   Mephu Nguifo and Jan Outrata

Combining Formal Concept Analysis and Translation to Assign Frames
and Thematic Grids to French Verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
   Ingrid Falk and Claire Gardent

Generation algorithm of a concept lattice with limited access to objects . . 239
  Christophe Demko and Karell Bertet

Homogeneity and Stability in Conceptual Analysis . . . . . . . . . . . . . . . . . . . . 251
  Paula Brito and Géraldine Polaillon

A lattice-based query system for assessing the quality of hydro-ecosystems 265
   Agnés Braud, Cristina Nica, Corinne Grac and Florence Le Ber

The word problem in semiconcept algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
   Philippe Balbiani
Looking for analogical proportions in a formal concept analysis setting . . . 295
   Laurent Miclet, Henri Prade and David Guennec

Random extents and random closure systems . . . . . . . . . . . . . . . . . . . . . . . . . 309
  Bernhard Ganter
Extracting Decision Trees From Interval Pattern Concept Lattices . . . . . . 319
   Zainab Assaghir, Mehdi Kaytoue, Wagner Meira and Jean Villerd

A New Formal Context for Symmetric Dependencies . . . . . . . . . . . . . . . . . . 333
   Jaume Baixeries
Cheating to achieve Formal Concept Analysis over a large formal context                                    349
   Vı́ctor Codocedo, Carla Taramasco and Hernán Astudillo

A FCA-based analysis of sequential care trajectories . . . . . . . . . . . . . . . . . . . 363
   Elias Egho, Nicolas Jay, Chedy Raissi and Amedeo Napoli
Querying Relational Concept Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
  Zeina Azmeh, Mohamed Hacéne-Rouane, Marianne Huchard,
  Amedeo Napoli and Petko Valtchev

Links between modular decomposition of concept lattice and bimodular
decomposition of a context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
   Alain Gély

Short Papers
Abduction in Description Logics using Formal Concept Analysis and
Mathematical Morphology: application to image interpretation . . . . . . . . . 405
  Jamal Atif, Céline Hudelot and Isabelle Bloch

A local discretization of continuous data for lattices: Technical aspects . . . 409
   Nathalie Girard, Karell Bertet and Muriel Visani
Formal Concept Analysis on Graphics Hardware . . . . . . . . . . . . . . . . . . . . . . 413
   W. B. Langdon, Shin Yoo, and Mark Harman

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
                                    Preface


The Eighth International Conference “Concept Lattices and Applications (CLA
2011)” is held in Nancy, France from October 17th until October 20th 2011. CLA
2011 is aimed at providing to everyone interested in Formal Concept Analysis
and more generally in Concept Lattices or Galois Lattices, students, professors,
researchers and engineers, a global and an advanced view of some of the last
research trends and applications in this field. As the diversity of the selected pa-
pers shows, there is a wide range of theoretical and practical research directions,
around data and knowledge processing, e.g. data mining, knowledge discovery,
knowledge representation, reasoning, pattern recognition, together with logic,
algebra and lattice theory.
This volume includes the selected papers and the abstracts of the 7 invited
talks. This year there were initially 47 submissions from which 27 papers were
accepted as full papers and 3 papers as posters. We would like to thank here the
authors for their work, often of very good quality, the members of the program
committee and the external reviewers who did a great job as this can be seen
in their reviews. This is one witnesses of the growing quality and importance of
CLA, highlightening its leading position in the field.
Next, this year is a little bit special while the bicentennial of the birth of Evariste
Galois (1811–1832) is celebrated, particularly in France. Evariste Galois has
something to do with Concept Lattices as they are based on a so-called “Galois
connection”. Among the invited speakers, some of them will discuss of these
fundamental aspects of Concept Lattices. Moreover, this is also the occasion of
thanking the seven invited speakers who, at least we hope that, will meet the
wishes of the attendees.
We would like to thank firstly our first sponsors, namely the CNRS GDR I3 and
Institut National Polytechnique de Lorraine (INPL). Then we would like to thank
the steering committee of CLA for giving us the occasion of leading this edition
of CLA, the conference participants for their participation and support, and
people in charge of the organization, especially Anne-Lise Charbonnier, Nicolas
Alcaraz and Mehdi Kaytoue, whose help was very precious in many occasions.
Finally, we also do not forget that the conference was managed (quite easily)
with the Easychair system, paper submission, selection, and reviewing, and that
Jan Outrata has offered his files for preparing the proceedings.




October 2011                                                      Amedeo Napoli
                                                                  Vilem Vychodil
                                                      Program Chairs of CLA 2011
           Mathematical Morphology, Lattices,
             and Formal Concept Analysis

                                     Isabelle Bloch

                    Telecom ParisTech, CNRS LTCI, Paris, France

Abstract. Lattice theory has become a popular mathematical framework in differ-
ent domains of information processing, and various communities employ its features
and properties, e.g. in knowledge representation, in logics, automated reasoning and
decision making, in image processing, in information retrieval, in soft computing, in
formal concept analysis. Mathematical morphology is based adjunctions, on the alge-
braic framework of posets, and more specifically of complete lattices, which endows
it with strong properties and allows for multiple applications and extensions. In this
talk we will summarize the main concepts of mathematical morphology and show their
instantiations in different settings, where a complete lattice can be built, such as sets,
functions, partitions, fuzzy sets, bipolar fuzzy sets, formal logics . . . We will detail in
particular the links between morphological operations and formal concept analysis,
thus initiating links between two domains that were quite disconnected until now,
which could therefore open new interesting perspectives.
                     Random concept lattices

                                  Richard Emilion

                       MAPMO, University of Orléans, France

Abstract. After presenting an algorithm providing concepts and frequent concepts,
we will study the random size of concept lattices in the case of a Bernoulli(p) context.
Next, for random lines which are independent and identically distributed or more
generally outcomes of a Markov chain, we will show the almost everywhere convergence
of the random closed intents towards deterministic intents. Finally we will consider the
problem of predicting the number of concepts before choosing any algorithm.
Galois and his Connections—A retrospective on
     the 200th birthday of Evariste Galois

                                     Marcel Erné

                           University of Hannover, Germany

Abstract. A frequently used tool in mathematics is what Oystein Ore called “Galois
connections” (also “Galois connexions”, “Galois correspondences” or “dual adjunc-
tions”). These are pairs (ϕ, ψ) of maps between ordered sets in opposite direction so
that x ≤ ψ(y) is equivalent to y ≤ ϕ(x). This concept looks rather simple but proves
very effective. The primary gain of such “dually adjoint situations” is that the ranges
of the involved maps are dually isomorphic: thus, Galois connections present two faces
of the same medal.
Many concrete instances are given by what Garrett Birkhoff termed “polarities”: these
are nothing but Galois connections between power sets. In slightly different terminology,
the fundamental observation of modern Formal Concept Analysis is that every “formal
context”, that is, any triple (J, M, I) where I is a relation between (the elements of)
J and M , gives rise to a Galois connection (assigning to each subset of one side its
“polar”, “extent” or “intent” on the other side), such that the resulting two closure
systems of polars are dually isomorphic; more surprising is the fact that, conversely,
every dual isomorphism between two closure systems arises in a unique fashion from a
relation between the underlying sets. In other words: the complete Boolean algebra of
all relations between J and M is isomorphic to that of all Galois connections between
¶J and ¶M , and also to that of all dual isomorphisms between closure systems on J
and M , respectively.
The classical example is the Fundamental Theorem of Galois Theory, establishing a
dual isomorphism between the complete lattice of all intermediate fields of a Galois ex-
tension and that of the corresponding automorphism groups, due to Richard Dedekind
and Emil Artin. In contrast to that correspondence, which does not occur explicitly
in Galois’ succinct original articles, a few other closely related Galois connections may
be discovered in his work (of course not under that name). Besides these historical
forerunners, we discuss a few other highlights of mathematical theories where Galois
connections enter in a convincing way through certain “orthogonality” relations, and
show how the Galois approach considerably facilitates the proofs. For example, each of
the following important structural isomorphisms arises from a rather simple relation
on the respective ground sets:

 – the dual isomorphism between the subspace lattice of a finite-dimensional linear
   space and the left ideal lattice of its endomorphism ring
 – the duality between algebraic varieties and radical ideals
 – the categorical equivalence between ordered sets and Alexandroff spaces
 – the representation of complete Boolean algebras as systems of polars.
      Canonical extensions, Duality theory, and
             Formal Concept Analysis

                                      Mai Gehrke

                     LIAFA CNRS – University of Paris 7, France

Abstract. The theory of canonical extensions, developed by Jonsson and Tarski in the
setting of Boolean algebras with operators, provides an algebraic approach to duality
theory. Recent developments in this theory have revealed that in this algebraic guise
duality theory is no more complicated outside than within the setting of Boolean
algebras or distributive lattices. This has opened the possibility of exporting the highly
developed machinery and knowledge available in the classical setting (e.g. in modal
logic) to the completely general setting of partially ordered and non-distributive lattice
ordered algebras. Duality theory in this setting is a special instance of the connection
between formal contexts and concept lattices and thus allows methods of classical
algebraic logic to be imported into FCA.

This will be an introductory talk on the subject of canonical extensions with the
purpose of outlining the relationship between the three topics of the title.
 Galois connections and residuation: origins and
                developments II

                                    Bruno Leclerc

        CAMS – École des Hautes Études en Sciences Sociales, Paris, France

Abstract. From the seventies, the uses of Galois connections (and residuated/residual
maps) multiplied in applied fields. Indeed Galois connections have been several times
rediscovered for one or another purpose, for instance in fuzzy set theory or aggregation
problems. In this talk, we illustrate the diversity of such applications. Of course, the
many developments in Galois lattices and Formal Concept Analysis, with their rela-
tion with Data Mining, will be only briefly evoked. Besides these developments, one
finds, among other uses, alternative methods to study a correspondence (binary rela-
tion) between two sets, models of classification and preferences, fitting and aggregation
problems.
 Galois connections and residuation: origins and
                 developments I

                                  Bernard Monjardet

    Centre d’Economie de la Sorbonne (University of Paris 1) and CAMS (Centre
                    Analyse et Mathmatique Sociale), France

Abstract. The equivalent notions of Galois connexions, and of residual and residuated
maps occur in a great varieties of “pure” as well as “applied” mathematical theories.
They explicitly appeared in the framework of lattice theory and the first of these talks
is devoted to the history of their appearance and of the revealing of their links in
this framework. So this talk covers more or less the period between 1940 (with the
notion of polarity defined in the first edition of Birkoff’s book Lattice theory) and 1972
(with Blyth and Janowitz’s book Residuation theory), a period containing fundamental
works like Öre’s 1944 paper Galois connexions or Croisot’s 1956 paper Applications
résiduées.
Metrics, Betweeness Relations, and Entropies on
           Lattices and Applications

                                     Dan Simovici

   Department of Computer Science, University of Massachusetts at Boston, USA

Abstract. We discuss an algebraic axiomatization of the notion of entropy in the
framework of lattices as well as characterizations of metric structures induced by such
entropies. The proposed new framework takes advantage of the partial orders defined
on lattices, in particular the semimodular lattice of partitions of a finite set to allow
multiple applications in data mining: data discretization, recommendation systems,
classification, and feature selection.
      Vertical decomposition of a lattice using clique
                       separators

                    Anne Berry, Romain Pogorelcnik, Alain Sigayret

                               LIMOS UMR CNRS 6158??
      Ensemble Scientifique des Cézeaux Université Blaise Pascal, F-63 173 Aubière,
                                         France.
            berry@isima.fr, romain.pogorelcnik@isima.fr, sigayret@isima.fr




          Abstract. A concept (or Galois) lattice is built on a binary relation; this
          relation can be represented by a bipartite graph. We explain how we can
          use the graph tool of clique minimal separator decomposition to decom-
          pose some bipartite graphs into subgraphs in linear time; each subgraph
          corresponds to a subrelation. We show that the lattices of these subrela-
          tions easily yield the lattice of the global relation. We also illustrate how
          this decomposition is a tool to help displaying the lattice.


          Keywords: lattice decomposition, clique separator decomposition, lat-
          tice drawing



  1     Introduction

  In many algorithms dealing with hard problems, a divide-and-conquer approach
  is helpful in practical applications. Computing the set of concepts associated
  with a given context (or the set of maximal rectangles associated with a binary
  relation) is time-consuming, as there may be an exponential number of concepts.
  It would be interesting to decompose the lattice into smaller sublattices. What
  we propose here is to decompose the relation into smaller subrelations, compute
  the lattice of each subrelation, and then use these lattices to reconstruct the
  lattice of the global relation.
      For this, we use a graph decomposition, called ”clique separator decomposi-
  tion”, introduced by Tarjan [9], and refined afterwards (see [3] for an extensive
  introduction to this decomposition). The general principal is roughly the follow-
  ing: repeatedly find a set of vertices which are pairwise adjacent (called a clique)
  and whose removal disconnects the graph (called a separator), then copy this
  clique separator into the different connected components obtained. When the de-
  composition is completed, a set of subgraphs is obtained, inconveniently called
  ’atoms’: each subgraph is a maximal subgraph containing no clique separator.
 ??
      Research partially supported by the French Agency for Research under the DEFIS
      program TODO, ANR-09-EMER-010.


c 2011 by the paper authors. CLA 2011, pp. 15–29. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
16     Anne Berry, Romain Pogorelcnik and Alain Sigayret


    In a previous work [2], we used graph algorithms on the complement of the
bipartite graph associated with the relation. In this paper, we will apply this
decomposition directly to the bipartite graph itself.
    It turns out upon investigation that the subgraphs obtained divide not only
the graph, but in a very similar fashion divide the matrix of the relation, the
set of concepts and the lattice. When the relation has a clique separator of size
two, the lattice, as we will explain further on, is divided along a vertical axis by
an atom and a co-atom which correspond to the two vertices of the separator.
Thus not only can the concepts be computed on the subrelations, but the Hasse
diagram of the lattice can be drawn better, as no edge need cross this vertical
line.
    Moreover, in a bipartite graph, this decomposition can be implemented with
a better worse-case time complexity than in the general case, as the clique sep-
arators can be of size one (in this case they are called articulation points) or of
size two. In both cases, the entire decomposition can be computed in linear time,
i.e. in the size of the relation, thanks to the works of [6] and [3]. Although some
graphs do not have a clique separator, when there is one, the decomposition is
thus a useful and non-expensive pre-processing step.
    The paper is organized as follows: we will first give some more preliminaries
in Section 2. In Section 3, we explain how a bipartite graph is decomposed. In
Section 4, we show how to reconstruct the global lattice from the concepts ob-
tained on the subrelations. In Section 5, we discuss using vertical decomposition
as a tool for layout help. Finally, in Section 6, we conclude with some general
remarks.


2    Preliminaries

We will first recall essential definitions and properties.
    All the graphs will be undirected and finite. For a graph G = (V, E), V is
the vertex set and E is the edge set. For xy ∈ E, x 6= y, x and y are said to be
adjacent; we say that x sees y. A graph is connected if, for every pair {x, y} of
vertices, there is a path from x to y. When a graph is not connected, the maximal
connected subgraphs are called the connected components. For C ⊂ V , G(C)
denotes the subgraph induced by C. In a graph G = (V, E), the neighborhood of
a vertex x ∈ V is the set NG (x) = {y ∈ V, x 6= y|xy ∈ E}. NG (x) is denoted
N (x) when there is no ambiguity.
    A clique is set of vertices which induces a complete graph, i.e. with all possible
edges.
    A bipartite graph G = (X + Y, E), where + stands for disjoint union, is built
on two vertex sets, X and Y , with no edge between vertices of X and no edge
between vertices of Y . A maximal biclique of a bipartite graph G = (X + Y, E)
is a subgraph G(X 0 + Y 0 ) with all possible edges between the vertices of X 0 and
the vertices of Y 0 .
    A relation R ⊆ O × A on a set O of objects and a set A of attributes is
associated with a bipartite graph G=(O + A, E), which we will denote Bip(R);
                    Vertical decomposition of a lattice using clique separators   17


thus, for x ∈ O and y ∈ A, (x, y) is in R iff xy is an edge of G. The maximal
rectangles of the relation correspond exactly to the maximal bicliques (maximal
complete bipartite subgraphs) of Bip(R) and to the elements (the concepts) of
concept lattice L(R) associated with context (O, A, R). If O1 × A1 and O2 × A2
are concepts of L(R) then O1 × A1  O2 × A2 iff O1 ⊆ O2 iff A1 ⊇ A2 ; the
corresponding bicliques on vertex sets O1 + A1 and O2 + A2 of Bip(R) are
comparable the same way.
    An atom (resp. co-atom) of L(R) is a concept covering the minimum element
(resp. covered by the maximum element). In the bipartite graph Bip(R), the
neighborhoods are defined as follows: for x ∈ O, N (x) = R(x) = {y ∈ A|(x, y) ∈
R}, and for x ∈ A, N (x) = R−1 (x) = {y ∈ O|(y, x) ∈ R}.
    A separator in a connected graph is a set of vertices, the removal of which
disconnects the graph. A clique separator is a separator which is a clique. Clique
separator decomposition [9] of a graph G = (V, E) is a process which repeat-
edly finds a clique separator S and copies it into the connected components of
G(V − S). When only minimal separators are used (see [3] for extensive general
definitions), the decomposition is unique and the subgraphs obtained in the end
are exactly the maximal subgraphs containing no clique separator [8], [3]. In a
bipartite graph, the clique minimal separators are of size one or two. A separator
of size one is called a articulation point. A clique separator S = {x, y} of size
two is minimal if there are two components C1 and C2 of G(V − S) such that x
and y both have at least one neighbor in C1 as well as in C2 .


3   Decomposing the bipartite graph and the relation

In the rest of the paper, we will use the bipartite graph Bip(R) defined by a rela-
tion R ⊆ O × A. Figure 1 shows an example of a relation with the corresponding
bipartite graph.




              Fig. 1. A relation and the corresponding bipartite graph




In this section, we will first discuss connectivity issues, then illustrate and give
our process to decompose the bipartite graph.
18     Anne Berry, Romain Pogorelcnik and Alain Sigayret


3.1   Decomposing the bipartite graph into connected components


When the bipartite graph Bip(R) is not connected, our process can be applied
separately (or in parallel) to each connected component. The lattice obtained is
characteristic: when the top and bottom elements are removed from the Hasse
diagram, the resulting diagram is a set of disjoint lattices, with a one-to-one
correspondence between the connected components of Bip(R) and the lattices
obtained.
    Figure 2 shows such a disconnected bipartite graph, its relation, and the
corresponding lattice.
    Note that trivially, if a connected component has only one vertex, this means
that the corresponding row or column of the relation is empty: such a component
corresponds to a lattice with only one element.
    In the rest of the paper, we will consider only relations whose bipartite graph
is connected.




Fig. 2. A disconnected bipartite graph, its relation, and the corresponding character-
istic lattice.
                   Vertical decomposition of a lattice using clique separators   19


3.2   Illustrating the two decomposition steps

In order to make the process we use as clear as possible, we will first describe
what happens when one decomposition step is applied for each of the two de-
compositions involved (using clique separators of size one or of size two).
    It is important to understand, however, that to ensure a good (linear) time
complexity, each of the two decompositions is computed globally in a single
linear-time pass.

Step with an articulation point

     The removal of an articulation point {x} in a connected bipartite graph G
results in components C1 , ..., Ck , which correspond to a partition V = C1 + ... +
Ck + {x} of Bip(R). After a decomposition step using {x}, x is preserved, with
its local neighborhood, in each component, so that G is replaced by k subgraphs
G(C1 ∪ {x}), ..., G(Ck ∪ {x}).

Example 1. In the input graph of Figure 3, vertex 1 defines an articulation point
that induces two connected components {2, 3, a, b, c} and {4, d, e}. The decom-
position step results into subgraphs G({1, 2, 3, a, b, c}) and G({1, 4, d, e}).




                 Fig. 3. Decomposition by articulation point {1}.




Step with a separator of size two

     When the removal of a clique minimal separator {x, y} in a connected bi-
partite graph G results into components C1 ,..., Ck , corresponding to a partition
V = C1 +...+Ck +{x, y}. The decomposition step replaces G with G(C1 ∪{x, y}),
..., G(Ck ∪ {x, y}).

Example 2. In the input graph of Figure 4, {2, b} is a clique minimal separa-
tor of size two that induces two connected components {1, 2, 3, a, b, c, f } and
{2, 5, 6, b, g, h}.
20     Anne Berry, Romain Pogorelcnik and Alain Sigayret




                  Fig. 4. Decomposition by clique separator {2,b}



3.3   Ordering the steps
A clique minimal separator of size two may include an articulation point. Thus it
is important to complete the decomposition by the articulation points first, and
then go on to decompose the obtained subgraphs using their clique separators
of size two.
Example 3. In the input graph of Figure 5, {2} is an articulation point included
in clique minimal separator {2, b}. The decomposition will begin with {2}, in-
ducing components {2, i} and {1, 2, 3, 5, 6, a, b, c, f, g, h}. As there remains no
articulation point in these resulting components, the second component will be
then decomposed by {2, b} into {1, 2, 3, a, b, c, f } and {2, 5, 6, b, g, h}.




      Fig. 5. Articulation point {2} is processed before clique separator {2,b}



    After the bipartite graph decomposition is completed, we will obtain sub-
graphs with no remaining clique minimal separator, and the corresponding sub-
relations with their associated lattices.
Example 4. Figure 6 shows that the input graph of Figure 1 is decomposable
into four bipartite subgraphs: G1 = G({1, 2, i}), G2 = G({2, 5, 6, b, g, h}), G3 =
G({1, 2, 3, a, b, c, f }) and G4 = G({1, 4, d, e}).
    Note that in the general case all subgraphs obtained have at least two vertices,
since at least one vertex of a separator is copied into a component which has at
least one vertex.
                   Vertical decomposition of a lattice using clique separators   21




               Fig. 6. Complete decomposition of a bipartite graph




3.4   The global decomposition process

To obtain the entire decomposition of a connected bipartite graph, we will thus
first decompose the graph using all articulation points, and then decompose each
of the subgraphs obtained using all its clique separators of size 2.
    The articulation points (clique minimal separators of size one) can be found
by a simple depth-first search [6], as well as the corresponding decomposition of
the graph (called decomposition into biconnected components).
    The search for clique separators of size two corresponds to a more complicated
algorithm, described in [5]: all separators of size 2 are obtained, whether they
are cliques or not.
    Once this list of separators is obtained, it is easy to check which are joined
by an edge. The desired decomposition can then be obtained easily.
    In both cases, the set of clique separators is output.
    Both algorithms run in linear time, so the global complexity is in O(|R|) to
obtain both the set of subgraphs and the set of clique separators of the original
graph.


3.5   Sub-patterns defined in the matrix

When the clique separators involved do not overlap and each defines exactly
two connected components, this decomposition of the bipartite graph partitions
the graph and the underlying relation. This results in a significant pattern of
their binary matrix. As the components obtained are pairwise disconnected, the
matrix can be reorganized in such a way that zeroes are gathered into blocks. Two
components may appear as consecutive blocks, linked by a row corresponding
to the articulation point that has been used to split them, or linked by one cell
giving the edge between the two vertices of a size two clique minimal separator.
    In the general case, this pattern can occur in only some parts of the matrix,
and different patterns can be defined according to the different separators which
the user chooses to represent.

Example 5. The first of the two matrices below corresponds to our running ex-
ample from Figure 1 and has been reorganized, following the decomposition,
which results in the second matrix. Notice how {1} is an articulation point, so
22     Anne Berry, Romain Pogorelcnik and Alain Sigayret


row 1 is shared by blocs 231×bacf and 14×de; and how {2, b} is a clique separa-
tor of size two, so cell [2, b] is the intersection of blocs 562 × ghb and 231 × bacf .
[2, i] is not integrated into the pattern, because separator {2, b} of Bip(R) defines
3 connected components: {2, 5, 6, b, g, h}, {i} and {1, 3, 4, a, c, d, e, f }.




We will now describe a process to organize the lines and columns of the ma-
trix with such patterns. We will construct a meta-graph (introduced in [7] as the
’atom graph’), whose vertices represent the subgraphs obtained by our decompo-
sition, and where there is an edge between two such vertices if the two subgraphs
which are the endpoints have a non-empty intersection which is a clique minimal
separator separating the corresponding two subgraphs in the original bipartite
graph. In this meta-graph, choose a chordless path; the succession of subgraphs
along this path will yield a succession of rectangles in the matrix which corre-
spond to a pattern.

Example 6. Figure 7 gives the meta-graph for our running example from Figure
1. Chordless path ({2, 5, 6, b, g, h}, {1, 2, 3, a, b, c, }, {1, 4, d, e}) was chosen for the
patterning. Another possible chordless path would be ({2, i}, {1, 2, 3, a, b, c, },
{1, 4, d, e}). Finding a chordless path in a graph can be done in linear time; the
meta-graph has less than min(|A|, |O|) elements, so finding such a path costs
less than (min(|A|, |O|))2 .




                      Fig. 7. Meta-graph for graph from Figure 1
                   Vertical decomposition of a lattice using clique separators   23


3.6   Decomposing the lattice

We will now examine how the set of concepts is modified and partitioned into
the subgraphs obtained. As clique minimal separators are copied in all the com-
ponents induced, most of the concepts will be preserved by the decomposition.
Furthermore, only initial concepts including a vertex of a clique minimal sepa-
rator may be affected by the decomposition.

Definition 1. We will say that a maximal biclique is a star maximal biclique if
it contains either exactly one object or exactly one attribute. This single object
or attribute will be called the center of the star.

Lemma 1. A star maximal biclique {x} ∪ N (x) of Bip(R) is an atomic concept
of L(R) (atom or co-atom), unless x is universal in Bip(R). More precisely,
{x} × N (x) is a atom if x ∈ O and N (x) 6= A, or N (x) × {x} is a co-atom if
x ∈ A and N (x) 6= O.

Proof. Let {x} ∪ N (x) be a star maximal biclique of Bip(R). As a maximal
biclique, it corresponds to a concept of L(R). Suppose the star has x ∈ O as
center . As a star, it contains no other element of O; as a biclique, it includes
all N (x) ⊆ A, and no other element of A by maximality. The corresponding
concept is {x} × N (x) which is obviously the first concept from bottom to top
including x. As the biclique is maximal, and as x is not universal, this concept
cannot be the bottom of L(R) but only an atom. A similar proof holds for x ∈ A
and co-atomicity.

   We will now give the property which describes how the maximal bicliques are
dispatched or modified by the decomposition. In the next Section, we will give
a general theorem and its proof, from which these properties can be deduced.

Property 1. Let G = (X + Y, E) be a bipartite graph, let S be a clique minimal
separator of G which decomposes G into subgraphs G1 , ..., Gk . Then:

1. ∀x ∈ S, {x} ∪ NG (x) is a star maximal biclique of G.
2. ∀x ∈ S, {x} ∪ NG (x) is not a maximal biclique of any Gi .
3. ∀x ∈ S, {x} ∪ NGi (x) is a biclique of Gi , but it is maximal in Gi iff it is not
   strictly contained in any other biclique of Gi .
4. All the maximal bicliques of G which are not star bicliques with any x ∈ S
   as a center are partitioned into the corresponding subgraphs.

With the help of Lemma 1, this property may be translated in terms of lattices.
Given a relation R, its associated graph G, its lattice L(R), and a decomposition
step of G into some Gi s by articulation point {x}:
    If x ∈ O (resp. ∈ A) is an articulation point of G, {x}×NG (x) (resp. NG (x)×
{x}) is a concept of L(R). After the decomposition step, in each subgraph Gi of
G, either this concept becomes {x}×NGi (x), or this concept disappears from Gi ;
this latter case occurs when there is in Gi some x0 ∈ O, the introducer of which
appears after the introducer of x in L(R), from bottom to top (resp. from top
24     Anne Berry, Romain Pogorelcnik and Alain Sigayret


to bottom if x, x0 ∈ A). Every other concept will appear unchanged in exactly
one lattice associated with a subgraph Gi .
   The same holds for each vertex of a size two clique minimal separator.
Example 7. Figure 8 illustrates a decomposition step with articulation point {1}
using the graph from Figure 3. Concept {1, 4} × {d, e} disappears from the first
component {1, 2, 3, a, b, c}, but remains in the second component {1, 4, d, e}.




        Fig. 8. Example of lattice decomposition using articulation point {1}.




Example 8. Figure 9 illustrates a decomposition step with clique separator {2, b}
using the graph from Figure 4. Concept {2}×N (2) is duplicated into components
{2, 5, 6, b, g, h} and {1, 2, 3, a, b, c, f }; concept N (b) × {b} will appear as {2, 6} ×
{b} in the first component, but not in the second one, as {2, 3, b} is a biclique
included in maximal biclique {2, 3, b, f } of G.
Remark 1. The smaller lattices obtained can not be called sublattices of the
initial lattice as some of their elements may not be the same: for example, in
Figure 9, {2} × {b, c, f } is an element of the third smaller lattice L(G3 ) but is
not an element of the initial lattice L(G).


4    Reconstructing the lattice
We will now explain how, given the subgraphs obtained by clique decomposi-
tion, as well as the corresponding subrelations and subsets of concepts, we can
reconstruct the set of concepts of the global input bipartite graph. We will then
go on to explain how to obtain the Hasse diagram of the reconstructed lattice.
                     Vertical decomposition of a lattice using clique separators   25




         Fig. 9. Example of lattice decomposition using clique separator {2, b}.




4.1   Reconstructing the set of concepts

We will use the following Theorem, which describes the concepts of the global
lattice.
Theorem 1. Let G = (X + Y, E) be a bipartite graph, let Σ = {s1 , ...sh } be
the set of all the vertices which belong to a clique separator of G, let G1 , ...Gk
be the set of subgraphs obtained by the complete corresponding clique separator
decomposition. Then:

1. For every s ∈ Σ, {s} ∪ NG (s) is a star maximal biclique of G.
2. Any maximal biclique of a subgraph Gi which is not a star with a vertex of
   Σ as center is also a maximal biclique of G.
3. There are no other maximal bicliques in G: ∀s ∈ Σ, no other star maximal
   biclique of Gi with center s is a star maximal biclique of G, and these are
   the only maximal bicliques of some graph Gi which are not also maximal
   bicliques in G.

Proof.
 1. For every s ∈ Σ, {s} ∪ NG (s) is a star maximal biclique of G:
    Case 1: s is an articulation point, let Gi , Gj be two graphs which s belongs
    to; s must be adjacent to some vertex yi in Gi and to some vertex yj in
    Gj . Suppose {s} ∪ NG (s) is not a maximal biclique: there must be a vertex
    z in G which sees yi and yj , but then {s} cannot separate yi from yj , a
    contradiction.
    Case 2: s is not an articulation point, let s0 be a vertex of S such that
26     Anne Berry, Romain Pogorelcnik and Alain Sigayret


    {s, s0 } is a clique separator of G, separating Gi from Gj . s must as above see
    some vertex yi in Gi and some vertex yj in Gj . Suppose {s} ∪ NG (s) is not
    maximal: there must be some vertex w in G which sees all of NG (s), but w
    must see yi and yj , so {s, s0 } cannot separate Gi from Gj .
 2. Let B be a non-star maximal biclique of Gi , containing o1 , o2 ∈ O and
    a1 , a2 ∈ A. Suppose B is not maximal in G: there must be a vertex y in
    G − B which augments B. Let y be in Gj , wlog y ∈ A: y must see o1 and
    o2 . Since Gi is a maximal subgraph with no clique separator, Gi + {y} must
    have a clique separator. Therefore N (y) must be a clique separator of this
    subgraph, but this is impossible, since y sees two non-adjacent vertices of
    Gi .
 3. Any star maximal biclique B of Gi whose center is not in Σ is also a star
    maximal biclique of G: suppose we can augment B in G.
    Case 1: v sees an extra vertex w; Gi + {w} contains as above a clique sepa-
    rator, which is impossible since N (w) = v and v 6∈ S.
    Case 2: A vertex z of Gj is adjacent to all of N (v): again, G + {z} contains
    a clique separator, so N (z) is a clique separator, but that is impossible since
    N (z) contains at least two non-adjacent vertices.
 4. For s ∈ Σ, no star maximal biclique of Gi is a star maximal biclique of G:
    let B be a star maximal biclique of Gi , with s ∈ Σ as center. s ∈ Σ, so s
    belongs to some clique separator which separates Gi from some graph Gj .
    s must see a vertex yj in Gj , so B + {yj } is a larger star including B: B
    cannot be maximal in G.

Example 9. We illustrate Theorem 1 using graph G from Figure 6, whose decom-
position yields subgraphs G1 , ..., G4 , with G1                  =   G({1, 2, i}),
G2 = G({2, 5, 6, b, g, h}), G3 = G({1, 2, 3, a, b, c, f }) and G4 = G({1, 4, d, e}).
Finally, Σ = {1, 2, b}. The corresponding lattices are shown in Figure 10, and
their concepts are presented in the table below. In this table, braces have been
omitted; symbol ⇒ represents a concept of the considered subgraph Gi which is
identical to a concept of G (there can be only one ⇒ per row); the other concepts
of the subgraphs will not be preserved in G while recomposing.

         L(G)    L(G1 ) L(G2 ) L(G3 ) L(G4 ) star max. biclique of G ?
      1 × acde                 1 × ac                  yes
      2 × bcf hi 2 × i 2 × bh 2 × bcf                  yes
       3 × abf                   ⇒
       14 × de                         ⇒
        5 × gh            ⇒
        6 × bg            ⇒
        13 × a                   ⇒
       236 × b          26 × b                         yes
        12 × c                   ⇒
       23 × bf                   ⇒
        56 × g            ⇒
        25 × h            ⇒
                   Vertical decomposition of a lattice using clique separators   27




                        Fig. 10. Reconstruction of a lattice



According to Theorem 1, the steps to reconstruct the maximal concepts of the
global lattice from the concepts of the smaller lattices are:

1. Compute Σ, the set of attributes and objects involved in a clique minimal
   separator. (In our example, Σ = {1, 2, b}.)
2. Compute the maximal star bicliques for all the elements of Σ. (In our exam-
   ple, we will compute star maximal bicliques 1 × acde, 2 × bcf hi and 26 × b.)
3. For each smaller lattice, remove from the set of concepts the atoms or co-
   atoms corresponding to elements of Σ; maintain all the other concepts as
   concepts of the global lattice. (In our example, for L(G3 ), we will remove
   1×ac and 2×bcf , and maintain 3×abf, 13×a, 12×c and 23×bf as concepts
   of L(G).)

    Step 1 requires O(|R|) time. Step 2 can be done while computing the smaller
lattices; Step 3 costs constant time per concept. Thus the overall complexity of
the reconstruction is in O(|R|) time.


4.2   Reconstructing the edges of the Hasse diagram

According to Theorem 1, the maximal bicliques which are not star maximal
bicliques with a vertex of Σ as center are preserved; therefore, the corresponding
edges between the elements of the lattice are also preserved. In the process
28     Anne Berry, Romain Pogorelcnik and Alain Sigayret


described below, we will refer to labels in the lattice as being the ’reduced’
labels, such as the ones used in our lattice figures throuhout this paper.
   To define the edges of the Hasse diagram of lattice L(G), we will, for each
smaller lattice L(Gi ):

 – find each atom (or co-atom) which corresponds to an element σ of Σ (such
   as 2 or b for L(G3 ) in our example).
 – If σ shares its label with some non-elements of Σ, remove all elements of
   Σ from the label. (In our example for L(G3 ), bf becomes f ). If σ does not
   share its label with some non-elements of Σ, remove the atom or co-atom.
   (In our example for L(G3 ), remove element 2).
 – Maintain the remaining edges as edges of L(G).
 – Compute the neighborhood in L(G) of each atom or co-atom which corre-
   sponds to an element of Σ.

   All this can be done in polynomial time: there are at most |A| + |O| vertices
in Σ, and the corresponding edges can be added in O((|A| + |O|)2 |R|).


5    Vertical decomposition as a layout tool

When there is a size two clique separator in the bipartite graph which divides the
graph into two components, the concepts which are not involved in the separator
can be displayed on the two sides of the separator, thus helping to minimize the
number of line crossings in the Hasse diagram.




                   (a)                                          (b)

Fig. 11. (a) Lattice constructed by Concept Explorer using the minimal intersection
layout option (8 crossings). (b) Lattice re-drawn using the information on clique sep-
arators (5 crossings).




     To illustrate this, we have used our running example with ’Concept Explorer’
[1], which is a very nice and user-friendly tool for handling lattices. Notice how-
ever how clique separator {1, d} is better displayed when put at the right ex-
tremity.
                    Vertical decomposition of a lattice using clique separators    29


    Figure 11 shows the lattice as proposed by Concept Explorer, and then re-
drawn with insight on the clique separators of the bipartite graph.
    The same technique of course also applies when there is a succession of such
clique separators.
    Let us add that if moreover both lattices are planar, as discussed in [4], merg-
ing the two lattices obtained using the clique separator as central will preserve
planarity.


6   Conclusion and perspectives
We have used a graph method, clique minimal separator decomposition, to pro-
vide simple tools which can help reduce the time spent computing the elements
of a lattice, as well as improve the drawing of its Hasse diagram.
    When there is no clique separator in the bipartite graph, it could be inter-
esting to investigate restricting the relation to a subgraph or partial subgraph
which does have one.
    We leave open the question of characterizing, without computing the relation,
the lattices whose underlying bipartite graph has a clique minimal separator.


Acknowledgments
The authors sincerely thank all the referees for their useful suggestions and
questions.


References
 1. Concept Explorer. Downloadable at http://sourceforge.net/projects/conexp/, ver-
    sion 1.3 (Java), 20/12/2009.
 2. Berry A., Sigayret A.: Representing a concept lattice by a graph. Discrete Applied
    Mathematics, 144(1-2):27–42, 2004.
 3. Berry A., Pogorelcnik R., Simonet G.: An introduction to clique minimal separator
    decomposition. Algorithms, 3(2):197–215, 2010.
 4. Eschen E.M., Pinet N., Sigayret A.: Consecutive-ones: handling lattice planarity
    efficiently. CLA’07, Montpellier (Fr), 2007.
 5. Hopcroft J. E., Tarjan R. E.: Dividing a graph into triconnected components. SIAM
    J. Comput., 2(3):135–158, 1973.
 6. Hopcroft J. E., Tarjan R. E.: Efficient algorithms for graph manipulation [H]
    (Algorithm 447). Commun. ACM, 16(6):372–378, 1973.
 7. Kaba B., Pinet N., Lelandais G., Sigayret A., Berry A.: Clustering gene expression
    data using graph separators. In Silico Biology, 7(4-5):433–52, 2007.
 8. Leimer H.-G.: Optimal decomposition by clique separators. Discrete Mathematics,
    113(1-3):99–123, 1993.
 9. Tarjan R. E.: Decomposition by clique separators. Discrete Mathematics,
    55(2):221–232, 1985.
            Building up Shared Knowledge with
                Logical Information Systems

              Mireille Ducassé1 , Sébastien Ferré2 , and Peggy Cellier1
                          1
                             IRISA-INSA de Rennes, France,
                            {ducasse, cellier}@irisa.fr
                        2
                          IRISA-University of Rennes 1, France
                                   ferre@irisa.fr


        Abstract. Logical Information Systems (LIS) are based on Logical Con-
        cept Analysis, an extension of Formal Concept Analysis. This paper de-
        scribes an application of LIS to support group decision. A case study
        gathered a research team. The objective was to decide on a set of po-
        tential conferences on which to send submissions. People individually
        used Abilis, a LIS web server, to preselect a set of conferences. Start-
        ing from 1041 call for papers, the individual participants preselected 63
        conferences. They met and collectively used Abilis to select a shared
        set of 42 target conferences. The team could then sketch a publication
        planning. The case study provides evidence that LIS cover at least three
        of the collaboration patterns identified by Kolfschoten, de Vreede and
        Briggs. Abilis helped the team to build a more complete and relevant
        set of information (Generate/Gathering pattern); to build a shared un-
        derstanding of the relevant information (Clarify/Building Shared Un-
        derstanding); and to quickly reduce the number of target conferences
        (Reduce/Filtering pattern).


  1    Introduction
  Group work represents a large amount of time in professional life while many
  people feel that much of that time is wasted. Lewis [13] argues that this amount
  of time is even going to increase because problems are becoming more complex
  and are meant to be solved in a distributed way. Each involved person has a local
  and partial view of the problem, no one embraces the whole required knowledge.
  Lewis also emphasizes that it is common that “groups fail to adequately define a
  problem before rushing to judgment”. Building up shared knowledge in order to
  gather relevant distributed knowledge of a problem is therefore a crucial issue.
      Logical Information Systems (LIS) are based on Logical Concept Analysis
  (LCA), an extension of Formal Concept Analysis (FCA). In a previous work [5],
  Camelis, a single-user logical information system, has been shown useful to sup-
  port serene and fair meetings. This paper shows how Abilis, a LIS web server that
  implements OnLine Analytical Processing (OLAP [3]) features, can be applied
  to help build shared knowledge among a group of skilled users.
      The presented case study gathered a research team to decide on a publication
  strategy. Starting from 1041 call for papers, each team member on his own

c 2011 by the paper authors. CLA 2011, pp. 31–42. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
32       Mireille Ducasse, Sebastien Ferre and Peggy Cellier


preselected a set of conferences matching his own focus of interest. The union of
individual preselections still contained 63 conferences. Then, participants met for
an hour and a half and collectively built a shared set of 42 target conferences. For
each conference, the team shared a deep understanding of why it was relevant.
The team could sketch a publication planning in a non-conflictual way.
    Kolfschoten, de Vreede and Briggs have classified collaboration tasks into 16
collaboration patterns [12]. The contribution of this paper is to give evidences
that LIS can significantly support three of these patterns which are important as-
pects of decision making, namely Generate/Gathering, Clarify/Building Shared
Understanding and Reduce/Filtering. Firstly, the navigation and filtering capa-
bilities of LIS were helpful to detect inconsistencies and missing knowledge. The
updating capabilities of LIS enabled participants to add objects, features and
links between them on the fly. As a result the group had a more complete and
relevant set of information (Generate/Gathering pattern). Secondly, the com-
pact views provided by LIS and the OLAP features helped participants embrace
the whole required knowledge. The group could therefore build a shared under-
standing of the relevant information which was previously distributed amongst
the participants (Clarify/Building Shared Understanding pattern). Thirdly, the
navigation and filtering capabilities of LIS were relevant to quickly converge on
a reduced number of target conferences (Reduce/Filtering pattern).
    In the following, Section 2 briefly introduces logical information systems. Sec-
tion 3 describes the case study. Section 4 gives detailed arguments to support the
claim that logical information systems help build up shared knowledge. Section 5
discusses related work.


2      Logical Information Systems

Logical Information Systems (LIS) [7] belong to a paradigm of information re-
trieval that combines querying and navigation. They are formally based on a
logical generalization of Formal Concept Analysis (FCA) [8], namely Logical
Concept Analysis (LCA) [6]. In LCA, logical formulas are used instead of sets
of attributes to describe objects. LCA and LIS are generic in that the logic
is not fixed, but is a parameter of those formalisms. Logical formulas are also
used to represent queries and navigation links. The concept lattice serves as the
navigation structure: every query leads to a concept, and every navigation link
leads from one concept to another. The query can be modified in three ways: by
formula edition, by navigation (selecting features in the index in order to modify
the query) or by examples. Annotations can be performed in the same interface.
Camelis3 has been developed since 2002; a web interface, Abilis 4 , has recently
been added. It incorporates display paradigms based on On-Line Analytical Pro-
cessing (OLAP). Instead of being presented as a list of objects, an extent can be
partitioned as an OLAP cube, namely a multi-dimensional array [1].
3
     see http://www.irisa.fr/LIS/ferre/camelis/
4
     http://ledenez.insa-rennes.fr/abilis/
            Building up Shared Knowledge with Logical Information Systems          33


3     The Case Study

The reported case study gathered 6 participants, including the 3 authors, 4
academics and 2 PhD students. All participants were familiar with LIS, 4 of them
had not previously used a LIS tool as a decision support system. The objective
was to identify the publishing strategy of the team: in which conferences to
submit and why. This has not been a conflictual decision, the group admitted
very early that the set of selected conferences could be rather large provided
that there was a good reason to keep each of them.
    One person, the facilitator, spent an afternoon organizing the meeting and
preparing the raw data as well as a logical context according to the objective. She
collected data about conference call for papers of about a year, related to themes
corresponding to the potential area of the team, from WikiCFP, a semantic wiki
for Calls For Papers in science and technology fields 5 . There were 1041 events:
conferences, symposiums, workshops but also special issues of journals.
    Then every participant, on its own, spent between half an hour to two hours
to browse the context, update it if necessary and preselect a number of con-
ferences (Section 3.1). The group met for one hour and a half. It collaborately
explored the data and selected a restricted set of conferences (Section 3.2). After
the meeting, every participant filled a questionnaire. The context used for the
case study can be freely accessed 6 .


3.1   Distributed Individual Preselection and Update

When the context was ready, every participant was asked to preselect a set of
conferences that could be possible submission targets. The instruction was to be
as liberal as wanted and in case of doubt to label the conference as a possible
target.
    During this phase, each of the academics preselected 20 to 30 conferences and
each of the PhD students preselected around 10 conferences. Each participant
had his own “basket”. There were overlappings, altogether 63 conferences were
preselected. Participants also introduced new conferences and new features, for
example, the ranking of the Australian CORE association 7 (Ranking), and the
person expected to be a possible first author for the target conference (Main
author).
    Figure 1 shows a state of Abilis during the preselection phase. LIS user in-
terfaces give a local view of the concept lattice, centered on a focus concept. The
local view is made of three parts: (1) the query (top left), (2) the extent (bottom
right), and (3) the index (bottom left). The query is a logical formula that typi-
cally combines attributes (e.g., Name), patterns (e.g., contains "conference"),
and Boolean connectors (and, or, not). The extent is the set of objects that are
matched by the query, according to logical subsumption. The extent identifies
5
  http://www.wikicfp.com/cfp/
6
  http://ledenez.insa-rennes.fr/abilis/, connect as guest, load Call for papers.
7
  http://core.edu.au/index.php/categories/conference rankings
34   Mireille Ducasse, Sebastien Ferre and Peggy Cellier




        Fig. 1. Snapshot of Abilis during preselection: a powerful query
            Building up Shared Knowledge with Logical Information Systems        35


the focus concept. Finally, the index is a set of features, taken from a finite sub-
set of the logic, and is restricted to features associated to at least one object in
the extent. The index plays the role of a summary or inventory of the extent,
showing which kinds of objects there are, and how many of each kind there are
(e.g., in Figure 1, 8 objects in the extent have data mining as a theme). In the
index, features are organized as a taxonomy according to logical subsumption.
     The query area (top left) shows the current selection criteria: (Name contains
"conference" or Name contains "symposium") and not (Name contains
"agent" or Name contains "challenge" or Name contains "workshop")
and (Theme is "Knowledge Discovery" or Theme is "Knowledge
Engineering" or Theme is "Knowledge Management"). Note that the query
had been obtained solely by clicking on features of the index (bottom left). Let
us describe how it had been produced. Initially there were 1041 objects. Firstly,
opening the Name ? feature, the participant had noticed that names could con-
tain “conference” or “symposium” but also other keywords such as “special
issue”. He decided to concentrate on conferences and symposiums by clicking
on the two features and then on the zoom button. The resulting query was
(Name contains "conference" or Name contains "symposium") and there
were 495 objects in the extent. However, the displayed features under Name ?
showed that there were still objects whose name in addition to “conference” or
“symposium” also contained “agent”, “challenge” or “workshop”. He decided to
filter them out by clicking on the three features then on the Not button then on
the zoom button. The resulting query was (Name contains "conference" or
Name contains "symposium") and not (Name contains "agent" or Name
contains "challenge" or Name contains "workshop") and there were 475
objects in the extent. He opened the Theme ? feature, clicked on the three sub-
features containing “Knowledge”, then on the zoom button. The resulting query
is the one displayed on Figure 1 and there are 48 objects in the displayed extent.
    In the extent area (bottom right), the 48 acronyms of the selected conferences
are displayed. In the index area, one can see which of the features are filled for
these objects. The displayed features have at least 1 object attached to them.
The number of objects actually attached to them is shown in parentheses. For
example, only 14 of the preselected conferences have an abstract deadline. All
of them have an acronym, a date of beginning, a date of end, a date for the
paper deadline, a name, some other (not very relevant) information, as well as
at least a theme and a town. The features shared by all selected objects have
that number in parentheses (48 in this case). For the readers who have a color
printout, these features are in green. The other features are attached to only
some of the objects. For example, only 16 objects have a ranking attached to
them: 4 core A, 6 core B, 2 core C, 1 ‘too recent event’, 4 unknown (to the Core
ranking).
    One way to pursue the navigation could be, for example, to click on Ranking ?
to select the conferences for which the information is filled. Alternatively, one
could concentrate on the ones for which the ranking is not filled, for example
36     Mireille Ducasse, Sebastien Ferre and Peggy Cellier


to fill in this information on the fly for the conferences which are considered
interesting.
    Another way to pursue the navigation could be, for example, to notice that
under the Theme ? feature, there are more than the selected themes. One can
see that among the selected conferences, one conference is also relevant to the
Decision Support Systems theme. One could zoom into it, this would add and
Theme is "Decision Support Systems" to the query ; the object area would
then display the relevant conference (namely GDN2011).


3.2   Collaborative Data Exploration, Update and Selection

The group eventually had a physical meeting where the current state of the
context was constantly displayed on a screen.
    Using the navigation facilities of Abilis, the conferences were examined by
decreasing ranking. Initially, the group put in the selection all the A and A+
preselected conferences. After some discussions, it had, however, been decided
that Human Computer Interaction (HCI) was too far away from the core of
the team’s research. Subsequently, the HCI conferences already selected were
removed from the selection. For the conferences of rank B, the team decided
that most of them were pretty good and deserved to be kept in the selection.
For the others, the group investigated first the conferences without ranking and
very recent, trying to identify the ones with high selection rate or other good
reasons to put them in the selection. Some of the arguments have been added
into the context. Some others were taken into account on the fly to select some
conferences but they seemed so obvious at the time of the discussion that they
were not added in the context.
    Figure 2 shows the selection made by the group at a given point. In the
extent area, on the right hand side, the selected objects are partitioned according
to the deadline month and the anticipated main author thanks to the OLAP
like facilities of Abilis [1]. Instead of being presented as a list of objects, an
extent can be partitioned as an OLAP cube, namely a multi-dimensional array.
Assuming object features are valued attributes, each attribute can play the role
of a dimension, whose values play the role of indices along this dimension. Users
can freely interleave changes to the query and changes to the extent view.
    The query is SelectionLIS02Dec and scope international. Note that
the partition display is consistent with the query. When the group added and
scope international to the query, the national conferences disappeared from
the array.
    Some conferences, absent from WikiCFP have been entered on the fly at that
stage (for example, ICCS 2011 - Concept). Not all the features had been entered
for all of them. In particular, one can see in the feature area that only 28 out
of 29 had been preselected. Nevertheless, the group judged that the deadline
month, the potential main author and the ranking were crucial for the decision
process and added them systematically. It is easy to find which objects do not
have a feature using Not and zoom, and then to attach features to them.
            Building up Shared Knowledge with Logical Information Systems          37




Fig. 2. Snapshot of Abilis during collaborative data exploration: a partition deadline
month/mainAuthor
38     Mireille Ducasse, Sebastien Ferre and Peggy Cellier


    One can see that there are enough opportunities for each participant to pub-
lish round the year. One can also see at a glance where compromises and decisions
will have to be made. For example, PC will probably not be in a position to pub-
lish at IDA, ISSTA, KDD and ICCS the same year. Thanks to this global view
PC can discuss with potential co-authors what the best strategy could be.
    A follow up to the meeting was that participants made a personal publication
planning, knowing that their target conferences were approved by the group.


4     Discussion
In this section, we discuss how the reported case study provides evidences that
LIS help keep the group focused (Section 4.1) and that LIS also help build up
shared knowledge (Section 4.2). As already mentioned, participants filled up a
questionnaire after the meeting. In the following, for each item, we introduce
the arguments, we present a summary of relevant parts of participant feedbacks,
followed by an analysis of the features of LIS that are crucial for the arguments.

4.1   Logical Information Systems Help Keep the Group Focused
It is recognized that an expert facilitator can significantly increase the efficiency
of a meeting (see for example [2]). A study made by den Hengst and Adkins [10]
investigated which facilitation functions were found the most challenging by
facilitators around the world. It provides evidences that facilitators find that
“the most difficult facilitation function in meeting procedures is keeping the group
outcome focused.”
    In our case study, all participants reported that they could very easily stay
focused on the point currently discussed thanks to the query and the consistency
between the three views.
    As the objective was to construct a selection explicitly identified in Abilis
by a feature, the objective of the meeting was always present to everybody
and straightforward to bring back in case of digression. Furthermore, even if the
context contained over a thousand conferences, thanks to the navigation facilities
of LIS, only relevant information was displayed at a given time. Therefore, there
was no “noise” and no dispersion of attention, the displayed information was
always closely connected to the focus of the discussion.

4.2   Logical Information Systems Help Build Up Shared Knowledge
Kolfschoten, de Vreede and Briggs have identified 6 collaboration patterns: Gen-
erate, Reduce, Clarify, Organize, Evaluate, and Consensus Building [12]. We
discuss in the following three of their 16 sub-patterns for which all participants
agreed that they are supported by Abilis in its current stage. For the other sub-
patterns, the situation did not demand much with respect to them. For example,
the decision to make was not conflictual, the set of selected conferences could
be rather large, there was, therefore, not much to experiment about “consensus
            Building up Shared Knowledge with Logical Information Systems         39


building.” The descriptions of the patterns in italic are from Kolfschoten, de
Vreede and Briggs.

Generate/Gathering: move from having fewer to having more complete and rel-
evant information shared by the group.
     Before and during the meeting, information has been added to the shared
knowledge repository of the group, namely the logical context. A new theme,
important for the team and missing from WikiCFP, has been added: Decision
Support Systems. New conferences have been added into the context either by
individual participants in the preselection phase or by the group during the
selection phase. New features were added. For example, it soon appeared that
some sort of conference rankings was necessary. The group added by hand, for the
conferences that were selected, the ranking of the Australian Core association.
Some conferences were added subsequently, sometimes the ranking was not added
at once.
     All participants acknowledged that the tool helped the group to set up a set
of features which was relevant and reflecting the group’s point of view.
     The crucial characteristics of LIS for this aspect are those which enable in-
tegrated navigation and update.
     Firstly, the possibility to update the context while navigating in it enables
participants to enhance it on the fly adding small pieces of relevant information
at a time. Secondly, for each feature, Abilis displays the number of objects which
have it. It is therefore immediate to detect when a feature is not systematically
filled. The query Not  selects the objects that do not have the feature.
Users can then decide if they want to update them. Thanks to the query, as soon
as an object is updated, it disappears from the extent. Users can immediately
see what remains to be updated. Thirdly, updating the context does not divert
from the initial objective. Indeed, the Back button allows users to go back to
previous queries. Fourthly, the three views (query, features, objects) are always
consistent and provide a “global” understanding of the relevant objects. Lastly,
in the shared web server, participants can see what information the others had
entered. Hence each participant can inspire the others.
     For the last aspect, the facilitator inputs were decisive. Participants reported
that they did not invent much, they imitated and adapted from what the facili-
tator had initiated. This is consistent with the literature on group decision and
negotiation which emphasizes the key role of facilitators [2].

Clarify/Building Shared Understanding: Move from having less to more shared
understanding of the concepts shared by the group and the words and phrases
used to express them.
    Participants, even senior ones, discovered new conferences. Some were sur-
prised by the ranking of conferences that they had previously overlooked. Par-
ticipants had a much clearer idea of who was interested in what.
    All participants found that the tool helped them understand the points of
view of the others.
40     Mireille Ducasse, Sebastien Ferre and Peggy Cellier


    The crucial characteristics of LIS for this aspect are those which enable to
grasp a global understanding at a glance. Firstly, the query, as discussed earlier,
helps keep the group focused. Secondly, the consistency between the 3 views
helps participants to grasp the situation. Thirdly, irrelevant features are not in
the index, the features in the index thus reflect the current state of the group de-
cision. Fourthly, the partitions à la OLAP sort the information according to the
criteria under investigation. Lastly, the shared web server enables participants
to know before the meeting what the others have entered.

Reduce/Filtering: move from having many concepts to fewer concepts that meet
specific criteria according to the group members.
    Both at preselection time and during the meeting, participants could quickly
strip down the set of conferences of interest according to the most relevant
criteria.
    All participants said that the filtering criteria were relevant and reflecting
the group’s point of view. They also all thought that the group was satisfied
with the selected set of conferences.
    The crucial characteristics of LIS for this aspect are those of the navigation
core of LIS. Firstly, the features of the index propose filtering criteria. They
are dynamically computed and they are relevant for the current selection of ob-
jects. Secondly, the query with its powerful logic capabilities enables participants
to express sophisticated selections. Thirdly, the navigation facilities enable par-
ticipants to build powerful queries, even without knowing anything about the
syntax. Lastly, users do not have to worry about the consistency of the set of
selected objects. The view consistency of Abilis guaranties that all conferences
fulfilling the expressed query are indeed present.
    This aspect is especially important. As claimed by Davis et al. [4], conver-
gence in meetings is a slow and painful process for groups. Vogel and Coombes [16]
present an experiment that supports the hypothesis that groups selecting ideas
from a multicriteria task formulation will converge better than groups working on
a single criteria formulation, where convergence is defined as moving from many
ideas to a focus on a few ideas that are worthy of further attention. Convergence
is very close to the Reduce/Filtering collaboration pattern. They also underline
that people try to minimize the effects of information overload by employing con-
scious or even unconscious strategies of heuristics in order to reduce information
load, where information overload is defined as having too many things to do at
once.
    With their powerful navigation facilities, LIS enable to address a large num-
ber of criteria and objects with a limited information overload. Indeed, one can
concentrate on local aspects. The global consistency is maintained automatically
by the concept lattice.

5    Related work
Abilis in its current stage does not pretend to match up to operational group
support systems (GSS) which have a much broader scope. LIS, however, could be
            Building up Shared Knowledge with Logical Information Systems       41


integrated in some of the modules of GSS. For example, M eetingworksT M [13],
one of the most established GSS, is a modular toolkit that can be configured to
support a wide variety of group tasks. Its “Organize” module proposes a tree
structure to help analyze and sort ideas. That structure looks much like the
index of LIS. It can be edited by hand and some limited selection is possible.
The navigation capabilities of LIS based on the concept lattice are, however,
more powerful.
    Concept analysis has been applied to numerous social contexts, such as so-
cial networks [15], computer-mediated communication [9] and domestic violence
detection [14]. Most of those applications are intended to be applied a posteri-
ori, in order to get some understanding of the studied social phenomena. On
the contrary, we propose to use Logical Concept Analysis in the course and as a
support of the social phenomena itself. In our case, the purpose is to support a
collaborative decision process. Our approach is to other social applications, what
information retrieval is to data mining. Whereas data mining automatically com-
putes a global and static view on a posteriori data, information retrieval (i.e.
navigation in and update of the concept lattice) presents the user with a local
and dynamic view on live data, and only guides users in their choice.
    A specificity of LIS is the use of logics. This has consequences both on the
queries that can be expressed, and on the feature taxonomy. The use of logics al-
lows to express inequalities on numerical attributes, disjunctions and negations in
queries. In pure FCA, only conjunctions of Boolean attributes can be expressed.
Previous sections have shown how disjunction and negation are important to
express selection criteria. In the taxonomy, criteria are organized according to
the logical subsumption relation between them in pure FCA, criteria would be
presented as a long flat list. Logics help to make the taxonomy more concise and
readable by grouping and hierarchizing together similar criteria. The taxonomy
can be dynamically updated by end-users.

6   Conclusion
In this paper we have shown that a Logical Information System web server could
be used to support a group decision process consisting of 1) data preparation
2) distributed individual preselection and update and 3) collaborative data ex-
ploration, update and selection. We have presented evidences that the navigation
and filtering capabilities of LIS were relevant to quickly reduce the number of
target conferences. Secondly, the same capabilities were also helpful to detect
inconsistencies and missing knowledge. The updating capabilities of LIS enabled
participants to add objects, features and links between them on the fly. As a
result the group had a more complete and relevant set of information. Thirdly,
the group had built a shared understanding of the relevant information.

Acknowledgments The authors thank Pierre Allard and Benjamin Sigonneau for
the development and maintenance of Abilis. They thank Pierre Allard, Annie
Foret and Alice Hermann for attending the experiment and giving many insight-
ful feedbacks.
42     Mireille Ducasse, Sebastien Ferre and Peggy Cellier


References
 1. Allard, P., Ferré, S., Ridoux, O.: Discovering functional dependencies and associa-
    tion rules by navigating in a lattice of OLAP views. In: Kryszkiewicz, M., Obiedkov,
    S. (eds.) Concept Lattices and Their Applications. pp. 199–210. CEUR-WS (2010)
 2. Briggs, R.O., Kolfschoten, G.L., de Vreede, G.J., Albrecht, C.C., Lukosch, S.G.:
    Facilitator in a box: Computer assisted collaboration engineering and process sup-
    port systems for rapid development of collaborative applications for high-value
    tasks. In: HICSS. pp. 1–10. IEEE Computer Society (2010)
 3. Codd, E., Codd, S., Salley, C.: Providing OLAP (On-line Analytical Processing)
    to User-Analysts: An IT Mandate. Codd & Date, Inc, San Jose (1993)
 4. Davis, A., de Vreede, G.J., Briggs, R.: Designing thinklets for convergence. In:
    AMCIS 2007 Proceedings (2007), http://aisel.aisnet.org/amcis2007/358
 5. Ducassé, M., Ferré, S.: Fair(er) and (almost) serene committee meetings with logi-
    cal and formal concept analysis. In: Eklund, P., Haemmerlé, O. (eds.) Proceedings
    of the International Conference on Conceptual Structures. Springer-Verlag (July
    2008), lecture Notes in Artificial Intelligence 5113
 6. Ferré, S., Ridoux, O.: A logical generalization of formal concept analysis. In:
    Mineau, G., Ganter, B. (eds.) International Conference on Conceptual Structures.
    pp. 371–384. No. 1867 in Lecture Notes in Computer Science, Springer (Aug 2000)
 7. Ferré, S., Ridoux, O.: An introduction to logical information systems. Information
    Processing & Management 40(3), 383–419 (2004)
 8. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations.
    Springer, Heidelberg (1999)
 9. Hara, N.: Analysis of computer-mediated communication: Using formal concept
    analysis as a visualizing methodology. Journal of Educational Computing Research
    26(1), 25–49 (2002)
10. den Hengst, M., Adkins, M.: Which collaboration patterns are most challenging:
    A global survey of facilitators. In: HICSS. p. 17. IEEE Computer Society (2007)
11. Kilgour, D.M., Eden, C.: Handbook of Group Decision and Negotiation, Advances
    in Group Decision and Negotiation, vol. 4. Springer Netherlands (2010)
12. Kolfschoten, G.L., de Vreede, G.J., Briggs, R.O.: Collaboration engineering. In:
    Kilgour and Eden [11], chap. 20, pp. 339–357
13. Lewis, L.F.: Group support systems: Overview and guided tour. In: Kilgour and
    Eden [11], chap. 14, pp. 249–268
14. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: A case of using formal con-
    cept analysis in combination with emergent self organizing maps for detecting
    domestic violence. In: Perner, P. (ed.) Advances in Data Mining. Applications and
    Theoretical Aspects, Lecture Notes in Computer Science, vol. 5633, pp. 247–260.
    Springer Berlin / Heidelberg (2009), http://dx.doi.org/10.1007/978-3-642-03067-
    3 20, 10.1007/978-3-642-03067-3 20
15. Roth, C., Bourgine, P.: Lattice-based dynamic and overlapping taxonomies:
    The case of epistemic communities. Scientometrics 69, 429–447 (2006),
    http://dx.doi.org/10.1007/s11192-006-0161-6, 10.1007/s11192-006-0161-6
16. Vogel, D., Coombes, J.: The effect of structure on convergence activities using
    group support systems. In: Kilgour and Eden [11], chap. 17, pp. 301–311
       Comparing Performance of Algorithms for
       Generating the Duquenne–Guigues Basis

                     Konstantin Bazhanov and Sergei Obiedkov

                    Higher School of Economics, Moscow, Russia,
                  kostyabazhanov@mail.ru, sergei.obj@gmail.com



        Abstract. In this paper, we take a look at algorithms involved in the
        computation of the Duquenne–Guigues basis of implications. The most
        widely used algorithm for constructing the basis is Ganter’s Next Clo-
        sure, designed for generating closed sets of an arbitrary closure system.
        We show that, for the purpose of generating the basis, the algorithm can
        be optimized. We compare the performance of the original algorithm
        and its optimized version in a series of experiments using artificially
        generated and real-life datasets. An important computationally expen-
        sive subroutine of the algorithm generates the closure of an attribute
        set with respect to a set of implications. We compare the performance
        of three algorithms for this task on their own, as well as in conjunction
        with each of the two versions of Next Closure.


  1    Introduction

  Implications are among the most important tools of formal concept analysis
  (FCA) [9]. The set of all attribute implications valid in a formal context defines
  a closure operator mapping attribute sets to concept intents of the context (this
  mapping is surjective). The following two algorithmic problems arise with respect
  to implications:

   1. Given a set L of implications and an attribute set A, compute the closure
      L(A).
   2. Given a formal context K, compute a set of implications equivalent to the
      set of all implications valid in K, i.e., the cover of valid implications.

      The first of these problems has received considerable attention in the database
  literature in application to functional dependencies [14]. Although functional
  dependencies are interpreted differently than implications, the two are in many
  ways similar: in particular, they share the notion of semantic consequence and
  the syntactic inference mechanism (Armstrong rules [1]). A linear-time algo-
  rithm, LinClosure, has been proposed for computing the closure of a set with
  respect to a set of functional dependencies (or implications) [3], i.e., for solving
  the first of the two problems stated above. However, the asymptotic complexity
  estimates may not always be good indicators for relative performance of algo-
  rithms in practical situations. In Sect. 3, we compare LinClosure with two

c 2011 by the paper authors. CLA 2011, pp. 43–57. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
44     Konstantin Bazhanov and Sergei Obiedkov


other algorithms—a “naı̈ve” algorithm, Closure [14], and the algorithm pro-
posed in [20]—both of which are non-linear. We analyze their performance in
several particular cases and compare them experimentally on several datasets.
    For the second problem, an obvious choice of the cover is the Duquenne–
Guigues, or canonical, basis of implications, which is the smallest set equivalent
to the set of valid implications [11]. Unlike for the other frequently occurring
FCA algorithmic task, the computation of all formal concepts of a formal con-
text [12], only few algorithms have been proposed for the calculation of the
canonical basis. The most widely-used algorithm was proposed by Ganter in
[10]. Another, attribute-incremental, algorithm for the same problem was de-
scribed in [17]. It is claimed to be much faster than Ganter’s algorithm for most
practical situations. The Concept Explorer software system [21] uses this algo-
rithm to generate the Duquenne–Guigues basis of a formal context. However,
we do not discuss it here, for we choose to concentrate on the computation of
implications in the lectic order (see Sect. 4). The lectic order is important in the
interactive knowledge-acquisition procedure of attribute exploration [8], where
implications are output one by one and the user is requested to confirm or reject
(by providing a counterexample) each implication.
    Ganter’s algorithm repeatedly computes the closure of an attribute set with
respect to a set of implications; therefore, it relies heavily on a subprocedure
implementing a solution to the first problem. In Sect. 4, we describe possible
optimizations of Ganter’s algorithm and experimentally compare the original
and optimized versions in conjunction with each of the three algorithms for
solving the first problem. A systematic comparison with the algorithm from [17]
is left for further work.

2    The Duquenne–Guigues Basis of Implications
Before proceeding, we quickly recall the definition of the Duquenne–Guigues
basis and related notions.
    Given a (formal) context K = (G, M, I), where G is called a set of objects, M
is called a set of attributes, and the binary relation I ⊆ G × M specifies which
objects have which attributes, the derivation operators (·)I are defined for A ⊆ G
and B ⊆ M as follows:
     A0 = {m ∈ M | ∀g ∈ A : gIm}                 B 0 = {g ∈ G | ∀m ∈ B : gIm}
In words, A0 is the set of attributes common to all objects of A and B 0 is the set
of objects sharing all attributes of B. The double application of (·)0 is a closure
operator, i.e., (·)00 is extensive, idempotent, and monotonous. Therefore, sets A00
and B 00 are said to be closed. Closed object sets are called concept extents and
closed attribute sets are called concept intents of the formal context K.
    In discussing the algorithms later in the paper, we assume that the sets G
and M are finite.
    An implication over M is an expression A → B, where A, B ⊆ M are at-
tribute subsets. It holds in the context if A0 ⊆ B 0 , i.e., every object of the context
that has all attributes from A also has all attributes from B.
           Comparing performance of algorithms for generating the DG basis          45


    An attribute subset X ⊆ M respects (or is a model of) an implication A → B
if A 6⊆ X or B ⊆ X. Obviously, an implication holds in a context (G, M, I) if
and only if {g}0 respects the implication for all g ∈ G.
    A set L of implications over M defines the closure operator X 7→ L(X) that
maps X ⊆ M to the smallest set respecting all the implications in L:
                 \
        L(X) = {Y | X ⊆ Y ⊆ M, ∀(A → B) ∈ L : A 6⊆ Y or B ⊆ Y }.

    We discuss algorithms for computing L(X) in Sect. 3. Note that, if L is the
set of all valid implications of a formal context, then L(X) = X 00 for all X ⊆ M .
    Two implication sets over M are equivalent if they are respected by exactly
the same subsets of M . Equivalent implication sets define the same closure op-
erator. A minimum cover of an implication set L is a set of minimal size among
all implication sets equivalent to L. One particular minimum cover described in
[11] is defined using the notion of a pseudo-closed set, which we introduce next.
    A set P ⊆ M is called pseudo-closed (with respect to a closure operator (·)00 )
if P 6= P 00 and Q00 ⊂ P for every pseudo-closed Q ⊂ P .
    In particular, all minimal non-closed sets are pseudo-closed. A pseudo-closed
attribute set of a formal context is also called a pseudo-intent.
    The Duquenne–Guigues or canonical basis of implications (with respect to
a closure operator (·)00 ) is the set of all implications of the form P → P 00 ,
where P is pseudo-closed. This set of implications is of minimal size among
those defining the closure operator (·)00 . If (·)00 is the closure operator associated
with a formal context, the Duquenne–Guigues basis is a minimum cover of valid
implications of this context. The computation of the Duquenne–Guigues basis
of a formal context is hard, since even recognizing pseudo-intents is a coNP-
complete problem [2], see also [13, 7]. We discuss algorithms for computing the
basis in Sect. 4.


3    Computing the Closure of an Attribute Set

In this section, we compare the performance of algorithms computing the closure
of an attribute set X with respect to a set L of implications. Algorithm 1 [14]
checks every implication A → B ∈ L and enlarges X with attributes from B
if A ⊆ X. The algorithm terminates when a fixed point is reached, that is,
when the set X cannot be enlarged any further (which always happens at some
moment, since both L and M are assumed finite).
    The algorithm is obviously quadratic in the number of implications in L in
the worst case. The worst case happens when exactly one implication is applied
at each iteration (but the last one) of the repeat loop, resulting in |L|(|L| + 1)/2
iterations of the for all loop, each requiring O(|M |) time.

Example 1. A simple example is when X = {1} and the implications in L =
{{i} → {i + 1} | i ∈ N, 0 < i < n} for some n are arranged in the descending
order of their one-element premises.
46     Konstantin Bazhanov and Sergei Obiedkov


Algorithm 1 Closure(X, L)
Input: An attribute set X ⊆ M and a set L of implications over M .
Output: The closure of X w.r.t. implications in L.
  repeat
      stable := true
      for all A → B ∈ L do
            if A ⊆ X then
                 X := X ∪ B
                 stable := false
                 L := L \ {A → B}
  until stable
  return X


    In [3], a linear-time algorithm, LinClosure, is proposed for the same prob-
lem. Algorithm 2 is identical to the version of LinClosure from [14] except for
one modification designed to allow implications with empty premises in L. Lin-
Closure associates a counter with each implication initializing it with the size
of the implication premise. Also, each attribute is linked to a list of implications
that have it in their premises. The algorithm then checks every attribute m of
X (the set whose closure must be computed) and decrements the counters for
all implications linked to m. If the counter of some implication A → B reaches
zero, attributes from B are added to X. Afterwards, they are used to decrement
counters along with the original attributes of X. When all attributes in X have
been checked in this way, the algorithm stops with X containing the closure of
the input attribute set.
    It can be shown that the algorithm is linear in the length of the input as-
suming that each attribute in the premise or conclusion of any implication in L
requires a constant amount of memory [14].
Example 2. The worst case for LinClosure occurs, for instance, when X ⊂ N,
M = X ∪ {1, 2, . . . , n} for some n such that X ∩ {1, 2, . . . , n} = ∅ and L consists
of implications of the form
                             X ∪ {i | 0 < i < k} → {k}
for all k such that 1 ≤ k ≤ n. During each of the first |X| iterations of the
for all loop, the counters of all implications will have to be updated with only
the last iteration adding one attribute to X using the implication X → {1}. At
each of the subsequent n − 1 iterations, the counter for every so far “unused”
implication will be updated and one attribute will be added to X. The next,
(|X| + n)th, iteration will terminate the algorithm.
    Note that, if the implications in L are arranged in the superset-inclusion
order of their premises, this example will present the worst case for Algorithm 1
requiring n iterations of the main loop. However, if the implications are arranged
in the subset-inclusion order of their premises, one iteration will be sufficient.
    Inspired by the mechanism used in LinClosure to obtain linear asymp-
totic complexity, but somewhat disappointed by the poor performance of the
           Comparing performance of algorithms for generating the DG basis     47


Algorithm 2 LinClosure(X, L)
Input: An attribute set X ⊆ M and a set L of implications over M.
Output: The closure of X w.r.t. implications in L.
  for all A → B ∈ L do
       count[A → B] := |A|
       if |A| = 0 then
            X := X ∪ B
       for all a ∈ A do
            add A → B to list[a]
  update := X
  while update 6= ∅ do
       choose m ∈ update
       update := update \ {m}
       for all A → B ∈ list[m] do
            count[A → B] = count[A → B] − 1
            if count[A → B] = 0 then
                 add := B \ X
                 X := X ∪ add
                 update := update ∪ add
  return X



algorithm relative to Closure, which was revealed in his experiments, Wild
proposed a new algorithm in [20]. We present this algorithm (in a slightly more
compact form) as Algorithm 3. The idea is to maintain implication lists similar
to those used in LinClosure, but get rid of the counters. Instead, at each step,
the algorithm combines the implications in the lists associated with attributes
not occurring in X and “fires” the remaining implications (i.e., uses them to
enlarge X). When there is no implication to fire, the algorithm terminates with
X containing the desired result.
    Wild claims that his algorithm is faster than both LinClosure and Clo-
sure, even though it has the same asymptotic complexity as the latter. The worst
case for Algorithm 3 is when L \ L1 contains exactly one implication A → B and
B \ X contains exactly one attribute at each iteration of the repeat . . . until
loop. Example 1 presents the worst case for 3, but, unlike for Closure, the
order of implications in L is irrelevant. The worst case for LinClosure (see
Example 2) is also the worst case for Algorithm 3, but it deals with it, perhaps,
in a more efficient way using n iterations of the main loop compared to n + |X|
iterations of the main loop in LinClosure.


Experimental Comparison

We implemented the algorithms in C++ using Microsoft Visual Studio 2010. For
the implementation of attribute sets, as well as sets of implications in Algorithm
3, we used dynamic bit sets from the Boost library [6]. All the tests described in
the following sections were carried out on an Intel Core i5 2.67 GHz computer
with 4 Gb of memory running under Windows 7 Home Premium x64.
48                                Konstantin Bazhanov and Sergei Obiedkov


Algorithm 3 Wild’s Closure(X, L)
Input: An attribute set X ⊆ M and a set L of implications over M.
Output: The closure of X w.r.t. implications in L.
  for all m ∈ M do
       for all A → B ∈ L do
            if m ∈ A then
                 add A → B to list[m]
  repeat
       stable S
              := true
       L1 := m∈M \X list[m]
       for all A → B ∈ L \ L1 do
            X := X ∪ B
            stable := false
       L := L1
  until stable
  return X




    Figure 1 shows the performance of the three algorithms on Example 1. Algo-
rithm 2 is the fastest algorithm in this case: for a given n, it needs n iterations
of the outer loop—the same as the other two algorithms, but the inner loop of
Algorithm 2 checks exactly one implication at each iteration, whereas the inner
loop of Algorithm 1 checks n − i implications at the ith iteration. Although the
inner loop of Algorithm 3 checks only one implication at the ith iteration, it has
to compute the union of n − i lists in addition.




                                  40
                                  35
     Time in sec for 1000 tests




                                  30
                                  25
                                  20                                                           Closure
                                  15                                                           LinClosure
                                                                                               Wild's Closure
                                  10
                                   5
                                   0
                                       0     100 200 300 400 500 600 700 800 900
                                                             n


                                           Fig. 1. The performance of Algorithms 1–3 for Example 1.
                                      Comparing performance of algorithms for generating the DG basis       49


   Figure 2 shows the performance of the algorithms on Example 2. Here, the
behavior of Algorithm 2 is similar to that of Algorithm 1, but Algorithm 2 takes
more time due to the complicated initialization step.


                                 40
                                 35
    Time in sec for 1000 tests




                                 30
                                 25
                                 20                                                        Closure
                                 15                                                        LinClosure
                                                                                           Wild's Closure
                                 10
                                 5
                                 0
                                      0   100 200 300 400 500 600 700 800 900
                                                           n


Fig. 2. The performance of Algorithms 1–3 for Example 2 with implications in L
arranged in the superset-inclusion order of their premises and |X| = 50.


    Interestingly, Algorithm 1 works amost twice as fast on Example 2 as it does
on Example 1. This may seem surprising, since it is easy to see that the algorithm
performs essentially the same computations in both cases, the difference being
that the implications of Example 1 have single-element premises. However, this
turns out to be a source of inefficiency: at each iteration of the main loop, all
implications but the last fail to fire, but, for each of them, the algorithm checks
if their premises are included in the set X. Generally, when A 6⊆ X, this can
be established easier if A is large, for, in this case, A is likely to contain more
elements outside X. This effect is reinforced by the implementation of sets as
bit strings: roughly speaking, to verify that {i} 6⊆ {1}, it is necessary to check
all bits up {i}, whereas {i | 0 < i < k} 6⊆ {k + 1} can be established by checking
only one bit (assuming that bits are checked from left to right). Alternative
data structures for set implementation might have less dramatic consequences
for performance in this setting. On the other hand, the example shows that
performance may be affected by issues not so obviously related to the structure
of the algorithm, thus, suggesting additional paths to obtain an optimal behavior
(e.g., by rearranging attributes or otherwise preprocessing the input data).
    We have experimented with computing closures using the Duquenne–Guigues
bases of formal contexts as input implication sets. Table 1 shows the results for
randomly generated contexts. The first two columns indicate the size of the at-
tribute set and the number of implications, respectively. The remaining three
columns record the time (in seconds) for computing the closures of 1000 ran-
50       Konstantin Bazhanov and Sergei Obiedkov


domly generated subsets of M by each of the three algorithms. Table 3 presents
similar results for datasets taken from the UCI repository [5] and, if necessary,
transformed into formal contexts using FCA scaling [9].1 The contexts are de-
scribed in Table 2, where the last four columns correspond to the number of
objects, number of attributes, number of intents, and number of pseudo-intents
(i.e., the size of the canonical basis) of the context named in the first column.


Table 1. Performance on randomly generated tests (time in seconds per 1000 closures)

                                               Algorithm
                          |M |   |L|       1         2       3
                           30 557      0.0051 0.2593 0.0590
                           50 1115     0.0118 0.5926 0.1502
                          100 380      0.0055 0.2887 0.0900
                          100 546      0.0086 0.4229 0.1350
                          100 2269     0.0334 1.5742 0.5023
                          100 3893     0.0562 2.6186 0.8380
                          100 7994     0.1134 5.3768 1.7152
                          100 8136     0.1159 5.6611 1.8412




                    Table 2. Contexts obtained from UCI datasets


                             Context    |G| |M | # intents # pseudo-intents
                                 Zoo 101 28            379             141
               Postoperative Patient    90 26         2378             619
                Congressional Voting 435 18          10644             849
                             SPECT 267 23            21550            2169
                       Breast Cancer 286 43           9918            3354
                          Solar Flare 1389 49        28742            3382
             Wisconsin Breast Cancer 699 91           9824           10666




    In these experiments, Algorithm 1 was the fastest and Algorithm 2 was the
slowest, even though it has the best asymptotic complexity. This can be partly
explained by the large overhead of the initialization step (setting up counters
and implication lists). Therefore, these results can be used as a reference only
when the task is to compute one closure for a given set of implications. When
1
     The breast cancer domain was obtained from the University Medical Centre, Insti-
     tute of Oncology, Ljubljana, Yugoslavia (now, Slovenia). Thanks go to M. Zwitter
     and M. Soklic for providing the data.
           Comparing performance of algorithms for generating the DG basis        51


a large number of closures must be computed with respect to the same set of
implications, Algorithms 2 and 3 may be more appropriate.


Table 3. Performance on the canonical bases of contexts from Table 2 (time in seconds
per 1000 closures)

                                                   Algorithm
                                  Context         1       2       3
                                     Zoo     0.0036 0.0905 0.0182
                   Postoperative Patient     0.0054 0.2980 0.0722
                    Congressional Voting     0.0075 0.1505 0.0883
                                 SPECT       0.0251 0.9848 0.2570
                           Breast Cancer     0.0361 1.7912 0.5028
                              Solar Flare    0.0370 2.1165 0.6317
                 Wisconsin Breast Cancer     0.1368 8.4984 2.4730




4   Computing the Basis in the Lectic Order
The best-known algorithm for computing the Duquenne–Guigues basis was de-
veloped by Ganter in [10]. The algorithm is based on the fact that intents and
pseudo-intents of a context taken together form a closure system. This makes it
possible to iteratively generate all intents and pseudo-intents using Next Clo-
sure (see Algorithm 4), a generic algorithm for enumerating closed sets of an
arbitrary closure operator (also proposed in [10]). For every generated pseudo-
intent P , an implication P → P 00 is added to the basis. The intents, which are
also generated, are simply discarded.


Algorithm 4 Next Closure(A, M , L)
Input: A closure operator X 7→ L(X) on M and a subset A ⊆ M .
Output: The lectically next closed set after A.
  for all m ∈ M in reverse order do
       if m ∈ A then
            A := A \ {m}
       else
            B := L(A ∪ {m})
            if B \ A contains no element < m then
                 return B
  return ⊥



   Next Closure takes a closed set as input and outputs the next closed set
according to a particular lectic order, which is a linear extension of the subset-
52       Konstantin Bazhanov and Sergei Obiedkov


inclusion order. Assuming a linear order < on attributes in M , we say that a set
A ⊆ M is lectically smaller than a set B ⊆ M if
                          ∃b ∈ B \ A ∀a ∈ A(a < b ⇒ a ∈ B).
   In other words, the lectically largest among two sets is the one containing
the smallest element in which they differ.
Example 3. Let M = {a < b < c < d < e < f }, A = {a, c, e} and B = {a, b, f }.
Then, A is lectically smaller than B, since the first attribute in which they
differ, b, is in B. Note that if we represent sets by bit strings with smaller
attributes corresponding to higher-order bits (in our example, A = 101010 and
B = 110001), the lectic order will match the usual less-than order on binary
numbers.
    To be able to use Next Closure for iterating over intents and pseudo-
intents, we need access to the corresponding closure operator. This operator,
which we denote by • , is defined via the Duquenne–Guigues basis L as follows.2
For a subset A ⊆ M , put
                               [
                  A+ = A ∪ {P 00 | P → P 00 ∈ L, P ⊂ A}.

Then, A• = A++···+ , where A•+ = A• ; i.e., • is the transitive closure of + .
    The problem is that L is not available when we start; in fact, this is precisely
what we want to generate. Fortunately, for computing a pseudo-closed set A, it is
sufficient to know only implications with premises that are proper subsets of A.
Generating pseudo-closed sets in the lectic order, which is compatible with the
subset-inclusion order, we ensure that, at each step, we have at hand the required
part of the basis. Therefore, we can use any of the three algorithms from Sect.
3 to compute A• (provided that the implication A• → A00 has not been added
to L yet). Algorithm 5 uses Next Closure to generate the canonical basis. It
passes Next Closure the part of the basis computed so far; Next Closure
may call any of the Algorithms 1–3 to compute the closure, L(A ∪ {m}), with
respect to this set of implications.
    After Next Closure computes A• , the implication A• → A00 may be added
to the basis. Algorithm 5 will then pass A• as the input to Next Closure, but
there is some room for optimizations here. Let i be the maximal element of A
and j be the minimal element of A00 \ A. Consider the following two cases:
j < i: As long as m > i, the set L(A• ∪{m}) will be rejected by Next Closure,
    since it will contain j. Hence, it makes sense to skip all m > i and continue as
    if A• had been rejected by Next Closure. This optimization has already
    been proposed in [17].
i < j: It can be shown that, in this case, the lectically next intent or pseudo-
    intent after A• is A00 . Hence, A00 could be used at the next step instead of
    A• .
      Algorithm 6 takes these considerations into account.
2
     We deliberately use the same letter L for an implication set and the closure operator
     it defines.
           Comparing performance of algorithms for generating the DG basis        53




Algorithm 5 Canonical Basis(M , 00 )
Input: A closure operator X 7→ X 00 on M , e.g., given by a formal context (G, M, I).
Output: The canonical basis for the closure operator.
  L := ∅
  A := ∅
  while A 6= M do
       if A 6= A00 then
             L := L ∪ {A → A00 }
       A := Next Closure(A, M, L)
  return L




Algorithm 6 Canonical Basis(M , 00 ), an optimized version
Input: A closure operator X 7→ X 00 on M , e.g., given by a formal context (G, M, I).
Output: The canonical basis for the closure operator.
  L := ∅
  A := ∅
  i := the smallest element of M
  while A 6= M do
       if A 6= A00 then
             L := L ∪ {A → A00 }
       if A00 \ A contains no element < i then
             A := A00
             i := the largest element of M
       else
             A := {m ∈ A | m ≤ i}
       for all j ≤ i ∈ M in reverse order do
             if j ∈ A then
                  A := A \ {j}
             else
                  B := L(A ∪ {j})
                  if B \ A contains no element < j then
                        A := B
                        i := j
                        exit for
  return L
54       Konstantin Bazhanov and Sergei Obiedkov


Experimental Comparison

We used Algorithms 5 and 6 for constructing the canonical bases of the contexts
involved in testing the performance of the algorithms from Sect. 3, as well as
the context (M, M, 6=) with |M | = 18, which is special in that every subset of
M is closed (and hence there are no valid implications). Both algorithms have
been tested in conjunction with each of the three procedures for computing
closures (Algorithm 1–3). The results are presented in Table 4 and Fig. 3. It can
be seen that Algorithm 6 indeed improves on the performance of Algorithm 5.
Among the three algorithms computing the closure, the simpler Algorithm 1 is
generally more efficient, even though, in our implementation, we do not perform
the initialization step of Algorithms 2 and 3 from scratch each time we need
to compute a closure of a new set; instead, we reuse the previously constructed
counters and implication lists and update them incrementally with the addition
of each new implication. We prefer to treat these results as preliminary: it still
remains to see whether the asymptotic behavior of LinClosure will give it an
advantage over the other algorithms on larger contexts.


     Table 4. Time (in seconds) for building the canonical bases of artificial contexts

                                                                 Algorithm
      Context # intents # pseudo-intents        5+1     5+2     5+3 6+1         6+2       6+3
 100 × 30, 4          307               557    0.0088 0.0145 0.0119 0.0044 0.0065 0.0059
10 × 100, 25          129               380    0.0330 0.0365 0.0431 0.0073 0.0150 0.0169
 100 × 50, 4          251              1115    0.0442 0.0549 0.0617 0.0138 0.0152 0.0176
10 × 100, 50          559               546    0.0542 0.1312 0.1506 0.0382 0.0932 0.0954
20 × 100, 25          716              2269    0.3814 0.3920 0.7380 0.1219 0.1312 0.2504
50 × 100, 10          420              3893    1.1354 0.7291 1.6456 0.1640 0.1003 0.2299
900 × 100, 4         2472              7994    4.6313 2.7893 6.3140 1.5594 0.8980 2.0503
20 × 100, 50        12394              8136    7.3097 8.1432 14.955 5.1091 6.0182 10.867
 (M, M, 6=)        262144                 0    0.1578 0.3698 0.1936 0.1333 0.2717 0.1656




5      Conclusion

In this paper, we compared the performance of several algorithms computing the
closure of an attribute set with respect to a set of implications. Each of these
algorithms can be used as a (frequently called) subroutine while computing the
Duquenne–Guigues basis of a formal context. We tested them in conjunction
with Ganter’s algorithm and its optimized version.
    In our future work, we plan to extend the comparison to algorithms generat-
ing the Duquenne–Guigues basis in a different (non-lectic) order, in particular, to
incremental [17] and divide-and-conquer [19] approaches, probably, in conjunc-
tion with newer algorithms for computing the closure of a set [16]. In addition,
           Comparing performance of algorithms for generating the DG basis        55



 Fig. 3. Time (in seconds) for building the canonical bases of contexts from Table 2
0,18
0,16
0,14
                                                                                5+1
0,12
                                                                                5+2
0,10
                                                                                5+3
0,08
                                                                                6+1
0,06                                                                            6+2
0,04                                                                            6+3
0,02
0,00
               Zoo            Postoperative Patient     Congressional Voting
1,6

1,4

1,2
                                                                                5+1
1,0                                                                             5+2
0,8                                                                             5+3
                                                                                6+1
0,6
                                                                                6+2
0,4
                                                                                6+3
0,2

0,0
                  SPECT                               Breast Cancer
16

14

12
                                                                                5+1
10                                                                              5+2
 8                                                                              5+3
                                                                                6+1
 6
                                                                                6+2
 4
                                                                                6+3
 2

 0
                Solar Flare                  Wisconsin Breast Cancer
56     Konstantin Bazhanov and Sergei Obiedkov


we are going to consider algorithms that generate other implication covers: for
example, direct basis [15, 20, 4] or proper basis [18]. They can be used as an inter-
mediate step in the computation of the Duquenne–Guigues basis. If the number
of intents is much larger than the number of pseudo-intents, this two-step ap-
proach may be more efficient than direct generation of the Duquenne–Guigues
basis with Algorithms 5 or 6, which produce all intents as a side effect.


Acknowledgements

The second author was supported by the Academic Fund Program of the Higher
School of Economics (project 10-04-0017) and the Russian Foundation for Basic
Research (grant no. 08-07-92497-NTsNIL a).


References
 1. Armstrong, W.: Dependency structure of data base ralationship. Proc. IFIP
    Congress pp. 580–583 (1974)
 2. Babin, M.A., Kuznetsov, S.O.: Recognizing pseudo-intents is coNP-complete. In:
    Kryszkiewicz, M., Obiedkov, S. (eds.) Proceedings of the 7th International Con-
    ference on Concept Lattices and Their Applications. pp. 294–301. University of
    Sevilla, Spain (2010)
 3. Beeri, C., Bernstein, P.: Computational problems related to the design of normal
    form relational schemas. ACM TODS 4(1), 30–59 (March 1979)
 4. Bertet, K., Monjardet, B.: The multiple facets of the canonical direct unit impli-
    cational basis. Theor. Comput. Sci. 411(22-24), 2155–2166 (2010)
 5. Blake, C., Merz, C.: UCI repository of machine learning databases (1998),
    http://archive.ics.uci.edu/ml
 6. Demming, R., Duffy, D.: Introduction to the Boost C++ Libraries. Datasim Edu-
    cation Bv (2010), see http://www.boost.org
 7. Distel, F., Sertkaya, B.: On the complexity of enumerating pseudo-intents. Discrete
    Appl. Math. 159, 450–466 (March 2011)
 8. Ganter, B.: Attribute exploration with background knowledge. Theor. Comput.
    Sci. pp. 215–233 (1999)
 9. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations.
    Springer, Berlin (1999)
10. Ganter, B.: Two basic algorithms in concept analysis. Preprint 831, Technische
    Hochschule Darmstadt, Germany (1984)
11. Guigues, J.L., Duquenne, V.: Familles minimales d’implications informatives re-
    sultant d’un tableau de donnees binaires. Math. Sci. Hum. 95(1), 5–18 (1986)
12. Kuznetsov, S., Obiedkov, S.: Comparing performance of algorithms for generating
    concept lattices. Journal of Experimental and Theoretical Artificial Intelligence
    14(2/3), 189–216 (2002)
13. Kuznetsov, S.O., Obiedkov, S.: Some decision and counting problems of the
    Duquenne–Guigues basis of implications. Discrete Appl. Math. 156(11), 1994–2003
    (2008)
14. Maier, D.: The theory of relational databases. Computer software engineering se-
    ries, Computer Science Press (1983)
           Comparing performance of algorithms for generating the DG basis          57


15. Mannila, H., Räihä, K.J.: The design of relational databases. Addison-Wesley
    Longman Publishing Co., Inc., Boston, MA, USA (1992)
16. Mora, A., Aguilera, G., Enciso, M., Cordero, P., de Guzman, I.P.: A new closure
    algorithm based in logic: SLFD-Closure versus classical closures. Inteligencia Ar-
    tificial, Revista Iberoamericana de IA 10(31), 31–40 (2006)
17. Obiedkov, S., Duquenne, V.: Attribute-incremental construction of the canonical
    implication basis. Annals of Mathematics and Artificial Intelligence 49(1-4), 77–99
    (April 2007)
18. Taouil, R., Bastide, Y.: Computing proper implications. In Proc. ICCS-2001 In-
    ternational Workshop on Concept Lattices-Based Theory, Methods and Tools for
    Knowledge Discovery in Databases pp. 290–303 (2001)
19. Valtchev, P., Duquenne, V.: On the merge of factor canonical bases. In: Medina,
    R., Obiedkov, S. (eds.) ICFCA. Lecture Notes in Computer Science, vol. 4933, pp.
    182–198. Springer (2008)
20. Wild, M.: Computations with finite closure systems and implications. In: Comput-
    ing and Combinatorics. pp. 111–120 (1995)
21. Yevtushenko, S.A.: System of data analysis “Concept Explorer” (in Russian). In:
    Proceedings of the 7th national conference on Artificial Intelligence KII-2000. pp.
    127–134. Russia (2000), http://conexp.sourceforge.net/
               Filtering Machine Translation Results with
              Automatically Constructed Concept Lattices

                              Yılmaz Kılıçaslan1 and Edip Serdar Güner1,
                       1
                           Trakya University, Department of Computer Engineering,
                                         22100 Edirne, Turkey
                                   {yilmazk, eserdarguner}@trakya.edu.tr



           Abstract. Concept lattices can significantly improve machine translation
           systems when applied as filters to their results. We have developed a rule-based
           machine translator from Turkish to English in a unification-based programming
           paradigm and supplemented it with an automatically constructed concept
           lattice. The test results achieved by applying this translation system to a Turkish
           child story reveals that lattices used as filters to translation results have a
           promising potential to improve machine translation. We have compared our
           system with Google Translate on the data. The comparison suggests that a rule-
           based system can even compete with this statistical machine translation system
           that stands out with its wide range of users.
           Keywords: Concept Lattices, Rule-based Machine Translation, Evaluation of
           MT systems.



    1 Introduction

    Paradigms of Machine translation (MT) can be classified into two major categories
    depending on their focus: result-oriented paradigms and process-oriented ones.
    Statistical MT focuses on the result of the translation, not the translation process
    itself. In this paradigm, translations are generated on the basis of statistical models
    whose parameters are derived from the analysis of bilingual text corpora. Rule-based
    MT, a more classical paradigm, focuses on the selection of representations to be used
    and steps to be performed during the translation process.
        It is the rule-based paradigm that will be the concern of this paper. We argue for
    the viability of a rule-based translation model where a concept lattice functions as a
    filter for its results.
        In what follows, we first introduce the classical models for doing rule-based MT,
    illustrating particular problematic cases with translation pairs between Turkish and
    English (cf. Section 2). Then, we briefly introduce the basic notions of Formal
    Concept Analysis (FCA) and touch upon the question of how lattices built using FCA
    can serve as a bridge between two languages (cf. Section 3). This is followed by the
    presentation of our translation system (cf. Section 4). Subsequently, we report on and
    evaluate several experiments which we have performed by feeding our translation
    system with a Turkish child story text (cf. Section 5). The discussion ends with some
    remarks and with a summary of the paper (cf. Section 6).



c 2011 by the paper authors. CLA 2011, pp. 59–73. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
60    Yılmaz Kılıçaslan and Edip Serdar Güner




 2 Models for Rule-Based Translation

 2.1 Direct Translation

 The most straightforward MT strategy is the so-called direct translation. Basically, the
 strategy is to translate each word into its target language counterpart while proceeding
 word-by-word through the source language text or speech. If the only difference
 between two languages were due to their lexical choices, this approach could be a
 very easy way of producing high quality translation. However, languages differ from
 each other not only lexically but also structurally.
    In fact, the direct translation strategy works very well only for very simple cases
 like the following:
    (1) Turkish:                                 Direct Translation to English:
          Köpek-ler havlar-lar.                Dogs bark.
          dog-pl    bark-3pl

 In this example, the direct translation strategy provides us with a perfect translation of
 the Turkish sentence (interpreted as a kind-level statement about dogs). But, consider
 now the following example:
    (2) Turkish:                                Direct Translation to English:




 Supposing that the referent of the pronoun is a male person, the expected translation
 for the given Turkish sentence would be the following:
    (3) Correct Translation:
          The woman knows him.

    The direct translation approach fails in this example in the following respects: First,
 the translation results in a subject-object-verb (SOV) ordering, which does not
 comply with the canonical SVO ordering in English. SOV is the basic word order in
 Turkish. Second, the subject does not have the required definite article in the
 translation. The reason for this is another typological difference between the two
 languages: Turkish lacks a definite article. Third, the word-by-word translation leaves
 the English auxiliary verb ambiguous with respect to number, as the Turkish verb
 does not carry the number information. Fourth, the verb know is encoded in the
 progressive aspect in the translation, which is unacceptable as it denotes a mental
 state. This anomaly is the result of directly translating the Turkish continuous suffix
 –yor to the English suffix –ing. Fifth, the pronoun is left ambiguous with respect to
 gender in the translation, as Turkish pronouns do not bear this information.
Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                      61




2.2 Transfer Approach

2.2.1 Syntactic Transfer

As Jurafsky and Martin [6] point out, examples like those above suggest that the
direct approach to MT is too focused on individual words and that we need to add
phrasal and structural knowledge into our MT models to achieve better results. It is
through the transfer approach that a rule-based strategy incorporates the structural
knowledge into the MT model. In this approach, MT involves three phases: analysis,
transfer, and generation. In the analysis phase, the source language text is parsed into
a syntactic and/or semantic structure. In the transfer phase, the structure of the source
language is transformed to a structure of the target language. The generation phase
takes this latter structure as input and turns it to an actual text of the target language.
   Let us first see how the transfer technique can make use of syntactic knowledge to
improve the translation result of the example discussed above. Assuming a simple
syntactic paradigm, the input sentence can be parsed into the following structure:
(4)




    Once the sentence has been parsed, the resulting tree will undergo a syntactic
transfer operation to resemble the target parse tree and this will be followed by a
lexical transfer operation to generate the target text:
(5)




  The syntactic transfer exploits the following facts about English: a singular count
noun must have a determiner and the subject agrees in number and person with the
verb. Collecting the leaves of the target parse tree, we get the following output:
(6)   Translation via Syntactic Transfer:




This output is free from the first three defects noted with the direct translation.
However, the problem of encoding the mental state verb in progressive aspect and the
62     Yılmaz Kılıçaslan and Edip Serdar Güner




 gender ambiguity of the pronoun still await to be resolved. These require meaning-
 related knowledge to be incorporated into the MT model.


 2.2.2 Semantic Transfer

 The context-independent aspect of meaning is called semantic meaning. A crucial
 component of the semantic meaning of a natural language sentence is its lexical
 aspect, which determines whether the situation that the sentence describes is a
 (punctual) event, a process or a state. This information is argued to be inherently
 encoded in the verb. Obviously, knowing is a mental state and, hence, cannot be
 realized in the progressive aspect.
        We can apply a shallow semantic analysis to our previously obtained syntactic
 structure, which will give us a tree structure enriched with aspectual information, and
 thereby achieve a more satisfactory transfer:
  (7)




 The resulting translation is the following:
 (8)   Translation via Semantic Transfer:




 2.3 Interlingua Approach

 There are two problems with the transfer model: it requires contrastive knowledge
 about languages and it requires such knowledge for every pair of languages. If the
 meaning of the input can be extracted and encoded in a language-independent form
 and the output can, in turn, be generated out of this form, there will be no need for any
 kind of contrastive knowledge. A language-independent meaning representation
 language to be used in such a scheme is usually referred to as an interlingua.
    A common way to visualize the three approaches to rule-based MT is with
 Vauquois triangle shown below (adopted from [6]):
Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                   63




                                  Fig. 1. The Vauquois triangle.
  As Jurafsky and Martin point out:
    [t]he triangle shows the increasing depth of analysis required (on both the
    analysis and generation end) as we move from the direct approach through
    transfer approaches, to interlingual approaches. In addition, it shows the
    decreasing amount of transfer knowledge needed as we move up the triangle,
    from huge amounts of transfer at the direct level (almost all knowledge is
    transfer knowledge for each word) through transfer (transfer rules only for
    parse trees or thematic roles) through interlingua (no specific transfer
    knowledge). (p. 867)


3 Lattice-Based Interlingua Strategy

A question left open above is that of what kind of representation scheme can be used
as an interlingua. There are many possible alternatives such as predicate calculus,
Minimal Recursion Semantics or an event-based representation. Another interesting
possibility is to use lattices built using Formal Concept Analysis (FCA) as meaning
representations to this effect.
  FCA, developed by Ganter & Wille [5], assumes that data from an application are
given by a formal context, a triple (G, M, I) consisting of two sets G and M and a so
called incidence relation I between these sets. The elements of G are called the objects
and the elements of M are called the attributes. The relation I holds between g and m,
(g, m) ∈ I if and only if the object g has the attribute m. A formal context induces two
operators, both of which usually denoted by ʹ. One of these operators maps each set of
objects A to the set of attributes Aʹ which these objects have in common. The other
operator maps each set of attributes B to the set of objects Bʹ which satisfies these
attributes. FCA is in fact an attempt to give a formal definition of the notion of a
‘concept’. A formal concept of the context (G, M, I) is a pair (A, B) such that G ⊇ A
= Aʹ and M ⊇ B  Bʹ. A is called the extent and B the intent of the concept (A, B).
The set of all concepts of the context (G, M, I) is denoted by C(G, M, I). This set is
ordered by a subconcept – superconcept relation, which is a partial order relation
denoted by ≤. If (A1, B1) and (A2, B2) are concepts in C(G, M, I), the former is said to
64      Yılmaz Kılıçaslan and Edip Serdar Güner




 be a subconcept of the latter (or, the latter a superconcept of the former), i.e., (A1, B1)
 ≤ (A2, B2), if and only if A1 ⊆ A2 (which is equivalent to B1 ⊇ B2). The ordered set
 C(G, M, I; ≤) is called the concept lattice or (Galois lattice) of the context (G, M, I).
 A concept lattice can be drawn as a (Hasse) diagram in which concepts are
 represented by nodes interconnected by lines going down from superconcept nodes to
 subconcept ones.
   Priss [15], rewording an idea first mentioned by Kipke & Wille [8], suggests that
 once linguistic databases are formalized as concept lattices, the lattices can serve as
 an interlingua. She explains how a concept lattice can serve as a bridge between two
 languages with the aid of the figure below (taken from [13]):




                              Fig. 2. – A concept lattice as an interlingua.


     [This figure] shows separate concept lattices for English and German words for
     “building”. The main difference between English and German is that in English
     “house” only applies to small residential buildings (denoted by letter “H”),
     whereas in German even small office buildings (denoted by letter “O”) and larger
     residential buildings can be called “Haus”. Only factories would not normally be
     called “Haus” in German. The lattice in the top of the figure constitutes an
     information channel in the sense of Barwise & Seligman [2] between the German
     and the English concept lattice. ([15] p. 158)

   We consider Priss’s approach a promising avenue for interlingua-based translation
 strategies. We suggest that this approach can work not only for isolated words but
 also even for text fragments. In what follows, we will sketch out a strategy with
 interlingual concept lattices serving as filters for refining translation results. The
 strategy proceeds as follows: 1) Compile a concept lattice from a data source like
 WordNet. 2) Link the nodes of the lattice to their possibly corresponding expressions
 in the source and target language. 3) Translate the input text into the target language
 with no consideration of the pragmatic aspects of its meaning. 4) Integrate the
 concepts derived from the input text into the concept lattice. The main motivation
 behind this strategy is to refine the translation results to a certain extent by means of
 pragmatic knowledge structured as formal contexts.
 Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                            65




4 A Translation System with Interlingual Concept Lattices

4.1 A Concept Lattice Generator

Concept lattices to be used as machine translation filters should contain concept nodes
associated with both functional and substantive words. All languages have a finite
number of functional words. Therefore, a manual construction of the lattice fragments
that would contain them would be reasonable. However, manually constructing a
concept lattice for lexical words would have considerable drawbacks such as the
following:

     •    It is labor intensive.
     •    It is prone to yielding errors which are difficult to detect automatically.
     •    It generates incomplete lists that are costly to extend to cover missing
          information.
     •    It is not easy to adapt to changes and domain-specific needs.

Taking these potential problems into consideration, we have developed a tool for
generating concept lattices for lexical words automatically. As this is an FCA
application, it is crucial to decide on which formal context to use before delving its
implementation details.
  Priss & Old [16] propose to construct concept neighborhoods in WordNet with a
formal context where the formal objects are the words of the synsets belonging to all
senses of a word, the formal attributes are the words of the hypernymic synsets and
the incidence relation is the semantic relation between the synsets and their
hypernymic synsets. The neighborhood lattice of a word in WordNet consists of all
words that share some senses with that word.1 Below is the neighborhood lattice their
method yields for the word volume:




                   Fig. 3. – Priss and Old’s neighborhood lattice for the word volume.



1 As lattices often grow very rapidly to a size too large to be visualized, Wille [18] describes a

  method for constructing smaller, so-called “neighborhood” lattices.
66       Yılmaz Kılıçaslan and Edip Serdar Güner




 Consider the bottom node. The concept represented by this node is not a naturally
 occurring one. Obviously, the adopted formal context causes two distinct natural
 concepts to collapse into one single formal concept here. The reason is simply that
 WordNet employs one single word, i.e., volume, for two distinct senses, i.e.,
 publication and amount. This could leave a translation attempt with the task of
 disambiguating this word. In fact, WordNet marks each sense with a single so-called
 synset number.
    When constructing concept lattices in WordNet, we suggest two amendments to the
 formal context adopted by Priss and Old. First, the formal objects are to be the synset
 numbers. Second, the formal attributes are to include also some information compiled
 from the glosses of the words. The first change allows us to distinguish between the
 two senses of the word volume, as shown in Fig. 4a. But, we are still far from
 resolving all ambiguities concerning this word, as indicated by the presence of two
 objects in the leftmost node. The problem is that the hypernymic attributes are not
 sufficiently informative to differentiate the 3-D space sense of the word volume from
 its relative amount sense. This extra information resides in the glosses of the word
 and once encoded as attributes it evokes the required effect, as shown in Fig. 4b.




     Fig. 4a. – A neighborhood lattice with the   Fig. 4b. – A more fine-grained neighborhood
     objects being synset numbers.                lattice with the objects being synset numbers.


 Each gloss, which is most likely a noun phrase, is parsed by means of a shift-reduce
 parser to extract a set of attributes. Having collected the objects (i.e. the synset
 numbers) and the associated attributes, the FCA algorithm that comes with the
 FCALGS library [9] is used for deriving a lattice-based ontology from that collection.
 FCALGS employs a parallel and recursive algorithm. Apart from its being parallel, it
 is very similar to Kuznetsov’s [10] Close-by-One algorithm.
   However, even the lattice in Fig4.b is still defective in at least one respect. The
 names of the objects denoted are lost. To remedy this problem, we suggest to encode
 the objects as tuples of synset numbers and sets of names, as illustrated below.
Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                      67




                 Fig. 5. – A neighborhood lattice including the names of the objects.

Another point to note is that the name of a synset serves as the attribute of a
subconcept. For example, ‘entity’ is the name of the topmost synset. But, as
everything is an entity, any subconcept must treat it as an element of its set of
attributes.


4.2 A Sense Translator

Each WordNet node is associated with a set of synonymous English words, which is
referred to as its synset. Each synset, in effect, denotes a sense in English. Thus, one
task to accomplish is to translate synsets into Turkish to the furthest possible extent.
We should, of course, keep in mind that some synsets (i.e. some senses encoded in
English) may not have a counterpart in the target language. To find the Turkish
translation of a particular synset, the Sense Translator first downloads a set of relevant
articles via the links given in the disambiguation pages Wikipedia provides for the
words in this set. It searches for the hypernyms of the synset in these articles. It
assigns each article a score in accordance with the sum of the weighted points of the
hypernyms found in this article. More specifically, if a synset has N hypernyms, the
Kth hypernym starting from the top is assigned WeightK = K/N. Let FrequencyK be the
number of occurrences of an item in a given article, then the score of the article is
calculated as follows:
    Article Score = Weight1 * Frequency1 + ... + WeightN * FrequencyN.                  (1)
If the article with the highest score has a link to a Turkish article, the title of the
article will be the translation of the English word under examination. Otherwise, the
word will be left unpaired with a Turkish counterpart. Figure 6 visualizes how the
word cat in WordNet is translated into its Turkish counterpart, kedi, via Wikipedia.
68     Yılmaz Kılıçaslan and Edip Serdar Güner




                  Fig. 6. - Translating the word cat into Turkish via Wikipedia.


    The Turkish counterparts will be added next to the English names, as shown
 below:




     Fig. 7. - A neighborhood lattice including the Turkish counterparts of the English names.



 4.3 A Rule-Based Machine Translator

 We have designed a transfer-based architecture for Turkish-English translation and
 implemented the translator in SWI-Prolog which is an open-source implementation of
 the Prolog programming language. Below is a figure representing the main modules
 of the translator:
Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                  69




              Fig. 8. - The main modules of the rule-based machine translator.
    The word list extracted by the Preprocessor is used as an input to the Analysis
Module. We have devised a shift-reduce parser in the analysis phase for building up
the grammatical structure of expressions. Briefly, a shift-reduce parser uses a bottom-
up strategy with an ultimate goal of building trees rooted with a start symbol [1]. The
Generation Module first rearranges the constituents using transformation rules.
Afterwards, all the structures are lexically transferred into English using a bilingual
dictionary.

4.4 Filtering Translation Results with the Concept Lattice

Let us turn to our exemplary sentence introduced in (2) (i.e. Kadın onu tanıyor).
Failing to take the context of the sentence into account, the rule-based translator
generates the result in (8) (i.e. The woman knows him/her/it), where the pronoun is
left ambiguous with respect to gender.
  Our claim is that we can resolve such ambiguities using FCA and thereby refine our
translations. To this effect, we propose to generate transient formal concepts for noun
phrases. We make the following assumptions. Basically, personal pronouns,
determiners and proper names introduce formal objects whereas adjectives and nouns
encode formal attributes.
  Suppose that our sentence is preceded by (the Turkish paraphrase of) a sentence like
‘A man has arrived’. The indefinite determiner evokes a new formal object, say obj1.
As the source text is in Turkish, all attributes will be Turkish words. The Turkish
counterpart of the word man is adam. Thus, the transient concept for the subject of
this sentence will be ({obj1}, {adam}). The task is now to embed this transient
concept into the big permanent concept lattice. To do this, a node where the Turkish
counterpart of the synset name is ‘adam’ is searched for. Immediately below this node
is placed a new node with its set of objects being {obj1} and with no additional
attributes. As this is a physical object, the subconcept of this new node has to be the
70     Yılmaz Kılıçaslan and Edip Serdar Güner




 lowest one. As for the second sentence, the NP kadın (the woman) will be associated
 with the transient concept ({X},{kadın}) and the pronoun onu (him/her/it) with the
 transient concept ({Y},{entity}). X and Y are parameters to be anchored to particular
 formal objects. In other words, they are anaphoric. It seems plausible to assert that the
 attributes of an anaphoric object must constitute a (generally proper) subset or
 hypernym set of the attributes of the object serving as the antecedent. Assume that X
 is somehow anaphorically linked to an object obj2. Now, there are two candidate
 antecedents for Y. The woman, or the object obj2, is barred from being antecedent of
 the pronoun by a locality principle like one stated in Chomsky’s [3] Binding Theory:
 roughly stated, a pronoun and its antecedent cannot occur in the same clause. There
 remains one single candidate antecedent, obj1. As its attribute set is a hyponym set of
 {entity}, it can be selected as a legitimate antecedent. The concept node created for
 the man will also be the one denoted by the pronoun with Y being instantiated with
 obj1. In the concept lattice constructed in WordNet, the concept named as ‘man’
 includes ‘male person’ in its set of attributes. Hence, the ambiguity is resolved and the
 pronoun translates into English as ‘him’.
    It is worth noting that in case there is more than one candidate antecedent, an
 anaphora resolution technique, especially a statistical one, can be employed to pick
 out the candidate most likely to be the antecedent. The interested reader is referred to
 Mitkov [12] for a survey of anaphora resolution approaches in general and to
 Kılıçaslan et al [7] for anaphora resolution in Turkish.
    The gender disambiguation process can also be carried out for common nouns.
 Consider the following fragment taken from a child story:
 (9)




 Turkish, leaves not only pronouns but also many other words ambiguous with respect
 to the gender feature. The word ‘kardeş’ in this example is ambiguous between the
 translations sister and brother. This ambiguity will be resolved in favor of the former
 interpretation in way similar to the disambiguation process sketched out for pronouns
 above.
   In fact, the problem of sense disambiguation is a kind of specification problem.
 Therefore, it cannot be confined to gender disambiguation. For example, given that
 we have somehow managed to compile the attributes listed in the column on the left-
 hand side, our FCA-based system generates the translations listed on the right-hand
 side:
Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                      71




         zehirli, diş ‘poisonous, tooth’               fang
         zehirli, mantar ‘poisonous, mushroom’         toadstool
         sivri, diş ‘sharp, tooth’                     fang
         arka, koltuk ‘rear, seat’                     rumble
         acemi, asker ‘inexperienced, soldier’         recruit
It will, of course, be interesting to try to solve other kinds of translation problems with
FCA-based techniques. We leave this task to accomplish in the light of further
research in the future.


5 Results and Evaluation

In the early years of MT, the quality of an MT system was determined by human
judgment. Though specially trained for the purpose, human judges are prone to suffer
at least from subjectivity. Besides, this exercise is almost always more costly and time
consuming. Some automated evaluation metrics have been developed in order to
overcome such problems. Among these are BLEU, NIST, WER and PER.
    BLEU [14] and NIST [4] are rather specialized metrics. They are employed by
considering the fraction of output n-grams that also appear in a set of human
translations (n-gram precision). This allows the acknowledgment of a greater diversity
of acceptable MT results.
    As for WER (Word Error Rate) and PER (Position-independent Word Error Rate),
they are more general purpose measures and they rely on direct correspondence
between the machine translation and a single human-produced reference. WER is
based on the Levenshtein distance [11] which is the edit distance between a reference
translation and its automatic translation, normalized by the length of the reference
translation. This metric is formulated as:
                                            S+D+I             (2)
                                WER =
                                              N
where N is the total number of words in the reference translation, S is the number of
substituted words in the automatic translation, D is the number of words deleted from
the automatic translation and I is the number of words inserted in the reference not
appearing in the automatic translation.
   Although WER requires exactly the same order of the words in automatic
translation and reference, PER neglects word order completely [17]. It measures the
difference in the count of the words occurring in automatic and reference translations.
The resulting number is divided by the number of words in the reference. It is worth
noting that PER is technically not a distance measure as it uses a position-independent
Levenshtein distance where the distance between a sentence and one of its
permutations is always taken to be zero.
   We used WER to evaluate the performance of our MT system. This is probably the
metric most commonly used for similar purposes. As we employed a single human-
produced reference, this metric suits well to our evaluation setup. We fed our system
72    Yılmaz Kılıçaslan and Edip Serdar Güner




 with a Turkish child story involving 91 sentences (970 words).2 We post-edited the
 resulting translation in order to generate a reference. When necessary calculations
 were done in accordance with formula (1), the WER turned out to be 38%.
    The next step was to see the extent to which the performance or our MT system
 could be improved using concept lattices as filters for the raw results. To this effect,
 we devised several concept lattices like that in figure 3 and filtered the lexical
 constituents of each automatic translation with them.
    A considerable regression in error rate is observed in our system supplemented with
 concept lattices: the WER score is reduced down to a value around 30%.
    One question that comes to mind at this point is that of whether the improvement
 achieved is statistically significant or not. To get an answer we had recourse to the
 Wilcoxon Signed-Rank test. This test is used to analyze matched-pair numeric data,
 looking at the difference between the two values in each matched pair. When applied
 to the WER scores of the non-filtered and filtered translation results, the test shows
 that the difference is statistically significant (p < 0.005).
    Another question is that of whether the results are practically satisfactory. To get
 some insight to this question, we should employ a baseline system for a comparison
 on usability. Google Translate, a statistical MT system that stands out with its wide
 range of users, can serve for this purpose. The WER score obtained employing
 Google Translate on our data is 34%. Recalling that the WER score of our system
 supplemented with concept lattices is 30%, we seem to be entitled to argue for the
 viability of rule-based MT systems. Of course, we need to make this claim tentatively
 since the size of the data on which the comparisons are made is relatively small.
 However, it should also be noted that we have employed a limited number of concept
 lattices of considerably small sizes. It is of no doubt that increasing the number and
 size of filtering lattices would improve the performance of our MT system.
    More importantly, we do not primarily have an NLP concern in this work. Rather,
 we would like the results to be evaluated from a computational linguistics perspective.
 Everything aside, the results show that even a toy lattice based ontology can yield
 statistically significant improvement for an MT system.


 6 Conclusion

    In this paper, we have illustrated some translation problems caused by some
 typological divergences between Turkish and English using a particular example. We
 have gone through the direct translation, syntactic transfer and semantic transfer
 phases of the rule-based translation model to see what problem is dealt with in what
 phase. We have seen that a context-dependent pragmatic process is necessary to get to
 a satisfactory result. Concept lattices appear to be very efficient tools for
 accomplishing this pragmatic disambiguation task. Supplementing a rule-based MT
 system with concept lattices not only yields statistically significant improvement on
 the results of the system but also enables it to compete with a statistical MT system
 like Google Translate.


 2 This is the story where the example in (9) comes from.
 Filtering Machine Transl. Results with Autom. Constructed Conc. Lattices                            73




References

1. Aho, A.V., Ullman, J.D.: The Theory of Parsing, Translation, and Compiling, Vol. 1.,
    Prentice Hall (1972)
2. Barwise J., Seligman, J.: Information Flow. The Logic of Distributed Systems. Cambridge
    University Press (1997)
3. Chomsky, N.: Lectures on Government and Binding, Foris, Dordrecht (1981).
4. Doddington, G.: “Automatic Evaluation of Machine Translation Quality Using N-gram Co-
    occurrence Statistics”. In Proceedings of HLT 2002 (2nd Conference on Human Language
    Technology). San Diego, California, 128-132 (2002)
5. Ganter, B., Wille, R.: Formale Begriffsanalyse: Mathematische Grundlagen. Berlin:
    Springer (1996)
6. Jurafsky, D., Martin, J. H.: Speech and Language Processing, 2nd Edition, Prentice Hall
    (2009)
7. Kılıçaslan, Y., Güner, E. S., Yıldırım, S.: Learning-based pronoun resolution for Turkish
   with a comparative evaluation, Computer Speech & Language, Volume 23, Issue 3,
   p. 311-331 (2009)
8. Kipke, U., Wille, R.: Formale Begriffsanalyse erläutert an einem Wortfeld. LDV–Forum, 5
    (1987)
9. Krajca, P., Outrata, J., Vychodil, V.: Parallel Recursive Algorithm for FCA. In: Belohlavek
    R., Kuznetsov S. O. (Eds.): Proc. CLA 2008, CEUR WS, 433, 71–82 (2008)
10. Kuznetsov, S.: Learning of Simple Conceptual Graphs from Positive and Negative
    Examples. PKDD 1999, pp. 384–391 (1999)
11. Levenshtein, V. I.: "Binary codes capable of correcting deletions, insertions, and reversals,"
    Tech. Rep. 8. (1966)
12. Mitkov, R.: Anaphora Resolution: The State of the Art. Technical Report, University of
     Wolverhampton (1999)
13. Old, L. J., Priss, U.: Metaphor and Information Flow. In Proceedings of the 12th Midwest
    Artificial Intelligence and Cognitive Science Conference, pp. 99-104 (2001)
14. Papineni, K., Roukos, S., Ward, T., Zhu, W. J.: "BLEU: a method for automatic evaluation
    of machine translation" in ACL-2002: 40th Annual meeting of the Association for
    Computational Linguistics pp. 311–318 (2002)
15. Priss, U.: Linguistic Applications of Formal Concept Analysis, Ganter; Stumme; Wille
    (eds.), Formal Concept Analysis, Foundations and Applications, Springer Verlag, LNAI
    3626, pp. 149-160 (2005)
16. Priss, U., Old, L. J.: "Concept Neighbourhoods in Lexical Databases.", In Proceedings of
   the 8th International Conference on Formal Concept Analysis, ICFCA'10, Springer
   Verlag, LNCS 5986, p. 283-295 (2010)
17. Tillmann C., Vogel, S., Ney, H., Zubiaga A., Sawaf, H.: Accelerated DP based search for
    statistical translation. In European Conf. on Speech Communication and Technology, pages
    2667–2670, Rhodes, Greece, September (1997)
18. Wille, R.: The Formalization of Roget’s International Thesaurus. Unpublished manuscript
   (1993)
      Concept lattices in fuzzy relation equations?

                        Juan Carlos Dı́az and Jesús Medina??

                   Department of Mathematics. University of Cádiz
                   Email: {juancarlos.diaz,jesus.medina}@uca.es



        Abstract. Fuzzy relation equations are used to investigate theoretical
        and applicational aspects of fuzzy set theory, e.g., approximate reasoning,
        time series forecast, decision making and fuzzy control, etc.. This paper
        relates these equations to a particular kind of concept lattices.


  1    Introduction
  Recently, multi-adjoint property-oriented concept lattices have been introduced
  in [16] as a generalization of property-oriented concept lattices [10,11] to a fuzzy
  environment. These concept lattices are a new point of view of rough set the-
  ory [23] that considers two different sets: the set of objects and the set of at-
  tributes.
      On the other hand, fuzzy relation equations, introduced by E. Sanchez [28],
  are associated to the composition of fuzzy relations and have been used to in-
  vestigate theoretical and applicational aspects of fuzzy set theory [22], e.g., ap-
  proximate reasoning, time series forecast, decision making, fuzzy control, as an
  appropriate tool for handling and modeling of nonprobabilistic form of uncer-
  tainty, etc. Many papers have investigated the capacity to solve (systems) of
  fuzzy relation equations, e.g., in [1, 8, 9, 25, 26].
      In this paper, the multi-adjoint relation equations are presented as a general-
  ization of the fuzzy relation equations [24,28]. This general environment inherits
  the properties of the multi-adjoint philosophy, consequently, e.g., several con-
  junctors and residuated implications defined on general carriers as lattice struc-
  tures can be used, which provide more flexibility in order to relate the variables
  considered in the system.
      Moreover, multi-adjoint property-oriented concept lattices and systems of
  multi-adjoint relation equations have been related in order to obtain results that
  ensure the existence of solutions in these systems. These definitions and results
  are illustrated by a toy example to improve the readability and comprehension
  of the paper.
      Among all concept lattice frameworks, we have related the multi-adjoint
  property-oriented concept lattices to the systems of multi-adjoint relation equa-
  tions, e.g., the extension and intension operators of this concept lattice can be
  ?
    Partially supported by the Spanish Science Ministry TIN2009-14562-C05-03 and by
    Junta de Andalucı́a project P09-FQM-5233.
 ??
    Corresponding author.


c 2011 by the paper authors. CLA 2011, pp. 75–86. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
76       Juan Carlos Dı́az and Jesús Medina-Moreno


used to represent multi-adjoint relation equations, and, as a result, the solu-
tions of these systems of relation equations can be related to the concepts of the
corresponding concept lattice.
    The more important consequence is that this relation provides that the prop-
erties given, e.g., in [2–4,12,14,17,18,27] can be applied to obtain many properties
of these systems. Indeed, it can be considered that the algorithms presented, e.g.,
in [5, 6, 15] obtain solutions for these systems.
    The plan of this paper is the following: in Section 2 we will recall the multi-
adjoint property-oriented concept lattices as well as the basic operators used and
some properties; later, in Section 3, an example will be introduced to motivate
the multi-adjoint relation equations. Once these equations have been presented,
in Section 4 the multi-adjoint property-oriented concept lattices and the systems
of multi-adjoint relation equations will be related in order to obtain results which
ensure the existence of solutions in these systems; the paper ends with some
conclusions and prospects for future work.

2      Multi-adjoint property-oriented concept lattices
The basic operators in this environment are the adjoint triples, which are formed
by three mappings: a non-commutativity conjunctor and two residuated impli-
cations [13], which satisfy the well-known adjoint property.
Definition 1. Let (P1 , ≤1 ), (P2 , ≤2 ), (P3 , ≤3 ) be posets and & : P1 × P2 → P3 ,
. : P3 × P2 → P1 , - : P3 × P1 → P2 be mappings, then (&, ., -) is an adjoint
triple with respect to P1 , P2 , P3 if:
1. & is order-preserving in both arguments.
2. . and - are order-preserving on the first argument1 and order-reversing
   on the second argument.
3. x ≤1 z . y iff x & y ≤3 z iff y ≤2 z - x, where x ∈ P1 , y ∈ P2
   and z ∈ P3 .
   Example of adjoint triples are the Gödel, product and Lukasiewicz t-norms
together with their residuated implications.
Example 1. Since the Gödel, product and Lukasiewicz t-norms are commuta-
tive, the residuated implications satisfy that .G =-G , .P =-P and .L =-L .
Therefore, the Gödel, product and Lukasiewicz adjoint triples are defined on
[0, 1] as:
           &P (x, y) = x · y                    z -P x = min(1, z/x)
                                                         (
                                                          1 if x ≤ z
           &G (x, y) = min(x, y)                z -G x =
                                                          z otherwise
           &L (x, y) = max(0, x + y − 1)        z -G x = min(1, 1 − x + z)

1
     Note that the antecedent will be evaluated on the right side, while the consequent
     will be evaluated on the left side, as in logic programming framework.


                                            2
                                       Concept lattices in fuzzy relation equations       77


In [19] more general examples of adjoint triples are given.

    The basic structure, which allows the existence of several adjoint triples for
a given triplet of lattices, is the multi-adjoint property-oriented frame.

Definition 2. Given two complete lattices (L1 , 1 ) and (L2 , 2 ), a poset (P, ≤)
and adjoint triples with respect to P, L2 , L1 , (&i , .i , -i ), for all i = 1, . . . , l, a
multi-adjoint property-oriented frame is the tuple

                 (L1 , L2 , P, 1 , 2 , ≤, &1 , .1 , -1 , . . . , &l , .l , -l )

Multi-adjoint property-oriented frames are denoted as (L1 , L2 , P, &1 , . . . , &l ).
Note that the notation is similar to a multi-adjoint frame [18], although the
adjoint triples are defined on different carriers.
   The definition of context is analogous to the one given in [18].

Definition 3. Let (L1 , L2 , P, &1 , . . . , &l ) be a multi-adjoint property-oriented
frame. A context is a tuple (A, B, R, σ) such that A and B are non-empty sets
(usually interpreted as attributes and objects, respectively), R is a P -fuzzy rela-
tion R : A × B → P and σ : B → {1, . . . , l} is a mapping which associates any
element in B with some particular adjoint triple in the frame.2

    From now on, we will fix a multi-adjoint property-oriented frame and context,
(L1 , L2 , P, &1 , . . . , &l ), (A, B, R, σ).
                                                               ↓N
    Now we define the following mappings ↑π : LB        A
                                                 2 → L1 and       : LA     B
                                                                     1 → L2 as

                       g ↑π (a) = sup{R(a, b) &σ(b) g(b) | b ∈ B}                        (1)
                         N
                      f ↓ (b) = inf{f (a) -σ(b) R(a, b) | a ∈ A}                         (2)

    Clearly, these definitions3 generalize the classical possibility and necessity
operators [11] and they form an isotone Galois connection [16]. There are two
dual versions of the notion of Galois connetion. The most famous Galois connec-
tion, where the maps are order-reversing, is properly called Galois connection,
and the other in which the maps are order-preserving, will be called isotone Ga-
lois connection. In order to make this contribution self-contained, we recall their
formal definitions:
    Let (P1 , ≤1 ) and (P2 , ≤2 ) be posets, and ↓ : P1 → P2 , ↑ : P2 → P1 mappings,
the pair (↑ , ↓ ) forms a Galois connection between P1 and P2 if and only if: ↑ and
↓
  are order-reversing; x ≤1 x↓↑ , for all x ∈ P1 , and y ≤2 y ↑↓ , for all y ∈ P2 .
    The one we adopt here is the dual definition: Let (P1 , ≤1 ) and (P2 , ≤2 ) be
posets, and ↓ : P1 → P2 , ↑ : P2 → P1 mappings, the pair (↑ , ↓ ) forms an isotone
Galois connection between P1 and P2 if and only if: ↑ and ↓ are order-preserving;
x ≤1 x↓↑ , for all x ∈ P1 , and y ↑↓ ≤2 y, for all y ∈ P2 .
2
  A similar theory could be developed by considering a mapping τ : A → {1, . . . , l}
  which associates any element in A with some particular adjoint triple in the frame.
3
  From now on, to improve readability, we will write &b , -b instead of &σ(b) , -σ(b) .


                                                3
78       Juan Carlos Dı́az and Jesús Medina-Moreno


    A concept, in this environment, is a pair of mappings hg, f i, with g ∈ LB , f ∈
                              N
L , such that g ↑π = f and f ↓ = g, which will be called multi-adjoint property-
    A

oriented formal concept. In that case, g is called the extension and f , the inten-
sion of the concept. The set of all these concepts will be denoted as MπN [16].

Definition 4. A multi-adjoint property-oriented concept lattice is the set
                                                                   N
                                                      ↑π
               MπN = {hg, f i | g ∈ LB        A
                                     2 , f ∈ L1 and g    = f, f ↓ = g}

in which the ordering is defined by hg1 , f1 i  hg2 , f2 i iff g1 2 g2 (or equivalently
f1 1 f2 ).

    The pair (MπN , ) is a complete lattice [16], which generalize the concept
lattice introduced in [7] to a fuzzy environment.


3       Multi-adjoint relation equations

This section begins with an example that motivates the definition of multi-
adjoint relation equations, which will be introduced later.


3.1      Multi-adjoint logic programming

A short summary of the main features of multi-adjoint languages will be pre-
sented. The reader is referred to [20, 21] for a complete formulation.
    A language L contains propositional variables, constants, and a set of logical
connectives. In this fuzzy setting, the usual connectives are adjoint triples and
a number of aggregators.
    The language L is interpreted on a (biresiduated) multi-adjoint lattice,4
hL, , .1 , -1 , &1 , . . . , .n , -n , &n i, which is a complete lattice L equipped with
a collection of adjoint triples h&i , .i , -i i, where each &i is a conjunctor in-
tended to provide a modus ponens-rule with respect to .i and -i .
    A rule is a formula A .i B or A -i B, where A is a propositional symbol
(usually called the head) and B (which is called the body) is a formula built from
propositional symbols B1 , . . . , Bn (n ≥ 0), truth values of L and conjunctions,
disjunctions and aggregations. Rules with an empty body are called facts.
    A multi-adjoint logic program is a set of pairs hR, αi, where R is a rule and
α is a value of L, which may express the confidence which the user of the system
has in the truth of the rule R. Note that the truth degrees in a given program
are expected to be assigned by an expert.
Example 2. Let us to consider a multi-adjoint lattice

                            h[0, 1], ≤, ←G , &G , ←P , &P , ∧L i
4
     Note that a multi-adjoint lattice is a particular case of a multi-adjoint property-
     oriented frame.


                                             4
                                        Concept lattices in fuzzy relation equations           79


where &G and &P are the Gödel and product conjunctors, respectively, and
←G , ←P their corresponding residuated implications. Moreover, the Lukasie-
wicz conjunctor ∧L will be used in the program [13].
   Given the set of variables (propositional symbols)
  Π = {low oil, low water, rich mixture, overheating, noisy behaviour,
         high fuel consumption}
the following set of multi-adjoint rules form a multi-adjoint program, which may
represent the behaviour of a motor.
           hhigh fuel consumption ←G rich mixture ∧L low oil, 0.8i
                         hoverheating ←G low oil, 0.5i
                   hnoisy behaviour ←P rich mixture, 0.8i
                         hoverheating ←P low water, 0.9i
                   hnoisy behaviour ←G low oil, 1i
    The usual procedural is to measure the levels of “oil”, “water” and “mix-
ture” of a specific motor, after that the values for low oil, low water and
rich mixture are obtained, which are represented in the program as facts, for
instance, the next ones can be added to the program:
                                        hlow oil ←P >, 0.2i
                                     hlow water ←P >, 0.2i
                                hrich mixture ←P >, 0.5i
    Finally, the values for the rest of variables (propositional symbols) are com-
puted [20]. For instance, in order to attain the value for overheating(o, w), for
a level of oil, o, and water, w, the rules hoverheating ←G low oil, ϑ1 i and
hoverheating ←P low water, ϑ2 i are considered and its value is obtained as:
        overheating(o, w) = (low oil(o) &G ϑ1 ) ∨ (low water(w) &P ϑ2 )                        (3)
    Now, the problem could be to recompute the weights of the rules from
experimental instances of the variables, that is, the values of overheating,
noisy behaviour and high fuel consumption are known for particular mea-
sures of low oil, low water and rich mixture.
    Specifically, given the levels of oil, o1 , . . . , on , the levels of water, w1 , . . . , wn ,
and the measures of mixture, t1 , . . . , tn , we may experimentally know the values
of the variables: noisy behaviour(ti , oi ), high fuel consumption(ti , oi ) and
overheating(oi , wi ), for all i ∈ {1, . . . , n}.
    Considering Equation (3), the unknown elements could be ϑ1 and ϑ2 instead
of overheating(o, w). Therefore, the problem now is to look for the values of
ϑ1 and ϑ2 , which solve the following system obtained after assuming the exper-
imental data for the propositional symbols, ov1 , o1 , w1 , . . . , ovn , on , wn .
       overheating(ov1 ) = (low oil(o1 ) &G ϑ1 ) ∨ (low water(w1 ) &P ϑ2 )
              ..         ..                  ..        ..
               .          .                   .         .
       overheating(ovn ) = (low oil(on ) &G ϑ1 ) ∨ (low water(wn ) &P ϑ2 )

                                                5
80     Juan Carlos Dı́az and Jesús Medina-Moreno


   This system can be interpreted as a system of fuzzy relation equations in
which several conjunctors, &G and &P , are assumed. Moreover, these conjunctors
could be neither non-commutative nor associative and defined in general lattices,
as permit the multi-adjoint framework.
    Next sections introduce when these systems have solutions and a novel method
to obtain them using concept lattice theory.

3.2   Systems of multi-adjoint relation equations
The operators used in order to obtain the systems will be the generalization of the
sup-∗-composition, introduced in [29], and inf-→-composition, introduced in [1].
From now on, a multi-adjoint property-oriented frame, (L1 , L2 , P, &1 , . . . , &l )
will be fixed.
    In the definition of a multi-adjoint relation equation an interesting mapping
σ : U → {1, . . . , l} will be considered, which relates each element in U to an
adjoint triple. This mapping will play a similar role as the one given in a multi-
adjoint context, defined in the previous section, for instance, this map provides
a partition of U in preference sets. A similar theory may be developed for V
instead of U .
    Let U = {u1 , . . . , um } and V = {v1 , . . . , vn } be two universes, R ∈ L2 U ×V
an unknown fuzzy relation, σ : U → {1, . . . , l} a map that relates each element
in U to an adjoint triple, and K1 , . . . , Kn ∈ P U , D1 , . . . , Dn ∈ L1 V arbitrarily
chosen fuzzy subsets of the respective universes.
    A system of multi-adjoint relation equations with sup-&-composition, is the
following system of equations
                  _
                      (Ki (u) &u R(u, v)) = Di (v), i ∈ {1, . . . , n}                (4)
                 u∈U

where &u represents the adjoint conjunctor associated to u by σ, that is, if
σ(u) = (&s , .s , -s ), for s ∈ {1, . . . , l}, then &u is exactly &s .
   If an element v of V is fixed and the elements Ki (uj ), R(uj , v) and Di (v) are
written as kij , xj and di , respectively, for each i ∈ {1, . . . , n}, j ∈ {1, . . . , m},
then System (4) can particularly be written as
                         k11 &u1 x1 ∨ · · · ∨ k1m &um xm = d1
                                     ..      ..          .. ..                         (5)
                                      .       .           . .
                        kn1 &u1 x1 ∨ · · · ∨ knm &um xm = dn
where kij and di are known and xj must be obtained.
   Hence, for each v ∈ V , if we solve System (5), then we obtain a “column”
of R (i.e. the elements R(uj , v), with j ∈ {1, . . . , m}), thus, solving n similar
systems, one for each v ∈ V , the unknown relation R is obtained.
Example 3. Assuming Example 2, in this case, we will try to solve the problem
about to obtain the weights associated to the rules from particular observed data
for the propositional symbols.


                                            6
                                     Concept lattices in fuzzy relation equations        81


    The propositional symbols (variables) will be written in short as: hfc, nb,
oh, rm, lo and lw, and the measures of particular cases of the behaviour of the
motor will be: hi , ni , ovi , ri , oi , wi , for hfc, nb, oh, rm, lo and lw, respectively,
in each case i, with i ∈ {1, 2, 3}.
    For instance, the next system associated to overheating is obtained from
the computation provided in Example 2.
                     oh(ov1 ) = (lo(o1 ) &G ϑoh                  oh
                                             lo ) ∨ (lw(w1 ) &P ϑlw )
                     oh(ov2 ) = (lo(o2 ) &G ϑoh                  oh
                                             lo ) ∨ (lw(w2 ) &P ϑlw )
                     oh(ov3 ) = (lo(o3 ) &G ϑoh                  oh
                                             lo ) ∨ (lw(w3 ) &P ϑlw )

where ϑoh         oh
         lo and ϑlw are the weights associated to the rules with head oh. Similar
systems can be obtained to high fuel consumption and noisy behaviour.
    Assuming the multi-adjoint frame with carrier L = [0, 1] and the Gödel and
product triples, these systems are particular systems of multi-adjoint relational
equations. The corresponding context is formed by the sets U = {rm, lo, lw, rm∧L
lo}, V = {hfc, nb, oh}; the mapping σ that relates the elements lo, rm ∧L lo to
the Gödel triple, and rm, lw to the product triple; the mappings K1 , . . . , Kn ∈
P U , defined as the values given by the propositional symbols in U on the ex-
perimental data, for instance, if u = lo, then K1 (lo) = lo(o1 ), . . . , Kn (lo) =
lo(on ); and the mappings D1 , . . . , Dn ∈ L1 V , defined analogously, for instance,
if v = rm, then D1 (rm) = rm(r1 ), . . . , Dn (rm) = rm(rn ).
    Finally, the unknown fuzzy relation R ∈ L2 U ×V is formed by the weights of
the rules in the program.
    In the system above, oh has been the element v ∈ V fixed. Moreover, as
there do not exist rules with body rm and rm ∧L lo, that is, the weights for
that hypothetical rules are 0, then the terms (rm(ri ) &G 0 = 0 and (rm(ri ) ∧L
lo(oi ) &P 0 = 0 do not appear.
   Its counterpart is a system of multi-adjoint relation equations with inf---
composition, that is,
              ^
                 (R(u, v) -uj Kj∗ (v)) = Ej (u), j ∈ {1, . . . , m}         (6)
               v∈V

considered with respect to unknown fuzzy relation R ∈ L1 U ×V , and where
K1∗ , . . . , Km
               ∗
                 ∈ P V and E1 , . . . , Em ∈ L2 U . Note that -uj represents the corre-
sponding adjoint implication associated to uj by σ, that is, if σ(uj ) = (&s , .s
, -s ), for s ∈ {1, . . . , l}, then -uj is exactly -s . Remark that in System 6,
the implication -uj does not depend of the element u, but of j. Hence, the
implications used in each equation of the system are the same.
    If an element u of U is fixed, fuzzy subsets K1∗ , . . . , Km∗
                                                                   ∈ P V , E1 , . . . , Em ∈
  U                                    ∗
L2 are assumed, such that Kj (vi ) = kij , R(u, vi ) = yi and Ej (u) = ej , for each
i ∈ {1, . . . , n}, j ∈ {1, . . . , m}, then System (6) can particularly be written as
                         y1 -u1 k11 ∧ · · · ∧ yn -u1 kn1 = e1
                                  ..           ..        .. ..                          (7)
                                   .            .         . .
                        y1 -um k1m ∧ · · · ∧ yn -um knm = em

                                             7
82     Juan Carlos Dı́az and Jesús Medina-Moreno


Therefore, for each u ∈ U , we obtain a “row” of R (i.e. the elements R(u, vi ), with
i ∈ {1, . . . , n}), consequently, solving m similar systems, the unknown relation
R is obtained.
   Systems (5) and (7) have the same goal, searching for the unknown relation
R although the mechanism is different.
   Analyzing these systems, we have that the left side of these systems can be
represented by the mappings CK : Lm            n          n    m
                                         2 → L1 , IK ∗ : L1 → L2 , defined as:

         CK (x̄)i = ki1 &u1 x1 ∨ · · · ∨ kim &um xm , for all i ∈ {1, . . . , n}         (8)
         IK ∗ (ȳ)j = y1 -uj k1j ∧ · · · ∧ yn -uj knj , for all j ∈ {1, . . . , m}       (9)

where x̄ = (x1 , . . . , xm ) ∈ Lm                              n
                                 2 , ȳ = (y1 , . . . , yn ) ∈ L1 , and CK (x̄)i , IK ∗ (ȳ)j
are the components of CK (x̄), IK ∗ (ȳ), respectively, for each i ∈ {1, . . . , n} and
j ∈ {1, . . . , m}.
   Hence, Systems (5) and (7) can be written as:

                            CK (x1 , . . . , xm ) = (d1 , . . . , dn )                  (10)
                             IK ∗ (y1 , . . . , yn ) = (e1 , . . . , em )               (11)

respectively.


4    Relation between multi-adjoint property-oriented
     concept lattices and multi-adjoint relation equation
This section shows that Systems (5) and (7) can be interpreted in a multi-
adjoint property-oriented concept lattice. And so, the properties given to the
                                         N
isotone Galois connection ↑π and ↓ , as well as to the complete lattice MπN can
be used in the resolution of these systems.
    First of all, the environment must be fixed. Hence, a multi-adjoint context
(A, B, S, σ) will be considered, such that A = V 0 , B = U , where V 0 has the same
cardinality as V , σ will be the mapping given by the systems and S : A × B → P
is defined as S(vi0 , uj ) = kij . Note that A = V 0 is related to the mappings Ki ,
since S(vi0 , uj ) = kij = Ki (uj );
    Now, we will prove that the mappings defined at the end of the previous
section are related to the isotone Galois connection. Given µ ∈ LB          2 , such that
µ(uj ) = xj , for all j ∈ {1, . . . , m}, the following equalities are obtained, for each
i ∈ {1, . . . , n}:

            CK (x̄)i = ki1 &u1 x1 ∨ · · · ∨ kim &um xm
                      = S(vi0 , u1 ) &u1 µ(u1 ) ∨ · · · ∨ S(vi0 , um ) &um µ(um )
                      = sup{S(vi0 , uj ) &uj µ(uj ) | j ∈ {1, . . . , m}}
                      = µ↑π (vi0 )
                                                                     ↑π
    Therefore, the mapping CK : Lm      n
                                 2 → L1 is equivalent to the mapping    : LB
                                                                           2 →
 A                           m                                    B
L1 , where an element x̄ in L2 can be interpreted as a map µ in L2 , such that

                                                 8
                                   Concept lattices in fuzzy relation equations     83


µ(uj ) = xj , for all j ∈ {1, . . . , m}, and the element CK (x̄) as the mapping µ↑π ,
such that µ↑π (vi0 ) = CK (x̄)i , for all i ∈ {1, . . . , n}.
   An analogy can be developed applying the above procedure to mappings IK ∗
      N
                                                                    ↓N
and ↓ , obtaining that the mappings IK ∗ : Ln1 → Lm           2 and    : LA    B
                                                                          1 → L2 are
equivalent.
   As a consequence, the following result holds:
Theorem 1. The mappings CK : Lm              n        n    m
                                     2 → L1 , IK ∗ : L1 → L2 , establish an iso-
                                                   m    m
tone Galois connection. Therefore, IK ∗ ◦ CK : L2 → L2 is a closure operator
and CK ◦ IK ∗ : Ln1 → Ln1 is an interior operator.
    As (CA , IK ∗ ) is an isotone Galois connection, any result about the solvability
of one system has its dual counterpart.
    The following result explains when these systems can be solved and how a
solution can be obtained.
                                                                 N
Theorem 2. System (5) can be solved if and only if hλ↓d¯ , λd¯i is a concept
of MπN , where λd¯ : A = {v1 , . . . , vn } → L1 , defined as λd¯(vi ) = di , for all
                                                                     N
i ∈ {1, . . . , n}. Moreover, if System (5) has a solution, then λ↓d¯ is the greatest
solution of the system.
    Similarly, System (7) can be solved if and only if hµē , µē↑π i is a concept of
MπN , where µē : B = {u1 , . . . , um } → L2 , defined as µē (uj ) = ej , for all j ∈
{1, . . . , m}. Furthermore, if System (7) has a solution, then µ↑ē π is the smallest
solution of the system.

   The main contribution of the relation introduced in this paper is not only the
above consequences, but a lot of other properties for Systems (5) and (7) that
can be stabilized from the results proved, for example, in [2–4, 12, 14, 17, 18, 27].
   Next example studies the system of multi-adjoint relation equations presented
in Example 3.
Example 4. The aim will be to solve a small system in order to improve the
understanding of the method. In the environment of Example 3, the following
system will be solved assuming the experimental data: oh(ov1 ) = 0.5, lo(o1 ) =
0.3, lw(w1 ) = 0.3, oh(ov2 ) = 0.7, lo(o2 ) = 0.6, lw(w2 ) = 0.8, oh(ov3 ) = 0.4,
lo(o3 ) = 0.5, lw(w3 ) = 0.2.

                   oh(ov1 ) = (lo(o1 ) &G ϑoh                  oh
                                           lo ) ∨ (lw(w1 ) &P ϑlw )
                   oh(ov2 ) = (lo(o2 ) &G ϑoh                  oh
                                           lo ) ∨ (lw(w2 ) &P ϑlw )
                   oh(ov3 ) = (lo(o3 ) &G ϑoh                  oh
                                           lo ) ∨ (lw(w3 ) &P ϑlw )

where ϑoh        oh
        lo and ϑlw are the variables.
    The context is: A = V 0 = {1, 2, 3}, the set of observations, B = U = {lo, lw},
σ associates the propositional symbol lo to the Gödel triple and lw to the
product triple. The relation S : A × B → [0, 1] is defined in Table 1.
    Therefore, considering the mapping λoh : A → [0, 1] associated to the values
of overheating in each experimental case, that is λoh (1) = 0.5, λoh (2) = 0.7,


                                          9
84        Juan Carlos Dı́az and Jesús Medina-Moreno

                                            Table 1. Relation S.

                                            low oil        low water
                                    1         0.3             0.3
                                    2         0.6             0.8
                                    3         0.5             0.2




and λoh (3) = 0.4; and the mapping CK : [0, 1]2 → [0, 1]3 , defined in Equation (8),
the system above can be written as

                                            CK (ϑoh    oh
                                                 lo , ϑlw ) = λoh

Since, by the comment above, there exists µ ∈ [0, 1]B , such that CK (ϑoh    oh
                                                                       lo , ϑlw ) =
 ↑π                                                      B             ↑π
µ , the goal will be to attain the mapping µ ∈ [0, 1] , such that µ = λoh ,
                                           N
which can be found if and only if ((λoh )↓ , λoh ) is a multi-adjoint property-
oriented concept in the considered context, by Theorem 2.
                                    N
    First of all, we compute (λoh )↓ .
         N
(λoh )↓ (lo) = inf{λoh (1) -G S(1, lo), λoh (2) -G S(2, lo), λoh (3) -G S(3, lo)}
                      = inf{0.5 -G 0.3, 0.7 -G 0.6, 0.4 -G 0.5}
                      = inf{1, 1, 0.4} = 0.4
         ↓N
(λoh )        (lw) = inf{0.5 -P 0.3, 0.7 -P 0.8, 0.4 -P 0.2}
                      = inf{1, 0.875, 1} = 0.875
                                        N
     Now, the mapping (λoh )↓ ↑π is obtained.

                  N                                        N              N
         (λoh )↓ ↑π (1) = sup{S(1, lo) &G (λoh )↓ (lo), S(1, lw) &P (λoh )↓ (lw)}
                            = sup{0.3 &G 0.4, 0.3 &P 0.875}
                            = sup{0.3, 0.2625} = 0.3
                ↓N ↑π
         (λoh )         (2) = sup{0.6 &G 0.4, 0.8 &P 0.875} = 0.7
                ↓N ↑π
         (λoh )         (3) = sup{0.5 &G 0.4, 0.2 &P 0.875} = 0.4
                               N
    Therefore, ((λoh )↓ , λoh ) is not a multi-adjoint property-oriented concept and
thus, the considered system has no solution, although if the experimental value
for oh had been 0.3 instead of 0.5, the system would have had a solution.
    These changes could be considered in several applications where noisy vari-
ables exist and their values can be conveniently changed to obtain approximate
solutions for the systems. Thus, if the experimental data for overheating are
oh(ov1 ) = 0.3, oh(ov2 ) = 0.7 and oh(ov2 ) = 0.4, then the original system will
have at least one solution and the values ϑoh        oh
                                               lo , ϑlw will be 0.4, 0.875, respectively
for a solution. Consequently, the truth for the first rule is lower than for the
second or it might be thought that it is more determinant in obtaining higher


                                                      10
                                   Concept lattices in fuzzy relation equations     85


values for lw than for lo. Another possibility is to consider that this conclusion
about the certainty of the rules is not correct, in which case another adjoint
triple might be associate to lo.
    As a result, the properties introduced in several fuzzy formal concept anal-
ysis frameworks can be applied in order to obtain solutions of fuzzy relation
equations, as well as in the multi-adjoint general framework.
    Furthermore, in order to obtain the solutions of Systems (5) and (7), the
algorithms developed, e.g., in [5, 6, 15], can be used.


5   Conclusions and future work
Multi-adjoint relation equations have been presented that generalize the existing
definitions presented at this time. In this general environment, different conjunc-
tors and residuated implications can be used, which provide more flexibility in
order to relate the variables considered in the system.
    A toy example has been introduced in the paper in order to improve its
readability and reduce the complexity of the definitions and results.
    As a consequence of the results presented in this paper, several of the prop-
erties provided, e.g., in [2–4, 12, 14, 17, 18, 27], can be used to obtain additional
characteristics of these systems.
    In the future, we will apply the results provided in the fuzzy formal con-
cept analysis environments to the general systems of fuzzy relational equations
presented here.


References
 1. W. Bandler and L. Kohout. Semantics of implication operators and fuzzy relational
    products. Int. J. Man-Machine Studies, 12:89–116, 1980.
 2. E. Bartl, R. Bělohlávek, J. Konecny, and V. Vychodil. Isotone galois connections
    and concept lattices with hedges. In 4th International IEEE Conference “Intelli-
    gent Systems”, pages 15.24–15.28, 2008.
 3. R. Bělohlávek. Lattices of fixed points of fuzzy Galois connections. Mathematical
    Logic Quartely, 47(1):111–116, 2001.
 4. R. Bělohlávek. Concept lattices and order in fuzzy logic. Annals of Pure and
    Applied Logic, 128:277–298, 2004.
 5. R. Bělohlávek, B. D. Baets, J. Outrata, and V. Vychodil. Lindig’s algorithm
    for concept lattices over graded attributes. Lecture Notes in Computer Science,
    4617:156–167, 2007.
 6. R. Bělohlávek, B. D. Baets, J. Outrata, and V. Vychodil. Computing the lattice
    of all fixpoints of a fuzzy closure operator. IEEE Transactions on Fuzzy Systems,
    18(3):546–557, 2010.
 7. Y. Chen and Y. Yao. A multiview approach for intelligent data analysis based on
    data operators. Information Sciences, 178(1):1–20, 2008.
 8. B. De Baets. Analytical solution methods for fuzzy relation equations. In D. Dubois
    and H. Prade, editors, The Handbooks of Fuzzy Sets Series, volume 1, pages 291–
    340. Kluwer, Dordrecht, 1999.


                                          11
86     Juan Carlos Dı́az and Jesús Medina-Moreno


 9. A. Di Nola, S. Sessa, W. Pedrycz, and E. Sanchez. Fuzzy Relation Equations and
    Their Applications to Knowledge Engineering. Kluwer, 1989.
10. I. Düntsch and G. Gediga. Approximation operators in qualitative data analysis.
    In Theory and Applications of Relational Structures as Knowledge Instruments,
    pages 214–230, 2003.
11. G. Gediga and I. Düntsch. Modal-style operators in qualitative data analysis. In
    Proc. IEEE Int. Conf. on Data Mining, pages 155–162, 2002.
12. G. Georgescu and A. Popescu. Non-dual fuzzy connections. Arch. Math. Log.,
    43(8):1009–1039, 2004.
13. P. Hájek. Metamathematics of Fuzzy Logic. Trends in Logic. Kluwer Academic,
    1998.
14. H. Lai and D. Zhang. Concept lattices of fuzzy contexts: Formal concept analysis
    vs. rough set theory. International Journal of Approximate Reasoning, 50(5):695–
    707, 2009.
15. C. Lindig. Fast concept analysis. In G. Stumme, editor, Working with Conceptual
    Structures-Contributions to ICCS 2000, pages 152–161, 2000.
16. J. Medina. Towards multi-adjoint property-oriented concept lattices. Lect. Notes
    in Artificial Intelligence, 6401:159–166, 2010.
17. J. Medina and M. Ojeda-Aciego. Multi-adjoint t-concept lattices. Information
    Sciences, 180(5):712–725, 2010.
18. J. Medina, M. Ojeda-Aciego, and J. Ruiz-Calviño. Formal concept analysis via
    multi-adjoint concept lattices. Fuzzy Sets and Systems, 160(2):130–144, 2009.
19. J. Medina, M. Ojeda-Aciego, A. Valverde, and P. Vojtáš. Towards biresiduated
    multi-adjoint logic programming. Lect. Notes in Artificial Intelligence, 3040:608–
    617, 2004.
20. J. Medina, M. Ojeda-Aciego, and P. Vojtáš. Multi-adjoint logic programming
    with continuous semantics. In Logic Programming and Non-Monotonic Reasoning,
    LPNMR’01, pages 351–364. Lect. Notes in Artificial Intelligence 2173, 2001.
21. J. Medina, M. Ojeda-Aciego, and P. Vojtáš. Similarity-based unification: a multi-
    adjoint approach. Fuzzy Sets and Systems, 146:43–62, 2004.
22. A. D. Nola, E. Sanchez, W. Pedrycz, and S. Sessa. Fuzzy Relation Equations
    and Their Applications to Knowledge Engineering. Kluwer Academic Publishers,
    Norwell, MA, USA, 1989.
23. Z. Pawlak. Rough sets. International Journal of Computer and Information Sci-
    ence, 11:341–356, 1982.
24. W. Pedrycz. Fuzzy relational equations with generalized connectives and their
    applications. Fuzzy Sets and Systems, 10(1-3):185 – 201, 1983.
25. I. Perfilieva. Fuzzy function as an approximate solution to a system of fuzzy relation
    equations. Fuzzy Sets and Systems, 147(3):363–383, 2004.
26. I. Perfilieva and L. Nosková. System of fuzzy relation equations with inf-→ com-
    position: Complete set of solutions. Fuzzy Sets and Systems, 159(17):2256–2271,
    2008.
27. A. M. Radzikowska and E. E. Kerre. A comparative study of fuzzy rough sets.
    Fuzzy Sets and Systems, 126(2):137–155, 2002.
28. E. Sanchez. Resolution of composite fuzzy relation equations. Information and
    Control, 30(1):38–48, 1976.
29. L. A. Zadeh. The concept of a linguistic variable and its application to approximate
    reasoning I, II, III. Information Sciences, 8–9:199–257, 301–357, 43–80, 1975.




                                           12
          Adaptation knowledge discovery for cooking
               using closed itemset extraction

                 Emmanuelle Gaillard, Jean Lieber, and Emmanuel Nauer

                    LORIA (UMR 7503—CNRS, INRIA, Nancy University)
                        BP 239, 54506 Vandœuvre-lès-Nancy, France,
                             First-Name.Last-Name@loria.fr



            Abstract. This paper is about the adaptation knowledge (AK) discov-
            ery for the Taaable system, a case-based reasoning system that adapts
            cooking recipes to user constraints. The AK comes from the interpreta-
            tion of closed itemsets (CIs) whose items correspond to the ingredients
            that have to be removed, kept, or added. An original approach is pro-
            posed for building the context on which CI extraction is performed. This
            approach focuses on a restrictive selection of objects and on a specific
            ranking based on the form of the CIs. Several experimentations are pro-
            posed in order to improve the quality of the AK being extracted and to
            decrease the computation time. This chain of experiments can be seen
            as an iterative knowledge discovery process: the analysis following each
            experiment leads to a more sophisticated experiment until some concrete
            and useful results are obtained.


      Keywords: adaptation knowledge discovery, closed itemset, data preprocess-
        ing, case-based reasoning, cooking.


      1    Introduction
      This paper addresses the adaptation challenge proposed by the Computer Cook-
      ing Contest (http://computercookingcontest.net/) which consists in adapt-
      ing a given cooking recipe to specific constraints. For example, the user wants
      to adapt a strawberry pie recipe, because she has no strawberry. The underlying
      question is: which ingredient(s) will the strawberries be replaced with?
          Adapting a recipe by substituting some ingredients by others requires cook-
      ing knowledge and adaptation knowledge in particular. Taaable, a case-based
      reasoning (CBR) system, addresses this problem using an ingredient ontology.
      This ontology is used for searching which is/are the closest ingredient(s) to the
      one that has to be replaced. In this approach the notion of “being close to”
      is given by the distance between ingredients in the ontology. In the previous
      example, Taaable proposes to replace the strawberries by other berries (e.g.
      raspberries, blueberries, etc.). However, this approach is limited because two in-
      gredients which are close in the ontology are not necessarily interchangeable and
      because introducing a new ingredient in a recipe may be incompatible with some
      other ingredient(s) of the recipe or may required to add other ingredients.




c 2011 by the paper authors. CLA 2011, pp. 87–99. Copying permitted only for private
  and academic purposes. Volume published and copyrighted by its editors. Local
  Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
88       Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer

         This paper extends the approach proposed in [2] for extracting this kind
     of adaptation knowledge (AK). The approach is based on closed itemset (CI)
     extraction, in which items are the ingredients that have to be removed, kept,
     or added for adapting the recipe. This paper introduces two originalities. The
     first one concerns the way the binary context, on which the CI extraction is
     performed, is built, by focusing on a restrictive selection of objects according
     to the objectives of the knowledge discovery process. The second one concerns
     the way the CIs are filtered and ranked, according to their form. The paper is
     organised as follows: Section 2 specifies the problem in its whole context and
     introduces Taaable which will integrate the discovered AK in its reasoning
     process. Section 3 gives preliminaries for this work, introducing CI extraction,
     case-based reasoning, and related work. Section 4 explains our approach; several
     experiments and evaluations are described and discussed.


     2     Context and motivations

     2.1   Taaable

     The Computer Cooking Contest is an international contest that aims at compar-
     ing systems that make inferences about cooking. A candidate system has to use
     the recipe base given by the contest to propose a recipe matching the user query.
     This query is a set of constraints such as inclusion or rejection of ingredients, the
     type or the origin of the dish, and the compatibility with some diets (vegetarian,
     nut-free, etc.).
         Taaable [1] is a system that has been originally designed as a candidate
     of the Computer Cooking Contest. It is also used as a brain teaser for research
     in knowledge based systems, including knowledge discovery, ontology engineer-
     ing, and CBR. Like many CBR systems, Taaable uses an ontology to retrieve
     recipes that are the most similar to the query. Taaable retrieves and creates
     cooking recipes by adaptation. If there exist recipes exactly matching the query,
     they are returned to the user; otherwise the system is able to retrieve similar
     recipes (i.e. recipes that partially match the target query) and adapts these
     recipes, creating new ones. Searching similar recipes is guided by several ontolo-
     gies, i.e. hierarchies of classes (ingredient hierarchy, dish type hierarchy and dish
     origin hierarchy), in order to relax constraints by generalising the user query.
     The goal is to find the most specific generalisation of the query (the one with
     the minimal cost) for which recipes exist in the recipe base. Adaptation consists
     in substituting some ingredients of the retrieved recipes by the ones required by
     the query.
         Taaable retrieves recipes using query generalisation, then adapts them by
     substitution. This section gives a simplified description of the Taaable system.
     For more details about the Taaable inference engine, see e.g. [1]. For example,
     for adapting the “My Strawberry Pie” recipe to the no Strawberry constraint,
     the system first generalises Strawberry into Berry, then specialises Berry into,
     say, Raspberry.
Adaptation knowledge discovery for cooking using closed itemset extraction          89

  2.2   Domain ontology
  An ontology O defines the main classes and relations relevant to cooking. O is
  a set of atomic classes organised into several hierarchies (ingredient, dish type,
  dish origin, etc.). Given two classes B and A of this ontology, A is subsumed by
  B, denoted by B w A, if the set of instances of A is included in the set of instances
  of B. For instance, Berry w Blueberry and Berry w Raspberry.

  2.3   Taaable adaptation principle
  Let R be a recipe and Q be a query such that R does not exactly match Q (oth-
  erwise, no adaptation would be needed). For example, Q = no Strawberry and
  R = “My Strawberry Pie”.The basic ontology-driven adaptation in Taaable
  follows the generalisation/specialisation principle explained hereafter (in a sim-
  plified way). First, R is generalised (in a minimal way) into Γ (R) that matches
  Q. For example, Γ may be the substitution Strawberry        Berry. Second, Γ (R)
  is specialised into Σ(Γ (R)) that still matches Q. For example, Σ is the substitu-
  tion Berry      Raspberry (the class Berry is too abstract for a recipe and must
  be made precise). This adaptation approach has at least two limits. First, the
  choice of Σ is at random: there is no reason to choose raspberries instead of blue-
  berries, unless additional knowledge is given. Second, when such a substitution
  of ingredient is made, it may occur that some ingredients should be added or
  removed from R. These limits point out the usefulness of additional knowledge
  for adaptation.


  3     Preliminaries
  3.1   Itemset extraction
  Itemset extraction is a set of data-mining methods for extracting regularities
  into data, by aggregating object items appearing together. Like FCA [8], itemset
  extraction algorithms start from a formal context K, defined by K = (G, M, r),
  where G is a set of objects, M is a set of items, and r is the relation on G × M
  stating that an object is described by an item [8]. Table 1 shows an example of
  context, in which recipes are described by the ingredients they require: G is a set
  of 5 objects (recipes R, R1 , R2 , R3 , and R4 ), M is a set of 7 items (ingredients
  Sugar, Water, Strawberry, etc.).
      An itemset I is a set of items, and the support of I, support(I), is the number
  of objects of the formal context having every item of I. I is frequent, with respect
  to a threshold σ, whenever support(I) ≥ σ. I is closed if it has no proper superset
  J (I ( J) with the same support. For example, {Sugar, Raspberry} is an item-
  set and support({Sugar, Raspberry}) = 2 because 2 recipes require both Sugar
  and Raspberry. However, {Sugar, Raspberry} is not a closed itemset, because
  {Sugar, PieCrust, Raspberry} has the same support. Another, equivalent, defi-
  nition of closed itemsets can be given on the basis of a closure operator ·00 defined
  as follows. Let I be an itemset and I 0 be the set of objects that have all the items
90     Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer




                         y




                         h




                         e
              Pi err




              Co arc




              Ge rry




              Ci uic
              Co st




                       p




              Pi on

                      ll
                      i




              Ap n
                   wb

                   ru

                   st

                   Wh

                   be

                   ti




                   eJ

                   am

                   he
                   r

                   r




                   e
                 ga

                te

                ra

                eC

                rn

                ol

                sp

                la

                pl

                pl

                nn

                 eS
              Su

              Wa

              St




              Ra




              Ap
            R    ×   ×    ×    ×     ×    ×
            R1   ×             ×     ×          ×    ×
            R2   ×   ×         ×                ×
            R3   ×             ×     ×                     ×    ×
            R4   ×   ×                                     ×         ×     ×

            Table 1. Formal context representing ingredients used in recipes.




     of I: I 0 = {x ∈ G | ∀i ∈ I, x r i}. In a dual way, let X be a set of objects and X 0
     be the set of properties shared by all objects of X: X 0 = {i ∈ M | ∀x ∈ X, x r i}.
     This defines two operators: ·0 : I ∈ 2M 7→ I 0 ∈ 2G and ·0 : X ∈ 2G 7→ X 0 ∈ 2M .
     These operators can be composed in an operator ·00 : I ∈ 2M 7→ I 00 ∈ 2M . An
     itemset I is said to be closed if it is a fixed point of ·00 , i.e., I 00 = I.
         In the following, “CIs” stands for closed itemsets, and “FCIs” stands for
     frequent CIs. For σ = 3, the FCIs of this context are {Sugar, PieCrust},
     {Sugar, PieCrust, Cornstarch}, {Sugar, Water}, {Sugar}, {Water}, {PieCrust},
     and {Cornstarch}.
         For the following experiments, the Charm algorithm [12] that efficiently
     computes the FCIs is used thanks to Coron a software platform implementing
     a rich set of algorithmic methods for symbolic data mining [11].


     3.2   Case-based reasoning

     Case-based reasoning (CBR [10]) consists in answering queries with the help
     of previous experience units called cases. In Taaable, a case is a recipe and a
     query represents user constraints. In many systems, including Taaable, CBR
     consists in the retrieval of a case from the case base and in the adaptation of the
     retrieved case in an adapted case that solves the query. Retrieval in Taaable is
     performed by minimal generalisation of the query (cf. section 2.3). Adaptation
     can be a simple substitution (e.g., substitute strawberry with any berry) but it
     can be improved thanks to the use of some domain specific AK. This motivates
     the research on AK acquisition.


     3.3   Related work

     The AK may be acquired in various way. It may be collected from experts [6],
     it may be acquired using machine learning techniques [9], or be semi-automatic,
     using data-mining techniques and knowledge discovery principles [3,4].
         This paper addresses automatic AK discovery. Previous works, such as the
     ones proposed by d’Aquin et al. with the Kasimir project in the medical do-
Adaptation knowledge discovery for cooking using closed itemset extraction         91

  main [5], and by Badra et al. in the context of a previous work on Taaable [2],
  are the foundations of our work.
      Kasimir is a CBR system applied to decision support for breast cancer
  treatment. In Kasimir, a case is a treatment used for a given patient. The
  patient is described by characteristics (age, tumour size and location, etc.) and
  the treatment consists in applying medical instructions. In order to discover
  AK, cases that are similar to the target case are first selected. Then, FCIs are
  computed on the variations between the target case and the similar cases. FCIs
  matching a specific form are interpreted for generating AK [5].
      Badra et al. use this approach to make cooking adaptations in Taaable [2].
  Their work aims at comparing pairs of recipes depending on the
  ingredients they contain. A recipe R is represented by the set of its ingredients:
  Ingredients(R). For example, the recipe “My Strawberry Pie” is represented
  by

  Ingredients(“My Strawberry Pie”) = {Sugar, Water, Strawberry, PieCrust,
                                             Cornstarch, CoolWhip}

  Let (R, R0 ) be a pair of recipes which is selected. According to [2], the represen-
  tation of a pair is denoted by ∆, where ∆ represents the variation of ingredients
  between R and R0 . Each ingredient ing is marked by −, =, or +:
   – ing − ∈ ∆ if ing ∈ Ingredients(R) and ing ∈
                                               / Ingredients(R0 ), meaning
     that ing appears in R but not in R0 .
   – ing + ∈ ∆ if ing ∈
                      / Ingredients(R) and ing ∈ Ingredients(R0 ), meaning
     that ing appears in R0 but not in R.
   – ing = ∈ ∆ if ing ∈ Ingredients(R) and ing ∈ Ingredients(R0 ), meaning
     that ing appears both in R in R0 .


  Building a formal context about ingredient variations in cooking reci-
  pes. Suppose we want to compare the recipe R with the four recipes (R1 , R2 ,
  R3 , R4 ) given in Table 1.
                            V Variations between R = “My Strawberry Pie” and a
  recipe Ri have the form j ingi,jmark
                                       . For example:

     ∆R,R1 = Sugar= ∧ Water− ∧ Strawberry− ∧ PieCrust= ∧ Cornstarch=
               ∧ CoolWhip− ∧ Raspberry+ ∧ Gelatin+                                (1)

      According to these variations, a formal context K = (G, M, I) can be built
  (cf. Table 2, for the running example):
   – G = {∆R,Ri }i
   – M is the set of ingredient variations: M = {ingi,j
                                                    mark
                                                         }i,j . In particular, M
     contains all the conjuncts of ∆R,R1 (Strawberry , etc., cf.(1)).
                                                    −

   – (g, m) ∈ I, if g ∈ G, m ∈ M , and m is a conjunct of g, for example
     (∆R,R1 , Strawberry− ) ∈ I.
92       Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer




                eC ry −




               ol ch −
                rn ch =




               nn ce +
               la y +
             Co ust −




             Ra hip −
             Pi ust =




             Pi mon +

                     ll +
             Ap in +
             Pi ber




             Co tar

             Co tar



             Ge err




             Ci Jui
                  r−
                  r=

                  r=




                  e+




                  he
                  w

                  r

                  r

                  s

                  s

                  W

                  b

                  t



                   e

                  a
               ga

               te

                te

                ra



               eC

               rn




               sp



               pl

                pl



                eS
             Su

             Wa

             Wa

             St




             Ap
         ∆R,R1   ×       ×   ×   ×         ×        ×    ×    ×
         ∆R,R2   ×   ×       ×   ×             ×    ×    ×
         ∆R,R3   ×       ×   ×   ×         ×        ×             ×    ×
         ∆R,R4   ×   ×       ×        ×        ×    ×             ×         ×    ×


      Table 2. Formal context for ingredient variations in pairs of recipes (R, Rj ).




     Interpretation. In the formal context, an ingredient marked with + (resp. −)
     is an ingredient that has to be added (resp. removed). An ingredient marked
     with = is an ingredient common to R and Ri .


     4     Adaptation Knowledge discovery

     AK discovery is based on the same scheme as knowledge discovery in databases
     (KDD [7]). The main steps of the KDD process are data preparation, data-
     mining, and interpretation of the extracted units of information. Data prepara-
     tion relies on formatting data for being used by data-mining tools and on filtering
     operations for focusing on special subsets of objects and/or items, according to
     the objectives of KDD. Data-mining tools are applied for extracting regularities
     into the data. These regularities have then to be interpreted; filtering operations
     may also be performed on this step because of the (often) huge size of the data-
     mining results or of the noise included in these results. All the steps are guided
     by an analyst.
         The objective of our work is the extraction of some AK useful for adapt-
     ing a given recipe to a query. The work presented in the following focuses
     on filtering operations, in order to extract from a formal context encoding
     ingredient variations between pairs of recipes, the cooking adaptations. The
     database used as entry point of the process is the Recipe Source database
     (http://www.recipesource.com/) which contains 73795 cooking recipes. For
     the sake of simplicity, we consider in the following, the problem of adapting
     R by substituting one or several ingredient(s) with one or several ingredient(s)
     (but the approach can be generalised for removing more ingredients, and also be
     used for adding ingredient(s) in a recipe). Three experiments are presented; they
     address the same adaptation problem: adapting the R = “My Strawberry Pie”
     recipe, with Ingredients(“My Strawberry Pie”) = {Sugar, Water, Strawberry,
     PieCrust, Cornstarch, CoolWhip}, to the query no Strawberry. In each ex-
     periment, a formal context about ingredient variations in recipes is built. Then,
     FCIs are extracted and filtered for proposing cooking adaptation. The two first
Adaptation knowledge discovery for cooking using closed itemset extraction       93

  experiments focus on object filtering, selecting recipes which are more and more
  similar to the “My Strawberry Pie” recipe: the first experiment uses recipe from
  the same type (i.e. pie dish) as “My Strawberry Pie” instead of choosing recipes
  of any type; the second experiment focuses on a more precise filtering based on
  similarity between the “My Strawberry Pie” recipe and recipes used for gener-
  ating the formal context on ingredient variations.


  4.1   A first approach with closed itemsets

  As introduced in [2], a formal context is defined, where objects are ordered
  pairs of recipes (R, R0 ) and properties are ingredients marked with +, =, − for
  representing the ingredient variations from R to R0 . The formal context which
  is build is similar to the example given in Table 2. In each pair of recipes, the
  first element is the recipe R =“My Strawberry Pie” that must be adapted; the
  second element is a recipe of the same dish type as R which, moreover, does not
  contain the ingredient which has to be removed. In our example, it corresponds
  to pie dish recipes which do not contain strawberry. This formal context allows
  to build CIs which have to be interpreted in order to acquire adaptation rules.


  Experiment. 3653 pie dish recipes that do not contain strawberry are found in
  the Recipe Source database. The formal context, with 3653 objects × 1355 items
  produces 107,837 CIs (no minimal support is used).


  Analysis. Some interesting CIs can be found. For example, {PieCrust− ,
  Strawberry− , Cornstarch− , CoolWhip− , Water− , Sugar− } with support of 1657,
  contains all the ingredients of R with a − mark, meaning that there are 1657
  recipes which have no common ingredients with the R recipe. In the same
  way, {PieCrust− , Strawberry− , Cornstarch− , CoolWhip− , Water− } with sup-
  port 2590, means that 2590 recipes share only the Sugar ingredient with R
  because the sugar is the sole ingredient of R which is not included in this CI.
  The same analysis can be done for {PieCrust− , Strawberry− , Cornstarch− ,
  CoolWhip− , Sugar− } (support of 1900), for water, etc.


  Conclusion. The CIs are too numerous for being presented to the analyst. Only
  1996 of the 3653 pie dish without strawberry recipes share at least one ingredient
  with R. There are too many recipes without anything in common. A first filter
  can be used to limit the size of the formal context in number of objects.


  4.2   Filtering recipes with at least one common ingredient

  Experiment. The formal context, with 1996 objects × 813 items, produces 22,408
  CIs (no minimal support is used), ranked by decreasing support.
94     Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer

     Results. The top five FCIs are:
      – {Strawberry− } with support of 1996;
      – {Strawberry− , CoolWhip− } with support of 1916;
      – {Strawberry− , PieCrust− } with support of 1757;
      – {Strawberry− , PieCrust− , CoolWhip− } with support of 1679;
      – {Strawberry− , Cornstarch− } with support of 1631.

     Analysis. Several observations can be made. The first FCI containing an ingre-
     dient marked by + ({Strawberry− , Egg+ }, with support of 849) appears only at
     the 46th position. Moreover, there are 45 FCIs with one ingredient marked by +
     in the first 100 FCIs, and no FCI with more than one ingredient marked by +.
     A substituting ingredient ing can only be found in CIs containing ing + meaning
     that there exists a recipe containing ing, which is not in R. So, FCIs that do not
     contain the + mark cannot be used for finding a substitution proposition, and
     they are numerous in the first 100 ones, based on a support ranking (recall that
     it has been chosen not to consider adaptation by simply removing ingredient).
         In the first 100 FCIs, there is only 15 FCIs containing both an ingredient
     marked by + and an ingredient marked by =. In a FCI I, the = mark on a
     ingredient ing means that ing is common to R and to recipe(s) involved by
     the creation of I. So, an ingredient marked by = guarantees a certain similarity
     (based on ingredients that are used) between the recipes R and R0 compared
     by ∆R,R0 . If a FCI I contains a potential substituting ingredient, marked by +,
     but does not contain any =, the risk for proposing a cooking adaptation from
     I is very high, because there is no common ingredient with R in the recipe the
     potential substituting ingredient comes from.
         In the first 100 recipes, the only potential substituting ingredients (so, the
     ingredients marked by +) are egg, salt, and butter, which are not satisfactory
     from a cooking viewpoint for substituting the strawberries.
         We have conducted similar experiments with other R and queries, and the
     same observations as above can be made.

     Conclusion. From these observations, it can be concluded that the sole rank-
     ing based on support is not efficient to find relevant cooking adaptation rules,
     because the most frequent CIs do no contain potential substituting ingredients
     and, moreover, have no common ingredient with R.

     4.3   Filtering and ranking CIs according to their forms
     To extract realistic adaptation, CIs with a maximum of ingredients marked by =
     are searched. We consider that a substitution is acceptable, if 50% of ingredients
     of R are preserved and if the adaptation does not introduce too many ingredients;
     we also limit the number of ingredients introduced to 50% of the initial number
     of ingredients in R. For the experiment with the R = “My Strawberry Pie”,
     containing initially 6 ingredients, it means that at least 3 ingredients must be
     preserved and at most 3 ingredients can be added. In term of CIs, it corresponds
     to CIs containing at least 3 ingredients marked with = and at most 3 ingredients
     marked with +.
Adaptation knowledge discovery for cooking using closed itemset extraction         95

  Experiment. Using this filter on CIs produced by the previous experiment re-
  duces the number of CIs to 505. However, because some CIs are more relevant
  than others, they must be ranked according to several criteria. We use the fol-
  lowing rules, given by priority order:

   1. A CI must have a + in order to find a potential substituting ingredient.
   2. A CI which has more = than another one is more relevant. This criterion
      promotes the pairs which have a largest set of common ingredient.
   3. A CI which has less − than another one is more relevant. This criterion
      promotes adaptations which remove less ingredients.
   4. A CI which has less + than another one is more relevant. This criterion
      promotes adaptations which add less ingredients.
   5. If two CIs cannot be ranked according to the 4 criteria above, the CI the
      more frequent is considered to be the more relevant.

  Results. The 5 first CIs ranked according to the previous criteria are:

   – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= ,
     Salt+ } with support of 5;
   – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= ,
     LemonJuice+ } with support of 4;
   – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= ,
     LemonJuice+ , CreamCheese+ } with support of 2;
   – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= ,
     LemonJuice+ , WhippingCream+ } with support of 2;
   – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= ,
     LemonJuice+ , LemonPeel+ } with support of 2.

  Analysis. One can observe that potential substituting ingredients take part of
  the first 5 CIs and each CIs preserve 4 (of 6) ingredients. The low supports of
  these CIs confirm that searching frequent CIs is not compatible with our need,
  which is to extract CIs with a specific form.

  Conclusion. Ranking the CIs according to our particular criteria is more efficient
  than using a support based ranking. This kind of ranking can also be seen as a
  filter on CIs. However, this approach requires to compute all CIs because the
  support of interesting CIs is low.


  4.4   More restrictive formal context building according to the form
        of interesting CIs

  The computation time can be improved by applying a more restrictive selection
  of recipe pairs at the formal context building step, decreasing drastically the size
  of the formal context. Indeed, as the expected form of CIs is known, recipe pairs
  that cannot produce CIs of the expected form can be removed. This can also
  be seen as a selection of recipes that are similar enough to R. R0 is considered
96     Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer

     as enough similar to R if R0 has a minimal threshold σ = = 50% of ingredients
     in common with R (cf. (2)) and if R0 has a maximal threshold σ + = 50% of
     ingredients that are not used in R (cf. (3)). These two conditions expresses for
     ∆R,R0 the same similarities conditions considered in section 4.3 on CIs.

                       |Ingredients(R) ∩ Ingredients(R0 )|
                                                           ≥ σ=                    (2)
                                |Ingredients(R)|
                       |Ingredients(R0 ) \ Ingredients(R)|
                                                           ≥ σ+                    (3)
                                |Ingredients(R)|

     Experiment. Among the 1996 pie dish recipes not containing Strawberry, only
     20 recipes satisfy the two conditions. The formal context, with 20 objects × 40
     items, produces only 21 CIs (no minimal support is used).

     Results. The 5 first CIs, satisfying the form introduced in the previous section
     and ranked by decreasing support are:
      – {Water= , Sugar= , Cornstarch= , PieCrust= , Strawberry− , CoolWhip− ,
        RedFoodColoring+ , Cherry+ } with support of 1;
      – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− ,
        PieShell+} with support of 6;
      – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− ,
        Raspberry+ } with support of 3;
      – {Water= , Sugar− , Cornstarch= , PieCrust= , Strawberry− , CoolWhip− ,
        Apple+ , AppleJuice+ } with support of 3;
      – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− ,
        Peach+ , PieShell+} with support of 2.

     Analysis. According to these CIs the first potential substituting ingredients are:
     RedFoodColoring, Cherry, PieShell, Raspberry, Apple, and Peach. Each CI
     preserves 3 or 4 (of 6) ingredients to 6 and two CIs add 2 ingredients.

     Conclusion. This approach reduces the computation time without reducing the
     result quality. Moreover, it gives the best potential adaptation in the first CIs.

     4.5   From CIs to adaptation rules
     As Taaable must propose a recipe adaptation, CIs containing potentially sub-
     stituting ingredients must be transformed. Indeed, a CI does not represent a
     direct cooking adaptation. For example, the third CI of the last experiment
     contains Raspberry+ , simultaneously with CoolWhip− and PieCrust− . Remov-
     ing the pie crust (i.e. PieCrust− ) can look surprising for a pie dish, but one
     must keep in mind that a CI does not correspond to a real recipe, but to
     an abstraction of variations between R and a set of recipes. So, producing a
     complete adaptation requires to get back to the ∆R,Ri for having all the vari-
     ations of ingredient that will take part to the adaptation. For example, for
Adaptation knowledge discovery for cooking using closed itemset extraction         97




                          y−
                          h=




                         st +
                           +




                           +
               St st −




                        p−



               Ge ll +
               Pi arc




               Co err




               Pi rry




               GC lor
               Fo n +



                       ru
                       i
                    r=

                    r=

                     st

                    ru

                    wb

                    Wh

                    be

                    he

                    ti

                    Co

                     eC
                 te

                  ga

                  rn

                  eC

                 ra

                 ol

                 sp

                 eS

                 la

                 od

                  Pi
               Wa

               Su

               Co




               Ra
         ∆R,R1 ×       ×    ×     ×    ×    ×     ×          ×          ×
         ∆R,R2 ×       ×    ×     ×    ×    ×     ×    ×          ×
         ∆R,R3 ×       ×    ×     ×    ×    ×     ×    ×

   Table 3. Formal context for ingredient variations in pairs of recipes (R, Rj ).




  the CI {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− ,
  Raspberry+ }, the ∆R,Ri (with i ∈ [1; 3]) are the ones given by Table 3.
     The adaptation rules extracted from these 3 recipe variations are:

   – {CoolWhip, PieCrust, Strawberry} ; {Gelatin, GCPieCrust, Raspberry};
   – {CoolWhip, PieCrust, Strawberry} ; {FoodColor, PieShell, Raspberry};
   – {CoolWhip, PieCrust, Strawberry} ; {PieShell, Raspberry}.


      For R2 and R3 , PieShell is added in replacement of PieCrust; in R1 ,
  GCPieCrust plays the role of PieCrust. These three recipe variations propose to
  replace Strawberry by Raspberry. For R1 (resp. R2 ), Gelatin (resp. FoodColor)
  is also added. Finally, the three recipe variations propose to remove the CoolWhip.
      Our approach guarantees the ingredient compatibility, with the assumption
  that the recipe base used for the adaptation rule extraction process contains
  only good recipes, i.e. recipes which do not contain ingredient incompatibility.
  Indeed, as adaptation rules are extracted from real recipes, the good combination
  of ingredients is preserved. So, when introducing a new ingredient ing1 (marked
  by ing1+ ), removing another ingredient ing2 (marked by ing2− ) could be required.
  The reason is that there is no recipe, entailed in the creation of the CI from which
  the adaptation rules are extracted, using both ing1 and ing2 . In the same way,
  adding a supplementary ingredient ing3 (marked by ing3+ ) in addition of ing1 ,
  is obtained from recipes which use both ing1 and ing3 .
      Applying FCA on these ∆R,Ri produces the concept lattice presented in
  Fig. 1 in which the top node is the CI retained. This node can be seen as
  a generic cooking adaptation, and navigating into the lattice will conduct to
  more specific adaptation. The KDD loop is closed: after having (1) selected and
  formatting the data, (2) applying a data-mining CI extraction algorithm, and
  (3) interpreting the results, a new set of data is selected on which a data-mining
  –FCA– algorithm could then be applied.
      We have chosen to return the adaptation rules generated from the 5 first CIs
  to the user. So, the system proposes results where Strawberry could be replaced
  (in addition of some other ingredient adding or removing) by “RedFoodColoring
  and Cherry”, by Raspberry with optional Gelatin or FoodColor, by Peach
98       Emmanuelle Gaillard, Jean Lieber and Emmanuel Nauer




             Fig. 1. The lattice computed on the formal context given in Table 3.




     with optional FoodColor or LemonJuice, by “HeavyCream and LemonRind”, or
     by “Apple and AppleJuice”.


     5    Conclusion

     This paper shows how adaptation knowledge can be extracted efficiently for ad-
     dressing a cooking adaptation challenge. Our approach focuses on CIs with a
     particular form, because the support is not a good ranking measure for this
     problem. A ranking method based on 5 criteria explicitly specified for this adap-
     tation problem is proposed; the support is used in addition to distinguish CIs
     which satisfy in the same way the 5 criteria.
         Beyond the application domain, this study points out that KD is not only a
     data-mining issue: the preparation and interpretation steps are also important.
     Moreover, it highlights the iterative nature of KD: starting from a first experi-
     ment with few a priori about the form of the results which are too numerous to
     be interpreted, it arrives to an experiment with a precise aim that gives results
     that are easy to interpret as adaptation rules.
         It has been argued in the paper that this approach is better than the basic
     adaptation approach (based on substituting an ingredient by another one, on
     the basis of the ontology), in that it avoids some ingredient incompatibilities
     and makes some specialisation choices. However, a careful study remains to be
     made in order to compare experimentally these approaches.
         A short-term future work is to integrate this AK discovery into the online
     system Taaable, following the principles of opportunistic KD [2].
         A mid-term future work consists in using the ontology during the KD process.
     The idea is to add new items, deduced thanks to the ontology (e.g. the properties
     Cream− and Milk+ entail the variation Dairy= ). First experiments have already
     been conducted but they raise interpretation difficulties. Indeed, the extracted
     CIs contain abstract terms (such as Dairy= or Flavoring+ ) that are not easy
     to interpret.
Adaptation knowledge discovery for cooking using closed itemset extraction             99

  References
   1. F. Badra, R. Bendaoud, R. Bentebitel, P.-A. Champin, J. Cojan, A. Cordier, S. De-
      sprés, S. Jean-Daubias, J. Lieber, T. Meilender, A. Mille, E. Nauer, A. Napoli, and
      Y. Toussaint. Taaable: Text Mining, Ontology Engineering, and Hierarchical Clas-
      sification for Textual Case-Based Cooking. In ECCBR Workshops, Workshop of
      the First Computer Cooking Contest, pages 219–228, 2008.
   2. F. Badra, A. Cordier, and J. Lieber. Opportunistic Adaptation Knowledge Dis-
      covery. In Lorraine McGinty and David C. Wilson, editors, 8th International Con-
      ference on Case-Based Reasoning - ICCBR 2009, volume 5650 of Lecture Notes
      in Computer Science, pages 60–74, Seattle, États-Unis, July 2009. Springer. The
      original publication is available at www.springerlink.com.
   3. S. Craw, N. Wiratunga, and R. C. Rowe. Learning adaptation knowledge to im-
      prove case-based reasoning. Artificial Intelligence, 170(16-17):1175–1192, 2006.
   4. M. d’Aquin, F. Badra, S. Lafrogne, J. Lieber, A. Napoli, and L. Szathmary. Case
      base mining for adaptation knowledge acquisition. In International Joint Confer-
      ence on Artificial Intelligence, IJCAI’07, pages 750–756, 2007.
   5. M. D’Aquin, S. Brachais, J. Lieber, and A. Napoli. Decision Support and Knowl-
      edge Management in Oncology using Hierarchical Classification. In Katherina
      Kaiser, Silvia Miksch, and Samson W. Tu, editors, Proceedings of the Symposium
      on Computerized Guidelines and Protocols - CGP-2004, volume 101 of Studies in
      Health Technology and Informatics, pages 16–30, Prague, Czech Republic, 2004.
      Silvia Miksch and Samson W. Tu, IOS Press.
   6. M. d’Aquin, J. Lieber, and A. Napoli. Adaptation Knowledge Acquisition: a Case
      Study for Case-Based Decision Support in Oncology. Computational Intelligence
      (an International Journal), 22(3/4):161–176, 2006.
   7. U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge
      discovery in databases. AI Magazine, pages 37–54, 1996.
   8. B. Ganter and R. Wille. Formal Concept Analysis: Mathematical Foundations.
      Springer, 1999.
   9. K. Hanney and M. T. Keane. Learning Adaptation Rules From a Case-Base.
      In I. Smith and B. Faltings, editors, Advances in Case-Based Reasoning – Third
      European Workshop, EWCBR’96, LNAI 1168, pages 179–192. Springer, 1996.
  10. C. K. Riesbeck and R. C. Schank. Inside Case-Based Reasoning. Lawrence Erlbaum
      Associates, Inc., Hillsdale, New Jersey, 1989.
  11. L. Szathmary and A. Napoli. CORON: A Framework for Levelwise Itemset Min-
      ing Algorithms. Supplementary Proc. of The Third International Conference on
      Formal Concept Analysis (ICFCA ’05), Lens, France, pages 110–113, 2005.
  12. M. J. Zaki and C.-J. Hsiao. CHARM: An efficient algorithm for closed itemset
      mining. In SIAM International Conference on Data Mining SDM’02, pages 33–43,
      2002.
             Fast Computation of Proper Premises

                  Uwe Ryssel1 , Felix Distel2 , and Daniel Borchmann3
   1
       Institute of Applied Computer Science, Technische Universität Dresden, Dresden,
                              Germany, uwe.ryssel@tu-dresden.de
        2
            Institute of Theoretical Computer Science, Technische Universität Dresden,
                        Dresden, Germany, felix@tcs.inf.tu-dresden.de
          3
             Institute of Algebra, Technische Universität Dresden, Dresden, Germany,
                                 borch@tcs.inf.tu-dresden.de



          Abstract. This work is motivated by an application related to refactor-
          ing of model variants. In this application an implicational base needs to
          be computed, and runtime is more crucial than minimal cardinality. Since
          the usual stem base algorithms have proven to be too costly in terms of
          runtime, we have developed a new algorithm for the fast computation of
          proper premises. It is based on a known link between proper premises and
          minimal hypergraph transversals. Two further improvements are made,
          which reduce the number of proper premises that are obtained multiple
          times and redundancies within the set of proper premises. We provide
          heuristic evidence that an approach based on proper premises will also
          be beneficial for other applications.


  1     Introduction

  Today, graph-like structures are used in many model languages to specify al-
  gorithms or problems in a more readable way. Examples are data-flow-oriented
  simulation models, such as MATLAB/Simulink, state diagrams, and diagrams
  of electrical networks. Generally, such models consist of blocks or elements and
  connections among them. Using techniques described in Section 5.2, a formal
  context can be obtained from such models. By computing an implicational base
  of this context, dependencies among model artifacts can be uncovered. These
  can help to represent a large number of model variants in a structured way.
      For many years, computing the stem base has been the default method for
  extracting a small but complete set of implications from a formal context. There
  exist mainly two algorithms to achieve this [10,15], and both of them compute
  not only the implications from the stem base, but also concept intents. This is
  problematic as a context may have exponentially many concept intents. Recent
  theoretical results suggest that existing approaches for computing the stem base
  may not lead to algorithms with better worst-case complexity [6,1].
      Bearing this in mind, we focus on proper premises. Just like pseudo-intents,
  that are used to obtain the stem base, proper premises yield a sound and com-
  plete set of implications. Because this set of implications does not have minimal
  cardinality, proper premises have been outside the focus of the FCA community

c 2011 by the paper authors. CLA 2011, pp. 101–113. Copying permitted only for
  private and academic purposes. Volume published and copyrighted by its editors.
  Local Proceedings in ISBN 978–2–905267–78–8, Inria NGE/LORIA, Nancy, France.
102     Uwe Ryssel, Felix Distel and Daniel Borchmann


for many years. However, there are substantial arguments to reconsider using
them. Existing methods for computing proper premises avoid computing con-
cept intents. Thus, in contexts with many concept intents they may have a clear
advantage in runtime over the stem base algorithms. This is particularly true
for our application where the number of concept intents is often close to the
theoretical maximum. Here, attributes often occur together with their negated
counterparts, and the concept lattice can contain several millions of elements.
In Section 5.1 we provide arguments that we can expect the number of con-
cept intents to be larger than the number of proper premises in most contexts,
assuming a uniform random distribution.
    Often, in applications, runtime is the limiting factor, not the size of the basis.
But even where minimal cardinality is a requirement, computing proper premises
is worth considering, since there are methods to transform a base into the stem
base in polynomial time [16].
    In this paper we present an algorithm for the fast computation of proper
premises. It is based on three ideas. The first idea is to use a simple connection
between proper premises and minimal hypergraph transversals. The problem
of enumerating minimal hypergraph transversals is well-researched. Exploiting
the link to proper premises allows us to use existing algorithms that are known
to behave well in practice. A first, naïve algorithm iterates over all attributes
and uses a black-box hypergraph algorithm to compute proper premises of each
attribute.
    A drawback when iterating over all attributes is that the same proper premise
may be computed several times for different attributes. So we introduce a can-
didate filter in the second step: For each attribute m, the attribute set is filtered
and proper premises are searched only among the candidate attributes. We show
that this filtering method significantly reduces the number of multiple-computed
proper premises while maintaining completeness. In a third step we exploit the
fact that there are obvious redundancies within the proper premises. These can
be removed by searching for proper premises only among the meet-irreducible
attributes.
    We argue that our algorithms are trivial to parallelize, leading to further
speedups. Due to their incremental nature, parallelized versions of the stem base
algorithms are not known to date. We conclude by providing experimental re-
sults. These show highly significant improvements for the contexts obtained from
the model refactoring application. For a sample context, where Next-Closure re-
quired several hours to compute the stem base, runtime has dropped to fractions
of a second. For contexts from other applications the improvements are not as
impressive but still large.


2     Preliminaries

We provide a short summary of the most common definitions in formal concept
analysis. A formal context is a triple K = (G, M, I) where G is a set of objects,
M a set of attributes, and I ⊆ G × M is a relation that expresses whether an
                                     Fast Computation of Proper Premises       103


object g ∈ G has an attribute m ∈ M . If A ⊆ G is a set of objects then A0
denotes the set of all attributes that are shared among all objects in A, i.e.,
A0 = { m ∈ M | ∀g ∈ G : gIm }. Likewise, for some set B ⊆ M we define
B 0 = { g ∈ G | ∀m ∈ B : gIm }. Pairs of the form (A, B) where A0 = B and
B 0 = A are called formal concepts. Formal concepts of the form ({ m }0 , { m }00 )
for some attribute m ∈ M are called attribute concept and are denoted by µm.
We define the partial order ≤ on the set of all formal concepts of a context to
be the subset order on the first component. The first component of a formal
concept is called the concept extent while the second component is called the
concept intent.
    Formal concept analysis provides methods to mine implicational knowledge
from formal contexts. An implication is a pair (B1 , B2 ) where B1 , B2 ⊆ M ,
usually denoted by B1 → B2 . We say that the implication B1 → B2 holds in a
context K if B10 ⊆ B20 . An implication B1 → B2 follows from a set of implications
L if for every context K in which all implications from L hold, B1 → B2 also
holds. We say that L is sound for K if all implications from L hold in K, and
we say that L is complete for K if all implications that hold in K follow from
L. There exists a sound and complete set of implications for each context which
has minimal cardinality [12]. This is called the stem base. The exact definition
of the stem base is outside the scope of this work.
    A sound and complete set of implications can also be obtained using proper
premises. For a given set of attributes B ⊆ M , define B • to be the set of those
attributes in M \ B that follow from B but not from a strict subset of B, i.e.,
                                             [      
                             B • = B 00 \ B ∪    S 00 .
                                             S(B

B is called a proper premise if B • is not empty. It is called a proper premise for
m ∈ M if m ∈ B • . It can be shown that L = { B → B • | B proper premise }
is sound and complete [11]. Several alternative ways to define this sound and
complete set of implications can be found in [2].
    We write g $ m if g 0 is maximal with respect to the subset order among all
object intents which do not contain m.


3   Proper Premises as Minimal Hypergraph Transversals

We present a connection between proper premises and minimal hypergraph
transversals, which forms the foundation for our enumeration algorithms. It has
been exploited before in database theory to the purpose of mining functional
dependencies from a database relation [14]. Implicitly, it has also been known
for a long time within the FCA community. However, the term hypergraph has
not been used in this context (cf. Prop. 23 from [11]).
    Let V be a finite set of vertices. Then a hypergraph on V is simply a pair
(V, H) where H is a subset of the power set 2V . Intuitively, each set E ∈ H
represents an edge of the hypergraph, which, in contrast to classical graph theory,
104       Uwe Ryssel, Felix Distel and Daniel Borchmann


may be incident to more or less than two vertices. A set S ⊆ V is called a
hypergraph transversal of H if it intersects every edge E ∈ H, i.e.,

                                ∀E ∈ H : S ∩ E 6= ∅.

S is called a minimal hypergraph transversal of H if it is minimal with respect to
the subset order among all hypergraph transversals of H. The transversal hyper-
graph of H is the set of all minimal hypergraph transversals of H. It is denoted by
Tr (H). The problem of deciding for two hypergraphs G and H whether H is the
transversal hypergraph of G is called TransHyp. The problem of enumerating
all minimal hypergraph transversals of a hypergraph G is called TransEnum.
Both problems are relevant to a large number of fields and therefore have been
well-researched. TransHyp is known to be contained in coNP. Since it has
been shown that TransHyp can be decided in quasipolynomial time [9], it is
not believed to be coNP-complete. Furthermore, it has been shown that it can
be decided using only limited non-determinism [8]. For the enumeration problem
it is not known to date whether an output-polynomial algorithm exists. However,
efficient algorithms have been developed for several classes of hypergraphs [8,4].
     The following proposition can be found in [11] among others.

Proposition 1. P ⊆ M is a premise of m ∈ M iff

                                 (M \ g 0 ) ∩ P 6= ∅

holds for all g ∈ G with g $ m. P is a proper premise for m iff P is minimal
(with respect to ⊆) with this property.

      We immediately obtain the following corollary.

Corollary 1. P is a premise of m iff P is a hypergraph transversal of (M, H)
where
                       H := {M \ g 0 | g ∈ G, g $ m}.

The set of all proper premises of m is exactly the transversal hypergraph

                           Tr ({M \ g 0 | g ∈ G, g $ m}).

    In particular this proves that enumerating the proper premises of a given
attribute m is polynomially equivalent to TransEnum. This can be exploited
in a naïve algorithm for computing all proper premises of a formal context (Al-
gorithm 1). Being aware of the link to hypergraph transversals, we can benefit
from existing efficient algorithms for TransEnum in order to enumerate proper
premises similar to what has been proposed in [14]. Of course, it is also possible
to use other enumeration problems to which TransEnum can be reduced. Ex-
amples are the enumeration of prime implicants of Horn functions [2] and the
enumeration of set covers.
                                           Fast Computation of Proper Premises      105


4     Improvements to the Algorithm
4.1     Avoiding Duplicates using Candidate Sets
We can further optimize Algorithm 1 by reducing the search space. In the naïve
algorithm proper premises are typically computed multiple times since they can
be proper premises of more than one attribute. Our goal is to avoid this wherever
possible.
   The first idea is shown in Algorithm 2. There we introduce a candidate set
C of particular attributes, depending on the current attribute m. We claim now
that we only have to search for minimal hypergraph transversals P of { M \ g 0 |
g $ m } with P ⊆ C. We provide some intuition for this idea.


Algorithm 1 Naïve Algorithm for Enumerating All Proper Premises
    Input: K = (G, M, I)
    P=∅
    for all m ∈ M do
      P = P ∪ Tr ({M \ g 0 | g ∈ G, g $ m})
    end for
    return P




Algorithm 2 A Better Algorithm for Enumerating All Proper Premises
    Input: K = (G, M, I)
    P = { { m } | m ∈ M, { m } is a proper premise of K }
    for all m ∈ M do
      C = { u ∈ M \ { m } | 6 ∃v ∈ M : µu ∧ µm ≤ µv < µm }
      P = P ∪ { P ⊆ C | P minimal hypergraph transversal of { M \ g 0 | g $ m } }
    end for
    return P



   Let us fix a formal context K = (G, M, I), choose m ∈ M and let P ⊆ M be
a proper premise for m. Then we know that m ∈ P 00 , which is equivalent to
                                 ^
                                    µp ≤ µm.
                                     p∈P

If we now find another attribute n ∈ M \ { m } with
                               ^
                                  µp ≤ µn < µm
                                  p∈P

it suffices to find the set P as a proper premise for n, because from µn < µm we
can already infer m ∈ P 00 . Conversely, if we search for all proper premises for m,
106     Uwe Ryssel, Felix Distel and Daniel Borchmann


we only have to search for those who are not proper premises for attributes n
with µn < µm. Now suppose that there exists an element u ∈ P and an attribute
v ∈ M such that
                            µm ∧ µu ≤ µv < µm.                            (1)
Then we know                ^                ^
                        (       µp) ∧ µm =         µp ≤ µv < µm,
                         p∈P                 p∈P

i.e., P is already a proper premise for v. In this case, we do not have to search
for P , since it will be found in another iteration. On the other hand, if P is a
proper premise for m but not for any other attribute n ∈ M with µn < µm, the
argument given above shows that an element u ∈ P and an attribute v ∈ M
satisfying (1) cannot exist.

Lemma 1. Algorithm 2 enumerates for a given formal context K = (G, M, I)
all proper premises of K.

Proof. Let P be a proper premise of K for the attribute m. P is a proper premise
and therefore m ∈ P 00 holds, which is equivalent to µm ≥ (P 0 , P 00 ). Let c ∈ M
be such that µm ≥ µc ≥ (P 0 , P 00 ) and µc is minimal with this property. We
claim that either P = { c } or P is found in the iteration for c of Algorithm 2.
    Suppose c ∈ P . Then m ∈ { c }00 follows from µm ≥ µc. As a proper premise,
P is minimal with the property m ∈ P 00 . It follows P = { c } and P is found by
Algorithm 2 during the initialization.
    Now suppose c 6∈ P . Consider

               C := { u ∈ M \ { c } | 6 ∃v ∈ M : µu ∧ µc ≤ µv < µc }.

We shall show P ⊆ C. To see this, consider some p ∈ P . Then p 6= c holds by
assumption. Suppose that p 6∈ C, i.e., there is some v ∈ M such that µp ∧ µc ≤
µv < µc. Because of p ∈ P , µp ≥ (P 0 , P 00 ) and together with µc ≥ (P 0 , P 00 ) we
have
                       (P 0 , P 00 ) ≤ µp ∧ µc ≤ µv < µc
in contradiction to the minimality of µc. This shows p ∈ C and all together
P ⊆ C.
    To complete the proof it remains to show that P is a minimal hypergraph
transversal of { M \ { g }0 | g $ c }, i.e., that P is also a proper premise for c, not
only for m. Consider n ∈ P . Assume c ∈ (P \ { n })00 . Since {c} implies m, then
P \ { n } would be a premise for m in contradiction to the minimality of P . Thus
c 6∈ (P \ { n })00 holds for all n ∈ P and therefore P is a proper premise for c.


4.2   Irreducible Attributes

We go one step further and also remove attributes m from our candidate set C
whose attribute concept µm is the V  meet of other attribute concepts µx1 , . . . , µxn ,
                                      n
where x1 , . . . , xn ∈ C, i.e., µm = i=1 µxi . This results in Algorithm 3 that no
                                          Fast Computation of Proper Premises            107


longer computes all proper premises, but a subset that still yields a complete
implicational base. We show that we only have to search for proper premises P
with P ⊆ N where N is the set of irreducible attributes of K.
   To ease the presentation, let us assume for the rest of this paper that the
formal context K is attribute-clarified.


Algorithm 3 Computing Enough Proper Premises
  Input: K = (G, M, I)
  P = { { m } | m ∈ M, { m } Vis a proper premise of K }
  N = M \ { x ∈ M | µx = n    i=1 µxi for an n ∈ N and xi ∈ M for 1 ≤ i ≤ n }
  for all m ∈ M do
    C = { u ∈ N \ { m } | 6 ∃v ∈ M : µu ∧ µm ≤ µv < µm }
    P = P ∪ { P ⊆ C | P minimal hypergraph transversal of { M \ g 0 | g $ m } }
  end for
  return P




Proposition 2. Let m be an attribute and let P be a proper premise for m. Let
x ∈ P , n ∈ N, and for 1 ≤ i ≤ n let xi ∈ M be attributes satisfying
 – m∈ / {Vx1 , . . . , xn },
            n
 – µx = i=1 µxi ,
 – xi ∈
      / ∅ for all 1 ≤ i ≤ n and
         00

 – µx < µxi for all 1 ≤ i ≤ n.
Then { x } is a proper premise for all xi and there exists a nonempty set Y ⊆
{ x1 , . . . , xn } such that (P \ { x }) ∪ Y is a proper premise for m.

Proof. It is clear that { x } is a proper premise for all xi , since xi ∈ { x }00 and
   / ∅00 . Define
xi ∈
                               QY := (P \ { x }) ∪ Y
for Y ⊆ { x1 , . . . , xn }. We choose Y ⊆ { x1 , . . . , xn } such that Y is minimal with
respect to m ∈ Q00Y . Such a set exists, since m ∈ ((P \ { x }) ∪ { x1 , . . . , xn })00
because of { x1 , . . . , xn } → { x }. Furthermore, Y 6= ∅, since m ∈    / (P \ { x })00 .
    We now claim that QY is a proper premise for m. Clearly m ∈                / QY , since
m∈ / Y . For all y ∈ Y it holds that m ∈    / (QY \ { y })00 or otherwise minimality of
Y would be violated. It therefore remains to show that m ∈           / (QY \ { y })00 for all
y ∈ QY \ Y = P \ { x }.

                         (QY \ { y })00 = ((P \ { x, y }) ∪ Y )00
                                         ⊆ ((P \ { y }) ∪ Y )00
                                         = (P \ { y })00

since { x } → Y and x ∈ P \{ y }. Since m ∈
                                          / (P \{ y })00 , we get m ∈
                                                                    / (QY \{ y })00
as required. In sum, QY is a proper premise for m.
108       Uwe Ryssel, Felix Distel and Daniel Borchmann


Lemma 2. Let N be the set of all meet-irreducible attributes of a context K.
Define

    P = { X ⊆ M | |X| ≤ 1, X proper premise } ∪ { X ⊆ N | X proper premise }

Then the set L = { P → P • | P ∈ P } is sound and complete for K.

Proof. Let m be an attribute and let P be a proper premise for m. If P ∈    / P then
it follows that P 6⊆ N . Thus we can find y1 ∈ P \N and elements x1 , . . . , xn ∈ M
with n ≥ 1 such that
 – m∈ / { xV1 , . . . , xn },
             n
 – µy1 = i=1 µxi ,
 – xi ∈
      / ∅00 for all 1 ≤ i ≤ n and
 – µx < µxi for all 1 ≤ i ≤ n.
By Proposition 2 we can find a proper premise P1 such that P → { m } fol-
lows from { y1 } → { x1 , . . . , xn } and P1 → { m }. Clearly { y1 } ∈ P, since all
singleton proper premises are contained in P. If P1 ∈     / P then we can apply
Proposition 2 again and obtain a new proper premise P2 , etc. To see that this
process terminates consider the strict partial order ≺ defined as

                        P ≺ Q iff ∀q ∈ Q : ∃p ∈ P : µp < µq.

It is easy to see that with each application of Proposition 2 we obtain a new
proper premise that is strictly larger than the previous with respect to ≺. Hence,
the process must terminate. This yields a set P 0 = { { y1 }, . . . , { yk }, Pk } ⊆ P
such that P → { m } follows from { Q → Q• | Q ∈ P 0 }. Thus L is a sound and
complete set of implications.

      Together with Lemma 1 this yields correctness of Algorithm 3.

Corollary 2. The set of proper premises computed by Algorithm 3 yields a
sound and complete set of implications for the given formal context.


5      Evaluation
5.1     Computing Proper Premises Instead of Intents
In both the stem base algorithms and our algorithms, runtime can be exponential
in the size of the input. In the classical case the reason is that the number
of intents can be exponential in the size of the stem base [13]. In the case of
our algorithms there are two reasons: the computation of proper premises is
TransEnum-complete, and there can be exponentially many proper premises.
The first issue is less relevant in practice because algorithms for TransEnum,
while still exponential in the worst case, behave well for most instances.
    To see that there can be exponentially many proper premises in the size of the
stem base, let us look at the context Kn from Table 1 for some n ≥ 2, consisting
                                               Fast Computation of Proper Premises      109


of two contranominal scales of dimension n × n and one attribute a with empty
extent. It can be verified that the proper premises of the attribute a are exactly
the sets of the form {mi | i ∈ I} ∪ {m0i | i ∈
                                             / I} for some I ⊆ {1, . . . , n}, while the
only pseudo-intents are the singleton sets and {m1 , . . . , mn , m01 , . . . , m0n }. Hence
there are 2n proper premises for a, while there are only 2n + 2 pseudo-intents.


           Table 1. Context Kn with Exponentially Many Proper Premises


                                     m1 . . . mn m01 . . . m0n a
                               g1
                                ..
                                 .       I6=           I6=
                               gn




    Next-Closure behaves poorly on contexts with many intents while our algo-
rithms behave poorly on contexts with many proper premises. In order to provide
evidence that our algorithm should behave better in practice we use formulae
for the expectation of the number of intents and proper premises in a formal
context that is chosen uniformly at random among all n × m-contexts for fixed
natural numbers n and m.4 Derivations of these formulae can be found in [7].
    The expected value for the number of intents in an n × m-context is
                    Xm  X   n  
                         m        n −rq
          Eintent =                   2 (1 − 2−r )m−q (1 − 2−q )n−r ,
                    q=0
                         q   r=0
                                  r

while the expected value for the number of proper premises for a fixed attribute
a in an n × m-context is
             Xn   m−1                        q
                  n X m                2 X      Y                  pi+1 −pi −1
  Epp = 2−n                     q! 2−q              1 − 2−q (1 + i)             .
             r=0
                  r q=0 q                     q i=0
                                           (p1 ,...,pq )∈N
                                          1≤p1 <···