=Paper=
{{Paper
|id=Vol-487/paper-4
|storemode=property
|title=From Set Unification to Set Constraints
|pdfUrl=https://ceur-ws.org/Vol-487/paper4.pdf
|volume=Vol-487
|dblpUrl=https://dblp.org/rec/conf/birthday/Rossi08a
}}
==From Set Unification to Set Constraints==
<pdf width="1500px">https://ceur-ws.org/Vol-487/paper4.pdf</pdf>
<pre>
          Il Milione: A Journey in the Computational Logic in Italy


DALL’UNIFICAZIONE INSIEMISTICA AI VINCOLI SU
                  INSIEMI
  FROM SET UNIFICATION TO SET CONSTRAINTS
                         Gianfranco Rossi


                                     20
                           Il Milione: A Journey in the Computational Logic in Italy


SOMMARIO/ABSTRACT                                                    making set constraint solving more effective.

In questo articolo riassumiamo brevemente alcuni dei                 2 Set Unification
problemi più interessanti che nascono quando si per-
mette di trattare gli insiemi come oggetti di “prima                 Intuitively, the set unification problem is the problem of
classe” in un linguaggio logico, dall’unificazione di in-            computing (or simply testing the existence of) an assign-
siemi ben-fondati e non alla soluzioni di vincoli su insiemi.        ment of values to the variables occurring in two set terms
                                                                     which makes them denote the same set.
In this paper, we briefly summarize some of the most chal-              Various forms of set unification have been used in
lenging issues that arise when allowing sets to be dealt             various application areas, such as (see [13]): deductive
with as first-class objects in a logic language, ranging from        databases, AI and its various sub-fields (e.g., Automated
set unification of well-founded and non-well-founded sets            Deduction and Natural Language Processing), program
to set constraint solving.                                           analysis and security, declarative programming languages
                                                                     with sets.
Keywords: Set unification, hypersets, set constraints.                  Set unification can be thought of as an instance of E-
                                                                     unification, i.e., unification modulo an equational theory
                                                                     E, where the identities in E capture the properties of set
1   Introduction
                                                                     terms—i.e., the fact that the ordering and repetitions of el-
Sets are familiar mathematical objects, and they are often           ements in a set are immaterial.
used as a high-level abstraction to represent complex data              The equational theory E is strongly related to the rep-
structures, whenever the order and repetitions of elements           resentation adopted for set terms. Two main approaches
are immaterial.                                                      have been presented in the literature: the union-based rep-
   In the last two decades, a number of proposals have               resentation, and the list-like representation. The union-
emerged where sets are dealt with as primitive objects of            based representation makes use of the union operator (∪)
a (first-order) logic language. In this context, sets are of-        to construct sets. This representation has been often used
ten represented as first-order terms, called set terms, built        when dealing with the problem of set unification on its
from symbols of a suitable alphabet, using selected func-            own, where set unification is dealt with as an ACI uni-
tion symbols as set constructors. Furthermore, the lan-              fication problem—i.e., unification in presence of operators
guage usually provides the typical set-theoretic operations          satisfying the Associativity, Commutativity, and Idempo-
to manipulate set objects.                                           tence properties (e.g., [6]).
   These short notes summarize some of the most chal-                   The list-like representation builds sets using an element
lenging issues arising when manipulating finite sets in a            insertion constructor (typically denoted by {· | ·}). With
logic language. In Section 2 we briefly review set unifi-            this approach, the finite set {t0 , . . . , tn } is represented by
cation, i.e., the key problem of solving equations between           a sequence of element insertions
set terms. In Section 3 the unification problem is extended                                {t0 | {· · · {tn | ∅} · · ·}},
to the case of non-well-founded sets. In Section 4 we in-            where t0 , . . . , tn are either individuals or sets. While this
troduce set constraints as a way to allow set-theoretic op-          representation restricts the number of set variables which
erations other than set equality to be taken into account.           can occur in each set term to one, on the other hand it al-
Finally, in Section 5 we briefly review proposals aiming at          lows sets to be viewed and manipulated in a fashion sim-

                                                                21
                                        Il Milione: A Journey in the Computational Logic in Italy


ilar to lists. As a matter of fact, this representation has                 3    Hypersets
been adopted in a number of logic and functional-logic
programming languages with sets (e.g., CLP (SET ) [10]).                    Sets considered so far are the so called hereditarily finite
                                                                            sets, i.e. sets with a finite number of elements, all of which
   Various authors have investigated the problem of set uni-
                                                                            are themselves hereditarily finite. This definition leaves
fication using the list-like representation [3, 12, 23, 8].
                                                                            still a further possibility for infinity. Let us consider the
In particular, the algorithm presented in [9] considers an
                                                                            sets x and y that satisfy the equations x = {∅, y}, y =
equational theory E containing the two identities (Ab) and
                                                                            {x}. They are hereditarily finite, but they hide an infinite
(C`) stating the fundamental properties of the set construc-
                                                                            descending chain x 3 y 3 x 3 y 3 · · ·. These sets in
tor {· | ·}:
                                                                            which, roughly speaking, membership can form cycles are
         (Ab) {X | {X | Z}} ≈ {X | Z}                                       called non-well-founded sets (or hypersets). Hypersets are
         (C`) {X | {Y | Z}} ≈ {Y | {X | Z}}.                                very important in some areas, such as concurrency theory,
                                                                            but hyperset theory has been applied in a number of areas
The core of the unification algorithm is very similar in
                                                                            of logic, linguistics, and computer science, as well.
structure to the traditional unification algorithms for stan-
                                                                               Introducing hypersets as a data structure in a logic pro-
dard Herbrand terms (e.g., [20]). The main difference
                                                                            gramming language requires a unification algorithm that is
is represented by the reduction of equations between set
                                                                            able to deal with objects denoting hypersets. All set uni-
terms, {Y1 | V1 } = {Y2 | V2 }. The algorithm allows also to
                                                                            fication algorithms cited in the previous section, however,
account for equations of the form X = {t0 , . . . , tn | X},
                                                                            consider well-founded sets only.
with X 6∈ vars(t0 , . . . , tn ), which turns out to be satisfi-
                                                                               An hyperset unification algorithm is shown in [1]. The
able for any X containing t0 , . . . , tn thanks to (Ab) and
                                                                            key idea underlying this algorithm is that of enlarging the
(C`)). As an example, given the set unification problem
                                                                            domain of discourse from terms (i.e., finite trees) over the
                         {X | S} = {1, 2}                                   signature Σ to directed labeled graphs over Σ, possibly
                                                                            with cycles. This data structure, when involving the inter-
(where {1, 2} is a syntactic shorthand for {1 | {2 | ∅}})                   preted function symbol {· | ·} used as the set constructor,
the algorithm non-deterministically computes the follow-                    can be regarded as a convenient way to denote hypersets.
ing (complete) set of solutions: X = 1 ∧ S = {2}, X =                       For instance, a solution to the equation X = {X} is a
1 ∧ S = {1, 2}, X = 2 ∧ S = {1}, X = 2 ∧ S = {1, 2}.                        cyclic graph which can be interpreted as an hyperset con-
                                                                            taining itself as its only element. In addition, a notion of
  A general survey of the problem of unification in pres-                   bisimulation which applies to this kind of graphs is defined
ence of sets, across different set representations and differ-              and the interpretation domain is taken as the set of directed
ent admissible classes of set terms, can be found in [13].                  labeled graphs over Σ modulo the equivalence relation in-
   The computational complexity properties of the set                       duced by bisimulation.
unification have been investigated by Kapur and Naren-                         The algorithm in [1] can be seen as an adaptation of
dran [18], who established that these decision problems are                 the set unification algorithm of [9]. Many of the changes
NP-complete. Complexity of the set unification operation,                   required to move from set to hyperset unification are the
however, depends on which forms of set terms (e.g., flat                    same needed when moving from standard unification to
or nested sets, with zero, one, or more set variables) are al-              unification over (uninterpreted) rational trees, for which
lowed. The form of set terms in turn is influenced by the set               a number of efficient algorithms have been proposed in the
constructors used to build them. Thus, different complex-                   literature (e.g., [21]). In particular the hyperset unification
ity results can be obtained for different classes of set terms.             algorithm in [1] works on Herbrand systems of equations,
For instance, while the set equivalence test of ground set                  avoiding full variable substitution and adding simple non-
terms denoting flat sets, such as {a, b, c} and {b, c, a}, is               membership constraints to avoid the possibly endless re-
rather easy, when the decision problem deals with nested                    peated insertions of the same elements into hypersets.
set terms involving variables it becomes NP-complete.
   Various authors have considered simplified versions of                   4    Set Constraints
the (Ab)(C`) problem obtained by imposing restrictions
on the form of the set terms. In particular, various works                  The algorithms cited above focus only on equality between
have been proposed to study the simpler cases of match-                     set terms. Besides equality, however, other basic set opera-
ing1 and unification of Bound Simple set terms, i.e., bound                 tions, such as membership, inclusion, union, etc., are usu-
set terms of the form {s1 , . . . , sn }, where each si is either           ally required for dealing with sets in a more general way.
a constant or a variable [4, 16].                                              A number of proposals have been put forward in the last
                                                                            fifteen years in which general set-based formulae are con-
   1 Set matching can be seen as a special case of set unification, where   sidered and procedures to check their consistency are de-
variables are allowed to occur in only one of the two set terms which are   veloped. Most of these proposals have emerged in the con-
compared.                                                                   text of Constraint (Logic) Programming (see, e.g., [15]).

                                                                                22
                            Il Milione: A Journey in the Computational Logic in Italy


In this context, set-theoretical operations are conveniently           the so-called Computable Set Theory area (C.S.T.), a fruit-
dealt with as (set) constraints, that is arbitrary conjunc-            ful research stream born in the 1970’s at the New York
tions of positive and negative atomic predicates built us-             University thanks to the initial ideas and subsequent stim-
ing a fixed finite set of predicate symbols denoting set-              ulus of J. T. Schwartz (see [7] for a general survey). Work
theoretical operations, whose variables can range over the             in C.S.T. has identified increasingly larger classes of com-
domain of sets. Systems of (set) constraints are solved as a           putable formulae of suitable sub-theories of the general
whole by suitable (set) constraint solvers, which are able to          Zermelo-Fraenkel set-theory for which satisfiability is de-
reduce the given constraints either to false or to a simpli-           cidable. Further extensions of these classes are still under
fied form from which it is easier to obtain a solution (i.e.,          investigation at present. Recent related work is described
a substitution for the variables in the given constraints that         in [19]. However, efforts in this area are mainly concerned
make them satisfiable in the selected interpretation). For             with decidability results, rather than computing solutions
example,                                                               like it is usually required in constraint programming.
              X ∈ S ∧ T = S ∪ R ∧ X 6∈ T                                  Other classes of aggregates (akin to sets) have also been
                                                                       considered in the literature. In particular, various frame-
is a set constraint, where R, S, and T are set variables, that         works have introduced the use of multisets where repeated
the set constraint solver can reduce to false.                         elements are allowed to appear in the collection. An anal-
   Set based formalisms allow a natural formulation of a               ysis of the problems concerned with the introduction of
number of problems, in quite different areas: combina-                 multisets—as well as sets and lists—is reported in [11, 12].
torial search problems, warehouse location problems, di-
agnostic related problems (e.g., VLSI circuit verification),
program analysis, network design problems (e.g., weight
                                                                       5   Efficient Set Constraint Solving
setting). Dealing with such formulations as constraints al-            The proposals for (general) set constraints cited above do
low, on the one hand, to solve these problems even if not              not take into account efficiency adequately to allow them to
all sets are (fully) known a priori, and, on the other hand, to        be effectively applied in many concrete applications. For
compute solutions efficiently, provided constraint reason-             example the CLP(SET ) solvers often use a generate & test
ing enables the solver to prune the search space.                      approach, that non-deterministically assigns values to vari-
   As an example, the following is a very compact formu-               ables as soon as those values are available. For instance,
lation as a set constraint of the well-known map coloring              given the constraint X ∈ {1, 2, 3, 4, 5} ∧ X 6= 10, the
problem for a map of three regions, R1, R2, R3 (where                  CLP(SET ) solver enumerates all possible values of X be-
R1 borders R2 and R2 borders R3), using two colors, c1                 fore asserting that the constraint holds.
and c2:                                                                   A number of proposals have been developed in the last
{R1, R2, R3} = {c1, c2} ∧ M = {{R1, R2}, {R2, R3}}                     fifteen years that consider more restricted forms of set con-
                   ∧ {c1} 6∈ M ∧ {c2} 6∈ M.                            straints but equipped with constraint solving techniques
   A complete set constraint solver, i.e., one which is al-            that allow them to be processed in a quite more effective
ways able to decide if a given constraint is satisfiable               way. Works along these lines include [5, 14, 17].
or not is presented in [10]. The constraint language is                   In these proposals constraint variables have a finite do-
based on constructed sets using the list-like representa-              main attached to them. In the case of set constraints, the
tion and it provides the usual set-theoretic operations as             domain is a collection of sets, usually specified as a set
primitive constraints. Sets are allowed to be nested and               interval [l, u], where l and u are known sets (typically,
to contain unknown elements (i.e., uninstantiated logical              of integers). [l, u] represents a lattice of sets induced by
variables). The constraint solver rewrites any given con-              the subset partial ordering relation ⊆ having l and u as
straint C into an equi-satisfiable disjunction of constraints          the greatest lower bound and the least upper bound, re-
in solved form—proved to be correct and terminating. In                spectively. The constraint solver exploits the information
particular the solver uses the set unification algorithm de-           that the domain of variables provides to efficiently com-
veloped in [9] to deal with (set) equalities. A constraint             pute simplified forms of the original constraint or to detect
in solved form is guaranteed to be satisfiable in the corre-           failures. In its simplest form, the solver uses a local prop-
sponding structure. Therefore the ability to obtain a solved           agation algorithm that attempts to enforce consistency on
form guarantees that the original constraint is satisfiable.           the values in the variable domains by removing values that
   This constraint structure has been exploited to obtain a            cannot form part of a solution to the system of constraints.
specific instance of the general Constraint Logic Program-             For example, given the set constraint
ming scheme, called CLP(SET ) [10]. A Java implementa-                           S ∈ {1}..{1, 2, 3, 4} ∧ X ⊆ S ∧ Y ⊆ S
tion of (most of) CLP(SET ) facilities for set management                                 ∧ #X = 2 ∧ Z = Y \X
has been recently developed and made available as part of a            where S, X, Y , and Z are set variables and #X denotes
Java library, called JSetL [22], intended to support declar-           the cardinality of the set X, the constraint solver in [5]
ative programming in an object-oriented language.                      is able to infer that the constraint is satisfiable provided
   The study of set constraints is strongly related to work in         #Z ≤ 2 holds.

                                                                  23
                                  Il Milione: A Journey in the Computational Logic in Italy


   Most of these consistency algorithms are incomplete, so        [9] A. Dovier, E. G. Omodeo, E. Pontelli, and G. Rossi.
they have to be combined with a backtracking search pro-              {log}: A Language for Programming in Logic with
cedure to produce a complete constraint solver. For exam-             Finite Sets. J. of Logic Programming, 28(1):1–44,
ple, in the example above, such a procedure allows to enu-            1996.
merate all possible solutions for Z: Z = {1}, Z = {1, 2},        [10] A. Dovier, C. Piazza, E. Pontelli, and G. Rossi. Sets
Z = {1, 3}, Z = {1, 4}.                                               and Constraint Logic Programming. ACM TOPLAS,
   While these constraint languages turn out to allow more            22(5):861–931, 2000.
efficient handling of set constraints with respect to the pro-
posals cited in the previous section (e.g., CLP(SET )), the      [11] A. Dovier, C. Piazza, and G. Rossi. A uniform ap-
latter allows more general form of sets to be dealt with: ele-        proach to constraint-solving for lists, multisets, com-
ments can be of any type, possibly other sets, and possibly           pact lists, and sets. ACM Transactions on Computa-
unknown (e.g., {X, {a, 1}}). For example the set-based                tional Logic, 9(3), 2008.
formulation of the map coloring problem shown above can          [12] A. Dovier, A. Policriti, and G. Rossi. A uniform ax-
be written—and solved—using CLP(SET ) but not using                   iomatic view of lists, multisets, and sets, and the rel-
the constraint language in [14] and [5].                              evant unification algorithms. Fundamenta Informati-
   A current line of research (see [2]) is trying to combine          cae, 36(2/3):201–234, 1998.
the general set representation and management of propos-         [13] A. Dovier, E. Pontelli, and G. Rossi. Set unification.
als like CLP(SET ), with the efficient constraint solving             Theory and Practice of Logic Programming, 6:645–
of “Finite Domain” solvers, in order to have the expres-              701, 2006.
sive power of the former while retaining the execution ef-
                                                                 [14] C. Gervet. Interval Propagation to Reason about Sets:
ficiency of the latter.
                                                                      Definition and Implementation of a Practical Lan-
                                                                      guage. Constraints, 1(3):191–244, 1997.
REFERENCES                                                       [15] C. Gervet. Constraints over Structured Domains. In
                                                                      F.Rossi, P. van Beek, and T. Walsh, ed’s, Handbook
 [1] D. Aliffi, A. Dovier, and G. Rossi. From Set to Hy-
                                                                      of Constraint Programming. Elsevier, 2006.
     perset Unification. J. of Functional and Logic Pro-
     gramming, 1999(10):1–48, 1999.                              [16] S. Greco. Optimal Unification of Bound Simple Set
                                                                      Terms. In Proc. of Conf. on Information and Knowl-
 [2] A. Dal Palù, A. Dovier, E. Pontelli, and G. Rossi.              edge Management, 326–336, ACM Press, 1996.
     Integrating Finite Domain Constraints and CLP with
                                                                 [17] P. Hawkins, V. Lagoon, and P. J. Stuckey. Solving Set
     Sets. In D. Miller, ed., Proc. of 5th ACM-SIGPLAN
                                                                      Constraint Satisfaction Problems using ROBDDs. J.
     Int’l Conf. on Principles and Practice of Declarative
                                                                      of AI Research, 24: 109–156, 2005.
     Programming, 219–229, ACM Press, 2003.
                                                                 [18] D. Kapur and P. Narendran. Complexity of Unifica-
 [3] P. Arenas-Sánchez and A. Dovier. A Minimality                   tion Problems with Associative-Commutative Opera-
     Study for Set Unification. J. of Functional and Logic            tors. J. of Automated Reasoning, 9:261–288, 1992.
     Programming, 1997(7):1–49, 1997.
                                                                 [19] V. Kuncak. Polynomial Constraints for Sets with Car-
 [4] N. Arni, S. Greco, and D. Saccà. Matching of                    dinality Bounds. FoSSaCS’99, LNCS 4423, Springer-
     Bounded Set Terms in the Logic Language LDL++.                   Verlag, 2007.
     J. of Logic Programming, 27(1):73–87, 1996.                 [20] A. Martelli and U. Montanari. An Efficient Unifica-
 [5] F. Azevedo. Cardinal: A Finite Sets Constraint                   tion Algorithm. ACM TOPLAS, 4:258–282, 1982.
     Solver. Constraints, 12(37):93–129, 2007.                   [21] A. Martelli and G. Rossi. Efficient Unification with
 [6] W. Büttner. Unification in the Data Structure Sets.             Infinite Terms in Logic Programming. In Proc. of
     In J. K. Siekmann, ed., Proc. of the 8th Int’l Conf.             FGCS’84: Int’l Conf. on Fifth Generation Computer
     on Automated Deduction, v. 230, 470–488, Springer-               Systems, 1984.
     Verlag, 1986.                                               [22] G. Rossi, E. Panegai, and E. Poleo. JSetL: A Java
                                                                      Library for Supporting Declarative Programming in
 [7] D. Cantone, E. G. Omodeo, and A. Policriti. Set                  Java. Software-Practice & Experience, 37:115–149,
     Theory for Computing. From Decision Procedures to                2007.
     Declarative Programming with Sets. Monographs in
     Computer Science, Springer-Verlag, 2001.                    [23] F. Stolzenburg. An Algorithm for General Set Unifi-
                                                                      cation and Its Complexity. J. of Automated Reason-
 [8] E. Dantsin and A. Voronkov. A Nondeterminis-                     ing, 22(1):45–63, 1999.
     tic Polynomial-Time Unification Algorithm for Bags,
     Sets, and Trees. In W. Thomas, ed., FoSSaCS’99,
     LNCS 1578, 180–196, Springer-Verlag, 1999.

                                                                   24
                         Il Milione: A Journey in the Computational Logic in Italy


6   Contacts
Gianfranco Rossi
Dipartimento di Matematica
Università di Parma
Viale G. P. Usberti 53/A
43100 Parma (Italy)
Phone: +39 (0521) 90.6909
E-mail: gianfranco.rossi@unipr.it

7   Biography
Gianfranco Rossi received a degree in Computer Science from the University of Pisa in 1979. Since November 2001 he
is a Full Professor of Computer Science at the University of Parma. His research activity has been mainly devoted to
Programming Languages, with special attention to Logic Programming languages. Since December 2006 he is President
of the Association of Logic Programming (GULP).


                                                        25

</pre>