<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modeling and Reasoning with Multirelations, and their encoding in Alloy</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peiyuan Sun</string-name>
          <email>p25sun@gsd.uwaterloo.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zinovy Diskin</string-name>
          <email>zdiskin@gsd.uwaterloo.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michał Antkiewicz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Krzysztof Czarnecki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Waterloo</institution>
        </aff>
      </contrib-group>
      <fpage>73</fpage>
      <lpage>88</lpage>
      <abstract>
        <p>Multisets and multirelations arise naturally in modeling. In this paper, we present a sound and practical mathematical framework, which encodes multisets and multirelations using only ordinary sets and total functions. We implement the encoding as a multiconcepts library in Alloy, which is declarative, compatible with ordinary sets and relations, and can be incorporated into existing models seamlessly.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Graphical notations are good for direct intuitive modeling but may be difficult to
formalize and lack precision. Text-based modeling languages are well-amenable to
formalization and formal reasoning, but the meaning of a textual specification may be
not immediate and difficult to grasp. Integration of the two frameworks is a challenge;
it needs understanding of how they can interact and how their mutual translation
could be mediated. In this paper, we investigate a special instance of the story with
the following two parties in the two roles.</p>
      <p>The graphical modeling party is represented by the UML-style relational modeling,
in which a relationship (an association in the UML parlance) between two classes
is seen as a set of links represented by arrows. As in general the same two objects
can be related by multiple links (the association ends are non-unique), we call such
a relationship a multirelation. Note that even if we compose two ordinary relations,
R1 : A → B and R2 : B → C, the result is, in general, a multirelation because there may
be multiple b ∈ B such that (a,b) ∈ R1 and (b,c) ∈ R2. Converting this multirelation
into an ordinary one means discarding some data and may be seen artificial and
unsatisfactory from the modeling perspective.</p>
      <p>The textual modeling party is Alloy, which is specially tailored for relational
modeling and analysis. However, Alloy does not provide a direct way to work with
multirelations, which requires the modeler to look for a workaround and makes
our story intriguing for an Alloy user. Indeed, considering the universe of ordinary
relations closed under composition is, as mentioned above, an artificial and often
counter-intuitive imposition on relational modeling.</p>
      <p>To integrate, we, first, formalize multirelations and operations over them in a
category-theoretical framework of spans and span operations. Although the framework
is well-known in category theory, its accurate presentation in the MDE literature
seems to be novel and makes a theoretical contribution of the paper. (For instance,
the fact that the disjoint union of two sets is generally a multiset, and the notion
of multiset product, in which elements’ roles are freely combined, have not yet been
considered in the UML literature up to our knowledge of it.) Second, as the span
framework for relational modeling is based on sets and total functions that are readily
modeled in Alloy, spans open the door for a seemingly natural Alloy encoding of
multirelational modeling. However, implementing the entire multirelational algebra in
Alloy as a proper (but conservative) extension of the ordinary relational algebra turned
out non-trivial and required solving numerous specific problems of Alloy encoding of
elementary operations over sets and functions. To ease using our solution for an Alloy
user, we implemented a library of utility modules allowing the user to seamlessly
integrate multirelations into existing models; conservativity then ensures that the
ordinary relation part of the model is not broken and can be used further if needed.</p>
      <p>Our plan for the paper is as follows. In Sect. 2 we present a simple running
example of multirelational modeling, and show the main notions at work. Sect. 3
describes the mathematical framework. In Sect. 4, we demonstrate the usage of our
Alloy module library by building an Alloy model for the running example. Sect. 5
presents the details of our encoding in Alloy, and in Sect. 6 we discuss how we evaluate
the Alloy encoding. Related work is discussed in Sect. 7, and Sect. 8 concludes.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Multiconcepts naturally arise in modeling</title>
      <p>We use multiconcept as a generic term referring to multisets, multirelations, and
specific multioperations over them. In this section, we present a simple modeling
scenario showing how naturally multiconcepts can appear in modeling. We then
formulate several basic requirements for a multiconcept modeling framework.
2.1</p>
      <sec id="sec-2-1">
        <title>Running example: seasonal sales and multirelations</title>
        <p>The manager of a grocery store asks employee to prepare some bundles for the coming
seasonal sale. A bundle contains several food items, and each item belongs to a certain
product category. The manager imposes some rules on the bundle content, e.g., the
following three:
(0) Every bundle must have (contains) at least two items
(1) Every bundle must have at least two dairy products;
(2) Every bundle must have items from at least two product categories.</p>
        <p>To model the scenario, we identify three classes of objects: Bundle, Item, and
Category, and two relations: contains : Bundle → Item and belongsTo : Item → Category
as shown in Fig. 1(a) (ignore dashed blue arrows for a while). Note that a bundle may
have repeated items according to the label ’non-unique’ at the respective association
end. A simple instance of this model is shown in Fig. 1(b). In this instance, the
relation contains consists of five links ci shown by arrows. Having two links relating
the same pair (B1,Bread) means that the bundle B1 has two breads; we also say
that the pair (B1,Bread) occurs twice in the relation, and the latter is then called
a multirelation. The relation belongsTo is an ordinary relation consisting of three
links/pairs bi (by default, it is considered bearing the ’unique’ label).
contains
belongsTo
Bundle</p>
        <p>B1
B2
2..*
non-unique
Category
(dashed lower arrow in Fig. 1a), where we use double semi-colon to denote the
composition operation. For the instance in Fig. 1b, the composition would consists
of five composed links shown by dashed blue arrows, e.g., c11;b1, c12;b1, etc.. As the
bundle B1 has only one composed link to Dairy, it violates rule (1), while bundle
B2 has two Dairy links (c3;b2 and c4;b3), and thus satisfies (1). As for rule (2), we
can see that bundle B1 satisfies whereas bundle B2 violates it. A bundle containing
one bread, one butter, and one milk would have satisfied both rules.</p>
        <p>Now suppose that the bundle B1 has only one bread and hence contains would be
an ordinary relation. Nevertheless, composition contains;;belongsTo of two ordinary
relations would still be a multirelation containing two links from B2 to Diary. Hence,
there are two types of composition of ordinary relations. One is precise and counts
each pair of composable links as a separate composed link as we did above by creating
two links from B2 to Diary. We call this composition multijoin and denote it by
double semi-column. The other relational composition only deals with reachability,
and links two elements (bundle B2 and Diary in the example) as soon as there is
at least one composable pair of links between those elements, but only retains one
link irrespective to the number of paths. We call this composition ordinary join
and denote it by single semi-column. Clearly, ordinary join loses some information,
which may be important to consider, e.g., for checking rule (1). Actually, this is a
typical situation: computing the price of a bundle composed from prices of its items
price : Item → int, or the weight of a bundle, or other aggregate queries (as they are
called in the database literature), requires multijoins and multirelations. On the other
hand, there are situations whereby we do need the ordinary join, e.g., checking rule
(2) in our example, where we are only concerned with reachability.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Seasonal sales and multisets</title>
        <p>Unary multirelations or multisets is also a natural modeling concept. For example,
in Fig. 1(b), mapping contains defines bundle B1 as a multiset of items, in which
item Bread occurs twice, and we write B1 = { Bread,Bread,Butter} , where double
brackets indicate that we deal with multisets rather than sets. Importantly, as a
unary multirelation, bundle B1 is to be seen as a multi-subset of set Item so that
the expression above implicitly states that item Milk is not included into B1. Hence,
to be accurate, we should specify the type of a multiset and write B1:MSub(Item),
where MSub(Item) denotes the set of all multisubsets of Item. Similarly, bundle
B2 is another multisubset { Butter,Milk} of the same type, which happens to be
an ordinary (sub)set. The italicized reservation is important if, for example, we
need to consider the union of two bundles, which should include item Butter twice:
B1 ] B2 = { Bread,Bread,Butter,Butter,Milk} . Thus, union of two ordinary sets
seen as multisets can result in a multiset (exactly like composition of two ordinary
relations seen as multirelations can be a multirelation). On the other hand, in some
contexts we may need an ordinary (non-counting) union operation, in which B1∪ B2 =
{ Bread,Butter,Milk} . To distinguish these two operations, we will call them ordinary
union and multiunion (similarly to contrasting ordinary join and multijoin in relational
composition). Thus, the universe of ordinary (sub)sets is closed under ordinary union,
but is not closed under multiunion. Multiunion is often called disjoint union, but the
fact that disjoint union of two sets can result in a multiset is often not recognized.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Requirements</title>
        <p>The discussion above leads to the following requirements for an effective multiconcepts
framework. A modeler should be able to do the following on top of the ordinary sets
and relations:</p>
        <sec id="sec-2-3-1">
          <title>1. directly declare a multi(sub)set or a multirelation;</title>
          <p>2. perform operations over multiconcepts;
3. control whether the result of an operation should be ordinary or multi.</p>
          <p>For example, it should be possible to compute the multijoin of an ordinary relation
with a multirelation, or the multijoin of two ordinary relations, or the multiunion of a
set and a multiset. Furthermore, the syntax and semantics of the extended framework
should be conservative w.r.t. the ordinary operations over ordinary objects, so that
the modelers could add multiconcepts to their existing models without having to
restructure them.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Mathematical framework</title>
      <p>
        The diagram in Fig. 1(b) is intuitively simple: three boxes are sets, and two block
arrows represent mappings between them. To make the diagram formal, we need
to formalize (i) the notion of mapping as a collection of links, and (ii) mapping
composition. There are two ways of doing this. The first is well-known: a multirelation
is a relation, in which each pair of elements is assigned with an integer called the
multiplicity of the pair; correspondingly, multioperations amount to operations over
numbers (see the accompanying TR [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] for details). We refer to this framework as
multiplicity-based or numeric, but it is beyond our goals in the paper. The second
approach is borrowed from category theory and is based on reification of links as
indices, we call the approach index-based and present it below.
      </p>
      <p>We will begin with index-based formalization of relations, because it allows us
to introduce the approach in a natural way by discussing the example in Sect. 2. In
this way we come to our major construct of span. Then we introduce the index-based
version of multisets—the notion of a family of elements. We will also define operations
over spans, operations over families, and mixed compositions of spans with families.
3.1</p>
      <sec id="sec-3-1">
        <title>Relations as spans and operations over them.</title>
        <p>We formalize the mapping contains in Fig. 1 by reifying all its constituent links as
separate objects. This gives us a set Contains consisting of five elements ci as shown
in Fig. 2. The special nature of these elements (they represent links) is formalized
by mapping each of them to the source and the target elements of the respective
link. For example, as element c1 ∈ Contains reifies link c1 from B1 to Bread in Fig. 1,
it is linked to B1 by the source link c1s, and to Bread by the target link c1t. All
source links make a function slegContains : Bundle ← Contains called the source leg, and
all target links make a function tlegContains : Contains → Item called the target leg. The
triple (Contains,slegContains,tlegContains) is called a span with head set Contains and legs
as above. Similarly, relation belongsTo is formalized by the span with the head set
BelongsTo (see Fig. 2).</p>
        <p>Definition 1 (span). A (finite) span R from a set A to a set B is a triple (headR,
slegR,tlegR) with headR a set called the head, and slegR : A ← headR, tlegR : headR → B
two functions called legs. In formal diagrams, we will often use a shorter notation
(HR,sR,tR) for span’s components. We denote a span by a stroked arrowR : A 9 B,
and will often use the same name for both the span and its head.</p>
        <p>If a span represents a total single-valued relation, i.e., a function f : A → B, its
source leg is a bijection (e.g., the relation belongsTo in Fig. 1 is such). As the choice
of the index set is arbitrary, we can take set A to be the head, and its identity
idA : A → A as the source leg. Then the target leg is the function f itself.</p>
        <p>Note also that sets A and B in Def. 1 are actually placeholders (formal
parameters) for sets rather than actual sets. For example, if we want to specify a
multirelation R = Spouse on a set Person1, then we define A = Person, B = Person,
headSpouse = marriageContract, slegSpouse = spouse1 and tlegSpouse = spouse2, where
functions spouse1 and spouse2 map a contract to the two spouses it binds. Formally,
this procedure can be described as binding formal parameters, A, slegR etc. to actual
1 the same two persons can be multiply related, if, e.g., they got diversed and then
re-married again, and hence may have several marriage contracts
tlegiResult
tlegResult
values Person, spouse1 etc. Nodes and arrows in diagrams used in all our formal
definitions below are formal parameters to be substituted by actual values (sets for
nodes and functions for arrows), when applied in modeling situations.</p>
        <p>Now we proceed to the index-based formalization of sequential composition of
relations. Our discussion of Fig. 1 showed that the core process is composition of
links. As links now are reified as elements in the heads of the spans involved, first
of all we need to find composable pairs of links. It is clear that links ci ∈ Contains
and bj ∈ BelongsTo are composable iff the target of ci is the source of bj, i.e.,
tlegContains(ci) = slegBelongsTo(bj). If this condition holds, the two links are composable,
and we can create a new link from slegContains(ci)∈ Bundle to tlegBelongsTo(bj)∈ Category.
Thus, the set of all composable links iResult (where i stands for ’inner’, and later we
will also build an outer span)) is given by the following formula:</p>
        <p>iResult = (ci,bj) : tlegContains(ci) = slegBelongsTo(bj) .</p>
        <p>This set is equipped with two projection functions selecting, resp., the first or
the second element of pair (ci,bj), and we obtain a span iResult shown in Fig. 2
as the inner span. To finish the composition and obtain the resulting outer span
Result, we need function composition: slegResult = slegiResult;slegContains and tlegResult =
tlegiResult;tlegBelongsTo (see Fig. 2). Below we present an abstract formal specification
of the procedure.</p>
        <p>Selection of the composable links is provided by the operation called (in category
theory) pullback of functions.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Definition 2 (pullback). The pullback of two functions</title>
        <p>wwiitthh haeacodmPm=o{n(ata1r,ag2e)t,∈ fA11:×A1A→ 2 :Yf1(aan1d) =f2f:2Y(a← 2)} A.2Tihsealesgpsaonf AO 1 f1
the span are projections pi : P → Ai with pi(a1,a2) = ai, i = 1,2. p1
(TnhoetedtiahgeraamngolenntehaerrPighdtesnhootwinsgthsuecrhesspqeucatirvees)p.ullback square P p2
It is easy to check that such a pullback square is commutative: p1;f1 = p2;f2.
f2
/ YO
/ A2</p>
        <sec id="sec-3-2-1">
          <title>Now we can define span composition.</title>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Definition 3 (span composition). Let</title>
        <p>R1 : A 9 B, R2 : B 9 C be two composable
spans. Their (sequential) composition is a span B AO
R1;;R2 = (R,s,t) : A 9 C defined as shown on s1
ftuhnecrtiiognhst,cowmhperoeseadrcfroamrrotwwos f(usncatniodnst)spdaennnoetde s RO 1 t1 / BO
by arcs (we will use this convention further on). p1 s2
In more detail, we first compute the pullback
of functions t1 and s2 to select all pairs of com- R p2 / R2 t2 7/ C
posable links. Then we build the outer legs by
composing p1;s1 and p2;t2. t</p>
        <p>If span R2 represents a function (total single-valued relation), we can
perform span composition more effectively by taking R2’s source leg to be the
identity, and hence tlegR2=R2. Then we set headR1;;R2=headR1, slegR1;;R2=slegR1, and
tlegR1;;R2=tlegR1;tlegR2. This span is isomorphic to any other span representing
relation R2 with an arbitrary index set I bijective to B (because the relation is total
and single-valued). For example, as relation belongsTo in our running example is a
function, we can compose Contains and BelongsTo in this simpler way by identifying
b1,2,3 with Bread, Butter and Milk resp. The result will be isomorphic to the span
specified in Fig. 1 in the following sense.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Definition 4 (span isomorphism). Two parallel spans</title>
        <p>bRe1t:wAee9n Bth, eRir2 h:Aea9dsB,ba:rHeRi1s o→mHorRp2h,icsuifchthtehraetissRa1b=ijebc;tsioRn2 AO o sR2 9 HR2
and tR1 = b;tR2. The two commutativity conditions ensure sR1 b tR2
tbhoatth ilfinakslihnakveh∈ thHeRs1a misemsoauprpceedantod atalrignekt.b(h)∈ HR2, then HR1 tR1 / B</p>
        <p>
          Thus, span composition is defined up to span isomorphism. Moreover, the process
of relation indexing by a span is also defined up to isomorphism: we are free in
choosing objects reifying links, but the source and target projection links of these
objects are uniquely determined by the link being reified. In fact, everything in
the indexing world is defined up to natural isomorphisms (bijections between index
sets commuting with the respective functions). In this paper, we take particular
representatives of equivalence classes defined by isomorphism like above, and perform
operations over them. General results of category theory, in which standard operations
over sets and functions are redefined via so called limits (e.g., pullback) and colimits
(e.g., merge) up to isomorphism ensure that applying these operations to different
but isomorphic representatives produces isomorphic results (see, e.g., [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]).
3.2
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Multisets as families and operations over them.</title>
        <p>We use the same indexing idea by reifying element occurrences as unique indices to
distinguish the multiple occurrences of the same element. In a sense,
Definition 5 (families). Let A be a set (perhaps, infinite).</p>
        <p>A (finite) family over A is a function f : I → A from a finite
index set I to A. The latter is called the ground set of family
ifs,owftheinlecathlleedratnhgeeascetitveofdfoumnactinionofff, (i.iet.,issaetlw{ afy(si)fi:nii∈ teI} a⊂s sAet, @ A ^
I is such). To avoid confusion, we will use the term ’range set’ f f0
Trawthoeframthialines’, afct:iIv→ e dAomanaidn’f.0 : I0 → A, over the same ground I b / I0
set A are called isomorphic, and we write f ∼= f0, if there is
a bijection b : I → I0 such that b;f0 = f.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Definition 6 (multiunion/sum/merge). Given two fam</title>
        <p>milieesrgoev)efr1A+,ff2k::II1k]→ I2 A→, Ak =is1g,2iv,etnhebirymthueltiduinaigoonna(olrasruromw, oinr I1 f1 / AO
the commutative diagram on the right, where ] denotes the i1
oapnedratthieondioafgdoinsjaolinfutnucntiioonn, aisrrdoewfisnie1d, ib2ya(ref1c+anfo2n)(ici)i=njefc1t(iio)nisf, I1]I2z o i2 I2
i∈ I1, and (f1+f2)(i) = f2(i) if i∈ I2.</p>
        <p>We can use the operation of family multiunion for building disjoint union of sets
(not surprisingly, as multiunion uses disjoint union of index sets). Given sets A and
B, we form their union U = A∪ B with two injections i : A → U and j : B → U. These
injections can be seen as two families over the same ground set. Summing them gives
us a family u : A]B → U, whose index set is the disjoint union of A and B.</p>
      </sec>
      <sec id="sec-3-7">
        <title>Mixed setting: Families and spans, ordinary and multi</title>
        <p>Connections between ordinary and multi(sub)sets are realized via the following two
functions. Given a set A, we have function dropA : MSub(A) → Sub(A) from multi- to
ordinary subsets of A that ignores indexes and only cares about the range set of a
family. Conversely, a (trivial) function liftA : MSub(A) ← Sub(A) makes an ordinary
subset X ⊂ A into a multiset by taking I = X and f(x) = x for any index x. Clearly,
dropA.liftAX = X for any X ⊂ A, but liftA.dropAf =6 f as dropping discards the
information about family f (where dot denotes function composition). Similarly, we
have functions dropAB : MRel(A,B) → Rel(A,B) from the universe of all multirelations
from A to B to the universe of all ordinary relations from A to B, and, conversely,
liftAB : MRel(A,B) ← Rel(A,B) defined in an obvious way.</p>
        <p>It is easy to see that the ordinary union of two ordinary sets X,Y can be
presented as drop(liftX +liftY ). Similarly, ordinary composition of two ordinary relations
X ⊂ A× B, Y ⊂ B× C is drop(liftX;;liftY ). Of course, an effective implementation of
the ordinary operations should not follow these patterns.</p>
        <p>In ordinary relational modleing, given a relation R : A 9 B, we often need to
build the R-image of a subset X ⊂ A, and the R-preimage of a subset Y ⊂ B. In
multi-modeling, subsets are families and relations are spans, hence, we need the
notions of the image and preimage of family w.r.t. a given span. Remarkably, both
notions can be formally defined via pullbacks as described below.</p>
        <p>Definition 7 (composing a family with a span). IO f / AO
pGoisvietinona ifsamafialymfily:I f→ ;;SA: Pa n→d BasdpeafnineSd: Aby9thBe ,dtiahgeriramcoomn- p1 s
the right (recall that arc arrows denote functions obtained P p2 / S t
by composition of two functions spanned by the arc). This
family is also called the S-image of f . f;;S</p>
        <p>The (inverse) composition of span S : A 9 B with a family g : J → B is a family
S;;g : P → A defined by a diagram dual to the above: we begin with pullback of g
and t, which gives us a pair of arrows (q1,q2), and then set S;;g = q1;s. This family
is called the S-preimage of g.
3.4</p>
      </sec>
      <sec id="sec-3-8">
        <title>Composition as navigation</title>
        <p>
          Span composition can be also defined in a more navigational way. Given a span
R : A 9 B, we can represent it as a function φ : A → MSub(B) by represeting an
element a ∈ A as a “family”a∗ :}∗{→ A with a∗ (∗ ) = a, and defining φ (a) = a∗ ;;R.
This is nothing but a special case of the general image-operation described in Def. 7.
The latter can be described as a function φ RM : MSub(A) → MSub(B). Now, if we have
spans R1 : A 9 B and R2 : B 9 C, we represent them as functions φ R1 : A → MSub(B)
and φ R2 : B → MSub(C) resp. Span composition is then represented by the function
composition φ R1.φ RM2, which is not difficult to show equals to φ R1;;R2 (see e.g. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] for an
elementary proof). That is, given an element a ∈ A, its R1;;R2-image (which is a family
(multiset) over C) can be computed either relationally as (a∗ ;;R1);;R2 = a∗ ;;(R1;;R2),
or navigationally as a.φ R1.φ RM2. Note also that the special case when multirelation R1
is not defined on a is well treated without exclusion, because then multiset φ R1(a) will
be empty (i.e., an empty family given by an empty function ∅ : ∅ → B), and φ RM2(∅ ) = ∅ .
        </p>
        <p>
          The description above actually shows that span composition effectively prevents the
NULL-navigation safety issue occurring in many textual languages such as OCL [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
Indeed, we use empty multisets to represent the case of non-existence similarly to
how other languages use NULLs. Any composition with an empty multiset results in
an empty multiset without special treatment. Hence, the navigation is always safe.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Demonstration</title>
      <p>In this section, we demonstrate the usage of our index-based multiconcepts library
in Alloy to model the scenario introduced in Section 2. The commented Alloy code
below is the full model.
1 open multi
2
3 open mrel[Bundle, Item] as Contains
4 open mrel[Item, Category] as BelongsTo
5 open mrel[Bundle, Category] as Result
6
7 abstract sig Bundle {}
// a utility module providing the function drop
// declare a multirelation from Bundle to Item
// declare a multirelation from Item to Category
// to be the result of Contains;;BelongsTo</p>
      <p>// Bundle class
8 one sig B1, B2 extends Bundle {} // Bundle instances
9 abstract sig Item { // Item class
10 belongsTo: set Category // declare an ordinary relation from Item to Category
11 }
12 one sig Bread, Butter, Milk extends Item {} // Item instances
13 fact {
14 Bread.belongsTo = Bakery // Set up belongsTo relation
15 Butter.belongsTo = Dairy
16 Milk.belongsTo = Dairy
17 }
18 abstract sig Category {} // Category class
19 one sig Dairy, Bakery extends Category {} // Category instances
20
21 fact {
22 BelongsTo/liftedFrom[belongsTo] // lift the ordinary relation to a multirelation
23 Result/composedFrom[Contains/get, BelongsTo/get] // perform the multijoin
24 // rules
25 all b : Bundle | #(b&lt;:Result/get:&gt;Dairy) &gt;= 2 // at least two dairy products
26 all b : Bundle | #(b.(drop[Result/get])) &gt;= 2 // at least two categories
27 }
28
29 assert AllHaveBread { // based on our rules, all bundles should contain bread
30 all b: Bundle | some drop[b&lt;:Contains/get].Bread
31 }
32
33 run {} for 6 Contains/Head, 3 BelongsTo/Head, 20 Result/Head
34 check AllHaveBread for 6 Contains/Head, 3 BelongsTo/Head, 20 Result/Head</p>
      <sec id="sec-4-1">
        <title>Listing 1.1: The bundling model</title>
        <p>The goal of this example is to show two typical usages of Alloy: instance finding
and assertion checking. For instance finding, we ask the solver to find instances of the
multirelation Contains (line 3) such that the rules (lines 25-26) are satisfied. Based
on the rules, each bundle must contain some bread; the assertion AllHaveBread
(line 29-31) is to check this fact.</p>
        <p>The module mrel allows for declaring multirelations for the given domain and
range signatures (lines 3-5). Such a multi-relation can be lifted from an ordinary
relation by assigning each tuple in the ordinary relation a unique index. In our
example, the ordinary relation belongsTo declared on line 10 is fixed on lines 13-17.
The multirelation BelongsTo (declared at line 4) represents the same information
as the ordinary relation belongsTo by lifting (line 22).</p>
        <p>We need to multijoin Contains and BelongsTo to observe the categories of
products in each bundle. Line 23 shows how an unconstrained multirelation
Contains is multijoined with BelongsTo; the result is stored in Result. We then use
the built-in domain and range restriction operators &lt;: and :&gt; and the function
drop (drop the indices in a multirelation to make it ordinary) to state the rules
“every bundle must contain at least two dairy products” (we care about multiplicity
in this case) and “every bundle must contain items covering at least two product
categories” (we do not care about multiplicity in this case). Fig. 3 shows a bundling
instance generated by Alloy analyzer which satisfied the two rules.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Index-based multiconcepts library in Alloy</title>
      <p>In this section, we present how we encode the index-based formalization of
multiconcepts introduced in Section 3. We demonstate how we leverage language features of
Alloy to design a multiconcept library following the mathematical framework while
emphasizing clarity and ease of use. The library comes in three parts: two parametric
modules named mset and mrel for declaring multisets and multirelations; one plain
module named multi providing operations over multiconcepts and a few utility functions.</p>
      <sec id="sec-5-1">
        <title>Encoding of family and span</title>
        <p>Family and span are naturally generic structures. By specifying the type of the ground
set in a family or the types of the source and target in a span, we can obtain concrete
family or span of specific types. Such polymorphism is provided by parametric
modules in Alloy. We employ this language feature to encode family and span:
1 module mset [g]
2
3 sig Idx { f : one g }
4 fun get[] : Idx -&gt; one g { f }
1 module mrel [s, t]
2
3 sig Head { sLeg: one s, tLeg: one t }
4 fun get[]: s -&gt; Head -&gt; t
5 { a:s, h:Head, b:t| h.sLeg=a &amp;&amp; h.tLeg=b }</p>
        <p>The first line of both listings declares a parametric module: module mset has
one type parameter g standing for the ground set of a family; module mrel has
two type parameters s and t standing for the source and target of a span.</p>
        <p>Family is encoded by the index set sig Idx, which has a totally-defined
singlevalued relation f (restricted by the quantifier one) to ground set g. Span is encoded
by the head set sig Head, which has two totally-defined single-valued relations
(restricted by the quantifier one) sLeg and tLeg to source set s and target set t.</p>
        <p>A family can be viewed as a totally-defined single-valued binary relation; a span
can be viewed as a ternary relation by constructing a triple in which head set sits
in the middle column along with source and target on the sides. Based on the views
above, each module exposes a function get which returns its internal structure as
a binary or a ternary relation (line 4). Returning them in such form has two benefits.
First, in the module multi, we provide a few generic predicates and functions, which
work on both families and spans. These predicates and functions can be generic
because we represent families and spans as ordinary binary and ternary relations
and we can type the parameters of these predicates and functions as univ-&gt;univ
or univ-&gt;univ-&gt;univ. Second, our encoding of multiconcepts is also automatically
compatible with ordinary sets and relations. For example, a span can be effectively
composed with a total single-valued relation by joining its target leg with that relation.
Since we have returned the span as a ternary relation, the effective composition can
simply be done by the built-in relational join operator (.).
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Encoding of composition</title>
        <p>The very important operation of composition over families and spans is pullback,
implemented as a predicate in the following listing which takes all the sets and
relations in the pullback diagram as input parameters.
1 // in module multi
2 pred pullback[ X1: univ, X2: univ, f1: X1 -&gt; one univ, f2: X2 -&gt; one univ,
3 P: univ, p1: P -&gt; X1, p2: P -&gt; X2 ] {
4 (no X1 or no X2) implies { no P } else {
5 all x1: X1 | all x2: X2 | x1.f1 = x2.f2 implies
6 { one p: P | p.p1 = x1 &amp;&amp; p.p2 = x2 }
7 #P = #(f1.~f2)
8 }
9 }
p1
AO 1
P
f1
p2
/ YO</p>
        <p>f2
/ A2</p>
        <p>In the pullback diagram above, we assume the sets X1, X2 and Y with the
functions f1 and f2 are given. By setting proper constraints, the solver will generate
all correct instances in set P together with functions p1 and p2 to form a pullback
square.</p>
        <p>We first deal with the case when set X1 or set X2 is empty (line 4), which implies
set P is also empty. The constraint in line 5 and 6 simply obeys the definition of
pullback. We iterate the elements in set X1 and X2. If elements x1:X1 and x2:X2
satisfy the condition x1.f1 = x2.f2, it implies that there must be one element p
in set P which makes the pullback diagram commute by applying the constraint
p.p1=x1 &amp;&amp; p.p2=x2.</p>
        <p>This constraint sets a lower bound to the solution space. With this constraint
solely, the solver will generate all the correct instances in set P, and possibly some
garbage instances due to the lack of an upper bound constraint. We set an upper
bound to the solution space using the fact that the cardinality of set P is equal to the
cardinality of the composition result f1.˜f2 according to the definition of pullback.</p>
        <p>We use the pullback predicate to implement the composition. Before we go into
details, we should notice that set P is fresh and disjoint with any other set in the
pullback diagram, this is why pullback cannot be encoded as a function in Alloy
because we cannot return a set that has not been declared before in a function. This
is also the reason that we need to manually declare a multiset or a multirelation to
hold the result of composition.
1 // in module mset [g] 1 // in module mrel [s, t]
2 open multi as m 2 open multi as m
3 // current family is composed from fami &amp; span 3 // current span is composed from span1 &amp; span2
4 4 pred composedFrom[ span1: s -&gt; univ -&gt; univ,
5 pred composedFrom[fami: univ -&gt; one univ, 5 span2: univ -&gt; univ -&gt; t ] {
6 span: univ -&gt; univ -&gt; g ] { 6 let Hd1=span1.head, Hd2=span2.head,
7 let I = fami.idx, Hd = span.head, 7 sLeg1=span1.sleg, tLeg1=span1.tleg,
8 sLeg = span.sleg, tLeg = span.tleg 8 sLeg2=span2.sleg, tLeg2=span2.tleg
9 | some p1: Idx-&gt;I, p2: Idx-&gt; Hd 9 | some p1: Head -&gt; Hd1, p2: Head -&gt; Hd2
10 | pullback[I, Hd, fami, sLeg, Idx, p1, p2] 10 | pullback[Hd1,Hd2, tLeg1,sLeg2, Head,p1,p2]
11 &amp;&amp; f = p2.tLeg 11 &amp;&amp; sLeg = p1.sLeg1 &amp;&amp; tLeg = p2.tLeg2
12 } 12 }</p>
        <p>Predicates composedFrom encodes the composition in Def. 3 and 6 respectively.
They restrict the current family or span to be the result of the composition of the given
input parameters. We use the utility functions to extract the components of a family or
a span. We then apply the pullback predicate on the corresponding sets and relations
(line 10) and get the result family or span (line 11). We rely on skolemization performed
by Alloy which enables higher-order quantification with an existential quantifier to
generate the relations p1 and p2 on the fly without explicitly declaring them (line 9).
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Other multioperations</title>
        <p>In our encoding, multiunion (merge) is simply a union of two families (assuming their
index sets are disjoint):
1 // in module multi
2 fun merge[f1, f2: univ-&gt;one univ] : univ-&gt;one univ { f1 + f2 }</p>
        <p>Since we transform a span to a ternary relation, it is straightforward to implement
the operation inverse by simply flipping the first and third column of the ternary
relation; the built-in domain and range restriction operators are directly applicable
to the returned ternary relation.
1 // flip the 1st and 3rd column
2 fun inverse[span: univ-&gt;univ-&gt;univ] : span {
3 ter/flip13[span]
4 }
5 // For the domain and range restriction, we can use the built-in operators &lt;: and :&gt;</p>
        <p>As discussed in Section 3.3, an ordinary set can be viewed as a multiset by
assigning each element a unique index. In the same way, an ordinary relation can be
seen as a multirelation. Therefore, we can transform an ordinary set or relation to a
multiset or multirelation in our encoding. The operation lift is implemented as follows:
1 // in module mset [g]
2
3 pred liftedFrom[ G: g ] {
4 Idx.f = G &amp;&amp; #Idx = #G
5 }
1 // in module mrel [s, t]
2
3 pred liftedFrom[r: s -&gt; t] {
4 (~sleg).tleg = r &amp;&amp; #Head = #r
5 }</p>
        <p>Conversely, we can transform a multiset or multirelation to an ordinary set or
relation by dropping the indices, which is encoded as follows:
1 // in module multi
2 fun drop[f: univ -&gt; one univ] : univ { rel/ran[f] }
3
4 fun drop[span: univ -&gt; univ -&gt; univ] : univ -&gt; univ {
5 ter/select13[span]
6 }</p>
        <p>We can see that our library fullfils the requirements from Section 2.3: it provides
1) a way to directly declare a multiset or a multirelation; 2) an algebra of operations
over multiconcepts; and 3) the lift and drop operations to control whether the result
is ordinary or multi.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Evaluation</title>
      <p>
        We assess the correctness of our encoding through testing, which uses assertions to
check if the result obtained from the encoded operation is equal to the expected result
we manually compute. By running the assertions in Alloy, if a counter-example is
found, that means the encoding of the operation is incorrect, otherwise, the encoding
passes the test. All the encoding of operations have passed the tests which increases
the confidence that the implementation is correct, and we have released the test suite
together with the library [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
One of the core of the Alloy reasoning framework is instance finding, but dealing
with multiconcepts significantly increases the search space. In a nutshell, suppose we
are given sets A,B and a relationship R : A 9 B. If R is ordinary, then for every pair
(a,b) ∈ A× B, there are only two possibilities: either (a,b) ∈ R and hence there is a link
from a to b, or R does not include such a link. In contrast, if R is a multirelation, for
every pair (a,b) there are infinitely many possibilities for the number of links from
a to b. Hence, multiplicity constraints play an important role in narrowing the search
space in order to gain reasonal performance on reasoning.
      </p>
      <p>We use the bundling model in Sect. 4 as the benchmark model. By manually
controlling the scope of Contain/Head (Line 33), which sets an upper bound of the
multiplicity of multirelation Contain, we investigate how the performance of Alloy
analyzer’s instance finding on the model relates with the multiplicity of a multirelation.</p>
      <p>The results are shown in Table 1 and the visualization is shown in Fig. 4. The
data includes the size of the model’s CNF encoding (vars and clauses) and the total
analysis time. The Alloy analyzer is configured to use the MiniSAT solver and runs
on a 2.4Ghz Core i7 machine with 8GB RAM.</p>
      <p>
        According to the diagram, the size of the model increases linearly with the scope
of Contains/Head. The analysis time is fluctuating with the increase of scope but
the overall analysis time is momentary when the scope is under 20. This result shows
our Alloy multiconcept encoding is practical given a reasonable scope.
Translating UML class diagrams with OCL constraints (involving bags) into Alloy
has been studied in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In the former paper, bags/multisets were encoded
in Alloy as sequences, which is inadequate as bags are not ordered. Neither of the
papers considered translation of multirelations, which, as our example in Sect. 2
shows, restricts the practical applicability of the studies.
      </p>
      <p>
        From the Alloy world side, the closest related work is a question: “Are there
multisets in Alloy?” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] on the website Stack Overflow. The selected answer to this
question proposes to represent a multiset as a function from a set of elements to the
natural numbers. For a given element, the function returns the multiplicity of the
element. The answer also provides implementation of merge, intersection, and difference,
which compute the new multiplicities of the results (add the multiplicities for the
merge, subtract them for difference, and take their minimum for intersection). The
answer does not provide any information about multirelations and other operations
over them such as composition.
      </p>
      <p>
        There is a large body of work about multisets treated numerically via multiplicities.
Blizard presents a historical survey of the multiset theory [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and traces back the
idea of multisets as families to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. A category theoretic treatment of multisets and
multirelations can be found in, resp., [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
We have shown that simple categorical notions of family, span, and their composition
via pullback can bridge the gap between graphical multirelational modeling and textual
modeling in Alloy. Alloy users can now use multiconcepts in their models by importing
our library module. There is previous work about transforming Alloy model to SMT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Our encoding is also a blueprint for implementing multiconcept support for
Satisfiability Modulo Theories (SMT) solvers, whereby only total functions are available.
      </p>
      <p>
        In the future, we will use the presented encoding to implement support for
multiconcepts as first-class constructs in the modeling language Clafer [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which uses
Alloy as one of the backends. In Clafer, modelers can directly declare multisets and
multirelations; however, both the syntax of Clafer has to be extended with the new
operators over multi-concepts and the translation to Alloy has to implement the
correct semantics.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Are there multisets in Alloy? (
          <year>2014</year>
          ), http://stackoverflow.com/questions/ 23579928/are-there
          <article-title>-multisets-in-alloy, last accessed in Aug 2016</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Anastasakis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bordbar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georg</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ray</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>On challenges of model transformation from UML to Alloy</article-title>
          .
          <source>Software &amp; Systems Modeling</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <fpage>69</fpage>
          -
          <lpage>86</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bką</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diskin</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Czarnecki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wsąowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Partial instances via subclassing</article-title>
          .
          <source>In: International Conference on Software Language Engineering</source>
          . pp.
          <fpage>344</fpage>
          -
          <lpage>364</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bką</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diskin</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Czarnecki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wsąowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Clafer: unifying class and feature modeling</article-title>
          .
          <source>Software &amp; Systems Modeling</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Barr</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wells</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Category theory for computing science</article-title>
          , vol.
          <volume>49</volume>
          . Prentice Hall New York (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Blizard</surname>
          </string-name>
          , W.D., et al.:
          <article-title>The development of multiset theory</article-title>
          .
          <source>Modern logic 1</source>
          (
          <issue>4</issue>
          ),
          <fpage>319</fpage>
          -
          <lpage>352</lpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bruni</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gadducci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Some algebraic laws for spans</article-title>
          .
          <source>Electr. Notes Theor. Comput. Sci</source>
          .
          <volume>44</volume>
          (
          <issue>3</issue>
          ),
          <fpage>175</fpage>
          -
          <lpage>193</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>El</given-names>
            <surname>Ghazi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            ,
            <surname>Taghdiri</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Relational reasoning via smt solving</article-title>
          .
          <source>In: International Symposium on Formal Methods</source>
          . pp.
          <fpage>133</fpage>
          -
          <lpage>148</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Maoz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ringert</surname>
            ,
            <given-names>J.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rumpe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>CD2Alloy: Class diagrams analysis using alloy revisited</article-title>
          .
          <source>In: MoDELS</source>
          , pp.
          <fpage>592</fpage>
          -
          <lpage>607</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Monro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The concept of multiset</article-title>
          .
          <source>Mathematical Logic Quarterly</source>
          <volume>33</volume>
          (
          <issue>2</issue>
          ),
          <fpage>171</fpage>
          -
          <lpage>178</lpage>
          (
          <year>1987</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Rado</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>The cardinal module and some theorems on families of sets</article-title>
          .
          <source>Annali di Matematica pura ed applicata 102(1)</source>
          ,
          <fpage>135</fpage>
          -
          <lpage>154</lpage>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diskin</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Czarnecki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Modeling and reasoning with multisets and multirelations in Alloy</article-title>
          .
          <source>Tech. Rep. GSDLAB-TR 2016-01-22</source>
          , Generative Software Development Lab, University of Waterloo (
          <year>January 2016</year>
          ), http://gsd.uwaterloo.ca/node/658
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Willink</surname>
          </string-name>
          , E.D.:
          <article-title>Safe navigation in ocl</article-title>
          .
          <source>In: OCL 2015-15th International Workshop on OCL and Textual Modeling: Tools and Textual Model Transformations Workshop Proceedings</source>
          . p.
          <volume>81</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>