The Online Abstraction Problem for Euler Diagrams Gennaro Cordasco1 , Rosario De Chiara2 , and Andrew Fish3 1 Dipartimento di Psicologia – Seconda Università di Napoli, ITALY, gennaro.cordasco@unina2.it 2 ISISLab, Dipartimento di Informatica – Università di Salerno, ITALY dechiara@dia.unisa.it 3 School of Computing, Engineering and Mathematics – University of Brighton, UK Andrew.Fish@brighton.ac.uk Abstract. A Euler diagrams are an accessible and effective visualisation of data involving simple set-theoretic relationships. Efficient algorithms to quickly com- pute the abstract regions of an Euler diagram upon curve addition and removal have been developed, but a strict set of drawing conventions (called wellformed- ness conditions) were enforced, meaning that some abstract diagrams are not representable as concrete diagrams. We present a variation and extension of the methodology which enables region computations for Euler diagrams under the relaxation of several drawing conventions. We provide complexity analysis and compare with the previous methodology. The algorithms are presented for generic curves, allowing for specialisations such as utilising fixed geometric shapes for curves that often occur in applications. 1 Introduction Venn [21] and Euler diagrams are a well known representation of sets and their re- lationships. Venn diagrams have had significant theoretical interest from the likes of Grünbaum and Hamburger in recent times; a detailed survey of Venn diagrams can be found in [16]. Euler diagrams are the modern incarnation of Euler circles [3], first in- troduced for the purposes of syllogistic reasoning. Whilst Venn diagrams ensure that every region determined by being inside some contours and outside the other contours is nonempty, Euler diagrams generalise Venn diagrams by relaxing this condition. This allows them to specify subset relations and disjointness relations amongst sets without any extra cognitive load since these semantic relationships are well-matched the spatial relationships of containment and disjointness [11, 17]. In a practical setting, Euler diagrams appear frequently in various application do- mains. For example, they have been used in biological areas for representing complex genetic set relations in [14], in computer-based resource management scenarios in [2], and in the information retrieval/visualisation context to depict the numbers of results of collections of library database query results in [20] and in network visualisation [15]. Euler diagrams, together with diagrammatic inference rules, form a diagrammatic logic, and comparisons of the effect of the choice of inference rules on automated searches for minimal proofs within Euler diagram-based reasoning systems [18] has been in- vestigated. There are many variations of the basic system, and they have also been incorporated into heterogeneous reasoning systems [19]. More complex diagrammatic logics such as Spider [12] or Constraint diagrams [4, 13] build on the underlying Eu- ler diagram logic, adding more syntax in order to increase the expressiveness of the languages. 3rd International Workshop on Euler Diagrams, July 2, 2012, Canterbury, UK. Copyright ⃝c 2012 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 63 Motivation. For any computer-based applications there is a natural disparity between the concrete level information that the user perceives and manipulates (the drawn or concrete diagrams) and the abstract information that the system requires or manipulates (the abstract models or abstract diagrams). Many important computations of the system tend to be defined at this abstract level. For instance, if one wished to present the se- mantics of a user-constructed diagram then the system needs to perform computations such as to identify the regions present in the diagram, to compute the set intersections that they represent, and to combine these into some set-theoretic statement. In a more general sense an efficient way to calculate the abstract diagram is also useful for the comparison of concrete diagrams. The efficient computation of the abstract model from a given concrete diagram, together with the ability to update the abstract model upon concrete changes such as curve addition, removal, translation and resizing represent an important challenge to be addressed. The relaxation of the drawing constraints is a significant extension because under these relaxed constraints, every abstract diagram has a concrete diagram repre- senting it [5]. Furthermore, for dynamic diagrams (e.g. sequences of diagrams con- structed during interactive user construction or the presentation of evolving data sets) it permits the temporary relaxation of the chosen set of drawing conventions imposed for a diagram in the sequence. This enables a natural construction or presentation, assisting in preserving a user’s mental map. Contribution and paper outline. In this paper we provide a solution to the on-line abstraction problem: compute the abstraction of a concrete Euler diagram (i.e. a drawn diagram), keep track of the concrete and abstract diagrams, and enable the automatic update of the abstract diagram upon concrete level manipulations. The algorithms pre- sented in [8] solved this problem for the wellformed diagrams of [5], but here we pro- vide a solution for the more general case in which several well-formedness conditions are relaxed, enabling much greater utility and flexibility. The algorithms have also been extended to permit further relaxation of the wellformedness constraints, enabling the processing of ‘generalised Euler diagrams’ (representing sets as unions of regions with holes), ensuring that any abstract diagram has a concrete realisation. However, this fur- ther extension is not presented in this paper due to space constraints. 2 Preliminaries We provide a definition of Euler diagrams, separating the abstract and concrete models as usual, together with the set of wellformedness conditions considered. Specifically, we incorporate some of the wellformedness conditions of [5] and [10] that are simplicity of contours (no self-intersection) and uniqueness of contour labels, into the main definition of concrete Euler diagram, which enables an omission of labels since the contours can be uniquely identified. Definition 1. A concrete Euler diagram is a pair d = ⟨C, Z⟩ where C is a set of simple closed curves, called (concrete) contours, in the plane and Z is the collection of (con- crete) zones z determined by being inside a set of contours Xz ⊆ C and outside the rest of the contours. That is, ∩ ∩ z= int (c) ∩ ext (c) , c∈Xz c∈C−Xz 64 for each Xz ⊆ C, provided this region is non-empty. Here int(c) and ext(c) denote the interior and the exterior of c, respectively, and the set Xz is called ∪ the zone descriptor for z. A minimal region of d is a connected component of R2 − c. c∈C Fig. 1. Nonwellformed Euler diagrams, breaking WF1a, 1b, 2 and 3, resp., from left to right. We say d is wellformed if the following wellformedness conditions (WFCs) hold (see Fig. 1 ): WF1 Transverse intersections: Contours that intersect do so transversely. This can be subdivided into: WF1a No tangential intersections. WF1b No concurrency (distinct contours meet at a discrete set of points). WF2 No multiple points: At most two contours can intersect at any given point. WF3 Connected concrete zones: Each concrete zone is a minimal region. An abstract Euler diagram (see Definition 2) is an abstraction (see Definition 3) of a concrete diagram. We overload the term zone, using it for the concrete zones, which are regions of the plane, as well as for abstract zones, which are the sets of containing contours of that region (or the labels of those contours in the generalised case); the context determines which is meant. Let PX denote the powerset of set X. Definition 2. An abstract Euler diagram is a pair: d = ⟨C(d), Z(d)⟩ where: C(d) is a ∪ from some alphabet L. The set of finite set of labels, called (abstract) contours, drawn (abstract) zones of d is Z(d) ⊆ PC(d), where z = C(d). z∈Z(d) Definition 3. Let d be a concrete Euler diagram and d′ an abstract Euler diagram. If there is a bijection between C(d) and C(d′ ) that induces a bijection between Z(d) and Z(d′ ), then d is said to be a realisation of d′ , and d′ is the abstraction of d. An abstract Euler diagram d′ is drawable if there is a realisation of d′ as a concrete Euler diagram. By convention, each concrete Euler diagram contains a zone o, called the outer zone, which is exterior to all the contours (that is, Xo = ∅). The left of Fig. 2 shows a concrete Euler diagram containing four contours (A, B, C, D). The zone descriptors for the concrete zones are graphically depicted in the right hand side of the figure; these sets can be viewed as the abstract zone set. We need terminology relating to the important operations of the addition and re- moval of the contours of an Euler diagram. Definition 4. Let d = ⟨C, Z⟩ be a concrete Euler diagram with A ∈ / C and B ∈ C. Let d + A and d − B denote the concrete Euler diagrams obtained by the addition of a new 65 Fig. 2. (a) A wellformed concrete Euler diagram, with contour identifiers (or labels); (b) a depic- tion of the zone descriptors. contour A to d and the removal of contour B from d, respectively. A region r of d is a union of minimal regions; it is: (i) a covered region (or is covered by A) if r ⊂ int(A) in d + A; (ii) split by A (a split region) if r ∩ int(A) ̸= ∅ and r ∩ ext(A) ̸= ∅ (i.e. r is partially covered by A). Analogously, a zone z of d is a covered zone (respectively a split zone) when it is covered (respectively partially covered) by A. Fig. 3 shows an example of contour addition. We observe that the zone described by {C} is split by the contour A but neither of its two minimal regions is split by A. A point of intersection x between two curves c1 and c2 is called a crossing point. We denote with Cross(d) the set of all the crossing points generated by contours in d. Fig. 3. An example of contour addition: (a) A non wellformed diagram d = ⟨{B, C}, {∅, {B}, {C}, {B, C}}⟩. The crossing points of d are shown with filled-in dots; (b) The crossing points of A with d (i.e. those in Cross(A)) are depicted as hollow dots. The set of all hollow and filled-in dots depicts the set of crossing points of d + A. 3 Computing the abstraction of Euler Diagrams The main problem that we address is the following, with variations according to the choice of wellformedness conditions imposed. Abstraction Update Problem Let d = ⟨C, Z⟩ be a concrete Euler diagram and d′ = ⟨C ′ , Z ′ ⟩ the abstraction of d. Let A ∈/ C and B ∈ C. Efficiently compute the abstractions of d + A and d − B. 66 In [6, 8] the single marked point approach (SMPA) (described below) was pre- sented, computing the abstraction for wellformed Euler diagrams, following an online approach where diagrams are viewed as a sequence of contour additions and removals. Other operations such as translation or resizing of a contour can be easily simulated by the addition and removal operations, and so such algorithms are applicable in a wider context. In this paper, we present an evolution of these algorithms, adopting the multiple marked point approach (MMPA) (described below), permitting the relaxation of condi- tion WF3 so that Euler diagrams whose zones are disconnected can be processed. First of all, we recall from [8] that the set of zones split by A, due to the addition of a contour A to (or its removal from) a given wellformed Euler diagram d, can be computed using the following observation. Observation 1 Let d be a wellformed Euler diagram and let {x0 , x1 , . . . , xm−1 } be all of the crossing points that we meet as we traverse the contour A from an arbitrary point on A. Then: (i) For each i = 0, . . . , m − 1 each arc (xi , xi+1 mod m ) splits one zone (note that two arcs can split the same zone but one arc cannot split more than one zone) of d. (ii) Two consecutive arcs (xi , xi+1 mod m ) and (xi+1 mod m , xi+2 mod m ) split two zones such that their zone descriptors differ by exactly one contour (the contour which inter- sects with A generating the crossing point xi+1 mod m ). Fig. 4. A schematic diagram which illustrates Observation 1. The addition of A generates six new crossing points shown with small blobs. The point x01 is an arbitrary point of the arc (x0 , x1 ) used to compute an initial zone descriptor for the zone split by arc (x0 , x1 ), whilst subsequent zone descriptors are computed using Observation 1. For the wellformed diagram case of [8], we adopted the SMPA where each zone z of d has a single point mp(z) ∈ R2 associated to it, where mp(z) lay in the boundary of the closure of the zone z (with the possible exception of the marker for the outside zone). These points keep track of the zone sets, and were used to update these sets according to their relationships with contours that are added or removed from a diagram. Then, the zones that are not split by A are covered by A if and only if mp(z) belongs to the interior of A. The SMPA is illustrated in Figure 5: each minimal region is marked by a single marked point (an arrowed dot indicates a marked point, the arrow indicating the 67 minimal region which is marked); additional marked points, or pseudo-crossing points, are used to mark the outside zone and any contour which has no crossing points, either in d or at some stage during its incremental construction (e.g. see the marked point for zone {B, D, E} in Figure 5). However, the SMPA is not sufficient to deal with the Euler diagrams with disconnected zones. In this case, there are two ways of splitting a zone: (i) a zone is split when one of its minimal regions is split by A; (ii) a zone is split when some of its constituent minimal regions are covered by A and some are not. To address case (ii) one can consider associating one marked point to each minimal region of the diagram. Figures 5 (c) illustrates a generalization of the case of [8] where each minimal region is associated with one marked point. Then, if a zone z has no minimal regions which are split by A (i.e. case (i) does not hold) we can analyse the relationships of the marked points with A to classify z as split by A, covered by A, or neither. In particular, if all of the marked points of z belong to int(A) then z is a covered zone, whilst if some of the marked points of z belong to int(A) while others do not, then z is a split zone, according to (ii) above. Fig. 5. Single marked point approach (SMPA): in (a) each zone of a wellformed Euler diagram is marked by a single point; (b) shows in grey the zone {B, C} and its marked point; in (c) a non wellformed Euler diagram (WF3 relaxed) with a zone {B}, shown in grey, which consists of two minimal regions, therefore requiring at least two marked points. However, the management of marked points (taking one for each minimal region) for non-wellformed diagrams (relaxing WF3) raises some tricky problems such as: if a zone z becomes split upon contour addition or deletion, how can one efficiently find a marked point for each of the minimal regions that comprise z? For example, Figure 6 shows two parallel examples which adopt the SMPA (using the algorithm of [8]) in which only the order of contour addiction has been varied. Whilst one of these gives a valid solution, the other does not. In detail, the first two steps (a) and (b) in the figure represent the addition of the first two contours (E and C) to the diagram. Then the two cases are depicted: on the left we add first contour D and then contour B, whilst on the right we first add B and then D. In the first case (on the left) the association between the marked points and minimal regions is correct, with the two minimal regions of zone {B} marked by points z0 and z2 . However, in the second case (on the right) the association is incorrect: there are two marked points associated to the same minimal region (top, shaded) and no marked point associated the other minimal region (bottom, shaded). The problem is that, in general, there is no easy way of discriminating between the case on the left from the case on the right; by just analyzing the relation between a marked point and the contours it is possible to discriminate between zones and not minimal regions. 68 Fig. 6. The influence of the order of contour addition on the marked points/minimal region asso- ciation, by using the algorithms of [8]. The dotted contour is the one that is going to be added to the current diagram. We avoid such problems by adopting the MMPA in which each zone is associated with a set of marked points (which comprises the set of crossing points laying on its boundary). This approach requires the tracking of a larger number of marked points but we accept this trade-off against a simpler marked point management (also making implementation easier), since when a zone is split, we just need to correctly partition the set of its marked points. The MMPA is illustrated in Figure 7: a set of points marks each minimal region r, including all of the crossing points on the boundary of r. Thus, each zone has marked point set including all of the crossing points laying on its boundary (i.e the boundaries of its constituent minimal regions). In the specific case in Figure 7, each marked point marks one, two or four zones. In the following we define a procedure for contour addition (with contour removal omitted for space reasons), which satisfies: Theorem 1. Let d = ⟨C, Z⟩ be an Euler diagram, satisfying WF1 and WF2 (i.e. with WF3 relaxed). Then i) If A ∈ / C, then the procedure NewContour(d,A) computes the new collection of zone descriptors for the zones Z ′ of d′ = ⟨C ∪ A, Z ′ ⟩. ii) If B ∈ C, then the procedure DeleteContour(d,B) computes the new collection of zone descriptors for the zones Z ′ of d′ = ⟨C − B, Z ′ ⟩. Moreover, both procedures: 1. compute, for each zone z ∈ Z ′ − {z0 }, where z0 is the zone in the exterior of all contours in d′ , the set of marked points of the zone, M P (z), that is comprised of the 69 Fig. 7. Multiple marked point approach (MMPA): in (a) each zone is marked by points including all of the crossing points belonging to its boundary; (b) shows in grey the zone {B, C} and its marked points; (c) a non wellformed Euler diagram (WF3 relaxed) with a zone {B}, shown in grey, comprised of two minimal regions, utilising eight marked points. set of all crossing points (or pseudo-crossing points) of d′ belonging to the closure of z. There is a single marked point mp(z0 ) in the exterior of all of the curves of d′ . 2. have running time O(|Z| + |Cross(d)| log(|Cross(d)|)). 3.1 The algorithms Initially, we consider the key case of Euler diagrams with WF3 relaxed (but WF2 and WF1 enforced). Subsequently, we will provide the ideas enabling the relaxation of WF2 and WF1. For space reasons we present only the algorithm for contour addition (omitting the algorithm for contour removal). Both algorithms are based on two auxil- iary algorithms, ComputeContourRelationships (which computes the relationship of a contour with the other contours in a diagram, and updates the marked point sets) and ComputeSplitRegions (which computes the zone descriptors of the split zones using Observation 1). Definition 5. Let d = ⟨C, Z⟩ be a concrete Euler diagram and A a contour which is not in C. Let Over(A) denote the collection of all of the contours in C that properly overlap A; that is Over(A) = {c ∈ C | int(A) ∩ int(c) ̸= ∅}; Cont(A) denote the collection of all of the contours in C that properly contain A; that is Cont(A) = {c ∈ C | c ∈/ Over(A) and int(c) ∩ int(A) = int(A)}. For example, Figure 3 (a) shows diagram d and Figure 3(b) shows the addition of contour A to d. We have Over(A) = {B, C}, Cont(A) = ∅ and Cross(A) contains the four crossing points between A and the contours in C. The methodology adopted makes use of the following low level computations, and we assume that, given two contours A and B of an Euler diagram with WF3 relaxed, we can quickly find: 1. the relationships between A and B; that is if A and B properly overlap, or if one contains the other; 2. their crossing points (if A and B properly overlap); 3. the relationship between any given point x ∈ R2 and A; that is whether x belongs to A, int(A) or ext(A). 70 Fig. 8. The addition of contour A splits eight minimal regions determined by the eight arcs that comprise A. However, it splits nine zones, eight of which are the distinct zones containing the eight minimal regions that are split. The ninth zone {B} is split, without splitting any of its constituent minimal regions, since one of its minimal regions is covered by A but the other is not. Placing restrictions on the geometric shapes used for contours (which is common in some applications) can enable particularly fast computations. For example, if each contour is a simple geometric shape, such as a circle or an ellipse, these computations reduce to solving a system of two quadratic equations (1 and 2) and a quadratic equa- tion (3), which can be computed very quickly (with different methods having different time/precision tradeoffs). Algorithm ComputeContourRelationships: (i) computes the relationship between the contours present in a diagram d (with WF3 relaxed) and a contour A; (ii) updates the set of crossing points of d. We will refer to Fig. 8 to assist with the explanation of the algorithms. Consider the addition of the dashed contour A to the diagram in Fig. 8 without A. After the execution of Algorithm ComputeContourRelationships we have Cont(A) = ∅, Over(A) = {B, C, D, E} whilst Cross(A) is the set of eight crossing points created by the addition of A. Algorithm ComputeSplitRegions uses the sets output by Algorithm ComputeCon- tourRelationships and calculates the collection of zone descriptors for the zones that contain a minimal region of d − A split by A. The crossing points of A can be used to decompose A into a set of arcs. This algorithm computes the zone descriptors of all of the zones of d − A that have at least one of their minimal regions split by A. In detail, the arcs are analysed in the sequence that they are met as one traverses the contour; see Fig. 4 for an example. The region that is split by the first arc (x0 , x1 ) is determined the set of contours that properly contain A and then by checking which of the contours that properly overlap with A contains the arc. Each successive region that is split is calculated by computing the difference with the previously computed region; this idea was present in Observation 1. Contour addition. Algorithm 1 updates the collection of zone descriptors upon the addition of a new contour A to a diagram d. There are two cases to consider. Firstly, if Over(A) = ∅ then no new crossing points are created by the addition of A, and so A forms a new connected component. Thus, A splits only the zone described by contours 71 Algorithm 1: NewContour(d, A) Input: An Euler diagram d = ⟨C, Z⟩ and a contour A such that A ∈ / C. Output: Z ′ , the collection of zone descriptors of d′ = ⟨C ∪ {A}, Z ′ ⟩. 1: (Cont(A), Over(A), Cross(A)) := ComputeContourRelationships(d, A) 2: if Over(A) = ∅ then // A does not properly overlap any contour present in C 3: s := Cont(A) // s is the zone split by A 4: Zs := {s} // Zs is the set of zones having a minimal region split 5: s.points := any point in A // the marked point for s 6: else 7: Zs := ComputeSplitRegions(d, A, Cont(A), Over(A), Cross(A)) 8: Z ′ := Z 9: forall z ∈ Zs do 10: s := z // s is the old zone 11: n := z ∪ {A} // n is the new zone 12: Ms := Mn := MA := ∅ 13: forall x ∈ s.points do 14: switch do 15: case x ∈ int(A) // the point x marks n 16: Mn := Mn ∪ {x} 17: case x ∈ ext(A) // the point x marks s 18: Ms := Ms ∪ {x} 19: case x ∈ A // the point x marks both s and n 20: MA := MA ∪ {x} 21: s.points := Ms ∪ MA 22: n.points := Mn ∪ MA 23: Z ′ := Z ′ ∪ {n} // The new zone n is added to the collection of zones of the diagram 24: forall z ∈ Z − Zs do 25: Mint := Mext := ∅ // Mint and Mext record the marked points of z that are in the interior and the exterior of A, respectively 26: forall x ∈ z.points do 27: if x ∈ int(A) then 28: Mint := Mint ∪ {x} 29: else 30: Mext := Mext ∪ {x} 31: if Mext = ∅ then // if z is covered by A then z should be removed 32: Z ′ := Z ′ − {z} 33: else 34: z.points := Mext 35: if Mint ̸= ∅ then // if z is split or covered by A then a new region should be added 36: n := z ∪ {A} 37: n.points := Mint 38: Z ′ := Z ′ ∪ {n} 39: return Z ′ 72 in Cont(A) (in Fig. 2 (a), contour D splits only the outer zone, exterior to all other contours, for example). Secondly, if Over(A) is not empty, then A splits several zones (contour A in Fig. 8 splits several zones, for example). Algorithm 1 computes the split zones in two steps: (i) the split zones which contain a split minimal region are computed by Algorithm ComputeSplitRegions; (ii) the split zones which do not have any split minimal regions are computed, together with the set of covered zones, by analysing the relationship between the collection of marked points of the zones and the contour A. For instance, in Fig. 8, the zones having a minimal region split by A are {∅, {C}, {B, C}, {B, C, E}, {B, E}, {B, D, E}, {B, D}, {D}} while the zone {B} is split by A even though neither of its two minimal regions are split by A since one of them is covered by A and the other is not. In detail, when Over(A) is empty, A does not properly overlap any of the contours in d and it does not generate any new crossing points. In this case there is exactly one zone of d split, defined by Cont(A), and any point on A can be chosen as the marked point for both of the zones of d+A that are described by Cont(A) and Cont(A)∪{A}, referred to as the old zone and the new zone respectively (lines 3 − 5). Thereafter the algorithm considers the marked points of each zone of d which has a minimal region that is split by A (line 7). The variable s refers to the zone in d that is split as well as the old zone that this becomes upon the addition of A in d′ , the variable n refers to the new zone that is created in d′ from s which is also inside A. For each such marked point x, the algorithm checks if x is a marked point for just the old zone, just the new zone or both (lines 13 − 20) in d′ (recorded using Ms , Mn and MA respectively). The new zone n is added to the collection of zones of the diagram (line 23). Subsequently (line 24) the algorithm checks the remaining zones (i.e. those that do not contain any split minimal regions) looking for covered or split zones of d. This is performed by verifying the relationship between the marked points of the zone z ∈ Z − Zs and A (lines 26 − 30). In particular, if all of the marked points belong to int(A) (i.e., Mext = ∅) then the zone z is covered by A and so the old zone z is removed from the diagram (lines 31 − 32). Moreover, if at least one marked point belongs to int(A) then the zone z is either split or covered and so a new zone is generated and added to the diagram (lines 35 − 38). 3.2 Relaxing the wellformedness condition WF2 and WF1 WF2. We extend the algorithms to also handle crossing points with multiplicity greater than 2 (i.e., more than two contours crossing transversely at a given point). For Euler diagrams with WF3 relaxed, we used Observation 1 in Section 3 to compute the set of zones split by the addition (or removal) of a contour A. Although Observation 1 (i) holds for Euler diagrams with WF2 also relaxed, part (ii) does not hold if the crossing point xi+1 mod m has multiplicity greater than 2. Fig. 9 (left) presents a schematic diagram, similar to that in Fig. 4, in which there is a crossing point, x2 , with multiplicity 3. Observation 2 provides the modification of the strategy to deal with diagrams that relax WF2 (as well as WF3). 73 Fig. 9. (left) A schematic diagram illustrating the need for Observation 2 when WF2 is relaxed (c.f. Fig. 4). (right) The addition of A, yielding a diagram with WF1a relaxed. We have four transverse crossing points (shown as filled dots) and one tangential intersection point (shown as a hollow dot). Observation 2 Let d be an Euler diagram with WF3 and WF2 relaxed. Let {x0 , x1 , . . . , xm−1 } be all of the crossing points that we meet as we traverse the contour A from an arbitrary point on A. If there are exactly ℓ ≥ 1 contours crossing A transversely at a point xi+1 mod m then the zone descriptors of the zones that are split by the arcs (xi , xi+1 mod m ) and (xi+1 mod m , xi+2 mod m ) differ by exactly ℓ contours, and these are the contours that intersect with A comprising the crossing point xi+1 mod m . The algorithms in Section 3.1 can now be altered to deal with the relaxation of WF2. WF1. Firstly, we relax WF1a. Since tangential intersection points do not affect the zones which are split by the corresponding arc (see Fig. 9 (right)), we adapt the algo- rithm to deal with tangential intersections by simply ignoring them. Hence even the relaxion of WF1a is straightforward. Secondly, we relax WF1b, allowing concurrency. For this case, we assume that, given two contours A and B of an Euler diagram, we can quickly: (i) check whether they meet in a concurrent arc (i.e. a maximal non-discrete set of points of intersection). (ii) check whether a concurrent arc is tangential (i.e. if a homotopy of the concurrent arc to a point would leave a tangential intersection point; see Figure 10 left) or transversal (i.e. if a homotopy of the concurrent arc to a point would leave a transverse crossing; see Figure 10 right) and (iii) find the split points (the points where two contours that meet in a concurrent arc separate). The split points will play essentially the same role as the crossing points within the extended algorithms: they will be used as marking points for zones and to compute the set of zones which are split by the addition of a new contour A (noting that the crossing points are also still used, as before). There are two cases to consider: (i) tangential concurrent arcs and (ii) transversal concurrent arcs. For a tangential concurrent arc, both of the split points of that arc do not affect the zones which are split (see Figure 10 left), and so we treat such points in the same manner as tangential intersections; For a transversal concurrent arc, the first split point that we encounter as we traverse the contour A does not affect the zone which is split, whilst the corresponding second split point does affect the zone which is split (see Figure 10 right). Therefore, the first point will be treated as a tangential intersection while the second one will be treated as a transverse crossing. 74 Fig. 10. The addition of A, yielding diagrams with WF1b relaxed. (left) The contour A creates two transverse crossing points, x0 and x2 , and one tangential concurrent arc between points x1 and x3 (shown as hollow dots). (right) The addition of contour A create three transverse crossing points, x0 , x2 and x4 , and one transverse concurrent arc between points x1 and x3 (shown as hollow dots). 3.3 Timing We provide complexity analysis for the MMPA for Euler diagrams. The invocation of procedure ComputeContourRelationships analyses the relation- ship between A and each contour in C and updates the set of crossing points, taking time O(|C| + |Cross(d)| log(|Cross(d)|)). Then, if there are intersection points, the procedure ComputeSplitRegions com- putes the set Zs of zones having a split region and, for each such zone updates the set of marked points. Hence the procedure ComputeSplitRegions requires O(|C| + |Cross(d)| log(|Cross(d)|) steps. Finally, for contour addition, lines 9 − 23 and 24 − 38 respectively compute the col- lection of marked points for zones having, and not having, split minimal regions, respec- tively. We can compute this two collections in O(|Z| + |Cross(d)|) steps. Collectively, algorithm NewContour operates within time O(|Z| + |Cross(d)| log(|Cross(d)|)). 4 Related work and conclusion The relationship of the Euler diagram abstraction problem with arrangements of Jordan curves in the plane [9] was discussed in [6]. We note that in our approach we do not need to store or manipulate graphs, but we work directly with the diagrams utilising its in- tersection points and the domain specific data structures, and we obtain a methodology with a straightforward means of implementation. In [22], an application is presented that interprets an Euler Diagram sketched with a pen or a mouse, and calculates the abstract diagram. The authors claim complexity that is asymptotically similar to ours, but this claim is not substantiated, with the paper not providing details of how the Zone List Refinement Step is performed. The only apparent method that would be effective with the generality that they describe is a pixel based inspection of the drawing (commonly available in programming languages) but which has the drawback of being dependent on the resolution of the image. Our methodology has the added advantage of being more general in that it is not dependent on the image of the contours, but only on their analytical representation. In [23], a methodology is provided which takes a set of polygons (meaning regions determined by sets of non-overlapping curves) and outputs a set of non-overlapping 75 polygons, which is essentially the boundary of a zone of the diagram in our termi- nology; this enables the computation of polygon operations such as union, intersection, difference and clipping. A graph based representation is constructed which consists of a binary tree structure, encapsulating the structure of non-overlapping contours, together with a winged-edge data structure which captures sets of polygons as a graph (indi- cating the intersection points) and provides a simple means of traversing faces of the graph. Their algorithm “corrects” input containing degeneracies (e.g. zero area contours or coincident edges, meaning concurrency), whereas we wish to develop a method that explicitly considers them. For ‘diagrams’ that consist of more than one connected com- ponent, they compare each output contour with the others to determine if one is inside the area of another or if they co-exist within the same area, and they record these contour relationships in the hierarchical tree structure of the contours. Our approach (utilising marked points) provides removes the need to compute these graph structures and then to operate on them, whilst providing a means to explicitly capture the ‘singularities’ that may occur within certain contexts or application domains. In [1], they prove that the Grünbaum encoding uniquely identifies simple Venn dia- grams (i.e. they are wellformed) which are monotone and polar symmetric, and develop an algorithm utilising a matrix representation to enumerate the monotone simple sym- metric 7-Venn diagrams. The codes considered in the paper rely upon numbering the curves (adopting certain conventions based on the curve segment in other and inner faces to fix the choices) and for a given curve recording the sequence of curve numbers that are given as one traverses the curve. Our methodology applies to a much wider class of diagrams, but investigating the computation of encodings from the data struc- tures utilised in our algorithms is an interesting line of future investigation. In this paper, we have developed a new methodology, called the multiple marked point approach, which enabled us to develop algorithms to solve the Euler diagram ab- straction problem for nonwellformed Euler diagrams relaxing the constraints imposed on the previous state of the art in [8], which utilised the single marked point approach and was limited to the wellformed diagram case. Furthermore, these algorithms extend in a natural manner to enable the processing of generalised Euler diagrams (i.e. unions of regions with holes). This enables greater flexibility for users of software tools, enabling them to work with nonwellformed diagrams where it is convenient, or necessary, to do so. For exam- ple, during user construction of wellformed diagrams if the user is permitted to con- struct intermediary non-wellformed diagrams then this aids them in adopting a natural construction method, whilst it can be very complex, or even impossible, to do so without allowing passage through non wellformed diagrams. Furthermore, the extension to gen- eralised Euler diagrams ensures that every abstract diagram has a concrete realisation (which is not the case for wellformed diagrams, as shown in [5]). The algorithms presented in this paper are available in a Java library, although to simplify implementation (and to improve the efficiency) it restricts the geometric shape of contours to be arbitrarily rotated ellipses [7]. This enables the specification of each contour in parametric form, enabling easy implementation of operations such as choos- ing a point on an arc, or checking contour relationships of intersection or containment. The applications of this work are widespread since the algorithms can be utilised in any software system that utilise Euler diagrams or their extensions. 76 References 1. T. Cao, K. Mamakani and F. Ruskey. Symmetric Monotone Venn Diagrams with Seven Curves. In Fifth Intern. Conference on Fun with Algorithms, LNCS 6099, 331-342, 2010. 2. R. De Chiara, U. Erra, and V. Scarano. VennFS: A Venn diagram file manager. In Proceed- ings of Information Visualisation, pages 120–126. IEEE Computer Society, 2003. 3. L. Euler. Lettres a une princesse dallemagne sur divers sujets de physique et de philosophie. Letters, 2:102–108, 1775. Berne, Socit Typographique. 4. A. Fish, J. Flower, and J. Howse. The Semantics of Augmented Constraint Diagrams. Jour- nal of Visual Languages and Computing, 16:541–573, 2005. 5. J. Flower, A. Fish, and J. Howse. Euler diagram generation. Journal of Visual Languages and Computing, 19:675–694, 2008. 6. G. Cordasco, R. De Chiara, and A. Fish. Interactive Visual Classification with Euler Dia- grams. In Proceedings of VL/HCC, pages 185–192. IEEE Press, 2009. 7. G. Cordasco, R. De Chiara, and A. Fish. EulerDiagramNWF: source code. http://isis.dia.unisa.it/projects/EulerNWF, 2010. 8. G. Cordasco, R. De Chiara, and A. Fish. Efficient on–line algorithms for Euler diagram region computation. Comput. Geometry: Theory and Application (CGTA),44:52–68, 2011. 9. H. Edelsbrunner, L. Guibas, J. Pach, R. Pollack, R. Seidel and M. Sharir. Arrangements of curves in the plane—topology, combinatorics, and algorithms. In Theoretical Computer Science, Vol. 92 N. 2, pages 319–336. Elsevier Science Publishers Ltd. 1992. 10. A. Fish. Euler Diagram Transformations. Graph Transformations & Visual Modelling Tech- niques, ECEASST, 18:1–17, 2009. 11. C. A. Gurr. Effective Diagrammatic Communication: Syntactic, Semantic and Pragmatic Issues. Journal of Visual Languages and Computing, 10:317–342, 1999. 12. J. Howse, F. Molina, J. Taylor, S. Kent, and J. Gil. Spider Diagrams: A Diagrammatic Reasoning System. Journal of Visual Languages and Computing, 12(3):299–324, 2001. 13. S. Kent. Constraint Diagrams: Visualizing Invariants in Object Oriented Models. In Pro- ceedings of OOPSLA97, pages 327–341. ACM Press, 1997. 14. H. Kestler, A. Muller, T. Gress, and M. Buchholz. Generalized Venn diagrams: A new method for visualizing complex genetic set relations. J. of Bioinformatics, 21(8), 2005. 15. N. Henry Riche, and T. Dwyer. Untangling Euler diagrams. IEEE Transactions on Visual- ization and Computer Graphics, 16(6):1090–1099, 2010. 16. F. Ruskey. A survey of Venn diagrams. Electronic Journal of Combinatorics. www.combinatorics.org/Surveys/ds5/VennEJC.html, 1997 17. A. Shimojima. Inferential and Expressive Capacities of Graphical Representations: Survey and Some Generalizations. In Proceedings of 3rd International Conference on the Theory and Application of Diagrams, Vol. 2980 of LNAI, pages 18–21. Springer-Verlag, 2004. 18. G. Stapleton, J. Masthoff, J. Flower, A. Fish, and J. Southern. Automated Theorem Proving in Euler Diagrams Systems. Journal of Automated Reasoning, 39:431–470, 2007. 19. N. Swoboda, and G. Allwein. Using DAG Transformations to Verify Euler/Venn Homo- geneous and Euler/Venn FOL Heterogeneous Rules of Inference. Journal on Software and System Modeling, 3(2):136–149, 2004. 20. J. Thièvre, M. Viaud, and A. Verroust-Blondet. Using Euler Diagrams in Traditional Library Environments. In Euler Diagrams 2004, Vol. 134 of ENTCS, pages 189–202, 2005. 21. J. Venn. On the Diagrammatic and Mechanical Representation of Propositions and Reason- ings. Phil.Mag, 1880. 22. M. Wang, B. Plimmer, P. Schmieder, G. Stapleton, P. Rodgers, and A. Delaney SketchSet: Creating Euler diagrams using pen or mouse. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing VL/HCC, pages 75–82. IEEE Press, 2011. 23. K. Weiler. Polygon comparison using a graph representation. Computer Graphics (SIG- GRAPH ’80 Proceedings), 14(3):10–18, July 1980.