Enabling Hierarchical Exploration for Large-Scale
     Multidimensional Data with Abstract Parallel Coordinates
                    Gaëlle Richer, Joris Sansen, Frédéric Lalanne, David Auber, Romain Bourqui
                       {gaelle.richer,joris.sansen,frederic.lalanne,david.auber,romain.bourqui}@u-bordeaux.fr
                                                LaBRI, Univ. Bordeaux, CNRS, UMR5800
                                                                  France
                                                                                                           
ABSTRACT                                                                                         2 2 2 2    
                                                                                                 1 1 2 6     
As data collection grows more common in various domains, there                                                
                                                                                                 5 2 2 6       
is a call for adapted or newer methods of visualization to tackle                                6 3 3 1        
magnitudes exceeding the number of available pixels on screens
and challenging interactivity. Exploratory visualization of large                         (a) Toy data                 (b) Line-based    (c) Abstract
data present two major challenges: perceptual scalability and
processing scalability. The first is concerned with overcoming                           Figure 1: Examples of (b) line-based and (c) abstract paral-
the fundamental limitation of screens and human perception.                              lel coordinate plots for the same dataset (a). On the line-
The second deals with efficiently processing large volumes of                            based plot, colors match those of tuples on (a). On the
data to achieve responsive interactions. Multiscale visualizations                       abstract plot, the height of aggregates corresponds to the
are an effective technique for solving the first challenge that                          number of tuples they cover.
builds on several levels of data abstraction to provide the user
with an initial overview and subsequent incremental detail. The
focus of this paper is on multidimensional data, a ubiquitous form                       subsets of items across axes. Selection relates to interactive means
of data among large-scale data sets, and parallel coordinates, a                         of choosing subsets of items and enhancing them such that they
representation largely used for this type of data. For this represen-                    can be discriminated from the rest.
tation, defining abstractions and interactively generating levels is                        In this paper, we are interested in supporting the interactive
not straightforward. Building upon several previous aggregated                           exploration of multidimensional datasets with a large number
parallel coordinates representations, we propose a unifying and                          of items and a moderate number of dimensions. We built upon
thinking model for conceiving and describing multiscale par-                             the assumption that large datasets, scaling to billions of records,
allel coordinates and their interactions. Using this formalism,                          lead to major overplot when displayed using the traditional line-
we present a focus+context representation which bounds the                               based parallel coordinate representation. Indeed, as the number
number of visual items with a fixed resolution parameter while                           of records increases, the plot becomes cluttered and analyses
supporting exploration up to the item-level. Processing scala-                           may be hindered as specific patterns are concealed [10]. Heinrich
bility is addressed by carrying out computation in a distributed                         and Weiskopf [18] listed several clutter reduction approaches for
manner on a remote data-intensive infrastructure. Bounding the                           parallel coordinates. One solution lies in the display of aggregates
visual items ensures perceptual scalability but also bounds the                          computed per-dimension as in [20, 29, 32] instead of single items
data transfer between this infrastructure and the rendering client.                      (see Fig 1c). Despite the aggregation, these abstract plots success-
                                                                                         fully provide overview information similar to a traditional plot
KEYWORDS                                                                                 and perceptually scale for any size of input data. Rendering time
                                                                                         and transfer time in client-server architectures are also reduced
visualization, large-scale data, parallel coordinates, multiscale
                                                                                         since they are bounded by the number of aggregates. Sansen
visualization, distributed computing, focus+context
                                                                                         et al. [32] leveraged this bound by precomputing the result of
                                                                                         several interactions and thus clear the dependency of linear scans
1    INTRODUCTION                                                                        of the data upon interaction. Indeed, past a certain number of
Multidimensional data, a form of data common in many domains,                            records, data cannot fit in a desktop computer memory and linear
encompasses all lists of individuals composed of several attributes                      scans over the data affect performance more negatively since
(possibly temporal). Multidimensional items are studied from sev-                        they either involve reads on disk or network transfer between
eral aspects: the particular behavior of individual items relative                       several computing units.
to the whole, the relationship between values from two dimen-                               As for any abstract visualization, deeper analyses are limited
sions and the distribution of values along each dimension [7].                           by the amount of information conveyed by the visual aggregates.
Parallel coordinates, introduced by Inselberg and Dimsdale [19],                         Two types of interactions can alleviate this limitation: changing
is a well-known technique of visualization for such data. Each                           the level of detail (show more details) and adjusting the aggre-
item is represented by a polyline which anchors on axes are                              gation (show different details). Supporting these interactions is
positioned at the corresponding attribute value of the item (see                         essential but strongly increases the number of states of the visu-
Fig 1b). Axes are traditionally aligned in parallel forming a se-                        alization and reasonable precomputing and storing of all these
quence of two-dimensional subplots sharing one axis with their                           states as performed by Sansen et al. [32] is challenging. The
predecessor. Usual interactions are axis reordering to analyze                           processing scalability challenge here has been addressed by Rü-
relationships between all dimension pairs and selection to trace                         bel et al. [31] on a modern high-performance computing (HPC)
                                                                                         platform. HPC platforms are large and expensive computing sys-
© 2018 Copyright held by the owner/author(s). Published in the Workshop                  tems suited for highly complex and real-time computation. They
Proceedings of the EDBT/ICDT 2018 Joint Conference (March 26, 2018, Vienna,
Austria) on CEUR-WS.org (ISSN 1613-0073). Distribution of this paper is permitted        are composed of multiple processors connected through a fast
under the terms of the Creative Commons license CC-by-nc-nd 4.0.                         network and use fast memory. As such, they are particularly


                                                                                    76
adapted to tightly coupled tasks where several processors work               precomputation (e.g. [15, 24, 30]). This work is related to general
on the same task and exchange data. Distributed systems, on the              techniques addressing perceptual and processing scalability for
contrary, are networks of computing units, usually commodity                 interactive parallel coordinates.
hardware, connected in a shared-nothing architecture (memory
and storage are independent to each unit). On these systems,                 2.1    Perceptual Scalability in Parallel
data transfer between computing units uses a slower network                         Coordinates
connection and thus is critical for performances. Consequently,
                                                                             Various approaches have been proposed to improve the legibility
they are most adapted to loosely coupled data-parallel tasks on
                                                                             of parallel coordinate plots by either reducing the clutter pro-
large amounts of data. In addition to being cheaper alternatives
                                                                             duced by the multiplicity of overlapping and crossing lines or
for data-intensive computing, they offer easier horizontal scalabil-
                                                                             enhancing their patterns. Approaches can be categorized into
ity (allocation of additional computing units) as their hardware
                                                                             geometry-based relying on computer graphics techniques and
and architecture are less sophisticated. The filtration and ag-
                                                                             data reduction approaches that use approximation or summary of
gregation problems at stake in abstract parallel coordinates are
                                                                             the data. Geometry-based approaches display all items with shape
data-parallel tasks. We focus on these less expensive and more
                                                                             or position modifications to alleviate the overdraw in-between
accessible infrastructures to address them.
                                                                             axes, for instance by bundling lines (e.g. [33]). These techniques
    Our aim is to enable interactive exploration down to the item-
                                                                             have the advantage of producing no or few loss of information
level over large-scale multidimensional data with an abstract
                                                                             but have the drawback of still being prone to over-plot since no
parallel coordinate plot. The main contributions of this paper
                                                                             reduction of the number of displayed items is performed. Data
are summarized as follows: (i) a unifying model for abstract
                                                                             reduction approaches limit the number of visual items either by
parallel coordinate plots based on hierarchical aggregation, (ii)
                                                                             sampling items [9] or using meta-items. Model-based approaches
formalization of multiscale and regular interactions in this model,
                                                                             mathematically reduce the data to a continuous function, and
(iii) a focus+context representation with bounded visual items
                                                                             meta-items usually represents the density of the underlying data
and drill-down/roll-up interactions, extension of the work of [32],
                                                                             (e.g. [27]). Aggregation approaches have been presented for paral-
(iv) an example implementation using a distributed infrastructure
                                                                             lel coordinates through different schemes: aggregation of dimen-
with components that benefits from horizontal scalability.
                                                                             sions (e.g. [2]), aggregation of items or values (e.g. [13, 28, 29])
    The paper is organized as follows: we first present previous
                                                                             or combinations of both (e.g. [21, 26]).
work on perceptual scalability in parallel coordinate plots and
                                                                                 In this paper, we are interested in scalability relative to the
their interactions with a focus on multiscale approaches (Sec.
                                                                             number of items, not dimensions, hence we focus on aggregation
2). Then, we describe the proposed graph-based formalism for
                                                                             over items and their values. Previous work using aggregation
abstract parallel coordinates based on hierarchical aggregation
                                                                             have used kernel density estimation, independently applied to
(Sec. 3). In Sec. 4, we study how different interactions are ex-
                                                                             dimension [29], different hierarchical clustering algorithms over
pressed in this model. This yields a prototype implementation
                                                                             multidimensional items or dimension (e.g. [13]) but also binning
and design of a scalable parallel coordinate plot based on hierar-
                                                                             on two-dimensional sub-spaces (e.g. [14, 28]). Aggregates have
chical aggregation, using a so-called big data infrastructure. This
                                                                             been represented by their statistical properties: extrema (e.g. [3]),
visualization is described and its scalability evaluated in Sec. 5.
                                                                             cardinality (e.g. [20]), mean (e.g. [13]), and/or other metrics [29].
Finally, in Sec. 6 we draw conclusions and present directions for
                                                                                 As reflected by previous work, multiple dimensions allow for
future work.
                                                                             different levels of grouping: the item level (multidimensional), the
                                                                             value level (per-dimension), the line level (two-dimensional). For
2 RELATED WORK                                                               aggregation-based methods, the value level has the advantages
Recent works have focused on the scalability of visualization ap-            of losing less information, preserving continuity on axes (poten-
plications for large-scale data with different techniques, among             tially broken by two-dimensional aggregation) as well as an idea
which are: data reduction, multi-threading, GPU-acceleration,                of the relationship between dimensions (broken by item-level
and incremental or approximate data processing. In the case of               aggregation).
visualization, the scalability of a system often refers to its ca-
pacity to accommodate and handle growing amounts of data.                    2.2    Multiscale Parallel Coordinates
Handling massive datasets brings about two main challenges for               Abstract representations are usually conditioned by a parameter
exploratory visualization: perceptual scalability and processing             that controls the resolution of the display. Multiscale visualiza-
scalability as noted by [12, 16, 24]. The first is concerned with the        tions allow navigation between multiple levels of abstraction,
legibility of visualizations representing numerous items relative            using a drill-dow/roll-up interaction (respectively for increas-
to the space available on a screen (so-called screen real-estate             ing and decreasing levels of detail). Elmqvist and Fekete [11]
problem) and human capabilities to apprehend them. The sec-                  provided general guidelines for multiscale representations and
ond relates to the computational cost of processing numerous                 interactions, based on such hierarchical aggregation. Bikakis et al.
items on each user input, that can create latencies responsible for          [4] presented a framework for hierarchical aggregation oriented
decreasing user performances [23]. A taxonomy of different tech-             towards the computational aspects of hierarchical navigation.
niques regarding the perceptual scalability aspect was established           Interactively changing the level of detail can be integrated in dif-
by Ellis and Dix [10]. A general solution is data reduction, which           ferent ways (see [8] for a general review). A first approach, similar
can be categorized into two approaches: either representing a                to geometry zooming, either (1) filters part of the representation
subset of the data items (sampling, filtering) or meta-items (aggre-         to maintain a fixed number of visual items on screen or instead
gation, mathematical models). Several work proposed methods                  (2) purposely displays increasing number of items (e.g. [6, 29]).
(called multiresolution, multiscale, hierarchical or even stratified)        Filtering has the advantage of being scalable when the amount of
for navigating through multiple levels of detail supported by                displayed items is controlled; however, the overview and context


                                                                        77
are lost along the way. One drawback of displaying an uncon-
                                                                                        a         
                                                                                                              2 2 2 2 
                                                                                                                           
                                                                                                       
                                                                                  X = b  =
trolled number of items is that the screen pixel limitation could
                                                                                                            1 1 2 6 
always be reached for a sufficiently massive dataset and infor-                          c           
                                                                                                              5 2 2 6 
                                                                                                                                d1        d2                d3           d4
mation would be lost. Additionally, from a certain point it may
overwhelm human capacities. Consequently, zooming without                                            (a)                                            (b)
filtering does not allow interactive exploration to the item-level                                                                                  d3
complying with perceptual scalability. For zooming with filter-
ing, an alternative to the loss of context is the overview+detail
                                                                                  d1            d2            d3      d4
approach which displays both the filtered zoomed view and an                                                                                        d1
overview (e.g. dimension zooming of [13]).
    A second approach, called focus+context, consists in displaying                                                                   d2                            d4

heterogeneous levels of detail: a selected portion of the data is                 d3            d1            d4      d2
shown with greater details to the detriment of the rest. This ap-
                                                                                                     (c)                                            (d)
proach allows to gain detail locally while preserving the overview.
                                                                                                                                                     a1
Fisheye lenses used in the work of [25] are an example applied in
                                                                                               a1
screen space. In data space, [13] and [28], although using differ-
ent aggregation strategies, presented techniques where a subset                                                                            a4
                                                                                                                                                2
of items can be enhanced and displayed with fine-scale details                            a4                                   a 2, c 2                           a 3, c 3, b3

layered over the rest of the data, abstracted to some level.                     a2 = 2                a3 = 2
                                                                                                       c3 = 2
                                                                                                                                           c1            2
                                                                                 c2 = 2
    To the best of our knowledge, none of these solutions pro-                                        b3 = 2                                                 b1
pose to interactively change the level of detail locally, in a fo-                     c1                      b1                                                              b2
cus+context fashion, while bounding the number of visual items.                                                        b2
                                                                                                                                           b4, c 4
                                                                                       c4 = 6         b4 = 6

                                                                                                     (e)                                            (f)

                                                                             Figure 2: Rationale behind the graph model. (a) Example
                                                                             data. (b) Opacity-based parallel coordinates. (c) Parallel co-
2.3 Processing Scalability of Multiscale                                     ordinates matrix where all 2-dimensional subspaces are
    Parallel Coordinates                                                     represented. (d) Many-to-many plot. (e) Graph-based rep-
For the past ten years, large-scale or massive has been used to              resentation G (X ). Shadowed areas cover identical values
qualify increasingly big data, now up to tera or petabytes in                from the same dimension and form the partition R. (f)
size [12]. For multidimensional data, above about 107 items, with            Quotient graph G /R (X ) (cardinality mapped to size).
a dozen of dimensions, perceptual scalability and processing
scalability becomes problematic for interactive analysis. Abstract
                                                                             3     A GRAPH-BASED FORMALISM FOR
representation and precomputation of interactions are solutions
respectively addressing perceptual and processing scalability,                     ABSTRACT PARALLEL COORDINATES
as [32] presented. In addition to the enhancement of subsets of              Consider X ∈ Rn,d , a set of multidimensional quantitative data
items and the reordering of axes, abstract representations require           with n items and d dimensions. On a traditional parallel coor-
a drilling interaction to allow fine data analysis.                          dinate plot, only a fraction of the input data subspaces is rep-
   Processing scalability can be tackled by parallelism, distributed         resented. Fig. 2c and 2d show two examples of representations
processing and precomputing of complete or partial results for in-           that display all subspaces at once: respectively the parallel coor-
stance. For parallel coordinates, most interactions can be applied           dinate matrix [17] and the many-to-many plot [22]. The model
to two-dimensional subplots independently and on separate por-               we propose integrates all subspaces without assumption on the
tions of the dataset without needing any communication (they                 layout of axes and supports abstraction. A value in the matrix
are pleasantly parallel). This property has been exploited by Rübel          inherently belongs to a tuple (or item), and a dimension. We
et al. [31] on a HPC platform, and [32] on a distributed platform.           focus on abstraction based on aggregation of tuples and do not
Rübel et al. [31] addressed both scalability challenges with an              directly address aggregation of dimensions. The rationale behind
histogram-based representation adapted from [28]. Histograms                 this approach is that dimensions hold semantic meaning not em-
are two-dimensional aggregations of the data, precomputed in                 bedded in their numerical content. Therefore, automatically and
parallel, potentially for different resolutions. On more affordable          meaningfully aggregate them is not straightforward: some types
hardware, Sansen et al. [32] addressed the same challenges with a            of values may not make sense aggregated together. Consequently,
parallel sets representation for an Hadoop+Elasticsearch ecosys-             we target datasets that challenge scalability by their size in tuples
tem that also relies on full precomputation of certain types of              (more than millions of items) and hold moderate amounts of
interactions. Both works presented good scalability evaluations              well-chosen dimensions.
relative to the number of computing units used. However, these
techniques only support the display of balanced levels of detail i.e.        3.1       Abstraction in Parallel Coordinates
drill-down is global and necessarily increases the number of dis-            In parallel coordinates, the values of the input matrix are con-
played items. With this approach, gaining detail at the item-level           nected to each other by lines to materialize the tuple or tuples
does not scale to large datasets.                                            they belong to. Extending this metaphor, an n × d matrix can be
                                                                             seen a graph linking its values based on the relation induced by
                                                                             tuples. This undirected graph is composed of n separate cliques


                                                                        78
                                                                                                 a
      a 
                                                          a
       b   10 7 6 0 
                                                          e                                      e

           5 6 10 10 
                                                                                                 b
 X =  c  =  3 2 0 8 
                                                          b                                      c
                                                          c
           1 4 1 10 
                                                                                                 d
                            
                                                          d
          e   8 8 3 1 
            d
                                                                                                     d1     d2         d3   d4        d1     d2         d3   d4
                                                              d1   d2         d3   d4
                                                                                                                 (a)                              (b)
                     (a)                                                (b)
                                      b3
                                                                                             Figure 4: Aggregation of tuples. (a) Translation of the
                                b1                                                           tuple partition (color) into a vertex partition refining D
b4, c 4, d a                               a 2, e 2, b2
                                                                                             (shadowed regions). (b) Tuple-oriented abstract represen-
                                             e3
                                                                                             tation using mean and extrema values.
                               a 4, e 4

               d 3, c 3
                                 a3            a 1, e 1
    c 1, d 1              d 2, c 2
                                                                                             D. Fig. 4 shows an example of tuple partition refinement and a
                                                              d1   d2         d3   d4
                                                                                             tuple-oriented representation. Using our model and the intro-
                     (c)                                                (d)                  duced notations, the abstract representation is fully defined by:
                                                                                             (i) a valid partition of the matrix values, (ii) two aggregate func-
Figure 3: Aggregation of dimension values. (a) Exam-                                         tions assigning weights to meta-nodes and meta-edges, (iii) an
ple data (b) Parallel coordinate plot augmented by                                           axis layout and visual encodings for meta-node and meta-edge
per-dimension partitions. (c) Quotient graph. (d) Value-                                     weights (vertical position, color, etc).
oriented abstract representation using extrema values and
cardinality (with link opacity).                                                             3.2          Hierarchical Abstraction in Parallel
                                                                                                          Coordinates
                                                                                             Hierarchical aggregations are precomputation of different lev-
of d vertices, that is, n complete graphs (see Fig. 2e). We note this
                                                                                             els of abstraction that naturally support multiscale representa-
graph G = (V , E, w ), where V denotes the vertices, E the edges,
                                                                                             tions. Such precomputation outputs a rooted tree structure whose
w is the weighting function: w : V → R that stores vertex values.
                                                                                             leaves are the data objects to aggregate. Every node of a rooted
We note D = {Vi , i ∈ ⟦1, d ⟧} the partition of V that groups ver-
                                                                                             tree is said to cover the leaves of the subtree it is the root of.
tices along their origin dimension. Given a dimension sequence,
                                                                                             An antichain (also called tree cut) in a tree is a set of nodes S,
a parallel coordinate plot is built by taking the subgraph of G
                                                                                             such that no node of S is an ancestor of another node of S. An
composed of only the edges joining vertices of consecutive dimen-
                                                                                             antichain is maximal if it cannot be expanded by any other node
sions in the sequence. Vertices of this subgraph are positioned
                                                                                             without violating the antichain property. The subsets covered
on their corresponding axis along a vertical position given by
                                                                                             by the nodes of a maximal antichain form a partition of the tree
the weighting function. Shadowed regions on Fig. 2e represent a
                                                                                             leaves: the antichain property makes them pairwise disjoint, the
vertex partition R that groups identical values of the same dimen-
                                                                                             maximal property ensures that their union is the set of leaves.
sion. Such partition is a refinement of the partition D. Merging
                                                                                                In our model, an abstraction is directed by a partition refining
vertices of G belonging to the same subset of a partition R corre-
                                                                                             the partition D which is equivalent to d partitions, one for each
sponds to taking the quotient graph of G relative to R, noted G /R .
                                                                                             Vi of D. By augmenting each subset Vi with a tree, such partition
Two subsets S, S ′ in the partition R are represented by adjacent
                                                                                             can be defined by one maximal antichain for each tree. This is
vertices in G /R if, and only if, some vertex in S is adjacent to a
                                                                                             equivalent to defining a single maximal and non-trivial antichain
vertex in S ′ in G. Further, two aggregation functions are defined
                                                                                             in the unified tree where the root’s children are the roots of all
to compute the properties of the meta-nodes and meta-edges, i.e.
                                                                                             dimension trees. We note this unified tree T (V ), and call its direct
vertices and edges of the quotient graph. Aggregation functions
                                                                                             subtrees dimension hierarchies.
return a tuple of weights given a set of edges or vertices as input.
                                                                                                Overall, for some input data modeled by the clique graph
Weights (cardinality, extrema, mean, etc) provided by such func-
                                                                                             G = (V , E, w ) and its associated hierarchy T (V ), an abstraction
tions are subsequently used for assigning visual properties to
                                                                                             is defined as previously described with the valid partition of V
meta-edges and meta-nodes. For instance, on Fig. 2b, the opacity
                                                                                             induced by a maximal antichain in T (V ). This unifying approach
of lines on the parallel coordinate plot depends on the cardinal-
                                                                                             accommodates multiple layouts of parallel coordinates (see Fig. 2)
ity of meta-edges. On this example, the aggregation of identical
                                                                                             and several abstract parallel coordinates representations. We pre-
values allows to find the optimal number of lines to draw. In-
                                                                                             sented two abstract representations: per-dimension aggregation
deed, the aggregation done here in data-space relates to the one
                                                                                             on Fig. 3d that corresponds to the technique presented by Palmas
done in screen-space when rendering the polylines with alpha
                                                                                             et al. [29] and on Fig. 4 tuple aggregation like presented by Fua
composing.
                                                                                             et al. [13]. For tuple aggregation, the hierarchy defined on tu-
    Given some aggregation functions, the same process is applied
                                                                                             ples is used to form all dimension hierarchies. Additionally, to
to obtain an abstract representation from any valid partition R
                                                                                             represent the same tuple aggregates on all subplots, dimension
of V , that is, refinement of D. Fig. 3 presents an example of parti-
                                                                                             hierarchies should all be cut at the same level.
tion obtained with per-dimension clustering and another type of
representation based on extrema values. Tuple-based approaches
are incorporated into the model by refining the input tuple par-                             4       COMMON INTERACTIONS
tition into a refinement of D. Each subset of the tuple partition                            We take advantage of the hierarchical graph model described in
can be decomposed into d groups along their membership in                                    the previous section to explore several hierarchical operations


                                                                                        79
and usual operations for parallel coordinates and investigate the           a predefined budget. In this setting, drilling potentially implies
incremental computation they require.                                       modifying the current maximal antichain in multiple points such
                                                                            that previously acquired detail collapses to allow drilling-down
4.1 Level-of-Detail Operations                                              when the budget would have been exceeded otherwise. If these
                                                                            changes are restricted to a single dimension hierarchy, the incre-
The drill-down and roll-up operations correspond to changes in
                                                                            mental changes only involve one or two subplots. Computing
the displayed level of detail. They respectively mean moving
                                                                            these subplots costs a pass over n bottom-level edges for each, in
deeper in the hierarchical aggregation to show finer details, and
                                                                            addition to the cost of finding a suitable maximal antichain.
upper to present a coarser display of the data. Since the model
                                                                               For a budget defined in number of visual items, the minimal
rests upon precomputed hierarchies of dimension values, the
                                                                            budget allowing to define a maximal antichain that does contain
weights (cardinality, extrema, etc) of all possible meta-nodes, i.e.
                                                                            a leaf node depends on the arity and depth of the tree. In a tree T
nodes of the hierarchies, can be precomputed as well. The number
                                                                            with n leaves and arity a, a maximal antichain containing at least
of edges of a quotient graph with respect to a maximal antichain
                                                                            one leaf and minimal in size can be expected to have between
is always less than those of the original graph and generally
                                                                            depth(T ) and a · depth(T ) nodes. Considering the toy example
greatly depends on the properties of the hierarchies (arity, depth).
                                                                            of binary trees which nodes all have either 0 or 2 children, the
In general, the number of maximal antichains is exponential. For
                                                                            minimum depth is ⌈loд2 (n)⌉ + 1 and the maximal depth is n. Con-
example, complete binary trees, that are rooted binary trees with
                                                                            sequently, a visual budget allowing gaining detail up to the leaf
every level full except the last, have Ω(2n ) maximal antichains
                                                                            level has to be chosen with respect to the depth of the hierarchy.
where n is their number of leaves. This makes the precomputation
                                                                            In our context, since this depth can reach orders of magnitude
of all possible quotient graphs, i.e. all meta-edge weights, with
                                                                            same as the number of tuples, this method does not scale without
respect to every possible maximal antichain for given dimension
                                                                            strong constraints on the properties of hierarchies.
hierarchies not possible for the scale of data we aim to tackle.
   In traditional parallel coordinates plots, and in many derived
techniques, each two-dimensional subplot is independent of the
others and can be computed and displayed separately. When                      Dynamic Context Aggregation. An alternative solution to col-
axes hold visual elements, like frequency plots, the correspond-            lapsing nodes when focusing on deeper nodes is to aggregate
ing visual items are usually mirrored and the two composing                 them dynamically. The meta-nodes from the chosen antichain
parts conceptually included in both sides subplots. If no edge              that are not in focus (i.e. not the deepest in the hierarchy) are rep-
is precomputed, the cost of an abstraction can be decomposed                resented aggregated which lowers their impact on the visual item
into independent substeps corresponding to all subplots, each               budget. In general, these dynamically-computed aggregates, do
requiring aggregation over n edges, one for each tuple.                     not correspond to existing nodes in the precomputed hierarchy.
   In the context of hierarchical aggregation, gaining detail can           This approach limits the visual items on an axis to the siblings
be defined locally as the substitution of a visual aggregate for its        of each focus and the aggregates of the rest of the meta-nodes
children. In our model, the displayed level of detail is given by           that forms the context. Thus, for a dimension hierarchy of arity
a maximal antichain of the hierarchy. Drilling-down on a meta-              k, an axis with f foci corresponds to at most k · f + ( f + 1) visual
node replaces it in the current antichain by all of its children            nodes, whatever the depth of these foci in the hierarchy is. Form-
which still forms a maximal antichain of the tree. This entails             ing these aggregates induces an aggregation of their outgoing
computing the outgoing edges of these children since they are not           edges, therefore it similarly reduces the number of meta-edges.
precomputed. It corresponds to aggregating the tuples covered               Consequently, with additional constraints on the arity of dimen-
by the drilled meta-node. The opposite operation, rolling up or             sion hierarchies and the number of concurrent foci on an axis,
collapsing, is the replacement of sibling nodes with their parent           this approach enables drill-down up to the deepest level while
in the antichain.                                                           complying with a visual budget. Contrary to the previously de-
   Globally increasing detail may result in an uncontrolled num-            scribed approaches, dynamic context aggregation matches the
ber of elements on display, independently of the arity and depth            requirement of our system and thus is the one implemented (see
of hierarchies. We detail three drilling approaches that tackle this        Sec. 5.1 and 5.2).
issue by trading context detail for scalability.

   Detail & Filter. One approach is to define drilling on a meta-
node as a filtering operation. This corresponds to computing the
expansion of the meta-node over a filtered clique-graph, induced
                                                                            4.2    Other Operations
solely by vertices adjacent to the leaves it covers in the original         Axis Reordering can be conceived either as the computation of a
clique-graph. This method effectively bounds the number of                  full plot or, incrementally, as an edition of one or more subplots
aggregates on an axis by the arity of the dimension hierarchy, i.e.         which requires one or two passes over n bottom-level edges, for
the maximal number of siblings in the tree. This also bounds the            n the number of tuples. Subset Selection relates to the emphasis
number of edges in an abstract plot. The limit of this approach is          of a subset of tuples in a given representation. In an aggregate
that all subplots are modified since the weights of all meta-nodes          visualization setting, an abstraction can be computed over the
and meta-edges from the quotient graph are to be updated to                 subset of tuples and the resulting weights displayed over the
account for the filtering. Thus for each subplot, it costs a pass           complete abstraction using a discriminating encoding. Using this
over as many bottom-level edges as the drilled meta-node covers             approach, a subset selection of m tuples requires going through
items.                                                                      m bottom-level edges per subplot, one per tuple, to compute their
                                                                            contribution to the current meta-edge weights. The process is
   Budgeted Detail. A second approach lies in constraining the              the same for meta-node weights.
definition of maximal antichains such that their size comply with


                                                                       80
5 FOCUS+CONTEXT REPRESENTATION &
                                                                                                                        3
  SCALABLE IMPLEMENTATION
Our goal is to enable hierarchical exploration in abstract parallel
                                                                                                                        2
coordinates while complying with the following scaling proper-
ties. On the representation-side, exploration should be possible
down to the item level, in a top-down manner, while the num-                                                            1
ber of visual items on display should be bounded for any size
of input data. On the processing-side, network transfer latency
between the displaying unit and the computing unit should be
bounded and traditional operations (axis reordering, subset selec-                                         (a)
tion) should be supported. In the following, we present first the
cornerstone of our method: the budgeted number of visual items.
Then, we detail a novel focus+context display with intuitive drill-          context node
down capabilities using the dynamic context aggregation. And                with aggregate
                                                                              of 2 and 3
finally, we describe technical details and performance evaluation
                                                                                                     4
of the proposed system.

5.1 Bounded Number of Visual Items
                                                                                                                            5
                                                                              focus nodes
Bounding visual items is essential for perceptual scalability but            (children of 1)

also ensures that data transfer between our rendering client and
back-end unit remains bounded in size thus predictable in time.
To this end, the properties of dimension hierarchies are con-
strained and the number of foci restricted to one per dimension.
   Hierarchy Constraints. We use a user-defined k value that acts                              (b)                              (c)
as a resolution parameter, bounding the number of visual items
per displayed axis. This parameter is enforced as the maximal              Figure 5: Focus+context representation. (a) Initial display.
arity of dimension hierarchies. k should be chosen large enough            (b) Drill-down on 1 . The top context node represents the
for the representation to preserve an appropriate amount of infor-         aggregation of 2 and 3 . (c) Drill-down on 4 and 5 succes-
mation but small enough for the visual items to fit the available          sively.
screen space. Additionally, dimension hierarchies should order
their leaves with respect to their values such that every maximal
antichain defines partitions of dimension intervals since the cho-         detail. Context regions are displayed with slightly lower width
sen visual representation relies on this property. Binning (equal          to emphasize the focus regions. Since regions of interest are in-
range partitioning) and adaptive binning (equal size partitioning)         crementally refined by successive drill-down, they get smaller
are examples of partitioning algorithms that can be applied in a           in height relative to the whole both in terms of cardinality and
bottom-up fashion to produce hierarchies complying with these              interval of values. Therefore, the height of focus and context re-
two requirements. No depth constraint is required, we adopt the            gions are rendered with different scaling factors. Focus nodes are
dynamic context aggregation strategy to bound the number of                represented colored and augmented by a bidirectional smoothed
vertices from quotient graphs displayed.                                   histogram. Context nodes are augmented by a pile of level blocks
   Level-of-Detail Navigation. We define as focus nodes, the nodes         (see Fig. 5c). The wider these blocks are, the lower in the hierar-
that have the maximal depth in the current antichain. The rest             chy are the nodes they represent (notice on Fig. 5c the difference
of the antichain nodes from one dimension are aggregated into              in width of the outer blocks from the top and bottom context).
the minimal number of context nodes such that their order is pre-          The height and vertical order of level blocks follows the same
served. In the proposed implementation, the top-level nodes are            encoding as those of focus nodes. Level blocks are computed on
initially presented as focus nodes and each drill-down triggers            the client side and thus not transferred nor counted in the visual
a dynamic aggregation (examples are presented in Fig. 5). Con-             item budget.
sequently, there is no more than k focus nodes and two context                In the drilling mode, focus nodes and level blocks are clickable.
nodes at once per axis. Thus, the number of nodes on display is            Clicking on a focus node triggers its animated expansion which
bounded by k + 2 per axis and the number of edges by (k + 2) 2             split it into its children and merges its siblings into level blocks.
per subplot.                                                               Level blocks represent an aggregation of nodes previously in
                                                                           focus, thus they enable going back to this precise previous state.
5.2 Visual Encoding
We propose an extension of the work presented in [32] which                5.3      System Overview
included two visual encodings: one oriented towards the distri-            Our system consists of two main parts and follows a client/server
bution of tuples, the other towards the distribution of values             architecture where the client is the visualization endpoint and
themselves. Basically, the first encoding maps aggregate height            the server is an interface to an on-demand computing and pre-
to their cardinality, while the second maps it to their covered            processing back-end. The latter could be implemented both in
interval size (distance between extrema). We extend the encod-             a distributed environment for a Hadoop cluster and as a multi-
ing and positioning of aggregates to emphasize the focus regions,          threaded application for single machines (desktop computer or
composed on each dimension of the data displayed with the finest           dedicated server). Past a certain number of input records (for a


                                                                      81
given number of dimensions and resolution parameter), a dis-                                              104


                                                                                                                                                                                    in selection (number of tuples)
tributed platform should be more efficient while facilitating load                                                                                                            108
expansion.


                                                                                   execution time (ms)
   The client is a WebGL application that displays the represen-
tation, computes level blocks and context aggregation on node                                             102                                                                 106
drill-down, and queries the supporting back-end for other inter-
actions. The back-end server is a long-lived Spark application                                                                                   distributed
                                                                                                                                                 single
which runs distributed job on demand while keeping prepared                                                                                                                   104
                                                                                                                                                 selection cardinality
data in memory. It computes dimension hierarchies in a pre-
                                                                                                          100
computation step and stores the resulting meta-node weights                                                      104    105          106      107          108     109
(extrema and cardinality) for all the hierarchies in a distributed                                                                n (number of tuples)
database. Contrary to the previous system, the hierarchical as-
pect makes the number of displayable meta-edges too large to                (a) Execution time for node selection relative to data size for two
allow their precomputation in reasonable time (see Sec. 4.1).               implementations. The distributed version uses 15 executors. Er-
                                                                            rors bars are not symmetrical and missing their bottom parts on
   The membership of each input data value is stored in a hierar-
                                                                            two points due to the log-log scale.
chy matrix of the same size as the input data, where each value
                                                                                                         8,000
holds the list of computed ancestors for the matching input data


                                                                            execution time (ms)
                                                                                                                                                         mean measured
value. This matrix is kept in memory and split among computing                                           6,000
                                                                                                                                                         ideal
units which will pass over their slice of the data to filter and                                         4,000
aggregate results on demand. The aggregation outputs the meta-                                           2,000
edges (source, target and cardinality) as well as the weights of
                                                                                                            0
context nodes since they do not exist in the precomputed hierar-                                                              5                     10                   15
chies. Upon user interaction and if necessary, the client requests                                                                number of executors
the server which in turn runs a distributed operation and merges
                                                                                                           (b) Scalability test using node selection, n = 2 · 108 .
the partial results returned by computing units. Finally, the client
receives the incremental changes in plain text and updates the
view consequently.                                                          Figure 6: Performance evaluations. Experimental results
                                                                            are averaged over 1000 runs. Executor resources (12 cores,
                                                                            31GB memory) are sized to match those of physical units.
5.4 Performance Evaluation
Our system supports moving an axis, selecting an aggregate node
or edge, drill-down on a focus node and rolling-up by clicking on
a context part. Drill-down, roll-up and moving an axis correspond           of the largest nodes does not increase linearly with n. We observe
to the same low-level operation which consists in computing the             that, on our infrastructure, under n = 2 · 108 (about 108 selected
edges of two-dimensional subplots given the current focus on                tuples) the Spark application performances are stable despite in-
each dimension involved. Compared to the other operations that              creasing workload. This suggests that execution time for datasets
only modify two to three subplots, selection operations affect all          of smaller size is dominated by costs related to network and disk
subplots, thus are more computationally expensive.                          I/O, and/or by merging all executor results by the driver unit.
   Since we are interested in supporting "n ≫ d" datasets, we               Indeed, the cost of merging results remains approximately the
evaluate our system for varying number of tuples n and using                same as it is a function of the output size and the number of task
the most expensive interactive operation: node selection of the             running in parallel. Another visible aspect of the results on small
largest node (in covered tuples). To demonstrate the scalability            datasets is their high variation. The oversized allocated memory
of the approach we also evaluate performance relative to the                can be an explanation: when triggered, garbage collection incurs
resources allocated for computation.                                        a substantial delay. Overall, for n less than approximately 106 , the
   Each node of the distributed platform has 64GB RAM and 2x6               distributed infrastructure underperforms the single-computer de-
hyper-threaded cores at 2.10GHz each, connected via a 1Gbit/s               spite having more resources. Past this limit, the system is better
network. The single computer used for running the multi-threaded            tuned and seems to scale linearly relative to the number of tuples
implementation has 2x4 hyper-threaded cores at 3.3GHz and                   in the selection. This observation holds for the other experiments
64GB RAM. Test datasets are generated for varying n (from 104               carried out for edge selection, roll-up, and drill-down.
to 109 ), with d = 15 and k = 31. Test datasets are generated                  We investigate the scalability of the system using a dataset
such that pairs of dimensions present a close to null correlation           larger than this limit. Fig. 6b presents the median execution time
factor [5] which tends to create close to the maximum number of             for node selection on a 2 · 108 -tuple dataset with varying numbers
edges between dimensions. Dimension hierarchies are generated               of executors. The plotted ideal execution time corresponds to a
using Canopy clustering [1] applied in a bottom-up manner.                  linear speedup, that is, halving the execution time when doubling
   On Fig. 6a, execution times of the selection of the largest top-         the number of executors. This experiment shows that the system
level node are plotted for increasingly large datasets. We evaluate         performs close to the ideal.
both a multi-threaded single-computer implementation and the
distributed Spark implementation. The stairs-shaped curve of
the single computer can be explained by the similar trend of the            6                            CONCLUSION
selected node cardinality for every test dataset. Indeed, due to the        We presented a graph model for hierarchical aggregation and
hierarchy constraints and the bottom-up approach for clustering,            interaction strategies for parallel-coordinate-based visualization.
the number of nodes on the top-level varies and the cardinality             This model formalizes aggregation over multidimensional data at


                                                                       82
the most-expressive level by making use of per-dimension hier-                                [5] S. Börzsönyi, D. Kossmann, and K. Stocker. 2001. The Skyline Operator. In
archies. This approach treats all dimensions equivalently which                                   Proc. of Int. Conf. on Data Engineering. IEEE Computer Society, 421–430.
                                                                                              [6] K. Selçuk Candan, L. Di Caro, and M. L. Sapino. 2012. PhC: Multiresolution
matches the way dimensions are handled in parallel coordinates.                                   Visualization and Exploration of Text Corpora with Parallel Hierarchical
    Based on this model, we presented a client/server system for                                  Coordinates. ACM TIST 3, 2 (2012).
                                                                                              [7] J. H. T. Claessen and J. J. van Wijk. 2011. Flexible Linked Axes for Multivariate
interactive exploration of large multidimensional data using ab-                                  Data Visualization. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011).
stract parallel coordinates. We address the limitation of the pre-                            [8] A. Cockburn, A. K. Karlson, and B. B. Bederson. 2008. A review of
vious system, based on fixed partitions of dimension values, with                                 overview+detail, zooming, and focus+context interfaces. Comput. Surveys 41,
                                                                                                  1 (2008).
hierarchies that allow to interactively change partition. The pro-                            [9] G. Ellis, E. Bertini, and A. Dix. 2005. The sampling lens. In CHI '05. ACM
posed drill-down operation enables the definition of an arbitrary-                                Press.
detailed focus region on each axis while retaining context in                                [10] G. P. Ellis and A. J. Dix. 2007. A Taxonomy of Clutter Reduction for Information
                                                                                                  Visualisation. IEEE Trans. Vis. Comput. Graph. 13, 6 (2007).
a reduced form. On the client-side, we showed how to display                                 [11] N. Elmqvist and J.-D. Fekete. 2010. Hierarchical Aggregation for Information
focus and context regions with intuitive navigation cues. The                                     Visualization: Overview, Techniques, and Design Guidelines. IEEE Trans. Vis.
                                                                                                  Comput. Graph. 16, 3 (2010).
strength of this approach is that it supports exploration down to                            [12] D. Fisher. 2016. Big data exploration requires collaboration between visualiza-
the item-level while controlling the number of visual items, thus                                 tion and data infrastructures. In Proc. Workshop on Human-in-the-Loop Data
ensuring perceptual scalability.                                                                  Analytics.
                                                                                             [13] Y.-H. Fua, M.O. Ward, and E.A. Rundensteiner. 1999. Hierarchical parallel
    On the server-side, the back-end processing is handled on a                                   coordinates for exploration of large datasets. In Proc. Visualization’99. IEEE.
distributed platform with components that scale horizontally. Be-                            [14] Z. Geng, Z. Peng, R. S. Laramee, J. C. Roberts, and R. Walker. 2011. Angular
sides drill-down/roll-up operations, the implementation supports                                  Histograms: Frequency-Based Visualizations for Large, High Dimensional
                                                                                                  Data. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011).
standard parallel coordinate operations. Experimental results                                [15] P. Godfrey, , J. Gryz, P. Lasek, and N. Razavi. 2016. Interactive Visualization of
demonstrate the scalability of the back-end system relative to the                                Big Data. Springer International Publishing, 3–22.
                                                                                             [16] P. Godfrey, J. Gryz, and P. Lasek. 2016. Interactive Visualization of Large Data
size of the input data and to the resources allocated for computa-                                Sets. IEEE Trans. Knowl. Data Eng. 28, 8 (2016).
tion. The results indicate that the proposed system can support                              [17] J. Heinrich, J. Stasko, and D. Weiskopf. 2012. The Parallel Coordinates Matrix.
increasingly large datasets by expanding its network of comput-                                   EuroVis–Short Papers (2012).
                                                                                             [18] J. Heinrich and D. Weiskopf. 2013. State of the Art of Parallel Coordinates. In
ing units. To a certain extent, adding computing resources can                                    Eurographics 2013. Eurographics Association.
also reduce interaction latencies.                                                           [19] A. Inselberg and B. Dimsdale. 1990. Parallel Coordinates: A Tool for Visualizing
    One focus of the design is the bounded data transfer between                                  Multi-dimensional Geometry. In IEEE Visualization.
                                                                                             [20] R. Kosara, F. Bendix, and H. Hauser. 2006. Parallel Sets: Interactive Exploration
the client and server parts which relies on the single-focus ap-                                  and Visual Analysis of Categorical Data. IEEE Trans. Vis. Comput. Graph. 12,
proach (on each axis) and the precomputation of dimension hi-                                     4 (2006).
                                                                                             [21] A. Lex, M. Streit, C. Partl, K. Kashofer, and D. Schmalstieg. 2010. Comparative
erarchies with a bounded arity k of small orders of magnitude.                                    Analysis of Multidimensional, Quantitative Data. IEEE Trans. Vis. Comput.
The hierarchies being predefined allows for efficient filtering and                               Graph. 16, 6 (2010).
aggregation of items based on ancestors but may be limiting for                              [22] M. Lind, J. Johansson, and M. D. Cooper. 2009. Many-to-Many Relational
                                                                                                  Parallel Coordinates Displays. In IV 2009. IEEE Computer Society.
exploration. As a counterbalancing measure, future work could                                [23] Z. Liu and J. Heer. 2014. The Effects of Interactive Latency on Exploratory
add support for user-driven hierarchy refinement. Depending on                                    Visual Analysis. IEEE Trans. Vis. Comput. Graph. 20, 12 (2014).
the scope and scale of the refinements this could be handled on                              [24] Z. Liu, B. Jiang, and J. Heer. 2013. imMens: Real-time Visual Querying of Big
                                                                                                  Data. Comput. Graph. Forum 32, 3 (2013).
the client-side and/or enforced on the server-side.                                          [25] T. V. Long and L. Linsen. 2009. MultiClusterTree: Interactive Visual Exploration
    Another path for future work is methods for latency reduction                                 of Hierarchical Clusters in Multidimensional Multivariate Data. Comput.
                                                                                                  Graph. Forum 28, 3 (2009).
other than horizontal scaling. Space is a possible trade-off. Even                           [26] J. Lu, Y. Zhang, J. Xu, G. Xiao, and Q. Liang. 2014. Data Visualization of Web
if the number of possible meta-edges makes their total precom-                                    Service with Parallel Coordinates and NodeTrix. In SCC 2014. IEEE Computer
putation infeasible, partial precomputation could be investigated:                                Society.
                                                                                             [27] H. Nguyen and P. Rosen. 2017. DSPCP: A Data Scalable Approach for Identi-
either beforehand or as a background process targeting meta-                                      fying Relationships in Parallel Coordinates. IEEE Trans. Vis. Comput. Graph.
edges that are the most likely to be requested given the current                                  (2017).
state.                                                                                       [28] M. Novotny and H. Hauser. 2006. Outlier-Preserving Focus+Context Visual-
                                                                                                  ization in Parallel Coordinates. IEEE Trans. Vis. Comput. Graph. 12, 5 (2006).
                                                                                             [29] G. Palmas, M. Bachynskyi, A. Oulasvirta, H. P. Seidel, and T. Weinkauf. 2014.
                                                                                                  An Edge-Bundling Layout for Interactive Parallel Coordinates. In 2014 IEEE
ACKNOWLEDGMENTS                                                                                   Pacific Vis. Symp. IEEE.
This work has been carried out as part of the "REQUEST" (PIAO18062-                          [30] A. Perrot, R. Bourqui, N. Hanusse, F. Lalanne, and D. Auber. 2015. Large
                                                                                                  interactive visualization of density functions on big data infrastructure. In
645401) and "SpeedData" (PIAO17298-398711) projects supported                                     Symp. on Large Data Analysis and Visualization. IEEE.
by the French "Investissement d’Avenir" Program (Big Data –                                  [31] O. Rübel, Prabhat, K. Wu, H. Childs, J. S. Meredith, C. G. R. Geddes, E. Cormier-
Cloud Computing topic). The authors would also like to thank                                      Michel, S. Ahern, G. H. Weber, P. Messmer, H. Hagen, B. Hamann, and E. W.
                                                                                                  Bethel. 2008. High performance multivariate visual data exploration for
the anonymous referees for their valuable comments and helpful                                    extremely large data. In SC ’08.
suggestions.                                                                                 [32] J. Sansen, G. Richer, T. Jourde, F. Lalanne, D. Auber, and R. Bourqui. 2017.
                                                                                                  Visual Exploration of Large Multidimensional Data Using Parallel Coordinates
                                                                                                  on Big Data Infrastructure. Informatics 4, 3 (2017).
REFERENCES                                                                                   [33] J. Wang, X. Liu, H.-W. Shen, and Guang Lin. 2017. Multi-Resolution Climate
                                                                                                  Ensemble Parameter Analysis with Nested Parallel Coordinates Plots. IEEE
 [1] A.McCallum, K. Nigam, and L. H. Ungar. 2000. Efficient clustering of high-                   Trans. Vis. Comput. Graph. 23, 1 (2017).
     dimensional data sets with application to reference matching. In Knowledge
     Discovery and Data Mining. ACM, 169–178.
 [2] K. Andrews, M. Osmic, and G. Schagerl. 2015. Aggregated parallel coordinates:
     integrating hierarchical dimensions into parallel coordinates visualisations. In
     Proc. of Int. Conf. on Knowledge Technologies and Data-driven Business. ACM.
 [3] G. Andrienko and N. Andrienko. 2004. Parallel coordinates for exploring
     properties of subsets. In Proc. of Int. Conf. on Coordinated & Multiple Views in
     Exploratory Visualization. 93–104.
 [4] N. Bikakis, G. Papastefanatos, M. Skourla, and T. Sellis. 2017. A hierarchical
     aggregation framework for efficient multilevel visual exploration and analysis.
     Semantic Web 8, 1 (2017).


                                                                                        83