=Paper=
{{Paper
|id=Vol-1947/paper06
|storemode=property
|title=Towards the Design of Expressive Data Exploration Environments
|pdfUrl=https://ceur-ws.org/Vol-1947/paper06.pdf
|volume=Vol-1947
|authors=Thiago Nunes,Daniel Schwabe
|dblpUrl=https://dblp.org/rec/conf/semweb/NunesS17
}}
==Towards the Design of Expressive Data Exploration Environments==
Towards the Design of Expressive Data Exploration
Environments
Thiago Nunes and Daniel Schwabe
Department of Informatics
Pontifical Catholic University of Rio de Janeiro
R.M.S. Vicente 225
Gávea Rio de Janeiro, RJ, Brazil
+55 21 3527-1500
{tnunes, dschwabe}@inf.puc-rio.br
Abstract. Information exploration processes are usually recognized by their
inherent complexity, lack of knowledge and uncertainty, concerning both the
domain and the solution strategies. Even though there has been much work on
the development of computational systems supporting exploration tasks, the
lack of a formal understanding of the exploration process and the absence of a
proper separation of concerns approach in the design phase is the cause of many
expressivity issues and serious limitations. This work shows how the design
space for exploration environments can be structured by applying the separation
of concerns approach, with special emphasis in characterizing the issues that
must be addressed when designing the interface of such environments.
1 Introduction
Information exploration environments aim at supporting information gathering tasks
that often involve a high degree of complexity, the lack of user’s knowledge about the
data, which is spread over multiple items and data types, and do not have a clear
ending [20]. Exploration tasks usually involve sequences of data interactions that
eventually lead to the desired outcome, which can be either a set of items or a
knowledge state acquired along the process [5]. These particular characteristics differ
exploration tasks from usual information retrieval tasks, where the user is assumed to
know precisely how to translate his/her information needs into a query and the
interaction is restricted to isolated sequences of query-responses [4]. As an example,
the task “write a paper about recently discovered treatments for diabetes” would
require exploratory behavior, while the task “discover who invented the light bulb”
can be solved in a single query against search engines. Given the complexity of the
tasks, exploration environments should be carefully designed to support a rich enough
variety of data processing actions and strategies, accessed through their interfaces.
A widely accepted motivation for exploration behavior is usually the need for
filling knowledge gaps that prevent the user to achieving his/her goals. Explorers
tends to experiment high degrees of uncertainty along the process, mostly concerning
whether the desired knowledge state can be achieved given the available information
space, task constraints, and the computational support for data manipulation. Despite
60
Towards the Design of Expressive Data Exploration Environments
the massive publication of semi-structured information, leveraged by the Linked Data
community1, and the amount of exploration environments available, it is still hard to
assess to which extent an environment fits the exploration needs of the users.
In order to address this central research question, in previous works we proposed
the adoption of the separation of concerns principle, guided by Normans’ gulf
traversal theory [12], as a means to improve the discussions of the concerns involved
in the design of exploration tools. In that work we traced a parallel between Norman’s
semantic and articulatory distances, and the distinction between the exploration
operations and their realizations in the interface, which usually was not considered or
mentioned in state-of-the-art tools addressing exploration tasks [14]. Subsequently,
we proposed a first approximation of a framework of exploration operations
expressive enough to describe the majority of state-of-the-art tools, which allowed us
to analyze and compare the functional aspect of exploration tools, abstracting
interaction and interface issues [16]. As an example, consider a refinement operation
over a set of information items. A functional concern for refinements is whether the
tool allows disjunctions, conjunctions, or negations of filters. On the other hand, an
interaction/interface concern is which interface controls and interaction dialogues
support the specification of filters and logical connectors.
The proposed framework presents a rich semantics for describing the functional
aspect of exploration tools, and may guide the design of interaction dialogues and
interface widgets. Nevertheless, a remaining open question is, how to approach the
interaction/interface design space based on a formal layer of exploration operations?
This work presents a novel way to approach the design space for interfaces of
exploration environments, by leveraging an expressive formalization of exploration
actions and processes. To illustrate the discussions of the design issues and possible
solution alternatives, we use the case of the design and implementation of the XPlain
environment, which provides higher expressivity when compared to state-of-the-art
tools, as we demonstrate in [13]. The example uses the Linked Data OpenCitations
dataset1.
The contribution of this work is a framework that presents a novel approach to
characterize the interaction/interface issues. This framework serves as a guideline for
designing expressive exploration environments over semi-structured data. Even
though we discuss the issues and possible solutions in terms of a real exploration
environment, the framework allows abstractions that can be generalized to any
exploration environment design.
The remainder of this paper is organized as follows. Section 2 presents the related
work. Section 3 presents the separation of concerns approach applied to exploration
environments, leading to a three-layered architecture. Section 4 presents the data
model used in the definition of the exploration operations. Section 5 addresses issues
related to exploration actions and strategies. Section 6 addresses interaction/interface
design issues. Finally, we present the conclusions and future steps.
1
http://linkeddata.org/
61
Towards the Design of Expressive Data Exploration Environments
2 Related Work
The need of separating visual representations from processing operations has been
established in the visualization area presenting taxonomies, typologies, and ontologies
addressing at least these two concerns [2, 3, 7, 10, 18]. Chi’s work [10] divides the
design of a visualization system in a sequence of stages in a pipeline that receives raw
data as input and generates interactive visualization of the raw data as output. Each
pipeline stage receives a set of data items, applies processing operations to transform
the data, and passes the transformed data to the next stage. “scroll”, “zoom”, “filter”,
“rotate”, and “scale” are operations that can be carried out in the view resulting from
the pipeline. The work in [2] presents a taxonomy of analytic operations for
describing visualization tasks containing, for example, “Cluster”, “Filter”, “Sort”, and
“Correlate” operations. The work in [7] presents a typology of abstract visualization
tasks addressing the Why, How, and What aspects of a visualization task
independently of the kind of visualization and of the task domain. The Why concerns
the goals, such as “discover” new information, “present” data, or “explore”. The How
presents the actions to achieve the goals, such as “select”, “navigate”, “filter”, and
“aggregate”. The What describes input and output resources handled by the tasks.
These approaches are valuable to promote some degree of separation between the
description of users goals, tasks, and operations from visual encoding details but their
lack of formality makes it hard to analyze where they overlap and what are the
differences. Moreover, they do not present detailed discussions of
interaction/interface design issues with respect to a given well-defined
conceptualization of exploration processes and strategies.
Visualization systems are concerned with encoding data in a visual representation
to foster human cognitive perception. Although interactive visualizations can be used
to explore a dataset to some extent, supporting exploration behavior is not its main
goal [19]. Moreover, even in interactive visualizations, the user is usually restricted
to a specific visual representation of the data aiming at highlighting a certain set of
dimensions. Since exploratory search tasks tend to be general and multifaceted [20], it
is very difficult to know in advance which data dimensions will suffice. Therefore, an
exploration environment should support a broader class of tasks that may even
include sense-making activities and manipulations of the raw data in order to select
proper dimensions to encode in visual representations. In this context, one advantage
of our reference framework is that it allows designers to situate visualization concerns
in the design of exploration environments. Within the framework, the projection of
the data onto a visual counterpart along with interaction controls and dialogue
structures is a concern of the interaction/interface layer. The data processing
interactions can be designed on top of the same framework of operations whose inputs
are visually encoded items and relations.
Beyond the works in visualization field, there are some works addressing issues
related to the broader exploration field. The work in [6] presents a similar
architectural view of exploration environments and also abstracts the functional
aspects in the SeCoQL exploration language. However, it presents a restricted set of
operators containing only refine, pivot, and ranking, and does not approach interface
62
Towards the Design of Expressive Data Exploration Environments
concerns in detail. The works in [1] and [21] propose a separation of interface details
from the underlying features, but there the main goal is to present a common
evaluation framework for comparison purposes and not a detailed separation of
concerns discussion. Moreover, the features are not formally defined, which is the
source of many ambiguities, such as different interactions for an operation being
mistakenly understood as being distinct features.
3 Reference Framework
A serious problem of exploration tools is the lack of a clear separation of the
exploration operators from their realization in the interface [15].
We propose the organization of the design space of exploration tools in a three-
layer architecture: Data Access, Functional, and Interaction/Interface. The Data
Access layer is responsible for providing access to data repositories and mapping their
data models to an abstract data model, which will be manipulated by the user
throughout the exploration task. The Functional layer presents a set of exploration
operators whose repeated compositions capture the solution strategies adopted by the
users. The Interaction/Interface layer provides proper access to both the operators and
the data items being manipulated. Fig. 1 shows a conceptual view of the design space.
Fig. 1. The layers of an exploration environment architecture.
The next sessions discuss the design questions for each layer of the architecture. We
discuss solution ideas for each question in terms of the design and implementation of
a generic and expressive exploration environment XPlain.
4 Data Model
The design questions addressed by this layer are, how to provide access to the data
repositories and how to abstract the details of the data model adopted by the
repositories so that the same exploration actions can be executed over data
represented in different ways.
63
Towards the Design of Expressive Data Exploration Environments
Independently of how the data is represented, the user effectively manipulates
items and relationships among them. Items can be organized in groups, such as papers
by author or papers by venue. Groups can also be formed along more than one
dimension, such as papers by author by publication year. Since such structures are not
directly mapped to traditional data models we chose to model items and relations as
nested relations. As an example, “papers by author by publication year” is a three-
level nesting, having the papers grouped by year inside a group for each author.
Nesting relations can be represented as trees, as Fig. 2 shows. We call “exploration
set” any nested relation generated by the execution of an exploration action.
Fig. 2. Nesting of papers by author by publication year.
This data model is very similar to the one used in NoSQL document-oriented
databases [9], where the nestings of an exploration set can be mapped to collections
and nested arrays. It also can be translated to the RDF model, where each path from
the root to the leaves can be mapped to a set of triples, using the reification
mechanism to describe statements over statements in case of trees having more than
three levels.
5 Functional Layer
The main concern of this layer is to define exploration actions as operations over the
generic data model presented in section 4 and the exploration process as a
composition of these operations. In order to find an expressive set of exploration
actions we carried out detailed analyses of a significant, if not the majority, of the
state-of-the-art tools published in the literature. A summary can be found in [16]; for a
detailed description of operations design process, refer to [13].
The operations of the functional layer are defined in terms of items and relations. A
dataset D = is defined as a set of items I = {i1, i2,…, in} and a set of pairs of
items R = {,…,}, representing the relationships in the dataset.
We briefly present each proposed operation in terms of its input parameters, result
sets, and examples of use over the dataset defined before. The complete formalization
of the operations is beyond the scope of this paper and can be found in [13]. The
operations’ descriptions are as follows:
• Pivot(S, Rel): maps the leaf items of S onto a set of related items through the
relation Rel;
• Refine(S, Filter): restricts the leaves of S keeping only the items that match
the predicate Filter;
64
Towards the Design of Expressive Data Exploration Environments
• Group(S, grel): maps each leaf item of S onto their group keys using the
grouping relation grel, where, the leaf items are nested within their
respective group keys;
• Correlate(S, T): finds all paths between every source item in S to every
target item in T (many-to-many). A path is a set of items that connects a
source to a target item and, each path is a different nesting in the result set;
• Rank(S, level, score): ranks a given nested level of S, where the relevance of
the items are obtained by a score function;
• Map(S, mRel): maps the items in S onto another set of items using the
computed relation mRel, where, mRel is a relation/function provided by the
environment, such as counts and format and scale converters;
• Unite(S, T): receives two exploration sets and applies a set union to each
level of S and T;
• Intersect(S, T): receives two exploration sets and returns a tree that is the
intersection between the levels of the input set trees;
• Diff(S, T): receives two exploration sets and applies a set difference between
the leaves of S and the leaves of T.
We call auxiliary functions or auxiliary relations all functions/relations that are not
defined as an exploration operator, such as, mapping relations for Map, score
functions for Rank, and filtering predicates for Refine.
In summary, independently of the interface design, the range of solution strategies
for exploration tasks is directly related to the set of operations supported and the
possibilities of combinations of these operations. This is the main concern of this
layer. For details and more examples of applications of the operations in a real case,
refer to the case study published in [15].
6 Interaction/Interface
Having defined the operators and their possible combinations for exploration tasks,
the interface designer must focus exclusively on deciding which interaction
paradigms are more adequate. This section presents a discussion of
interaction/interface issues separating its concerns from the exploration actions and
compositions defined by the functional layer. Based on the concepts of the functional
layer, the main goals of the interface are: (1) to present one or many exploration sets,
where the items of each set may be hierarchically organized (nested); (2) Allow the
user to select an operator/composition and specify its parameters; (3) Allow the user
to visualize, manage, and browse the exploration trail. In order to leverage the
discussion, we used a real problem situation discussed in [11] in the scientific
publications field using the Open Citations [17] Linked Data dataset. Fig. 3 shows a
screenshot of the interface of the XPlain environment. A screencast of a session can
be seen at https://vimeo.com/227356693.
Next, we discuss the interaction/interface design issues in the light of the
separation of concerns approach, using the XPlain interface to exemplify the issues
and possible solution ideas.
65
Towards the Design of Expressive Data Exploration Environments
6.1 Requirement 1: Manipulation of Exploration Sets and Items
The first challenge for the design of exploration environment interfaces is how to
present the data being manipulated and its relationships. The biggest issue to be dealt
with is handling the potential excess of information to be presented, as the number of
items can be very large. Here, Shneiderman’s visual information seeking mantra
“Overview first, zoom and filter, then details-on-demand” [18] should be considered
as a guideline.
Considering the conceptualization of the exploration process as a functional
composition that results in multiple exploration sets, the design alternatives for
presenting those sets are: show one exploration set at a time (unifocal) or show many
sets at a time (multifocal).
Fig. 3. The interface of the XPlain environment. (A) keyword search controls; (B) Exploration
operations toolbar; (C) Exploration sets area; (E) Exploration trail view.
Unifocal interfaces have the advantage of reducing the amount of information shown
at a given time and requiring less focus management interactions. However, they do
not properly support operations that take more than one set as input, such as
comparisons of alternatives [8]. For example, imagine a user interested in comparing
the publication profiles (e.g., venues in common) of two researchers in a certain
period of time. S/he filters the publications of each researcher by the desired period,
pivots by venue, and computes the intersection (or difference) of the two sets. In a
unifocal interface s/he must apply this sequence of operations to each researcher one
at a time, and somehow apply the intersection operation (if available in the
functionality layer) to the results, which won’t be both available in the same interface.
Note that if the functionality layer does not provide a set operation, s/he must annotate
the results and then make the comparisons offline. For web browsers, a common
strategy is to open two or more windows and organize the windows to support the
comparisons, but this is limited by available screen real estate.
66
Towards the Design of Expressive Data Exploration Environments
In a multifocal view s/he is able to visualize the two sets of venues simultaneously
and compute the intersection straightforwardly. The drawback of multifocal interfaces
is the need to design of focus management controls, such as maximization,
minimization, and restoration controls to avoid information overload. If comparisons
between alternatives are not the case or the device is very restricted in screen size,
unifocal interfaces may be more appropriate.
An additional design issue is the layout and presentation of the relationships
between the sets on the screen. To illustrate one possible set of options, in XPlain we
opted for a multifocal interface to better support operations over multiple sets, where
each set presentation has minimization/maximization controls. Fig. 3C shows two
exploration sets in the workspace. The last generated set is always placed on top of
the screen and the exploration trail presents the relationships between the sets and also
allows the user to navigate to intermediary sets by clicking on the corresponding node
in the graph. After deciding between unifocal and multifocal presentations, it is
necessary to define the organization and interactions for the exploration items and the
relations they participate in, within each exploration set.
According to the functional model, there are two types of item relations that must
be considered: schema relations and computed relations. Computed relations are
relations created along the exploration process, such as grouping relations or
mappings, which do not necessarily have an identifier, such as predicate URIs in RDF
or column names. Two common representations of schema relations found in the
literature are tabular format, where each relation becomes a column, and graph
format, where the relations are the edges between nodes. A tabular format may be
easier for spreadsheet users while graph views favors the visualization of the joins
between the items.
In XPlain we took a different approach by adopting a directory metaphor, where
items are mapped to directories and both schema and computed relations are nested
directories, as shown in Fig. 4. This choice allows a natural representation of groups,
where each group is represented as a separate directory. The drawback is the
visualization of items that participate in more than one relation, as items related to
two different nested items will appear repeatedly, in two “directories”.
Even in a unifocal interface, the number of items within a single exploration set
can be large, so the designer should weight alternative choices for presenting the set
using scroll or pagination controls.
It is also typically desirable to apply some natural ordering to the items. Although
our model describes ranking as an independent exploration operator, it can also be
used in conjunction with other operators. Thus, even when operators other than
ranking are selected, such as keyword refine or grouping, the interface can also make
the composition with a ranking function and send the result to the server in order to
enforce a natural ranking for the result set. In XPlain we opted for pagination controls
with a limit of twenty items per page and an alphabetical or numerical ordering of
results.
67
Towards the Design of Expressive Data Exploration Environments
Fig. 4. Visual representation of an exploration set as a nesting of items and relations.
In summary, the interface design issues for the manipulation of exploration sets and
items are:
1. Choosing between unifocal and multifocal view
a. If multifocal: design appropriate focus management controls for the sets,
such as maximization, minimization, restore, and hiding controls;
2. Deciding where and how to show relationships between exploration sets;
3. Designing focus management controls for the items within the set e.g.
pagination, scroll, or a combination of both;
4. Determining the best visualization for items and items relations (schema and
computed): graphs, trees, tables, lists, etc;
5. Establishing a natural order for presenting exploration items and, possibly,
adding sorting controls.
6.2 Requirement 2: Applying Exploration Operations
The application of exploration operations presents another class of
interface/interaction design issues, which concern both the selection and activation of
an operator, and the definition of its parameters. The functional layer defines four
types of arguments: exploration items, auxiliary functions, relations, and relation
paths. Next, we argue that each argument type may require distinct interaction
models.
To invoke an exploration action the user must assign the values to each input
parameter of the invoked operation. Each assignment is a binding, i.e., a pair
that will be evaluated when operation is executed. For example,
pivoting requires two bindings: the definition of the input set and the pivoting
relation. For binding definitions, the interaction issues are: defining the assignment
order for parameters, and defining the interaction that will support the binding
definition. The latter issue depends on both the argument type and the operation.
With regards to the order of the assignments, consider the pivoting action as an
example. Some design alternatives are: the user selects the input set, activates the
pivoting operator, and the system shows the relations for selection (e.g., interaction in
68
Towards the Design of Expressive Data Exploration Environments
SeCo[6]); Alternatively, when the user selects the set, the interface could show all
relations as selectable elements whose activation causes a pivoting over the selected
relation. For tabular presentations, the first alternative may be better due to layout
organization issues, however, for graph and list presentations the second option seems
to be closer to hypertext browsing, which may favor Web users. The second option is
the solution adopted by the majority of faceted search tools with pivoting
functionality.
Another example is the definition of bindings for the Refine operation, where the
user should select the filtering relation, the filtering predicate and a value. One option
is to simply allow the selection of values, where the relation is inferred and the
filtering predicate is always an equality test. Another option is to allow the selection
of the relation, the value, and the filtering predicate, which may be different than
equality comparison e.g., greater than or less than operators. Thus, there can be many
distinct interaction sequences for the definition of the bindings for each operation.
The next issue concerns specific interactions for different types of parameters.
Considering the case of the Refine operation where the user must define bindings for
the relation or the relation path, the comparison operator (e.g. =, <, >), and the
restriction value for the relation. With regards to the relation, the interface has to
allow both the selection of relations and of relation paths.
Fig. 5. View for the Refine operation. The user selects relations (A) or relation paths (B) and
restriction values for each filter. Filters can be disjunctive or conjunctive according to the
selected logical operator.
For example, in Open Citations, if we want to refine papers by venue names, we must
bind the relation path :isDocumentContextFor:isHeldBy:name to the relation
69
Towards the Design of Expressive Data Exploration Environments
parameter, as shown in Fig. 5B. One design option is to allow the user to pivot
relation by relation in this path until reaching the next to last relation, which is the
:isHeldBy relation in this case. At this point, the interface can show the possible
relations and values for the holders, which includes the :name relation and the actual
venue names for selection. The computation is, therefore, carried out relation by
relation until the desired path is achieved. Next, a selection of a venue name will
cause the refinement over the path :isDocumentContextFor:isHeldBy:name. This is
the most common interaction for path refinements found in faceted search interfaces,
but considering the size of paths, the possibility of mistakes, and the amount of
refinements required for the task, this design can be cumbersome. Another option is to
allow the visualization of relation chains on demand, where the user can explore and
select relation paths without causing a context change (pivoting). XPlain implements
this design option with relation nestings built at runtime, where when a relation :x is
nested with a relation :y then there is relation path :x:y in the dataset representation.
This design allows the user to browse the nestings in order to find the desired path,
with reasonable performance. Fig. 5B shows the nesting of :isDocumentContextFor,
:isHeldBy, and :name relations.
Auxiliary functions, such as the comparison operators, the scoring function, and
the mapping functions can be picked out in the interface from a pre-defined set. For
example, the Refine modal dialogue presents a selection box with all comparison
operators available, as Fig. 5A shows. However, since it is very difficult to define a
complete range of functions that covers all problem domains, the interface can also
allow the user to describe the function in some computable language. Consider a user
wishing to convert a set of measures to a different scale. The functional layer provides
the Map function for such tasks, but, the desired scale converter is not among the
available mapping functions. The user could simply type the formula and the interface
creates the binding. Therefore, the interaction design should not only consider
interface selection, but also textual inputs, with some validation in the case of
function definitions, and filterable selection lists. The same issue also occurs for
bindings of exploration items. XPlain’s interface allows the definition of new
auxiliary functions using a Domain Specific Language (DSL) implemented by the
functional layer. We also chose filterable selection lists for the definition of the
filtering values, as shown in Fig. 5B.
So far we have discussed the interaction/interface design issues and possible
solution ideas for the execution of single operations. For some recurring functional
compositions, there can be alternative interaction styles to the operation-at-a-time
approach. An example of such compositions can be found in the expansion of an
exploration item, shown in Fig. 4. When a user double clicks an item in the
exploration set, XPlain executes a composition of Refine and Pivot to respectively
select the clicked item, and pivot to its set of relations. The relations are shown as
nested items that can also be expanded. This interaction allows the user to browse the
graph of relations of an item in a follow-your-nose style without causing a context
change or the addition of a new exploration set for each Refine and Pivot executed.
This illustrates the fact that the designer should explore alternative interaction and
70
Towards the Design of Expressive Data Exploration Environments
interface designs for combinations of operators that are more appropriate for a given
task context.
Capturing common combinations that would require more appropriate interaction
designs is likely to be difficult, since the combinations may not be obvious for some
domains and contexts. However, we expect that these combination patterns will
emerge with the continued use of the environment. Since the patterns are formally
described and recorded, they can be mined from the environment log and analyzed
from the perspective of separate interface and interaction models.
The interaction/interface design issues presented for the specification of
exploration operations are:
1. The ordering for the specification of bindings for each operation.
2. The interaction required for specifying bindings for each parameter,
depending on the parameter type and the operation. Some possibilities
discussed were: interface selection, textual inputs, filterable selects,
computable specification for auxiliary functions, and navigation through
relation paths for relation parameters.
3. The possibility of modeling particular interactions for specific combinations
of operators, such as for combinations of Refine and Pivot, and for
compositions of Refine, Intersect, and Unite that can be modeled as a faceted
search interaction.
6.3 Requirement 3: Exploration Trail Management and Browsing
It has been recognized that exploration tools should allow the user to visualize the
history of the exploration actions [19]. The functional layer defines relationships
between result sets, where the result set of a previous action can serve as the input for
the next. Hence, the design issues at this point are how to present the exploration trail
and how to allow its manipulation.
Some interface options for visualizing the exploration trail are trees or graphs. The
tree representation has the advantage of allowing the user to collapse or expand the
branches, which may be a good option considering the “details-on-demand” rule of
the information seeking mantra. However, since the functional layer presents
operations that receive two sets as input (e.g., unite, intersect, and diff), tree
representations present a drawback because the result set of these operations must be
repeated in two branches. With tree representation is not easy to perceive these join
nodes, i.e., sets resulting from combinations of two input sets. In XPlain we choose a
graph representations in order to enhance the perception of join nodes. For the
following examples, consider the case of a user reviewing a paper. One revision
strategy is to find relevant papers of the same area of the reviewing paper that were
not referenced. Fig. 6 shows an exploration trail example for the case study of
“finding relevant and not cited papers”. The join node is the set difference operation.
The graph in Fig. 6 is a visual representation of the sequence of operations applied
along the exploration process, where, each node is a result set and the arrows
represent the operations applied. The “START” node represents the whole dataset and
the highlighted node “Relevant and not Cited Papers” is the result of the difference
71
Towards the Design of Expressive Data Exploration Environments
between the citations is the result of the difference between the citations of the paper
being reviewed and the top 20 most relevant papers of the Semantic Web area,
according to the number of income citations.
The graph in Fig. 6 is more than just a visual representation of the exploration trail
- it can also be used as a first-class object, where the user can parameterize the
operations and reevaluate dependent branches. For example, the user could replace
the set “Semantic Web Papers” in the exploration trail in Fig. 6 by a set of papers in
another research field and reevaluate the entire branch, thus reusing an exploration
trail for different papers of distinct research fields. In other words, it is possible to
reapply strategy used to solve a task as represented by the exploration trail.
Fig. 6. Graph representation of the functional composition for the task “finding relevant and not
cited papers”.
Once we recognize that an exploration is, in the end, also a function, the interaction
issues for allowing the reevaluation of a functional composition become quite similar
to the issues concerning the definition of bindings for the operations presented in the
previous section. The additional step is to consider the union of bindings from all
operations in the composition as bindings of the exploration. Therefore, the
reevaluation of the functional composition requires the redefinition of one or many
bindings of some operations. The interface could show the bindings and ask which
ones must be replaced for the reevaluation. The same design decisions adopted for the
definition of bindings for each argument type also apply for the redefinition of
bindings of functional compositions2.
The interaction/interface design issues for exploration trail management and
browsing are:
1. The visual encoding for the exploration trail, which includes both
exploration sets and their dependencies;
2. Allow the user to browse the exploration sets in the nodes of the exploration
trail;
3. Allow the user to access the bindings for specific operation/functional
compositions;
2
This feature is currently under development
72
Towards the Design of Expressive Data Exploration Environments
4. Define the interaction for binding redefinitions and reevaluations from the
exploration trail.
In summary, we have shown how separating the concerns of interaction/interface
design from the operations of the functional layer, together with use of the functional
layer as a guide for what the interface should provide for specific task contexts guides
the discussions of interaction possibilities. Since the main concern of this work is to
explore the design space of exploration environments, the XPlain interface is one
possible interface and interaction model for the functional layer that, even though it
presents full expressivity, it may not be efficient for all exploration contexts and
users.
7 Conclusion and Future Directions
This work presents a novel way to approach the design space of exploration
environment interfaces based on the separation of interface/interaction aspects from
the exploration operations and compositions, and data access concerns. The main goal
is to present and discuss abstract design issues and solution ideas for the interaction
design of exploration environments from the perspective of a well-defined and
expressive framework of exploration actions. We also based the discussion on the
design and development of a new expressive exploration environment in order to
demonstrate the occurrence of the design issues and possible solutions in a realistic
design case.
As future directions we plan to execute user studies and analyze the usage of each
operator within an exploration case. Given the agreement on the operators and
compositions required, usability studies must be carried out for devising efficient
interfaces and interaction dialogs.
Moreover, we plan to investigate the benefits of the reuse of functional
compositions in future explorations.
Acknowledgement
The authors were partially supported by CNPq project 557128/200-9 National Science,
Technology Institute on Web Science, CAPES, and Google Research Program.
8 References
1. Alahmari, F. et al.: Evaluating Semantic Browsers for Consuming Linked Data.
Proceedings of the Twenty-Third Australasian Database Conference - Volume 124.
pp. 89–98 Australian Computer Society, Inc., Darlinghurst, Australia (2012).
2. Amar, R. et al.: Low-level components of analytic activity in information
visualization. Proc. - IEEE Symp. Inf. Vis. INFO VIS. 111–117 (2005).
3. Amar, R. a., Stasko, J.T.: Knowledge precepts for design and evaluation of
information visualizations. IEEE Trans. Vis. Comput. Graph. 11, 4, 432–442 (2005).
73
Towards the Design of Expressive Data Exploration Environments
4. Bates, M.J.: The design of browsing and berrypicking techniques for the online search
interface. Online Inf. Rev. 13, 5, 407–424 (1989).
5. Belkin, N.J. et al.: Ask for Information Retrieval: Part I. Background and Theory. J.
Doc. 38, 2, 61–71 (1982).
6. Bozzon, A. et al.: Exploratory search framework for Web data sources. VLDB J. 22, 5,
641–663 (2013).
7. Brehmer, M., Munzner, T.: A multi-level typology of abstract visualization tasks.
IEEE Trans. Vis. Comput. Graph. 19, 12, 2376–2385 (2013).
8. Buschbeck, S. et al.: Parallel faceted browsing. CHI ’13 Extended Abstracts on
Human Factors in Computing Systems on - CHI EA ’13. p. 3023 ACM Press, New
York, New York, USA (2013).
9. Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4, 12
(2011).
10. Chi, E.H.: A Taxonomy of Visualization Techniques using the Data State Reference
Model. Inf. Vis. 2000. InfoVis 2000. IEEE Symp. 94301, Table 2, 69–75 (2000).
11. Di Iorio, A. et al.: Exploring Bibliographies for Research-related Tasks. Proceedings
of the 24th International Conference on World Wide Web - WWW ’15 Companion.
pp. 1001–1006 ACM Press, New York, New York, USA (2015)
12. Norman, D.A., Draper, S.W.: User Centered System Design; New Perspectives on
Human-Computer Interaction. L. Erlbaum Associates Inc., Hillsdale, NJ, USA (1986).
13. Nunes, T.: A Model for Exploration of Semi-Structured Datasets. Pontifical Catholic
University of Rio de Janeiro (2017).
14. Nunes, T., Schwabe, D.: Exploration of Semi-Structured Data Sources. 3rd Int. Work.
Intell. Explor. Semant. Data (IESD 2014). (2014).
15. Nunes, T., Schwabe, D.: Frameworks for Information Exploration – A Case Study. 4th
International Workshop on Intelligent Exploration of Semantic Data (IESD 2015).
(2015).
16. Nunes, T., Schwabe, D.: Frameworks of Information Exploration - Towards the
Evaluation of Exploration Systems. Proceedings of the 5th International Workshop on
Intelligent Exploration of Semantic Data - IESD ’16. , Kobe, Japan (2016).
17. Peroni, S. et al.: Setting our bibliographic references free: towards open citation data.
J. Doc. 71, 2, 253–277 (2015).
18. Shneiderman, B.: The eyes have it: a task by data type taxonomy for information
visualizations. Proceedings 1996 IEEE Symposium on Visual Languages. pp. 336–343
IEEE Comput. Soc. Press (1996).
19. White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm.
Synth. Lect. Inf. Concepts, Retrieval, Serv. 1, 1, 1–98 (2009).
20. Wildemuth, B.M., Freund, L.: Assigning search tasks designed to elicit exploratory
search behaviors. Proc. Symp. Human-Computer Interact. Inf. Retr. - HCIR ’12. C, 1–
10 (2012).
21. Wilson, M.L. et al.: Evaluating advanced search interfaces using established
information-seeking models. J. Am. Soc. Inf. Sci. Technol. 60, 7, 1407–1422 (2009).
74