Towards the Design of Expressive Data Exploration Environments Thiago Nunes and Daniel Schwabe Department of Informatics Pontifical Catholic University of Rio de Janeiro R.M.S. Vicente 225 Gávea Rio de Janeiro, RJ, Brazil +55 21 3527-1500 {tnunes, dschwabe}@inf.puc-rio.br Abstract. Information exploration processes are usually recognized by their inherent complexity, lack of knowledge and uncertainty, concerning both the domain and the solution strategies. Even though there has been much work on the development of computational systems supporting exploration tasks, the lack of a formal understanding of the exploration process and the absence of a proper separation of concerns approach in the design phase is the cause of many expressivity issues and serious limitations. This work shows how the design space for exploration environments can be structured by applying the separation of concerns approach, with special emphasis in characterizing the issues that must be addressed when designing the interface of such environments. 1 Introduction Information exploration environments aim at supporting information gathering tasks that often involve a high degree of complexity, the lack of user’s knowledge about the data, which is spread over multiple items and data types, and do not have a clear ending [20]. Exploration tasks usually involve sequences of data interactions that eventually lead to the desired outcome, which can be either a set of items or a knowledge state acquired along the process [5]. These particular characteristics differ exploration tasks from usual information retrieval tasks, where the user is assumed to know precisely how to translate his/her information needs into a query and the interaction is restricted to isolated sequences of query-responses [4]. As an example, the task “write a paper about recently discovered treatments for diabetes” would require exploratory behavior, while the task “discover who invented the light bulb” can be solved in a single query against search engines. Given the complexity of the tasks, exploration environments should be carefully designed to support a rich enough variety of data processing actions and strategies, accessed through their interfaces. A widely accepted motivation for exploration behavior is usually the need for filling knowledge gaps that prevent the user to achieving his/her goals. Explorers tends to experiment high degrees of uncertainty along the process, mostly concerning whether the desired knowledge state can be achieved given the available information space, task constraints, and the computational support for data manipulation. Despite 60 Towards the Design of Expressive Data Exploration Environments the massive publication of semi-structured information, leveraged by the Linked Data community1, and the amount of exploration environments available, it is still hard to assess to which extent an environment fits the exploration needs of the users. In order to address this central research question, in previous works we proposed the adoption of the separation of concerns principle, guided by Normans’ gulf traversal theory [12], as a means to improve the discussions of the concerns involved in the design of exploration tools. In that work we traced a parallel between Norman’s semantic and articulatory distances, and the distinction between the exploration operations and their realizations in the interface, which usually was not considered or mentioned in state-of-the-art tools addressing exploration tasks [14]. Subsequently, we proposed a first approximation of a framework of exploration operations expressive enough to describe the majority of state-of-the-art tools, which allowed us to analyze and compare the functional aspect of exploration tools, abstracting interaction and interface issues [16]. As an example, consider a refinement operation over a set of information items. A functional concern for refinements is whether the tool allows disjunctions, conjunctions, or negations of filters. On the other hand, an interaction/interface concern is which interface controls and interaction dialogues support the specification of filters and logical connectors. The proposed framework presents a rich semantics for describing the functional aspect of exploration tools, and may guide the design of interaction dialogues and interface widgets. Nevertheless, a remaining open question is, how to approach the interaction/interface design space based on a formal layer of exploration operations? This work presents a novel way to approach the design space for interfaces of exploration environments, by leveraging an expressive formalization of exploration actions and processes. To illustrate the discussions of the design issues and possible solution alternatives, we use the case of the design and implementation of the XPlain environment, which provides higher expressivity when compared to state-of-the-art tools, as we demonstrate in [13]. The example uses the Linked Data OpenCitations dataset1. The contribution of this work is a framework that presents a novel approach to characterize the interaction/interface issues. This framework serves as a guideline for designing expressive exploration environments over semi-structured data. Even though we discuss the issues and possible solutions in terms of a real exploration environment, the framework allows abstractions that can be generalized to any exploration environment design. The remainder of this paper is organized as follows. Section 2 presents the related work. Section 3 presents the separation of concerns approach applied to exploration environments, leading to a three-layered architecture. Section 4 presents the data model used in the definition of the exploration operations. Section 5 addresses issues related to exploration actions and strategies. Section 6 addresses interaction/interface design issues. Finally, we present the conclusions and future steps. 1 http://linkeddata.org/ 61 Towards the Design of Expressive Data Exploration Environments 2 Related Work The need of separating visual representations from processing operations has been established in the visualization area presenting taxonomies, typologies, and ontologies addressing at least these two concerns [2, 3, 7, 10, 18]. Chi’s work [10] divides the design of a visualization system in a sequence of stages in a pipeline that receives raw data as input and generates interactive visualization of the raw data as output. Each pipeline stage receives a set of data items, applies processing operations to transform the data, and passes the transformed data to the next stage. “scroll”, “zoom”, “filter”, “rotate”, and “scale” are operations that can be carried out in the view resulting from the pipeline. The work in [2] presents a taxonomy of analytic operations for describing visualization tasks containing, for example, “Cluster”, “Filter”, “Sort”, and “Correlate” operations. The work in [7] presents a typology of abstract visualization tasks addressing the Why, How, and What aspects of a visualization task independently of the kind of visualization and of the task domain. The Why concerns the goals, such as “discover” new information, “present” data, or “explore”. The How presents the actions to achieve the goals, such as “select”, “navigate”, “filter”, and “aggregate”. The What describes input and output resources handled by the tasks. These approaches are valuable to promote some degree of separation between the description of users goals, tasks, and operations from visual encoding details but their lack of formality makes it hard to analyze where they overlap and what are the differences. Moreover, they do not present detailed discussions of interaction/interface design issues with respect to a given well-defined conceptualization of exploration processes and strategies. Visualization systems are concerned with encoding data in a visual representation to foster human cognitive perception. Although interactive visualizations can be used to explore a dataset to some extent, supporting exploration behavior is not its main goal [19]. Moreover, even in interactive visualizations, the user is usually restricted to a specific visual representation of the data aiming at highlighting a certain set of dimensions. Since exploratory search tasks tend to be general and multifaceted [20], it is very difficult to know in advance which data dimensions will suffice. Therefore, an exploration environment should support a broader class of tasks that may even include sense-making activities and manipulations of the raw data in order to select proper dimensions to encode in visual representations. In this context, one advantage of our reference framework is that it allows designers to situate visualization concerns in the design of exploration environments. Within the framework, the projection of the data onto a visual counterpart along with interaction controls and dialogue structures is a concern of the interaction/interface layer. The data processing interactions can be designed on top of the same framework of operations whose inputs are visually encoded items and relations. Beyond the works in visualization field, there are some works addressing issues related to the broader exploration field. The work in [6] presents a similar architectural view of exploration environments and also abstracts the functional aspects in the SeCoQL exploration language. However, it presents a restricted set of operators containing only refine, pivot, and ranking, and does not approach interface 62 Towards the Design of Expressive Data Exploration Environments concerns in detail. The works in [1] and [21] propose a separation of interface details from the underlying features, but there the main goal is to present a common evaluation framework for comparison purposes and not a detailed separation of concerns discussion. Moreover, the features are not formally defined, which is the source of many ambiguities, such as different interactions for an operation being mistakenly understood as being distinct features. 3 Reference Framework A serious problem of exploration tools is the lack of a clear separation of the exploration operators from their realization in the interface [15]. We propose the organization of the design space of exploration tools in a three- layer architecture: Data Access, Functional, and Interaction/Interface. The Data Access layer is responsible for providing access to data repositories and mapping their data models to an abstract data model, which will be manipulated by the user throughout the exploration task. The Functional layer presents a set of exploration operators whose repeated compositions capture the solution strategies adopted by the users. The Interaction/Interface layer provides proper access to both the operators and the data items being manipulated. Fig. 1 shows a conceptual view of the design space. Fig. 1. The layers of an exploration environment architecture. The next sessions discuss the design questions for each layer of the architecture. We discuss solution ideas for each question in terms of the design and implementation of a generic and expressive exploration environment XPlain. 4 Data Model The design questions addressed by this layer are, how to provide access to the data repositories and how to abstract the details of the data model adopted by the repositories so that the same exploration actions can be executed over data represented in different ways. 63 Towards the Design of Expressive Data Exploration Environments Independently of how the data is represented, the user effectively manipulates items and relationships among them. Items can be organized in groups, such as papers by author or papers by venue. Groups can also be formed along more than one dimension, such as papers by author by publication year. Since such structures are not directly mapped to traditional data models we chose to model items and relations as nested relations. As an example, “papers by author by publication year” is a three- level nesting, having the papers grouped by year inside a group for each author. Nesting relations can be represented as trees, as Fig. 2 shows. We call “exploration set” any nested relation generated by the execution of an exploration action. Fig. 2. Nesting of papers by author by publication year. This data model is very similar to the one used in NoSQL document-oriented databases [9], where the nestings of an exploration set can be mapped to collections and nested arrays. It also can be translated to the RDF model, where each path from the root to the leaves can be mapped to a set of triples, using the reification mechanism to describe statements over statements in case of trees having more than three levels. 5 Functional Layer The main concern of this layer is to define exploration actions as operations over the generic data model presented in section 4 and the exploration process as a composition of these operations. In order to find an expressive set of exploration actions we carried out detailed analyses of a significant, if not the majority, of the state-of-the-art tools published in the literature. A summary can be found in [16]; for a detailed description of operations design process, refer to [13]. The operations of the functional layer are defined in terms of items and relations. A dataset D = is defined as a set of items I = {i1, i2,…, in} and a set of pairs of items R = {,…,}, representing the relationships in the dataset. We briefly present each proposed operation in terms of its input parameters, result sets, and examples of use over the dataset defined before. The complete formalization of the operations is beyond the scope of this paper and can be found in [13]. The operations’ descriptions are as follows: • Pivot(S, Rel): maps the leaf items of S onto a set of related items through the relation Rel; • Refine(S, Filter): restricts the leaves of S keeping only the items that match the predicate Filter; 64 Towards the Design of Expressive Data Exploration Environments • Group(S, grel): maps each leaf item of S onto their group keys using the grouping relation grel, where, the leaf items are nested within their respective group keys; • Correlate(S, T): finds all paths between every source item in S to every target item in T (many-to-many). A path is a set of items that connects a source to a target item and, each path is a different nesting in the result set; • Rank(S, level, score): ranks a given nested level of S, where the relevance of the items are obtained by a score function; • Map(S, mRel): maps the items in S onto another set of items using the computed relation mRel, where, mRel is a relation/function provided by the environment, such as counts and format and scale converters; • Unite(S, T): receives two exploration sets and applies a set union to each level of S and T; • Intersect(S, T): receives two exploration sets and returns a tree that is the intersection between the levels of the input set trees; • Diff(S, T): receives two exploration sets and applies a set difference between the leaves of S and the leaves of T. We call auxiliary functions or auxiliary relations all functions/relations that are not defined as an exploration operator, such as, mapping relations for Map, score functions for Rank, and filtering predicates for Refine. In summary, independently of the interface design, the range of solution strategies for exploration tasks is directly related to the set of operations supported and the possibilities of combinations of these operations. This is the main concern of this layer. For details and more examples of applications of the operations in a real case, refer to the case study published in [15]. 6 Interaction/Interface Having defined the operators and their possible combinations for exploration tasks, the interface designer must focus exclusively on deciding which interaction paradigms are more adequate. This section presents a discussion of interaction/interface issues separating its concerns from the exploration actions and compositions defined by the functional layer. Based on the concepts of the functional layer, the main goals of the interface are: (1) to present one or many exploration sets, where the items of each set may be hierarchically organized (nested); (2) Allow the user to select an operator/composition and specify its parameters; (3) Allow the user to visualize, manage, and browse the exploration trail. In order to leverage the discussion, we used a real problem situation discussed in [11] in the scientific publications field using the Open Citations [17] Linked Data dataset. Fig. 3 shows a screenshot of the interface of the XPlain environment. A screencast of a session can be seen at https://vimeo.com/227356693. Next, we discuss the interaction/interface design issues in the light of the separation of concerns approach, using the XPlain interface to exemplify the issues and possible solution ideas. 65 Towards the Design of Expressive Data Exploration Environments 6.1 Requirement 1: Manipulation of Exploration Sets and Items The first challenge for the design of exploration environment interfaces is how to present the data being manipulated and its relationships. The biggest issue to be dealt with is handling the potential excess of information to be presented, as the number of items can be very large. Here, Shneiderman’s visual information seeking mantra “Overview first, zoom and filter, then details-on-demand” [18] should be considered as a guideline. Considering the conceptualization of the exploration process as a functional composition that results in multiple exploration sets, the design alternatives for presenting those sets are: show one exploration set at a time (unifocal) or show many sets at a time (multifocal). Fig. 3. The interface of the XPlain environment. (A) keyword search controls; (B) Exploration operations toolbar; (C) Exploration sets area; (E) Exploration trail view. Unifocal interfaces have the advantage of reducing the amount of information shown at a given time and requiring less focus management interactions. However, they do not properly support operations that take more than one set as input, such as comparisons of alternatives [8]. For example, imagine a user interested in comparing the publication profiles (e.g., venues in common) of two researchers in a certain period of time. S/he filters the publications of each researcher by the desired period, pivots by venue, and computes the intersection (or difference) of the two sets. In a unifocal interface s/he must apply this sequence of operations to each researcher one at a time, and somehow apply the intersection operation (if available in the functionality layer) to the results, which won’t be both available in the same interface. Note that if the functionality layer does not provide a set operation, s/he must annotate the results and then make the comparisons offline. For web browsers, a common strategy is to open two or more windows and organize the windows to support the comparisons, but this is limited by available screen real estate. 66 Towards the Design of Expressive Data Exploration Environments In a multifocal view s/he is able to visualize the two sets of venues simultaneously and compute the intersection straightforwardly. The drawback of multifocal interfaces is the need to design of focus management controls, such as maximization, minimization, and restoration controls to avoid information overload. If comparisons between alternatives are not the case or the device is very restricted in screen size, unifocal interfaces may be more appropriate. An additional design issue is the layout and presentation of the relationships between the sets on the screen. To illustrate one possible set of options, in XPlain we opted for a multifocal interface to better support operations over multiple sets, where each set presentation has minimization/maximization controls. Fig. 3C shows two exploration sets in the workspace. The last generated set is always placed on top of the screen and the exploration trail presents the relationships between the sets and also allows the user to navigate to intermediary sets by clicking on the corresponding node in the graph. After deciding between unifocal and multifocal presentations, it is necessary to define the organization and interactions for the exploration items and the relations they participate in, within each exploration set. According to the functional model, there are two types of item relations that must be considered: schema relations and computed relations. Computed relations are relations created along the exploration process, such as grouping relations or mappings, which do not necessarily have an identifier, such as predicate URIs in RDF or column names. Two common representations of schema relations found in the literature are tabular format, where each relation becomes a column, and graph format, where the relations are the edges between nodes. A tabular format may be easier for spreadsheet users while graph views favors the visualization of the joins between the items. In XPlain we took a different approach by adopting a directory metaphor, where items are mapped to directories and both schema and computed relations are nested directories, as shown in Fig. 4. This choice allows a natural representation of groups, where each group is represented as a separate directory. The drawback is the visualization of items that participate in more than one relation, as items related to two different nested items will appear repeatedly, in two “directories”. Even in a unifocal interface, the number of items within a single exploration set can be large, so the designer should weight alternative choices for presenting the set using scroll or pagination controls. It is also typically desirable to apply some natural ordering to the items. Although our model describes ranking as an independent exploration operator, it can also be used in conjunction with other operators. Thus, even when operators other than ranking are selected, such as keyword refine or grouping, the interface can also make the composition with a ranking function and send the result to the server in order to enforce a natural ranking for the result set. In XPlain we opted for pagination controls with a limit of twenty items per page and an alphabetical or numerical ordering of results. 67 Towards the Design of Expressive Data Exploration Environments Fig. 4. Visual representation of an exploration set as a nesting of items and relations. In summary, the interface design issues for the manipulation of exploration sets and items are: 1. Choosing between unifocal and multifocal view a. If multifocal: design appropriate focus management controls for the sets, such as maximization, minimization, restore, and hiding controls; 2. Deciding where and how to show relationships between exploration sets; 3. Designing focus management controls for the items within the set e.g. pagination, scroll, or a combination of both; 4. Determining the best visualization for items and items relations (schema and computed): graphs, trees, tables, lists, etc; 5. Establishing a natural order for presenting exploration items and, possibly, adding sorting controls. 6.2 Requirement 2: Applying Exploration Operations The application of exploration operations presents another class of interface/interaction design issues, which concern both the selection and activation of an operator, and the definition of its parameters. The functional layer defines four types of arguments: exploration items, auxiliary functions, relations, and relation paths. Next, we argue that each argument type may require distinct interaction models. To invoke an exploration action the user must assign the values to each input parameter of the invoked operation. Each assignment is a binding, i.e., a pair that will be evaluated when operation is executed. For example, pivoting requires two bindings: the definition of the input set and the pivoting relation. For binding definitions, the interaction issues are: defining the assignment order for parameters, and defining the interaction that will support the binding definition. The latter issue depends on both the argument type and the operation. With regards to the order of the assignments, consider the pivoting action as an example. Some design alternatives are: the user selects the input set, activates the pivoting operator, and the system shows the relations for selection (e.g., interaction in 68 Towards the Design of Expressive Data Exploration Environments SeCo[6]); Alternatively, when the user selects the set, the interface could show all relations as selectable elements whose activation causes a pivoting over the selected relation. For tabular presentations, the first alternative may be better due to layout organization issues, however, for graph and list presentations the second option seems to be closer to hypertext browsing, which may favor Web users. The second option is the solution adopted by the majority of faceted search tools with pivoting functionality. Another example is the definition of bindings for the Refine operation, where the user should select the filtering relation, the filtering predicate and a value. One option is to simply allow the selection of values, where the relation is inferred and the filtering predicate is always an equality test. Another option is to allow the selection of the relation, the value, and the filtering predicate, which may be different than equality comparison e.g., greater than or less than operators. Thus, there can be many distinct interaction sequences for the definition of the bindings for each operation. The next issue concerns specific interactions for different types of parameters. Considering the case of the Refine operation where the user must define bindings for the relation or the relation path, the comparison operator (e.g. =, <, >), and the restriction value for the relation. With regards to the relation, the interface has to allow both the selection of relations and of relation paths. Fig. 5. View for the Refine operation. The user selects relations (A) or relation paths (B) and restriction values for each filter. Filters can be disjunctive or conjunctive according to the selected logical operator. For example, in Open Citations, if we want to refine papers by venue names, we must bind the relation path :isDocumentContextFor:isHeldBy:name to the relation 69 Towards the Design of Expressive Data Exploration Environments parameter, as shown in Fig. 5B. One design option is to allow the user to pivot relation by relation in this path until reaching the next to last relation, which is the :isHeldBy relation in this case. At this point, the interface can show the possible relations and values for the holders, which includes the :name relation and the actual venue names for selection. The computation is, therefore, carried out relation by relation until the desired path is achieved. Next, a selection of a venue name will cause the refinement over the path :isDocumentContextFor:isHeldBy:name. This is the most common interaction for path refinements found in faceted search interfaces, but considering the size of paths, the possibility of mistakes, and the amount of refinements required for the task, this design can be cumbersome. Another option is to allow the visualization of relation chains on demand, where the user can explore and select relation paths without causing a context change (pivoting). XPlain implements this design option with relation nestings built at runtime, where when a relation :x is nested with a relation :y then there is relation path :x:y in the dataset representation. This design allows the user to browse the nestings in order to find the desired path, with reasonable performance. Fig. 5B shows the nesting of :isDocumentContextFor, :isHeldBy, and :name relations. Auxiliary functions, such as the comparison operators, the scoring function, and the mapping functions can be picked out in the interface from a pre-defined set. For example, the Refine modal dialogue presents a selection box with all comparison operators available, as Fig. 5A shows. However, since it is very difficult to define a complete range of functions that covers all problem domains, the interface can also allow the user to describe the function in some computable language. Consider a user wishing to convert a set of measures to a different scale. The functional layer provides the Map function for such tasks, but, the desired scale converter is not among the available mapping functions. The user could simply type the formula and the interface creates the binding. Therefore, the interaction design should not only consider interface selection, but also textual inputs, with some validation in the case of function definitions, and filterable selection lists. The same issue also occurs for bindings of exploration items. XPlain’s interface allows the definition of new auxiliary functions using a Domain Specific Language (DSL) implemented by the functional layer. We also chose filterable selection lists for the definition of the filtering values, as shown in Fig. 5B. So far we have discussed the interaction/interface design issues and possible solution ideas for the execution of single operations. For some recurring functional compositions, there can be alternative interaction styles to the operation-at-a-time approach. An example of such compositions can be found in the expansion of an exploration item, shown in Fig. 4. When a user double clicks an item in the exploration set, XPlain executes a composition of Refine and Pivot to respectively select the clicked item, and pivot to its set of relations. The relations are shown as nested items that can also be expanded. This interaction allows the user to browse the graph of relations of an item in a follow-your-nose style without causing a context change or the addition of a new exploration set for each Refine and Pivot executed. This illustrates the fact that the designer should explore alternative interaction and 70 Towards the Design of Expressive Data Exploration Environments interface designs for combinations of operators that are more appropriate for a given task context. Capturing common combinations that would require more appropriate interaction designs is likely to be difficult, since the combinations may not be obvious for some domains and contexts. However, we expect that these combination patterns will emerge with the continued use of the environment. Since the patterns are formally described and recorded, they can be mined from the environment log and analyzed from the perspective of separate interface and interaction models. The interaction/interface design issues presented for the specification of exploration operations are: 1. The ordering for the specification of bindings for each operation. 2. The interaction required for specifying bindings for each parameter, depending on the parameter type and the operation. Some possibilities discussed were: interface selection, textual inputs, filterable selects, computable specification for auxiliary functions, and navigation through relation paths for relation parameters. 3. The possibility of modeling particular interactions for specific combinations of operators, such as for combinations of Refine and Pivot, and for compositions of Refine, Intersect, and Unite that can be modeled as a faceted search interaction. 6.3 Requirement 3: Exploration Trail Management and Browsing It has been recognized that exploration tools should allow the user to visualize the history of the exploration actions [19]. The functional layer defines relationships between result sets, where the result set of a previous action can serve as the input for the next. Hence, the design issues at this point are how to present the exploration trail and how to allow its manipulation. Some interface options for visualizing the exploration trail are trees or graphs. The tree representation has the advantage of allowing the user to collapse or expand the branches, which may be a good option considering the “details-on-demand” rule of the information seeking mantra. However, since the functional layer presents operations that receive two sets as input (e.g., unite, intersect, and diff), tree representations present a drawback because the result set of these operations must be repeated in two branches. With tree representation is not easy to perceive these join nodes, i.e., sets resulting from combinations of two input sets. In XPlain we choose a graph representations in order to enhance the perception of join nodes. For the following examples, consider the case of a user reviewing a paper. One revision strategy is to find relevant papers of the same area of the reviewing paper that were not referenced. Fig. 6 shows an exploration trail example for the case study of “finding relevant and not cited papers”. The join node is the set difference operation. The graph in Fig. 6 is a visual representation of the sequence of operations applied along the exploration process, where, each node is a result set and the arrows represent the operations applied. The “START” node represents the whole dataset and the highlighted node “Relevant and not Cited Papers” is the result of the difference 71 Towards the Design of Expressive Data Exploration Environments between the citations is the result of the difference between the citations of the paper being reviewed and the top 20 most relevant papers of the Semantic Web area, according to the number of income citations. The graph in Fig. 6 is more than just a visual representation of the exploration trail - it can also be used as a first-class object, where the user can parameterize the operations and reevaluate dependent branches. For example, the user could replace the set “Semantic Web Papers” in the exploration trail in Fig. 6 by a set of papers in another research field and reevaluate the entire branch, thus reusing an exploration trail for different papers of distinct research fields. In other words, it is possible to reapply strategy used to solve a task as represented by the exploration trail. Fig. 6. Graph representation of the functional composition for the task “finding relevant and not cited papers”. Once we recognize that an exploration is, in the end, also a function, the interaction issues for allowing the reevaluation of a functional composition become quite similar to the issues concerning the definition of bindings for the operations presented in the previous section. The additional step is to consider the union of bindings from all operations in the composition as bindings of the exploration. Therefore, the reevaluation of the functional composition requires the redefinition of one or many bindings of some operations. The interface could show the bindings and ask which ones must be replaced for the reevaluation. The same design decisions adopted for the definition of bindings for each argument type also apply for the redefinition of bindings of functional compositions2. The interaction/interface design issues for exploration trail management and browsing are: 1. The visual encoding for the exploration trail, which includes both exploration sets and their dependencies; 2. Allow the user to browse the exploration sets in the nodes of the exploration trail; 3. Allow the user to access the bindings for specific operation/functional compositions; 2 This feature is currently under development 72 Towards the Design of Expressive Data Exploration Environments 4. Define the interaction for binding redefinitions and reevaluations from the exploration trail. In summary, we have shown how separating the concerns of interaction/interface design from the operations of the functional layer, together with use of the functional layer as a guide for what the interface should provide for specific task contexts guides the discussions of interaction possibilities. Since the main concern of this work is to explore the design space of exploration environments, the XPlain interface is one possible interface and interaction model for the functional layer that, even though it presents full expressivity, it may not be efficient for all exploration contexts and users. 7 Conclusion and Future Directions This work presents a novel way to approach the design space of exploration environment interfaces based on the separation of interface/interaction aspects from the exploration operations and compositions, and data access concerns. The main goal is to present and discuss abstract design issues and solution ideas for the interaction design of exploration environments from the perspective of a well-defined and expressive framework of exploration actions. We also based the discussion on the design and development of a new expressive exploration environment in order to demonstrate the occurrence of the design issues and possible solutions in a realistic design case. As future directions we plan to execute user studies and analyze the usage of each operator within an exploration case. Given the agreement on the operators and compositions required, usability studies must be carried out for devising efficient interfaces and interaction dialogs. Moreover, we plan to investigate the benefits of the reuse of functional compositions in future explorations. Acknowledgement The authors were partially supported by CNPq project 557128/200-9 National Science, Technology Institute on Web Science, CAPES, and Google Research Program. 8 References 1. Alahmari, F. et al.: Evaluating Semantic Browsers for Consuming Linked Data. Proceedings of the Twenty-Third Australasian Database Conference - Volume 124. pp. 89–98 Australian Computer Society, Inc., Darlinghurst, Australia (2012). 2. Amar, R. et al.: Low-level components of analytic activity in information visualization. Proc. - IEEE Symp. Inf. Vis. INFO VIS. 111–117 (2005). 3. Amar, R. a., Stasko, J.T.: Knowledge precepts for design and evaluation of information visualizations. IEEE Trans. Vis. Comput. Graph. 11, 4, 432–442 (2005). 73 Towards the Design of Expressive Data Exploration Environments 4. Bates, M.J.: The design of browsing and berrypicking techniques for the online search interface. Online Inf. Rev. 13, 5, 407–424 (1989). 5. Belkin, N.J. et al.: Ask for Information Retrieval: Part I. Background and Theory. J. Doc. 38, 2, 61–71 (1982). 6. Bozzon, A. et al.: Exploratory search framework for Web data sources. VLDB J. 22, 5, 641–663 (2013). 7. Brehmer, M., Munzner, T.: A multi-level typology of abstract visualization tasks. IEEE Trans. Vis. Comput. Graph. 19, 12, 2376–2385 (2013). 8. Buschbeck, S. et al.: Parallel faceted browsing. CHI ’13 Extended Abstracts on Human Factors in Computing Systems on - CHI EA ’13. p. 3023 ACM Press, New York, New York, USA (2013). 9. Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4, 12 (2011). 10. Chi, E.H.: A Taxonomy of Visualization Techniques using the Data State Reference Model. Inf. Vis. 2000. InfoVis 2000. IEEE Symp. 94301, Table 2, 69–75 (2000). 11. Di Iorio, A. et al.: Exploring Bibliographies for Research-related Tasks. Proceedings of the 24th International Conference on World Wide Web - WWW ’15 Companion. pp. 1001–1006 ACM Press, New York, New York, USA (2015) 12. Norman, D.A., Draper, S.W.: User Centered System Design; New Perspectives on Human-Computer Interaction. L. Erlbaum Associates Inc., Hillsdale, NJ, USA (1986). 13. Nunes, T.: A Model for Exploration of Semi-Structured Datasets. Pontifical Catholic University of Rio de Janeiro (2017). 14. Nunes, T., Schwabe, D.: Exploration of Semi-Structured Data Sources. 3rd Int. Work. Intell. Explor. Semant. Data (IESD 2014). (2014). 15. Nunes, T., Schwabe, D.: Frameworks for Information Exploration – A Case Study. 4th International Workshop on Intelligent Exploration of Semantic Data (IESD 2015). (2015). 16. Nunes, T., Schwabe, D.: Frameworks of Information Exploration - Towards the Evaluation of Exploration Systems. Proceedings of the 5th International Workshop on Intelligent Exploration of Semantic Data - IESD ’16. , Kobe, Japan (2016). 17. Peroni, S. et al.: Setting our bibliographic references free: towards open citation data. J. Doc. 71, 2, 253–277 (2015). 18. Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. Proceedings 1996 IEEE Symposium on Visual Languages. pp. 336–343 IEEE Comput. Soc. Press (1996). 19. White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm. Synth. Lect. Inf. Concepts, Retrieval, Serv. 1, 1, 1–98 (2009). 20. Wildemuth, B.M., Freund, L.: Assigning search tasks designed to elicit exploratory search behaviors. Proc. Symp. Human-Computer Interact. Inf. Retr. - HCIR ’12. C, 1– 10 (2012). 21. Wilson, M.L. et al.: Evaluating advanced search interfaces using established information-seeking models. J. Am. Soc. Inf. Sci. Technol. 60, 7, 1407–1422 (2009). 74