1 INTRODUCTION

Conversation Trainings - Towards a Multidimensional Visualization of Learning Flows

ALEXANDER MAASCH

ROMY BÜRGER

0 0 Additional Key Words and Phrases: Sankey Diagrams , Learning Analytics, Information Visualization

2018

40 47

Conversation trainings are an e-learning method designed for the transfer of communication-related skills. With increasing conversation length and number of learners, creating and evaluating conversations becomes a complex task that needs appropriate visual guidance and data visualization. In this paper we show the inherent complexities of conversation trainings and we will guide through an approach that eases the conversation creation. We will explain data collection with Experience API and finally we present a Sankey Diagrams based approach to visualize large amounts of collected learning data. CCS Concepts: • Human-centered computing → Visual analytics; Information visualization; • Applied computing → E-learning;

1 INTRODUCTION

Using virtual conversations with an avatar is one specific method of vocational education and used to train correct decisions, tonality and technical correctness in various business applications. While chat bots are already on the horizon of education, statically defined conversations still achieve high customizability to very specific learning objectives. However, creating static conversation trainings and getting insights into the learner behaviour is a challenging subject to visualization and data querying.

A static e-learning conversation training consists of a sequence of virtual conversational situations. The learner is confronted with the situation and has a set of alternative reactions to choose from. A virtual conversational situation is typically defined by a statement of a virtual conversation partner. The conversation partner may be visually represented by a graphical avatar, photo or a video sequence. In addition, this virtual interlocutor may have variable states which may result in diferent visualizations. Each situation has one or more possible reactions in the form of statements that may be expressed by the learner. These reactions define which conversation path will be selected for the following conversation.

Defining a static conversation comes with the cost of managing many statements and each branch of a conversation. Branches increase the number of alternatives by an up to exponential extent. The high number of alternative paths can easily lead to a number of problems. These include conversation alternatives not being able to be displayed on screen at once with suficient information or cognitive overload while trying to maintain an overall picture. Once distributed the static conversation model will be enriched with real usage data, while displaying additional data on an already crowded canvas is becoming an increasingly challenging task.

This work will show how selecting a feasible auto layout for editing as well as filtering and user controllable information depth may lead to a balanced visualization of the multidimensional virtual conversation data, ultimately leading to better insights and providing valuable feedback for learning content creators. VisBIA 2018 – Workshop on Visual Interfaces for Big Data Environments in Industrial Applications. Co-located with AVI 2018 – International Conference on Advanced Visual Interfaces, Resort Riva del Sole, Castiglione della Pescaia, Grosseto (Italy), 29 May 2018 © 2018 Copyright held by the owner/author(s). 1 2 3 4 5 6 ... x To find an appropriate visualization for the conversation we first have to take a look at the inner structure of conversations and virtual conversation trainings. An idealized conversation between two partners may be described as a sequence of statements. Each statement is followed by a verbal and emotional reaction of the conversation partner. Thus, a single conversation can be modelled as a sequence of it’s elements. Each element is a verbal statement, either of the virtual conversation partner or the learner. The virtual conversation partner may also change its emotional state corresponding to the learners statement. Such a simple model could look like the linear chat-like flow shown in table 1 forming a single conversation path (see Fig. 1 (left)).

To create a challenging and explorable learning experience, a linear conversation has to be extended by alternative learning paths. One possible solution is to introduce choices at the learner side of the conversation and let the learner select one of diferent statements. Each statement leads to a diferent path for the rest of the conversation. Now, that every two steps each paths splits up, a tree structure arises as shown in figure 1 (middle).

With an exponentially growing number of statements such conversation trainings would become very hard to maintain. Intuitively, it makes sense to reuse parts of the conversation training in diferent situations, such as greetings, introductions or technical explanations. While slightly reducing the variation of the conversation, reuse ofers a big impact on maintainability. It also changes the way the statements are connected: more than one learner statement may now link to a virtual conversation partner statement. The branch-prone tree structure is reduced into a graph structure (see Fig. 1 (right)).

This graph has the following properties: • it is directed, • it is acyclic (even if a real-life conversation may contain cycles for the reason of enjoyment or mental disabilities, their absence is assumed in this paper as well as in demonstrative technical implementations), • it has a single starting point and multiple finishing points, • it has alternating layers for the virtual conversation partner and the learner.

Additionally the following content oriented properties may be considered: For each attempt, the learner experience is created by only one single path through the conversation graph. Designing a great learner experience with suficient learning time requires conversation creators to build a deep graph with long conversation paths. Because of this influence, the graph can be assumed to have less, but long conversation branches. At the same time, the number of alternatives in a single step can be assumed to be small. The conversation creator will always design only as much alternatives as the learner will be able to choose from, considering available display space, reading time and statement distinction. 42 • A Start

L A

A L

L A End

A Start

L L L

A A A End

L L L L

A End A End A End A End

A Start L

L L A

A L L L

L A End

A End 3

VISUALIZING CONVERSATION GRAPHS

Even by using graphs with reusable statements conversations may grow into a size and complexity where they become hard to maintain. Operations such as inserting new or deleting statement nodes in the graph lead to large manual reorganization tasks. Auto-layouting the conversation graph can improve the usability.

One applicable method of layering directed graphs has already been described by Sugiyama et al. in 1981 [ 9 ]. The proposed method consists of several steps: • Removing cycles. This step creates an acyclic graph. As shown above we assume conversation training graphs to be acyclic. Consequently this step can be skipped for conversation models. • Layering. In this step each element of the graph is assigned to a layer, maintaining a balanced layer size, and with the goals of minimizing the number of layers to the maximum path length and reducing cross layer edge spanning. • Dummy node creation. Now edges spanning across levels are replaced with temporary dummy nodes, so that each edge only spans directly to the adjacent layer. • Node Ordering. This now allows for a less complex node reordering. Nodes are ordered for edge crossing minimization following the graph direction. This reordering is processed layer by layer, only paying attention to the current adjacency. • Assigning coordinates. In this final step each node is assigned to coordinates including previously assigned dummy nodes. This process should aim at minimizing the number of bends by centering neighbors. • Cleanup Dummy nodes are deleted from the graph. It is now ready to be drawn. • Handling of acyclics. If edges have been removed in the first step to remove cycles, they will finally be re-added to the graph. While crossings are acceptable here, this step should try to visually separate backwards-directed edges using spacing, curvature and/or style. This step may be skipped for conversation models as well.

The described algorithm matches very well with the typical structure of conversation trainings and improves the creator’s user experience. However, for larger conversation trainings, their creators easily loose the flow of conversations, especially if the conversations can only partially be shown on screen. This challenge has been addressed by: • Coloring the emotional state of the statements. This method visually groups emotionally related statements to each other. • Providing an overview over the whole conversation graph with navigation functions.

• Zoom functions to freely select the required level of detail.

An example of a layered graph visualization with the additional elements state colorization, overview navigation and zoom is shown in figure 2. 4

STANDARDIZED DATA TRACKING

After finishing the creation process of the conversation training, such trainings are packaged and deployed as standalone training content packages or as part of larger learning arrangements. For many years now learning management systems have been used to manage the runtime experience of learners. Standards like AICC[ 1 ], SCORM[ 2 ] and Experience API (also abbreviated as xAPI)[ 3 ] are helping to track the actual usage of the conversation trainings. AICC/SCORM are established communication protocols that enable the exchange of learningrelated data between a learning management system and learning content packages. They are used to track the learning process and assessment results into a centralized database. SCORM for example supports the following tracking: start and finishing of a content package, given answer for questions (via IDs), time taken for answering, 44 • progress and total score of a question and a whole content package. Although, this kind of tracking already existed, it was hardly of use here due to the following limitations: • Learning data is stored internally in databases inside the learning management systems, • The learning data does not include suficient textual information for multilingual use cases, • It is not possible to add contextual data, • It is not possible to report on learning data across multiple learning content packages.

In order to establish an automated tracking and evaluation process, we decided on setting up a tracking using the comparatively new standard xAPI. Thus, we could tackle the limitations of the existing LMS/SCORM combination as follows: • Data is tracked in plain text in the format actor (subject), verb (predicate) and object. Diferent languages can be mapped, in order to merge all answers belonging to the same question, regardless of language. • xAPI can be used to track information across all content packages and even across learning management systems. • Since xAPI has been designed for expandability, new verbs can be added ad-hoc if necessary. • All data is stored in an independent Learning Record Store and can be retrieved at any time using the open standard xAPI.

Tracking with Experience API captures learner experiences in sequences of experiences statements. While it is not possible to directly track cognitive processes of learning, these tracked experiences observe the measurable aspects of learner’s interactions with a digital conversation training. Such a sequence of events is by itself not tied to the original conversation structure, may contain gaps, aborts or other inconsistencies and is subject to downstream interpretation, reporting or visualization processes. 5

VISUALIZING LEARNING FLOW

After successfully creating and distributing a conversation training with integrated Experience API support, creators and tutors need insights into the learning data. To their questions belong: • Which reactions have been chosen by the learners? • What are typical learning paths of the conversation? • Are there underused statements and paths? • Are there statements/reactions that learners struggle with? • How is the performance of the learners distributed? • Is the dificulty of the conversation training appropriate for the target audience? • How does the actual learning time relate to the planned learning time? • What is the overall quality of the conversation model?

Having delivered the learning experience to their audience, the Learning Record Store should now contain a reasonable data set with events of each single learning flow. The bespoken visualization of the conversation graph now has to be extended by additional elements.

To answer the needs of creators and tutors the used interface needs to: • Zoomably visualize the conversation graph, • Show comparable quantities of statement experiences, • Show the path frequency, • Show individual paths to inspect taken decisions and • Assist in diferentiation between diferent target audiences (e.g. context and time).

The proposed visualization concept is inspired by parallel coordinates [ 4, 5 ] and flow diagrams such as Sankey diagrams [ 8 ] and Parallel sets [ 7 ]. Instead of showing individual learning paths, edges are grouped by source and target and aggregated by number of individual paths. The transformed edges of the conversation graph now form a quantitative flow. Traditionally, Sankey diagrams are used to visualize the flow of energy or materials in networks and processes. They illustrate quantitative information about flows, their relationships and their transformation [ 8 ]. For this reason Sankey diagrams are suitable for the visualization of quantitative aspects of learning path usage while being able to retain an overall view of the conversation.

In this concept, the nodes represent statements of the conversation, starting from the left and continuing to the right. Individual statement nodes are colored by emotional state (green, yellow and red) and by owner (virtual conversation partner vs. learner (black)). The vertical size of a node shows the frequency of the statement. These nodes are connected by a curve representing the aggregated flow from each conversation node to the following node. The width of this curve indicates the frequency of that relation.

The increasing complexity of the visualization as well as learning path scalability has been addressed by a zoom feature in combination with variable display of details. This concept uses three levels of detail (LOD) to allow the analysis of long training scenarios: in the lowest LOD only nodes are presented to allow the identification of most frequently used statements, comparable through the height of each node (see Fig. 3, upper left corner). By zooming into the axis the second LOD is presented that shows the flows between the axes (see 3, upper right corner). The third LOD provides the statements in text boxes that allow the identification of interesting nodes (see Fig. 3, lower left part).

Target audience and learning time segmentation has to be addressed with additional elements. For this reason a time distribution of learning processes has been added at the bottom (see Fig. 3) showing when, how many learners and with which overall score they completed the training. This visualization may additionally be used as audience filter: by selecting a time range with a point and drag interaction, the displayed data set will be reduced to that time period. On the left side a scale has been added showing the duration of the conversation with a dot for each single attempt. In addition to the state indicator for each single node in the center area, a color indication for the score of each completed learning experience is used at two positions: the time distribution at the bottom and the conversation duration scale on the left side.

To get deeper insights into the individual decisions, the points on the left and the bars at the bottom, each representing a single learner experience, may be selected. An additional blue line appears showing the linear conversation flow of the learner’s experience throughout the conversation training (see Fig. 3 bottom right). For comparison, several learning experiences may be selected to be displayed by multiple blue lines on the conversation graph.

To follow a flow by filtering nodes, multiple conversation statements can be selected (see Fig. 4). Node selections are combined by logical AND across layers and by OR in a layer. Flows matching the selection are highlighted and give a quantitative insight into the connections between selected nodes across the whole conversation. While lfows are selected, the display of selected experiences is adapted to the underlying flow: solid lines match the lfow selection, experiences with dashed lines do not match the selected flow.

Hence, both visualization techniques are combined with each other: the flows serve for the analysis of the distribution in the data set, single curves as used in parallel coordinates allow the comparison of diferent data values (cp. [ 6 ]).

6 CONCLUSIONS

Conversation trainings with their inherent graph structures are a challenge for visualization even in the beginning of the process, when training creators build their statements for both conversation partners. We showed, that the graph has a structure with pairs of adjacent pairs of layers. We have shown, that by using layered auto layout for the directed conversation graph the visualization could be improved for the editing process.

Integration with standardized collected data added new dimensions to the visualization. Sankey diagrams have been used for visualizing the quantities of learning paths along the edges between the statement nodes of the graph without adding too much complexity to the layered graph. By using the Sankey diagrams based approach to display the flow of learning in a zoomable manner - with variable levels of detail, we were able to provide a visualization scaling from small numbers of learners and nodes to wider audiences as well as larger conversation graphs.

A set of basic filters have been built to dive into data details, however there are many dimensions left in the space of learning context. Demographic attribution, place and time of learning or the embedding context of the conversation may be included in the visualization. Finding appropriate solutions to integrate this contextual information as well as an overall evaluation is subject to future work.

7 ACKNOWLEDGEMENTS

The demonstration of the parallel sets based learner experience visualization as web application has been developed by Marius Hogräfer, student at Technische Universität Dresden as part of a workshop on applied visual language. This work has been supported by the European Regional Development Fund and the Free State of Saxony (project no. 100238470).

[1]

Aviation

Industry CBT Committee ( AICC ). 2015 . AICC Specifications Document Archive hosted at Advanced Distributed Learning (ADL) Initiative . https://github.com/ADL-AICC/AICC-Document-Archive/.

[2]

Advanced

Distributed Learning (ADL) Initiative . 2009 . SCORM. Technical Specification (4th Ed.) . http://adlnet.gov/public/uploads/SCORM_ 2004 _ 4ED_v1_1_Doc_Suite .zip.

[3]

Advanced

Distributed Learning (ADL) Initiative . 2016 . Experience API Specification, Version 1.0 .3 . https://github.com/adlnet/xAPI-Spec/ tree/1.0.3.

[4]

Alfred

Inselberg and

Bernard

Dimsdale . 1990 . Parallel Coordinates: A Tool for Visualizing Multi-dimensional Geometry . In Proceedings of the 1st Conference on Visualization '90 (VIS '90) . IEEE Computer Society Press, Los Alamitos, CA, USA, 361 - 378 . http://dl.acm.org/citation. cfm?id= 949531 . 949588

[5]

Johansson and

Forsell . 2016 . Evaluation of Parallel Coordinates: Overview, Categorization and Guidelines for Future Research . IEEE Transactions on Visualization and Computer Graphics 22 , 1 (Jan 2016 ), 579 - 588 . https://doi.org/10.1109/TVCG. 2015 .2466992

[6]

Mandy

Keck , Martin Herrmann, Andreas Both, Dana Henkens, and

Rainer

Groh . 2014 . Exploring Similarity. In Human Interface and the Management of Information. Information and Knowledge in Applications and Services, Sakae Yamamoto (Ed.). Springer International Publishing, Cham, 160 - 171 .

[7]

Kosara ,

Bendix , and

Hauser . 2006 . Parallel Sets: interactive exploration and visual analysis of categorical data . IEEE Transactions on Visualization and Computer Graphics 12 , 4 ( July 2006 ), 558 - 568 . https://doi.org/10.1109/TVCG. 2006 .76

[8]

Riehmann ,

Hanfler , and

Froehlich . 2005 . Interactive Sankey diagrams . In IEEE Symposium on Information Visualization , 2005 . INFOVIS 2005 . 233 - 240 . https://doi.org/10.1109/INFVIS. 2005 .1532152

[9]

Kozo

Sugiyama , Shojiro Tagawa, and

Mitsuhiko

Toda . 1981 . Methods for Visual Understanding of Hierarchical System Structures . IEEE Transactions on Systems, Man and Cybernetics 11 , 2 ( 1981 ), 109 - 125 . https://doi.org/10.1109/TSMC. 1981 .4308636