An Interactive Framework to Facilitate Behavioural Pattern Exploration in Event Data (Extended Abstract) Leen Jooken Faculty of Business Economics UHasselt - Hasselt University Hasselt, Belgium leen.jooken@uhasselt.be Index Terms—event data behavioural analytics, pattern min- behaviour and very hard to interpret in their raw form. The fact ing, visual analytics, exploratory data analysis that this data is so low-level and fine-grained is what makes analyzing it so challenging. In order to analyze the behaviour, I. P ROBLEM S TATEMENT AND P OSITIONING WITH the low-level event data first needs to be transformed into R EGARD TO THE S TATE OF THE A RT higher-level behavioural patterns. A lot of data that is collected by systems can be categorized This is where our research problem arises: because this or treated as event data. Event data is data that describes data is more complex than conventional transactional data, events in which the state of the subject to which the event traditional data analysis techniques are not always appropriate relates, changes. This subject can be a person, an object or to extract these insights. There is a wide variety of pattern a process. Data must meet three important characteristics in mining techniques available, and although some could be used order to be categorized as event data: (1) an event happens on (preprocessed) event data, they were not developed with instantaneously, therefore we do not consider begin or end event data in mind and often rely upon a set of assumptions timestamps, nor the duration, (2) it is possible to (partially) which are not universal for event data, for example: process order events in time, (3) an event describes a change of state or mining [2] expects each event to be related to an instance of the context. The availability of this type of data has witnessed an process. However, exploring event data is not only challenging increase because of two characteristics of this digital age: (1) because this data is generally interrelated in a multidimen- the capturing has become easier, and (2) it has become easier sional manner, but also because of the large amounts of data, to deal with huge amounts of data as a result of cheaper data which creates challenges for current techniques both in terms storage opportunities, improved database technology and the of feasibility and in terms of interpretability of the results. availability of big data technology. When the data consists of a lot of observations and many Event data can provide a different type of insights than different types of items, these techniques tend to produce a traditional ‘rectangular’ data, because of how the information long list of possibly interesting patterns, which is difficult to is stored. Traditional data typically describes some type of act on by a user who wants to explore the data. Furthermore, business object by means of a set of attributes [1], which insights can only be learned from patterns that are meaningful means that it only provides a snapshot of the state of that with respect to the practitioner’s use case. Hence it is crucial business object. It cannot provide insights into the exact that understanding of behavioural structures, semantics and behaviour that led to that state. The ability to learn behavioural dynamics of the event data is incorporated in the pattern patterns is what makes event data so interesting. After all, discovery and analysis stage [1]. understanding, predicting and correctly reacting to changes in behaviour are crucial business capabilities. Furthermore, So we identified a set of needs related to exploring be- including the behaviour perspective to complement traditional havioural patterns in event data. Firstly there is a need for analysis can help you build a multi-dimensional viewpoint to clarity on which techniques can be used, and how they can be better solve certain business problems [1]. used, to mine for behavioural patterns in event data. Secondly Behaviour, however, is an abstract concept that consists of there is a need to effectively present the found patterns in many attributes and properties, for instance: the subject that a way that is intuitively comprehensible to the end user. carries out the action, the object on which a behaviour is Lastly there is a need to incorporate domain knowledge in an imposed, the context in which behaviour manifests itself, the interactive way while exploring the data to ensure the quality, goal of the behaviour and the impact on the object or context correctness and relevance of the found patterns with respect [1]. Events, on the other hand, are the raw observation of to the practitioner’s use case. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). II. P ROPOSED S OLUTION easy exploration of behavioural patterns in the practitioner’s The final envisioned solution is an exploration framework event data. This artifact will be designed and implemented to guide the practitioner in interactively exploring behavioural following a Design Science approach [5], to safeguard that it patterns in their event data. The framework must meet the will meet the predetermined requirements and the actual needs following requirements: (1) it should take raw event data as of the initial problem. Next, available visualization techniques an input, (2) it should uncover insights in behaviour of people, need to be explored and put to the test, in order to present the objects or systems, (3) it should enable interactive exploration, resulting patterns in a intuitively comprehensible way to the and (4) it must be possible to incorporate domain knowledge. end user. The development of this framework is an application of the C. Work Package 3 behaviour informatics approach [1]. The scientific field of Be- The focus of this last work package lies on incorporating haviour Informatics focuses on the development of methodolo- expert knowledge into the analysis process. We plan to do gies, techniques and practical tools for representing, modeling, this in an interactive way based on a visual analytics approach analyzing, understanding and utilizing behaviour [1]. [6]. Visual analytics is described as the “science of analytical The PhD Project is divided into three work packages. Each reasoning facilitated by interactive visual interfaces” [7]. At work package meets one of the needs we identified in Sect. I, its base it is an iterative process in which automatic discovery with the following deliverables: and visualization gets refined by feedback from the end user 1) a) a conceptual translation from the field of behav- [6]–[8]. The interactivity of visual analytics makes it extremely ioral informatics [1] to a set of analytical chal- well suited to be used for exploratory analysis, and places it lenges that describe different types of behavioural in the broader area of hybrid intelligence, aimed at creating insights that can be learned synergies by letting machines and humans work together [9]. b) an overview on how different pattern mining tech- IV. R ESULT VALIDATION niques, or underlying concepts of these techniques, Validation of the results of the different work packages is can address these challenges and be used to answer a great challenge. To validate our framework’s ability to find related research questions meaningful patterns and its user experience, one or more case 2) a framework that incorporates the most useful concepts studies will be set up in which a practitioner can evaluate the from these pattern mining algorithms to facilitate the performance of the framework on their data. However, it is exploration of event data from a behavioural analytics important to keep in mind that this evaluation is subjective and perspective and visualizes the found patterns in a way highly influenced by the end result the practitioner already has that is intuitively comprehensible to the end user in mind. A strategy needs to be worked out to minimize this 3) a way in which the end user can interact with the frame- influence and to evaluate not only patterns that are expected work and the feedback is reentered into the iterative to be found but also patterns that are unusual or overlooked. analysis process A combination of a structured interview of the practitioner’s III. R ESEARCH M ETHODOLOGY expectations beforehand, the capturing of the framework’s user experience and an in-dept interview afterwards, seems to be A. Work Package 1 the best approach. Since behaviour is an abstract concept that consists of many R EFERENCES low-level aspects, there are many viewpoints on behaviour that can be investigated [1], and many different insights that can [1] L. Cao, “In-depth Behavior Understanding and Use: The Behavior Infor- matics Approach,” Information Sciences, vol. 180, no. 17, Sep 2010. be mined. For this first work package we aim to map out the [2] W. Van der Aalst, Process Mining: Data Science in Action, 2nd ed. different domains of behaviour topics of interest and make a Springer-Verlag Berlin Heidelberg, 2016, 978-3-662-49851-4. classification of which pattern mining techniques can provide [3] C. C. Aggarwal and J. Han, Eds., Frequent Pattern Mining, 1st ed. Springer International Publishing, 2014, 978-3-319-07821-2. answers to questions from each domain. We will explore [4] P. Fournier-Viger, C.-W. Lin, R. U. Kiran, Y. S. Koh, and R. Thomas, pattern mining techniques from different domains to get an “A Survey of Sequential Pattern Mining,” Data Science and Pattern overview of the kind of behavioural insights that each tech- Recognition, vol. 1, no. 1, pp. 54–77, 2017. [5] P. Johannesson and E. Perjons, An Introduction to Design Science, 1st ed. nique can discover. Different techniques from the following Springer International Publishing, 2014, 978-3-319-10632-8. domains, among others, will be examined: sequential pattern [6] D. Keim, G. Andrienko, J. D. Fekete, C. Görg, J. Kohlhammer, and mining [3], [4], sequential rule mining [4], itemset mining [3], G. Melançon, Visual Analytics: Definition, Process, and Challenges. Springer Berlin Heidelberg, 2008, pp. 154–175. [4], episode mining [3], [4], periodic pattern mining [3], [4], [7] J. J. Thomas and K. A. Cook, Eds., Illuminating the Path: The Research high-utility pattern mining [4], association rule mining [3], [4], and Development Agenda for Visual Analytics. IEEE Computer Society, graph pattern mining [3], [4] and process mining [2]. 2005, 978-0-769-52323-1. [8] D. Sacha, A. Stoffel, F. Stoffel, B. C. Kwon, G. Ellis, and D. A. Keim, “Knowledge Generation Model for Visual Analytics,” IEEE Transactions B. Work Package 2 on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1604–1613, 2014. Using the results of work package 1 we aim to build a [9] E. Kamar, “Directions in Hybrid Intelligence: Complementing AI Systems framework that incorporates underlying concepts of the pre- with Human Intelligence,” in Proceedings of the Twenty-Fifth Interna- viously mentioned pattern mining techniques to facilitate the tional Joint Conference on Artificial Intelligence (IJCAI-16), March 2016.