Exploration of Anomalies in Cyclic Multivariate Industrial
            Time Series Data for Condition Monitoring
                 Josef Suschnigg                                            Belgin Mutlu                          Anna Katharina Fuchs
               Pro2Future, AVL List                            Pro2Future, Know-Center GmbH                                 AVL List
                  Graz, Austria                                         Graz, Austria                                     Graz, Austria

                   Vedran Sabol                                         Stefan Thalmann                                Tobias Schreck
               Know-Center GmbH                                           University of Graz                   Graz University of Technology
                  Graz, Austria                                             Graz, Austria                              Graz, Austria
ABSTRACT                                                                                  As the first contribution of this paper we characterize testbed
Industrial product testing is frequently performed in cycles, re-                      data and study how engineers achieve their data analysis goals.
sulting in cycle-dependent test data. Monitoring the condition of                      Based on the design study we propose an (1) extendable and
products under test involves analysis of large and complex test                        (2) versatile glyph based visual analytics approach for anomaly
data sets. Main tasks are to detect anomalies and dependencies                         detection in multivariate sensor data as our second contribution.
between observation variables, which appears to be challenging                         The extendibility and versatility is achieved by making underly-
to engineers. In this paper, we present a flexible and extend-                         ing data analysis methods exchangeable and flexible by a variable
able visual analytics approach for anomaly detection focusing                          number of anomaly detectors. Additionally, we applied a tech-
on cycle-depended data. It is based on a glyph representation to                       nique enabling users to visually identify conspicuous sensor data
visualize anomaly scores of cycles with respect to interactively                       by a matrix representation for drill-down and further compar-
selected reference data. Our approach is built on a design study                       ative analysis. As a constraint, the concept of this work is only
in collaboration with an industrial engineering corporation, and                       applicable for cyclic (also periodic or seasonal) data. Nevertheless,
is demonstrated on real data from engines tested on automotive                         cyclic data is related to the repetitive behavior of many industrial
testbeds. Based on findings from evaluation results, we provide a                      applications. The approach has been designed for automotive
discussion and an outlook for future work.                                             testbed data with sensor-intensive technology, where anomalous
                                                                                       or erroneous events should be anticipated by visual data analysis.
                                                                                       The third contribution of this paper are results of the pair ana-
1    INTRODUCTION                                                                      lytics evaluation [2], which has been conducted in collaboration
                                                                                       with the target user group on the given use case data set. Results
The trend of digitization in industry (often synonym for so-called
                                                                                       are encouraging and open promising directions for future work.
Industry 4.0) generates large amounts of data by sensors and data
recordings from almost all machines and devices of the produc-
tion process with the promise of creating new usage opportunities                      2     RELATED WORK
[39]. Over all stages of the industrial product life cycle (PLC),                      This section discusses the related work conducted in analyzing
extracting valuable knowledge from generated data can lead to                          data using either algorithmic data analysis or visual analytics
an improvement regarding costs, quality and increased flexibility                      approaches. Furthermore, we detail the glyph representation as
(incl. safety, durability, reliability) [3][31]. However, for human                    this technique has been proven to be an effective manner to
perception, it can be overwhelming to observe and analyze large                        represent time series.
industrial data sets. Another important requirement of analyzing
industrial data is that extensive professional and domain-specific                     2.1    Automated data analysis approaches for
knowledge of users is required [42]. Addressing those challenges,
visual analytics research proposes tools supporting domain ex-
                                                                                              anomaly detection
perts to explore large and complex data sets [26]. Further, to                         One property of many typical industrial applications is the rep-
identify and address the industry’s needs, our case study has                          etition of specific tasks. To give an example, Maier et al. [21]
been conducted in close collaboration with an industry partner                         emphasized reoccurring processes (cycles) for automation and
from the automotive sector, focusing, among others, on engine                          production. In theory, data generated within such cycles should
testbeds. The main driver of the focus on the anomaly detection                        be highly comparable for anomaly detection. Anomalies are
is the common occurrence of real-world problems in automotive                          generally understood to represent patterns in data that do not
condition monitoring. Due to the large amount of testbed data,                         conform to a well-defined normal behavior [1]. The literature
data analysis is time-consuming and there is a risk that events                        provides a comprehensive collection of algorithms for the de-
which lead to the failure of an engine under test, may not be                          tection of anomalies in multivariate time series data. Anomaly
anticipated or overlooked by engineers. We hence research the                          detection also often refers to the term novelty detection [23]
development of visual analytics approaches to support test en-                         or semi-supervised learning [6], whereas for those methods the
gineers in creating hypotheses and early warnings of potential                         definition of normal or rather reference data is needed. Anomaly
failure cases.                                                                         detection in time series [16] found attention from industry for sev-
                                                                                       eral applications, such as predictive maintenance [17], condition
© 2020 Copyright for this paper by its author(s). Published in the Workshop Proceed-   monitoring [40] or decision support systems [28]. Two groups
ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen,       of anomaly detection algorithms are used in this work. The first
Denmark) on CEUR-WS.org. Use permitted under Creative Commons License At-
tribution 4.0 International (CC BY 4.0)
                                                                                       group are correlation-based approaches, which have been effec-
                                                                                       tively applied on industrial sensor data [40]. The idea behind
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                                       Suschnigg et al.


this approach is, that changes of the bivariate correlation be-                            120
tween two sensors can be interpreted as an anomaly. The second


                                                                        Temperatur in °C
group are regression-based methods [12] or reconstruction-based
novelty detection methods [23]. The basic idea of this type of
anomaly detection is to build a regression model on reference                              115
(normal) data. Estimation errors of the model in comparison to
measured data are rated as anomalies if they exceed a predefined
threshold.
                                                                                           110
                                                                                                 0   400   800       1,200     1,600     2,000
2.2    Visual analytics for industrial application                                                               Hours
Recently, a survey on visualization and visual analytics applica-
tions for smart manufacturing has been published [42]. It reveals                     Figure 1: Temperature incident
the diversity of several studies performed for industrial appli-       Temperature of the critical component over all cycles. For each
cations and the need of visual analytics. A few examples are           cycle over the whole durability test time the mean temperature
available, on how to solve the problem of finding anomalies in                      has been calculated for trend analysis.
multivariate time series data by visual analytics. An application
for finding anomalies in the power consumption of buildings
has been proposed by Janetzko et al. [18]. It suggests a model-        engine power density, speed and durability, or of legal nature,
based and a similarity-based anomaly score and visualizes them         such as, along with others, fuel economy, noise pollution and
in several visualization techniques such as, recursive patterns        exhaust gas emissions. For this research work, we analyzed data
[19], spiral graphs [34] and line charts. In the work of Wu et         from a durability test of an internal combustion engine. The
al. [35] anomalies are detected for condition monitoring by a          main goal of the test is to ensure the durability, reliability and life
model-based approach. The deviation of estimated and real val-         time expectations of the engine. Therefore, durability tests are
ues is visualized in a river plot view [9]. As an ongoing challenge,   conducted to let the engine undergo sufficiently high mechanical
the authors outline the problem of analysts to trust and make          and thermal loads (stresses) and a sufficient number of fatigue
use of the algorithms for condition monitoring. Many different         cycles (e.g., hundreds of hours) [37]. During durability tests, a
algorithms are available for several applications and finding ap-      vast amount of data is collected by sensors which either are
propriate models and parameters is a hard task. Considering this       commonly build in modern vehicles and accessed through the
problem, Xia et al. [36] proposed a visual analytics application       engine control unit (ECU), or sensors which have been mounted
to support users in finding the right model for dimensionality         on the engine and the testbed for testing purposes.
reduction. Another work addressing this challenge is presented            Throughout durability tests, engineers are observing the test,
by the EnsembleLens [38]. It is a visual analytics system to help      and are responsible for the performance and the condition of the
data mining experts to evaluate, compare and select available          testee. For that condition monitoring task, engineers generally
anomaly detection algorithms.                                          monitor a few familiar sensors for threshold violations, manually
                                                                       selected and defined by their given domain knowledge or by the
2.3    Glyph representation of cyclic time series                      customer. However, for novel engine design, which has been
       data                                                            recently developed, there is no knowledge on all sensors and
                                                                       their thresholds, and it is often not appropriate to apply earlier
Besides the major summaries and surveys [7] [25], recently a           experiences. In fact, an important task in testing of novel engine
systematic review of experimental studies on data glyphs has           designs is the comparison for differences with previous designs.
been presented by Fuchs et al. [14]. To visualize multivariate         Therefore, our work is motivated by making use of the time series
time series data, glyphs are an appropriate choice and can en-         data of all sensors and the information they might contain.
able quick visual comparison of data values over time [13]. In            To give a practical example of the challenges engineers face
Ward and Lipchak [33], the visualization of a circular glyph for       during durability tests, in Figure 1 a line plot describing the prob-
recognition of the evolution of a measurement of interest has          lem of a use case is shown. After 1, 200 hours of a 2, 000 hours
been proposed. Another glyph-based design for outlier detection        durability test a fatal error occurred, which increased the tem-
in social networks has been proposed by Cao et al. [10]. In their      perature of a critical component part of the engine by up to 8
work, glyphs visualize suspicious behavior of users, based on          °C over time. In the end, the durability test failed because of the
the z-score of several attributes, in a star-glyph like design. The    temperature increase of the critical component. The failure could
anomaly scores of entities are visualized by the intensity of the      not be anticipated, because of the large amount of sensors. Con-
red color in the cyclic glyph center. As examples for glyph-based      sequently, it was a hard task to define rule-based thresholds or
time series visualizations, a few techniques for glyph designs for     measurements of concern, indicating the failure upfront. How-
comparison purposes are evaluated in the work of Fuchs et al.          ever, in comparison to a simple rule-based anomaly detection
[13].                                                                  approach, more complex data analysis and models which also
                                                                       take the interplay of sensor data into account may lead to better
3     BACKGROUND ON AUTOMOTIVE                                         analysis results. Domain experts assume that it should be pos-
      TESTBEDS AND USE CASE                                            sible to anticipate such failures by advanced data analysis and
Our work has been conducted for automotive engines in the              visualization techniques. Therefore, to understand the current
context the validation and verification phase of the industrial PLC    data analysis workflow of engineers, we carried out a design
[3] [30]. After an engine has been developed, its requirements         study as basis for our proposed visual analytics application. More
are verified and validated in automotive testbed environments.         details on the data and the design study will be given in the next
Those requirements can be of functional nature, such as the            section.
                                                                               EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark


4     DESIGN STUDY                                                       4.3     Tasks
As the first contributions of this paper, we characterize testbed        This design study is based on the domain question, if an engine
data and studied how engineers fulfill their condition monitoring        is in a non-critical condition during a durability test. For that
task through data analysis. This section relates to Miksch and           purpose, we define test cycles as the population unit (or entity,
Aigners “design triangle” [22] and is generally based on the design      or unit of analysis) [20]. Furthermore, engineers have a high
study methodology of Sedlmair et al. [27]. For the tasks aspects         level of understanding of using cycles as a granularity level for
of the design triangle we bridge from goals to tasks with the            their analysis. Due to the repetitive behavior of cycles, they are
design study analysis report as proposed by Lam et al. [20].             highly comparable, and therefore we define the data analysis
                                                                         goals of engineers as multiple population analysis. Consequently,
                                                                         we identified that engineers are pursuing all three multiple popu-
4.1    Data                                                              lation goals defined in the design study analysis report framework
                                                                         [20]: (a) compare entities (b) explain differences and (c) evaluate
Input data used in our proposed visual analytics approach for
                                                                         hypothesis. In the following, we further investigate the char-
anomaly detection is taken from an automotive engine testbed.
                                                                         acterization of those goals by their input, output and analysis
One common task within engine development is to carry out dura-
                                                                         steps.
bility tests. For such a durability test, a test cycle is specified to
verify the durability of an engine. The test cycle, which is defined        4.3.1 Compare Entities. Engineers attempt to detect popu-
by a given engine speed and engine torque profile over time, is          lation differences as their top level analysis goal. In order to
repeated in the period of several month until the target operating       achieve that, engineers are observing trends of several familiar
hours are reached. During a durability test, hundreds of sensor          channels by calculating the mean values of channels per cycle
measurement signals are acquired and stored continuously, while          and visually explore changes over cycles in a time-ordered line
the engine drives the given profile. In the automotive domain,           plot (see Figure 1). In trend charts, up to ten time series are com-
those sensor measured time series are called channels, which             pared either in juxta- and/or superpositioned lineplots [15]. The
we adopted throughout our research. Among others, channels               output of the compare entities analysis goal, is the observation
mainly record several engine speed, engine torque, temperatures,         of a conspicuous trend or anomaly, which is investigated in the
pressures and exhaust gas measures.                                      explain differences goal.
   One cycle is stored within one file and can be seen as a N × n-
                                                                            4.3.2 Explain Differences. Regarding the domain question,
dimensional matrix, where N is the number of channels and n
                                                                         the output of the explain differences goal can be either that the
the length of the time series. All signals originally are recording
                                                                         observation is not relevant for the engines condition, or as a
numerical values in a frequency of 10Hz. Note that channels are
                                                                         hypothesis one specific component of the engine is in a bad
aligned according to the given engine speed profile and therefore
                                                                         condition. In the next analysis goal, the hypothesis needs to be
cycles of the same length can be extracted. The target data con-
                                                                         evaluated.
tains records of c = 860 cycles of a 2, 000 hours’ durability test,
in which each cycle has a duration of 140 minutes. Overall, the             4.3.3 Evaluate Hypothesis. Evidence that a component or part
dataset has 860 cycles x 480 channels x 84, 000 numerical values.        of the engine is in a bad condition needs to be evaluated by
   To conclude, the dataset can be characterized as a cyclic, piece-     engineers. The analysis steps of that goal don’t differ from the
wise equi-length segmented, multivariate streaming time series           preceding goal. Domain experts are exploring channel time series
(accordingly to the characterization of Shurkhovetskyy et al. [29]).     by their domain knowledge and attempt to find differences and
For our research work data could be seen as stationary, since            interesting patterns by comparing different channel line plots
the durability has finished and the entire dataset was available         of multiple populations. The final confirmation or rejection of
for our work. Although, we kept a streaming time series sce-             hypotheses is made after evidence has been collected through
nario in mind where engineers use our proposed visual analytics          data analysis and if required investigated directly on the engine.
approach throughout a durability test.                                      Through that characterization of higher level analysis goals,
                                                                         we can derive lower level task definitions T1 - T5 to address them
                                                                         in our visual analytics design considerations and automated data
4.2    Users                                                             analysis:
Users of our proposed visual analytics application are devel-            T1 Identify population contrasts. Test cycles are the unit of
opment engineers with mechanical engineering background,                     analysis, or population, for engineers. As the first task, popu-
working with powertrains and engines on a regular basis. They                lation contrasts or differences are explored. This is achieved
have long-standing experience with engines in testbed environ-               by trend analysis of a few familiar channels. The problem of
ments and as front-line analyst, also practically analyzing data             dealing with big channel amounts engineers face should be
to achieve their analysis goals (i.e., condition monitoring). Three          considered in the design by taking all channels into account.
users collaborated in our project systematically by participating        T2 Application and visualization of semi-supervised anom-
at the design study and the pair analytics evaluation (see section 4         aly detection methods. Engineers detect interesting pat-
and section 7). In general, the work with testbed data is essential          terns and anomalies mainly by visually exploring line plot
for development engineers and offers the opportunity to measure              trends of channels. Comparing past cycles to current cycles
indicators regarding functional or legal requirements and engine             is related to a semi-supervised learning scenario and should
performance. During our research work, we also collaborated                  be considered for the choice of automated data analysis and
with data scientists who are daily working with testbeds and                 the visualization. First, the automation of data analysis to
powertrains. They constantly provided informal feedback from a               highlight interesting or conspicuous channels should be in-
different view throughout our work.                                          cluded into the visualization. Second, we assume that the
                                                                             combination of several anomaly detection algorithms leads
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                                   Suschnigg et al.


   to more significant findings, for which reason an ensemble           anomaly score calculation. Note, that the baseline anomaly score
   method [38] should be considered for the visualization.              is calculated by a baseline cycle, which data should be recorded
T3 Examine conspicuous channels in multiple populations.                temporally close to the reference cycle. Hence, we specify the
   After conspicuous channels have been detected by trend               cycle subsequent to the reference as the baseline cycle. Defining
   analysis, engineers drill-down to examine and find differ-           the upper limit (threshold) is critical and can be changed inter-
   ences between channel line plots of different populations.           actively in the visualization. In the following, we discuss two
   The comparison of interesting channels in different popula-          methods, which have been applied on industrial sensor data in
   tions should be considered in the design.                            prior research. Another criterion for selecting those two methods
T4 Detection of conspicuous channel relations. With excep-              is the capability of identifying conspicuous channels separately
   tions on visualizing multiple trend line plots in juxtaposition,     for further drill-down and comparative analysis (T3).
   engineers generally detect anomalies by univariate time se-
   ries analysis of multiple channels. They, also compare line
   plots with the given engine speed and engine torque. Consid-
                                                                        5.2    Correlation-based anomaly score
   ering relations between channels at a broader scale should be        Inspired by the approach presented by Zhao et al. [40], we as-
   considered for the automated data analysis and visualization.        sume that the change of linear correlations between two sensors
T5 Reduce amount of data. To address the large amount of                over time refers to an anomaly. During our research work, we
   channels, data reduction techniques should be take into ac-          investigated the application of correlation-based anomaly de-
   count for the visual analytics design. The main consideration        tection on testbed data. Despite its limitations, the method has
   for the visual analytics design is that interesting data should      also strengths that are beneficial in solving the tasks defined
   be highlighted to support engineers in their decision making.        in the design study. First, we highlight the main limitation in
                                                                        detecting anomalies by the change of linear sensor correlations,
Overall, data can only be analyzed by including extensive domain
                                                                        if throughout the durability test no linear correlation between
knowledge of users to the data analysis. Modern powertrains and
                                                                        two specific sensors exist. Nevertheless, we examined that in
engines are highly complex machines and therefore domain ex-
                                                                        testbed data many linear correlations between sensors exist. For
perts are necessary to interpret results of automated data analysis
                                                                        example, data from several temperature channels are likely to
through a visual analytics approach. In the next section we in-
                                                                        correlate. Experiments demonstrated that this method can detect
troduce the anomaly detection methods we applied to the visual
                                                                        anomalies in testbed data, and therefore has been applied to our
analytics approach.
                                                                        visual analytics approach.
                                                                            The basis for the correlation-based anomaly score is the corre-
5     AUTOMATED DATA ANALYSIS:
                                                                        lation difference matrix, which represents the deviation of linear
      ANOMALY DETECTION METHODS                                         channel relations between two testbed cycles. The correlation
In research, anomaly detection often refers to a two-class classi-      matrix for each sensor combination of the reference cycle, the
fication problem, in which data either is classified as an anomaly      baseline cycle and the unseen cycle are calculated by using Pear-
or not. In general, a model is built on normal data, considering        son’s correlation coefficient. Then, the correlation matrix of the
that the model can calculate an anomaly score on unseen data            unseen cycle is subtracted by the correlation matrix of the ref-
sets (apart from unsupervised methods). If the anomaly score            erence cycle, which results in the correlation difference matrix.
exceeds a predefined threshold, the data record or the entire set       As a result, the anomaly score is calculated as the average of
is classified as an anomaly. We consider the application of semi-       all values in the difference matrix and is mapped to the unified
supervised anomaly detection methods to our design T2. Most             anomaly score accordingly to the method explained before.
techniques are specific to different observational features, in con-
sequence of which we assume that an ensemble-based approach
obtains more robust anomaly scores [38]. Therefore, we propose          5.3    Regression-based anomaly score
to map results of different anomaly detection algorithms to a           As the second anomaly score, we make use of regression models
unified value for comparison purposes and describe two anomaly          for regression-based anomaly detection [12]. For this research
detection methods, used in the visual analytics approach.               work, we train regression models to estimate a time series. Con-
                                                                        sequently, the model is applied to an unseen data set, in which
5.1    Unified anomaly score                                            the difference between the estimation and the real values (resid-
Test cycles are engineer’s unit of analysis for which reasons           uals) can be interpreted as anomalies. Considering this method
we choose them as the granularity level for data analysis (T1).         for T1 and T2, an anomaly score between populations or cycles
To make different anomaly detection methods comparable in               needs to be calculated in a semi-supervised manner. Therefore,
an ensemble-based approach, we propose the following to map             regression models with data of a user-defined reference cycle are
anomaly scores to unified values between 0 and 1: (1) Interac-          trained for all channels separately. To make those channel regres-
tively select a reference cycle as input data for the training of the   sion models comparable, it is necessary to standardize data first,
anomaly detection model. (2) A baseline cycle is selected to cal-       i.e. standardization of the entire time series to values between
culate a baseline anomaly score. (3) Define a threshold anomaly         0 and 1. The regression models can now be used to estimate all
score, based on prior knowledge, domain knowledge or historical         channel time series for unseen cycles and the anomaly score of
data. (4) Further, the anomaly score of cycles are calculated as        one cycle can be calculated by the average mean average error
the linear scaling from 0 (baseline) to 1 (threshold). Therefore,       over all channels of a cycle. In the following it can me mapped
our approach needs the definition of a reference cycle for model        to a unified anomaly score accordingly to the method explained
training, a cycle for baseline definition and the definition of a       above.
threshold. The baseline anomaly score is used to consider a train-          We chose Random Forest regressor, as suggested by Breiman
ing error and therefore is taken as the lower limit of the unified      [8], considering that this model has been proven to perform well
                                                                            EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark


                                                                      comprehensible to the target group. However, our approach has
                                                                      advantages over line-plot-based visualizations. In general, testbed
                                                                      cycles as granularity level are highly comprehensible for engi-
                                                                      neers. Therefore, cycles are visualized as individual and complete
                                                                      entities, whereas the glyph design offers the following opportu-
                                                                      nities: (1) As a visual entity it can be clearly selected by users
                                                                      for further exploration, reasoning and drill-down. Also, glyphs
                                                                      can be selected interactively to be defined as reference for the
                                                                      underlying semi-supervised learning algorithms to identify con-
                                                                      trasts between populations (T1, T2). (2) The glyph design can be
                                                                      extended with several anomaly detection algorithms by adding
               Figure 2: Proposed glyph design                        additional outer circular segments. (3) Similar glyphs can be ag-
  The glyph visualizes two anomaly scores and their ensemble          gregated, clustered and arranged for further interpretation by
(aggregate). Anomaly Score 1 visualizes a value of 0.8, Anomaly       domain experts and to save screen space. (4) The glyph can be
 Score 2 visualizes a value of 0.2, whereas the equally weighted      visualized on its own as a quick overview of an engine’s con-
     center visualized the Ensemble Anomaly Score of 0.5.             dition. This is also related to the idea of involved engineers of
                                                                      having a simple "traffic light like" system, which also encouraged
in many domains [41]. As input data for training channel regres-      us, developing the presented glyph-based approach.
sion models, engine speed and engine torque are chosen, since
those two channels are given by the test and strongly relate to       6.2     Identification of anomalous channels
the majority of the channels (T4). Also, sliding window features
                                                                      As stated above, we choose two anomaly scores by their capability
for those two channels are extracted, whereas sliding windows
                                                                      to further explore single anomalous channels. After an anomalous
contain differences and mean values of three seconds into the
                                                                      channel has been identified in the glyph representation, users are
past. We assume that in this time frame the most relevant infor-
                                                                      interested in the cause of that anomaly. Therefore, we visually
mation can be extracted for our models. The aim of this approach
                                                                      represent anomalies for both anomaly scores, as follows:
is not to estimate each channel as accurate as possible, but to
                                                                      Matrix-based identification of anomalous channels. The
detect change of anomaly scores between populations. As the
                                                                      correlation deviation matrix calculated for the correlation-based
correlation-based method, we are aware of the limitation that this
                                                                      anomaly score is shown in step (c) of Figure 3. Basically, in this
method may not return a decent estimation for all channels, but it
                                                                      symmetrical matrix, deviations of correlations of channels within
may be effective for some types of anomalies. This consideration
                                                                      a given cycle with respect to the selected reference cycle are vi-
should also emphasize the choice of an ensemble method.
                                                                      sualized. More specifically, we compute the difference of the
                                                                      correlation matrices of these two cycles, and show the result by
6     VISUAL ENCODING AND                                             color-coding the cells of the difference matrix. Hence, levels of
      CONSIDERATIONS                                                  red representing the anomaly score of channel correlations. This
This section explains how we use two anomaly scores for a glyph-      matrix representation supports the analysis goal to determine
based visualization. Also, an example on how to identify conspic-     and quantify visual patterns for pattern-driven visual exploration.
uous channels within a cycle either in a matrix representation        Together with appropriate matrix reordering methods, we can use
and a ranked channel list is given. The visual considerations         this display to search for typical patterns in matrix visualizations,
explained in this section will be brought together in the proto-      including line patterns and block patterns [4]. Most importantly,
type, describing the visual analytics approach by the prototype       if one sensor shows an anomalous behavior, its correlation dif-
implementation.                                                       ference values to many or all other sensors will be rather large,
                                                                      leading to line patterns. Such visual patterns attract the attention
6.1    Cycle anomaly glyph                                            of the analyst and are a starting point for drilling-down into the
The proposed glyph in Figure 2 is flexible and independent of the     respective sensor data (Figure 3 (e2)).
underlying analytical methods for anomaly detection, as long as it    Ranked mean average error list. The regression-based anom-
implements the framework for calculating unified anomaly scores       aly score can be explored by the ranked mean average error
between 0 and 1 (subsection 5.1). Anomaly scores are visualized       list as proposed in Figure 3 (d). Channels that deviate from the
in the outer circular segments of the glyph representation as         reference are listed and ranked by their anomaly score. This en-
the opacity value of the red background color. Our aim, when          ables a guided approach for exploring anomalies and simplifies
designing the glyph was that no algorithm can detect all kind         data analysis. By clicking on channel names users can explore
of anomalies, relevant for different applications. As a result, we    the reference and the anomalous channel time series by visually
choose an extensible glyph design, achieved by its circular shape,    comparison in juxtaposition for hypothesis generation (Figure 3
which offers the capability of adding and removing anomaly            (e1)).
scores in their according circular segments. The main visual
focus of the glyph stays at the center circle, which represents an
                                                                      6.3     Prototype
equally weighted average of anomaly scores combined, labeled
as the ensemble anomaly score.                                        The workflow of the approach, applied to data of the given use
   During our work, the main concern of visualization experts         case, is exhibited and briefly described in Figure 3. It shows screen-
regarding the presented glyph design was the benefit compared         shots of the implemented prototype, whereas further explana-
to simpler visualizations, such as line plots. As stated in the de-   tions are given in the following: In (a) glyphs are placed in a
sign study, line plots are a well-known visualization type and        grid, with each cell representing a test cycle in chronological
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                                          Suschnigg et al.


                                               Figure 3: Proposed visual analytics tool
  (a) Differences between the selected reference cycle and and other cycles can be explored, whereas glyphs are positioned in a time
 ordered grid. (b) Besides some filter capabilities, the anomaly score threshold can be interactively changed by a ruler. (c) Anomalies
   found by the correlation-based anomaly score can be explored in the matrix representation (d) Anomalous channels found by the
regression-based anomaly score can be explored by the ranked mean average error list. (e) Hypothesis can be evaluated by comparing
                                  channel time series in the reference cycle and the cycle of interest.


order (from top left to bottom right, inspired by the calendar-              7    EVALUATION
based view [32]). Note, that the reference cycle is interactively            We conducted a pair analytics evaluation [2] with three subject
selectable and represented by a white circle, as visible in the top          matter experts (SME), who represent the target user group identi-
left, or first, glyph. In (b) three interaction possibilities are visible:   fied in subsection 4.2, and the dataset described in subsection 4.1.
(i) to add flexibility to the visual comparison of glyphs the user           The main target was to evaluate either the comprehensibility of
can interactively change the anomaly score threshold to values               the different views and the underlying automated data analysis,
between 50 % - 200% of the original value (ii) the user can change           along with the capabilities and limitations in supporting users
the amount of displayed glyphs by filtering them by a ’‘from - to’           with their daily condition monitoring analysis goals. According
range slider (iii) glyphs can be filtered by the definition to visu-         to the pair analytics protocol, the evaluation is done by a human-
alize every x t h glyph only. Both anomaly scores of interesting             to-human interaction of one SME and one visual analytics expert
glyphs can be selected for further exploration by a drill-down               (VAE), in which the SME acts as the navigator and the VAE as the
in (c) and (d): In (c) a drill-down example to inspect and identify          driver (operator) of the visual analytics tool. In general, all three
one or more conspicuous channels within the selected cycle by a              SME participants stated that the visual analytics tool can be of
matrix representation visualizing the correlation-based anomaly              great benefit to support them in their daily work for two reasons:
score is shown. An example for a visual perceptive line pattern              The visual analytics tool supports engineers in analyzing testbed
is outlined, representing a possibly conspicuous channel. Fur-               data (1) more efficiently by highlighting interesting data on dif-
ther, the conspicuous channel can be selected in the matrix for              ferent granularity levels and (2) more effective by enabling the
exploration and comparative analysis with the reference line                 analysis of the entire dataset and not only a subset of well-known
plot in (e2). In (d) an example of the ranked mean average error             channels. To give evidence to that statement, we connect partici-
representation of the regression-based anomaly scores is given,              pants comments and actions during the pair analytics evaluation
in which its drill-down capabilities are visible in (e1). In general,        to the task definition of subsection 4.3 in the following:
drill-down information needs to be investigated and interpreted                 Each evaluation session started by an introduction to the visual
by domain experts. However, our approach supports users in the               analytics approach and a short demonstration of the prototype. It
identification of interesting data by visually highlighting deviat-          is notable, that all three participants (P1, P2, P3) gained a quick
ing cycles and sensors. As a side note, interactive line plots and           understanding of the concept for two reasons: First, we conducted
heatmaps in the prototype have been created with the JavaScript              the design study with the same engineers and connected findings
visualization library Plotly.js [24] and are anonymized in Figure 3          of the study with explanations of our visual analytics tool. Second,
screenshots.                                                                 the design study clearly identifies tasks and goals of engineers,
                                                                            EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark


therefore the visual analytics prototype accurately addresses the       of the proposed visual analytics approach and are interested in
needs of engineers. In general, participants appreciated our effort     using the prototype in a productive scenario. In the next section,
in developing a decision support system supporting engineers in         we discuss some of the aspects of the evaluation in greater detail
handling the big amount of data for their condition monitoring          and also consider generalizability and future work.
tasks.
   The actual pair analytics evaluation sessions started by defin-      8   DISCUSSION AND FUTURE WORK
ing a reference cycle in the glyph-based overview (T2). P1 and P3       Our visual analytics approach has been designed and evaluated
appreciate the capability of selecting the reference cycle interac-     on testbed data, but we emphasize that it is not limited to the
tively in the visualization. However, P2 questioned the necessity       automotive domain. At least the glyph-based overview should
of interactively selecting the reference cycle, because the testee is   be applicable on any other cyclic multivariate data set, as long as
likely to be in a good condition before the first cycle, considering    the underlying automated data analysis methods and dependent
that the testee runs through an extensive health check at the           visualization techniques are adapted to the specific domain. The
beginning of the entire durability test. We are aware of the fact       visual analytics prototype has been evaluated to be useful for
that selecting the first cycle may be an appropriate default choice,    collaborators and they clearly identified advantages in terms of
but we wanted to keep the analysis more flexible.                       efficiency and effectiveness in comparison to their current work-
   After the reference cycle has been selected, other glyphs in the     flow. Also, the evaluation opened up many directions for future
overview turned red regarding their anomaly scores (see Figure 1        work: Analyzing up to a thousand of cycles can be critical regard-
(a)). All participants were immediately curious in exploring those      ing the screen space. Applied filtering techniques in Figure 3 (b)
anomalies by the visualization and easily identified cycles that        can be improved by a more scalable solution and clearly needs
appear interesting to them (T1). P2 and P3 pointed at cycles that       further attention. For example, similar glyphs can be aggregated
had a more intensive color of red then the majority of all cycles,      to save space on the display. We also address the scalability for
whereas P1 mentioned that all glyphs that visualize at least a          the matrix-representation for future work. As a generic abstrac-
small anomaly score are interesting. However, in a productive           tion of anomalies in the glyph-based overview, the calculations
use scenario the exploring strategy may differ since not all cy-        of anomaly scores are exchangeable and more extensive research
cles are available from the beginning, and new data would be            on additional available anomaly detection methods for the use
explored incrementally on a regular basis as it becomes available.      case needs to be done. The evaluation demonstrated that anom-
We emphasize that at this stage of the visual data analysis we          alies can be characterized in three manners. Hence, engineers
successfully reduced the amount of data (T5) and enabled the            should be able to provide feedback on their findings, i.e., to clas-
further exploration of anomalies in the succeeding views.               sify the relevance of anomalies consequently for analyzing data
   T3 and T4 are both achieved by exploring one of the two anom-        in further iterations more efficiently. Even if the target users are
aly scores of a specific cycle: (1) The correlation-based anomaly       not data mining experts, we experienced that they gain a quick
score and the correlation difference visualization in Figure 1 (b)      understanding of the proposed workflow. However, for future
were comprehensible to the participants as they were able to            work we will address guidance for visual analytics [11] to reduce
identify conspicuous channels. However, participants articulated        system complexity from the user perspective and support users
the need of a more guided approach to engage engineers using            to further achieve their analysis goals. One promising venue we
the matrix visualization, because it appears overloaded and thus        see to this end is the application of visual interestingness mea-
overwhelming to engineers. (2) The regression-based anomaly             sures [5] to automatically select cycle pairs from the database
score was also highly comprehensible to participants, since they        showing significant visual patterns.
have a general understanding of regression models. On the other
hand, we avoided explaining the actual regression model Random          9   CONCLUSION
Forest to participants in detail. In comparison, to the correlation     We propose a visual analytics approach to improve the engineer’s
difference matrix, participants commented that the exploration          daily work by the glyph-based visual analytics workflow. We have
of conspicuous channels is easier by the ranked mean average            found promising results on the given use case, but the concept
error list (Figure 2 (c)). Also, they expressed their interest of ad-   still needs to be proven with other datasets. We contribute to
ditional guided approaches in the other views, considering that         the need of visual analytics approaches for condition monitor-
such rankings represent a clear order on what channels to focus         ing or anomaly detection in cyclic time series data. Further, our
on, especially if they are short of time during their analysis.         approach devises a methodology to reduce large amounts of in-
   As the last step of the visual analysis, participants evaluated      dustrial data, by drawing attention to anomalous cycles on a
hypothesis of channels being anomalous by comparing anoma-              higher granularity level to increase efficiency and effectiveness
lous line plots with their reference cycle equivalents (see Figure 1    of engineers’ data analysis work.
(e1 + e2)). From a data perspective, engineers approved that all
explored anomalies are interesting, because they highlight a sig-       ACKNOWLEDGMENTS
nificant difference to the reference. From a domain perspective,
                                                                        This research work is done by Pro2 Future and AVL List. Pro2 Future
some of the anomalies were interesting, but others were expli-
                                                                        is funded within the Austrian COMET Program-Competence
cable and irrelevant for the condition monitoring task. Another
                                                                        Centers for Excellent Technologies- under the auspices of the
type of anomaly, that has been detected during evaluation are
                                                                        Austrian Federal Ministry of Transport, Innovation and Tech-
defect or unconnected sensors. Line plots of those anomalies
                                                                        nology, the Austrian Federal Ministry for Digital and Economic
visualize a constant or noisy signal. Therefore, we characterize
                                                                        Affairs and of the Provinces of Upper Austria and Styria. COMET
three types of anomalies that have been found during evaluation:
                                                                        is managed by the Austrian Research Promotion Agency FFG.
(1) Domain irrelevant (2) Domain relevant and (3) Defect sensors.
   Overall, we evaluated that the visual analytics prototype re-
ceive acceptance from all participants. They confirm the benefit
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                                                                  Suschnigg et al.


REFERENCES                                                                                   on Visualization and Computer Graphics 18, 12 (2012), 2431–2440.
 [1] Shikha Agrawal and Jitendra Agrawal. 2015. Survey on Anomaly Detection             [28] Maxim Shcherbakov, Adriaan Brebels, N.L. Shcherbakova, V.A. Kamaev, O.M.
     using Data Mining Techniques. In Procedia Computer Science, Vol. 60. Elsevier,          Gerget, and D. Devyatykh. 2017. Outlier detection and classification in sen-
     708 – 713.                                                                              sor data streams for proactive decision support systems. Journal of Physics:
 [2] Richard Arias-Hernandez, Linda T Kaastra, Tera M Green, and Brian Fisher.               Conference Series 803 (2017).
     2011. Pair Analytics: Capturing Reasoning Processes in Collaborative Visual        [29] Georgiy Shurkhovetskyy, N Andrienko, G Andrienko, and Georg Fuchs. 2018.
     Analytics. In 44th Hawaii International Conference on System Sciences. IEEE,            Data Abstraction for Visualizing Large Time Series. Computer Graphics Forum
     1–10.                                                                                   37, 1 (2018), 125–144.
 [3] Eric Armengaud. 2017. Industry 4.0 as Digitalization over the Entire Product       [30] Stefan Thalmann, Gursch Heimo, Josef Suschnigg, Milot Gashi, Helmut Enns-
     Lifecycle: Opportunities in the Automotive Domain. In Systems, Software and             brunner, Anna Katharina Fuchs, Tobias Schreck, Belgin Mutlu, Jürgen Mangler,
     Services Process Improvement, 24th European Conference, EuroSPI 2017. Springer,         Gerti Kappl, Christian Huemer, and Stefanie Lindstaedt. 2019. Cognitive Deci-
     334–351.                                                                                sion Support for Industrial Product Life Cycles: A Position Paper. In Cognitive
 [4] Michael Behrisch, Benjamin Bach, Nathalie Henry Riche, Tobias Schreck, and              2019, Vol. 11. IARIA, 3–9.
     Jean-Daniel Fekete. 2016. Matrix Reordering Methods for Table and Network          [31] Stefan Thalmann, Juergen Mangler, Tobias Schreck, Christian Huemer, Marc
     Visualization. Computer Graphics Forum 35, 3 (2016), 693–716.                           Streit, Florian Pauker, Georg Weichhart, Stefan Schulte, Christian Kittl,
 [5] Michael Behrisch, Benjamin Bach, Michael Hund, Michael Delz, Laura von                  Christoph Pollak, Matej Vukovic, Gerti Kappel, Milot Gashi, Stefanie Rinderle-
     Rüden, Jean-Daniel Fekete, and Tobias Schreck. 2017. Magnostics: Image-                 Ma, Josef Suschnigg, Nikolina Jekic, and Stefanie N. Lindstaedt. 2018. Data
     Based Search of Interesting Matrix Views for Guided Network Exploration.                Analytics for Industrial Process Improvement A Vision Paper. In IEEE 20th
     Transactions on Visualization and Computer Graphics 23, 1 (2017), 31–40.                Conference on Business Informatics (CBI), Vol. 02. 92–96.
 [6] Gilles Blanchard, Gyemin Lee, and Clayton Scott. 2010. Semi-supervised             [32] Jarke J Van Wijk and Edward R Van Selow. 1999. Cluster and calendar based
     novelty detection. Journal of Machine Learning Research 11 (2010), 2973–3009.           visualization of time series data. In Proceedings IEEE Symposium on Information
 [7] Rita Borgo, Johannes Kehrer, David H. S. Chung, Eamonn Maguire, Robert S.               Visualization. 4–9.
     Laramee, Helwig Hauser, Matthew Ward, and Min Chen. 2013. Glyph-based              [33] Matthew O. Ward and Benjamin N. Lipchak. 2000. A visualization tool for
     Visualization: Foundations, Design Guidelines, Techniques and Applications.             exploratory analysis of cyclic multivariate data. Metrika 51 (2000), 27–37.
     Eurographics 2013 - State of the Art Reports (2013), 39–63.                        [34] Marc Weber, Marc Alexa, and Wolfgang Müller. 2001. Visualizing time-series
 [8] Leo Breiman. 2001. Random Forrest. Machine Learning 45 (2001), 5–32.                    on spirals. In IEEE Symposium on Information Visualization, 2001. 7–13.
 [9] P. Buono, C. Plaisant, A. Simeone, A. Aris, B. Shneiderman, G. Shmueli, and W.     [35] Wenchao Wu, Yixian Zheng, Kaiyuan Chen, Xiangyu Wang, and Nan Cao.
     Jank. 2007. Similarity-Based Forecasting with Simultaneous Previews: A River            2018. A Visual Analytics Approach for Equipment Condition Monitoring in
     Plot Interface for Time Series Forecasting. In 11th International Conference            Smart Factories of Process Industry. In IEEE Pacific Visualization Symposium.
     Information Visualization. IEEE, 191–196.                                               140–149.
[10] Nan Cao, Conglei Shi, Sabrina Lin, Jie Lu, Yu Ru Lin, and Ching Yung Lin.          [36] Jiazhi Xia, Fenjin Ye, Wei Chen, Yusi Wang, Weifeng Chen, Yuxin Ma, and
     2016. TargetVue: Visual Analysis of Anomalous User Behaviors in Online                  Anthony K.H. Tung. 2018. LDSScanner: Exploratory Analysis of Low-
     Communication Systems. IEEE Transactions on Visualization and Computer                  Dimensional Structures in High-Dimensional Datasets. IEEE Transactions
     Graphics 22 (2016), 280–289.                                                            on Visualization and Computer Graphics 24 (2018), 236–245.
[11] Davide Ceneda, Theresia Gschwandtner, Thorsten May, Silvia Miksch, Hans-           [37] Qianfan Xin. 2013. 2 - Durability and reliability in diesel engine system design.
     Jörg Schulz, Marc Streit, and Christian Tominski. 2017. Characterizing Guid-            In Diesel Engine System Design, Qianfan Xin (Ed.). Woodhead Publishing, 113
     ance in Visual Analytics. IEEE Transactions on Visualization and Computer               – 202.
     Graphics 23 (2017), 111–120.                                                       [38] Ke Xu, Meng Xia, Xing Mu, Yun Wang, and Nan Cao. 2019. EnsembleLens:
[12] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly Detec-                 Ensemble-based Visual Exploration of Anomaly Detection Algorithms with
     tion: A Survey. ACM Computer Surveys 41, 3, Article 15 (2009), 58 pages.                Multidimensional Data. IEEE Transactions on Visualization and Computer
[13] Johannes Fuchs, Fabian Fischer, Florian Mansmann, Enrico Bertini, and Petra             Graphics 25 (2019), 109–119.
     Isenberg. 2013. Evaluation of Alternative Glyph Designs for Time Series Data       [39] Shen Yin and Okyay Kaynak. 2015. Big Data for Modern Industry: Challenges
     in a Small Multiple Setting. In Proceedings of the SIGCHI Conference on Human           and Trends [Point of View]. Proc. IEEE 103, 2 (2015), 143–146.
     Factors in Computing Systems (CHI ’13). ACM, 3237–3246.                            [40] Pushe Zhao, Masaru Kurihara, Junichi Tanaka, Tojiro Noda, Shigeyoshi
[14] Johannes Fuchs, Petra Isenberg, Anastasia Bezerianos, and Daniel Keim. 2017.            Chikuma, and Tadashi Suzuki. 2017. Advanced correlation-based anomaly
     A Systematic Review of Experimental Studies on Data Glyphs. IEEE Transac-               detection method for predictive maintenance. In IEEE International Conference
     tions on Visualization and Computer Graphics 23 (2017), 1863–1879.                      on Prognostics and Health Management. 78–83.
[15] Michael Gleicher. 2018. Considerations for Visualizing Comparison. IEEE            [41] Xun Zhao, Yanhong Wu, Dik Lun Lee, and Weiwei Cui. 2018. iforest: Inter-
     Transactions on Visualization and Computer Graphics 24, 1 (2018), 413–423.              preting random forests via visual analytics. IEEE transactions on visualization
[16] Manish Gupta, Jing Gao, Charu C Aggarwal, and Jiawei Han. 2014. Outlier                 and computer graphics 25, 1 (2018), 407–416.
     Detection for Temporal Data: A Survey. IEEE Transactions on Knowledge and          [42] Fangfang Zhou, Xiaoru Lin, Chang Liu, Ying Zhao, Panpan Xu, Liu Ren,
     Data Engineering 26, 9 (2014), 2250–2267.                                               Tingmin Xue, and Lei Ren. 2019. A survey of visualization for smart manufac-
[17] Clemens Gutschi, Nikolaus Furian, Josef Suschnigg, Dietmar Neubacher, and               turing. Journal of Visualization 22 (2019), 419–435.
     Siegfried Voessner. 2019. Log-based predictive maintenance in discrete parts
     manufacturing. In 12th CIRP Conference on Intelligent Computation in Manu-
     facturing Engineering, Vol. 79. Elsevier, 528 – 533.
[18] Halldór Janetzko, Florian Stoffel, Sebastian Mittelstädt, and Daniel A. Keim.
     2014. Anomaly detection for visual analytics of power consumption data.
     Computers & Graphics 38 (2014), 27–37.
[19] Daniel Keim, H.-P Kriegel, and Mihael Ankerst. 1995. Recursive pattern: A
     technique for visualizing very large amounts of data. In Proceedings of the
     IEEE Visualization Conference. 279–286.
[20] Heidi Lam, Melanie Tory, and Tamara Munzner. 2018. Bridging from Goals to
     Tasks with Design Study Analysis Reports. IEEE Transactions on Visualization
     and Computer Graphics 24 (2018), 435–445.
[21] Alexander Maier, Tim Tack, and Oliver Niggemann. 2012. Visual Anomaly De-
     tection in Production Plants. In Proceedings of the 9th International Conference
     on Informatics in Control, Automation and Robotics. SciTePress, 67–75.
[22] Silvia Miksch and Wolfgang Aigner. 2014. A matter of time: Applying a
     data–users–tasks design triangle to visual analytics of time-oriented data.
     Computers & Graphics 38 (2014), 286 – 290.
[23] Marco AF Pimentel, David A Clifton, Lei Clifton, and Lionel Tarassenko. 2014.
     A review of novelty detection. Signal Processing 99 (2014), 215–249.
[24] Plotly Technologies Inc. 2015. Collaborative data science. Montreal, QC.
     https://plot.ly
[25] Timo Ropinski, Steffen Oeltze, and Bernhard Preim. 2011. Survey of glyph-
     based visualization techniques for spatial multivariate medical data. Computers
     & Graphics 35 (2011), 392–401.
[26] Dominik Sacha, Andreas Stoffel, Florian Stoffel, Bum Chul Kwon, Geoffrey
     Ellis, and Daniel A. Keim. 2014. Knowledge generation model for visual
     analytics. IEEE Transactions on Visualization and Computer Graphics 20 (2014),
     1604–1613.
[27] Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design Study
     Methodology: Reflections from the Trenches and the Stacks. IEEE Transactions