=Paper=
{{Paper
|id=Vol-1525/paper-19
|storemode=property
|title=LOGDIG log file analyzer for mining expected behavior from log files
|pdfUrl=https://ceur-ws.org/Vol-1525/paper-19.pdf
|volume=Vol-1525
|dblpUrl=https://dblp.org/rec/conf/splst/HeikkinenH15
}}
==LOGDIG log file analyzer for mining expected behavior from log files==
<pdf width="1500px">https://ceur-ws.org/Vol-1525/paper-19.pdf</pdf>
<pre>
SPLST'15


                 LOGDIG Log File Analyzer for Mining Expected
                          Behavior from Log Files
                                      Esa Heikkinen, Timo D. Hämäläinen

                     Tampere University of Technology, Department of Pervasive computing,
                                    P.O. Box 553, 33101 Tampere, Finland

                             Email: esa.heikkinen@student.tut.fi
                                   timo.d.hamalainen@tut.fi


                  Abstract. Log files are often the only way to identify and locate errors in a de-
                  ployed system. This paper presents a new log file analyzing framework,
                  LOGDIG, for checking expected system behavior from log files. LOGDIG is a
                  generic framework, but it is motivated by logs that include temporal data
                  (timestamps) and system-specific data (e.g. spatial data with coordinates of
                  moving objects), which are present e.g. in Real Time Passenger Information
                  Systems (RTPIS). The behavior mining in LOGDIG is state-machine-based,
                  where a search algorithm in states tries to find desired events (by certain accu-
                  racy) from log files. That is different from related work, in which transitions are
                  directly connected to lines of log files. LOGDIG reads any log files and uses
                  metadata to interpret input data. The output is static behavioral knowledge and
                  human friendly composite log for reporting results in legacy tools. Field data
                  from a commercial RTPIS called ELMI is used as a proof-of-concept case
                  study. LOGDIG can also be configured to analyze other systems log files by its
                  flexible metadata formats and a new behavior mining language.


                  Keywords: Log file analysis, data mining, spatiotemporal data mining, behav-
                  ior computing, intruder detection, test oracles, RTPIS, Python


           1      Introduction
               Log files are often the only way to identify and locate errors in deployed software
           [1], and especially in distributed embedded systems that cannot be simulated due to
           lack of source code access, virtual test environment or specifications. However, log
           analysis is no longer used only for error detection, but even the whole system man-
           agement has become log-centric [2].
               Our work originally started 15 years ago with a commercial Real Time Passenger
           Information System (RTPIS) product called ELMI. It included real-time bus tracking
           and bus stop monitors displaying time of arrival estimates. ELMI consisted of several
           mobile and embedded devices and a proprietary radio network. The system was too
           heterogeneous for traditional debugging, which led to log file analysis as the primary
           method. In addition, log file analysis helped discovering the behavior of some black-
           box parts of the system, which contributed to the development of open ELMI parts.
               The first log tools were TCL scripts that were gradually improved in an ad-hoc
           manner. The tooling fulfilled the needs very well, but the maintenance got complicated


                                                         266
SPLST'15


           and the tools fit only for the specific system. This paper presents a new, general pur-
           pose log analyzer tool framework called LOGDIG based on the previous experience. It
           is purposed for logs that include temporal data (timestamps) and optionally system-
           specific data (i.e. spatial data with coordinates of moving objects).
               Large embedded and distributed systems generate many types of log files from
           multiple points of the system. One problem is that log information might not be pur-
           posed for error detection but e.g. for business, user or application context. This requires
           capability to interpret log information. In addition, there can be complex behaviors like
           a chain of sequential interdependent events, for which simple statistical methods are
           not sufficient [1]. This requires state sequence processing. LOGDIG is mainly intended
           for expected behavior mining.
               State-of-the art log analyzers connect the state transitions one by one to the log
           lines (i.e. events or records). LOGDIG has an opposite new idea, in which log events
           do not directly trigger transitions but events are searched for states by special state
           functions. The benefit is much more versatile searches and inclusion of already read
           old log lines in the searches, which is not the case in state-of-the-art. LOGDIG in-
           cludes also our new Python based language Behavior Mining Language (BML), but
           this is out of the scope of this paper.
               The new contributions in this paper are i) New method for searching log events for
           state transitions, ii) LOGDIG architecture and implementation iii) Proof-of-concept by
           real RTPIS field data.
               This paper is organized as follows. Section 2 describes the related work and Sec-
           tion 3 the architecture of LOGDIG. Section 4 presents an RTPIS case study with real
           bus transportation field data. The paper is concluded in Section 5.


           2      Related work
               We focus on discovering the realized behavior from logs, which is required to fig-
           ure out if the system works as expected. The behavior appears as an execution trace or
           sequential interdependent events in log files. The search means detecting events that
           contribute to the trace described as expected behavior.
               A typical log analyzer tool includes metamodels for the log file information, speci-
           fication of analytical tasks, and engines to execute the tasks and report the results [2].
           Most log analyzers are based on state machines. The authors in [5] conclude that most
           appropriate and useful form for a formal log file analyzer is a set of parallel state ma-
           chines making transitions based on lines from the log file. Thus, e.g. in [1], programs
           are validated by checking conformity of log files. The records (lines) in a log file are
           interpreted as transitions of the given state machine.
               The most straightforward log analyzing methods is to use small utility commands
           like grep and awk, spreadsheet like Excel or database queries. More flexibility comes
           with scripts, e.g. Python (NumPy/SciPy), Tcl, Expect, Matlab, SciLab, R and Java
           (Hadoop). There are also commercial log analyzers like [3] and open source log ana-
           lyzers like [4]. However, in this paper we focus on scientific proposals in the follow-
           ing.


                                                      267
SPLST'15


              Valdman [1] has presented a general log analyzer, and Viklund [5] analysis of de-
           bug logs. Authors in [6] have presented execution anomaly detection techniques in
           distributed systems based on log analysis. Execution anomalies include work flow
           errors and low performance problems. The technique has three phases: at first abstract-
           ing log files, then deriving FSA and next checking execution times. The result is a
           model of behavior. Later it can be compared whether the learned model was same or
           not than current model to detect anomalies. This technique does not need extra lan-
           guages or data to configure analyzer. A database based analysis is presented in [7] for
           employing a database as the underlying reasoning engine to perform analyses in a rapid
           and lightweight manner. TBL-language is presented in [8] for validating expected be-
           haviors in execution traces of system. TBL is based on parameterized patterns. Domain
           specific LOGSCOPE language is presented in [9] that is based on temporal logic.

              LFA [10] has been developed for general test result checking with log file analysis,
           and extended by LFA2 [11; 12] with the idea of generating log file analyzers from
           C++ instead of Prolog. That improved general performance of LFA and allowed to
           extend the LFAL language by new features, such as support for regular expressions.
           Authors in [13] have presented a broad study on log file analysis. LFA is clearly clos-
           est to our work and most widely reported.

              Table 1 presents comparison between our work and LFA/LFA2. The classification
           criteria is modified from [1] and for brevity we exclude detailed description of each.
           To conclude, the main difference to LFA/LFA2 is that it connects transitions directly
           to the events (lines) of the log files. LOGDIG has a completely different approach
           enabling more versatile searches, like the search from old (already read) lines or the
           search “backward”. New system-specific search features are also easier to add.


                                                     268
SPLST'15


                             Table 1. Comparison of LOGDIG and LFA/LFA2 analyzer

           Feature                                             LOGDIG            LFA/LFA2
           1. Analyzed system and logs
               1. System type                                 Distributed        Distributed
               2. Amount of logs                              Not limited             1
               3. size of logs                                Not limited        Not limited
               4. formats of logs                             Not limited          Limited
           2. Analyzing goals and results
               1. Processing                                   Batch job         Batch job
               2. type of knowledge                          Very complex          Complex
               3. Results                              SBK, RCL, test oracle     Test oracle
           3. Analyzator
               1. technology or formalism of engine          State-machine     State-machine
               2. using event indexing                          Time, Id             No
               3. Search aspects (SA)                       Index, data, SSD        Data
               4. Adjustment of accuracy of SA                    Yes                Yes
               5. pre-processing (and language)                Yes (PPL)             No
               6. analyzing language                             BML                LFAL
               7. versatile reading from log            By index not limited    Only forward
               8. operational modes                          Multiple-pass      Single-pass
               9. state-machine-based engine                      Yes                Yes
                     1. state machines                              1            Not limited
                     2. state-functions                      ES algorithm            No
                     3. state transitions                   Limited (F,N,E)      Not limited
                     4. state transition functions                Yes                Yes
                     5. variables                                 Yes                Yes
           4. Analyzator’s structure and interfaces
               1. implementation                             Python (Tcl)       Prolog/C++
               2. extensibility                         Via SBK, BMS, SSC             ?
               3. integratability                            Command line             ?
               4. modularity                           BMU,ESU,SSC,SSD,lib           Lib
           5. Analyzator’s other features
               1. robustness (runtime)                       Exit transition   Error transition
               2. user interface                             Command line             ?
               3. compiled or interpreted                     Interpreted        Compliled
               4. speed                                         Medium              Fast


                                                      269
SPLST'15


           3      LOGDIG architecture

               Fig. 1 depicts the overall architecture and log processing of LOGDIG. The analysis
           process refines raw data from the log files (bottom) into information (middle) and fi-
           nally to knowledge (top). The straightforward purpose is to transform the dynamic
           behavior data from the log files into static behavior knowledge (SBK file). This can
           be thought of linking many found events from many “rows” of log files into one “row”
           of the SBK file. The static knowledge can be reported and further processed by legacy
           visualization, database and spreadsheet tools like Excel. Besides SBK, LOGDIG pro-
           duces a Refined Composite Log called RCL file. Its purpose is to link found data on
           the log files together into one composite, human readable text format described in
           BML language. This helps simple keyword-based searches for reporting and further
           analysis.
                                                     Reports:
                                                     Static behavior knowledge KNOW-
                                                     (SBK)                       LEDGE
                                                     refined composite log (RCL)
                             3 Language of
                             analysis (mining):
                             BML                   4 Log file Analyzer: LOGDIG

                                                                                 INFOR-
                             2 Metadatas of logs          Pre-Processed logs
                                                                                 MATION
                             (and SSD) and
                             Pre-Processing-
                             tasks: PPL            5 System        1 Logs
                                                   specific data   (Behaviour     DATA
                                                   (SSD)           data)

                           Fig. 1. Overview of LOGDIG analyzer log processing.


               The LOGDIG processing has two main phases: log data pre-processing and behav-
           ior mining. Pre-processing transforms the original log files into uniform, mining-
           friendly format, which is defined by our own Pre-Processing Language (PPL). PPL is
           not discussed further in this paper, but we use the common csv format with named
           columns as output of the transform. The behavior mining phase uses our BML lan-
           guage to define searching the events of expected behavior.
               Logs may consist of rows of unstructured text that can be parsed with regular ex-
           pressions (Python). Timestamp is mandatory in every line. Pre-processed logs are
           structured in rows and columns as csv files. The first row is header line that includes
           column (variable) names. The columns are row number, timestamp and log file specif-
           ic data.
               Sometimes analyzing needs systems specific utility data to make decisions in be-
           havior mining phase. This is called System Specific Data (SSD) and it is not log data.
           SSD files can come from system documentation and databases without any standard
           structure and format as in our case study. SSD can be e.g. spatial data like position
           boundary boxes for bus stops in RTPIS. SSD is parsed using System Specific Code
           (SSC) written in Python.


                                                         270
SPLST'15


                                                            KNOWLEDGE RESULTS:
                                        RESULT FILES (E.g.):                      STATIC BEHAVIOUR            REFINED
                                           -histograms                             KNOWLEDGE FILE            COMPOSITE
                                          -visualizations                                (SBK)              LOG FILE (RCL)                        DATA FLOW

                                                                                                                                                  CONTROL FLOW
                                     17. WRITE                                         15. READ

                              EXTERNAL TOOL (E.g. Spreadsheet)
                                                                                                                                            Command line parameters:
                                        16. READ
                                                                                                                                            - BMS variables with values
                              BEHAVIOR MINING STORAGE, BMS
                                  global variable dictionary
                                                                                                      14.         13.
                                                                                                                                  2. READ
           11. WRITE              4. READ                      12. READ                  4. WRITE     WRITE       WRITE
                                                                                                                                             1. EXECUTE
             EVENT SEARCH UNIT’s,                               BEHAVIOR MINING UNIT,
             ESU’s                                              BMU
                                                5. EXECUTE
               ES algorithm                                                                                               Composite
                                                   6. BS                                                                  Timetable for             BML-language
                                                   PARAMS                                                                 refined
                                                                                                                          composite log       3.   1) Init values of
                                                   7. SS
                                                                 ESU interface


                  Base System code                                                                                                            READ variables in BMS
                                                   PARAMS                                               State
                  (BSC)                                                            State-                                                           2) Statemachine: ESU
                                                                                                     transition           Preprocessor
                                                                                  machine                                                           states and params
                                                                                                     Functions            for BMS
                                                 10. Result:                                                                                        3) State transition
                                                 Found,                                                                   variable
                   Optional System                                                                                                                  functions
                                                 Not found                                                                names of state
                  Specific Code (SSC)            or Exit                                                                  transition
                                                                                                                          functions (3)


            8. READ               9. READ

                                                                Platform                                 Framework
                                  SYSTEM SPECIFIC                            Script language               BMU library functions
                 PRE-
                                  DATA (SSD) FILES                               (Python)                        (Python)
              PROCESSED
                                    (E.g. position               Utility commands and
               LOGFILE
                                  boundary boxes)                  Operating system


                               Fig. 2. Detailed architecture of behavior mining phase of LOGDIG.


               The detailed behavioral mining process is presented in Fig. 2. We first describe the
           concepts and functional units. There are two main functional units: Behavior Mining
           Unit (BMU) and Event Search Unit (ESU) and also one global data structure named as
           Behavior Mining Storage (BMS). There can be many ESU’s, one BMU and one BMS
           in LOGDIG.
               BMS is a global and common dictionary type data structure, which includes varia-
           ble names and their values used in LOGDIG. Initial BMS can be described in BML-
           language or given in command line parameters. It is possible to add new variables and
           values to BMS on the fly. Used variables can be grouped as follows: command line,
           internal, and result variables. The result variables are grouped as identify, log, time,
           searched, calculated, counter, error flags and warning flag variables (See example in
           table 2). BMS can be implemented as shared memory structure wherein it is very fast
           and it can be used in many (external) tools and commands even in real time.
           BMU consists of five parts: pre-processor of variable names, state-machine, state-
           transition functions, ESU interface and composite timetable. Pre-processor converts
           variable names in state transition functions of a BML code to be used in Python. State
           machine is the main engine of BMU that is described in BML language. State transi-


                                                                                               271
SPLST'15


           tion functions are called from the state machine and they are also described in BML-
           language.
               ESU interface connects the state machine to ESU. It converts execute command
           and parameters suitable to ESU and converts the result of ESU for the state machine.
           The ESU interface can be modified depending on how ESU has been implemented and
           where it is located, compared to BMU. ESU can be a separate command line command
           and it can be even located on separate computers. This gives interesting possibilities
           for the future to decentralization and performance.
               A composite timetable-structure is for the Refined Composite Log (RCL) file to
           collect together all time indexed “print”-notifications from state transitions. The time-
           table is needed because all searched data from log files are not always read in time
           (forwarding) order, because BMU has capability to re-read old lines from same log and
           also to read older line from other logs.
               BML language presents the expected behavior with certain accuracy and it is
           straightforward representation of state machine, (ESU) states with input parameters
           and state transition functions. It can also include initialization of BMS variables and
           values. State transition functions can include Python code and BMS variables. Because
           BML is out of the scope of this paper, we do not go into details.
               ESU is a state of the state machine and it includes state function which is named
           Event Search (ES) algorithm. ES algorithm includes a Basic System Code (BSC) and
           optionally a System Specific Code (SSC). The BSC is the body of ES algorithm and
           SSC is an extension to BSC. BSC reads Basic System (BS) parameters from the state
           machine and on their basis start searching from log. If SSC is in use (set in mode pa-
           rameter), also System Specific (SS) parameters are read.
              Base System (BS) parameters are needed to set the search mode, log file name ex-
           pression, searched BMS log variable names, time column name, as well as start and
           stop time expressions for ES algorithm. System Specific (SS) parameters are used in
           SSC-part of ES algorithm. E.g. in case of spatial data like in our case study, SS param-
           eters are: latitude and longitude column name and also SSD filename expression.

           Each search mode has the common task: searches an event between the start and the
           stop time (ESU time windows in figure 5) that values of variable names are same in
           log and BMS.

           The search modes can be:

           • “EVENT:First”: Searches the first event from the log
           • “EVENT:Last”: Searches the last event from the log

           In the case of SSC, search modes can also be System Specific (SS) modes (as in our
           case study):

           • “POSITION:Entering”: Searches object’s entering position to boundary box area
             (desciped in SSD) from the log
           • “POSITION:Leaving”: Searches object’s leaving from boundary box area (de-
             scribed in SSD) from the log


                                                      272
SPLST'15


           3.1    Mining process
                The mining process is described in figure 2. The user of LOGDIG starts the BMU
           execution (1), which reads command line parameters (2) and BML language (3) as
           input. On their basis BMU writes (4) BMS variable values directly or via state transi-
           tion function, and executes (5) ESU with search BS parameters (6). If SS mode is in
           use in current ESU, also SS parameters (7) are read. Then ESU reads (8) pre-processed
           log file. If SS mode in use, it reads also (9) SSD file. Then starts the searching.
               After ESU has finished the searching, it returns (10) the result of the search, which
           can be “Found” (F), “Not found” (N) or “Exit” (E). If “Found”, ESU writes (11) found
           variables from the log file to the BMS variables. Then, BMU reasons what to do next
           based on the action attached to the result. This is described in BML language. Depend-
           ing on the action, BMU runs possible state transition function which can read (12)
           BMS variables. Based on the action, a transition function may write one row to (13)
           RCL or (14) SBK knowledge-result files.
               After that BMU normally executes (5) a new search by starting ESU with new pa-
           rameters. BMU exits the mining phase if there are no searches left or “Exit” has been
           returned. “Exit” is a problem exception that should never happen, but if it occurs, user
           is notified and LOGDIG is closed cleanly.
               When BMU has completed, the SBK file can be read (15) by an external tool (like
           spreadsheet) to write (17) better visualizations from the knowledge results. All tools
           supporting the csv format can be used. Another option is to read (16) directly data from
           the BMS variables by a suitable tool to write (17) other (visualization) knowledge-
           results.
               State transition functions can use framework BMU library functions, like setting
           and writing RCL and SBK files and also calculating time differences between
           timestamp. It is also possible to add new library-functions depending on the needs.
               Because LOGDIG analyzer has been implemented in Python and it works on top of
           any operating system, all features of Python and the operating system can be used as a
           platform in BMU and ESU.


           4      Case study: EPT-case in ELMI
               The real time passenger information system ELMI [14], deployed in Espoo, Fin-
           land in 1998 - 2009, displayed the waiting time of buses at bus stop monitors. ELMI
           included 300 buses and 11 bus stop monitors [15], 3 radio base stations, central com-
           puter system (CCS) and communication server (COMSE). Buses sent login of line and
           location information to CCS via radio base stations every 30 seconds. Then CCS calcu-
           lated the waiting time values and sent them to the bus stop monitors via COMSE. Be-
           cause all messages of the system went through the COMSE, it was also used for log-
           ging and running the LOGDIG analyzer. There are 3 types of original logs: CCS,
           COMSE and BUS. Every bus has own BUS log identified by bus number. In pre-
           processing phase of LOGDIG, original logs are divided message-specified log files:
           CCS_RTAT, CCS_AD, BUS_LOGIN and BUS_LOCAT. That makes analyzing faster
           and simpler.
               We have named the main requirement as EPT (Estimated Passing Time) to see
           when the bus pass the stop. The expected behavior can include one or more EPT’s.


                                                      273
SPLST'15


           Other definitions include RTAT (Real bus Theoretical Arrival Time), including bus
           number, line and monitor, as well as AD (Advanced or Delayed) compared to RTAT.
               The sequence of requirements for one EPT case is: 1) a bus driver sent a login for a
           line in a bus terminal, 2) the bus started to drive from the terminal, 3) when the bus is
           in the line and if necessary (started, delayed or advanced of schedule) CCS calculated
           estimated waiting times and sent to target monitors, 4) the bus arrived to a bus stop and
           its monitor, and 5) COMSE removed the waiting time from monitor.
               There are two SSD’s (System Specific Data) in this case: 1) customer’s require-
           ment document that requires the error of estimation should be below 120 seconds and
           2) boundary box areas of bus stop’s from database of ELMI-system.
               Various EPT cases are processed by the LOGDIG state machine shown in Fig. 4. It
           can search and analyze all EPT’s in a time window of Expected Behavior (EB) for
           specific lines and monitor as described in Fig. 3. The time window can be set from the
           command line or BML language.

                                                            Time window
                      Start time of EB                      of Expected Behavior (EB)              Stop time of EB

                                                           Re-read

                        CCS_RTAT log
                        CCS_AD log
                        BUS_LOGIN logs
                        BUS_LOCAT logs
                        COMSE log                                                                                    Time of day


            00:00            03:00                         EPT1                                         23:00   23:59
                                                                                         EPT3

                                                                EPT2

                          Fig. 3. Time window from 03:00 to 23:00 and 3 example EPT’s in logs

              Start

               T1
                                                                                                                                   T9
                              On entry function
                       T11
                                                                       N                  N                                  N
                                                    F      T3                                                   F    T7
                       S1 F        T2         S2                     S3      F   T5       S4 F     T6      S5               S6     F
                                                    N      T4                                                   N    T8
                      E N                     E                        E                  E                 E                E

                          T10


                          ESU-state           Transition
                      S                   T
                                              function      F Found        N Not found    E Exit


                                        Fig. 4. State machine structure of the EPT’s case study


                                                                           274
SPLST'15


                            INPUT: Start time of EB or                 4. Coords and times
                            RTAT start time of ESU (T11)               before and after stop (S4)
                                                 EPT
                                                                                                    INPUT:Stop time of EB
                                              1. Searchs RTAT (S1)
                 2. Searchs LOGIN (S2)
                                              3. Searchs START (S3)

                                              4. Searchs REAL PASS (S4)
                                              6. Searchs COMSE PASS (S5)
                                              7. Searchs AD (S6)

               All logs: CCS, BUS, COMSE

                                                                       8. CCS     6. COMSE               Time of
                                                                       Error (T9) Error (T9)             day

                    2. LOGIN 3. START                                 8.             6.
                                                                      ESTIM          COMSE
                                                           7. AD      PASS (T9)      PASS
                                         1. RTAT                      (CCS)
                      ESU time
                      windows                                             5. REAL PASS
                                                                          (T6: calculated)
                                                              Max error (-120 sec)    Max error (120 sec)

                     INPUT: Max login                      INPUT: Max rtat

                                           Fig. 5. Timing diagram for one EPT

              Behavior and ESU-states of the state machine of LOGDIG in EPT’s case is depict-
           ed in Fig. 5. Inputs of the EPT’s case (from T1 transition function or command line)
           are start time, stop time, line, max login (means maximum time from LOGIN to
           RTAT) and max rtat (means maximum time from RTAT to PASS) and also maximum
           error values (SSD 1: -120 – 120 seconds) for the estimations of CCS and COMSE.
           The sequence is following:

           • S1: Searches first RTAT message (inputs: line) from CCS_RTAT log file in given
             ESU time window (see Fig. 5):
             ─ On entry function (before searching), calls T11: Sets start time of ESU time
               window to search next RTAT message or original EB start time in first “round”
               of search
             ─ Found: Sets RTAT variables. Calls T2: Sets input variables for next state.
             ─ Not found: Calls T10: Prints final results.
           • S2: Searches first LOGIN message (inputs: log-type, bus, line, direction) from
             BUS_LOGIN log file in given ESU time window:
             ─ Found: Sets LOGIN variables. Calls T3: Sets input variables for next state.
             ─ Not found: Calls T4: Sets input variables for next state.
           • S3: Searches starting (leaving) place of the bus from terminal bus stop in given
             ESU time window from BUS_LOCAT log file (input: bus). This needs SSD 2 to
             check positions:


                                                             275
SPLST'15


             ─ Found: Sets LOCAT variables. Calls T5: Sets input variables for next state.
             ─ Not found: -
           • S4: Searches arriving place of the bus to the target bus stop in given ESU time
             window from BUS_LOCAT log file. This needs SSD 2 to check position:
             ─ Found: Sets LOCAT variables. Calls T6: Sets input-variables for next state.
                Calculates the real passing time of the bus
             ─ Not found: -
           • S5: Searches first BQD message (inputs: bus, line) from COMSE log file in given
             ESU time window:
             ─ Found: Sets BQD variables. Calls T7: Sets input variables for next state.
             ─ Not found: Calls T8: Sets input variables for next state.
           • S6: Searches last AD message (inputs: line, direction, bus) from CCS_AD log file
             in given ESU time window:
             ─ Found: Sets AD variables. Calls T9: Calculates errors of the estimated waiting
                time, writes one row to SBK file and writes RCL knowledge from the composite
                timetable structure to RCL file
             ─ Not found: -

              There are exit transitions in every ESU state if a problem reading a log file occurs,
           e.g. the file is missing or there is a syntax error. Variables and their values (BMS
           variables in Fig. 2) comes directly from the log files and the command line initializa-
           tion parameters, but there are also result specific variables in the SBK files. Result
           variables of SBK file are set in almost every state and state transition function and
           they are added (by one row) to SBK file at the end of successfully analyzing of one
           EPT in the last transition (T9 in Fig. 4). RCL knowledge is written in almost every
           state transition function to the composite timetable-structure (see Fig. 2) and in T9
           they are added to RCL file. Searching time windows in every state can vary depend-
           ing on (initializing variables and) variable values (found events timestamps) given
           from previous states.
           Time indexing and possibility to read already processed log lines is utilized in two
           ways. See the re-read between EPT1 and EPT2 in Fig. 3. When searching different
           log files, RTAT-message from CCS_RTAT log is read before LOGIN message from
           BUS_LOGIN log even though LOGIN (EPT requirement step 1) comes before RTAT
           (EPT requirement step 2) in time. This way we can limit the complexity of the search
           and analyze only certain RTAT and its line or monitor. Within the same log file,
           RTAT or AD messages in many EPT cases exist. If we want to analyze only one
           monitor and its line, we will re-read the old log lines.


           4.1    Case study results

           The content of the SBK file in the ELMI case study is given in Table 2. There are
           described three example EPT’s (bus drives in line 2132 and towards bus stop monitor
           1031), which are also shown in Fig. 6. We consider a more detailed reading of the
           first EPT (EPT1). At 6:24:31 (LOGIN_MSG) in the terminal, the bus driver (num-
           ber111) has logged in to line 2132 and its direction 1. Then at 6:25:01 (DRIVE) the


                                                     276
SPLST'15


           bus has started from the terminal. At 6:42:33 (RTAT_MSG) CCS has sent the first
           estimated arrival time (6:45:52, RTAT_VAL) to the monitor. At 6:45:03 (AD_MSG)
           CCS has sent the last time correction for the waiting time. AD_VAL -25 tells that the
           bus has been 25 seconds in advance of the schedule. At 6:45:40 (PASS) the bus has
           passed the bus stop that is the estimation of LOGDIG. COMSE has estimated the
           arrival time to be 6:46:02 (BQD_MSG). That means there has been 13s error
           (PASS_TIME_ERR) between the estimated arrival time (RTAT_AD_VAL: 6:45:27)
           and real arrival time (PASS: 6:45:40). The error to the COMSE estimation has been -
           22 seconds (BQD_TIME_ERR). Because absolute values of the errors are smaller
           than 120 seconds, error flag EPT_ERR is 0. Warning flags are 0, that means there
           have been found LOGIN and BQD messages from the logs. In this case the whole
           EPT has lasted from 6:24:31 (LOGIN_MSG) to 6:46:02 (BQD_MSG) in total of
           21:31 minutes.

              Table 2. Three example EPT’s (in line 2132 and bus stop monitor 1031) of the SBK file

                     Knowledge variables                 Knowledge data
                   Group Num            Variable  EPT1       EPT2        EPT3
                Identify     1 SBK_ID               13         35          52
                Log          2 BUS                 111        451         228
                             3 LINE               2132       2132        2132
                             4 DIR                  1          1           1
                             5 SIGN               1031       1031        1031
                Time         6 RTAT_SRCH_TIME 6:36:01       7:55:47     8:48:08
                Searched     7 LOGIN_MSG (S2)    6:24:31    7:48:17     8:43:09
                             8 DRIVE (S3)        6:25:01    7:48:47     8:44:10
                             9 RTAT_MSG (S1)     6:42:33    7:55:47     8:48:08
                            10 RTAT_VAL (S1)     6:45:52    7:59:02     8:51:18
                            11 PASS (S4)         6:45:40    8:01:57     8:51:45
                            12 BQD_MSG (S5)      6:46:02    8:02:17     8:52:08
                            13 AD_MSG (S6)       6:45:03    8:01:48     8:51:09
                            14 AD_VAL (S6)         -25        166          11
                Calculated 15 RTAT_AD_VAL (T9 6:45:27       8:01:48     8:51:29
                            16 PASS_LOGIN (T9)    1269        820         516
                            17 PASS_DRIVE (T9)    1239        790         455
                            18 PASS_RTAT (T9)      187        370         217
                            19 PASS_TIME_ERR (T     13         9           16
                            20 BQD_TIME_ERR (T9 -22           -20         -23
                Error flags 21 EPT_ERR              0          0           0
                Warning     22 LOGIN_WARN (T4)      0          0           0
                flags       23 BQD_WARN (T8)        0          0           0


                                                      277
SPLST'15


              The same EPT’s has also been presented in a visual form in Fig. 6. The visual
           presentation has been generated from the SBK file using gnuplot and Tcl script lan-
           guage. The horizontal axis represents the time of day and the vertical axis time differ-
           ences in relation to the passing time. The values can be directly found from the SBK
           variables numbers: 16-20.


               Fig. 6. Example of visual representation of SBK-file (bus line 2132 and monitor 1031)

           We can reason a few things in Fig. 6. Times between 14 EPT’s have been varied a lot.
           For example “LOGIN time” in about 8:00 – 9:00 and 16:00 - 17:00 seems to be
           smaller. These times are typically morning and afternoon rush hours. We see also that
           real waiting time values have been quite short time (RTAT time: 200s) in monitors,
           but that is because in this case the trip from the terminal bus stop to the monitor was
           quite short. Estimation of waiting times are seemed to work quite well, because error
           values (WT-error and BQD-error) are between +120 - -120 seconds in all EPT cases.
           These were quite normal results. Additionally traffic jams or other problems can also
           be easily see from the visual presentation like that.
           All detected errors can later be explored statistically from SBK file to get more de-
           tailed information of causes of errors. For example bus, line and sign (bus stop) spe-
           cific errors such as the following error percentages:
           BUS,463          = 1 / 24 = 4.2 %
           BUS,53           = 1 / 29 = 3.4 %
           LINE,2132 DIR 1 = 1 / 42 = 2.4 %
           LINE,2132 DIR 2 = 4 / 139 = 2.9 %
           SIGN,1033        = 1 / 62 = 1.6 %
           SIGN,1061        = 3 / 77 = 3.9 %


                                                       278
SPLST'15


           These kind of results gave valuable “feedback” knowledge to understand the real
           behavior of the system. The results also helped to fix bugs and behavior of the system
           and thus improve the quality of the system.

           The execution time of analyzing of one bus line (2132 in this case study) was about
           15 seconds using normal PC. There was at all 181 EPT’s in the analysis. Other EPT’s
           were for other bus stop monitors. The execution time depends on the amount of daily
           log data and analyzed bus line. There have been three key bus lines and its monitors
           used for analyzing the ELMI system.
           LOGDIG can also be used to detect anomalies, like work flow error or low perfor-
           mance of the execution. For example, the work flow error can be checked by structure
           of the state machine, and the low performance by time window limits of the ESU’s.


           5      Conclusion

           We have introduced the LOGDIG analyzer framework that is capable to analyze very
           complex expected behavior from the logs. LOGDIG was motivated by the fact that
           there were no other tools flexible enough, even though LFA is very close to the needs.
           Visualizations and other statistical post-processing have been left out, since there are
           lots of them available.

           LOGDIG is best suited for complex behavior problems, since the setup takes some
           time compared to e.g. simpler command line search tools. To improve the perfor-
           mance, the Python implementation can be further optimized or even written in com-
           piled language, and the overall execution distributed. If the source log files are availa-
           ble in different machines, the ESUs can be distributed directly there.

           In the future, LOGDIG could be used as a higher-level analyzer that uses lower level
           analyzers. For example, the event search algorithm can be replaced by e.g. LFA/LFA2.
           To conclude, LOGDIG fulfills the requirements for RTPIS kind of applications, and
           offers a flexible framework for other domains.


           6      References
           1. Valdman, J. Log file analysis. Technical report, Department of Computer Science
           and Engineering, University of West Bohemia in Pilsen (FAV UWB), Czech Repub-
           lic, , Tech.Rep.DCSE/TR-2001-04 (2001) pp. 1-51.
           2. Oliner, A., Ganapathi, A. & Xu, W. Advances and challenges in log analysis.
           Communications of the ACM 55(2012)2, pp. 55-61.
           3. Jayathilake, D. Towards structured log analysis. Computer Science and Software
           Engineering (JCSSE), 2012 International Joint Conference on, 2012, IEEE. pp. 259-
           264.
           4. Matherson, K. Machine Learning Log File Analysis. Research Proposal (2015).


                                                      279
SPLST'15


           5. Viklund, J. Analysis of Debug Logs. Master of Science. 2013. Luleå University of
           Technology, Department of Computer Science, Electrical and Space Engineering,
           Sweden. 1-39 p.
           6. Fu, Q., Lou, J., Wang, Y. & Li, J. Execution anomaly detection in distributed sys-
           tems through unstructured log analysis. Data Mining, 2009. ICDM'09. Ninth IEEE
           International Conference on, 2009, IEEE. pp. 149-158.
           7. Feather, M.S. Rapid application of lightweight formal methods for consistency
           analyses. Software Engineering, IEEE Transactions on 24(1998)11, pp. 949-959.
           8. Chang, F. & Ren, J. Validating system properties exhibited in execution traces.
           Proceedings of the twenty-second IEEE/ACM international conference on Automated
           software engineering, 2007, ACM. pp. 517-520.
           9. Groce, A., Havelund, K. & Smith, M. From scripts to specifications: the evolution
           of a flight software testing effort. Proceedings of the 32nd ACM/IEEE International
           Conference on Software Engineering-Volume 2, 2010, ACM. pp. 129-138.
           10. Andrews, J.H. & Zhang, Y. General test result checking with log file analysis.
           Software Engineering, IEEE Transactions on 29(2003)7, pp. 634-648.
           11. Aulenbacher, I.L. Master of Science, Generating Log File Analyzers, The Univer-
           sity of Western Ontario, London Ontario Canada (2012) pp. 1-88.
           12. LEAL–AULENBACHER, I. & ANDREWS, J.H. Generating C Log File Analyz-
           ers. WSEAS Transactions on Information Science & Applications 10(2013)10.
           13. Andrews, J.H. & Zhang, Y. Broad-spectrum studies of log file analysis. Proceed-
           ings of the 22nd international conference on Software engineering, 2000, ACM. pp.
           105-114.
           14. Aaltonen, J. Implementation of GPS based real time passenger information sys-
           tem. Licentiate in Technology. 1998. Tampere University of Technology. 1-76 p.
           15. Heikkinen, E. Informaatiotaulun protokollakortin ohjelmisto. Master of Science.
           1996. Tampere University of Technology. 1-68 p.


                                                    280

</pre>