=Paper= {{Paper |id=Vol-2009/fmt-proceedings-2017-paper15 |storemode=property |title=A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS |pdfUrl=https://ceur-ws.org/Vol-2009/fmt-proceedings-2017-paper15.pdf |volume=Vol-2009 |authors=Niklas Thür,Markus Wagner,Johannes Schick,Christina Niederer,Jürgen Eckel,Robert Luh,Wolfgang Aigner |dblpUrl=https://dblp.org/rec/conf/fmt/Thur0SNELA17 }} ==A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS== https://ceur-ws.org/Vol-2009/fmt-proceedings-2017-paper15.pdf
  A Bigram Supported Generic Knowledge-Assisted
     Malware Analysis System: BiG2-KAMAS
               Niklas Thür1 , Markus Wagner1 , Johannes Schick1 , Christina Niederer1 , Jürgen Eckel3 ,
                                          Robert Luh2 , Wolfgang Aigner1
                1
                    Institute of Creative\Media/Technologies, St. Pölten University of Applied Sciences, Austria
                          2
                            Josef Ressel Center for Unified Threat Intelligence on Targeted Attacks, Austria
                                            3
                                              IKARUS Security Software GmbH, Austria
                                          Email: 1,2 first.last@fhstp.ac.at, 3 eckel.j@ikarus.at


   Abstract—Malicious software, short “malware”, refers to soft-      et al. [3] a design study for a behavior-based knowledge-
ware programs that are designed to cause damage or to perform         assisted malware analysis system (referred to as KAMAS)
unwanted actions on the infected computer system. Behavior-           is described. The malware analyst’s workflow involves the
based analysis of malware typically utilizes tools that produce
lengthy traces of observed events, which have to be analyzed          tasks of examining potentially malicious behavior patterns,
manually or by means of individual scripts. Due to the growing        selecting them, categorizing them, and storing the found rules
amount of data extracted from malware samples, analysts are           in the knowledge database (KDB) [3]. We developed an
in need of an interactive tool that supports them in their            interactive prototype to extend the KAMAS design study [3]
exploration efforts. In this respect, the use of visual analytics     with a new feature of Bi-Gram supported Generic Knowledge-
methods and stored expert knowledge helps the user to speed
up the exploration process and, furthermore, to improve the           Assisted Malware Analysis System (BiG2-KAMAS) [4]. A
quality of the outcome. In this paper, the previously developed       focus group meeting with members of an Austrian IT security
KAMAS prototype is extended with additional features such as          company, the Information security department of St. Pölten
the integration of a bi-gram based valuation approach to cover        UAS and the developers of the initial KAMAS prototype
further malware analysts’ needs. The result is a new prototype        was conducted to identify the tasks and needs for additional
which was evaluated by two domain experts in a detailed user
study.                                                                features requested by the IT security company to extend the
                                                                      KAMAS design study [3]. Based on this feature list, the paper
                        I. I NTRODUCTION                              at hand contributes the following:
                                                                         1) Integrating a generic data loading process enabling
   Malicious software, or short malware, is one of the biggest               KAMAS to load any kind of data, based on a given
threats to computer systems these days [1]. ’Malware’ refers                 structure;
to software programs, which are designed to cause damage or              2) Storing benign rules and their highlighting when loading
perform other unwanted actions on a computer or network.                     new cluster files, thereby supporting the analyst;
Therefore malware plays a big part in most computer in-                  3) Identifying malicious or benign call sequences by in-
trusions and security incidents. Malware includes inter alia:                cluding a bi-gram based valuation;
viruses, trojan horses, worms, rootkits, scareware, and spy-             4) Presenting in detail two user studies validating the new
ware [1]. By now there are millions of malicious programs                    features.
and the number is increasing every day.                                  This paper is structured as follows: Sect. II provides back-
   “Malware analysis is the art of dissecting malware to              ground knowledge about the work of our collaborators and
understand how it works, how to identify it, and how to               related work in the field of malware analysis. In Sect. III we
defeat or eliminate it” [1]. In malware analysis, there are two       describe the prototype’s design, visualization methods and im-
basic approaches to examine a malware program: the static             plementation. Furthermore, Sect. IV defines the integration of
and the dynamic approach. Often the malware analyst only              additional knowledge in the prototype’s knowledge database.
has the potentially malicious executable, which includes the          Sect. V shows the prototype’s evaluation method, while results
machine code but is not human-readable. Therefore, static             are discussed in Sect. VI.
malware analysis involves the investigation of the malware
executable as well as certain reverse-engineering tasks to                                II. R ELATED W ORK
recover the sample’s source code. On the other hand, dynamic             Shiravi et al. [5] published a survey related to network se-
analysis requires the execution of the malicious software on          curity visualization, comparing the data sources and visualiza-
e.g. a virtualized host machine to detect the malware’s run-          tion techniques of thirty-eight different systems. Furthermore,
time behavior [1]. To cover all of the malware analyst’s              Egele et al. [6] presented a general literature for malware
needs, Wagner et al. [2] performed a problem characterization         analysis techniques and tools. In their work they surveyed
and abstraction elaborating the analysts needs in relation to         different approaches for dynamic automated malware analy-
behavior-based malware analysis. In the article by Wagner             sis and compared them based on their analysis techniques.




                                                                    107
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS




Fig. 1. The BiG2-KAMAS prototype and it’s three sections: Section 1 shows the knowledge base including the KDB (1a) with it’s new category for benign
activity. Beneath the knowledge base highlighting filters are displayed (1b). Section 2 shows the rule exploration area including the bigram visualization (2b)
and new color highlighting for benign rules (2a). Finally, section 3 shows the call exploration area.



Likewise, Bazrafshan et al. [7] surveyed various heuristic                       and classification” [16]. Wrench and Irwin [17] published an
malware detection techniques as well as malware obfuscation                      approach in which they identify and classify Remote Access
techniques. Additionally, Wagner et al. [8] published a survey                   Trojans (RATs) and other malicious software based on the
of 25 different visualization systems for malware analysis. The                  programming language PHP.
objective of their work was the comparison and categorization
of the malware systems visualization methods and features and                                         III. P ROTOTYPE C ONCEPT
categorizing them along their novel ’Malware Visualization                         This section describes the new features of the ‘Bi-Gram
Taxonomy’. Furthermore, McNabb and Laramee [9] published                         supported Generic Knowledge-Assisted Malware Analysis
a survey of surveys: Mapping The Landscape of Survey Papers                      System (BiG2-KAMAS), conceptually grounded on the KA-
in Information Visualization.                                                    MAS prototype [3].

   In 2017, Wagner et al. [3] published a paper on a                             A. Data
Knowledge-Assisted Malware Analysis System, referred to as                          In its current iteration, BiG2-KAMAS bases its visualization
KAMAS. In their user study, they found out that the experts                      on sequential traces of Windows kernel operations amounting
are not only interested in visualizing patterns. A supportive                    to benign and malicious application behavior in the context
valuation approach was implemented by Luh et al. [10], [11],                     of OS and user-initiated processes. These events are typically
calculating the degree of maliciousness based on system and                      abstractions of raw system and API calls that yield information
API call bi-grams. Somarriba et al. [12] presented another                       about the general behavior of an unknown application sam-
malware detector system for Android Malware Behavior. Be-                        ple or resident process [8]. Raw calls may include wrapper
sides, Marschalek et al. [13] published a system for threat                      functions (e.g. CreateFile) that offer a simple interface
detection using a real-time monitoring agent to gather all or                    to the application programmer, or native system calls (e.g.
only selected system events and visualize these using event                      NtCreateFile) that represent the underlying OS or kernel
propagation trees. Xiaofang et al. [14] published a paper of                     support functions. In the context of BiG-KAMAS and its data
a malware variant detection approach using Similarity Search                     providers, events are collected directly from the Windows
“by processing malware as content fingerprint” [14]. Jain et                     kernel. We employ a driver-based monitoring agent [13]
al. [15] presented a visual exploration approach of android                      designed to collect and forward a number of events to a
binary files. Their approach is based on the visualization of                    database server. This gives us unimpeded access to events
android .dex files to analyze and compare malicious android                      depicting operations related to process and thread control,
executables. David et al. [16] presented “a novel deep learning                  image loads, file management, registry modification, network
based method for automatic malware signature generation                          socket interaction, and more. For example, a shell event that




                                                                             108
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

creates a new binary file on a system may be simply denoted as                                       TABLE I
a triple explorer.exe,file-create,sample.exe.                              O PERATION OF S EQUITUR AFTER [20]. P ROPERTY APPLICATION IS
                                                                                                    italicized.
Additional information captured in the background includes
various process and thread ID information required to uniquely                Symbol          String   Grammar        Remarks
identify an event within a system session and to link individual
                                                                                   1              a    S→a
events to a full sequence (trace) needed for further processing                    2             ab    S → ab
stages. Based on aforementioned traces, BiG2-KAMAS uses                            3            abc    S → abc
two distinct mechanisms to further process arbitrary kernel                        4           abcd    S → abcd
                                                                                   5          abcdb    S → abcdb
event sequences:                                                                   6         abcdbc    S → abcdbc     bc appears 2x
   Pattern inference: Our introduced framework has been                                                S → aAdA       bigram uniqness
developed in concert with an event extraction system called                                            A → bc
                                                                                   7         abcdbca   S → aAdAa
SEQUIN [11]. SEQUIN uses grammar inference extended                                                    A → bc
with statistical evaluation to automatically identify and crop                     8      abcdbcab     S → aAdAab
relevant sequences (rules) from traces of kernel-level behav-                                          A → bc
                                                                                   9     abcdbcabc     S → aAdAabc    bc reappears
ioral data for further processing and visualization. Generally                                         A → bc
speaking, grammar inference is the process of computationally                                          S → aAdAaA     bigram uniqness
assembling a formal ruleset by examining the sentences of an                                           A → bc         aA appears 2x
                                                                                                       S → BdAB       bigram uniqness
unknown language [18]. In the information security domain,                                             A → bc
grammar inference is primarily used for pattern recognition,                                           B → aA
computational biology, natural language processing, language                      10    abcdbcabcd     S → BdABd      Bd appears 2x
                                                                                                       A → bc
design programming, data mining, and machine learning.                                                 B → aA
Grammar inference has also been proven to be a feasible                                                S → CAC        bigram uniqness
approach to anomaly detection, since “algorithmic incompress-                                          A → bc         B used only 1x
                                                                                                       B → aA
ibility is a necessary and sufficient condition for randomness”                                        C → Bd
[19]. We use grammar inference as key component in the                                                 S → CAC        rule utility
process of ‘compressing’ a sequential trace for extracting                                             A → bc
                                                                                                       C → aAd
relevant behavioral patterns.
   To achieve inference by compression in a computationally
feasible way, we selected an algorithm that losslessly produces
(without changes to order and immutability) a context-free             [10]. An LLR test is a statistical method used test model
grammar (CFG) in unsupervised operation. As opposed to                 assumptions, namely the quality of fit of a reference (null)
context-sensitive grammars, languages created by a CFG can             and an alternative model. When determining the occurrence
be recognized in O(n3 ) time, which is a relevant distinction for      of rarely observed events – which are often at the core of
all future parsing efforts. The choice ultimately fell on Sequitur     malicious traces – likelihood ratio tests show significantly
[20]. Sequitur is a greedy compression algorithm that creates          better results than alternatives such as x2 or z-score tests [21].
a hierarchical structure (CFG) from a sequence of discrete                In preparation for sentiment-assisted visualization, we use
symbols by recursively replacing repeated phrases with a               the LLR method to learn likely benign and malicious event se-
grammatical rule. The output is a compressed representation of         quences in big corpora of recorded kernel operations (traces).
the original sequence. The algorithm creates this representation       The resulting sentiment dictionary can be used to accurately
through the application of two base properties: rule utility and       and effectively determine if an investigated event bi-gram is
bi-gram uniqueness. Rule utility checks if a rule occurs at least      contextually suspicious. Specifically, we compute the LLR
twice in the grammar, while bi-gram uniqueness observes if             score for each bi-gram to highlight collocations characteristic
two adjacent symbols occur only once. Assuming we have                 to sequences of malicious and benign system events [10].
a string abcdbcabcd, where every character represents an                  The resulting occurrence counts (shown in Table II) are
event, the first bi-gram of that trace would be ab, followed by        the basis for this calculation: Following the approach by
a second bi-gram bc, and so forth. See Table I for a complete          [21], we define the number of times both event tokens occur
example of the process.                                                in combination (k11 ), the number of times each token has
   Sequitur is linear in space and time. In terms of data              been observed independently from the other (k12 and k21 ,
compression, the algorithm can outperform other designs                depending on the relative position in the bi-gram), and the
that achieve data reduction by factoring out repetition. It is         number of times the token was not present at all (k22 ).
almost as performant as designs that compress data based on
probabilistic predictions [20].                                                                    TABLE II
   Bi-gram extraction and scoring: In addition to rule infer-                            E VENT OCCURRENCE MATRIX [10]
ence, BiG2-KAMAS uses precomputed maliciousness scores                                          A             !A
of event bi-grams separately explored using a sentiment-like                            B       k11 =k(AB)    k12 =k(!AB)
extraction system based on the log likelihood ratio (LLR) test                          !B      k21 =k(A!B)   k22 =k(!A!B)




                                                                     109
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

   The same process is later applied to the pattern’s general             The background of the third column of the ‘Rule Overview
occurrence in a labeled benign versus malicious corpus. The            Table’ indicates whether a rule is fully benign, partially
final result is a normalized sentiment rating ranging from             benign, not known, partially malicious or fully malicious.
+1.0 (benign) to −1.0 (malicious). Unknown bi-grams are                The background of the malicious rules will be painted in red
ultimately scored against the resulting dictionary, the outcome        and the background of the benign rules in blue. The fully
of which is at the core of the bi-gram evaluation feature in the       known rules will be displayed in a dark red/blue while the
new BiG2-KAMAS prototype.                                              partially known rules are highlighted in a light red/blue (see
                                                                       Figure 1:1b). The red color highlighting for malicious activity
B. Visualization Design                                                is adopted of the KAMAS prototype [3]. If a rule is fully
   Structure: Wagner et al. [3] describe in their article that         known and, therefore, highlighted in dark red, the rule is
since IT-security experts are commonly familiar with pro-              included as-is in the KDB. A partially known rule is only a
gramming IDEs, they used the design concept of IDEs like               part of one rule in the KDB. This kind of rule has at least one
Eclipse or Netbeans for their prototype. The updates to the new        additional call at the beginning or at the end of a fully known
prototype also follow this design concept approach. In contrast        rule [3]. If an input file was loaded, the system automatically
to the previous prototype, the new one has an additional view.         calculates the knowledge state of each rule. For this purpose,
In this initial view the KDB is situated on the left side, which       the system compares each rule of the input file with each
can be compared to the project view in Eclipse. On the right           rule of the KDB. After the calculation process the system
side only the file load buttons are displayed, which can be            highlights the rules in the corresponding colors in the rule
compared to the initial view of Eclipse, where no project has          overview table.
been opened yet.                                                          Bi-Gram Visualization: The rule detail table is located
   Coloring: For the rule highlighting as well as the Bi-Gram          next to the rule overview table (see Figure 1:2b). The rule
visualization we selected a sequential color scheme from red           detail table automatically updates its content when clicking
to blue. Red       indicates that the rule or bi-gram is malicious     on a rule in the rule overview table and represents all system
and a blue one stands for a benign rule or bi-gram. To avoid           and API calls included in the selected rule. From left to right,
problems with red and green hues for colorblind people [22, p.         the table displays the unique id as well as the name of the call.
124], we used blue instead of green and select colorblind-safe         The last column visualizes the new bi-gram based valuation
qualitative colors from Colorbrewer1 .                                 approach for the corresponding calls. As mentioned before,
   Layout: The prototype is structured into three parts: knowl-        the prototype uses the bi-gram approach of Luh et al. [10].
edge base, rule exploration area and call exploration area (see        A bi-gram is an n-gram where the length of n = 2. An
Figure 1). On the left side the knowledge base is visualized           n-gram, in turn, is a coherent sequence of n elements. In
with it’s ‘Knowledge Database (KDB)’ (see Figure 1:1a) and             this approach the elements are system or API calls. Each
the KDB’s color highlighting filters (see Figure 1:1b). The            bi-gram has a score in the range [-1, 1], which indicates
KDB is displayed as a tree, in which each category of the              whether this pair of calls is malicious or benign. For bi-
database can have several subcategories. Each category with            gram based valuation, two different visualization approaches
subcategories is shown with a box icon (see Figure 1:1a)               were implemented following a semantic zooming approach:
and the ones without subcategories are displayed with folder           First, if the width of the bi-gram column is bigger than 75px,
icons. Each rule, which is stored in the database, is displayed        the prototype visualizes the bi-gram values as bar charts (see
with a paper icon. Beneath the KDB the ‘Knowledge Base                 Figure 2:a), whereby each bar starts in the middle of the bi-
Highlighting’ filters are displayed (see Figure 1:1b). Each filter     gram column. If the bi-gram score is between 0 and -1, the
can be activated or disabled with its checkbox and updates the         bi-gram is malicious. Therefore, the red color bar chart unfurls
result of the prototypes filter pipeline and visualization of the      from the middle towards the left side. If the bi-gram score is
‘Rule Overview Table’ (see Figure 1:2a).                               between 0 and 1 the bi-gram is benign and the bar chart is
   After loading and translating the input file, the system            visualized from the middle to the right side in a blue color. The
updates the ‘Graphical User Interface’ (GUI) and visualizes            colors correspond to the KDB highlighting. The visualization
new elements. In the middle the ‘Rule Exploration’ area (see           approach was chosen to give the user a quick but still precise
Figure 1:2) is visualized, while the right side contains the ‘Call     overview of the bi-gram based scores.
Exploration’ area (see Figure 1:3).                                       If the width of the bi-gram column is smaller than 75px and
   In the ‘Call Exploration’ area all the included system or API       therefore the bar charts are hardly recognizable, the system
calls of the loaded input file are represented in the call table       switches to the second visualization. Here, the bi-gram values
(see Figure 1:2b) as described by Wagner et al. [3]. The rules         are visualized as a color-filled rectangle (see Figure 2:b).
included in the input file are visualized in the rule overview         As before, a red colored rectangle indicates that the bi-gram
table located in the ‘Rule Exploration’ area (see Figure 1:2a).        is malicious and a blue one stands for a benign bi-gram.
If the user loads several trace files, each trace file will be         To visualize the value of the malicious or benign bi-gram,
displayed as one rule.                                                 the system changes the alpha value of the displayed color.
                                                                       Therefore, the darker the color, the higher the respective value.
  1 http://colorbrewer2.org                                            Since the difference of an alpha value between 255 and 240 is




                                                                     110
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

                                                                                  data of these files. Contrary to a loaded Sequitur file, each
                                                                                  entry of the rule overview table represents an entire trace file.
                                                                                  Thus, if the user loads three traces the rule overview table will
                                                                                  have only three rows. Furthermore, due to the fact that the user
                                                                                  analyses several independent trace files the histogram for the
                                                                                  rule occurrence is insignificant. Therefore, only one histogram
                                                                                  for the trace length will be displayed in the rule filter area.
                                                                                     Rearrange: If the rule overview table and the call overview
                                                                                  table are loaded with data, the user can rearrange their content
                                                                                  by clicking on a table’s column. This will re-sort the included
                                                                                  data and update the visualization [3]. The content of the rule
                                                                                  detail table cannot be rearranged since the calls are shown in
                                                                                  their sequential order and should therefore not be changeable.
Fig. 2. The two different visualisations methods of the call bi-grams. The           Filter: In the next step the user can reduce the number of
first method visualises the bi-grams as bar charts (a), whereas the second        rules or trace files by using the rule/trace and call filters [3].
visualisation uses the alpha channel to show the severity of the bi-gram (b).
                                                                                  No matter which files were loaded, the user always has the
                                                                                  opportunity to filter the rules or traces by the included calls
not easy to recognize and every value below 100 is generally                      (events). The user can rearrange the call filters or select a
difficult to see, we decided to implement only four graduation                    specific call in the call overview table to reduce the number of
steps for the alpha value. The visualization with the alpha                       shown rules [3]. Furthermore, the analyst can filter the rules or
value is less precise than the visualization with the bar charts                  specific traces by using the filters in the rule exploration area.
but, at the same time, significantly easier to interpret. Table III               If loading a Sequitur file, the analyst can filter the rules by their
shows the different graduation steps and their value ranges.                      occurrence, length, whether they are equally distributed in the
                                                                                  input file or if they match, partially match, or don’t match the
                           TABLE III                                              stored rules in the KDB [3]. By changing the filter settings, the
     C OLOUR GRADUATION STEPS FOR THE ALPHA VALUE BI - GRAM                       included rules in the rule overview table automatically update
                        VISUALISATION .
                                                                                  immediately. If one or more trace files were loaded, the analyst
            Colour         Alpha value     Value ranges                           can only filter the shown traces in the rule overview table by
                                200        >= 0.75
                                                                                  their length. In addition, the highlighting and filtering of the
                                                                                  KDB is switched off.
                                150        >= 0.5 && <0.75
                                                                                     Details-on-Demand: If the user wants to analyze a rule or
                                100        >= 0.25 && <0.5                        trace, he/she can open the rule/trace in the rule detail table
                                50         >= 0 && <0.25                          by selecting it in the rule overview table. This will display
                                                                                  all the included calls in the rule detail table in their sequential
                                50         <0 && <= -0.25
                                                                                  order [3]. The bi-grams provide information whether a combi-
                                100        <-0.25 && <= -0.5                      nation of two calls is malicious or benign. This should support
                                150        <-0.5 && <= -0.75                      the user in finding interesting call sequences more quickly.
                                200        <-0.75                                    Extract: Independent of the loaded files the analyst can
                                                                                  add a new rule to the database using two different ways.
                                                                                  One method is to simply select one rule or trace in the rule
C. Interaction                                                                    overview table and simply drag and drop it in one leaf category
   Like the KAMAS prototype of Wagner et al. [3], the BiG2-                       of the KDB. This will add the entire rule or trace file to the
KAMAS’s functionality will be described in accordance to                          database [3]. Alternatively, the analyst can select several calls
the four steps of the visual information seeking mantra of                        of interest in the call overview table and add these by dragging
Shneiderman et al. [23], namely overview, rearrange and filter,                   and dropping them to the KDB. When adding a new rule to
details-on-demand and, extract.                                                   the KDB, a popup window will show up where the analyst
   Overview: The BiG2-Kamas prototype has an additional                           can assign the rule a specific name. If the user has loaded a
initial view where the user can decide whether to load a                          Sequitur file, the system will now update the knowledge state
Sequitur input file or several raw trace files. When the analyst                  for all rules as well as the highlighting in the rule overview
loads a Sequitur file, the rule and call tables will be filled with               table for further analysis.
the rule and call data included in the input file. Each entry
in the rule overview table represents one rule of the loaded                      D. Implementation
cluster. Furthermore, the histograms in the rule exploration                        Since the BiG2-KAMAS prototype is based on the proto-
area give a quick impression of the distribution in the rule                      type of Wagner et al. [3], it also uses a data-oriented design
occurrence and length [3]. When the user loads one or more                        concept [24]. To increase the performance of the prototype,
trace files the rule and call tables will also be filled with the                 the system only works with integer comparisons. Therefore,




                                                                                111
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

the input data only includes the call ids. It is only possible to          •Rule Name: Here, the actual rule name is displayed. The
translate a call id to the actual call value with an additional             rule name is implemented as a text field to quickly change
translation file. This translation file is also used for the bi-            it if necessary.
grams. The original bi-gram file has several columns in which             • Included Calls: Finally, the calls included in the stored
only the string values of the system or API calls are stored.               rule are displayed in a table. Thus, the calls are visualized
To increase the performance and to reduce memory usage, the                 in their sequential order and each call will be shown with
BiG2-KAMAS prototype generates its own bi-gram file. When                   its unique call id which corresponds to the call id of the
starting the prototype the system checks with md5 hash values               translation file and the actual call value. In the current
to determine whether the translation file or the original bi-gram           version of the prototype it is only possible to investigate
file has changed. If so, the system converts the original bi-gram           the included calls in their sequential order, but not to
file to the translated bi-gram file in which also the integer               delete specific calls which are listed in the table.
values of the system calls are stored. Like the prototype of              The second menu item is the “delete’ item, which allows
Wagner et al. [3] the new prototype is using the action pipeline       the analyst to delete the currently selected rule. Furthermore,
for filter options. This enables dynamic query environments            when selecting a concept instead of a rule, the BiG2-Kamas
and real-time data operations.                                         prototype will show a context menu with which the user
   To evaluate the robustness and performance of the BiG2-             can disable a category and all its integrated subcategories.
KAMAS prototype three different Sequitur cluster-grammar               Thus, the analyst can disable the entire KDB or only specific
files containing between 10 and 500 rules were used. The file          categories. If the user disables a category all the included
with 500 different rules contained a total amount of 30,000            rules will no longer be considered in the knowledge base
system and API calls. To test the bi-gram functionality, a bi-         highlighting and filtering.
gram file with nearly 117,500 bi-gram entries was loaded. On              When the user clicks the right mouse button to open
a machine with an 2.1GHZ Dual-Core processor and 12GB                  the corresponding context menu before selecting a rule or
of memory it took the system about four minutes to translate           category, the system automatically selects the rule/category at
the original bi-gram file to the translated bi-gram file. The          the actual mouse position.
malware and bi-gram samples were collected by collaborators               Searching: If the user searches for interesting rules or
in the Josef Ressel Center TARGET of St. Pölten UAS.                  specific calls or call groups he/she can use the call filter options
                                                                       to reduce the data to be analyzed. In the call exploration area,
      IV. E XTERNALIZED K NOWLEDGE I NTEGRATION                        the user can search for a specific call by entering its name or
                                                                       use regular expressions to find an entire call group. Beneath
   As Wagner et al., [3] described in their article, we integrated
                                                                       the search text field the user can enable case sensitive search
a knowledge database to support the user during their analysis
                                                                       with the corresponding checkbox ’Case Sensitive’. Filtering or
tasks. The KDB is based on the malware behavior schema of
                                                                       searching the calls affects the data shown in the call overview
Dornhackl et al., [25]. The KDB is located at the left side of
                                                                       and rule overview table. Additionally, to find rules of interest
the prototype and is implemented in a hierarchical structure
                                                                       the analyst can use the rule exploration filters or the knowledge
(tree structure). In the BiG2-KAMAS prototype the KDB was
                                                                       base filters.
extended by one additional category to store the benign rule
data, namely benign activity. In the current version of the                             V. P ROTOTYPE E VALUATION
prototype there is only one category to store benign rule data.           This section describes the procedure of the performed user
Each category is displayed with either a box or a folder icon,         studies, the specific results, as well as further feature requests.
the category description and the number of included rules in           For the prototype validation, a user study with two domain
the integrated subfolders. The analyst can add new rules by            experts was conducted. The domain experts validated the
drag & drop. When adding a new rule, the KDB automatically             functionality as well as the visual design interface.
unfolds closed categories. Additionally, a popup window opens             Participants: Both participants work at St. Pölten UAS and
in which the analyst can enter a rule name. To investigate a           have more than five years of experience in the field of malware
rule stored in the KDB, the user can open a context menu by            analysis. The first participant is between 30 and 39 years of
right clicking on the chosen rule. The context menu will show          age, male and holds a masters degree. The second participant
two different menu items, namely ‘Information’ and ‘Delete’.           is between 60 and 69 years of age, male, and holds a PhD.
The information menu item opens a popup window in which                Generally, both participants are well experienced in this field
the analyst is presented the following information:                    and can be categorized as experts.
  •   Assigned Concept: This information tells the analyst in             Design and Procedure: Each participant was interviewed
      which schema category (concept) the rule is currently            individually and had already tested the previous version of
      categorized. The assigned concept is implemented as a            the prototype at least once. First, the participants received a
      selection list to give the user the opportunity to change        short introduction to the new features of BiG2-KAMAS and
      the assigned concept. For that purpose, the analyst must         also a quick reminder of the basic features and workflow.
      select a different concept in the list and press the save        The participants were asked to mention additional missing
      button at the bottom of the pop up window.                       functionalities and to criticize all potential usability issues.




                                                                     112
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

Both participants took part in the same two scenarios: First, the      a specific call in a group of similar calls. Additionally, he
participants had to load a Sequitur file, investigate the loaded       recommended a search button for the regular expression call
rules and filter specific call sequences. At the end they had to       filter. This could help some users, since currently it is only
store a rule in the KDB and name it. In the second scenario,           possible to search by pressing the enter key. Adding a new
the participants had to load three trace files. They were asked if     rule to the KDB was no challenge for either participant and
they perceived any differences when loading trace files instead        both valued the ability to give the rule a specific name.
of a Sequitur file. At the end they had to investigate a rule              Scenario 2: Loading and analyzing three trace files.
stored in the KDB and move it to a different category.                 Both participants had no difficulties with loading the three
   Equipment and Materials: The latest version of the BiG2-            trace files. They also recognized quickly that each entry in the
KAMAS prototype was used in the evaluation. For the first              rule overview table now represents one trace. Neither of them
user scenario, the participants had to load a Sequitur file with       realized that the knowledge base filters and highlighting were
about 500 rules and 30,000 system and API calls. In the second         disabled. Participant 1 suggested to gray out the knowledge
scenario, three trace files with a length between ten and fifteen      base filters to make it clear that these are disabled. Participant
calls were used. The bi-gram file had a total number of about          2 proposed to change the headings for the trace file analysis
117,000 bi-grams. The translated bi-gram file had already              view in order to avoid confusion. He remarked that it could
been generated so that the participants did not have to wait           be misleading if the headings say e.g. ‘Rule Overview Table’
until the system finished the translation process. As evaluation       when analyzing a trace file. Furthermore, both participants
equipment, two different setups were used. Both participants           recommended to change the occurrence column in the rule
worked on a 13 inch Macbook Pro with a Retina display                  overview table to the file names of the traces. As the last task,
(screen resolution of 2560x1600) and a mouse for navigation.           the participants had to change the corresponding category of a
Participant #1 worked with an additional 20 inch Monitor with          random rule. Even if both participants solved this task easily,
a full HD screen resolution and an external keyboard. Each             both remarked that it would be useful if the user could move
user test was conducted with the same version of the BiG2-             a rule from one category to another per drag & drop.
KAMAS prototype and was documented on paper.
                                                                       B. Result Analysis
A. Results                                                                This section gives an overview of the issues which were
    The following section discusses the results of both scenar-        mentioned during the expert reviews. Like Wagner et al. [3]
ios. Both the results of ‘Scenario 1’ (Sequitur file) and ‘Sce-        each issue was rated based on Nielsen’s [26] severity ratings.
nario 2’ (trace files) will be presented. Both participants had        Table IV shows the potential new features noted by the test
no problem loading the different files for the user scenarios.         persons and includes three columns: ‘feature requests’ (FR),
    Scenario 1: Loading and Analyzing a Sequitur file.                 ‘severities’ (SE) and the effort it would take to implement
Both participants quickly recognized the additional color              these changes [3]. The features mentioned in the table include
scheme for the new benign category. The colors for the knowl-          small cosmetic changes as well as real usability improvements.
edge base highlighting were assessed as easily understandable          The only feature mentioned by all participants is an additional
and the additional rule counter next to the knowledge base             tooltip which shows the actual bi-gram values.
filters were mentioned as being very useful. Participant 1 men-
tioned that if a rule in the rule overview table is highlighted,                                        TABLE IV
                                                                       L IST OF REMARKED FEATURE REQUEST AND SEVERITIES AND THE EFFORT
it would be useful to know which rule or rules of the KDB              IT WOULD TAKE TO IMPLEMENT THEM IN THE PROTOTYPE . (FR: 1 = NICE
match this rule in the table. Therefore, a tooltip would be            TO HAVE , 2 = GOOD FEATURE , 3 = ENHANCES USABILITY; SE: 1 = MINOR ,
helpful which tells the user the names of the matching rules             2 = BIG , 3 = DISASTER ; E FFORT: 1 = MIN , 2 = AVERAGE , 3 = MAX ) [3].
of the KDB. Furthermore, participant 2 suggested to always                 Description                                          FR   SE   Effort
show the rule counter of the KDB’s categories. If there are
currently no rules in a category, the counter should be zero.              KDB: Move a rule to another category by using        2    1      1
    When participant 2 first saw the bar chart bi-gram visualiza-          drag & drop.
                                                                           KDB: Show the rule counter even if zero rules        1    -      1
tion, he assumed it visualizes the occurrence of the combined              are included.
call sequence. In contrast, the alpha color visualization was              KDB: Gray out the knowledge base filters if they     2    1      1
immediately recognized as an indicator for maliciousness or                are disabled.
                                                                           Tables: Highlighted rules in the rule overview ta-   3    2      3
benignity. Participant 1 also mentioned that the alpha color               ble should show the KDB’s corresponding rules.
visualization is easier and faster to recognize. Furthermore,              Tables: Change the occurrence column to the          2    1      2
both participants mentioned that the color visualization is not            trace file names.
                                                                           Tables: Show only the begin and the end of the       3    2      2
as precise as the bar chart visualization and therefore would              calls in the call overview table.
only be useful for initial malware classification. Participant 1           Tables: Implement a search button for the call       1    -      1
suggested an additional tooltip to display the accurate bi-gram            regex search.
                                                                           Bigram: Tooltip to show the bi-gram values.          3    -      1
value. Participant 2 remarked that it would be more useful if              Headings: Change the headings when loading           2    -      1
the calls in the call overview table only showed the beginning             trace files.
and the end of the call’s value. This would simplify finding




                                                                     113
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

              VI. D ISCUSSION & R EFLECTION                             (see 1:3a) to show the relation to the total number of occur-
                                                                        rences included in the loaded file. Additionally, normalizing
   The performed user studies described in Section V con-               the occurrence dataset and visualization to this total could be
firmed that the four feature requests, which are determined in          beneficial.
Section I are fulfilled by the BiG2-KAMAS prototype:                       Categorization of BiG2-KAMAS: Like the KAMAS pro-
   1) Generic data loading: The BiG2-KAMAS prototype is                 totype [3] the BiG2-KAMAS prototype can be categorized
structured to enable the generic loading of data sequences. To          as a Malware Forensic as well as a Malware Classification
make this possible the input data as well as the prototype’s            tool in the Malware Visualization Taxonomy of Wagner et
database are based on unique identifiers (id) instead of the            al. [8]. However, due to the bi-gram based valuation the BiG2-
actual values. Thus, all system-internal comparisons are based          KAMAS prototype offers the malware analyst an additional
on integer values instead of string values. Only with the               assistance for the Individual Malware Analysis.
corresponding translation table, the system can translate the
ids to the actual values. Thus, it is possible to load data                                   VII. C ONCLUSION
sequences independent of their actual values as long as there              In this work, we presented a design study for a Bi-gram Sup-
is a translation table through which the prototype can translate        ported Generic Knowledge-Assisted Malware Analysis System
the data. Furthermore, the system was adopted to also offer             (BiG2-KAMAS). The prototype is based on the KAMAS
the opportunity to load raw system or API call based traces.            prototype [3] and extended by additional features such as
In this state the KDB highlighting and filtering is disabled            generic data loading, an extension of the KDB to enable the
but the user can explore the loaded trace files and add new             analysis of benign rules, and the implementation of a bi-gram
rules to the KDB. The prototype can’t only load Sequitur call           based valuation approach. The requirements were discussed
sequences, but also independent data sequences as long as the           in a focus group meeting and then implemented as part of
the data sequence has the given structure and a translation file.       a functional prototype. After implementing the new features,
   2) Extend the KDB with benign rules: To fulfill this require-        two user studies were conducted to evaluate the design and
ment the KDB was extended with an additional category for               the functionality of the new BiG2-KAMAS prototype.
benign activity. In this category, all rules which are identified                             ACKNOWLEDGMENTS
as benign can be stored. Additionally, the KDB’s highlighting
                                                                           The financial support by the Austrian Federal Ministry of
and filter pipelines were extended to identify and filter partially
                                                                        Science, Research and Economy and the National Foundation
and fully benign rules. Rules with a partially or fully benign
                                                                        for Research, Technology and Development is gratefully ac-
knowledge state are highlighted in blue in order to avoid the
                                                                        knowledged.
combination of the colors red and green.
                                                                           This work was supported by the Austrian Science Fund
   3) Implementation of bi-gram based valuation: To support
                                                                        (FWF) via the “KAVA-Time” project (P25489-N23) and by the
the bi-gram approach of Luh et al, [10] the prototype’s
                                                                        Austrian Federal Ministry of Science, Research and Economy
rule detail table was adopted. Since many domain experts
                                                                        under the FFG Innovationscheck (no. 856429). We would also
mentioned [3] that the arc-diagram visualization is not very
                                                                        like to thank all focus group members and test participants who
helpful, it was replaced by the bi-gram visualization. Bi-gram
                                                                        have agreed to volunteer in this project.
based valuation is implemented with two different approaches.
If the width of the bi-gram column is bigger than 75px the                                          R EFERENCES
valuation is visualized with bar charts and colored in red              [1] M. Sikorski and A. Honig, Practical Malware Analysis: The Hands-On
(malicious) or blue (benign). If the width is less than 75px                Guide to Dissecting Malicious Software, 1st ed. No Starch Press, 2012.
                                                                        [2] M. Wagner, W. Aigner, A. Rind, H. Dornhackl, K. Kadletz, R. Luh,
the bi-gram visualization uses the alpha channel to show the                and P. Tavolato, “Problem characterization and abstraction for visual
severity of the bi-gram (see Table III).                                    analytics in behavior-based malware pattern analysis,” in Proceedings of
   4) User studies to validate the new features: The results of             the Eleventh Workshop on Visualization for Cyber Security, ser. VizSec
                                                                            ’14. ACM, 2014.
the user studies show further feature requests which could be           [3] M. Wagner, A. Rind, N. Thür, and W. Aigner, “A knowledge-assisted
implemented in a future project. However, both participants                 visual malware analysis system: Design, validation, and reflection of
mentioned that the bi-gram visualization is very helpful for                KAMAS,” Computers & Security, vol. 67, pp. 1–15, 2017.
                                                                        [4] N. Thür, M. Wagner, J. Schick, C. Niederer, J. Eckel, R. Luh, and
identifying potentially malicious or benign call sequences and,             W. Aigner, “Big2-kamas: Supporting knowledge-assisted malware anal-
therefore, helps to decide whether a rule is malicious or not.              ysis with bi-gram based valuation,” in Poster of the 14th Workshop on
   Future Work: For the behavior-based malware analysis                     Visualization for Cyber Security (VizSec), Phoenix, Arizona, USA, 2017.
                                                                        [5] H. Shiravi, A. Shiravi, and A. Ghorbani, “A survey of visualization
process, it could be valuable to implement a rule creation                  systems for network security,” vol. 18, no. 8, pp. 1313–1329, 2012.
process where the analyst can build their own rules based on            [6] M. Egele, T. Scholte, E. Kirda, and C. Kruegel, “A survey on automated
the known system and API calls [27]. Furthermore, it could be               dynamic malware-analysis techniques and tools,” vol. 44, no. 2, pp. 6:1–
                                                                            6:42, 2008.
beneficial to edit the stored rules in the KDB or to build new          [7] Z. Bazrafshan, H. Hashemi, S. Fard, and A. Hamzeh, “A survey on
rules based on existing patterns. Further avenues for future                heuristic malware detection techniques,” 2013, pp. 113–120.
work are to include possibilities to hide, shrink an expand             [8] M. Wagner, F. Fischer, R. Luh, A. Haberson, A. Rind, D. A. Keim, and
                                                                            W. Aigner, “A survey of visualization systems for malware analysis,”
areas to provide the user with more flexibility. Moreover, to               in Eurographics Conference on Visualization (EuroVis) - STARs. The
update the occurrence column of the Call Exploration area                   Eurographics Association, 2015.




                                                                      114
 A Bigram Supported Generic Knowledge-Assisted Malware Analysis System: BiG2-KAMAS

 [9] L. McNabb and R. S. Laramee, “Survey of surveys sos - mapping
     the landscape of survey papers in information visualization,” Comput.
     Graph. Forum, vol. 36, no. 3, pp. 589–617, Jun. 2017. [Online].
     Available: https://doi.org/10.1111/cgf.13212
[10] R. Luh, S. Schrittwieser, and S. Marschalek, “LLR-based Sentiment
     Analysis for Kernel Event Sequences.” IEEE, 2017.
[11] R. Luh, G. Schramm, M. Wagner, and S. Schrittwieser, “Sequitur-based
     Inference and Analysis Framework for Malicious System Behavior,”
     2017.
[12] O. Somarriba, U. Zurutuza, R. Uribeetxeberria, L. Delosières, and
     S. Nadjm-Tehrani, “Detection and visualization of android malware
     behavior,” vol. 2016, p. e8034967, 2016.
[13] S. Marschalek, R. Luh, M. Kaiser, and S. Schrittwieser, “Classifying
     malicious system behavior using event propagation trees.” ACM Press,
     2015, pp. 1–10.
[14] B. Xiaofang, C. Li, H. Weihua, and W. Qu, “Malware variant detection
     using similarity search over content fingerprint.” IEEE, 2014, pp. 5334–
     5339.
[15] A. Jain, H. Gonzalez, and N. Stakhanova, “Enriching reverse engineering
     through visual exploration of android binaries,” in Proceedings of the 5th
     Program Protection and Reverse Engineering Workshop, ser. PPREW-5.
     ACM, 2015, pp. 9:1–9:9.
[16] O. E. David and N. S. Netanyahu, “DeepSign: Deep learning for
     automatic malware signature generation and classification.” IEEE, 2015,
     pp. 1–8.
[17] P. M. Wrench and B. V. W. Irwin, “Towards a PHP webshell taxonomy
     using deobfuscation-assisted similarity analysis.” IEEE, 2015, pp. 1–8.
[18] A. Stevenson and J. R. Cordy, “A survey of grammatical inference in
     software engineering,” Science of Computer Programming, vol. 96, pp.
     444–459, 2014.
[19] L. Ming and P. Vitányi, An introduction to Kolmogorov complexity and
     its applications. Springer Heidelberg, 1997.
[20] C. G. Nevill-Manning and I. H. Witten, “Identifying hierarchical struc-
     ture in sequences: A linear-time algorithm,” J. Artif. Intell. Res. (JAIR),
     vol. 7, pp. 67–82, 1997.
[21] T. Dunning, “Accurate methods for the statistics of surprise and coinci-
     dence,” Computational linguistics, pp. 61–74, 1993.
[22] C. Ware, Information Visualization: Perception for Design. Elsevier,
     2012.
[23] B. Shneiderman, “The eyes have it: a task by data type taxonomy for
     information visualizations,” in Proc. of VL, 1996, pp. 336–343.
[24] R.       Fabian,       “Data-Oriented        Design,”       2013,       ac-
     cessed       on     Nov.      11,      2015.      [Online].     Available:
     http://www.dataorienteddesign.com/dodmain/dodmain.html
[25] H. Dornhackl, K. Kadletz, R. Luh, and P. Tavolato, “Malicious behavior
     patterns,” in SOSE. IEEE, 2014, pp. 384–389.
[26] J. Nielsen, Usability engineering. Boston: Academic Press, 1993.
[27] M. Wagner, A. Rind, G. Rottermanner, C. Niederer, and W. Aigner,
     “Knowledge-assisted rule building for malware analysis,” in Proceedings
     of the 10th Forschungsforum der österreichischen Fachhochschulen, FH
     des BFI Wien. Vienna, Austria: FH des BFI Wien, 2016.




                                                                                   115