Toward Semantic Assessment of Vulnerability Severity:
              A Text Mining Approach

                                      Yongjae Lee and Seungwon Shin
                              Korea Advanced Institute of Science and Technology
                                         Daejeon, Republic of Korea
                                        {ylee.cs, claude}@kaist.ac.kr


                                                                     ity data and show that our method can sift
                                                                     out more critical vulnerabilities effectively.
                          Abstract
                                                                 1   Introduction
     A security vulnerability is a flaw in software
                                                                 A vulnerability is a flaw which exists in either hard-
     or hardware systems that an adversary could
                                                                 ware or software systems and can be used to threaten
     exploit to compromise resources. Despite the
                                                                 the systems [ALRL04]. A vulnerability itself is not a
     never ending effort to reduce and prevent
                                                                 problem unless an adversary exploits it for the pur-
     the vulnerabilities, its number has been con-
                                                                 pose of making the systems fail in terms of security.
     stantly increasing until today. To deal with
                                                                 In other words, the vulnerability can be used by ma-
     the vulnerabilities that are increasingly found
                                                                 licious people to violate the systems’ important secu-
     in diverse systems, various methods to priori-
                                                                 rity properties such as Confidentiality, Integrity, and
     tize and manage the vulnerabilities have been
                                                                 Availability (CIA) [ALRL04]. Therefore, swiftly find-
     proposed. The de facto standard method used
                                                                 ing and patching the vulnerabilities is one of the most
     to assess and prioritize the vulnerability based
                                                                 significant concerns for hardware or software manufac-
     on severity is using CVSS (Common Vulner-
                                                                 turers, security software vendors, and researchers.
     ability Scoring System), and many organiza-
                                                                     Unfortunately, it is so labor intensive and time con-
     tions have been using this system for vulnera-
                                                                 suming to fix the ever increasing vulnerabilities, and
     bility management. However, CVSS is limited
                                                                 thus people want to prioritize vulnerabilities to know
     in that it only takes some properties (e.g., ease
                                                                 much more critical ones. For example, we can decide to
     of exploit, impact, etc.) of a vulnerability into
                                                                 fix remotely exploitable vulnerabilities prior to locally
     account when measuring severity, and hence,
                                                                 exploitable ones because the former can be easily ex-
     CVSS scores are often considered inaccurate
                                                                 ploited by most attackers. To this end, the vulnerabil-
     or impractical. In this paper, we present a
                                                                 ities are managed in a systematic manner where they
     semantic approach to assess the severity of
                                                                 are given a unique ID and stored in a central database,
     vulnerabilities by ranking them. Our ranking
                                                                 NVD (National Vulnerability Database) [NIS17], and
     method uses the relational information of how
                                                                 their severity is assessed by a severity assessment sys-
     strongly two vulnerabilities are related or sim-
                                                                 tem.
     ilar to each other. With this ranking method,
                                                                     Common        Vulnerabilities      and     Exposures
     we try to find which vulnerability has more
                                                                 (CVE) [MSR06, MIT18] is the most popular
     common characteristics than others, since we
                                                                 vulnerability management scheme, which is operated
     believe that if a vulnerability has more com-
                                                                 by NVD. If someone finds and asks to register a
     mon and popularly used characteristics, then
                                                                 newly discovered vulnerability, NVD issues a unique
     the vulnerability is likely to attract more at-
                                                                 ID (CVE identifier) to the vulnerability. Once the
     tack trials. Based on this insight, we evalu-
                                                                 vulnerability is registered to NVD, one can check
     ate our ranking method with real vulnerabil-
                                                                 its related information by the issued CVE identifier,
Copyright © CIKM 2018 for the individual papers by the papers'   which includes a short description, references, a list of
authors. Copyright © CIKM 2018 for the volume as a collection    affected products, and a severity score. In particular,
by its editors. This volume and its papers are published under   the severity score is evaluated by Common Vulnera-
the Creative Commons License Attribution 4.0 International (CC
BY 4.0).
bility Scoring System (CVSS) [FIR17], the de facto         To get the rank of each CVE, we employ the TextRank
standard to quantify a vulnerability’s severity. With      algorithm [MT04], and it is an unsupervised ranking
CVSS score, one can sort vulnerabilities from highest      algorithm that can summarize and extract important
to lowest, which helps prioritize the vulnerabilities to   sentences or words within a text.
fix.                                                          To initially evaluate our proposed ranking ap-
   However, many researchers have argued that the          proach, we have collected real CVE data and apply our
CVSS does not account for what security experts per-       method to the data. In addition, we compare our rank-
ceive in the wild [AM12, AM14]. For example, a             ing results with CVSS score to understand whether our
vulnerability with a low CVSS score is ranked in a         approach can clearly reflect real world opinions. Our
higher position in bug bounty programs [MM16], and         initial results show that our approach provides much
the CVSS scores of randomly selected vulnerabilities       more realistic (and reasonable) ranking results than
are not correlated well with severity scores manu-         CVSS.
ally evaluated by experts in the security field [HA15].
More specifically, let’s take Heartbleed as an example.    2     Method
Heartbleed (CVE-2014-0160) is one of the most well-
known vulnerabilities that received worldwide atten-       Before we give the detailed explanation on our vul-
tion lately. It is an implementation flaw of OpenSSL,      nerability ranking method, we illustrate our system
the most used open source encryption library and TLS       overview in Figure 1. Our method operates in three
implementation [Ope18]. This vulnerability can make        phases: (1) corpus building, (2) graph building, and
servers leak confidential data including the encryption    (3) vulnerability ranking. Our method is a text-
key of the servers, which makes the problem much           oriented vulnerability ranking, and thus we need a lot
worse, but its CVSS score is just 5.0 out of 10.0 with     of text descriptions about CVEs. Fortunately, NVD
Medium severity level.                                     compiles related information in a database and allows
                                                           to access the information freely, and we glean CVE de-
   In this paper, we present a semantic approach to as-
                                                           scriptions from NVD to build CVE description corpus.
sess the severity of vulnerabilities (specifically CVEs)
                                                           After building the corpus, our method generates a vul-
by analyzing descriptions for a CVE. There are many
                                                           nerability ranking graph where a vertex represents a
text descriptions for a CVE, such as NVD entries, se-
                                                           CVE. In the graph, vertices are linked to one another
curity blog posts, and manufacturers’ web bulletins,
                                                           when there is a certain relation between two CVEs.
and such text descriptions explain how to exploit the
                                                           We will discuss the relation that the CVEs can have
CVE and what kind of damage can be caused if the
                                                           in the following sections. Once we completed the graph
CVE is exploited by attackers. Since those descrip-
                                                           building, we run TextRank algorithm on the graph to
tions commonly present various characteristics about
                                                           obtain importance scores of the CVEs, by which the
the CVE in human readable natural language, we can
                                                           CVEs are sorted.
glean insightful information from them with the help
of Natural Language Processing (NLP) techniques.
                                                           2.1   Vulnerability ranking
   To this end, we first collect text descriptions il-
lustrating the characteristics of CVEs from various        Ranking model Our vulnerability ranking method
sources: NVD, blogs, and web bulletins. Next, we           is based on TextRank [MT04], which is a graph-based
extract information from the text descriptions, which      and unsupervised ranking model. TextRank summa-
includes the type of product where a CVE is found,         rizes a text by ranking sentences in the text according
which version of the product that has the CVE,             to their importance and singling out a set of higher
whether there exists an easy-to-use exploit for the        ranked sentences. In the model, a sentence is rep-
CVE, and so forth. Once such information is ex-            resented as a vertex, and two sentences are linked to
tracted, then we apply a ranking method to under-          each other if they share similar contents, or words. Al-
stand the severity of CVEs clearly. Based on the ex-       though TextRank is an application of Google’s PageR-
tracted information, our ranking method first tracks       ank [BP98] to text summarization, the two ranking
how strongly CVEs are related or similar to one an-        methods are different in that TextRank operates on
other. This relation can reveal whether characteristics    an undirected graph. This is because, unlike web
of a CVE are also shared by other CVEs or not. Fi-         pages, sentences do not have explicit reference re-
nally, our ranking method sorts CVEs in order, i.e., a     lations. Therefore, TextRank cannot use the graph
CVE with more common characteristics will be ranked        structure information which denotes that a node votes
higher. The intuition behind our ranking method is         another one. Instead, TextRank assigns a similarity
that if characteristics of a CVE are more general,         weight on each link between two nodes, and they ex-
which means that they could be commonly/widely             change the weight when calculating the importance
adopted by attackers, then the CVE is more serious.        score.
                      regex: (CVE-\d{4}-\d{4,7})
                Preprocessor                                                       Graph Builder                                 Rank       CVE
                                            CVE             Description
                  Sentence                                                                                                        1     CVE-YYYY-XXXX
                  Boundary            CVE-YYYY-XXXX   Sentence 1, Sentence 2 …        Linker
     NVD          Detection                                                                                                       2     CVE-YYYY-XXXX
                                      CVE-YYYY-XXXX   Sentence 1, Sentence 2 …
                                                                                    Link Weight                                   3     CVE-YYYY-XXXX
                 PoS Tagging
                                             …                   …                   Evaluator
                                                                                                                                  …          …


                               Corpus Building                                                      Graph Building and Ranking


                                            Figure 1: Vulnerability ranking system overview

   In the ranking graph G=(V, E), where V is the set                             The (1) TLS and (2) DTLS implementations in
of vertices and E is the set of edges, let’s assume that                         OpenSSL 1.0.1 before 1.0.1g do not properly handle
there is a vertex Vi . For Vi , let In(Vi ) and Out(Vi ) be                      Heartbeat Extension packets, which allows remote at-
the set of predecessors of Vi and the set of successors                          tackers to obtain sensitive information from process
of Vi , respectively. In addition, if there is a vertex Vj                       memory via crafted packets that trigger a buffer over-
that belongs to In(Vi ), the similarity weight between                           read, as demonstrated by reading private keys, related
Vi and Vj is defined as wji . Then, the importance                               to d1 both.c and t1 lib.c, aka the Heartbleed bug.
score of the vertex Vi can be computed as below:
                                                                                 Figure 2: Description of CVE-2014-0160 (Heartbleed)
                               X                   wji                           attackers and what kind of damages can be caused af-
IS(Vi ) = (1 − d) + d ∗                  P                        IS(Vj )        ter the vulnerability is successfully exploited. In other
                          Vj ∈In(Vi )        Vk ∈Out(Vj ) wjk
                                                                                 words, in the CVE description, the characteristics of
                                                        (1)                      the CVE are expressed in natural language, and the
where d is a damping factor, which denotes the prob-                             presence of common characteristics between two ver-
ability (1 - d ) for a random surfer on the graph to                             tices determines whether they are linked together or
jump from a vertex to another one randomly [BP98].                               not. Notice that the text description cannot be used
In this model, the importance score of a vertex is dis-                          directly to be drawn as a vertex and needs to pass
tributed to its successors proportionally to the simi-                           the predefined preprocessing steps to be converted to
larity weight. Therefore, a vertex that is similar to                            a bag-of-words such as part-of-speech tagging, lemma-
majority of other vertices within the graph tends to                             tization, and so forth.
have a higher importance score.                                                     How to define similarity between two vulner-
    In our vulnerability ranking problem, we believe                             abilities? For two CVEs to share similar character-
that a vulnerability that has similar properties with                            istics can be defined as having similar words in both
all kinds of vulnerabilities is important and thus needs                         of their bags-of-words simultaneously. If the two bags-
to be fixed first. This is because, if such a vulnerability                      of-words describing the two CVEs have similar words,
is found in a hardware or software product, it means                             we compute the similarity between them by employ-
that the vulnerability makes the product have broad                              ing text similarity measures such as Jaccard index or
attack surfaces. In other words, the product can be                              TF-IDF cosine similarity [BYRN99]. In this work, we
attacked in various ways. Here, for two vulnerabilities                          employ Jaccard index [Pau12] as presented in Equa-
to have similar properties means that they can be used                           tion 2, where X and Y are the sets of unique words
by similar types of attacks or violate one of the CIA                            that constitute each bag-of-words of the two vulnera-
triads alike.                                                                    bility descriptions.
    How to represent a vulnerability in a graph?
In our ranking graph, a vertex represents the short de-                                                |X ∩ Y |         |X ∩ Y |
scription of a CVE, which is less than 10 sentences and                          JaccIndex(X, Y ) =             =
                                                                                                       |X ∪ Y |   |X| + |Y | − |X ∩ Y |
recorded for every CVE in NVD. For example, a ver-
                                                                                                                                     (2)
tex labeled with CVE-2014-0160 represents the text
                                                                                    Using this metric, we can measure how similar two
description presented in Figure 2 which is excerpted
                                                                                 CVEs are. For instance, we present three CVEs and
from NVD 1 . From content words in the description,
                                                                                 their descriptions in Figure 3 and summarize their
we can grasp how the vulnerability can be exploited by
                                                                                 similarities in Table 1. Since both CVE-2015-0311
   1 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-                          and CVE-2015-7645 are vulnerabilities of Adobe Flash
2014-0160                                                                        Player and affect the same operating systems (i.e., MS
Unspecified vulnerability in Adobe Flash Player                CVE ID      CVSS Keyword
                                                               CVE-2014-   5.4 Android, X.509 certificate, SSL, MITM
through 13.0.0.262 and 14.x, 15.x, and 16.x through            5694
16.0.0.287 on Windows and OS X and through                     CVE-2014-   7.5   Use-after-free, DOM, Google Chrome, DoS,
11.2.202.438 on Linux allows remote attackers to exe-          3169              remote attack
cute arbitrary code via unknown vectors, as exploited          CVE-2014-   5.4   7Sage LSAT Prep - Proctor, Android,
                                                               6707              X.509 certificates, SSL, MITM, spoof
in the wild in January 2015.                                   CVE-2014-   7.5   Integer overflows, Blink, Google Chrome,
                    (a) CVE-2015-0311                          1741              remote attackers, DoS
                                                               CVE-2014-   5.0   Heimdal, Apple OS X, remote attackers,
                                                               1316              DoS, Kerberos 5
Adobe Flash Player 18.x through 18.0.0.252 and 19.x            CVE-2014-   4.3   Multiple directory traversal, McAfee, re-
through 19.0.0.207 on Windows and OS X and 11.x                2536              mote authenticated users
                                                               CVE-2014-   6.4   Multiple directory traversal, SeedDMS, re-
through 11.2.202.535 on Linux allows remote attackers          2279              mote authenticated users, read arbitrary
to execute arbitrary code via a crafted SWF file, as                             files, .. (dot dot) in the logname parame-
exploited in the wild in October 2015.                                           ter
                                                               CVE-2014-   5.4   GittiGidiyor, Android, X.509 certificates,
                    (b) CVE-2015-7645                          5836              SSL, MITM, spoof
                                                               CVE-2014-   6.8   CSRF, Admin Web UI, IBM Lotus Protec-
                                                               0885              tor for Mail Security, remote authenticated
Unspecified vulnerability in Oracle MySQL Server
                                                                                 users, unknown vectors
5.6.23 and earlier allows remote authenticated users           CVE-2014-   5.4   Bouncy Bill, Android, X.509 certificates,
to affect availability via unknown vectors related to          5780              SSL, MITM, spoof
Server : Security : Privileges.
                                                           Table 2: Top 10 CVEs generated by our ranking
                    (c) CVE-2015-2567
                                                           method and their CVSS scores and keywords. CVSS
Figure 3: Three random CVEs registered in 2015 and         score ranges from 0.0 (not severe) to 10.0 (the most
their NVD descriptions.                                    severe) and it is increased by 0.1.

Windows and Apple OS X), they are the most similar         specify the whole description about each of the CVEs
vulnerability pair among the three vulnerability pairs.    but present some keywords in the table.
Next, CVE-2015-0311 and CVE-2015-2567 are similar              Taking a look at Table 2, we know that most of
to each other because they have the same property          the CVEs are related to an X.509 public key certifi-
that remote attackers can exploit the vulnerabilities      cate problem and can cause some remote attacks such
via unknown vectors. In consequence, CVE-2015-0311         as Man-In-The-Middle (MITM) attacks and Denial of
has various factors to be exploited by attackers such as   Service (DoS) attacks. In addition, the CVEs are re-
Adobe Flash Player, OS, and unknown attack vectors,        ported to be found in widely used software products
and we can conclude that it should be handled earlier      including Android, Google Chrome, and Apple OS X.
than others.                                               In summary, our ranking method ranks CVEs higher,
                                                           which (1) are related to a security hole (e.g., certifi-
3   Evaluation                                             cate verification bypass), (2) are found in popularly
                                                           used products (e.g., Android) , and (3) can cause well-
To evaluate our ranking method, we randomly selected       known types of attacks (e.g., MITM and DoS).
100 CVEs from NVD, which were registered in 2014,              Note that, while the CVSS score for CVE-2014-5694
and then constructed a small corpus consisting of the      is lower than that for CVE-2014-3169, CVE-2014-5694
100 CVE descriptions. The descriptions are converted       is located at a higher position than CVE-2014-3169
to bags-of-words, and the conversion requires prepro-      by our ranking method. This is because the keywords
cessing that normally consists of sentence boundary        contained in the description of CVE-2014-5694 (i.e.,
detection, stop word removal, and lemmatization. Af-       Android, X.509, SSL, etc.) are commonly found in
ter that, we run TextRank algorithm on the CVE de-         other CVEs’ descriptions more frequently than those
scription corpus and obtain the rank of the 100 CVEs.      of CVE-2014-3169 (i.e., DOM, Google Chrome, DoS),
In Table 2, we present the CVEs ranked in the top 10       which has an impact on the weights of the links to
out of 100. Due to the page limitation, we could not       which the CVE is connected.
           CVE pair                     Jaccard index
 CVE-2015-0311 & CVE-2015-7645             0.6182          4      Discussion
 CVE-2015-0311 & CVE-2015-2567             0.2223
 CVE-2015-7645 & CVE-2015-2567             0.1132          Although our ranking method gives a new rank of
                                                           the vulnerabilities, reflecting other facets that were
Table 1: Pairs of the three CVEs registered in 2015        never used for assessing the vulnerabilities’ severity,
and their Jaccard index                                    the ranking method has issues to be addressed fur-
ther. First, using NVD descriptions, we make nodes           References
of the vulnerability ranking graph. However, the de-
                                                             [ALRL04] A. Avizienis, J. C. Laprie, B. Randell, and
scriptions are a short text and provide limited infor-
                                                                      C. Landwehr. Basic concepts and taxon-
mation. On the contrary, there are useful sources from
                                                                      omy of dependable and secure computing.
which we can glean detailed information about vul-
                                                                      IEEE Transactions on Dependable and Se-
nerabilities. For example, one can search Microsoft
                                                                      cure Computing, 1(1):11–33, Jan 2004.
Security Bulletins [Mic18] for web documents explain-
ing about newly discovered or patched vulnerabili-           [AM12]     Luca Allodi and Fabio Massacci. A prelim-
ties in their own words, and many researchers and                       inary analysis of vulnerability scores for at-
practitioners post security-related information on their                tacks in wild: The ekits and sym datasets.
own blogs [Fee18]. In addition, exploits, which are                     In Proceedings of the 2012 ACM Workshop
a set of commands to infringe a system using a vul-                     on Building Analysis Datasets and Gather-
nerability, are archived with related information in a                  ing Experience Returns for Security, BAD-
database [Sec18], and thus we can collect another type                  GERS ’12, pages 17–24, New York, NY,
of information that explains how to use the vulnera-                    USA, 2012. ACM.
bility from the viewpoint of practitioners.
   Furthermore, to measure the similarity of given two       [AM14]     Luca Allodi and Fabio Massacci. Compar-
vulnerabilities, we use Jaccard index based on their                    ing vulnerability severity and exploits us-
CVE descriptions. However, we can extend our sim-                       ing case-control studies. ACM Trans. Inf.
ilarity measure, considering not only such topologi-                    Syst. Secur., 17(1):1:1–1:20, August 2014.
cal semantics but also distributional semantics such
                                                             [BP98]     Sergey Brin and Lawrence Page. The
as word embeddings. In addition, it is not sufficient to
                                                                        anatomy of a large-scale hypertextual web
measure two vulnerabilities’ similarity only consider-
                                                                        search engine. In Proceedings of the Sev-
ing the textual descriptions because there are other
                                                                        enth International Conference on World
factors to determine whether given two vulnerabili-
                                                                        Wide Web 7, WWW7, pages 107–117, Am-
ties are similar to each other, which may not be ex-
                                                                        sterdam, The Netherlands, The Nether-
pressed in the descriptions. For instance, there are
                                                                        lands, 1998. Elsevier Science Publishers B.
bug bounty programs that are operated by many or-
                                                                        V.
ganizations such as Google, Mozilla, and Facebook,
and they give a bounty for a bug or a vulnerability          [BYRN99] Ricardo A. Baeza-Yates and Berthier
to the bug discoverer. Then, we can assume that two                   Ribeiro-Neto. Modern Information Re-
vulnerabilities are similar if their bounties are set in a            trieval. Addison-Wesley Longman Publish-
similar level.                                                        ing Co., Inc., Boston, MA, USA, 1999.

5    Conclusion                                              [Fee18]    Feedspot. Top 100 information security
                                                                        blogs for data security professionals. http:
In this work, we present a semantic way to assess                       / / blog . feedspot . com / information _
vulnerabilities by examining their textual descriptions                 security_blogs, 2018.
from which we can grasp characteristics of the vulnera-
bilities. We then build the vulnerability ranking graph      [FIR17]    FIRST. Common vulnerability scoring
by representing each vulnerability’s characteristics as                 system. https://www.first.org/cvss,
a node, which are expressed in natural language, and                    2017.
run the TextRank algorithm on the graph to obtain
the rank of the vulnerabilities. As our future work, we      [HA15]     Hannes Holm and Khalid Khan Afridi. An
will address the issues discussed in Section 4 to im-                   expert-based investigation of the common
prove the performance of our ranking method to the                      vulnerability scoring system. Computers &
degree to which security experts and practitioners can                  Security, 53:18 – 30, 2015.
agree with our ranking result. To this end, we are           [Mic18]    Microsoft. Microsoft security bulletins.
going to carry out expert-based performance evalua-                     https : / / technet . microsoft . com /
tion for our ranking method, inspired by the existing                   en-us/security/bulletins.aspx, 2018.
research work [HA15].
                                                             [MIT18]    MITRE. Common vulnerabilities and ex-
Acknowledgment                                                          posures. https://cve.mitre.org/, 2018.
This research is (in part) based on the work supported
by Samsung Research, Samsung Electronics.
[MM16]    Nuthan Munaiah and Andrew Meneely.                       Natural Language Processing, EMNLP ’04,
          Vulnerability severity scoring and boun-                 2004.
          ties: Why the disconnect? In Proceedings
          of the 2nd International Workshop on Soft-     [NIS17]   NIST. National vulnerability database.
          ware Analytics, SWAN 2016, pages 8–14,                   https://nvd.nist.gov/, 2017.
          New York, NY, USA, 2016. ACM.                  [Ope18]   OpenSSL Software Foundation. Openssl.
[MSR06]   Peter Mell, Karen Scarfone, and Sasha                    https://www.openssl.org/, 2018.
          Romanosky. Common vulnerability scor-          [Pau12]   Jaccard Paul. The distribution of the
          ing system. IEEE Security and Privacy,                   flora in the alpine zone. New Phytologist,
          4(6):85–89, November 2006.                               11(2):37–50, 1912.
[MT04]    Rada Mihalcea and Paul Tarau. TextRank:        [Sec18]   Offensive Security. The exploit database.
          Bringing order into texts. In Proceedings of             https://www.exploit-db.org/, 2018.
          the 2004 Conference on Empirical Methods