An Ontology of Top 25 CWEs

                             Vladimir Dimitrov and Ivan Kolev

                              Faculty of Mathematics and Informatics
    University of Sofia St. Kliment Ohridki, 5 James Bourchier Blvd., 1164, Sofia, Bulgaria

                  cht@fmi.uni-sofia.bg, ivan.pl.kolev@gmail.com


       Abstract. CWEs Top 25 is a view to the top 25 most dangerous software errors
       – weaknesses (CWE). The CWE list is maintained by MITRE Corporation.
       Weaknesses are types of vulnerabilities that can be exploited as vulnerabilities by
       attacks. MITRE Corporation supports lists for vulnerabilities (CVE) and attacks
       (CAPEC). The investigation process of given vulnerability, weakness or attack is a
       sophisticated navigation process in mentioned lists. The aim of presented research
       is to represent in an ontology Top 25 CWEs and related with them CVEs and
       CAPECs to facilitate the navigation.


       Keywords: weakness, CWE, vulnerability, CVE, attack, CAPEC, ontology, OWL.


1    Introduction
The aim of this research is to be created an ontology in the cybersecurity domain
based on information contained in CWE’s view Top 25 Most Dangerous Software
Errors, and referenced CVEs and CAPECs. Information about CWEs, CVEs and
CAPECs is given below in the next section.
     The ontology must be simplified to be usable for educational purposes.

2    Basic Terms
The weakness is an error, bug or misconfiguration introduced at some stage of the
software life cycle.
     The vulnerability is an exploited by some attack weakness. Some weaknesses cannot
be exploited because is not available an attack vector to them.
     MITRE Corporation maintains:
 CWE [1] – a community developed list of software and hardware weakness types;

 CVE [2] – a list of publicly known cybersecurity vulnerabilities.

 CAPEC [3] – a community developed list of known attack patterns employed to exploit
   known weaknesses.

     Every CVE is linked to concrete vendor(s), product(s) or product version(s).

 Copyright © 2020 for this paper by
                                  92 its authors. Use permitted under
 Creative Commons License Attribution 4.0 International (CC BY 4.0).
     CWEs are organized in several taxonomies at different abstract levels. CWEs are
like CVE types.
     MITRE Corporation with above-mentioned lists supports the community process
of vulnerability registration, its classification as weaknesses and further investigation of
applicable attack patterns. At the beginning is the vulnerability but new weaknesses and
attack patterns sometimes have to be introduced.
     NIST’s NVD [4] is based on CVE. NVD uses the impact metrics CVSS (Common
Vulnerability Scoring System) [5] for vulnerability evaluation. CVSS scores of CVEs in
NVD are used for ranking Top 25 CWEs.

3    CWE Top 25 Most Dangerous Software Errors
Top 25 Most Dangerous Software Errors (Top 25) [6] is published every year. This
is a list of the most widespread and critical weaknesses that can be discovered and
exploited as software vulnerabilities.
     The ranking methodology intensively uses NVD and CVSS scores assigned there
to the CVEs. Only CVEs that have at least one CWE assigned to them as a root cause
participate in the calculations.
     Two basic values are assigned to each CWE weakness X mentioned in NVD CVEs:
    frequency – Fr(X)

    severity – Sv(X)
      Let Freq(X) is the number of references to the weakness X in NVD CVEs.
      Let Fmin is the minimum value and Fmax is the maximum value of Freq(X) over
its domain.
      Then weakness X frequency is:
    Fr(X) = (Freq(X) – Fmin) / (Fmax – Fmin)

    Fr(X) value is normalized.
     Let AVG_CVSS(X) is the average CVSS score from the CVEs in which the weakness
X is mentioned as a root cause.
     Let CVSSmin is the minimal value and CVSSmax is the maximal value of AVG_
CVSS(X) over its domain.
     Then the weakness X severity is:
    Sv(X) = (AVG_CVSS(X) – CVSSmin) / (CVSSmax – CVSSmin)

    Sv(X) value is normalized.
     Finally, the weakness Tip 25 score is:
    Score(X) = Fr(X) * Sv(X) * 100
     2019 Top 25 is presented in Table 1. Additionally, CWE team added 15 CWEs
(Table 2) that are risky but have not enough score.


                                              93
           Table 1. 2019 Top 25 Most Dangerous Software Errors (source [6]).


Rank ID           Name                                                   Score
[1]  CWE-119      Improper Restriction of Operations within the Bounds 75.56
                  of a Memory Buffer
[2]    CWE-79     Improper Neutralization of Input During Web Page       45.69
                  Generation (‘Cross-site Scripting’)
[3]    CWE-20     Improper Input Validation                              43.61
[4]    CWE-200    Information Exposure                                   32.12
[5]    CWE-125    Out-of-bounds Read                                     26.53
[6]    CWE-89     Improper Neutralization of Special Elements used in an 24.54
                  SQL Command (‘SQL Injection’)
[7]    CWE-416    Use After Free                                         17.94
[8]    CWE-190    Integer Overflow or Wraparound                         17.35
[9]    CWE-352    Cross-Site Request Forgery (CSRF)                      15.54
[10]   CWE-22     Improper Limitation of a Pathname to a Restricted      14.10
                  Directory (‘Path Traversal’)
[11]   CWE-78     Improper Neutralization of Special Elements used in an 11.47
                  OS Command (‘OS Command Injection’)
[12]   CWE-787    Out-of-bounds Write                                    11.08
[13]   CWE-287    Improper Authentication                                10.78
[14]   CWE-476    NULL Pointer Dereference                               9.74
[15]   CWE-732    Incorrect Permission Assignment for Critical Resource 6.33
[16]   CWE-434    Unrestricted Upload of File with Dangerous Type        5.50
[17]   CWE-611    Improper Restriction of XML External Entity            5.48
                  Reference
[18]   CWE-94     Improper Control of Generation of Code (‘Code          5.36
                  Injection’)
[19]   CWE-798    Use of Hard-coded Credentials                          5.12
[20]   CWE-400    Uncontrolled Resource Consumption                      5.04
[21]   CWE-772    Missing Release of Resource after Effective Lifetime   5.04
[22]   CWE-426    Untrusted Search Path                                  4.40
[23]   CWE-502    Deserialization of Untrusted Data                      4.30
[24]   CWE-269    Improper Privilege Management                          4.23
[25]   CWE-295    Improper Certificate Validation                        4.06


                                       94
          Table 2. Additional 2019 Top 25 Most Dangerous Software Errors (source [6]).


Rank ID              Name                                             NVD         Avg
                                                                      Count       CVSS
[26]     CWE-835 Loop with Unreachable Exit Condition                 218         6.610
                 (‘Infinite Loop’)
[27]     CWE-522 Insufficiently Protected Credentials                 150         8.460
[28]     CWE-704 Incorrect Type Conversion or Cast                    143         8.484
[29]     CWE-362 Concurrent Execution using Shared                    187         6.740
                 Resource with Improper Synchronization
                 (‘Race Condition’)
[30]     CWE-918 Server-Side Request Forgery (SSRF)                   128         7.917
[31]     CWE-415 Double Free                                          111         7.981
[32]     CWE-601 URL Redirection to Untrusted Site (‘Open             159         6.141
                 Redirect’)
[33]     CWE-863 Incorrect Authorization                              113         7.050
[34]     CWE-862 Missing Authorization                                92          7.491
[35]     CWE-532 Inclusion of Sensitive Information in Log            90          7.064
                 Files
[36]     CWE-306 Missing Authentication for Critical Function         66          8.529
[37]     CWE-384 Session Fixation                                     76          7.083
[38]     CWE-326 Inadequate Encryption Strength                       73          7.278
[39]     CWE-770 Allocation of Resources Without Limits or            75          6.880
                 Throttling
[40]     CWE-617 Reachable Assertion                                  75          6.729


4      Ontology Contents
CVE, CWE and CAPEC databases are maintained to support the life cycle of
vulnerabilities, weaknesses and attack templates. They contain processing
information needed for registration, classification and maintenance – not only
information related to their nature.
     Initially, the new vulnerability is registered in the CVE database by some CVE
Numbering Authorities (CNAs). This means that it is yet registered in some private
or public repository. Then an investigation process follows to accept or reject this
vulnerability. Sometimes, the new vulnerability is recognized as old one. Further, the
investigation process relates the vulnerability (CVE entry) to a weakness (CWE entry)
and to an attack pattern (CAPEC entry). Sometimes, it is impossible to clarify the
vulnerability nature and to relate it with any weakness and/or attack pattern. It is possible
a vulnerability to be related to several weaknesses and/or several attack patterns. Briefly,

                                             95
this is the contents of vulnerability processing without going in details about the life cycle
and the phases of vulnerabilities, weaknesses and attack patterns.
      In the case of Top 25 Most Dangerous Software Errors, all CVEs are related with
CWEs and CAPECs, simply because only such CVEs are used in the ranking procedure.
      The ontology is implemented in OWL [7] using Protégé [8]. Initially, it was a master
thesis developed by the second co-author under the supervision of the first one. Then it
has been redesigned by the first co-author.
      CVE, CWE and CAPEC databases are represented in the ontology as disjoint classes.
Their definitions are as follow:
   Class: top25:CAPEC
         DisjointWith: top25:CVE, top25:CWE
   Class: top25:CVE
         DisjointWith: top25:CAPEC, top25:CWE
   Class: top25:CWE
         DisjointWith: top25:CAPEC, top25:CVE
     The elements of CVE, CWE and CAPEC have an identifier (ID) – positive integer, a
name (Name) – string, and a short description (Description) – string. These characteristics
are represented in the ontology as datatype properties:
   DataProperty: top25:ID
         Characteristics: Functional
         Domain: top25:CAPEC or top25:CVE or top25:CWE
         Range: xsd:string
   DataProperty: top25:Name
         Characteristics: Functional
         Domain: top25:CAPEC or top25:CVE or top25:CWE
         Range: xsd:string
   DataProperty: top25:Description
         Characteristics: Functional
         Domain: top25:CAPEC or top25:CVE or top25:CWE
         Range: xsd:string
     CVE entries have external references to other repositories in which they are registered
with different identifications. These references are important because they extend the
view to the vulnerability but at this time, they are not included in the ontology.
     The information about the CVE entry nature is in its description. This is semi-
structured text about the vendor, product, version, component root cause, attack vector


                                             96
etc., but this text is hardly readable even by a human-expert. Information extraction from
the CVE description is out of the scope of this research.
      CVEs are not organized in any taxonomies – they are simply lists. The information
about the CVE entry is contained in its CWE “type”, but it is possible a vulnerability to
be related with several CWEs. CWEs and CAPECs are organized in several taxonomies.
      A CWE entry can have an extended description (ExtendedDescription) in addition
to its description. This additional description is included in the ontology as a datatype
property:
   DataProperty: top25:ExtendedDescription
         Characteristics: Functional
         Domain: top25:CWE
        Range: xsd:string
     Likelihood of CWE entry exploit is evaluated in a scale of several string values. It is
included in the ontology as LikelihoofOfExploit datatype property:
   DataProperty: top25:LikelihoodOfExploit
         Characteristics: Functional
         Domain: top25:CWE
         Range: xsd:string
     How a CWE can be detected is given in the datatype property DetectionMethods of
the class CWE:
   DataProperty: top25:DetectionMethods
         Domain: top25:CWE
         Range: xsd:string
     Another important information about the CWE are mitigation methods. These are
represented in the datatype property PotentialMitigations:
   DataProperty: top25:PotentialMitigations
         Domain: top25:CWE
          Range: xsd:string
      Some CWEs are related to specific programming languages. In the CWE class,
this is represented as a datatype property (Languages). This property is a simplification
because relation can be not only to the programming languages but also to platforms,
technologies etc.:

  DataProperty: top25:Languages
         Domain: top25:CWE
        Range: xsd:string
     Usually, the most detailed CWEs (at abstraction level Variant) are linked with some


                                            97
programming language, platform or technology etc., but in Top 25 Most Dangerous
Software Errors, there are CWEs at different abstraction levels. This is another
simplification accepting that all CWEs are at the same abstraction level. Just a same is the
situation with CAPECs.
      CWE entries have more characteristics that are interesting but at this stage only
listed above are included in the ontology.
      CAPECs are organized in several taxonomies. CAPEC entry has many interesting
characteristics, but in our ontology, they participate only with their ID, Name and
Description.
      One of the most important concept in the ontologies are class relationships. They are
modelled as class object properties.
      As it has been mentioned, vulnerability “types” are the weaknesses. The object
property WeaknessEnumerations links CVEs with their CWEs. The MITRE Corporation
CVE list does not contains this relationship, but it is available in NVD with this name.
This object property is defined as follows:
    ObjectProperty: top25:WeaknessEnumerations
         Characteristics: Irreflexive, Asymmetric
         Domain: top25:CVE
         Range: top25:CWE
      InverseOf: top25:ObservedExamples
   The object property ObservedExamples links weaknesses to their vulnerabilities:
  ObjectProperty: top25:ObservedExamples
         Characteristics: Irreflexive, Asymmetric
         Domain: top25:CWE
         Range: top25:CVE
         InverseOf: top25:WeaknessEnumerations
     Weaknesses, on the other hand, are linked with the attack patterns that can be used
to exploit them. This relationship is represented as an object property with the name
AttackPatterns that links CWEs to their CAPECs:
   ObjectProperty: top25:AttackPatterns
         Characteristics: Irreflexive, Asymmetric
         Domain: top25:CWE
         Range: top25:CAPEC
         InverseOf: top25:RelatedWeaknesses
     Finally, the relationship of attack patterns to the exploited by them weaknesses is
represented by the object property RelatedWeaknesses:
   ObjectProperty: top25:RelatedWeaknesses


                                            98
         Characteristics: Irreflexive, Asymmetric
         Domain: top25:CAPEC
         Range: top25:CWE
        InverseOf: top25:AttackPatterns
     CVEs and CAPECs are not linked in the ontology and the databases. This relationship
can be derived via CWEs.

4    Ontology Usability
Navigation in our ontology is via SPARQL queries. Starting point can be any
weakness, vulnerability or attack pattern. The path and its scope depend of the
case study scenario. Several case studies, roles (security manager, cyber security
operational team, procurement employee, cyber security trainee) and scenarios
have been investigated. Identified case studies, roles and scenarios are not
exhausting, but it is clear that not all of these potential users can use SPARQL
queries in their everyday duties. A specialized user-friendly interface to the
ontology must be developed for them.
     Presented here ontology is very simple. It contains the basic knowledge about Top
25 Most Dangerous Software Errors and related vulnerabilities and attack patterns but
even now, it is populated with 493 individuals (CWEs – 25, CVEs – 156, CAPECs – 312).
Further extensions of this ontology would be populated with many thousands CWEs,
CVEs and CAPECs. This process have to automatic for a stable ontology structure. CVE,
CWE and CAPEC databases are relatively stable but the devil is in the details. So, may
be some automatic ontology structure must be developed.
     Finally, it is clear that the real power of the semantic search can be achieved with
the introduction of CWE and CAPEC taxonomies. This task is not so simple because the
used taxonomies are not simple ones and further research and investigations must be done
on them.
     This research will be used in the context of updating of current curricula at University
of Sofia with cybersecurity topics as described in [9].

5    Acknowledgements
This work was conducted using the Protégé resource, which is supported by grant
GM10331601 from the National Institute of General Medical Sciences of the
United States National Institutes of Health.
    This research is supported by the National Scientific Program “Information
and Communication Technologies for a Single Digital Market in Science,
Education and Security (ICTinSES)”, financed by the Ministry of Education and
Science.


                                             99
References
1. MITRE Corporation, Common Weakness Enumeration, http://cwe.mitre.org, last accessed
   2020.
2. MITRE Corporation, Common Vulnerabilities and Exposures, http://cve.mitre.org, last
   accessed 2020.
3. MITRE Corporation, Common Attack Pattern Enumeration and Classification, http://capec.
   mitre.org, last accessed 2020.
4. NIST, National Vulnerability Database, http://nvd.nist.gov, last accessed 2020.
5. NIST, Vulnerability Metrics, http://nvd.nist.gov/vuln-metrics/cvss, last accessed 2020.
6. MITRE Corporation, http://cwe.mitre.org/top25, last accessed 2020.
7. W3C, Semantic Web, Web Ontology Language (OWL), http://www.w3.org/OWL, last accessed
   2020.
8. Musen, M.A. The Protégé project: A look back and a look forward. AI Matters. Association of
   Computing Machinery Specific Interest Group in Artificial Intelligence, 1(4), June 2015. DOI:
   10.1145/2557001.25757003.
9. Kaloyanova K., Exploring Cybersecurity Curricula Designation Requirements. Computer and
   Communications Engineering, Vol. 13, No. 2/2019, pp 64-67.


                                             100