<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Tool for Automatic Enterprise Architecture Modeling</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Markus Buschle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hannes Holm</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Teodor Sommestad</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mathias Ekstedt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Khurram Shahzad</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Industrial Information and Control Systems, KTH Royal Institute of Technology</institution>
          ,
          <addr-line>Osquldas v. 12, SE-10044 Stockholm</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <fpage>25</fpage>
      <lpage>32</lpage>
      <abstract>
        <p>Enterprise architecture is an approach which aim to provide decision support based on organization-wide models. The creation of these models is however cumbersome as multiple aspects of an organization need to be considered. The Enterprise Architecture approach would be signicantly less demanding if data used to create the models could be collected automatically. This paper illustrates how a vulnerability scanner can be utilized for data collection in order to automatically create enterprise architecture models. We show how this approach can be realized by extending an earlier presented Enterprise Architecture tool. An example is provided through a case study applying the tool on a real network.</p>
      </abstract>
      <kwd-group>
        <kwd>Enterprise Architecture</kwd>
        <kwd>Automatic data collection</kwd>
        <kwd>Automatic instantiation</kwd>
        <kwd>Software tool</kwd>
        <kwd>Security Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Enterprise Architecture (EA) is a comprehensive approach for management and
decision-making based on models of the organization and its information
systems. An enterprise is typically described through dimensions such as Business,
Application, Technology and Information. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These pictographic descriptions
are used for system-quality analysis to provide valuable support for IT and
business decision-making [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        As these models are intended to provide reliable decision support it is
imperative that they capture all the aspects of an organization which are of relevance.
Thus, they often grow very large and contain several thousands of entities and an
even larger number of relationships in between them. The creation of such large
models is both time and cost consuming, as lots of stakeholders are involved and
many dierent pieces of information have to be gathered. During the creation
process the EA models are also likely to become (partly) outdated [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Thus, in
order to provide the best possible decision support it needs to be ensured that
EA models both are holistic and reect the organizations current state.
      </p>
      <p>
        Automatic data collection and model creation would be preferable as this
would mean a reduced modeling eort and an increased quality of the collected
data. In current EA tools two approaches addressing automatic data collection
can be found. The most common way is to import models that are made in 3rd
party software. For example, BizzDesign Architect [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] can import from oce
applications. Thereby the automation aspect is the fact that data is reused and
does not need to be manually entered if it is already available. The interpretation
of data documented in the third-party software can however be resource- and
time consuming, thus contradicting parts of the purpose with automatic data
collection. Other tools such as for example Troux [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] allow the usage of SQL
queries in order to load information from available data bases. This approach
focuses on the extraction of the data-model and thereby the automatic creation
of the information architecture as well as the business architecture based on
process descriptions and similar documents.
      </p>
      <p>
        In this paper we present how the Enterprise Architecture Analysis Tool [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
has been extended in order to automatically instantiate elements in EA models
based on results from network scans. In comparison to the previously described
approaches of other tools our implementation focuses on the Application and
Technology layer of the organization. This information is gathered through an
application of a vulnerability scanner that evaluates the network structure of an
enterprise. Thereby attached network hosts and the functionality they provide
can be discovered. Another dierence is that the presented tool uses EA models
for system-quality analysis, whereas commercial applications focus on
modeling. As a running example we illustrate how a meta-model designed for cyber
security analysis [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] can be (partly) automatically instantiated. The presented
implementation is generic and can be used to support any kind of EA analysis.
      </p>
      <p>The remainder of this paper is structured as follows. Section two describes
the components used to realize the implementation and introduces into the
metamodel that is used as running example. Section three describes how the
information, which was automatically collected, is used to instantiate the meta-model
for security evaluation. Section four exemplies the tool application on real data
collected by scanning a computer network used for security exercises. In
section ve the presented tool and the underlying approach are discusses as well as
future work is described. Finally section six concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        This section describes the three components that we combined in order to
automatically create EA models that can be used for security analysis. In subsection
2.1 the vulnerability scanner NeXpose [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is used for data collection, is
explained. Subsection 2.2 describes the Enterprise Architecture Analysis Tool
that is used to generate the models and evaluate them with regards to
security aspects. Subsection 2.3 briey introduces CySeMoL, the used meta-model
which is partly instantiated using the automated data collection. The overall
architecture can be seen in gure 1.
      </p>
      <sec id="sec-2-1">
        <title>A Tool for Automatic Enterprise Architecture Modeling 27</title>
        <p>
          The vulnerability scanner NeXpose was chosen in this project as it has
demonstrated good results in previous tests [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          NeXpose [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is an active (i.e. it queries remote hosts for data) vulnerability
scanner capable of both authenticated and unauthenticated scans.
Authenticated scans involve providing the scanner with user accounts to hosts. They are
typically less disturbing to normal operations and providing a higher degree of
accuracy. However, it is not always the case that credentials are readily available
for the individual(s) performing a scan.
        </p>
        <p>NeXpose provides information regarding the network architecture in terms of
all devices which are communicating over TCP or UDP, e.g. computers, rewalls
and printers. The scanner identies the operating systems or rmware that is
running on the scanned devices and any services that are running. If the scanner
is given credentials it is also able to assess all applications (and versions thereof)
installed on a device and all user/administrator accounts on that device.</p>
        <p>
          More security related functions of the scanner include that it can check for
both software aws and conguration errors. It is also capable of performing
web application scans. NeXpose has approximately 53000 current signatures in
its engine, with every signature corresponding to a certain vulnerability.
NeXpose is also SCAP-compliant [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and thus compliant with a suite of six commonly
used protocols developed by the National Institute of Standards and Technology
(NIST): i) Extensible Conguration Checklist Description Format (XCCDF),
ii) Open Vulnerability and Assessment Language (OVAL), iii) Common
Platform Enumeration (CPE), iv) Common Conguration Enumeration (CCE), v)
Common Vulnerabilities and Exposures (CVE) and vi) Common Vulnerability
Scoring System (CVSS).
2.2
        </p>
        <p>
          Enterprise Architecture Analysis Tool
In [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] we presented a tool for EA analysis. This tool consists of two parts to
be used in succession. The rst component allows the denition of meta-models
to describe a certain system quality of interest ( 1 in Figure 1). This is done
according to the PRM formalism [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] in terms of classes, attributes, and relations
between them. Thereafter an execution of the second component is performed
in order to describe an enterprise as an instantiated model ( 2 in Figure 1),
which is compliant to the previously dened meta-model. As the PRM formalism
supports the expression of quantied theory the described enterprise can be
evaluated with regards to the considered system quality described in the rst
component.
        </p>
        <p>To use the results gained from NeXpose scans an extension of the tool was
necessary. The result of NeXpose’s scans can be exported to XML les ( 4 in
Figure 1), which are structured according to a schema denition le (XSD) 1 (3
in Figure 1). We added the possibility to create mappings between XSD les
and meta-models (5 in Figure 1) in order to automatically instantiate the
metamodel based on NeXpose’s XML les ( 6 in Figure 1). The used mapping is
discussed in section 3.
2.3</p>
        <p>
          CySeMoL
This paper exemplies the mapping functionality by instantiating a subset of
the meta-model of the CySeMoL (Cyber Security Modeling Language)[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. This
modeling language follows the abstract model presented in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and uses the PRM
formalism to estimate the value of security attributes from an architecture model.
Its meta-model covers both technical and organizational aspects of security and
does in total contain 20 entities, 30 entity-relationships and a number of
interdependent attributes. Four of these entities and three of its relationships can be
mapped to elements produced by NeXpose. This subset of CySeMoL is depicted
in the left part of Figure 2. While only a subset of the total number of entities
and relations could be instantiated, this subset includes entities and relations
which are of high multiplicity in enterprises, and thus require lots of eort to
model.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The mapping</title>
      <p>In this section we describe how we matched the structure of NeXpose’s results to
entities of the CySeMol language in order to instantiate the language based on
scans. As described earlier, this was done based on the XSD le that describes
the structure of the reports.</p>
      <p>For our implementation we used four elements that a NeXpose result
contains. At rst we mapped ngerprintsType and osType to the OperatingSystem
1 The XSD le (Report_XML_Export_Schema.xsd ) is part of the NeXpose
Community Edition that can be downloaded from http://www.rapid7.com</p>
      <sec id="sec-3-1">
        <title>A Tool for Automatic Enterprise Architecture Modeling 29</title>
        <p>class of CySeMol, visualized as Mapping 1 in gure 2. This allows us to
determine the used operating system of a computer identied by NeXpose. The
second mapping (Mapping 2 in gure 2) relates softwareType and
ngerprintType to SoftwareProduct in order to identify the software that is executed on
the considered system. Thirdly ( Mapping 3) we mapped endpointsType and
endpointType to Service in order to identify at which ports services are provided
by a machine. Finally a mapping between service_ngerprints_Type and
service_ngerprint_Type to SoftwareProduct was made (Mapping 4) in order to
describe the software that provide services on the machine of interest.</p>
        <p>Additionally we considered the hierarchical structure of the XSD le in order
to derive relationships. This made it possible to add the relationships Operates,
ControlledBy, and ProductOf as they are shown in Figure 2.</p>
        <p>The used subset of</p>
        <p>CySeMol</p>
        <p>The structure of</p>
        <p>NeXpose’s reports
In this section we describe how we tested the implementation on a real network.
We give a brief introduction to the background of the collected data. Afterwards
we depict how the resulting auto-generated model looks like.</p>
      </sec>
      <sec id="sec-3-2">
        <title>4.1 The setup The main experimental setup was designed by the Swedish Defence Research Agency (FOI) in Linkping, Sweden with the support of the Swedish National</title>
        <p>SoftwareProduct &lt;&lt;Asset</p>
        <p>Operates</p>
        <p>OperatingSystem &lt;&lt;Software
ProductOf</p>
        <p>ControlledBy
Service &lt;&lt;Software
Mapping 1
Mapping 2
Mapping 3
Mapping 4
Defence College (SNDC). Also, a group of computer security specialists and
computer security researchers originating from various northern-European
governments, military, private sectors and academic institutions were part of designing
the network architecture.</p>
        <p>The environment was set to describe a simplied critical information
infrastructure at a small electrical power utility. The environment was composed of
20 physical PC servers running a total of 28 virtual machines, divided into four
VLAN segments. Various operating systems and versions thereof were used in the
network, e.g. Windows XP SP2, Debian 5.0 and Windows Server 2003 SP1. Each
host had several dierent network services operating, e.g. web-, mail-, media-,
remote connection- and le sharing services. Furthermore, every host was more
or less vulnerable through software aws and/or poor congurations.
4.2</p>
        <p>The result
We performed a NeXpose scan on the setup environment and thereafter applied
the mapping as presented in chapter 3. The resulting auto generated model
consists of 28 instances of CySeMol’s OperatingSystem class. Furthermore 225
instances of the Service class and 141 instantiations of the SoftwareProduct class
were automatically generated. The generated components were related based on
the relations that are specied in CySeMol. Figure 3 shows the resulting model
exemplary for one computer of the environment as the full model is to big to be
shown here.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and future work</title>
      <p>This paper demonstrates that vulnerability scanners can provide useful support
for the creation of EA models. As mentioned earlier, the results of a scan do not
deliver a complete EA model, but require some completion work. The application
of an automated scan however signicantly reduces modeling eort and provides
an EA analyst with a model stub which he or she can complement with other
types of data.</p>
      <p>
        The validity and reliability of the proposed approach can be discussed from
two dierent viewpoints: i) how much of the meta-model that can be captured,
both in scope (i.e. how much of the meta-model that can be instantiated) and
context (i.e. if the scanner provides all the information needed to accurately
capture the context of a variable), and ii) how accurate a vulnerability scanner
is at assessing the instantiated variables. Regarding i), most of the more modeling
intensive concepts of CySeMoL are captured and all context are accurate. That
is, the scanner provides e.g. all the information regarding vulnerabilities that
CySeMoL requires. Regarding ii), the scanning accuracy in terms of assessing
vulnerabilities is studied in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The accuracy in terms of assessing software,
operating systems and such is something that will be examined in future works.
      </p>
      <p>It would also be interesting to look at other variables provided by automated
vulnerability scanning, e.g. user accounts of systems. Furthermore, automated</p>
      <sec id="sec-4-1">
        <title>A Tool for Automatic Enterprise Architecture Modeling 31</title>
        <p>
          scanning could be mapped to more commonly used EA frameworks such as
ArchiMate [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] to increase the usage of the method.
        </p>
        <p>Additionally in future work it might be investigated how other data sources
can be used in order to provide input to automatic model creation and further
reduce the manual tasks necessary. Examples of such sources are access control
lists, ERP systems, accounting systems and UDDI registries. Especially how
automatic data collection for the domains that so far not have been considered
(the Business Layer and the information architecture) can be carried out, needs
to be investigated. The long-term goal is to minimize the manual eort required
to generate EA models.</p>
        <p>The fact that enterprises are changing in the course of time is an important
aspect too. The support for periodic scans leading to an automatic model update
might therefore be implemented in the present tool as well.</p>
        <p>It is also possible to collect information on vulnerabilities of services and
software. This is something that we aim to incorporate in a future project in
order to improve the analysis functionality.
6</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper we presented an extension of our previously developed tool that
allows the automatic generation of elements for Enterprise Architecture models.
The input for these models is provided by a vulnerability scanner, which was used
to identify infrastructure elements and applications that were part of a computer
network. Our implementation is generic even though CySeMol, a meta-model
for security analysis, was used as a running example. The data gained from the
vulnerability scanner can be used to instantiate any meta-model, as soon as a
mapping has been dened. The scan with NeXpose took less than an hour and
the creation of the EA model using that data was next to instantaneous. Thus,
it should be a viable option for EA architects. We have also illustrated the
architecture of our implementation and described used components in detail.
Finally, we have presented a practical application based on real data of our
implementation. Thereby we have shown the feasibility of our approach.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckl</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franke</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gleichauf</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nrman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schweda</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ullberg</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A survival analysis of application life spans based on enterprise architecture models</article-title>
          .
          <source>In: 3rd International Workshop on Enterprise Modelling and Information Systems Architectures</source>
          , Ulm, Germany. pp.
          <volume>141154</volume>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. BiZZdesign: BiZZdesign Architect. http://www.bizzdesign.
          <source>com (Mar</source>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Buschle</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ullberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franke</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lagerstrm</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sommestad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A tool for enterprise architecture analysis using the prm formalism</article-title>
          .
          <source>In: CAiSE2010 Forum PostProceedings</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Getoor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfeer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Learning probabilistic relational models</article-title>
          .
          <source>In: Proc. of the 16th International Joint Conference on Articial Intelligence</source>
          . pp.
          <fpage>13001309</fpage>
          . Morgan Kaufman (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Holm</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sommestad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almroth</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Persson</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A quantitative evaluation of vulnerability scanning</article-title>
          . Information Management &amp; Computer
          <string-name>
            <surname>Security</surname>
          </string-name>
          (to be published)
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quinn</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scarfone</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Waltermire</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The technical specication for the security content automation protocol (SCAP)</article-title>
          .
          <source>NIST Special Publication</source>
          <volume>800</volume>
          ,
          <issue>126</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lankhorst</surname>
            ,
            <given-names>M.M.</given-names>
          </string-name>
          : Enterprise Architecture at Work: Modelling,
          <source>Communication and Analysis</source>
          . Springer, Berlin, Heidelberg, Germany, 2 nd edn. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Rapid7: NeXpose. http://www.rapid7.com/ (
          <year>Mar 2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Sommestad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ekstedt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , P.:
          <article-title>A probabilistic relational model for security risk analysis</article-title>
          .
          <source>Computers &amp; Security</source>
          <volume>29</volume>
          (
          <issue>6</issue>
          ),
          <volume>659679</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sommestad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ekstedt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nordstrm</surname>
            ,
            <given-names>L.:</given-names>
          </string-name>
          <article-title>A case study applying the Cyber Security Modeling Language (</article-title>
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Troux Technologies: Metis. http://www.troux.com/products/ (
          <year>Mar 2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>