<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Development of Algorithm for Improving Accuracy of Probability of Threat Implementation in Personal Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergey Verevkin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ksenia Naumova</string-name>
          <email>ksenia.naumovaks@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tatiana Tatarnikova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavel Bogdanov</string-name>
          <email>45bogdanov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ekaterina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>&amp; Saint Petersburg</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Proceedings of the 12th Majorov International Conference on Software Engineering and Computer Systems</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Russian State Hydrometeorological University</institution>
          ,
          <addr-line>Voronezhskaya st. 79, St. Petersburg, 192007</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The continuing increase of the number of information systems inevitably entails the need to ensure cyber security of the information contained in them, in view of the need to provide both information containing commercial secrets and various types of information processed, including by State information systems. Considering the process of ensuring cyber security of the information, in the context of the need to comply with the requirements of legislative and regulatory acts, we should take note of the inevitability of creating a model of an illegal intruder and model of threats to the security of the protected information system, to determine the relevance of the vulnerabilities indicated in them. This article review the process of creating an algorithm that determines the existing methodology for determining actual threats to data security during their processing in information data systems, which is used at the step of building a model of security threats. The developed algorithm is relevant in view of its application to the current methodology, which serves as the main document in determining the requirements for the information security system. It is proposed to use a four-stage algorithm for collecting reconnaissance information from public sources (OSINT) for assessing risks and determining the state of security of an information system. The algorithm contains the steps of collecting information from freely distributed databases of supervisory authorities, external network resources of the organization, identifying potential an illegal intruderamong the employees of the organization, as well as checking the organization's internal network resources. The developed algorithm is recurrent and allows organizing a recursive update of the input data collected as a result of its first execution, thereby providing data for a more detailed analysis when performing subsequent cycles. The information obtained as a result of OSINT analyze and provide to the managerial staff of the organization or the owner of the information system for further use in determining the appropriate coefficients of the current methodology. OSINT, corporate networks, security analysis, information security ORCID: 0000-0002-5255-940X (A. 1); 0000-0001-6972-5390 (A. 2); 0000-0002-6419-0072 (A. 3); 0000-0002-7533-7316 (A. 4); 0000-</p>
      </abstract>
      <kwd-group>
        <kwd>Personal</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Today, a matter of necessity of the need to ensure the information security of the organization is
increasingly arose not only by large corporations and government entities, but also by small private
organizations. The main reason for it is the increase in the cost of processed information in the networks
of organizations that has become the most desirable resource of cybercriminals.</p>
      <p>2020 Copyright for this paper by its authors.</p>
      <p>With the need to protect the information being processed, it is necessary to properly assess the
current state of security of the information system in accordance with the requirements of current federal
laws and other governing documents of supervisory bodies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Justification of the existing problem</title>
      <p>
        In carrying out the task of building an information protection system in organizations closely related
to the processing of client databases that include personal data, an important point is the need to
determine the current personal data threats when processing them in ISPD in accordance with the
current FSTEC methodology [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>As a result of the actions described in this methodology, employees of the organization are faced
with the task of determining numerical coefficients  1 and  2, which indicate the state of the initial
security and the probability of the threat implementation.</p>
      <p>Unlike the first coefficient determined by the table in the methodology, the value of the  2 coefficient
should be determined by using the proposed verbal estimates corresponding to small, medium, high and
unlikely.</p>
      <p>It is worth noting the difficulty of conducting such assessments in the absence of any actual data on
the current state of the organization's information systems and not to mention a further similar process
for assessing the feasibility of a threat, which requires an impartial assessment of the possibility of
implementing security incidents, including by the organization's staff.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Algorithm development</title>
      <p>As a way to solve the problem of correctly determining the values of the  2, coefficient, we will
build an algorithm that allows using open source software used for intelligence based on open
information sources (OSINT) to search for existing threats.</p>
      <p>Among the methods for conducting OSINT, the four-stage cyclic method for conducting data
collection has gained the greatest popularity:</p>
      <sec id="sec-3-1">
        <title>1. Definition of information search criteria 2. Retrieving searched data from open sources 3. Analysis of the received information 4. Structuring the obtained information in order to use it for further data search.</title>
        <p>
          Therefore, the accuracy of the research conducted depends on the number of OSINT cycles, which
allows you to determine the depth of analysis of the collected information depending on its type, secrecy
and the wishes of the organization's management [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>An important feature of OSINT is the full analysis of the organization's information and personnel
resources. For this reason, we highlight three main steps of the algorithm being developed and consider
the most successful methods of their implementation:
1)</p>
      </sec>
      <sec id="sec-3-2">
        <title>Analysis of public pages of the organization</title>
        <p>It includes the collection and analysis of information about the organization posted in such sources
of information, advertisements, organization websites, resources, tax information and other sources of
information that allow you to obtain initial data on the activities of the organization: organizational
structure, position, etc.</p>
        <p>There are many software solutions, but as an example, we will consider the Maltego, which provides
a convenient interface for visualizing data found and connections between it. Despite the fact that
Maltego has a free version, the most effective are paid versions of the program that allow expanding its
capabilities by connecting additional third-party libraries, the work of which is implemented by
connecting using API keys. An example of analysis and construction of connections of collected data
of the Russian State Hydrometeorological University (RSHU) website (rshu.ru) is shown in Figure 1.</p>
        <p>
          As a result of the analysis, it becomes possible to obtain the following information: contact
information of the owners of network resources, hosting on the basis of which the organization's website
is located, personal data of employees whose numbers are indicated on the website, information about
the current and completed judicial proceedings of the organization and information about the dates of
important events, such as: company management's birthdays, dates of corporate events and many other
information that will further facilitate the receipt of additional information[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>2)</p>
      </sec>
      <sec id="sec-3-3">
        <title>Analysis of employee information</title>
        <p>In this step, you search for existing employees in your organization using the data you have received
in the previous step. The main goal is to collect information about the largest number of employees in
the organization using the previously obtained data. As a result of the analysis, it becomes possible to
determine most of the employees of the organization with high accuracy through the analysis of social
networks of these employees, their personal e-mails, phone numbers, home addresses and relationships
between the employees.</p>
        <p>We will use the OSINT Framework, which combines a huge number of solutions in the field of
searching for information from open sources. The Maltego that was discussed earlier can also be used
for these purposes, but most of its functionality for analyzing social networks used in Russia requires
purchase of paid packages. The main advantage of the OSINT Framework is the ability to get the user
the access to the maximum number of information from free sources, with additional indication of paid
resources. Figure 2 shows the OSINT Framework options for Social Network and Mail Address
Analysis.</p>
        <p>An important task of this step is to identify dissatisfied employees who openly express
dissatisfaction with colleagues and the organization as a whole. Often, it is a dissatisfied employee who
is a potential victim of social engineers who provoke the employee to help achieve their own goals.
3)</p>
      </sec>
      <sec id="sec-3-4">
        <title>Analysis of the organization's network</title>
        <p>The last but no least important step is to analyze the current state of security of corporate networks
of the organization. In this step, it is important to analyze the network infrastructure used by the system
and application software, the security tools used, protocols and other information that allows the abuser
to plan attacks for specific network components.</p>
        <p>The task of analyzing data about an organization's network can be solved in many different ways,
the application of which depends on the type of network and the devices used in it. One of the most
famous tools is Nmap. Using Nmap to the address found using Maltego IP, we can get information
about the system software used, which is used on the hosting network resource. Figure 3 shows the
result of the website rshu.ru hosting operating system definition.</p>
        <p>
          The main criterion for choosing an implementation tool is to locate an attacker in relation to the
network of the organization. If located in a segment of the corporate network, the use of sniffers to
analyze network traffic for the use of vulnerable network protocols is needed. At the same time, for the
purpose of further penetration, it is necessary to use vulnerability scanners and Nmap analogues to
search for vulnerabilities of border nodes of the network or to obtain information about the protection
used in case of remote scanning of devices at the border of the investigated network in case of
firewalls[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>The result of the work is an algorithm, contributory factor to the process of determining the verbal
coefficients of the probability of the implementation of security threats for information systems, through
the use of the final report generated from the results of external OSINT and analysis of the organization's
network. It should be pointed out the possibility of obtaining new data on threats existing in the
information system, the identification of which in the case of multiple cyclical repetition of the
algorithm contributes to the addition of the model of security threats and information created at the
previous stages. Also should be pointed out that the developed algorithm can also be used when
reevaluating the security of an information system to identify new sources of threats and determine their
relevance.
5. References</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] "Methodology for determining current threats to personal data security during their processing in personal data information systems"</article-title>
          <source>FSTEC of 14.02 2008</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Penetration</given-names>
            <surname>Testing Execution</surname>
          </string-name>
          <article-title>Standard (PTES)</article-title>
          , URL: http://www.penteststandard.org/index.php/Main_Page
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] «Maltego Desktop Application Guide»URL: https://docs.maltego.com/support/solutions/articles/15000008703-client-requirements
          <source>#networkrequirements-0-3</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Tatarnikova</surname>
            <given-names>T.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Volskiy</surname>
            <given-names>A.V.</given-names>
          </string-name>
          <article-title>Estimation of probabilistic-temporal characteristics of network nodes with traffic differentiation//Informatsionno-Upravliaiushchie Sistemy</article-title>
          .
          <year>2018</year>
          . V. 94 No. 3. P.
          <volume>54</volume>
          -
          <fpage>60</fpage>
          . DOI 10.15217/issn1684-
          <fpage>8853</fpage>
          .
          <year>2018</year>
          .
          <volume>3</volume>
          .
          <fpage>5</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Tatarnikova</surname>
            <given-names>T.M.</given-names>
          </string-name>
          <article-title>Statistical methods for studying network traffic //InformatsionnoUpravliaiushchie Sistemy</article-title>
          .
          <year>2018</year>
          . V.
          <volume>96</volume>
          . No.5. P.
          <volume>35</volume>
          -
          <fpage>43</fpage>
          . DOI:
          <volume>10</volume>
          .31799/
          <fpage>1684</fpage>
          -8853-2018-5-
          <fpage>35</fpage>
          -43
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Bogatyrev</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          <string-name>
            <surname>Fault</surname>
          </string-name>
          <article-title>Tolerance of Clusters Configurations with Direct Connection of Storage Devices // Automatic Control</article-title>
          and
          <source>Computer Sciences - 2011</source>
          , Vol.
          <volume>45</volume>
          , No.
          <issue>6</issue>
          , pp.
          <fpage>330</fpage>
          -
          <lpage>337</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Bogatyrev</surname>
            <given-names>A. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bogatyrev</surname>
            ,
            <given-names>V. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bogatyrev</surname>
            ,
            <given-names>S. V.</given-names>
          </string-name>
          <string-name>
            <surname>Multipath</surname>
          </string-name>
          <article-title>Redundant Transmission with Packet Segmentation</article-title>
          .
          <source>In: 2019 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)</source>
          , (
          <year>2019</year>
          ). 8840647 doi: 10.1109/WECONF.
          <year>2019</year>
          .8840643
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Bogatyrev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Derkach</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Evaluation of a Cyber-Physical Computing System with Migration of Virtual Machines during Continuous Computing</article-title>
          .
          <source>Computers</source>
          <year>2020</year>
          ,
          <volume>9</volume>
          ,
          <fpage>42</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Tatarnikova</surname>
            <given-names>T.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dzubenko</surname>
            <given-names>I.N.</given-names>
          </string-name>
          <article-title>IoT system for detecting dangerous substances by smell</article-title>
          // Informatsionno-Upravliaiushchie
          <string-name>
            <surname>Sistemy</surname>
          </string-name>
          .
          <year>2018</year>
          . V.
          <volume>93</volume>
          , No 2. P.
          <volume>84</volume>
          -
          <fpage>90</fpage>
          . DOI 10.15217/issn1684-
          <fpage>8853</fpage>
          .
          <year>2018</year>
          .
          <volume>2</volume>
          .
          <fpage>84</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>