<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cluster analysis and individual anthropogenic risk</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vladimir V. Moskvichev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ulyana S. Postnikova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga V. Taseiko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Krasnoyarsk Branch of the Federal Research Center for Information and Computational Technologies</institution>
          ,
          <addr-line>Krasnoyarsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Reshetnev Siberian State University of Science and Technology</institution>
          ,
          <addr-line>Krasnoyarsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Siberian Federal University</institution>
          ,
          <addr-line>Krasnoyarsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>526</fpage>
      <lpage>532</lpage>
      <abstract>
        <p>Models and assessment methods of anthropogenic risk are analyzed at this article, general basis of mathematical approach for risk analysis is disclosed. Based on multivariate statistic methods, algorithm of analysis for Siberian territories safety is formulated, it allows to define acceptable level of risk for each territorial group (cities with population density more than 70 000, towns with population less than 70 000, and municipals areas).</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Territorial risk</kwd>
        <kwd>hierarchical clustered analysis</kwd>
        <kwd>k-means method</kwd>
        <kwd>acceptable level of risk</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>For efective realization of Presidential decree “Strategy on the Russian Federation national
safety” 02.07.2021, a range of measures is required, which are referred to population and territory
protection from natural to anthropogenic accidents.</p>
      <p>Risk analysis forms main mechanism in safety control, which involves playing into kill efect.
Key reasons of risk formation are human and his living, natural and industrial processes. At
present diferent methods of prediction of the risk and risk assessment are developed, which
associated with natural and man-made emergencies. Risk assessment models and methods can
be divided into two groups:
— for the analysis of the safety of industrial facilities;
— for the analysis of territorial entity.</p>
      <p>Minimization of technogenic risks which allows to reduce cost on remediation is necessary
for sustainable economic development.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Multivariate statistic method in problems solving of acceptable level risk</title>
      <p>These days acceptable level risk is considered 1· 10− 5, which can be interpret as one death from
100 000 people [1], another method proposes to measure acceptable risk as aggregate figure in
some years [2]. However, oficial methods of risk assessment don’t take into account the fact
that the number of populations in territorial entity difers from 5 000 to one million people and
more. Such changes in population base influence the final risk level and lead to uncertainty. For
example, while having the equal number of accidents and survivors in territories with diferent
population density the risk also will be diferent.</p>
      <p>
        At this work the method of hierarchical clustered analysis is proposed to use, which allows to
divide the Siberian Federal District’s (SFD) territory into clustered groups, to choose a reference
group, in which comparisons with determination of acceptable level of risk are held [3]. The
given method is widely used in various fields of science [
        <xref ref-type="bibr" rid="ref1">4, 5</xref>
        ].
      </p>
      <p>Figure 1 shows the algorithm of the method for analyzing hazardous anthropogenic events
in the territory under consideration.</p>
      <p>At the 1ststage, the task to analyze anthropogenic safety of SFD is set. At the 2ndstage,
quantitative indicator is selected on the basis of which the analysis is carried out. Instead of
data normalization, value of territory vulnerability from diferent kinds of accidents is used,
where data of automated information and control system for the prevention and liquidation
of emergency situations period 1999–2017 years is taken. Territory vulnerability is a complex
indicator, which has a probable origin. It includes probability of hazardous event formation,
probability of emergency and death in diferent anthropogenic events:</p>
      <p>= {;  ; } ,
where  — territory vulnerability;  — probability of hazardous event formation;  —
probability of death during hazardous event;  — probability of emergency.</p>
      <p>Next stage is connected with determination the distance between objects. More common
methods to determine the distance between two points, formed by coordinate axes x and y
are Euclidean distance, Chebyshev, city-block distance (Manhattan). To be accurate in cluster
dividing diferent distances should be used, at adequate distribution hierarchical tree will have
the similar form.</p>
      <p>At the 5thstage, cluster method should be chosen (calculation way od distance between
clusters). Ward’s method is the best suited for objects having “blurred” structure with vague
concentration. As a result, small in size and very compact clusters are formed. This method
difers from the others in that it uses analysis of variance methods to estimate the distance
between clusters [3].</p>
      <p>
        Next stage is connected with finding out the number of clusters. There are various methods
to calculate the number of clusters in a hierarchical tree. However, nowadays the universal
method doesn’t exist [
        <xref ref-type="bibr" rid="ref2">6</xref>
        ]. At the work k-means method is used to determine the number of
clusters. It allows to set the number of clusters and gradually, beginning with number two, to
cheek the adequacy of the division of the hierarchical tree [3].
      </p>
      <p>The 7thstage is related to a quantitative assessment of the investigated individual
anthropogenic risk. The acceptable level of risk is taken in the last stage.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Assessment of individual anthropogenic risk for Siberian</title>
    </sec>
    <sec id="sec-4">
      <title>Federal District’s territories</title>
      <p>Analysis of SFD territories is necessary to be done separately in diferent kinds of
administrativeterritorial units cities (with population more than 70 000 people), towns (population less than
70 000 people) and territorial entities. Anthropogenic load is diferent in the cities and districts.</p>
      <p>In SFD with population more than 70 000 people 31 cities are located, three of them are cities
with one million people and more. Omsk, Novosibirsk and Krasnoyarsk are the most vulnerable
cities, because of significant population density, developed industry and infrastructure. 46 towns
are situated in Siberian territory with population less than 70 000 and there are 268 diferent
municipal districts. For these territories figures of vulnerability have been got as well. The most
event frequency is defined in Irkutsk, Novosibirsk and Emelyanova districts.</p>
      <p>On basis of this method, with the use of software STATISTICA the sources and factors of
technogenic accident are reviewed with the example of SFD cities to assess the anthropogenic
a
b
— acceptable level  ⩽ 1.07 · 10− 5;
— higher level  ∈ (1.07 · 10− 5; 1.85 · 10− 5];
— high-lying level  &gt; 2.85 · 10− 5.
hazardous level. 31 cities (population more than 70 000 people) are chosen to analyze. Figure 2
shows the tree diagrams which are regimented by using diferent distances (Euclidean,
Chebyshev, Manhattan). The tree diagram has been received with one-type distribution. Thus, we
can make a conclusion about the adequate distribution of clusters. To calculate the number of
clusters k-means method is used, which allows to distinguish five equable groups.</p>
      <p>For each cluster diferent kinds of hazardous anthropogenic events are analyzed (for big cities
17 various kinds of accidents are followed), there are appertained mean values in Table 1. In every
cluster group there are its dominating negative factors. The biggest number of anthropogenic
accidents is seen in V cluster, where high population density and developed industry are.</p>
      <p>Therefore, the values of individual risk and interval values for every city have been received.
The 3rdgroup has the minimum (the least value). This cluster is considered to be the reference
and risk interval value is an acceptable level. Thus, there are three zones to analyze risk for big
cities SFD (Figure 3).</p>
      <p>In acceptable risk level zone there are 11 cities, which are in III cluster. There are 13 cities
(I and II cluster) in a higher level, and there are 6 cities in a high-lying level from VI and V
clusters and one city from I and II clusters (Mezhdurechinsk and Biisk).</p>
      <p>By a similar way individual anthropogenic risk values have been got for towns with population
less 70 000 people:
— acceptable level  ⩽ 3.48 · 10− 6;
— higher level  ∈ (3.48 · 10− 6; 4.3 · 10− 6];
— high-lying level  &gt; 4.3 · 10− 6.</p>
      <p>In municipal districts this method allows you to divide 268 ones into five uniform groups,
analyze each group individually and find out a reference cluster and risk level. Risk values for
municipal districts change from 0 to 6.3· 10− 6. The following risk level values are done:
— acceptable level  ⩽ 3.85 · 10− 6;
— higher level  ∈ (3.85 · 10− 6; 6.3 · 10− 6];
— high-lying level  &gt; 6.3 · 10− 6.</p>
      <p>After analyzing the technogenic safety of Siberian it was determined that there are 74
territorial units in a high risk zone, where it is necessary to hold event to privent and minimize
risk.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>The sustainable development depends on the analysis, assessment and minimization of
anthropogenic risk. Assessment and analysis of territorial technogenic risk are the important tools for
improving the regional policy, strategy and tactics which allow to minimize the consequences
on the targeted territory. With man-made burden moving higher, which threatens restoration of
natural recourses, with rising risks for life and heath of population, high-quality management
based on using the assessment approaches of anthropogenic risk necessary.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was carried out with the financial support of the Krasnoyarsk regional fund of Science
and Technology support within the framework of the project No. 2020061506473.
[1] MR 2-4-71-40 methodical recommendations “The Order of development, verification,
evaluation and correction of electronic passports of Territories (objects): UTV”. Ministry of the
Russian Federation on Civil Defense, emergency situations and liquidation of consequences
of natural disasters from 15.07.2016. Available at: http://docs.cntd.ru/document/456080084.
[2] Standard 22.10.02-2016 Safety in emergency situations. Emergency risk management.</p>
      <p>Permissible risk of emergency situations. Approved by the order of Rosstandart No.
724art. of 29.06.2016.
[3] Taseiko O., Ivanova U., Rihter E., Pitt A. Using multivariate statistics to solve risk
assessment problems for forest ecosystems // International Multidisciplinary Scientific
GeoConference Surveying Geology and Mining Ecology Management (SGEM). 2020-August (3.1).</p>
      <p>P. 777–784.
[4] Tromelin A., Chabanet C., Audouze K., Koensgen F. Multivariate statistical analysis of a
large odorants database aimed at revealing similarities and links between odorants and
odors // Flavour Fragr. J. 2017. P. 1–21.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Shan</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>S.F.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qian</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            <given-names>S.</given-names>
          </string-name>
          , Zhang L.,
          <string-name>
            <surname>Ding</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Chemical fingerprint and quantitative analysis for the quality evaluation of platyclade cacumen by ultra-performance liquid chromatography coupled with hierarchical cluster analysis //</article-title>
          <source>Journal of Chromatographic Science</source>
          .
          <year>2018</year>
          . Vol.
          <volume>56</volume>
          . No. 1. P.
          <volume>41</volume>
          -
          <fpage>48</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Yatskiv</surname>
            <given-names>I.</given-names>
          </string-name>
          , Gusarova L.
          <article-title>Methods for determining the number of clusters by classifying without training // Transport and Telecommunication 2003</article-title>
          . Vol.
          <volume>4</volume>
          . No. 1. P.
          <volume>23</volume>
          -
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>