Modeling Risks on Licensed Markets:
    on the Example of the Russian Alcohol Market

                     Olga M. Pisareva1 and Anna I. Denisova2

                    State University of Management, Moscow, Russia
               1
                   o.m.pisareva@gmail.com, 2 a.i.denisova@inbox.ru


      Abstract. The methodology of mathematical and computer modeling
      of risks of licensed commodity markets of the Russian Federation on the
      example of the alcohol market is presented in the article. The illustra-
      tive example shows the possibility of assessing the probability of a risky
      event using the logistic regression model. The methodology for the as-
      sessment of the expected damage cost from non-payment of taxes and
      fees on illegally circulating products to the budget is considered. A sim-
      ulation approach to risk modeling and its implementation in relation to
      the alcohol market is presented.

      Keywords: decision making, mathematical modeling, risk modeling, lo-
      gistic regression, risk prediction, alcohol market


1    Introduction
Traditionally markets associated with the production and sale of potentially haz-
ardous products (for example, tobacco market, alcohol market, pharma market
and so on) are the sources of close attention and concern on the part of state
control bodies. Negative events in these markets can lead to damage to the na-
tional economy, life and health of people, the environment, etc. Therefore, it
is important to assess the risks of such events in a timely manner in order to
prevent or reduce their dramatic consequences.
     One of the forms of state regulation of such “potentially dangerous” markets
is licensing. In the Russian Federation, this is fixed in Federal Law No 99-FZ of
04.05.2011 “On licensing of certain types of activities”. The specific state control
bodies work in such markets involves the use of the so-called risk-based man-
agement model. This is due to the need to timely identify the possibility of the
emergence of a risk source and assess its possible negative consequences.
     The problems of identifying, assessing and managing risks have been the sub-
ject of a large number of studies by such well-known scientists as Knight F. [5],
Senchagov V. [12], Purdy G. [11], Kachalov R. [4] and others. The methodol-
ogy of risk assessment and its consequences on the basis of mathematical and
computer modeling methods is considered in the works of such researchers as
Ayvazyan S. [2, 3], Varshavskiy A. [13], Makarov V. and Bakhtizin A. [7], Mkhi-
taryan V. [3,8], etc. We used the general methodological approaches presented in
the works of these scientists to model risks in licensed commodity markets [10].
We will present the experience of constructing a model for assessing the risks of
shortfall in the consolidated state budget of funds from taxes and other charges
that must be paid for illegally circulating products on the example of the al-
cohol market. The calculations were based on the open data from the Federal
Service of State Statistics of the Russian Federation (www.gks.ru), Treasury
of the Russian Federation (www.roskazna.ru) and Federal Service for Alcohol
Market Regulation (www.fsrar.ru, AMR).
    In accordance with ISO 31000 [1] international standard, it is customary to
determine the risk (R) as a combination of the likelihood of occurrence of a risk
event (P ) and its associated consequences (C):

                                  R=P ×C                                       (1)

In accordance with this, we will present two stages of the model implementation:
modeling the probability of a risk event occurrence and estimating the amount
of damage from its occurrence, which is reflected in general in Fig. 1.


     Fig. 1. The scheme of the risk assessment on licensed commodity markets
2   Description of the risk probability modeling

By “risky events” (risk-events) here we mean, first of all, the fixed offenses of
the current regulatory legal acts (RLA). Risk-indicators (risk-factors, signs of
risk) are facts that in general do not indicate the realization of a risk-event,
but because of their presence or absence, especially in combination, it becomes
possible to judge the expectation of a risk event.
    Suppose we can establish a relationship between the risky event and the
set of attendant risk-indicators in the form of mathematical model, enabling to
measure the risk probability according to risk signs.
    Let there be the sampling of factors, presented as the matrix of independent
variables Xn×m and the result vector Yn×1 , n – the sample size, m – number of
independent variables. Let Y be a discrete binary variable: value “1” means an
event happening, and value “0” means the contrary – non-occurrence of event.
Values of risk-indicators (independent variables) can be both discrete and con-
tinuous.
    Then the source data can be represented as follows:
                                              (
                                                1;
                             Y = (yi ); yi =
                                                0;

X = (xij ) – matrix of risk-factors, here xij is the j-th indicator value of i-th
object, i = 1, . . . , n; j = 1, . . . , m.
   In general, the task is to estimate the probability of the state of the system
on the basis of a set of external factors:

                                  P = F (X, Y ).                                (2)

Here P is the estimated probability of event happening; F (X, Y ) is the function
of estimation; X is the risk characteristics matrix and Y is the events vector.
    The next necessary step is to collect statistical information about the occur-
rence of risk events and the impact of the accompanying indicators on them, as
well as the choice of the dominant indicators (factors) among them.
    Taking into account the nature of the initial data and the experience of pre-
vious studies [2, 3, 7–9, 13], we can designate the following groups of methods
used in risk modeling: artificial intelligence methods (neural modeling, logical-
probabilistic method, etc.), methods of operations research (linear programming,
nonlinear optimization, etc.), econometric modeling and machine learning meth-
ods (decision tree, regression of binary choice, analysis of discriminant functions,
etc.).
    In practice, the binary logistic regression method has proved to be the most
satisfactory in terms of accuracy and ease of implementation. Here the proba-
bility of Y can be estimated as:

                                               ez
                               P = F (z) =          ,                           (3)
                                             1 + ez
where z = b0 + b1 x1 + · · · + bn xn ; b0 , . . . , βm – the regression parameters; x0 ,
. . . , xm – indicators of risk-event (cut-off value is not less than 0.75).
       The coefficients of the model (3) are estimated using the maximum likelihood
estimator. The assumption of the lack of normality in the residues distribution
using MLE is an advantage of logistic regression either. The scheme for logistic
regression constructing is shown in Fig. 2. As an illustrative example of risk
factors there are 8 unlawful act indicators as shown in Table 1. Verification of
the adequacy of the results of model calculations to actual observations was
carried out according to the data 2016.


 Fig. 2. The relationship of the stages and methods of logistic regression evaluation


   By forming a representative, reliable and statistically stable sample for build-
ing a model, it is necessary to know the ratio of the number of “good” and the
number of “bad” outcomes in the general population. The minimum allowable
sample size (n), based on the known ratio of the number of realized and unre-
        Table 1. Example of a list of independent variables (risk-indicators)

No   Variable                              Interpretation
 1      x1      Returns to the supplier of large volumes of products (including re-
                peatedly)
 2      x2      The names and brands of products that were noticed in illegal traffic
                earlier
 3      x3      Absence of GPS data on transportation of products
 4      x4      Excess of volumes of wholesale and retail turnover over its production
 5      x5      The organization being inspected had a violation earlier
 6      x6      The discrepancy between the volumes of purchased and used raw
                materials with the volumes of products produced
 7      x7      The presence of visual signs of non-compliance of products with state
                standards
 8      x8      Visually determined signs of forgery of documents


alized events, on the number of “bad” outcomes, was found on the basis of the
ratio [3]:
                                 zγ2 × q × (1 − q)
                            n=                     .                      (4)
                                         α2
Here q is share of “bad” outcomes, α is maximum reasonable error of share
estimate, zγ is the value of standardized normal distribution in reliability level
γ.
    According to AMR statistics, the share of established unlawful acts inspec-
tions was 86%, that is q = 0.86. If γ = 0.95; zγ = 1.96; α = 0.05 (hereinafter),
                                                     2
the minimal allowable sample size is 186 (n = 1.96 ×0.86×(1−0.86)
                                                         0.052        = 185.01).
    The choice of a set of the most important risk indicators was carried out in
two stages. At first, experts reduced the number of risk factors from 252 to 8.
The Delphi procedure was used here, three of its rounds were conducted. In this
procedure, a group of regional representatives of the Federal Service for Alcohol
Market Regulation participated. Each of them had the same official powers.
Secondly, we additionally evaluated the relationship between the factors and the
result variable.
    Since the estimation of the ratio of non-numerical variables did not give
clear results, it was decided to consider their different combinations. Due to a
large number of factors, the task becomes computationally complex. For this
reason, a software algorithm was implemented in the programming language R.
It allowed formulating a list of models with significant coefficients by the method
of automatic recalculation of various combinations of independent variables X.
    For each generated combination, a binary logistic model was constructed
and the significance of its coefficients was estimated. The observed insignificant
coefficients were consistently eliminated. Then in a new iteration of the cycle a
new set of factors was generated again. As a result of the program, the list of 5
models containing more than three significant coefficients was formed. Number
of iterations is k = C88 +C87 +C86 +C85 = 93. We limited the combinations of up to
five in order to avoid unnecessary recounting. This approach is quite flexible and
does not limit the specification of the model. We can also use the possibilities of
computer modeling.
    Based on a comparison of their quality characteristics, a choice was made in
favor of the model:
                                          e1.19x1 +1.41x4 +1.42x5 +1.36x6
                     P {y = 1 | x} =                                        ,              (5)
                                        1 + e1.19x1 +1.41x4 +1.42x5 +1.36x6
here z = 1.19x1 + 1.41x4 + 1.42x5 + 1.36x6 , the coefficients are significant.
           (0.012)     (0.003)     (0.002)   (0.004)
    The model (5) has minimum value of the Akaike information criterion (AIC =
                                                                       2
139.3); maximum value of Nagelkerke coefficint of determination (RN      g = 0.31);
                             2                2
as Hosmer–Lemeshow test [χ (α, n + 1) = χ (0.05, 5) = 11.07] > [HL = 4.592],
p(HL) = 0.71 is value high enough, indicating that the model values meet the
observed. That means the model (5) is well-calibrated. The area under the ROC
curve (Receiver Operating Characteristic curve, shown in Fig. 3) is 0.82, so the
probability of a correct model definition is rather high. The testing residuals on
the normality of the distribution by the likelihood ratio criterion and on het-
eroscedasticity by the Breusch–Pagan test did not reveal anomalies that forbade
the use of the model (5). In general, the results obtained are characteristic for
models of this type. The model has a high level of quality, which is effective to
solve practical problems.


                                 Fig. 3. ROC curve of model (5)


    Further, the limiting effects of risk factors were evaluated. These effects mean
change in the probability when the factor changes by one and signal change in
the uncertainty of the binary choice situation. In the logistic model, the small
change ∆xk of the k-th independent variable lead to the probability change
                                                                  (xT β)         βk ez
P {y = 1 | x}: P {y = 1 | x} ∼      =
                                        ∂P {y=1|x}
                                            ∂xk      ∆xk = ∂F∂x     k
                                                                         ∆xk = (1+e  z )2 ∆xk .

Here F (·) is the logistic model (3), z = β0 + β1 x1 + · · · + βn xn , xi , . . . , xm
are the values of m independent variables, β0 , β1 , . . . , βm are the coefficients of
regression [9].
                ez         e1.19x1 +1.41x4 +1.42x5 +1.36x6
    If G = (1+e   z )2 = (1+e1.19x1 +1.41x4 +1.42x5 +1.36x6 )2 the probability changes will

be when the new indicator is identified: P {y = 1 | x} = 1.19 × G × ∆x1 ;
P {y = 1 | x} = 1.41 × G × ∆x4 ; P {y = 1 | x} = 1.42 × G × ∆x5 ; P {y = 1 | x} =
1.36 × G × ∆x6 .
    The x5 and x4 factors have approximately the greatest effect on the result
variable, and are interpreted as “The organization being inspected had a vio-
lation earlier” and “Excess of volumes of wholesale and retail turnover over its
production”. Variable x1 has the smallest effect.


3   An Assessment of the cost of damage and modeling of
    risks
It was decided, as a characteristic of the consequences of direct economic damage
(directly affecting the economy) to take an estimation of the amount of taxes
and charges unpaid to the state budget because of illegally circulating products
in the licensed market. A similar approach to assessing risk as underfunding was
mentioned in [6]. Note that the transfer from rubles to dollars is based on the
weighted average rate for 2016: 1 dollar was equal 67.04 rubles.
    So the total damage from non-payment of excises taxes and VAT for the year
(C, dollars) will be equal to:

                          C = 0.18 × S0 × Q + S0 × A0 ,                          (6)

here S0 – annual volume of unaccounted products in total sales volume (dekaliters,
dl); A0 – averaged excise rate calculated on the basis of official rates specified in
the Tax Code for all types of market products:
                                        l
                                        X
                                 A0 =         kj aj gj ,                         (7)
                                        j=1

aj – excise rate for the j-th type of products; l – count of type of products; gj
– object of taxation (for example, excise on alcohol is paid for the amount of
anhydrous ethyl alcohol in the product); kj – market share of the j-th type of
product; Q – average price per unit of sold products (dollar):
                                        l
                                        X
                                   Q=          kj qj
                                        j=1

qj – price for the j-th type of products (dollar/liter);

                                       ∆A × S 2
                               S0 =              ,                               (8)
                                      ∆A × S + B
here the total volume of product S (dl) is the sum of the volume of product
with which taxes are paid (S1 ) and the volume of unaccounted product (S0 ):
S = S1 + S0 ; B – amount of excise taxes paid to the budget for the year (dollar).
In addition, we can note that
                                          B
                                    A0 =     .
                                          S1
   Harmonizing the units of measurement, we get the difference between A0 and
averaged “actual” excise rate for all types of market products:
                                     B        B
                              ∆A =      −            .
                                     S1   (S1 + S0 )
Dividing C by the number of risk-events (N ) detected in the year we get the
average annual amount of damage from one risky event:
                                                C
                                         C̄ =     .
                                                N
However, one cannot expect a constant value of damage in the event of any risk
event and nor should it be expected that C̄ will be the maximum value of the
damage. Suppose, that C̄ is the value corresponding to the middle of the interval
at which the amount of damage varies randomly from event to event. Hence, we
can suppose that 2 × C̄ is a maximum value of probable damage.
    Obviously, the damage cannot be negative, in the case of the equality 0 – it
is about the absence of damage. So, for example, let C be a random variable
varying in accordance with the uniform distribution law in the interval [0; 2× C̄]:
C = R[0; 1] × 2 × C̄. Here R[0; 1] is a random uniformly distributed quantity on
the interval [0; 1]. We use this quantity to generate pseudo random numbers in
the process of computer simulation. We also considered 2× C̄ as the upper bound
to take into account the possible variations in the amount of damage not related
to the budget risk.
    Then as a method for predicting the magnitude of the risk, we use computer
simulation based on the Monte–Carlo method: we perform a large number of
simulations (NM C ) for random variable values. As a result, we have not the only
value of the magnitude of the risk, but its probability distribution.
    Finally, we estimate the risk value in accordance with (1) and taking into
account the described methodology:
                                           ez
                                    R=          × C.                           (9)
                                         1 + ez
In addition, based on the obtained values, it is possible to calculate some char-
acteristics: the expected value of risk
                                       PNM C
                                             Ri
                                 R̄ = i=1       ;
                                        NM C
the variance and standard deviation of risk
                           PNM C
                                  (Ri − R̄)2                √
                     S 2 = i=1               ,         s=       S2;
                               NM C − 1
the coefficient of risk variation
                                      s
                                        ;cv =
                                      R
here NM C – number of simulated Monte–Carlo runs, Ri – risk value in the i-th
run.
4   Results of computer modeling of the risk value

Let us illustrate the possibility of applying the proposed methodology for risk
modeling on licensed commodity markets by the example of the taxes and fees
non-payment for illegally circulating products in the Russian alcohol market.
    According to the Federal State Statistics Service, the volume of alcohol sales
(S) in 2016 was 974.5 million decaliters (here 106.9 million dl containing more
than 9% alcohol; 8.8 million dl with content less than 9%; 51.9 million dl wine,
excluding sparkling wines; 22.0 million dl sparkling wines; 780.6 million dl beer;
4.3 million dl other). The corresponding shares of the retail sales market by
kinds of said alcohol products are: 1) containing more than 9% alcohol – 0.11;
2) with content less than 9% – 0.009; 3) wine, excluding sparkling wines – 0.053;
4) sparkling wines – 0.023; 5) beer – 0.801; 6) other (in particular, various fruit
wines) – 0.04.
    The Tax Code (came into force in 2016) specifies such excise rates: 1) 500
rubles per liter of anhydrous ethyl alcohol; 2) 400 rubles per liter of anhydrous
ethyl alcohol; 3) 9 rubles per liter of products; 4) 26 rubles/liter; 5) 20 rubles
per liter; 6) 9 rubles per liter.
    Then the average value of the excise rate on the basis of official rates approved
in the Tax Code and taking into account the market conditions calculated by
the formula (8) is A0 = 0.63 dollar. The amount of budget revenues from excise
duties on alcoholic products is 5.2 billion dollars.
    Proceeding from (8) it is possible to estimate the volume of illegal products
in the total volume of the sold goods: S0 = 1447.5 million liters. So the damage
from non-payment of excises is S0 × A0 = 0.91 billion dollars per year. And the
share of illegal products in the volume sold is 1447.5/9745 = 0.149.
    According to calculations, the weighted average price (Q) for alcohol products
calculated on the basis of data on average prices for different types of alcoholic
beverages (data of the Federal State Statistics Service) amounted to 3.15 dollar
per liter in 2016.
    Then, the damage from VAT non-payment (18%) is 0.18 × S0 × Q = 0.82
billion dollar. Based on (6) we can estimate the total damage in 2016: C =
0.18 × S0 × Q + S0 × A0 = 1.73 billion dollars.
    The rounded average number of violations found on the alcohol market (re-
lated to non-compliance with tax, customs legislation, with the sale of products
with counterfeit excise or special marks or unmarked, violation of license condi-
tions and illegal manufacture at legal enterprises) was N = 22, 200 for the last
5 years (by Federal Service of State Statistics). So, we can estimate the average
                                                                         1.73
“cost” of one risk event taking n as the number of risky events: C̄ = 22200   = 0.078
million dollars.
    Finally, using formula (9), we obtain a model for estimating the magnitude
of the risk:

                 ez           e1.19x1 +1.41x4 +1.42x5 +1.36x6
R = P ×C =            × C =                                     × R[0; 1] × 2 × 0.078.
               1 + ez       1 + e1.19x1 +1.41x4 +1.42x5 +1.36x6
Then a simulation model for estimating the value of the risk was implemented
using the Monte–Carlo method. 22,200 tests were conducted in accordance with
the average number of violations in the alcohol market over the last 5 years. The
fragment of the results is shown in Table 2.


Table 2. Fragment of computational values of runs of the simulation model of risk
assessment

       No    x1   x4   x5   x6      P       C, million dollar   R, million dollar
        1    1    0    1    0    0.931502       0.034834            0.032448
        2    0    1    0    0    0.803766       0.132731            0.106685
        3    0    0    1    1    0.941585       0.123103            0.115912
        4    1    1    1    1    0.995413       0.027095            0.026971
        5    1    0    1    1    0.981476       0.089822            0.088158
       ···


     Based on the results of the experiments, the maximum value of the risk of
one unlawful act is 0.1552 million dollar. The expected value of the risk is 0.0697
million dollar, the coefficient of variation is cv = 0.59, which indicates about the
average probability of happening of a risk event.
     It is useful to assess the probability of a risky event happaning with a suf-
ficiently high level of reliability, for example, in 80%. We divide the number of
cases in which the probability of unlawful act was above 0.8, by the total number
of runs: P = P0.8 /NM C = 0.81 (sufficiently large value in practical problems).
Note that this value is pretty close to 0.86, the specified AMR in 2016, as the
proportion of checks in which violations were found.


5   Conclusion

We note that the calculated share of illegal products among those sold on the
market (equal to 0.149) corresponds to the official data during the last years
submitted by the AMR. According to their information, the average share of
illegal products among those audited in 2013, 2014, 2015, 2016 is estimated at
0.23. The discrepancy is explained by the fact that some illegal products can
be identified before the sales stage, and because the fact of sale was not also
reflected in the documents. In addition, we should expect improved modeling
results due to the identification and accumulation of additional information on
inspections of market participants. The assessment of arrears for excise duties
for wine, beer, and products with alcoholic content above and below 9% totaled
66.102 billion rubles (0.986 billion dollar) (by Report on taxes and tees, fines and
tax sanctions in 2016, the Federal Tax Service of the Russian Federation). The
amount of non-payments calculated in the work amounted to 0.91 billion dollar.
Such a discrepancy is primarily because the report does not include arrears
for all types of products, and the averaging of some indicators present in the
calculations may affect accuracy.
    The presented arguments of checking the quality of the event risks model-
ing in the markets of licensed products testify to the possibility and the need
for further improvement in order to increase the possibility of using simulation
results in the practice of managing markets for potentially hazardous products.


References
 1. Risk management – risk assessment techniques, iSO/IEC 31010:2009
 2. Ayvazyan, S., all: Modeling risk patterns of russian systemically important financial
    institutions. Review of Applied Socio-Economic Research 1(1), 70–80 (2011)
 3. Ayvazyan, S., Mkhitaryan, V.: Applied Statistics. Fundamentals of Econometrics.
    Unity, Moscow (2001)
 4. Kachalov, R.: Economical Risk Management: theory and applications. Nestor–
    Historia, Moscow, Saint Petersburg (2012)
 5. Knight, F.: Risk, Uncertainty and Profit. Hart, Schaffner & Marx; Houghton Mifflin
    Co., Boston, MA (1921)
 6. Kovaleva, T.: Organization of budget management in the subject of the russian
    federation. Financy and Credit 5 (2003)
 7. Makarov, V., Bakhtizin, A.: Social modeling – new computer breakthrough (agent-
    oriented models). Economika, Moscow (2013)
 8. Mkhitaryan, V., Karelina, M.: Mathematical modeling of integration policy risks
    based on the method of fuzzy performance. In: Information Technologies in Eco-
    nomics and Management. Dagestan State Technical University. pp. 60–64 (2006)
 9. Nosko, V.: Econometrics for Beginners (Additional chapters). IETP, Moscow
    (2005)
10. Pisareva, O.: Methodological and applied aspects of modeling risks of unlawful
    events in the field of licensed economic activity. Economics and management: prob-
    lems, solutions 3(66)(6), 159–167 (2017), annual international round table “System
    Economics, Social and Economic Cybernetics, Soft Measurements in the Economy
    – 2017”, Moscow, June 8, 2017, Financial University under the Government of the
    Russian Federation
11. Purdy, G.: Raising the standard – the new iso risk management standard (2009)
12. Senchagov, V.: Economic security: geopolitics, globalization, self-preservation and
    development. Institute of Economics of the Russian Academy of Sciences. Moscow.
    ZAO Finstatinform (2002)
13. Varshavskiy, A.: Challenging innovations: risks and responsibilities (on the example
    of food products of domestic production). CEMI RAS, Moscow (2009)