<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Claude Bernard Lyon 1 University</institution>
          ,
          <addr-line>43 boulevard du 11 Novembre 1918, 69622 Villeurbanne cedex</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"</institution>
          ,
          <addr-line>ave. Beresteiskyi 37, Kyiv, 03056</addr-line>
        </aff>
      </contrib-group>
      <fpage>50</fpage>
      <lpage>62</lpage>
      <abstract>
        <p>In this paper, several car insurance claims problems are analyzed and solved via existing statistical models implementation for real-world datasets. The first problem which was studied is the problem of measuring the probability of a claim for a specific policy. This problem is solved by using a set of families of generalized linear models with an additional approach to analyze data by utilizing survival models. The best generalized linear model is then chosen according to statistical criteria. The second problem considers distinct classes of policies. A number of claims and prices are forecasted for the different groups. Same approach as for the first problem, generalized linear models are used and the best model is chosen according to statistical criterion. The third problem is the problem of scorecard generation. A brief interpretation and result of the built scorecard is also provided.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Car-insurance 1</kwd>
        <kwd>Generalized linear models 2</kwd>
        <kwd>Scorecard 3</kwd>
        <kwd>Survival models 4</kwd>
        <kwd>Claims forecasting 5 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Usually, the insurance activity is aimed to protect the property interests of individuals and legal
entities in the event at the expense of monetary funds, which are formed from the insurance
premiums paid by policyholders. One of the main conditions for the effective functioning of the
insurance market is the reliability of its participants ‒ insurance companies. Supporting the ability of
insurance companies operating in the market to fulfill their obligations promptly and as a whole. That
is their financial stability which is a special starting point for the actual manifestation and
implementation of the insurance function. The current financial state of the insurance companies
requires the search for new forms and methods of increasing their competitiveness and financial
stability. They need to create special decision-support systems for more effective assessment of the
policies, more precise forecasting of the probability of claims, evaluate the possible losses and develop
more flexible conditions for insurance policy evaluation.</p>
      <p>
        The variety of risk manifestation forms and the frequency and complexity of the consequences of
their implementation determine the need for an in-depth analysis of possible risks and
economicmathematical justification of the financial policy of insurance companies. For every car insurance
company importance of proper policy selection for a given client cannot be overestimated. The
insurance premium is formed according to the client's expectation to be prone to raise claims and the
size of those claims [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1‒3</xref>
        ].
      </p>
      <p>
        Information for the determination of terms and conditions of policies can be separated into two
parts: data concerning a driver and a car. Age, driving, and length of insurance policy are the values
that define a driver part of the information. However, some aspects like driver's habits are hard to
collect, describe and analyze. On the other hand, information about cars can be specified and collected
concerning some technical criteria [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It can range from the car type to a quantity of cylinders or
safety bags. A practical task is to model and forecast claims with information about cars being
available in abundance, hence requiring selections and filtering in search of its most relevant parts.
      </p>
      <p>A completely separate issue is creating models that can be used to predict some aspects of a claim
based on selected data. One of the most important tasks is to predict the probability of a claim for a
specific case. Insurance firms need to have a proper model for predicting and forecasting claims for
different clients. Meanwhile, those methods should be easily interpreted and thus explained to clients
or regulators about key factors that affect the terms and conditions of policies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem statement</title>
      <p>This work is concentrated on solving the main problems, which appear in the insurance field. The
first and foremost task is that the claim expectation should be forecasted for a given client. It could
be measured by the claim’s probability. Companies need a way to approximate the chances of claims
to properly form policies' terms for a given client. This task requires taking into account the client's
data and forming a decision based on it.</p>
      <p>The second task is forecasting the number of claims for each group. The importance of this task is
quite understandable while it is a part of company policy selection. By grouping clients by
aggregating values, groups can be created. For these groups, the number of claims can be estimated
and the models for forecasting can be built. The approach can follow two possible scenarios: modeling
only the number of claims or total spending on a group.</p>
      <p>Third task the model creation, which is usually paired with interpretation. This interpretation can
provide valuable insight into what values increase the probability of the claim. This allows us to create
scorecards that can be built to provide an easy tool to make decisions directly from data provided by
a client. The main objective of this study is to define not only the probability and cost (value) of each
claim but also the subset of the most damaged cases.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>
        The appropriate approach usually depends on the task but the most important is that it is determined
by the flow of data extraction and preparation. The same method can be applied to the same data but
different approaches and pre-processing techniques may affect the results. For example, [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] provides
us with the flow and handling of data and objectives very similar for use in this work. Data is collected
on an open platform. Claims are analyzed and the number of which match our task is predicted.
However, due to the dataset restriction, the preprocessing was added which yielded comparatively
stuffiest results but lacked interpretability due to PCA usage.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Generalized linear models</title>
        <p>
          Generalized linear models (GLM) were the main tools used during our research. They provided a
unified framework for modeling and forecasting the target variables [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Due to the variable's nature
and the different tasks that were tackled, the number of family distributions was used to deal with
the problems from different sides and selections of the most fitting.
        </p>
        <p>The generalized linear model is an extension of a simple linear regression model. A linear
relationship between variables is the simplest case for researching links between factors. However,
this is not true for most real-world processes where the relationship is more complicated than linear.
In this case linking function is introduced. There are a number of different families that were used in
the research.</p>
        <p>The general way of writing down the generalized linear model is as follows:
 =  ( ),
where X denotes the independent variables and  is a parameters vector  is a link function to
transform the scale of dependent variable  to suit a linear relationship.</p>
        <p>Generalized models can be used for discrete or continuous variables which provides it with a
significant advantage.</p>
        <p>Logistic regression (LR) is a statistical method that is used for classification values into different
categories. In the scope of the research, the logistic regression was used for modeling claim
probability for the one police.
.</p>
        <p>Normal or Gaussian generalized model uses an identity link function which is the same as
simple linear regression.</p>
        <p>=  .</p>
        <p>Poisson regression is a statistical model that is used when the dependent variable is a count of
occurrence. Its link function is following:</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Survival models</title>
        <p>
          To predict claims or similar events like death or accidents, survival models can be used. They can
be utilized when the outcome can be traced along some period of time [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>The simplest form of survival model is a table with all events noted with timestamp of occurrences.
It may give a significant insight into the time periods when most events occur.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Scorecards</title>
        <p>Scorecards are special tables constructed in a way to provide scores for every feature, summing
up the scores for a record, the total points can be estimated. It is possible to move records to one of
the preselected categories by assigning levels to the score.</p>
        <p>
          Scorecards are powerful practical tools that can be used to fast identify policies with high risks
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Scorecards are built by using Weight of evidence - WoE, Information value - IV, and Population
Stability Index – PSI.
%  
 
        </p>
        <p>   
3.4. Other methods and models</p>
        <p>The prediction of the insurance field is huge and rich with many approaches and methods that are
effective for forecasting the probability of claims [5‒9]. Some methods cover not only the same
objective as the current study but are also applied to handling more financially oriented data, missing
data, and combining results of the several models [5‒9]. Let's make a brief overview of these methods
and present results in a general table Table 1.
(Abdelhadi et al.</p>
        <p>2020) [21]</p>
        <p>Classification to predict
claims occurrence</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4.1. Decision Tree</title>
        <p>A decision tree (DT) and its variation is a family of classification methods that are built on a tree
structure for handling the decision-making process based on binary decisions on each step. This
allows to apply of the method to data with non-linear relations between features and target variables.
There are several extensions of the basic model: random forest, CART models as part of multivariable
trees. An example of research is in the work [22].</p>
        <p>The random tree is used for classification tasks so a direct comparison of this method with the
regression family of methods doesn't seem to be direct. There are tasks like determination of whether
the claim will happen at all which can be approached by both methods but with prediction of
continuous variable only one method could be used.</p>
        <p>The simplest model is straightforward: each node checks features and directs the pipeline to one
of two possible branches till the final is reached. However, this model is not suitable for complex data
since it tends to overfit and variable selection can be biased.</p>
        <p>One of the very popular extensions that was also covered by work [22] is CART or Classification
and Regression Trees. It overcomes the limitation of the original model by allowing to model and
predict regression variables without restricting original capabilities for categorical methods.</p>
        <p>Another method is Random Forest. It combines several decision trees which in turn can be
regressive together and via weighting of their output comes up with a single decision. It can be seen
as a statistical-machine learning algorithm.</p>
        <p>A further development that might not be so widespread in the Insurance topic but noteworthy is
multivariable trees which use multivariable values for response variables.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.4.2. Machine Learning</title>
        <p>Support Vector Machines (SVM) is a method dedicated to providing solutions for both
classification and regression problems. It is a supervised learning algorithm in which the idea is based
on a hyperplane. This hyperplane of space of fewer dimensions is target one and is used for
decisionmaking and boundary creation for target point separation.</p>
        <p>One of the SVM key features is that in cases when it is not possible to find a plane in the current
domain it can transfer inputs to higher dimensions in order to find a hyperplane in a new, higher
dimension. This allows us to overcome obstacles that the target dimension possesses. SVM can be
modified to solve regression tasks in [23].</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Dataset</title>
      <p>Necessary data for model creation were obtained from a Car Insurance dataset provided on a Kaggle
web site [24]. The mentioned dataset is oriented on technical aspects of the car with most variables
featuring physical parts of a machine for which police is formed.</p>
      <p>Dataset consists of two parts which were used for training and testing. It contained 58592 and
39063 records for each part respectively. Each record is a unique policy with information about the
owner of the policy and the car. The dataset has information about whether there was a claim during
the upcoming 6 months for the insurance. This was a target value during the first stage of modeling.
Additionally, the dataset had information about a range of different features with the total number of
variables being equal to 44.</p>
      <p>For further research, the grouping by several variables has been made with the aim of forming
groups of special clients and policies for which modeling was made.</p>
      <p>It's important to note that the dataset doesn't contain information about financial data. There is
no information about the price of cars and insurance premiums for a policy.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Modeling Results</title>
      <p>Modeling and forecasting have been done using generalized linear models for binomial, gaussian, and
Poisson types. Scorecard was also generated to assist in decision-making and interpretation of the
results.</p>
      <p>Modeling for the probability of claim was done by building two models – binomial and Gaussian.
The comparison presented in Table 2 has shown that Gaussian performs significantly better.
 age_of_car – how old is the car;
 policy_tenure – the length of the policy up to date;
 area_cluster – the area where most driving by the policy holder is done;


make – the car’s manufacturer;
ncap_rating – rating the car’s safety given by the agency.</p>
      <p> atr – synthesized variable based on the car's features: extra safety bags, lamps, etc.
The target relationship is then represented by the following formula:
 _
=  (
+ 

× 
× 
confusion matrix for the normal distribution which is presented in Table 4.</p>
      <p>In the next stage the modelling was made based on survival theory. It is possible to construct a
survival model where each claim is treated as the death of a member of the population. We will count
the length of the policy as a measure of time. Thus, the claims population "survives" during the policy
length interval. It was decided that high-quality prognoses cannot be derived from existing data when
claims prediction is done in the scope of the simple policy.</p>
      <p>Let's build a Cox proportional hazards model:</p>
      <p>ℎ( = (_, _) ~  (
+ _,  = __)
),
+ 
where n = 58592, number of events = 3748.</p>
      <p>It can be seen however that length of policy indeed has an effect on the claims number amounts
but this observation is rather trivial and cannot be used to make a decision since only short-range
policies should be preferred (Figure 1). Therefore, the relationship between the length of the policy
and the frequency of lawsuits was revealed. At the moment of time 1, 1.6 and 1.7 year duration there
is a sharp increase in claims. It is possible to perform separation and in the future to focus on the
threshold values found. Also from the survival model is easier to determine the duration of the most
risky policies and to define the possible new policies politics.</p>
      <p>The calculation of individual cases (a claim for each policy separately) showed the absence of
parameters and characteristics that would accurately indicate the onset of a claim. All probabilities
for each policy lie between 0.001 and 0.12. In this case, a decision was made to proceed to the
consideration of individual segments.</p>
      <p>Grouping of data by segment, manufacturer, and machine brand was performed. Thus, we have
moved from looking at an individual car to the segment as a whole, where individual characteristics
are of little importance.</p>
      <p>Two values can be calculated for segments:
1. The number of claims in the segment.
2. Amount of payment by segments.</p>
      <p>It was decided to implement a further approach to working with groups. The Poisson generalized
linear model was chosen as the model to forecast the number of cases. It showed a high level of
accuracy.
The equation for modelling relationships was presented in the such way:
_
+</p>
      <p>As can be seen in Figures 3 and 4 the claims' number prediction across groups has a higher quality
degree. This also shows that despite the low ability to predict each unique case, prediction of the
group is a much easier task.</p>
      <p>Another approach was chosen for dealing with the group. It was about forecasting the price of all
cars for which claims were issued. The gaussian model was used as the most appropriate. This also
showed significant accuracy (Table 6).</p>
      <p>Additionally, in the dataset, the car's price was missing data. For this model, the following
approach was used:
1. To find the average price for every class.
2. Adjust it according to the attribute feature.
3. To group price per category to create a new feature – total (price).</p>
      <p>The equation for modeling is as follows:
 _ =  ( + 
+  _ _
).</p>
      <p>_
+ 
+ 
_
+</p>
      <p>We need to understand which variables and intervals for these variables are the most significant
in the aim of our insurance task. Information value (IV) is one of the most useful techniques for
selecting important variables in a predictive model. This helps to rank the variables based on their
importance. On Figure 5 it is presented how many claims cases were and how they correlated in
accordance to different values of the car's age.</p>
      <p>Finally, the scorecard was built (Table 7). It provided information about values that are associated
with high risk of a claim for this dataset. Non-significant values have been filtered out. The remaining
variables describe continuous data – age of policyholder and policy tenure for which binning is made.
Categorical variables were also presented in the work – area of clusters which were named in the
initial dataset and ranges from C1 to C22 and variables that related to technical aspects: rear mirror
availability and functionality, brakes type, and transmission type.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Today car insurance companies require a lot of information to decide policies and conditions [5].
Even though a vast amount of information can be collected it doesn't guarantee the ability to create
a model that can predict a claim for a specific policy with a significant level of accuracy due to the
randomness the of claim's nature. Some special cases can be chosen, less or more prone to claims
cases can be selected but it doesn't allow to make a robust prediction according to the results. From a
built model for probability prediction, the gaussian generalized model has been chosen. It shows that
claims' nature cannot be determined based on some specific features or its combinations since for
same key variables. There are examples of policies with and without claims. Obtained values show a
high level of centering which doesn't allow to select intervals for confident claim selection and hence
undermines the usefulness of such an approach.</p>
      <p>The problem of single-claim prediction is the hardest one. For the claims risk management, we
need to forecast the probability of each claim, of each type of claim, and to develop a special scoring
card in an understandable and easily interpretable manner with the key features automatically.</p>
      <p>More promising are results for a group of claims where policies are selected and combined under
the same group with similar features. Such groups have a higher degree of an accuracy and can be
modeled and forecasted with respect to number of claims or total cars' price for which claims have
been made. Overall, the results show a low ability to predict specific cases but relatively high
confidence in forecasting in big groups.</p>
      <p>It is worth noting that different methods like Random Forest could perform better with the task of
predicting claims per observation which can be examined in consequent researchers.</p>
      <p>Finally, the scorecard is a high-quality tool to make decisions for clients directly. It is not only
easy to interpret but to use. We used the scorecard to determine in an understandable and easily
interpretable manner the key features. It yields great results on the grouped data and provides
valuable insights about the tendencies. It is also useful to implement the scorecards instrument as a
good tool for telecom and different finance for the big data tasks [25, 26] where it is needed to evaluate
some scores and influence of characteristics as well.
[5] V. Selvakumar, D. K. Satpathi, P. T. V. Praveen Kumar, V. V. Haragopal, Predictive Modeling of
Insurance Claims Using Machine Learning Approach for Different Types of Motor Vehicles,
Universal Journal of Accounting and Finance 9(1): 1-14, 2021. doi: 10.13189/ujaf.2021.090101.
[6] C. Ye, L. Zhang, M. Han, Y. Yu, B. Zhao, Y. Yang, Combining Predictions of Auto Insurance</p>
      <p>Claims, Econometrics 2022, 10(2), 19. doi: 10.3390/econometrics10020019.
[7] W. Yu, G. Guan, J. Li, Q. Wang, X. Xie, Y. Zhang, Y. Huang, X. Yu, C. Cui, Claim Amount
Forecasting and Pricing of Automobile Insurance Based on the BP Neural Network, Hindawi
Complexity Volume 2021. doi: 10.1155/2021/6616121.
[8] M. Hanafy, R. Ming, Improving Imbalanced Data Classification in Auto Insurance by the Data
Level Approaches, (IJACSA) International Journal of Advanced Computer Science and
Applications, Vol. 12, No. 6, 2021. doi: 10.14569/IJACSA.2021.0120656.
[9] K. A. Smith, R. J. Willis, M. Brooks, An analysis of customer retention and insurance claim
patterns using data mining: a case study, Journal of the Operational Research Society 51: 532–
41. doi: 10.1057/palgrave.jors.2600941.
[10] C.-C. Günther, I. F. Tvete , K. Aas, G. I. Sandnes, Ø. Borgan, Modelling and predicting customer
churn from an insurance company, Scandinavian Actuarial Journal 2014 Vol. 2014, No. 1, 58–71.
doi: 10.1080/03461238.2011.636502.
[11] K.P.M.L.P. Weerasinghe, M.C. Wijegunasekara, A Comparative Study of Data Mining
Algorithms in the Prediction of Auto Insurance Claims, European International Journal of
Science and Technology Vol. 5 No. 1 January, 2016.
[12] K. Fang, Y. Jiang, M. Song, Customer profitability forecasting using Big Data Analytics: A case
study of the insurance industry, Computers &amp; Industrial Engineering (2016). doi: 10.1016/
j.cie.2016.09.011.
[13] S. Subudhi, S. Panigrahi, Use of Optimized Fuzzy C-Means Clustering and Supervised Classifiers
for Automobile Insurance Fraud Detection, Journal of King Saud University – Computer and
Information Sciences (2017). doi: 10.1016/j.jksuci.2017.09.010.
[14] S. Mau, I. Pletikosa, J. Wagner, Forecasting the next likely purchase events of insurance
customers: A case study on the value of data-rich multichannel environments, International
Journal of Bank Marketing 36 (2018): 6. doi: 10.1108/IJBM-11-2016-0180.
[15] L. Jing, W. Zhao, K. Sharma, R. Feng, Research on Probability-based Learning Application on Car
Insurance Data, proceedings of the 2017 4th International Conference on Machinery, Materials
and Computer (MACMC 2017), Amsterdam: Atlantis Press, 2018. doi: 10.2991/macmc-17.2018.14.
[16] G. Kowshalya, M. Nandhini, Predicting fraudulent claims in automobile insurance, in:
Proceedings of the 2018 Second International Conference on Inventive Communication and
Computational Technologies (ICICCT), Coimbatore, India, April 20–21; pp. 1338–43. doi:
10.1109/ICICCT.2018.8473034.
[17] S. F. Sabbeh, Machine-learning techniques for customer retention: A comparative study,
International Journal of Advanced Computer Science and Applications Vol. 9, No. 2, 2018: 273–
81.
[18] O. Stucki, Predicting the Customer Churn with Machine Learning Methods: Case: Private</p>
      <p>Insurance Customer Data, Master’s dissertation, LUT University, Lappeenranta, Finland, 2019.
[19] K. C. Dewi, H. Murfi, S. Abdullah, Analysis Accuracy of Random Forest Model for Big Data – A
Case Study of Claim Severity Prediction in Car Insurance, in: Paper presented at 2019 5th
International Conference on Science in Information Technology (ICSITech), Yogyakarta,
Indonesia, October 23–24; pp. 60–65. doi: 10.1109/ICSITech46713.2019.8987520.
[20] J. Pesantez-Narvaez, M. Guillen, M. Alcañiz, Predicting Motor Insurance Claims Using Telematics</p>
      <p>Data—XGBoost versus Logistic Regression, Risks 2019, 7(2), 70. doi: 10.3390/risks7020070.
[21] S. Abdelhadi, K. Elbahnasy, M. Abdelsalam, A proposed model to predict auto insurance claims
using machine learning techniques, Journal of Theoretical and Applied Information Technology
30th November 2020. Vol.98. No 22: 3428–3437.
[22] Z. Quan, E. A. Valdez, Predictive analytics of insurance claims using multivaria te decision trees,</p>
      <p>Depend. Model. 2018; 6:377–407. doi: 10.1515/demo-2018-0022.
[23] E. Alamir, T. Urgessa, A. Hunegnaw , T. Gopikrishna, Motor Insurance Claim Status Prediction
using Machine Learning Techniques, (IJACSA) International Journal of Advanced Computer
Science and Applications, Vol. 12, No. 3, 2021. doi: 10.14569/IJACSA.2021.0120354.
[24] Kaggle. URL:
https://www.kaggle.com/datasets/ifteshanajnin/carinsuranceclaimpredictionclassification?resource=download
[25] Kuznietsova, N., Bidyuk, P., Kuznietsova, M. Data Mining Methods, Models and Solutions for Big
Data Cases in Telecommunication Industry, Lecture Notes on Data Engineering and
Communications Technologiesthis link is disabled, 2022, 77, pp. 107–127. doi:
10.1007/978-3-03082014-5_8.
[26] Kuznietsova N., Bateiko E. Analysis and Development of Mathematical Models for Assessing
Investment Risks in Financial Markets. CEUR Workshop Proceeding (ISSN 1613-0073). 2023. Vol.
3503, p. 92-101. https://ceur-ws.org/Vol-3503/paper9.pdf.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Wang</surname>
            , Xin,
            <given-names>Tapani</given-names>
          </string-name>
          <string-name>
            <surname>Ahonen</surname>
            , and
            <given-names>Jari</given-names>
          </string-name>
          <string-name>
            <surname>Nurmi</surname>
          </string-name>
          .
          <article-title>"Applying CDMA technique to network-on-</article-title>
          <string-name>
            <surname>M. Denuit</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Marechal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Pitrebois</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Walhin</surname>
          </string-name>
          ,
          <article-title>Actuarial modelling of claim counts: risk classification, credibility and bonus-malus systems</article-title>
          , Wiley,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Verbelen</surname>
          </string-name>
          , K. Antonio, G. Claeskens,
          <article-title>"Unravelling the predictive power of telematics data in car insurance pricing"</article-title>
          ,
          <source>Journal of the Royal Statistical Society</source>
          , Series C (Applied Statistics),
          <volume>67</volume>
          .5 (
          <year>2018</year>
          ):
          <fpage>1275</fpage>
          -
          <lpage>1304</lpage>
          . doi:
          <volume>10</volume>
          .1111/rssc.12283.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ashworth</surname>
          </string-name>
          <string-name>
            <surname>Nelder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.M.</given-names>
            <surname>Wedderburn</surname>
          </string-name>
          ,
          <article-title>"Generalized linear models"</article-title>
          ,
          <source>Journal of the Royal Statistical Society: Series A (General)</source>
          ,
          <volume>135</volume>
          .3 (
          <year>1972</year>
          ):
          <fpage>370</fpage>
          -
          <lpage>384</lpage>
          . doi:
          <volume>10</volume>
          .2307/2344614.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Kuznietsova</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. I. Bidyuk</surname>
          </string-name>
          ,
          <article-title>Theory and practice of financial risk analysis: systemic approach, Lira-</article-title>
          K,
          <year>Kyiv</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>