<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ORCID:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Recognizing the Fictitious Business Entity on Logistic Regression Base</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Desyatnyuk</string-name>
          <email>o.desyatnyuk@wunu.edu.ua</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arkadiusz Banasik</string-name>
          <email>arkadiusz.banasik@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iryna Lukasevych-Krutnyk</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Krysovatyy</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lipianina-Honcharenko</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Svitlana</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sachenko</string-name>
          <email>s_sachenko@yahoo.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oksana</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Machine Classification.</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Silesian University of Technology</institution>
          ,
          <addr-line>Kaszubska Str., 23, Gliwice, 44-100</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>West Ukrainian National University</institution>
          ,
          <addr-line>Lvivska Str., 11, Ternopil, 46000</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1820</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>The mechanism of creation and further activity of fictitious enterprises is constantly improved, and this one requires the use of adequate means to combat them. The method of fictitious enterprise detection on the basis of machine learning, namely Logistic Regression is offered. The developed method is represented by an algorithm and implemented in the software R environment. For example, a binary sample of 1048 enterprises was selected, of which 390 were fictitious. An EDA analysis was performed, which makes it possible to analyze the set and perform data cleaning if necessary. The correlation diagram shows that most of the parameters are slightly correlated with each other, only partially correlated with the parameter fictitious enterprises, business entities, classification, machine learning, Support Vector IntelITSIS'2022: 3rd International Workshop on Intelligent Information Technologies and Systems of Information Security, March 23-25, 0000-0002-1384-4240 (O. Desyatnyuk); 0000-0002-4267-2783 (A. Banasik); 0000-0002-9557-7886 (I. Lukasevych-Krutnyk).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Shadow commodity-money transactions carried out using fictitious entrepreneurship are of concern
in the economy of any country. Components of this criminal activity are commercial banks, a network
of fictitious enterprises, legal enterprises. At the same time, fictitious entrepreneurship is an instrument
of committing a number of mercenary crimes, in particular, tax evasion, smuggling, fraud with financial
resources, etc. The spread of fictitious entrepreneurship is due to various reasons. For example, in
Ukraine this has happened due to the long-term legal unregulation of private property, market and other
social relations in the past and the imperfection of the state and legal mechanism for regulating business
activities in modern conditions. Every year, the fiscal authorities of Ukraine alone identify about 6,000
fictitious legal entities. Given that the average turnover through the accounts of each of these categories
of enterprises is about 5 billion dollars, budget losses from VAT alone amount to about 200 million
USD dollars annually [21].</p>
      <p>According to OLAF in 2020 (the year of the fight against COVID-19), 230 investigations were
completed, 375 recommendations were issued to relevant national and European authorities, € 293.4
million was recommended for recovery in the EU budget and 290 new studies were opened after 1,098
preliminary analyzes. conducted by OLAF experts [26].</p>
      <p>Defining an economic crime is a rather time-consuming procedure for law enforcement officers.
Therefore, the development of an effective approach to identifying a fictitious enterprise is relevant.
Krutnyk)</p>
      <p>2022 Copyright for this paper by its authors.</p>
      <p>This paper proposes a method for detecting a fictitious enterprise based on classical machine
learning, namely Logistic Regression, the structure of which is as follows. Section 2 discusses the
analysis of related work; Section 3 presents the method of classification of fictitious enterprises based
on Logistic Regression, and Section 4 the implementation of the algorithm itself. Section 5 presents the
conclusions of the study.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Classification methods based on logistic regression are used in many areas: classification of tweet
authors [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]; detection of fake news on social networks [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; classification of diseases [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]; bankruptcy
forecasting [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; classification of production processes [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]; classification of text and images [
        <xref ref-type="bibr" rid="ref9">9, 24</xref>
        ]; in
tourism [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]; cash flow forecasting [20].
      </p>
      <p>
        In the article [29, 30] the research in the decision of the supposed good and unintentional bad
consequences of application of artificial intelligence in financial crimes is carried out. In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] a new
machine learning system for adult and adolescent autism screening has been proposed, which contains
vital features and performs predictive analysis using logistic regression to identify important
information related to autism screening. Article [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] proposes a model for predicting the dismissal of
employees based on machine learning. The prediction model is implemented using logistic regression
using the cross-entropy function as the objective function and using Newton’s method and
regularization to optimize the model. In order to achieve the best, the logical regression of the kernel
based on the confusion matrix (CM-KLOGR) is also proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The article [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] discusses the latest economic research on tax compliance and their application.
Based on a unique set of data [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] on the leakage of client lists from offshore financial institutions, it
was found that tax evasion is very concentrated among the rich. The document [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] uses administrative
microdata to study the impact of law enforcement efforts on taxpayers’ reporting of offshore accounts
and revenues. Increasing the exchange of information between countries is a key policy tool in the fight
against cross-border tax evasion, in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] the short-term effect of the Common Reporting Standard
(CRS). CRS is the first global multilateral standard for automatic exchange of information. In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], a
model of tax evasion was developed. In [18], a model is proposed to study the relationship between
economic growth and both types of income tax evasion.
      </p>
      <p>Article [19] examines the issues of tax avoidance and the evolution of tax evasion, highlighting the
factors that influence the emergence of these phenomena from a historical point of view: it is determined
who is a typical fraudster, as auditors can identify problems with tax evasion and avoidance of pay taxes
because they know better the type of person who can commit tax fraud, as this can be seen as an element
of risk in the audit. To combat tax evasion, the OECD has developed an automatic information exchange
(AIE) standard, in [17], the factors that explain the differences between the two information exchange
mechanisms in the implementation of the AIE standard. The results of the study show that the
differences are influenced by existing IT capabilities, compatibility, trust between information
exchange partners, differences in power, inter-organizational relations and the expected benefits of
implementing such mechanisms. Article [28] examines the processes of money laundering or, more
broadly, illegal financial transactions, such as terrorist financing.</p>
      <p>
        It should be noted that the above-mentioned works do not describe the detection of fictitious
enterprises with the help of information technology. On the other hand, the closest analogues [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1-3</xref>
        ],
representing the Logistic Regression classification, do not investigate the use of this method to
determine a fictitious enterprise.
      </p>
      <p>Thus, the purpose of this article is to develop a method of classification of fictitious enterprises on
the basis of Logistic Regression as a basis for the appropriate software environment.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <p>To be able to quickly identify a fictitious enterprise, a method of classification of fictitious
enterprises based on machine learning by the Logistic Regression method has been developed. The
advantages of the logistic regression method are: well studied; very fast, can work on very large
samples; practically out of competition, when the signs are very many (from hundreds of thousands and
more), and they are sparse; the coefficients before the signs can be interpreted; gives the probability of
assignment to different classes. The disadvantages of this method are: they work poorly in problems in
which the dependence of the answers on the signs is complex, nonlinear.</p>
      <p>The developed method is represented by an algorithm (Fig. 1) and the following steps.</p>
      <p>Step 1. At the beginning the data are entered: enterprise code (ID); parameter for determining the
fictitiousness of the enterprise (Fit); company name (Company); legal address (Address); physical
address (FAddress1,… FAddressn); KVED; names of managers (PIPKER); photo equipment with
geolocation (Foto); availability of a register of legal entities and individuals (EDR) in a single database;
availability in the database of VAT payers (P); timely payment of taxes (PO); availability of settlements
with co-agents (K); information on the presence of company executives in the state register of
declarations (VKK); availability of licenses according to NACE (L); the presence of criminal cases
under Art. 205 of the Criminal Code of Ukraine (K205); presence of mentions of company executives
with keywords: criminal case, corruption, offshore accounts, etc. (ZMI); availability of land at the legal
or physical address (ZD); availability of registered trademarks and services, database of industrial
marks, database of inventions and other databases of the Institute of Industrial Property of Ukraine
(TovZ); availability of issued motor third party insurance policies, MTIBU policy check, motor third
party database, search by state car number, check of the status of the Green Card policy for cars owned
by the company (SP); availability of cars and their owners issued to the company (A); coincidence of
registered cars with insurance policies (A&amp;SP); availability in the database of exporters (E); availability
in the stock market database (F); the presence of cars and their owners registered with the company
wanted (AR); the presence of weapons of the owners of the company wanted (ZR); the presence of
cultural values of the owners of the company wanted (KR); availability of construction licenses in the
company (LB); availability of real estate in the company (NM); availability of the company’s website
(NS); availability of equipment, recognition of equipment by the available photo and determination of
compliance of geolocation with the production address (FR); availability of the company’s social
networks and affiliated employees (FC).</p>
      <p>Step 2. Conducting an EDA with the output of the results. Intelligence data analysis (EDA) –
analysis of the basic properties of data, finding in them general patterns, distributions and anomalies,
construction of initial models, often using visualization tools.</p>
      <p>Step 3. Data conversion to binary expression.</p>
      <p>Step 4. Data cleaning [22, 25].</p>
      <p>Step 5. Data distribution. The data are divided into 25% of the test and 75% of the training sample.
Step 6. Select the variable most correlated with Fit [22, 23].</p>
      <p>Step 7. Then the model, which contains only one selected independent variable, is checked for
significance using a private F-test. If the significance of the model is not confirmed, the algorithm ends
because of the lack of significant input variables. Otherwise, this variable is entered into the model and
the transition to the next point of the algorithm.</p>
      <p>Step 7.1. For the remaining variables according to formula (1), the value of the statistics µ is
calculated, which is the ratio of the increase in the sum of regression squares achieved by introducing
the corresponding additional variable into the model to the value 
&lt;
&gt;</p>
      <p>.
ℎ ( ) =  ( 0 +  1 1 + ⋯ +     ) =  (   ) ,
 − −2
(1)
(2)
where  ( ) =</p>
      <p>1
1+ −</p>
      <p>Step 7.2. Calculated  
where SSE is the sum of the squares of errors (the model is constructed on variables), accounting for
one degree of freedom ( &lt;1&gt;,  &lt;2&gt;, …  &lt; &gt; 
&gt;</p>
      <p>&gt;).</p>
      <p>with  
=
Step 8. There is a comparison</p>
      <p>
        , which indicates the need to include the variable Xn
in the regression model, while the probability that the decision to include will be incorrect is α = 0.05
(the values are taken from the Fisher criterion table) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>Data input:</p>
      <p>ID, Fit, Company, Address, FAddress1 Faddressn,
KVEDPIPKER1 PIPKERn, Foto, EDR, P, PO, K, VKK,
L, K205, ZMI, ZD, TovZ, SP, A, E, F, AR, ZR, KR,</p>
      <p>LB, NM, NS, FR, FC</p>
      <p>EDA conducting
Converting to binary</p>
      <p>expression
Data cleaning</p>
      <p>Results
visualization</p>
      <p>Test
sample
25%</p>
      <p>Data distribution</p>
      <p>75%</p>
      <p>Training set</p>
      <p>Selection of the variable
most correlated with the Fit
parameter
І = 0 (1) m</p>
      <sec id="sec-3-1">
        <title>Calculation of Freal criterion and its maximal Ftable</title>
      </sec>
      <sec id="sec-3-2">
        <title>Freal &gt; Ftable</title>
        <p>Yes
Inclusion the corresponding</p>
        <p>variable in the model
Selection a new variable
Construction of a logistic</p>
        <p>regression model
Calculation of accuracy of
results
END</p>
        <p>No</p>
        <p>Confusion</p>
        <p>matrix
ROC-curve</p>
        <p>Step 9. From all variables-applicants for inclusion in the model, the one that has the highest value
of the criterion calculated in step 8 is selected.</p>
        <p>Step 10. The significance of the independent variable selected in step 9 is checked. If its significance
is confirmed, it is included in the model, and the transition to step 8 (but with a new independent variable
in the model). Otherwise, the algorithm stops.</p>
        <p>Step 11. The construction of a logistics model based on the obtained variables in step 9.</p>
        <p>Step 12. Based on the test sample, the obtained model is tested for the output of results, namely
Confusion matrix and ROC-analysis.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results and Discussion</title>
      <p>To implement the algorithm (Fig. 1) of fictitious enterprises classification based on machine learning
by Logistic Regression, selected free programming R language. R language allows to model statistical
indicators, having a large number of relevant libraries and easy to operate, so it is a very good choice
for this task.</p>
      <p>A sample of 1,048 enterprises was selected to solve this problem. 390 of them are fictitious. Next,
an EDA analysis was performed, namely, the distribution of the fit parameter is presented. The
visualized information plot_normality () is as follows: Histogram of original data; Q-Q plot of original
data; histogram of log transformed data; Histogram of square root transformed data. From the results,
the binary values are clearly traced.</p>
      <p>For binary values, the main graphical result is the correlation matrix (Fig. 2). The diagram shows
that most of the parameters are poorly correlated with each other, only partially traced correlation with
the parameter K205, namely with the parameter indicating the presence of criminal cases under Art.
205 of the Criminal Code of Ukraine. Due to the fact that there are small traces of correlation, it is
difficult to determine whether the company belongs to the fictitious or not, so it is the classification
algorithms based on machine learning, ideal for the task.</p>
      <p>For testing, the data set was divided into a training set (75%) and a testing set (25%). We train the
model to predict a fictitious enterprise. Instead of directly modeling the answer Y, logistic regression
simulates the probability that Y belongs to a certain category, in our case, the probability of
fictitiousness. This probability can be calculated using the logistics function. Thus, we build a model
based on logistic regression (Fig. 3).</p>
      <p>As can be seen, from the obtained results, standard errors, z-score and p-values for each of the
coefficients were determined. None of the coefficients are significant here, except for K205 and EDR,
which is similarly represented by correlation (see Fig. 2). The effectiveness of logistic regression is
assessed by certain key indicators:
• AIC (Akaike Information Criteria): this is the equivalent of R2 in logistic regression. It
measures suitability when a fine is applied to a number of parameters. Smaller AIC values indicate
that the model is closer to the truth. In the presented implementation, AIC = 291.54.
• Zero deviation: suitable for model with interception only. Degree of freedom n-1. Interpreted
as a Chi-square value (an adapted value that differs from the actual value hypothesis test). Residual
deviations: model with all variables. This is also interpreted as a test of the chi-square hypothesis.
The example shows (see Fig.3) that the deviation decreases by 843.62 when subtracting 23 variables
of the predictor (degree of freedom = number of observations – the number of predictors). This
reduction in deviation is evidence of the suitability of the obtained model.
• Number of iterations estimated by Fisher: the number of iterations before convergence, equal
to 8, for the task.</p>
      <p>Now let’s see how accuracy, sensitivity and specificity are transformed for a given threshold. By
default, use the 50% threshold to determine the probability of fictitiousness to assign class observations.
However, from the graph (Fig. 4), it is seen that the probability threshold has two increases from 1% to
50% and 50% to 100%.</p>
      <p>Let’s consider the indicators of accuracy, sensitivity and specificity (Fig. 5), the diagram shows that
the accuracy and sensitivity, begins to decrease at 55%. Therefore, consider the confusion matrix (Fig.
6) for the cut-off point by 55%, with Accuracy: 0.99.
The different values of the Confusion matrix (Table 1) will be as follows for the training sample:
• True positive (TP) = 129; this means that 129 indicators of positive class data are correctly
classified by the model;
• True negative (TN) = 121; this means that 121 data points of negative class were correctly
classified by the model;
• False positive (FP) = 5; this means that 5 indicators of negative class data were incorrectly
classified as models belonging to the positive class;
• False negative (FN) = 7; this means that 7 data indicators of the positive class were incorrectly
classified as models belonging to the negative class.</p>
      <p>The ROC curve is a popular graph for displaying two types of errors simultaneously for all possible
thresholds. Therefore, we present the ROC-curve for our study (Fig. 6).</p>
      <p>Area under the curve: 0.9936</p>
      <p>As shown in the ROC curve (see Fig. 6), the optimal threshold level of diagnostic assessment for
forecasting fictitious enterprises is 0.6, sensitivity and specificity are 96.3% and 95.2%, respectively.
The forecast for predicting the accuracy of determining fictitious enterprises is quite high, namely
AUC = 0.99.</p>
      <p>The logistic regression model was used to predict the fictitiousness of the enterprise. Clipping 55%
gave a high Accuracy: 0.99, and the area curve also provides the same accuracy of 0.99.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>A method of detecting a fictitious enterprise based on the classic method of machine learning,
namely Logistic Regression, is proposed, which allows to quickly track fictitious enterprises, which is
useful for public sector employees to prevent economic crimes.</p>
      <p>The method is implemented in the software environment R. To solve this problem, a binary sample
of 1048 enterprises was selected, of which 390 are fictitious. The EDA analysis allows yto clear the
data as needed. The correlation diagram shows that most parameters are poorly correlated with each
other. In particular. there is only a partial correlation with parameter K205, namely with the parameter
indicating the existence of criminal cases under the Criminal Code of Ukraine. Logistic Regression
model built: AIC = 291.54; the deviation decreases by 843.62 when subtracting 23 predictor variables;
number of iterations according to Fisher = 8. Prediction of fictitiousness of enterprises is carried out:
Accuracy = 0.99; AUC = 0.99. The Confusion matrix derived the following classification results for
the training sample: 129 indicators of positive class data correctly classified by the model; 121 data
points of negative class were correctly classified by the model; 5 indicators of negative class data were
incorrectly classified as models belonging to the positive class; 7 indicators of positive class data were
incorrectly classified as models belonging to the negative class.</p>
      <p>In further research, it is expected to develop an algorithm for recognizing images of enterprises
equipment with geolocation data processing and converting them into binary values.</p>
    </sec>
    <sec id="sec-6">
      <title>6. References</title>
      <p>[17] R. A. Kurnia, D. Praditya, M. Janssen, A comparative study of business-to-government
information sharing arrangements for tax reporting. in: Dwivedi Y., Ayaburi E., Boateng R., Effah
J. (eds) ICT Unbounded, Social Impact of Bright ICT Adoption TDIT 2019. IFIP Advances in
Information and Communication Technology, 558 (2019) 154-169.
https://doi.org/10.1007/978-3030-20671-0_11
[18] C. Bethencourt, L. Kunze, Social norms and economic growth in a model with labor and capital
income tax evasion. Economic Modelling 86 (2019) 170-182. doi:
10.1016/j.econmod.2019.06.009.
[19] D. Saxunova, R. Sulikova, R. Szarkova, Tax management hierarchy – Tax fraud and a fraudster.
in: Proceedings of the Joint International Conference on Managing the Global Economy
MIC’2017, Monastier di Treviso, Italy, 24–27 May 2017, University of Primorska Press.
[20] K. Bazilevych, M. Mazorchuk, Y. Parfeniuk, V. Dobriak, I. Meniailov, &amp; D. Chumachenko,
Stochastic modelling of cash flow for personal insurance fund using the cloud data storage.</p>
      <p>International Journal of Computing, 17 (2018) 153-162. https://doi.org/10.47839/ijc.17.3.1035
[21] UNITED NATIONS DEPARTMENT FOR ECONOMIC AND SOCIAL AFFAIRS. (2020).</p>
      <p>World economic situation and prospects 2020. p. 236.
[22] H. Lipyanina, S.Sachenko, T. Lendyuk, V. Brych, V. Yatskiv, O. Osolinskiy Method of Detecting
a Fictitious Company on the Machine Learning Base. In: Hu Z., Petoukhov S., Dychka I., He M.
(eds) Advances in Computer Science for Engineering and Education IV. ICCSEEA 2021. Lecture
Notes on Data Engineering and Communications Technologies, 83 (2021) doi:
https://doi.org/10.1007/978-3-030-80472-5_12
[23] A. Krysovatyy, H. Lipyanina-Goncharenko, S. Sachenko and O. Desyatnyuk. Economic Crime
Detection Using Support Vector Machine Classification. Modern Machine Learning Technologies
and Data Science Workshop. Proc. 3rd International Workshop (MoMLeT&amp;DS 2021). Volume I:
Main Conference. Lviv-Shatsk, Ukraine, June 5-6, 2021, 830-840.
[24] R. Gramyak, H. Lipyanina-Goncharenko, A. Sachenko, T. Lendyuk and D. Zahorodnia. Intelligent
Method of a Competitive Product Choosing based on the Emotional Feedbacks Coloring, CEUR
WS, (2021) 246-257.
[25] Z. Hu, M. Ivashchenko, L. Lyushenko, D. Klyushnyk, Artificial Neural Network Training
Criterion Formulation Using Error Continuous Domain, International Journal of Modern
Education and Computer Science (IJMECS), 13 3 (2021) 13-22. doi: 10.5815/ijmecs.2021.03.02
[26] The OLAF report 2020. OLAF, 2020, URL:
https://ec.europa.eu/anti-fraud/system/files/202112/olaf_report_2020_en.pdf
[27] M. Kantardzic, Data mining: concepts, models, methods, and algorithms. 3rd Edition. Wiley-IEEE</p>
      <p>Press. 2019, 672 p.
[28] L. Corselli, "Italy: money transfer, money laundering and intermediary liability", Journal of
Financial Crime, (2020) Vol. ahead-of-print No. ahead-of-print. doi:
https://doi.org/10.1108/JFC10-2019-0137
[29] P. Yeoh, Artificial intelligence: accelerator or panacea for financial crime?", Journal of Financial</p>
      <p>Crime, 26 2 (2019) 634-646. doi: https://doi.org/10.1108/JFC-08-2018-0077
[30] S. Dsouza, H. Habibniya, R. Demiraj, AI, a Provenance or Solution for Financial Crime. Manag
Econ Res J, 7(2) (2021) 26140.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Thabtah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Abdelhamid</surname>
          </string-name>
          , &amp;
          <string-name>
            <surname>D. Peebles,</surname>
          </string-name>
          <article-title>A machine learning autism classification based on logistic regression analysis</article-title>
          .
          <source>Springer. Health Inf Sci Syst</source>
          <volume>7</volume>
          (
          <year>2019</year>
          )
          <article-title>article ID 12</article-title>
          . https://doi.org/10.1007/s13755-019-0073-5
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Employee resignation prediction model based on machine learning</article-title>
          . In: Abawajy J.,
          <string-name>
            <surname>Choo</surname>
            <given-names>KK</given-names>
          </string-name>
          ., Xu
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Atiquzzaman</surname>
          </string-name>
          <string-name>
            <surname>M</surname>
          </string-name>
          . (eds),
          <source>Proceedings of the 2020 International Conference on Applications and Techniques in Cyber Intelligence (ATCI</source>
          '
          <year>2020</year>
          ),
          <source>Advances in Intelligent Systems and Computing</source>
          ,
          <volume>1244</volume>
          , (
          <year>2020</year>
          )
          <fpage>367</fpage>
          -
          <lpage>374</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -53980-1_
          <fpage>55</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ohsaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Matsuda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Katagiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Watanabe</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Ralescu</surname>
          </string-name>
          ,
          <article-title>Confusion-matrixbased kernel logistic regression for imbalanced data classification</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>29</volume>
          .9 (
          <year>2017</year>
          )
          <fpage>1806</fpage>
          -
          <lpage>1819</lpage>
          , doi: 10.1109/TKDE.
          <year>2017</year>
          .
          <volume>2682249</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
            <surname>Aborisade</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Anwar</surname>
          </string-name>
          ,
          <article-title>Classification for authorship of tweets by comparing logistic regression and Naive Bayes Classifiers</article-title>
          ,
          <source>in: Proceedings of the 2018 IEEE International Conference on Information Reuse and Integration (IRI)</source>
          , (
          <year>2018</year>
          )
          <fpage>269</fpage>
          -
          <lpage>276</lpage>
          , doi: 10.1109/IRI.
          <year>2018</year>
          .
          <volume>00049</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Goksu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cavus</surname>
          </string-name>
          ,
          <article-title>Fake news detection on social networks with artificial intelligence tools: Systematic literature review</article-title>
          . in: Aliev R.,
          <string-name>
            <surname>Kacprzyk</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedrycz</surname>
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jamshidi</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Babanli</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sadikoglu</surname>
            <given-names>F</given-names>
          </string-name>
          . (eds) 10th
          <source>International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions ICSCCW-2019. Advances in Intelligent Systems and Computing</source>
          ,
          <volume>1095</volume>
          (
          <year>2020</year>
          )
          <fpage>47</fpage>
          -
          <lpage>53</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -35249-
          <issue>3</issue>
          _
          <fpage>5</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>L. Liu,</surname>
          </string-name>
          <article-title>Research on logistic regression algorithm of breast cancer diagnose data by machine learning</article-title>
          ,
          <source>in: Proceedings of the 2018 International Conference on Robots &amp; Intelligent System (ICRIS)</source>
          , (
          <year>2018</year>
          )
          <fpage>157</fpage>
          -
          <lpage>160</lpage>
          , doi: 10.1109/ICRIS.
          <year>2018</year>
          .
          <volume>00049</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Barboza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kimura</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Altman,</surname>
          </string-name>
          <article-title>Machine learning models and bankruptcy prediction</article-title>
          .
          <source>Expert Systems with Applications</source>
          <volume>83</volume>
          (
          <year>2017</year>
          )
          <fpage>405</fpage>
          -
          <lpage>417</lpage>
          , doi: 10.1016/j.eswa.
          <year>2017</year>
          .
          <volume>04</volume>
          .006.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>İ.</given-names>
            <surname>Kabasakal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.D.</given-names>
            <surname>Keskin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Koçak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Soyuer</surname>
          </string-name>
          ,
          <article-title>A prediction model for fault detection in molding process based on logistic regression technique</article-title>
          . In: Durakbasa N.,
          <string-name>
            <surname>Gençyılmaz</surname>
            <given-names>M</given-names>
          </string-name>
          . (eds)
          <source>Proceedings of the International Symposium for Production Research ISPR'2019, Lecture Notes in Mechanical Engineering</source>
          . (
          <year>2020</year>
          )
          <fpage>351</fpage>
          -
          <lpage>360</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -31343-2_
          <fpage>31</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nieuwenhuis</surname>
          </string-name>
          , &amp; J.
          <string-name>
            <surname>Wilkens</surname>
          </string-name>
          ,
          <article-title>Twitter text and image gender classification with a logistic regression n-gram model</article-title>
          .
          <source>in: Proceedings of the Ninth International Conference of the Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum (CLEF</source>
          <year>2018</year>
          ).
          <source>CEUR-WS</source>
          ,
          <year>2125</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V.</given-names>
            <surname>Krylov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sachenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Strubytskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lendiuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lipyanina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zahorodnia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dorosh</surname>
          </string-name>
          , &amp; T. Lendyuk,
          <article-title>Multiple regression method for analyzing the tourist demand considering the influence factors</article-title>
          .
          <source>in: Proceedings of the 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications</source>
          (IDAACS'
          <year>2019</year>
          ), Metz, France,
          <volume>2</volume>
          (
          <year>2019</year>
          )
          <fpage>974</fpage>
          -
          <lpage>979</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Upper Critical Values of the F Distribution</article-title>
          . Information Technology Laboratory | NIST. URL: https://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Allingham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sandmo</surname>
          </string-name>
          ,
          <article-title>Income tax evasion: a theoretical analysis</article-title>
          .
          <source>Journal of Public Economics</source>
          <volume>1</volume>
          (
          <year>1972</year>
          )
          <fpage>323</fpage>
          -
          <lpage>338</lpage>
          , doi:10.1016/
          <fpage>0047</fpage>
          -
          <lpage>2727</lpage>
          (
          <issue>72</issue>
          )
          <fpage>90010</fpage>
          -
          <lpage>2</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alstadsaeter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Johannesen</surname>
          </string-name>
          , and G. Zucman, Tax Evasion and Inequality.
          <source>American Economic Review</source>
          <volume>109</volume>
          (
          <year>2019</year>
          )
          <fpage>2073</fpage>
          -
          <lpage>2103</lpage>
          , Doi:
          <volume>10</volume>
          .1257/aer.20172043.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>N.</given-names>
            <surname>Johannesen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Langetieg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Reck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Risch</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>J. Slemrod</surname>
          </string-name>
          ,
          <article-title>Taxing hidden wealth: The consequences of US enforcement initiatives on evasive foreign accounts</article-title>
          .
          <source>American Economic Journal: Economic Policy</source>
          <volume>12</volume>
          (
          <year>2020</year>
          )
          <fpage>312</fpage>
          -
          <lpage>346</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Slemrod</surname>
          </string-name>
          ,
          <article-title>Tax compliance and enforcement</article-title>
          .
          <source>Journal of Economic Literature</source>
          <volume>57</volume>
          (
          <year>2019</year>
          )
          <fpage>904</fpage>
          -
          <lpage>954</lpage>
          , doi: 10.1257/jel.20181437.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Casi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Spengel</surname>
          </string-name>
          , &amp;
          <string-name>
            <surname>B. M. Stage</surname>
          </string-name>
          ,
          <article-title>Cross-border tax evasion after the common reporting standard: Game over</article-title>
          ?
          <source>Journal of Public Economics</source>
          <volume>190</volume>
          (
          <year>2020</year>
          ),
          <fpage>104240</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>