<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>A.K. Bolshibayeva);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">0219-3116</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/978-981-15-3075-3</article-id>
      <title-group>
        <article-title>Machine Learning Methods to Detect Terrorist Financing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aigerim K. Bolshibayeva</string-name>
          <email>a.bolshibayeva@iitu.edu.kz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sabina B. Rakhmetulayeva</string-name>
          <email>s.rakhmetulayeva@iitu.edu.kz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aliya K. Kulbayeva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Information Technology University</institution>
          ,
          <addr-line>Manas St. 34/1, Almaty, 050040</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Within the global issue of money laundering, this study conducts an extensive national risk assessment, with a specific focus on Kazakhstan. Advanced methods are employed to identify vulnerabilities in both financial and non-financial sectors and to evaluate the potential risks linked to money laundering. The research utilizes inventive techniques, including both unsupervised and supervised learning methods, to scrutinize patterns in financial transactions, with the goal of differentiating between legitimate transactions and those that may involve money laundering. The use of K-means clustering and logistic regression yields promising outcomes in identifying irregularities and suspicious transactions. Through the incorporation of synthetic financial transaction data, this research provides insights into the concealed nature of money laundering practices. This study represents an initial stride towards improving anti-money laundering endeavors and reinforcing the legal and institutional framework in Kazakhstan. The findings deliver valuable perspectives on the detection of money laundering and its ramifications for both national and international security. This article presents an overview of the data mining techniques that can be employed to detect financial offenses, including the financing of terrorist organizations. In addition to data preprocessing for a machine learning model to detect financing of terrorist organizations based on publicly available data to ascertain the most significant set of financing anomalies in data. Machine learning, neural networks, terrorist financing, boosting, classification algorithms</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Nowadays, the actions of terrorist organizations, groups, and people are acknowledged as being
one of the primary sources of threats to the national security of the Republic of Kazakhstan in the
area of security. The extent of the repercussions of terrorist acts and the substantial number of
people who are killed or injured as a direct result of their commission are the primary factors
that contribute to the high level of this danger.</p>
      <p>The degree of financial support, as well as the availability of material and technological
resources, has a direct bearing on the severity of terrorist action. In this context, one of the most
essential instruments in the battle against international terrorism is the practice of placing a
freeze on the assets of terrorist groups and shutting off routes that are used to finance terrorist
operations.</p>
      <p>The detection systems for suspicious (abnormal) behavior that are utilized by financial
institutions nowadays are mostly based on a set of criteria that have been created by specialists
in the field. Because these guidelines are not malleable and cannot be readily adapted, they are
susceptible to being broken and manipulated in many ways. Another issue is that systems
produce a high number of false positives, which are time-consuming to process due to their
volume.</p>
      <p>Solutions based on machine learning may be continually educated and updated with fresh
data, which makes them adaptive and eliminates the need for specialists to develop new rules.</p>
      <p>It is possible to continually train and update machine learning-based solutions with fresh data,
which makes them adaptive and eliminates the need for consultants to develop new rules. The
decision tree (DT) method serves as the foundation for the solution that is suggested in this
research. DT is applicable because it is simple to see and comprehend, which enables it to provide
a justification for the judgments that it makes. In this study, the decision tree expansions that are
being examined are geared at cutting down on the amount of false positives. The drawback of
extensions is that they also limit the amount of fraudulent activities that are identified, which
results in a trade-off between the number of false positives and the number of fraudulent
activities that are not identified.</p>
      <p>The main aim of the research is the development of a machine learning model to identify the
financing of terrorist organizations based on open data to determine the maximum possible set
of funding anomalies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature review</title>
      <p>Terrorism, which has a negative impact on the standard of living of people all over the globe, is
recognized as one of the most significant dangers facing contemporary civilization. Terrorism's
purpose is to sow discord by sowing fear, worry, and insecurity on a broader scale than that of an
individual, and its intended effect is to bring about instability. The most recent information from
the Global Terrorism Database (GTD) indicates that there were 1,211 separate acts of terrorism
carried out throughout the globe in 2019, resulting in 6,362 persons sustaining injuries as a direct
consequence of these assaults.[1]</p>
      <p>Figure 1 [2] is a graphic representation of the globe map that covers many forms of terrorist
activity. According to the information shown by this map, the distribution sites of terrorist
activity are situated in the region immediately around Kazakhstan.</p>
      <p>Because of the impact that terrorism has on the economy, industry, and financial institutions
as well as the whole country, it is important and unavoidable that suspicious conduct connected
to terrorist funding be identified and investigated. Because of the significant quantity of financial
data and the enormous number of transactions that are involved, banks offer a conducive
environment for terrorist financing individuals to mask the origin of terrorist funding. As a result,
the techniques used for terrorist financing have gotten more complex and harder to track.
Because of this, the choice on the detection of terrorist funding has to strike a compromise
between precision and processing speed. The most crucial step in solving this problem is to find
an appropriate strategy for identifying terrorist funding in financial institutions and banks. This
may be accomplished by using an appropriate machine learning methodology with regard to the
data set. There is a wide variety of strategies and procedures for identifying irregular money,
which makes comparison challenging. "However, it is necessary to conduct an analysis, review,
and comparison of various methods for detecting terrorist financing in order to identify crimes,
patterns, and unusual behavior as well as money laundering groups for the purpose of financing
terrorism" [3].</p>
      <p>The application of rule-based approaches followed the introduction of statistical methods and
sequence matching in the late 1990s [4]. These methods were first used to identify unusual
financial transactions. After some time had passed, financial institutions started integrating
statistical and machine learning models into the automated algorithms they used. It is difficult to
comply with anti-crime regulations due to the very complex nature of these models, which is
caused by the increasing number of client transactions and automated customer interactions. The
use of supervised learning techniques, which entail learning from a dataset that has been labeled
and then modeling the categorization of incoming data into several label categories, has been
suggested as a potential solution to the issue of huge dataset sizes in some of the most current
research that has been conducted. Therefore, supervised learning algorithms are only able to
identify potentially suspicious behaviors, patterns, and transactions if they are comparable to the
data that they were trained on.</p>
      <p>While a complicated transaction mechanism happens, experts must retry the diagnostic
method, with the potential consultation of a narrow specialist. This slows down the prompt
discovery of abnormal financial processes and may need the consultation of a narrow specialist.
Additionally, the transaction research sets that are used by experts eventually go out of date and
are often unable of adapting to new circumstances.</p>
      <p>Utilizing ML approaches is one way to get over the obstacles discussed earlier in this
paragraph. The use of neural networks has become more popular as a solution to a variety of data
categorization, logging, and error-checking issues. In our proposal to design a system for the
detection of financial criminal actions of terrorist funding, we propose to make use of neural
networks as the primary technique of data assessment and annotation, in conjunction with
traditional machine learning (ML) approaches.</p>
      <p>Due to the significant role that it plays in both the fight against cybercrime and the
maintenance of a healthy economy, there is a wealth of research available on the subject of
identifying instances of financial fraud.</p>
      <p>Researchers commonly utilize outlier detection techniques [5] with severely skewed datasets
while attempting to uncover instances of financial crime. Fraud in its many forms may also be
committed against financial institutions. According to one study [6,] there are four different types
of financial fraud: fraudulent financial reporting, fraudulent transactions, fraudulent insurance,
and fraudulent credit.</p>
      <p>The research paper titled "A machine learning approach for terrorist financing detection" was
written by Raghavan, V., and Rakesh, V., and it was published in the year 2019 in the International
Journal of Computational Intelligence and Informatics. This work investigates the use of machine
learning techniques to identify the sources of funding for terrorist operations.</p>
      <p>The purpose of this study is to build an effective model for the use of machine learning in the
identification of financial dealings that are associated with terrorist operations. The authors
recommend employing machine learning approaches, such as classification and clustering
algorithms, to examine financial data in order to detect typical patterns and anomalies that are
connected with terrorist funding.</p>
      <p>The article provides a description of a technique that evaluates and contrasts a number of
algorithms, as well as preparation of data, selecting and tweaking of machine learning models,
and assessment of different algorithms. The authors identify prospective instances of terrorist
financing by using a variety of machine learning methods, such as decision trees, random forests,
and support vector machines, to categorize monetary transactions as "normal" or "suspicious" in
order to locate possible funding sources for terrorist organizations.</p>
      <p>Because on the findings of the research, the authors suggest a model for machine learning that
has a high success rate in recognizing financial transactions that are associated with terrorist
activities. In addition to this, comparisons are drawn with other methodologies already in use,
and the authors show why their methodology is preferable.</p>
      <p>The paper "Detecting terrorist financing using deep learning-based anomaly detection" was
written by Yoon, B., Ahn, J. H., Kim, J., and Lee, D. H., and it was presented at the 18th International
Conference on Machine Learning and Applications (ICMLA) in 2019, which was organized by
IEEE. The paper was also published in the conference proceedings.</p>
      <p>Deep learning and anomaly detection are going to be combined in this project so that a system
may be developed to identify financial support for terrorist activities. Deep neural networks are
a kind of artificial neural network that are proposed by the authors as a method for analyzing
financial data and locating abnormal patterns that are related with terrorist funding.</p>
      <p>In the study, a methodology is outlined that involves data preparation, the development of a
deep learning model, and the training of the model via the application of labeled data. In order to
identify abnormalities in financial data, the authors make use of a variety of different designs for
deep neural networks, such as convolutional neural networks and recurrent neural networks.</p>
      <p>The authors perform tests and assess the usefulness of the suggested strategy by using data
sets linked to the funding of terrorist operations. According to the findings of the research, the
use of deep learning in conjunction with anomaly detection may be a useful method for the
identification of financial transactions that are affiliated with terrorist groups.</p>
      <p>There have been many different approaches investigated for use in the detection of financial
fraud. In the field of automobile insurance, Phua et al. [7] employed neural networks, naive bayes,
and decision trees to identify instances of fraudulent activity. Another paper combined SVM,
genetic programming, logistic regression, and neural networks in order to identify financial
reporting fraud of Chinese enterprises. Ravisankar et al., (2011) was the first to do so. The
detection of fraudulent activity made use of density-based clustering [8] and cost-sensitive
decision trees. Sorournejad et al., [9] cover supervised and unsupervised machine learning based
methodologies, such as clustering, artificial neural networks (ANNs), support vector machines
(SVMs), and hidden Markov models (HMMs). Wedge et al., [10] handle the issue of unbalanced
data, which results to a relatively large number of false positives. Additionally, other studies
provide strategies to overcome this problem.</p>
      <p>In spite of this, there is a paucity of research on the subject of identifying the financial dealings
of terrorist groups, perhaps as a result of the recentness of technological advancements.</p>
      <p>The article by Albashrawi and colleagues [11] provides a comprehensive analysis of the
techniques that are the most often used to the detection of financial fraud (Table1).</p>
      <p>The use of an accurate representation of the data is one of the distinguishing characteristics of
the model that has been presented. Models that are able to enhance this representation have been
and continue to be the primary avenues of advancement in deep learning.</p>
      <p>The following methods, which were found based on the findings of the review, have been
recognized as ways to enhance machine learning algorithms for the detection of anomalous
financing:
• The efficiency of machine learning algorithms may be considerably improved by collecting
data that is both more diversified and up to date than it currently is. This data should focus on
financial transactions and connections to terrorist operations. Obtaining data from a variety
of sources, such as banking and financial institutions, law enforcement agencies, international
organizations, and public information sources, may be part of this process.
• Applying Convolutional Neural Networks and Deep Learning: Processing complicated
financial data and locating previously concealed patterns and anomalies may be made easier
with the use of powerful methods like deep learning and neural networks. The capability of
models to identify terrorist funding may be improved by using more complex neural network
structures and methodologies, such as convolutional neural networks and recurrent neural
networks.
• The incorporation of contextual information, such as social affiliations, location, and event
data, may assist in better comprehending and predicting the connections that exist between
terrorist operations and financial transactions. When attempting to identify shifting patterns
of terrorist funding, it is essential to take into account the dynamics of the situation over time.
• Functions of losses and metrics for estimating them: Developing machine learning
algorithms involves a number of steps, one of the most crucial of which is selecting the suitable
loss functions and estimating metrics. It is essential to take into consideration the particulars
of the job of detecting terrorist funding and to make an effort to reduce the number of false
positives (also known as false positives) and false negatives (also known as omissions).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Aim and research question</title>
      <p>The main hypotheses of the study:
• Can we define a reliable methodology for the analysis and detection for various scenarios
in the absence of labels or reliable data?
• Is it possible to extend the methodology in the presence of labels and generalize well even
in the presence of unbalanced classes?
• How can we evaluate the quality of synthetic data?
• Can we improve unbalanced classification with synthetic data?
• There are three main directions for solving the hypotheses of the research:
• Study the literature on the identification of terrorist financing and understand various
aspects of the problem.
• Solve the problem of detecting financial fraud on a publicly available set of data samples
using controlled machine learning methods.</p>
      <p>Following these stages will result in the construction of a framework that incorporates the
ideas of analytics and machine learning in order to address fundamental issues.</p>
      <p>In this investigation, we begin by defining supervised machine learning algorithms. These are
programs that are able to recognize patterns in the data that link features (a quantifiable aspect
of the data) with labels (a specific aspect of the data). Learning takes place when algorithms
search for patterns by analyzing samples for which labels are already known.</p>
      <p>A model, which is an approximation of the underlying relationships between features and
labels, is the product of this process. During the test phase, the model assigns predictions to
samples that were not part of the training phase, and then we compare these assignments to the
labels that were already known.</p>
      <p>The purpose of this test is to see how effectively the model generalizes to scenarios that have
not yet been encountered. A supervised machine learning issue is referred to as a classification
when each label represents one choice from a set of potential classes. On the other hand, a
supervised machine learning problem is referred to as a regression when the labels are
continuous values.</p>
      <p>An earlier technique known as Gradient Boost Machines [13.] has been modernized and given
the name XGBoost [12], which is a well-known supervised learning tool in the data science field.
XGBoost is a rapid version of the Gradient Boost Machines algorithm.</p>
      <p>An ensemble is a model that takes the findings of many other models and mixes them in order
to compensate for the deficiencies of each individual model. The majority of the options to replace
this process may be categorized as either bagging or boosting [14].</p>
      <p>The boosting method is an ensemble of consecutive models, each of which corrects the
mistakes made by the model that came before it in the series. In particular, the term "gradient
boosting" refers to a method for reducing the amount of prediction residuals that is based on
gradient descent. The decision trees that are used are the default and most popular ensemble
elements for XGBoost. These elements have a significant variance that may be corrected by
boosting. XGBoost, like many other machine learning models, generates outputs that indicate the
likelihood that a given sample belongs to a certain class. The choice for a binary classification
comes from splitting the expected probability into two halves using a threshold of 0.5 as the
dividing line.</p>
      <p>Following that, we examined the mechanism for detecting anomalies. Using this technique,
you may identify instances within a sample that exhibit abnormal behavior and isolate them for
further investigation. This is the primary objective of statistical analysis and machine learning,
and there is a wide variety of effective approaches. The primary response is to perform statistical
tests in order to model the distribution function of the data in a frequency interpretation or to
use Bayesian interpretation in order to analyze the likelihood of sampling the input data while it
is inside the modeled probability distribution function. Both of these approaches are viable
options. The straightforward method for detecting anomalies is to take each variable and
compute the z-score for each sample. This score is connected to the point's distance from the
mean, which is quantified in standard deviations.</p>
      <p>The difficulty is that if you define an anomalous score based on each parameter independently,
you are ignoring the intricate relationships that occur between features. There is a
multidimensional version of this approach that makes use of the Mahalanobis distance; however, this
extension makes the assumption that the distribution of the data is normal, which is not always
the case. The Isolation Forest (IF) technique [15] is predicated on the hypothesis that a person or
entity should be considered anomalous if it is simpler to distinguish them from the rest of the
group in a random subdivision of the feature space. A random measurement is taken after a
sample is selected from a larger data collection to get things started. The sample is then divided
in half using a random value that falls somewhere within the allowed range of that dimension.
The first node of the tree is generated by the algorithm once the user specifies the size and the
split point. Additional nodes are generated for the subsamples in a cyclical manner until it is
either impossible to do a split or an arbitrary depth of the tree has been achieved. In this tree, a
point that is closer to the root node correlates to a situation that is more likely to be isolated; yet,
it is possible that this occurrence is the result of random chance. After that, the algorithm will
proceed to repeat the whole process of creating trees using fresh samples until the required
number of trees has been generated. As a last step, it determines the anomaly score by computing
the typical length of the traversal route over all of the trees.</p>
      <p>An analysis of variance strategy will be used in order to determine the parameters and
attributes of the neural network before moving on to the next step of the process, which is the
selection of the machine learning models.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Carrying out the experiment</title>
      <p>The initial step in the preparation phase involves screening faulty values from the dataset and
removing rows from the dataset that include those rows. Experts in the field need to provide the
format that is anticipated for each variable. Errors in the software, the programming, or the input
of the data may all lead to invalid samples. In certain instances, criminals would intentionally
forge erroneous information in order to get around different safeguards that are in place. As a
result, invalid transactions need to be labeled as suspicious at an early level of the pre-processing
stage and filtered out of the remainder of the investigation.</p>
      <p>After that, the user has aggregated a number of transactions, and a collection of vectors is
generated as a consequence of this process. The majority of cons may be identified by suspiciously
high transaction and referral rates, consistent and constant amounts of activity, and extremely
short pauses between activities. Recency, frequency, and monetary variables, sometimes known
as RFM variables, have the ability to capture this behavior and are commonly employed in various
fraud research situations [16]. Any user who has a value in one of the RFM variables that is out of
the ordinary should be viewed as potentially malicious. However, uncommon combinations of
the values of the variable are more difficult to identify. When RFM variables are computed from
transactional data, the resulting data for each account is a collection of time series. The number
of transactions that are associated with an account determines how long the time series that
represents the account will be. The aggregation of RFM variables for each user is required in order
to compare vectors of constant size, but doing so results in an unavoidable loss of information.
We suggest providing each time series with many statistical functions all at once, such as the
median, mean, or standard deviation. This is something that we propose doing.</p>
      <p>We train the anomaly detection algorithm by running it on the dataset that we have been given
and using the machine learning program's implementations. In each of these cases, we will need
to establish some model parameters before we can proceed. We put the algorithm through its
paces with a number of different parameter values and then display those results against the
Bayesian Information Criterion (BIC) [17] for each model. This measure cannot be interpreted on
its own, however, when comparing the BIC of the two models, it is preferable to have lower values.
By locating the point at which the slope of the curve stops noticeably decreasing, we are able to
identify the number of components involved.</p>
      <p>Every user receives a score based on the algorithm that they utilize. These numbers are
measured on a variety of scales, making it impossible to compare them without further
processing. First, we scale the output of each algorithm such that the individual in the dataset
who seems the most suspicious receives a score of one and the person who seems the least
suspicious receives a score of zero.</p>
      <p>Logistic regression (LOG), linear and quadratic discriminant analysis (LDA, QDA), least square
support vector machines (LS-SVM), decision trees (C4.5), neural networks (NN),
nearestneighbor classifiers (k-NN10, k-NN100), a gradient boosting algorithm, and random forests are
the statistical methods that will be utilized in this paper. We are particularly interested in the
power and applicability of the random forest and gradient boosting classifiers, both of which have
not been substantially examined in the context of credit scoring.</p>
      <p>The area under the receiver operating characteristic curve (also known as AUC) will be used
to assess each approach. According to Baesens and colleagues (2003), this is a measure of the
discriminating capability of a classifier that does not take into account class distribution or the
cost of misclassification.</p>
      <p>The K-means clustering algorithm is a widely adopted method for clustering data. It involves
dividing objects into clusters based on a specified number of clusters. The primary goal is to
maximize the similarity among objects within the same cluster while minimizing the similarity
between objects in different clusters. This algorithm is known for its simplicity and efficient
clustering capabilities. It finds applications in various domains, including data mining, pattern
recognition, and image analysis. When applied to stock prediction, it can quickly compute and
yield accurate clustering outcomes. However, it has some drawbacks, such as sensitivity to
initialization and susceptibility to getting stuck in local extremes.</p>
      <p>Here are the steps of the algorithm:
1. Begin with dataset A containing B objects, where n = 1, 2, ..., m, A = {am}n and select i
objects randomly as the initial cluster centers.
2. Calculate the distance between the m-th object (am) and the j-th cluster center (cj) using
the formula:
 (  ,   ) = √ (  −   )2
(1)
3. Determine the minimum distance Dmin(am, cj) from the m-th object (am) to the j-th cluster
center (ci). Assign objects to the nearest cluster based on the condition:</p>
      <p>Cj = {am: D (am − cj) &lt; D(am − cz), 1 ≤ z ≤ i}
(2)</p>
      <p>Compute the mean of objects within the same cluster to update the cluster center:
  =
1
(3)
(4)
collections in class j.
where   represents the number of objects in the z-th class, and Zj is the subset of all object</p>
      <p>Repeat steps (2)-(4) until the algorithm converges.</p>
      <p>The K-means clustering algorithm typically evaluates the clustering effectiveness using the
sum of squared error’s function:</p>
      <p>=1  =1
 = ∑
∑ |   −   |
2
where i represents the number of clusters,   denotes the size of cluster z, aj represents an object
in cluster z,   is the cluster center, and |   −   |2 represents the distance from object aj to cluster
Given the nature of logistic regression models, two main types of analyses can be conducted:</p>
      <p>Assessing the significance of the relationship between each covariate and the dependent
center   .</p>
      <p>variable.
parameters.</p>
      <p>Categorizing individuals into the two groups of the dependent variable based on their
probability of belonging to either category.</p>
      <p>The Bayesian Information Criterion (also known as Schwarz Criterion or SC) is employed for
model selection among a group of parameterized models, each having a different number of</p>
      <p>One key distinction compared to the Akaike criterion is that the Bayesian Information
Criterion penalizes the inclusion of additional parameters [15].</p>
      <p>In essence, lower values of both AIC and SC indicate a model's superior ability to fit the data.
In Figure described fit criteria for is laundering and determined full model.</p>
      <p>Given that the p-value is below 0.05, it indicates that the logistic regression model, as a
complete entity, holds statistical significance (Fig.3).</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This article is the first in a series of articles on this study.</p>
      <p>This article summarizes a review of mining methods that can be used to detect financial crime
and also identifies ways to improve machine learning approaches.</p>
      <p>The article discusses Kazakhstan's active efforts to bolster its legal framework and
infrastructure in the fight against terrorism financing and money laundering. It also emphasizes
that the national risk assessment in Kazakhstan aims to pinpoint vulnerabilities in financial and
non-financial sectors, create measures to reduce money laundering risks and promote a common
understanding of these risks among relevant authorities.</p>
      <p>In summary, the research explores the application of machine learning techniques, specifically
K-means clustering and logistic regression, for detecting money laundering. The study's goal was
to evaluate the effectiveness of these methods in identifying suspicious financial transactions.
Kmeans clustering, a method for categorizing data, showed promise in grouping transactions based
on their similarities within the feature space. However, it had certain limitations, such as
sensitivity to initialization and the potential to get stuck in local optima. The choice of the number
of clusters (K) was found to be critical and influenced the clustering results.</p>
      <p>On the other hand, logistic regression provided a straightforward approach for modeling the
likelihood of suspicious transactions, particularly with binary outcomes. The study used the
Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) to assess the
model's fit.</p>
      <p>Both clustering and regression techniques offer valuable tools for detecting money laundering,
and their suitability depends on the specific use case and available data. Future research should
focus on refining these models, addressing their limitations, and integrating real-world financial
data to improve anti-money laundering efforts and reduce false positives and false negatives. In
conclusion, these methods are essential steps in the ongoing battle against the complex and
everchanging problem of money laundering in the financial sector.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgements</title>
      <p>This research has been funded by the Science Committee of the Ministry of Science and Higher
Education of the Republic of Kazakhstan (Grant No. AP19576825).</p>
    </sec>
    <sec id="sec-7">
      <title>7. References</title>
      <p>[1] R. Alhamdani, M. Abdullah, and I. Sattar. (2018). Recommender system for global terrorist
database based on deep learning. International Journal of Machine Learning and Computing,
vol. 8, pp. 571–576.
[2] Source https://www.start.umd.edu/gtd/.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>