=Paper= {{Paper |id=Vol-3052/paper1 |storemode=property |title=Black-Box Adversarial Entry in Finance through Credit Card Fraud Detection |pdfUrl=https://ceur-ws.org/Vol-3052/paper1.pdf |volume=Vol-3052 |authors=Akshay Agarwal,,Nalini Ratha |dblpUrl=https://dblp.org/rec/conf/cikm/0001R21 }} ==Black-Box Adversarial Entry in Finance through Credit Card Fraud Detection== https://ceur-ws.org/Vol-3052/paper1.pdf
Black-Box Adversarial Entry in Finance through Credit Card
Fraud Detection
Akshay Agarwal, Nalini Ratha
University at Buffalo, USA


                                          Abstract
                                          In the literature, it is well explored that machine learning algorithms trained on image classes are highly vulnerable against
                                          adversarial examples. However, very limited attention has been given to other sets of inputs such as speech, text, and tabular
                                          data. One such application where little work has been done towards adversarial examples generation is financial systems.
                                          Despite processing sensitive information such as credit fraud detection and default payment prediction, a low depiction of
                                          the robustness of the financial machine learning algorithms can be dangerous. One possible reason for such limited work is
                                          the challenge of crafting adversarial examples on the financial databases. The financial databases are heterogeneous where
                                          features might have a strong dependency on each other. Whereas image databases are homogeneous, and hence several
                                          existing works have shown it is easy to attack the classifiers trained on them. In this paper, for the first, we have analyzed the
                                          vulnerability of several traditional machine learning classifiers trained on financial tabular databases. To check the robustness
                                          of these classifiers, ‘black-box and classifier agnostic’ adversarial attack is proposed through mathematical operations on the
                                          features. In brief, the proposed research for the first time presents a detailed analysis that reflects which classifier is robust
                                          against minute perturbation in the tabular features. Apart from that through the perturbation on individual features, it is
                                          shown which column feature is more or less sensitive for the incorrect classification of the classifier.

                                          Keywords
                                          Adversarial Attacks, Credit Card Fraud Detection, Machine Learning Classifiers, Vulnerability, Black-Box,



1. Introduction                                                                                                    lump sum amount of money. Due to the processing of
                                                                                                                   multiple applications which might be in huge numbers
The recent research articles claim that in the last decade                                                         and the time required for handling an application, ma-
from 2010 to 2020, people with the personal loan double                                                            chine learning algorithms are ideal for decision making.
from $11 million to $21 million [3]. At the same time,                                                             Machine learning generally requires a significant number
the amount of loan debt increase by three times from                                                               of feature components, which in the case of credit appli-
$55 billion to $162 billion. The processing of such a large                                                        cation can be such as age, property available, and amount
number of loan applications and identifying any possible                                                           paid in the previous loan if any. The corrupt personal can
fraud is a tedious and time-consuming task for a human                                                             minutely change one or more feature components which
being. The possible solution to overcome the load is to                                                            can easily be ignored in the application by the system
utilize the power of machine learning (ML) algorithms.                                                             due to no drastic change in the feature space and hence
In the past, machine learning algorithms have shown                                                                can accept the fraud application. Apart from difficultly
tremendous success in solving variety of tasks ranging                                                             observing the feature space, the heterogeneous nature of
from object recognition [4, 5] to person identification                                                            the tabular databases requires expert opinion in identify-
[6, 7, 8] and solving complex medical problems [9, 10, 11].                                                        ing small modifications. The severity of the credit fraud
While the machine algorithms are here to ease the hu-                                                              or loan fraud can be seen from the recent news articles
man and perform the task with near perfection. How-                                                                [1, 2, 12]. As per the statistics, in 2018, $24.26 Billion was
ever, recent research indicates that the machine learning                                                          lost due to payment card fraud worldwide. Among all the
algorithms are highly susceptible against the minute per-                                                          amount, the United Stated is one of the largest contribu-
turbation in the input data.                                                                                       tors with almost 38.6% reported credit card fraud cases.
   Imagine a scenario, where a corrupt individual came                                                             The sensitivity of machine learning algorithms towards
into the bank for credit card approval and the issue of a                                                          minute perturbations in other domains [13] requires that
In International Workshop on Modelling Uncertainty in the Financial
                                                                                                                   ML algorithms used for tabular databases are secure to
World (MUFin21) In conjunction with CIKM 2021,                                                                     ensures the correct decision. Figure 1 shows the impact
November 1, 2021, Online                                                                                           of credit card fraud in the worldwide community. There-
Envelope-Open aa298@buffalo.edu (A. Agarwal); nratha@buffalo.edu (N. Ratha)                                        fore, it is extremely important to extensively examine
GLOBE https://sites.google.com/iiitd.ac.in/agarwalakshay/home                                                      the vulnerability of machine learning algorithms before
(A. Agarwal); https://nalini-ratha.github.io/ (N. Ratha)
                                                                                                                   trusting their decision in the financial domain.
Orcid 0000-0001-7362-4752 (A. Agarwal); 0000-0001-7913-5722
(N. Ratha)                                                                                                            In this research, for the first time, we have extensively
                                    © 2021 Copyright for this paper by its authors. Use permitted under Creative
                                    Commons License Attribution 4.0 International (CC BY 4.0).
                                                                                                                   evaluated several machine learning models and their vul-
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: Reflecting the impact of credit card fraud worldwide and demand of secure deployment of automated machine
learning (ML) systems. The statistics are taken from the multiple Internet sources [1, 2].



nerability against minute perturbations in the feature         the databases. Later, the machine learning classifiers
space (or input space). The credit card default prediction     chosen to perform the vulnerability analysis are de-
databases contain multiple features such as age, gender,       scribed. The experimental results along with analysis
payment status, and education. The individual feature          are presented to showcase the impact of the proposed
can affect the classification decision due to any reason       ‘black-box and classifier agnostic’ adversarial perturba-
such as bias and mislabelling. For example, it might           tion.
be believed that the highly educated individual might
not perform fraud. Therefore, in this research, we have
identified the sensitivity of various machine learning clas-   2. Existing Adversarial Examples
sifiers against individual features both in their raw form        Research
and under minute perturbation. While the features play
an important role, the optimization function of different      Since the finding of adversarial examples [14], several
classifiers play an important role in learning decision        adversarial attack algorithms are presented in the liter-
boundaries. Hence, the detailed experimental evaluation        ature. The existing adversarial attacks can be divided
of multiple machine learning classifiers has been per-         based on the following two criteria: (i) intention and (ii)
formed to showcase which classifier is more robust or          type of learning. The type of learning can be described
sensitive against imperceptible perturbations. In brief,       how much knowledge of the machine learning classi-
the contributions of this research are:                        fier is needed to fool it and it can be categorized into
                                                               white-box and black-box. In the white-box setting, an
     • first-ever black-box inference time imperceptible       attacker assumes the complete knowledge of the system
       adversarial attack on credit-card default predic-       such as its parameters and classification probabilities. On
       tion is performed;                                      the other hand, the black-box attacks do not utilize any
     • extensive ablation studies are conducted to find        ML network information in creating adversarial exam-
       out the importance of individual feature value          ples. In the real world, it is extremely difficult to acquire
       towards decision making;                                the knowledge of the machine learning classifiers due
     • sensitivity analysis of multiple machine learning       to their security and the existence of a wide variety of
       classifiers are presented to help in building a ro-     machine learning algorithms. For example, Goel et al.
       bust finance system utilizing robust classifier(s);     [15, 16, 17] have utilized the concept of blockchain and
     • comprehensive survey of the existing adversarial        cryptography to either change the structure of the net-
       attacks developed in other data domains show-           works or encrypt them to make it difficult to identify the
       case the needs of the development of adversary          exact parameters of the networks. Similarly, there exists
       to identify the vulnerabilities in the finance space    a humongous number of machine learning algorithms
       as well.                                                such as supervised, unsupervised, and ensemble learning,
                                                               hence, assuming the knowledge of the system an attacker
   In the next section, the review of the existing adver-      wants to fool is difficult [18, 19]. Due to the above obser-
sarial examples is presented followed by the descrip-          vations, a black-box attack is practical in the real world
tion of credit card default prediction databases. In           and at the same difficult to achieve. On the other hand,
the next, exploratory database analysis has been per-          intention-based attacks are divided into targeted attacks
formed to effectively examine the characteristics of           and untargeted attacks. The targeted attacks aim the
input data to be misclassified by the network into one of      Table 1
the desired classes. For example, a credit card defaulter      Characteristics of the Credit Card databases.
would like to be classified as genuine by the machine                           Default Credit                    Australian Credit
                                                                  Feature   Name                 Type          Feature    Type
learning classifier. Whereas, the untargeted attacks aim
                                                                     1      ID                   Continuous      A1       Binary
the input data to be misclassified into ‘any’ class except           2      Limit-Bal            Continuous      A2       Continuous
the true class.                                                      3      Sex                  Binary          A3       Continuous
                                                                     4      Education            Categorical     A4       Categorical
   In the literature, several adversarial attacks are pro-           5      Marriage             Categorical     A5       Categorical
posed. The majority of the attacks are proposed for visual           6      Age                  Continuous      A6       Categorical
object classification and limited work has been done so              7      Pay_0                Continuous      A7       Continuous
                                                                     8      Pay_2                Continuous      A8       Binary
far for other kinds of input information such as speech              9      Pay_3                Continuous      A9       Binary
and tabular data, and machine learning classifiers such             10      Pay_4                Continuous      A10      Continuous
                                                                    11      Pay_5                Continuous      A11      Binary
as reinforcement learning. The gradient is one of the               12      Pay_6                Continuous      A12      Categorical
most essential information in deep network learning and             13      Bill_Amt1            Continuous      A13      Continuous
utilizing this information several attacks are proposed.            14      Bill_Amt2            Continuous      A14      Continuous
                                                                    15      Bill_Amt3            Continuous      A15      Binary
For example, PGD attack [20] is one of the strongest                16      Bill_Amt4            Continuous
attacks for visual image classifiers. The attack is per-            17      Bill_Amt5            Continuous
formed in multiple iterations by projecting the gradient            18      Bill_Amt6            Continuous
                                                                    19      Pay_Amt1             Continuous
in the direction that leads to the strong adversary. Other          20      Pay_Amt2             Continuous
image-based attacks such as DeepFool [21], add the per-             21      Pay_Amt3             Continuous
                                                                    22      Pay_Amt4             Continuous
turbation in the image iteratively so that the image can            23      Pay_Amt5             Continuous
pass its corresponding class decision boundary learned              24      Pay_Amt6             Continuous
by the network. The above attacks learn the manipu-                 25      Default Payment      Binary
lation for each image separately, while it is possible to
learn a unique noise vector to apply on multiple images
and fool the network [22, 23]. The above-described at-         effectively. Agarwal et al. [28] have not utilized any exter-
tacks are performed in the white-box setting utilizing the     nal knowledge including perturbation vector but extract
complete knowledge of the classifier. Another disadvan-        the noise inherently present in an image. The authors use
tage of the white-box attack is the transferability against    an intelligent observation that due to several factors such
multiple models. As the attacks are generated utilizing        as camera preprocessing steps, environmental factors,
the knowledge of the classifier which can be significantly     the noise inherently present in an image. The authors
different from the other unseen models, hence, leads to a      extract those noise pattern and used as an adversarial
poor success rate against unseen models [24, 25].              pattern. The above-mentioned attacks are performed in
   The other class of attack is the black-box attacks which    the image space. Limited attacks are also proposed in
are more practical in the real world and can fool multiple     other categories of networks or input such as generative
classifiers. The black-box attack can be further divided       models [29], reinforcement learning [30], and cyberspace
into query-based and generic manipulation-based. In            [31].
the query-based attack, some knowledge of the system              While on the one hand, adversarial attacks on machine
is assumed such as the decision of the classifier. By uti-     learning classifiers especially deep learning classifiers
lizing the decision of the classifier on the given input,      are prevalent, the defense against them is also getting
the noise is modified leading to the desired intent of         significant attention. Several defense algorithms based
misclassification, i.e., targeted or untargeted. While the     on the following two motives are proposed: (i) segrega-
query-based attacks are more successful for unseen mod-        tion of the adversarial examples from the clean examples
els whose knowledge is not available but still bounded         [32] and (ii) mitigating the impact of adversarial noise
by the number of queries that can be sent for the noise        [27]. The defense algorithms have shown tremendous
generation. Therefore, this limitation restricts the practi-   success in countering the adversarial attacks on the im-
cal deployment at multiple places. Another category of         age domain and show generalizability even in complex
attack which is general manipulation is one of the most        situations such as an unseen attack, unseen database,
successful attacks because not utilization of any classifier   and unseen model [33, 34]. The survey of the existing
knowledge makes them agnostic to classifiers and can           research on adversarial examples can be further referred
fool multiple classifiers. Goswami et al. [26, 27] have        from the survey papers [35, 36, 37].
proposed several image manipulations for fooling face             It is interesting to observe from the above discussion
recognition networks. The manipulations are somewhat           that adversarial machine learning is one of the fasted
inspired by the domain knowledge of face recognition           growing communities; however, only a few works ex-
and therefore, modified the landmark features of a face        ist towards the robustness in the financial domain. The
image which were able to fool the recognition networks         prime reason for such low existence can bethink from
Figure 2: Correlation matrix among the individual features of the Default Credit database.



the point of the type of input. The financial data espe-        operators. The proposed attacks work in the black-box
cially tabular databases are heterogeneous as compared          setting and do not utilize any information of a classifier.
to homogeneous image databases. Tabular features are            Therefore, the proposed attack is classifier agnostic and
not interchangeable in contrast to the pixels of an image.      can be applied against ‘any’ classifier. In contrast to the
Apart from that, the images are rich in visual informa-         existing research reported on the limited classifier, the
tion and hence humans can predict the information by            proposed research study the adversarial strength against
looking at them and easy to identify whether any manip-         multiple classifiers and shows that the proposed attack
ulation has been performed. Whereas tabular data are            can fool each of them. Apart from this, the proposed
less interpretative and it is complex to identify the minor     manipulation also aims to reveal the role of individual
modification in individual value. In the literature, few re-    tabular features in the classification.
search works are proposed for crafting adversarial noise
on tabular databases. Ballet et al. [38] and Levy et al.
[39] have proposed an imperceptible adversarial attack          3. Finance Databases
by minimizing the norm of the perturbation. The criti-
                                                                In this research, we have used two popular credit
cal drawback of the attacks is that the attacker assumes
                                                                card default prediction databases namely Default Credit
the complete knowledge of the classifier for learning the
                                                                Database [41, 42] ad Australian Credit Database [12]. The
perturbation and hence less practical for real-world de-
                                                                default credit database is one of the largest databases for
ployment. Another drawback is that the norm-based
                                                                the binary prediction of the default payment category.
perturbation on the tabular features can yield unrealistic
                                                                The database contains 30, 000 data points belonging to
transformations [40]. Apart from that, the above attacks
                                                                two categories of default payment, i.e., yes or no. In
on tabular data are evaluated on a single classifier, i.e., a
                                                                total, the database consists of 24 features belonging to
shallow neural network or decision forest.
                                                                multiple types such as binary (0 or 1), categorical (1 to
   To overcome the limitations of the existing adversarial
                                                                𝑛), and continuous. The ID is a feature to represent an
study on the tabular databases, we have proposed an
                                                                individual in the database and hence no role in the clas-
adversarial manipulation method based on mathematical
                                                                sification of the data point. Therefore, the ID feature is
Figure 3: Score of an individual feature computed using UFR (left) and MRMR (right) algorithm on the Default Credit Card
database.




Figure 4: Score of an individual feature computed using UFR (left) and MRMR (right) algorithm on the Australian Credit
Card database.



discarded from the Default Credit database. It is clear       4.1. Correlation Analysis
from the description that each feature has a different
                                                              Figure 2 shows the correlation heatmap among each fea-
scale and hence, it is important to bring each feature
                                                              ture in the Default Credit Card database. It is clear from
into the same range, such as between 0 to 1. We have
                                                              the heatmap that, no feature exhibits a strong correlation
performed the min-max normalization to bring the scale
                                                              with the class variable (default payment). Whereas, the
of each feature to the same range. The Australian Credit
                                                              features that belong to the same category such as ‘Pay_’
database contains 14 features aiming to classify the data
                                                              and ‘BILL_AMT’ show a strong correlation among them-
into binary categories of default payment. Similar to the
                                                              selves. For example, ‘Pay_0’ have the positive correlation
Default Credit database, the Australian database consists
                                                              value of 0.67 with variable ‘Pay_1’. ‘Pay_0’ feature rep-
of the features of different scales and hence normalized
                                                              resents the repayment status in September 2005 and the
using min-max scaling. The characteristics of both the
                                                              value of the feature ranges between -1 to 9. Other pay
databases are given in Table 1. Contrary to few available
                                                              features represent the repayment status between April
pieces of research [38] which drops few features for ad-
                                                              to August 2005. The correlation among them shows the
versarial learning on credit database, we have utilized
                                                              repayment status of the current month and in turn, the
each feature in the database and analyze their impact on
                                                              credit default payment is somewhat dependent on the
adversary generation.
                                                              status of the last month. However, as compared to re-
                                                              payment status, ‘BILL_AMT’ features have very strong
4. Exploratory Data Analysis                                  correlation values among themselves. The correlation
                                                              value of at least 0.8 is observed between different features.
Before performing the adversarial attack on the input         ‘BILL_AMT’ represents the amount of bill statement be-
features of the Credit Card databases, we have performed      tween April 2005 to September 2005.
the exploration studies on the features such as correlation
among the features and relevance of the features.
Figure 5: Credit Card default prediction accuracy on clean test set of Australian Credit Card (left) and Default Credit Card
(right) database. Log. Reg. represents the logistic regression and B. Trees represents the binary trees. S-NNet and D-NNet
represent the shallow and deep neural networks, respectively.




Figure 6: Credit Card default prediction accuracy on the clean test set, perturbed set of reflecting maximum drop, and the
difference in accuracy using Australian Credit Card (left) and Default Credit Card (right) database. Log. Reg. represents
the logistic regression and B. Trees represents the binary trees. S-NNet and D-NNet represent the shallow and deep neural
networks, respectively. Clean shows the accuracy of the classifier on the original clean test set, whereas, perturbed shows the
accuracy when any of the features is perturbed which leads to a maximum drop in the accuracy on the test set. Max. Drop is
the difference between clean and perturbed set accuracy.



4.2. Feature Importance                                    goal of selecting the important features by reducing the
                                                           redundancy among the features and weighting the rele-
Another data exploratory analysis has been performed
                                                           vant features. For that, the MRMR algorithm computes
by examining the importance of individual features con-
                                                           the mutual information among the features and between
cerning the class label. For that two feature selection or
                                                           feature and class label. The MRMR algorithm selects the
feature weight assignment algorithms namely Univari-
                                                           best feature set (𝑆) for classification by maximizing the
ate Feature Ranking (UFR) and Minimum Redundancy
                                                           relevance score |𝑉𝑠 | between feature 𝑥 and class label 𝑦.
Maximum Relevance (MRMR) [43], are utilized. The
                                                           At the same time, the algorithm aims to minimize the
advantage of both the algorithm is that they accept cat-
                                                           redundancy score |𝑊𝑆 | between two feature values 𝑥 and
egorical and continuous features for the classification
                                                           𝑧. The |𝑉𝑆 | and |𝑊𝑆 | can be defined using the following
problem. The UFR algorithm measures the independence
                                                           equations:
of each feature concerning the class variable using the                                1
chi-square test between them. The smaller the p-value on                       𝑉𝑆 =       ∑ 𝐼 (𝑥, 𝑦)
                                                                                      |𝑆| 𝑥∈𝑆
a particular feature represents the higher the dependence
between the feature and class label and the importance                                 1
                                                                              𝑊𝑆 = 2 ∑ 𝐼 (𝑥, 𝑧)
of the feature for classification. MRMR algorithm itera-                              |𝑆|  𝑥∈𝑆
tively examines the features to find the features which where, |𝑆| represents the number of features in the opti-
are mutually and maximally dissimilar to each other but mal subset 𝑆. Finally, mutual information quotient (MIQ)
effective for decision making. The algorithm achieves its
Table 2
Adversarial vulnerability of multiple machine learning classifiers against the proposed perturbation defined in Equation 1
on Australian Credit Card Database. Colored box represents the sensitive features and drop in accuracy of classifier on the
corresponding feature.
            Perturb    SVM                Logistic      Naive     Binary    Neural Network
                                                                                                KNN      DAC
            Feature    Linear    RBF      Regression    Bayes     Trees     Shallow   Deep
               1        86.13    85.55      84.39       79.77      84.97     83.81    87.28     82.08    86.13
               2        86.13    82.66      84.39       77.46      83.81     77.46    81.50     78.61    86.13
               3        86.13    84.39      83.81       71.67      79.77     80.35    84.97     80.35    84.97
               4        86.13    85.55      64.16       74.57      41.04     43.93    63.00     82.10    80.92
               5        86.13    86.13      70.52       83.81      79.77     72.25    71.10     78.61    82.08
               6        86.13    86.13      84.97       83.81      82.10     78.61    86.13     72.83    85.55
               7        86.13    84.97      84.39       41.62      84.39     75.14    83.24     50.87    85.55
               8        13.88    32.27      40.46       67.63      38.73     37.57    39.30     44.51    24.28
               9        86.13    84.39      85.55       79.19      76.88     83.24    83.24     80.35    86.70
              10        86.13    85.55      41.62       41.62      83.23     63.58    87.28     41.62    50.29
              11        86.13    87.28      86.70       78.61      83.23     85.55    82.66     82.66    85.29
              12        86.13    85.55      84.39       63.00      83.23     70.52    83.81     58.96    86.13
              13        86.13    83.24      80.92       79.19      79.19     78.61    82.66     77.46    84.97
              14        86.13    84.97      41.62       41.62      84.39     50.29    86.13     41.62    41.62


is calculated to select the subset of features using the           1. Support vector machine (SVM) [44]: It is one of
following equation:                                                   the most popular machine learning classifier be-
                                                                      cause of strong mathematical foundation. SVM
                                 𝑉𝑥                                   learns the decision hyperplanes by maximizing
                       𝑀𝐼 𝑄𝑥 =
                                 𝑊𝑥                                   the distance between the nearest points of each
                                                                      class. SVM classifier have significant success in
where, 𝑉𝑥 and 𝑊𝑥 are the relevance and redundancy value
                                                                      the binary classification tasks such as presenta-
of a feature 𝑥, respectively.
                                                                      tion attack detection [45, 46] and adversarial ex-
   The earlier adversarial studies discarded few features
                                                                      amples detection [34, 33]. Based on its success on
and hence do not provide the complete picture on the
                                                                      binary classification tasks, SVM is an ideal choice
credit card domain. We want to highlight that the pro-
                                                                      to be used for credit card default prediction as
posed research is the first work explaining detailed anal-
                                                                      well. SVM works on the following optimization
ysis helpful both crafting the attack and mitigating it by
                                                                      function:
protecting the important features. Figure 3 and Figure
4 show the score plot of the features from the Default                     𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒||𝑤|| 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑦𝑖 (𝑊 𝑇 𝑥𝑖 − 𝑏) ≥ 1
Credit and Australian Credit Card database, respectively.
                                                                                                        𝑓 𝑜𝑟 𝑖 ∈ 1, ..., 𝑛
On the Default Credit database, feature 6 (i.e., Age as
shown in Table 1) shows the highest importance irrespec-              where, 𝑤 is the vector on the separating hyper-
tive of the feature selection algorithm. Feature 8 is found           plane and 𝑏 is the bias term. 𝑥𝑖 and 𝑦𝑖 are the
most relevant in the Australian database using both UFR               𝑖𝑡ℎ data point and label, respectively. 𝑛 repre-
and MRMR feature selection algorithms.                                sents the total training data points. Upon solv-
                                                                      ing the above equation, the classifier obtained is:
                                                                      𝑋 → 𝑠𝑖𝑔𝑛(𝑤 𝑇 𝑥 + 𝑏). In this research, we have
5. Vulnerable Machine Learning                                        used two variants of SVM referring to basically
   Algorithms                                                         two kernels namely ‘linear’ and ‘radial basis func-
                                                                      tion’ (RBF) used for learning the separating hy-
In this research, we have used several machine learning               perplane.
algorithms to carefully investigate the impact of adver-           2. Logistic Regression [47]: It is another simple and
sarial manipulation on the feature space of Credit Card               popular classifier for the task of binary classifi-
databases. To present a first-ever detailed study, in total,          cation problem. It uses the logistic function to
nine different classifiers are used for extensively investi-          model the probabilities of the binary classes. The
gating the adversarial fraud in the finance domain. Fur-              class of the input is predicted by taking the maxi-
ther, we describe each of the algorithms used for binary              mum of the probabilities of the classes output by
classification on clean and manipulated features:                     the model. The probability of each class can be
   computed using the following logistic formula:                 where, 𝑃(𝑥, 𝑘) is the probability of the point 𝑥
                    exp 𝛽0 + 𝛽1 𝑥1 + ...... + 𝛽𝑘 𝑥𝑘               belonging to class 𝑘. 𝜇𝑘 and Σ𝑘 are the Gaussian
        𝜋(𝑋 ) =                                                   parameters of class 𝑘.
                  1 + exp 𝛽0 + 𝛽1 𝑥1 + ...... + 𝛽𝑘 𝑥𝑘
   where, 𝛽s are the parameters of the classifier and
   𝑥s are the feature values.                             6. Proposed Adversarial Attack
3. Naive Bayes Classifier: It is based on the popular        and Experimental Results
   Bayes theorem which can be written as follows:
                             𝑃(𝐴/𝐵)𝑃(𝐵)                In this research, we have studies the impact of manipula-
                  𝑃(𝐵/𝐴) =                             tion on individual features for credit default prediction.
                                𝑃(𝐴)
                                                       We have applied several mathematical operations to ob-
   where 𝐴 and 𝐵 are two independent events and tain the manipulated features. Broadly, the proposed
   𝑃(𝐴) ≠ 0. It assumes the features are indepen- classifier agnostic and black-box attack can be defined
   dent from each other and observation follows using the following equation:
   multivariate distribution.
4. Binary Trees: It works on segregating the classes       𝑥𝑝𝑒𝑟𝑡 = {𝑋 𝑂𝑅(𝑥𝑐𝑙𝑒𝑎𝑛 , 1), 𝑖𝑓 𝑥𝑐𝑙𝑒𝑎𝑛 𝑖𝑠 𝑏𝑖𝑛𝑎𝑟𝑦
   based on the features at each level of the tree.                                                                (1)
                                                                          𝑥𝑐𝑙𝑒𝑎𝑛 + 𝜂 𝑖𝑓 𝑥𝑐𝑙𝑒𝑎𝑛 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠}
   At each level, data is partitioned into classes and
   the features for level are selected based on the where, 𝑥𝑝𝑒𝑟𝑡 is the perturbed variant of the clean fea-
   impurity function such as Gini impurity. The ture 𝑥𝑐𝑙𝑒𝑎𝑛 . 𝜂 is defined using several function such as:
   classifier works iteratively and stops when either 𝜂 = 𝐶, where 𝐶 is a constant value between 0 to 1.
   all the data points are exhausted or leaf nodes Other mathematical function which are explored for 𝜂
   arrive. As the name suggested, at each level of are 𝑒𝑥𝑝(𝑚𝑎𝑥(𝑥𝑐𝑙𝑒𝑎𝑛 ) − 𝑚𝑖𝑛(𝑥𝑐𝑙𝑒𝑎𝑛 ) and 𝑙𝑜𝑔(𝑚𝑎𝑥(𝑥𝑐𝑙𝑒𝑎𝑛 ) −
   the tree, only two nodes are allowed at most.       𝑚𝑖𝑛(𝑥𝑐𝑙𝑒𝑎𝑛 ). Apart from that, another simple mathemati-
5. Neural Network [48]: It is another most success- cal attack on the feature value termed as ‘feature dropout’
   ful machine-learning architecture that works on here can be defined as:
   mapping the input features to the output classes
   through multiple layers in between. The interme-                            𝑥𝑝𝑒𝑟𝑡 = 𝑥𝑐𝑙𝑒𝑎𝑛 ∗ 0
   diate layers are popularly referred to as hidden       In this section, we describe the adversarial manipula-
   layers and they can vary from 1 to any number. In tion results and analysis using the constant value mod-
   this research, we use shallow NN (S-NNet) with ification as the attack. The databases are divided into
   one hidden layer and deep NN (D-NNet) with 2 training and testing, where the training set contains ran-
   hidden layers each having the neurons equal to domly selected 75% of the total data point. The remaining
   half of the size of the neurons in the previous 25% data points are used for the evaluation of each of the
   layer. Both NNs are trained using a stochastic classifiers trained on the training set.
   gradient descent algorithm.                            The analysis of the results can be divided into the
6. K Nearest Neighbor (KNN) [49]: It is the simplest following parts: 1 accuracy on the clean images, 2
   and training-free classifier that works on the mea- robustness of a classifier, and 3 sensitive features for
   surement of the distance between test points and adversarial goal. The credit card default payment results
   training data points. The test point is classified of each classifier on a clean test of both the databases
   into the class from which it has the lowest dis- are reported in Figure 5. On the Australian database, as
   tance. In this research, we have used the 𝐾 = 5 compared to the non-linear classifiers such as RBF SVM
   nearest data points to find the closes class using and Neural Network, the linear classifier such as linear
   the ’Euclidean’ distance.                           SVM performs better. Whereas, on the default credit
7. Discriminant Analysis Classifier (DAC): It is card database, the RBF SVM performs best as compared
   based on the assumption that the data points of to other linear and non-linear classifiers. It is interesting
   different classes contain different parameters of to note from going shallow to a deep neural network,
   the Gaussian distribution. For classification, the no significant improvement in accuracy notices on both
   Gaussian parameters of each class are found out the databases. In another observation, the Naive Bayes
   using the training set. To identify the class, the classifier performs the worst on the Australian credit
   posterior probability of point belonging to each card database and second-worst on the default credit
   class is calculated as follows:                     card database.
                       1             −1                Analysis concerning classifiers: In terms of the sensi-
     𝑃(𝑥|𝑘) =                    𝑒𝑥𝑝( (𝑥 − 𝜇𝑘 )|Σ𝑘 |−1
                ((2𝜋)𝑑 |Σ𝑘 |)1/2     2                 tivity of the classifier, it is found that the SVM classifier is
                                                    𝑇
                                          (𝑥 − 𝜇𝑘 ) )  the least robust in terms of the magnitude of the accuracy
Table 3
Adversarial vulnerability of multiple ML classifiers against the proposed perturbation defined in Equation 1 on
Default Credit Card Database. Colored box represents the sensitive features and drop in accuracy of classifier on the corre-
sponding feature.
            Perturb    SVM                Logistic      Naive      Binary   Neural Network
                                                                                                 KNN      DAC
            Feature    Linear    RBF      Regression    Bayes      Trees    Shallow   Deep
               1        81.19    78.45      78.55       80.10       69.25    78.60    80.30      78.69    78.95
               2        81.19    81.16      81.81       64.17       73.13    81.58    82.47      79.29    81.92
               3        81.19    78.76      78.61       81.10       69.27    79.48    81.68      78.43    79.04
               4        81.19    79.88      79.13       75.76       71.61    81.44    81.89      80.33    79.47
               5        81.19    80.21      82.32       34.41       66.40    77.79    81.35      77.57    81.81
               6        22.11    22.13      22.50       25.40       43.35    77.15    22.17      48.01    22.17
               7        41.76    77.91      81.36       25.96       61.28    81.63    32.99      57.56    79.05
               8        81.19    80.10      82.40       26.10       72.10    54.41    31.95      74.22    82.07
               9        81.19    80.00      81.89       25.77       71.60    29.39    75.21      74.36    81.99
              10        81.19    82.69      82.17       25.37       64.24    79.79    80.81      70.03    82.28
              11        81.19    81.61      81.35       25.75       69.63    74.36    80.91      72.25    81.48
              12        81.19    81.65      77.89       38.68       60.20    78.67    81.31      77.91    77.88
              13        81.19    82.15      30.83       33.40       68.25    42.83    40.40      61.17    71.67
              14        81.19    81.61      25.37       81.10       67.71    42.97    27.89      77.88    53.25
              15        81.19    82.12      79.28       42.78       67.93    80.61    78.15      78.17    79.19
              16        81.19    82.10      80.84       30.78       71.11    78.17    79.09      78.91    80.05
              17        81.19    81.93      81.41       63.55       66.99    78.85    30.00      79.80    81.41
              18        81.19    78.35      77.89       77.89       71.35    77.89    77.91      77.89    77.89
              19        81.19    79.10      77.89       77.89       71.30    76.23    80.36      77.89    77.90
              20        81.19    80.55      77.91       77.89       73.13    71.79    56.09      77.89    80.31
              21        81.19    78.61      78.11       77.89       69.38    82.33    78.11      77.89    78.83
              22        81.19    78.65      77.89       77.89       69.23    77.95    78.35      77.91    78.59
              23        81.19    78.48      77.96       77.89       68.47    77.91    79.85      77.89    79.33


drop on both Australian credit card and default credit          Analysis concerning features: The default payment
card databases. On the Australian database, the accuracy        database contains 23 features by removing the ID fea-
of the linear SVM drops from 86.13% to 13.88%. The rela-        ture which is simply a sequence reflecting the obser-
tive drop in the accuracy is 72.25% which is the highest        vation number and class variable, i.e., default payment.
among all the classifiers used for credit default predic-       Whereas, the Australian database contains 14 features for
tion. On the other hand, the Naive Bayes classifier which       classification. We want to mention that in this research,
performs the worst on the Australian database shows the         we have shown the adversarial strength by perturbing
least drop in accuracy when the features are perturbed          a single feature only. On the Australian database, fea-
using the proposed black-box and model agnostic attack.         ture 8 is found most sensitive feature, and perturbing
In other words, the Naive Bayes classifier is found most        that feature affects the performance of each classifier
effective in handling the perturbation. The accuracy of         significantly. Apart from affecting the performance of
each classifier on the clean images, least accuracy ob-         each classifier, feature 8 shows the highest reduction in
tained under perturbation, and difference reflecting a          the accuracy of each classifier. Feature 8 contains the
maximum drop in the accuracy is reported in Figure 6            binary values and we have modified the binary values
(left) on the Australian database. On the default credit        through the ‘XOR’ operator as shown in the proposed
card database, the non-linear RBF classifier found the          attack equation 1. The second worst feature is the fea-
highest vulnerable and the relative drop in the perfor-         ture 14 which contains the continuous values. However,
mance is went to 60.3%. KNN classifier is found most            interesting both linear and non-linear SVM, binary trees,
robust in terms of the relative drop in the performance         and deep neural networks are found robust against the
when the features are compared as compared to the ac-           slight modification on it.
curacy on the clean features. It is interesting to note            On the default credit card database, feature 6 found
that, even on the Australian database, KNN shows the            the weakest point of each classifier except for shallow
second-best robustness on the perturbed features. Figure        neural network (S-NNet). The perturbation of the fea-
6 (right) shows the maximum sensitivity of each classifier      ture 6 significantly dropped the accuracy of the affected
on the default credit card database.                            classifiers. The RBF SVM classifier is found sensitive
against feature 6 only. We want to highlight that both          perturbation can be defined in the terms of the num-
the feature selection algorithms give the highest score to      ber of values available for manipulation. For example,
the features 6 as shown in Figure 3 on the default credit       an image contains a significantly large number of val-
card database. Similarly, on the Australian database, each      ues (pixels) available for manipulation and is easily in-
classifier has been found highly sensitive to the highest       terchangeable. Whereas, the tabular finance databases
relevance features reported by the feature selection al-        contain a low number of features and can not be easily in-
gorithms as shown in Figure 4. The detailed analysis on         terchanged with each other. Few works exist to identify
the sensitivity of individual features is given in Tables 2     the vulnerability of ML algorithms on tabular databases.
and 3.                                                          However, limitations of the existing attacks are that they
Other Manipulations: We want to highlight that the              require white-box access of the classifiers and result in
other mathematical operations such as 𝑒𝑥𝑝 and 𝑙𝑜𝑔 men-          unwanted transformations of the features. In this re-
tioned in Section 6 yield similar adversarial phenomena         search, we have proposed a first-ever black-box attack
are observed on each classifier.                                on the tabular credit card default prediction databases.
                                                                We have evaluated a broad number of machine learning
6.1. Unwanted Phenomena for Attacker                            classifiers as compared to a few classifier vulnerability
                                                                assessments in the existing works. The proposed attack
It is interesting to observe that the adversarial perturba-     proves its classifier agnostic strength by fooling each clas-
tion does not always reduce the performance of a clas-          sifier. Apart from the evaluation of multiple classifiers,
sifier. Apart from that, another interesting point is that      we have also studied the sensitivity concerning individ-
the features which are least important for classification,      ual features of the databases. Interestingly, it is observed
perturbing them can inversely affect the goal of an at-         that perturbation of every feature might hurt the aim
tacker. The importance of the features can be calculated        of an attack, and therefore, intelligent consideration is
using the feature selection algorithm. For example, on          required. We hope the proposed research opens multiple
the Australian database, the feature 11 was found least         research threads both towards finding the vulnerabilities
relevant by both UFR and MRMR feature selection algo-           of tabular classifiers and improving their robustness.
rithm. Interestingly, perturbing this feature significantly
improves the performance of multiple classifiers. For
example, the performance of the RBF SVM, logistic re-           References
gression, and shallow neural network (S-NNet) improves
by 2.89%, 1.73%, and 2.31%, respectively. Similarly, the         [1] https://shiftprocessing.com/credit-card-fraud-
features which are found less relevant by the feature se-            statistics/, Credit card fraud statistics in the
lection algorithms on the default database, perturbing               united stated, https://mk0shiftprocessor1gw.
them shows the performance improvement. For example,                 kinstacdn.com/wp-content/uploads/2019/10/
features 1 and 14 are among the least important feature in           CC-Fraud-reports-in-US-2-e1571769539315.jpg,
the default payment database. However, perturbing them               2020.
drastically increased the performance of the Naive Bayes         [2] Facts + statistics:        Identity theft and cy-
classifier. The performance of Naive Bayes shows at                  bercrime,         https://www.iii.org/fact-statistic/
least 5.55% jump in the classification performance when              facts-statistics-identity-theft-and-cybercrime,
perturbing these features. From the above analysis, we               2020.
suggest careful attention is required while perturbing a         [3] Personal loan statistics for 2020, https://www.fool.
feature, a random perturbation of any feature set might              com/the-ascent/research/personal-loan-statistics/,
not be fruitful for an attacker. Although further analysis           2020.
can reveal future directions to improve the performance          [4] S. Girish, S. R. Maiya, K. Gupta, H. Chen, L. S. Davis,
of a classifier by securing only the relevant features.              A. Shrivastava, The lottery ticket hypothesis for ob-
                                                                     ject recognition, in: Proceedings of the IEEE/CVF
                                                                     Conference on Computer Vision and Pattern Recog-
7. Conclusion                                                        nition, 2021, pp. 762–771.
                                                                 [5] M. Mandal, L. K. Kumar, M. S. Saran, et al., Motion-
Adversarial vulnerability of the visual classifiers is exten-        rec: A unified deep framework for moving object
sively explored and paves the way for improving their                recognition, in: Proceedings of the IEEE/CVF Win-
robustness for secure real-world deployment. However,                ter Conference on Applications of Computer Vision,
limited work has been done on financial databases es-                2020, pp. 2734–2743.
pecially tabular databases. The probable reason might            [6] M. Singh, S. Nagpal, M. Vatsa, R. Singh, Enhanc-
be the heterogeneous nature of the databases and the                 ing fine-grained classification for low resolution
low degree of freedom for perturbation. The degree of                images, arXiv preprint arXiv:2105.00241 (2021).
 [7] M. Singh, S. Nagpal, R. Singh, M. Vatsa, Derivenet          [20] A. Madry, A. Makelov, L. Schmidt, D. Tsipras,
     for (very) low resolution image classification, IEEE             A. Vladu, Towards deep learning models resistant to
     Transactions on Pattern Analysis and Machine In-                 adversarial attacks, arXiv preprint arXiv:1706.06083
     telligence (2021).                                               (2017).
 [8] S. Ghosh, R. Singh, M. Vatsa, Subclass heterogene-          [21] S.-M. Moosavi-Dezfooli, A. Fawzi, P. Frossard,
     ity aware loss for cross-spectral cross-resolution               Deepfool: a simple and accurate method to fool
     face recognition, IEEE Transactions on Biometrics,               deep neural networks, in: Proceedings of the IEEE
     Behavior, and Identity Science 2 (2020) 245–256.                 conference on computer vision and pattern recog-
 [9] I. Nigam, R. Keshari, M. Vatsa, R. Singh, K. Bowyer,             nition, 2016, pp. 2574–2582.
     Phacoemulsification cataract surgery affects the            [22] K. R. Mopuri, A. Ganeshan, R. V. Babu, Gener-
     discriminative capacity of iris pattern recognition,             alizable data-free objective for crafting universal
     Scientific reports 9 (2019) 1–9.                                 adversarial perturbations, IEEE transactions on
[10] Alphafold:            a    solution     to      a     50-        pattern analysis and machine intelligence 41 (2018)
     year-old        grand       challenge        in     biol-        2452–2465.
     ogy,              https://deepmind.com/blog/article/        [23] J. Hayes, G. Danezis, Learning universal adversarial
     alphafold-a-solution-to-a-50-year-old-grand-\                    perturbations with generative models, in: 2018
     challenge-in-biology, 2020.                                      IEEE Security and Privacy Workshops (SPW), IEEE,
[11] F. O. Geraldes, Pushing the boundaries of computer-              2018, pp. 43–49.
     aided diagnosis of melanoma, The Lancet Oncology            [24] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren,
     22 (2021) 433.                                                   A. L. Yuille, Improving transferability of adversarial
[12] Statlog (australian credit approval) data set,                   examples with input diversity, in: Proceedings of
     https://archive.ics.uci.edu/ml/datasets/Statlog+                 the IEEE/CVF Conference on Computer Vision and
     %28Australian+Credit+Approval%29, 2020.                          Pattern Recognition, 2019, pp. 2730–2739.
[13] F. Pierazzi, F. Pendlebury, J. Cortellazzi, L. Cavallaro,   [25] X. Wang, K. He, Enhancing the transferability of
     Intriguing properties of adversarial ml attacks in               adversarial attacks through variance tuning, in:
     the problem space, in: 2020 IEEE Symposium on Se-                Proceedings of the IEEE/CVF Conference on Com-
     curity and Privacy (SP), IEEE, 2020, pp. 1332–1349.              puter Vision and Pattern Recognition, 2021, pp.
[14] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Er-           1924–1933.
     han, I. Goodfellow, R. Fergus, Intriguing properties        [26] G. Goswami, N. Ratha, A. Agarwal, R. Singh,
     of neural networks, arXiv preprint arXiv:1312.6199               M. Vatsa, Unravelling robustness of deep learn-
     (2013).                                                          ing based face recognition against adversarial at-
[15] A. Goel, A. Agarwal, M. Vatsa, R. Singh, N. Ratha,               tacks, in: Proceedings of the AAAI Conference on
     Deepring: Protecting deep neural network with                    Artificial Intelligence, volume 32, 2018.
     blockchain, in: Proceedings of the IEEE/CVF Con-            [27] G. Goswami, A. Agarwal, N. Ratha, R. Singh,
     ference on Computer Vision and Pattern Recogni-                  M. Vatsa, Detecting and mitigating adversarial per-
     tion Workshops, 2019, pp. 0–0.                                   turbations for robust face recognition, International
[16] A. Goel, A. Agarwal, M. Vatsa, R. Singh, N. Ratha,               Journal of Computer Vision 127 (2019) 719–742.
     Securing cnn model and biometric template using             [28] A. Agarwal, M. Vatsa, R. Singh, N. K. Ratha, Noise
     blockchain, in: 2019 IEEE 10th International Con-                is inside me! generating adversarial perturbations
     ference on Biometrics Theory, Applications and                   with noise derived from natural filters, in: Pro-
     Systems (BTAS), IEEE, 2019, pp. 1–7.                             ceedings of the IEEE/CVF Conference on Computer
[17] A. Goel, A. Agarwal, M. Vatsa, R. Singh, N. K. Ratha,            Vision and Pattern Recognition Workshops, 2020,
     Dndnet: Reconfiguring cnn for adversarial robust-                pp. 774–775.
     ness, in: Proceedings of the IEEE/CVF Conference            [29] J. Kos, I. Fischer, D. Song, Adversarial examples for
     on Computer Vision and Pattern Recognition Work-                 generative models, in: IEEE Security and Privacy
     shops, 2020, pp. 22–23.                                          Workshops, 2018, pp. 36–42.
[18] K. Das, R. N. Behera, A survey on machine learning:         [30] Y.-C. Lin, Z.-W. Hong, Y.-H. Liao, M.-L. Shih, M.-Y.
     concept, algorithms and applications, International              Liu, M. Sun, Tactics of adversarial attack on deep
     Journal of Innovative Research in Computer and                   reinforcement learning agents, in: International
     Communication Engineering 5 (2017) 1301–1309.                    Joint Conference on Artificial Intelligence, 2017, pp.
[19] S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. P.           3756–3762.
     Reyes, M.-L. Shyu, S.-C. Chen, S. Iyengar, A sur-           [31] I. Rosenberg, A. Shabtai, L. Rokach, Y. Elovici,
     vey on deep learning: Algorithms, techniques, and                Generic black-box end-to-end attack against state
     applications, ACM Computing Surveys (CSUR) 51                    of the art api call based malware classifiers, in: In-
     (2018) 1–36.                                                     ternational Symposium on Research in Attacks, In-
     trusions, and Defenses, Springer, 2018, pp. 490–510.          icone mask based face presentation attack, in: 2019
[32] A. Agarwal, R. Singh, M. Vatsa, N. Ratha, Are                 IEEE 10th International Conference on Biometrics
     image-agnostic universal adversarial perturbations            Theory, Applications and Systems (BTAS), IEEE,
     for face recognition difficult to detect?, in: 2018           2019, pp. 1–5.
     IEEE 9th International Conference on Biometrics          [46] A. Agarwal, D. Yadav, N. Kohli, R. Singh, M. Vatsa,
     Theory, Applications and Systems (BTAS), IEEE,                A. Noore, Face presentation attack with latex masks
     2018, pp. 1–7.                                                in multispectral videos, in: Proceedings of the
[33] A. Agarwal, R. Singh, M. Vatsa, N. K. Ratha, Im-              IEEE Conference on Computer Vision and Pattern
     age transformation based defense against adver-               Recognition Workshops, 2017, pp. 81–89.
     sarial perturbation on deep learning models, IEEE        [47] R. E. Wright, Logistic regression. (1995).
     Transactions on Dependable and Secure Computing          [48] J. J. Hopfield, Artificial neural networks, IEEE
     (2020).                                                       Circuits and Devices Magazine 4 (1988) 3–10.
[34] A. Agarwal, G. Goswami, M. Vatsa, R. Singh, N. K.        [49] K. S. Fu, T. M. Cover, Digital pattern recognition,
     Ratha, Damad: Database, attack, and model agnos-              volume 3, Springer, 1976.
     tic adversarial perturbation detector, IEEE Trans-
     actions on Neural Networks and Learning Systems
     (2021).
[35] J. Zhang, C. Li, Adversarial examples: Opportuni-
     ties and challenges, IEEE transactions on neural net-
     works and learning systems 31 (2019) 2578–2593.
[36] R. Singh, A. Agarwal, M. Singh, S. Nagpal, M. Vatsa,
     On the robustness of face recognition algorithms
     against attacks and bias, in: Proceedings of the
     AAAI Conference on Artificial Intelligence, vol-
     ume 34, 2020, pp. 13583–13589.
[37] A. Serban, E. Poll, J. Visser, Adversarial examples
     on object recognition: A comprehensive survey,
     ACM Computing Surveys (CSUR) 53 (2020) 1–38.
[38] V. Ballet, J. Aigrain, T. Laugel, P. Frossard, M. De-
     tyniecki, et al., Imperceptible adversarial attacks
     on tabular data, in: NeurIPS 2019 Workshop on
     Robust AI in Financial Services: Data, Fairness, Ex-
     plainability, Trustworthiness and Privacy (Robust
     AI in FS 2019), 2019.
[39] E. Levy, Y. Mathov, Z. Katzir, A. Shabtai, Y. Elovici,
     Not all datasets are born equal: On heterogeneous
     data and adversarial examples, arXiv preprint
     arXiv:2010.03180 (2020).
[40] E. Erdemir, J. Bickford, L. Melis, S. Aydore, Adver-
     sarial robustness with non-uniform perturbations,
     arXiv preprint arXiv:2102.12002 (2021).
[41] I.-C. Yeh, C.-h. Lien, The comparisons of data min-
     ing techniques for the predictive accuracy of prob-
     ability of default of credit card clients, Expert Sys-
     tems with Applications 36 (2009) 2473–2480.
[42] M. Lichman, Uci machine learning repository, https:
     //archive.ics.uci.edu/ml, 2013.
[43] C. Ding, H. Peng, Minimum redundancy feature
     selection from microarray gene expression data,
     Journal of bioinformatics and computational biol-
     ogy 3 (2005) 185–205.
[44] C. Cortes, V. Vapnik, Support-vector networks,
     Machine learning 20 (1995) 273–297.
[45] A. Agarwal, M. Vatsa, R. Singh, Chif: Convo-
     luted histogram image features for detecting sil-