A hybrid approach for feature selection in data mining modeling of credit scoring Galyna O. Chornous1, Kostiantyn К. Pysanets1, Nataliia O. Yakovenko1 1 Taras Shevchenko National University of Kyiv, 90a, Vasylkivska str., Kyiv, 03022 Ukraine chornous@univ.kiev.ua, knukkp@gmail.com, tasha.yakovenko@gmail.com Abstract. Recent year researches shows that data mining techniques can be im- plemented in broad areas of the economy and, in particular, in the banking sector. One of the most burning issues banks face is the problem of non-repayment of loans by the population that related to credit scoring problem. The main goal of this paper is to show the importance of applying feature selection in data mining modeling of credit scoring. The study shows processes of data pre-processing, feature creation and feature selection that can be applicable in real-life business situations for binary classification problems by using nodes from IBM SPSS Modeler. Results have proved that application of hybrid model of feature selec- tion, which allows to obtain the optimal number of features, conduces in credit scoring accuracy increase. Proposed hybrid model comparing to expert judgmen- tal approach performs in harder explanation but shows better accuracy and flex- ibility of factors selection which is advantage in fast changing market. Keywords: Credit Scoring Model, Feature Selection, Hybrid Approach, Data Mining, IBM SPSS Modeler 1 Introduction Recent year researches shows that data mining techniques can be implemented in broad areas of the economy and, in particular, in the banking sector. Banks and other credit institutions have faced the need to process large amounts of data at a growing rate. The imperatives for the volume of data operations and the speed of their processing require these processes to be almost completely automated. These requirements apply not only to direct digitalization, but also to the procedures for developing appropriate mathemat- ical models. Credit scoring models are a prime example. They are increasingly com- bined with new computational methods based on data mining. An extremely important problem in scoring modeling was and remains the choice of the borrowers’ characteristics, which are decisive in loan decision making. In terms of the model, these characteristics are often known as the explanatory variables, covari- ates, predictor attributes, predictor variables, independent variables or, typically, fea- tures. Set of the most influential features is not permanent. It changes over time and is significantly dependent on the macroeconomic situation and national specificities. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Researchers [16] analyzed 187 papers from 1992 to 2015 on credit scoring and noted that feature selection is the No.4 objective of seven types of main objectives: proposing a new method for rating, comparing traditional techniques, conceptual discussions, fea- ture selection, literature review, performance measures studies and other issues. The authors of this research have set this objective for 95 articles, representing 51% of the total, with 52 articles published since 2011. Among the main studies of the 2002-2015 period, which are devoted to the feature selection, can be highlighted [2, 4, 6, 11, 14, 15, 18, 20, 22, 27, 28]. Relevant problems are also actively researched in the last four years. The examples of publications are as follows [2, 3, 8, 10, 17, 24-26]. The focus of pertinent researches on feature selection is increasingly being shifted toward machine learning and hybridi- zation methods. Nowadays software implementations of machine learning algorithms in credit scor- ing can take place using the following classes of software: business application pack- ages (statistical packages and analytics platforms, such as SAS/STAT, SAS Enterprise Miner, IBM SPSS Modeler, STATISTICA Data Miner), open platforms (Python, R, Apache Spark) and cloud solutions (Microsoft Azure Machine Learning Studio, Google Machine Learning Engine). Despite all the advantages of cloud-based data analytics, it is not widely used in banks due to security concerns about the passing confidential data to the cloud. The biggest advantage of open environments is the ability to use much more algo- rithms comparing to business application packages. However, they impose additional qualification requirements for the developers of scoring models. Therefore, statistical packages and analytical platforms are the most commonly used in banks. They include analytical pre-proceeding tools, ready and customizable ma- chine learning algorithm templates. In addition, these packages allow to configure model settings and use interactive quality assessment techniques. This conclusion is also confirmed by the authors of the paper [26]. Business application packages have powerful functionality to solve the problem of feature selection, which can be improved by combining built-in approaches and hybrid- ization. That is why, it is extremely important to improve the functionality of these software products for developing scoring models in general, and for making feature selection in particular. The purpose of this study is to propose new hybrid approaches to the feature selec- tion, which will improve the quality of credit scoring models, built on intelligent data analysis, machine-learning approaches in today's analytics platforms. 2 Literature review From last few decades more and more attention has been paid to the problem of credit scoring [21, 23]. Artificial Neural Networks (ANNs) [25] and Support Vector Machine (SVM) [4, 20] are two commonly soft computing methods used in credit scoring mod- elling. Other methods like evolutionary algorithms, stochastic optimization technique have shown promising results in terms of prediction accuracy [26]. Besides, there are also traditional approaches based on expert knowledge which allow to develop expert judgmental models [7], scoring expert systems [1] and mixed models. Feature selection algorithms, generally as preprocessing methods of scoring model creation, can be used to increase the classification performance. They have a number of benefits as follows [19]: decreasing the noise in dataset; reducing the computational cost in order to successfully acquire proper models; helping to better understand the final models in the classification algorithms; simple application; assisting in updating the model. We have studied a lot of publications regarding the problem of credit scoring in data mining with applying feature selection techniques. Some of them are described below. Starting with primitive approaches, use of expert judgmental forms is a good source for initial features list creation. It is not common for long existing credit business, how- ever in case of new lending segments emergence under condition of data shortage it shows acceptable effectiveness. One of the simplest and widespread statistical approach is ‘weight of evidence’ and ‘information value’ indicators use, explained by Siddiqi in [21]. Kuhn and Johnson describe in [12-13] two main types of feature selection tech- niques: wrapper and filter methods. The filter approach considers the feature selection process as a separate step of learning algorithms. The filter model uses evaluation func- tions to evaluate the classification performances of subsets of features. There are many evaluation functions such as feature importance, Gini, information gain, the ratio of information gain, etc. A disadvantage of this approach is that there is no relationship between the feature selection process and the performance of learning algorithms. The wrapper approach uses a machine-learning algorithm to measure the set good- ness of selected features. The measurement relies on the performance of the learning algorithm such as its accuracy, recall and precision values. The papers [25-26] systemize a credit scoring model based on deep learning and feature selection to evaluate the applicant’s credit score from the applicant’s input fea- tures. The objective of many studies is to analyze the outperform feature selection tech- niques among conventional and heuristic techniques in various applications [14, 17, 28]. A lot of researches embody the optimization approach to find the best subset of pre- dictors for improving scoring model performance [3, 6, 27]. For instance, in [3] authors suggested to study local search, stochastic local search and variable neighborhood search for feature selection in credit scoring. The proposed feature selection is then combined with a support vector machine to classify the input data. Publications of the recent years show for credit scoring problem active use of prin- cipal component analysis feature selection: PCA is a transformation process to reduce the number of features by extraction of the new independent features [7, 11, 25]. These days, there are more and more examples of the use of different hybrid feature selection techniques. A hybrid approach in data mining models of feature selection algorithms and ensem- bling learning classifiers for credit scoring was used by Koutanaei, Sajedi and Khanba- baei [11]. Credit scoring modeling based on feature selection approach and parallel Random Forest were described by Ha Van Sang, Ha Nam and Nguyen Duc Nhan [8]. For feature selection, a feature clustering approach was proposed to find optimal set of predictors by autors [24]. Kamalov and Thabtah suggested a new filtering method that combines and normalizes the scores of three major feature selection methods: infor- mation gain, chi-squared statistic and inter-correlation [10]. Chornous and Nikolskyi prove that a great improvement can be reached by applying hybrid approach to feature selection process on additional variables (more descriptive ones that were built on ini- tial features) for case with limited computational resources [5]. This study in comparison with others proposes a unique approach for feature selec- tion techniques and has in advantage the accessibility of the approach for users of ana- lytical platforms, business application packages, using the capabilities of the built-in tools and offering methods for combining them. Scoring modelling techniques mentioned in studied works are commonly used, but their aim mostly to show an increase of models performance accuracy directly. Previous studies compare different approaches or miss the relationship between the feature se- lection process and the performance of learning algorithms. So the authors’ aim is the search of techniques conjunction for features selection to reach and explain better mod- els performance. Besides, it is obvious that the usage of feature selection techniques differs among countries, so the results of this particular study can be implemented in real business cases in Ukraine. Important tasks of this research are to show the advantages of modern powerful an- alytical platforms (on the example of IBM SPSS Modeler) for solving the problems of credit scoring in general and making the feature selection in particular; to suggest the concept of an effective combination of their tools; to show the experimental results of the joint application of options in Feature Select node and PCA/Factor node to optimize the feature selection process and to model credit scoring on the example of Ukrainian bank data. 3 Methodology Dataset and tools The dataset was collected and systematized all socio-economic information about them at the stages of loan repayment, collected information about the timeliness of the loan by Ukrainian bank during the stages of providing consumer loans to individuals. The original dataset consists of 61 fields with a record volume of 61 216. It is advis- able to use all the data in this database, not just a subset, which will allow us to build more accurate models. Non-relevant records or non-relevant attributes may not be in- cluded. The available data is in several formats: numeric, categorical and logical. All the computing work was done in nodes from IBM SPSS Modeler. This software product has powerful functionality for solving binary classification tasks, including credit scoring. To select important features, such built-in tools as the Feature Selection node and the PCA/Factor node are used [9]. The Feature Selection node allows to implement 3 key procedures: 1. Screening that removes unimportant and problematic inputs and records, or cases such as input fields with too many missing values or with too much or too little variation to be useful. 2. Ranking that sorts remaining inputs and assigns ranks based on importance. 3. Selecting that identifies the subset of features to use in subsequent models. The PCA/Factor node provides powerful data-reduction techniques to reduce com- plexity of data. Two approaches are provided. The first one is Principal component analysis (PCA). In this statistical dimensionality reduction technique, the correlated features can be combined as principal components. The second one is Factor analysis. It identifies underlying concepts, or factors, that explain the pattern of correlations within a set of observed fields. Factor analysis focuses on shared variance only. Vari- ance that is unique to specific features is not considered in estimating the model. Sev- eral methods of factor analysis are provided by the Factor/PCA node. For extended comparison we have received expert judgmental generic list of influencing factors, their attributes and weights for the studied segment from one of professionals of Ukraine credit market with more than 10 years of working experience. For both approaches, the goal is to find a small number of derived features that ef- fectively summarize the information in the original set of features and compare them with expert model. To develop scoring models, IBM SPSS Modeler offers 16 base methods (some ex- amples of these ones are: Decision Trees (CART, QUEST, C.5.0, CHAID), Neural Network, SVM, Bayes Network, KNN, Logistic Regression, Discriminant analysis) and large set of ensemble methods (bagging, boosting, Random Tree, Random Forest, XGBoost Tree XGBoost Linear, XGBoost-AS) [9]. Moreover, the Auto Classifier node creates and compares a number of different bi- nary models, allowing to choose the best approach for development. 16 modeling algo- rithms are supported, making it possible to select the best methods, the specific options for each, and the criteria for comparing the results. Concept used Our investigation starts with the stages of pre-processing the dataset and adding new features that better describes defining borrower’s status than initial ones. Than to the resulted dataset with initial number of features we applied modeling methods such as Decision Trees, Random Forest, Support Vector Machines, Neural Networks, Logistic Regression and Expert approach. After this stage we applied a number of feature selec- tion techniques in order to decrease a number of features and conduct modeling again but under feature selection. We provide with analysis to compare results for AUC val- ues between initial models and models under feature selection. Finally, we applied a hybrid approach of feature selection analysis to obtain the optimal number of features, conduces in credit scoring accuracy increase. Described stages for investigational re- sults are presented in Figure 1. Fig. 1. Proposed concept for the experiment Data pre-processing Data pre-processing includes several steps: 1. Removing missing values and irrelevant features. 2. Data integration, transformation and normalization. 3. Reclassifying categorical values. 4. Balancing data. Feature creation The process of feature creation gives an opportunity to adjust business logic to the pro- cess of feature selection by adding new feature interaction rules. Adding new features gives more description for defining borrower’s status than initial ones, should increase modeling results in order to improve banks' performance in credit scoring problem. Feature selection Feature selection is also a data pre-processing technique, which is used to select the relevant attributes for the experiment. Feature engineering is crucial for model optimi- zation. This paper proposes feature selection by ranking measures as Pearson chi-square, Likelihood-ratio chi-square, Cramer's V and Lambda and feature selection using Prin- cipal Component Analysis. All feature selection techniques in detail are presented be- low. Feature selection by ranking measures. Feature selection by ranking measures was used to screen and rank features by im- portance. In this paper, we focus on 4 ranking measures. 1. Pearson chi-square. Tests for independence of the target and the input without indi- cating the strength or direction of any existing relationship. 2. Likelihood-ratio chi-square. Similar to Pearson's chi-square but also tests for target- input independence. 3. Cramer's V. A measure of association based on Pearson's chi-square statistic. Values range from 0, which indicates no association, to 1, which indicates perfect associa- tion. 4. Lambda. A measure of association reflecting the proportional reduction in error when the variable is used to predict the target value. A value of 1 indicates that the input field perfectly predicts the target, while a value of 0 means the input provides no useful information about the target. Feature selection by Principal Component Analysis. Principal components analysis (PCA) finds linear combinations of the input fields that do the best job of capturing the variance in the entire set of features, where the compo- nents are orthogonal (perpendicular) to each other. PCA focuses on all variance, includ- ing both shared and unique variances. Factor analysis and PCA can effectively reduce the complexity of data without sac- rificing much of the information content. These techniques can help to build more ro- bust models that execute faster than would be possible with the raw input fields. Modeling Typical methods for performing binary classification are Decision Trees, Random For- est, Support Vector Machines, Neural Networks, Logistic Regression [23]. Expert ap- proach is an effective alternative for these methods. In this paper, we focus on the four main methods as Support Vector Machines, Neu- ral Networks, Logistic Regression and Decision Tree (CHAID). We are interested in achieving the best rate of AUC, because the higher is the value of AUC the better is the distinguishing capacity of the classifier. It means that the chosen features, set by the mentioned feature selection techniques provide the best combination of features given that improves the capability of a credit models to correctly identify the behavior of a potential borrower to pay back a loan. After comparing and sorting the results of AUC measures for the classification al- gorithm, the procedure of selecting best one takes place. Besides, in order to achieve better performance, the ensemble of chosen models can be developed. A hybrid approach for feature selection analysis To present an argument for reasonable feature selection, it is advisable to focus not only on the meaning of the measures described above. We suggest to implement a hybrid approach. We propose to average the values of the statistical measures for each model for dif- ferent number of fields, which will allow to obtain the number of features for selection that corresponds to the highest accuracy of the model. Taking into account active use of Principal Component Analysis, presented in the literature on this issue, it is also advisable to take advantage of the following approach: the weight for the PCA will be 0.5, and the remainder will be evenly distributed among the statistical measures (0.125 for each). Based on the hybridization of the measures, we can arrive at conclusion that the AUC values depend (or not) on the number of features to be selected, and then we will obtain the optimal number of features and develop model using the best chosen quality approach. To recognize the correspondence between the selection criteria and the models used, the AUC values for each measure for different number of features for each model have to be averaged. If a particular criterion takes precedence for most models, it can be used for feature selection and developing the ensemble of models 4 Experimental Results Data pre-processing Our experiment starts with the process of data pre-processing. First of all, data cleaning was performed (removing missing values and irrelevant features from the database). Fields were screened based on the following criteria: maximum percentage of missing values; maximum percentage of records in a single category and maximum number of categories as a percentage of records (for categorical fields), minimum coefficient of variation and minimum standard deviation (for numeric fields). While proceeding data we found out that there is a conflicting coding scheme in the database. Numerical attributes had two ways of representing integer separators from fractional: a comma (field “WRK_EXPERIENCE”) and a semicolon (all other numeric attributes), or another example: a date of birth field that has "-" or "." separators. There are also two data formats in the WRK_NROFEMPLOYEES field - categorical and date. Gender (“Female”, “female”) is also indicated by different formulations. To prepare the data for modeling, we create a numeric field of the borrower's age, the flag field of the borrower's gender, where we reclassify the errors of entering the gender data. Also we create new flag fields for the presence of a partner, attitude to- wards the army and reclassify the occupational names field to six occupations to facil- itate further modeling. Another important moment is creating new features in order to adjust business logic to the process of feature selection. Thus, we added new feature interaction rules: the amount of income per family member, the amount of income per child, the amount of loan per term and the amount of income per payments. The practicality of this step is noted in many sources [21, 23]. The role is set to target for the field that indicates whether or not a given customer defaulted on the loan. The potential target fields were EVER_1_DPD, EVER_30_DPD, EVER_60_DPD, EVER_90_DPD. We have chosen EVER_30_DPD as the target, be- cause such loan delinquencies are beneficial for the bank, because the borrowers also pay delay penalty in addition to the loan repayments. After data pre-processing a dataset of 34 fields was prepared instead of initial 62. It consists of 41687 records. The data is in several formats: numeric (24), categorical (8) and logical (2). It should be mentioned that the resulted dataset was split into two samples: training (75%) and testing (25%). To correct imbalances in dataset we use Balance node that causes an artificial increase records for which the target field EVER_30_DPD returned “1”. The process of balancing data is essential in order to decrease misbalanced in initial data where the percentage of non-repayable loan is low in comparison with repayable ones (that is typical in credit-scoring problem for real business cases). Since many mod- eling techniques have trouble with biased data, they will tend to learn only the positive cases (repayable loans) and ignore the negative ones. If the data are well balanced with approximately equal numbers of positive and negative cases, models will have a better chance of finding patterns that distinguish the two groups. In this case, a Balance node is useful for creating a balancing directive that increases cases for non-repayable loans. In order not to misrepresent the true distribution results we apply Balance node only to the training sample of the data. Feature Selection Techniques and Modeling This section provides us with AUC values for scope of methods with and without fea- ture selection including expert approach model (Table 1). Table 1. AUC results for scope of models. Model AUC value Logistic Regression 0.601 Neural Networks 0.596 SVM 0.583 CHAID 0.696 Expert 0.654 Without using feature selection, Decision Tree (CHAID) model has demonstrated the best results while pure expert model showed second separation power When scoring data uses Feature Selection node, the top n fields based on importance (4 different sta- tistical measures) were selected (n = 30, 25, 20, 15, 10). Similar actions were performed for the PCA/Factor node. Next, Logistic Regression, Neural Network, Support Vector Machines, and Decision Tree (CHAID models were built for each case. The choice of these models is explained by the results of the application of Auto Classifier node. According to expert approach, 20 factors were determined as significant. The most valuable features are ‘Loan payment to income ratio’, ‘Income to expenses ratio of borrower’, ‘Spouse availability’ and `Age of a borrower`. By significance all factors can be divided into 4 groups with same level of predictive strength with number of factors 3, 1, 10 and 6. By logical criteria all factors can be assigned to social-demo- graphic, lending terms, financial state and lending history types. Further number of fac- tors change is impractical as it is time consuming and contradicts to the aim of pure expert approach. AUC results for models with feature selection for n equal to 25, 20 and 15 are pre- sented in the Tables 2-5. Results for n equal to 10 and 30 are missed due to insignificant difference from 25 and 15 options respectively. Table 2. AUC results due to feature selection technique for the Logistic Regression. Number (1) (2) (3) (4) (5) (6) (7) of fea- Pearson Likeli- Cramer's Lambda PCA Aver- Aver- tures chi-square hood-ratio V age age chi-square (1-4) (5-6) 15 0.740 0.749 0.746 0.745 0.681 0.745 0.713 20 0.748 0.749 0.747 0.749 0.594 0.748 0.671 25 0.746 0.748 0.747 0.747 0.589 0.747 0.668 Table 3. AUC results due to feature selection technique for the Neural Network. Number (1) (2) (3) (4) (5) (6) (7) of fea- Pearson Likeli- Cramer's Lambda PCA Aver- Aver- tures chi-square hood-ratio V age age chi-square (1-4) (5-6) 15 0.702 0.715 0.704 0.700 0.677 0.705 0.691 20 0.713 0.721 0.732 0.709 0.595 0.719 0.657 25 0.648 0.683 0.705 0.701 0.592 0.684 0.638 Table 4. AUC results due to feature selection technique for the SVM. Number (1) (2) (3) (4) (5) (6) (7) of fea- Pearson Likeli- Cramer's Lambda PCA Aver- Aver- tures chi-square hood-ratio V age age chi-square (1-4) (5-6) 15 0.657 0.673 0.67 0.678 0.507 0.670 0.588 20 0.679 0.671 0.672 0.679 0.582 0.675 0.629 25 0.638 0.673 0.686 0.685 0.580 0.671 0.625 Table 5. AUC results due to feature selection technique for the CHAID. Number (1) (2) (3) (4) (5) (6) (7) of fea- Pearson Likeli- Cramer's Lambda PCA Aver- Aver- tures chi-square hood-ratio V age age chi-square (1-4) (5-6) 15 0.702 0.675 0.688 0.716 0.686 0.695 0.691 20 0.695 0.684 0.718 0.69 0.697 0.697 0.697 25 0.729 0.668 0.714 0.721 0.680 0.708 0.694 It is obvious that PCA has proven to be an ineffective variable selection technique for CHAID and SVM, since the AUC values are close to the corresponding values in models without feature selection. The best results were achieved for logistic regression and model of neural networks according to criteria Likelihood-ratio chi-square and Cramer's V. On average, the use of feature selection techniques improves the AUC value by 11.2% compared to the none-use. Note that the distribution of cumulative gain of AUC averages compared to the AUC averaged without feature selection is not uniform - the statistical criteria of Pearson chi-square, Likelihood-ratio chi-square, Cramer's V and Lambda im- prove by 13.0-14.7%, but Principal Component Analysis does only by 0.4%. That is why, it is advisable for the Ukrainian banks to try the alternative of feature selection by statis- tical measures, unlike the widespread foreign experience which prefers PCA [7, 11, 25]. The findings prove that in most cases with decreasing number of features, AUC measures for the classification algorithms increase, and cases reducing the variables to 20 improve models performance. Hybrid approach of feature selection analysis The tables 1-5 show the AUC results for a different number of input fields, selected in accordance with 5 feature selection criteria, as well as average values of 4 statistical measures and a weighted average of all presented measures (the maximum value is highlighted in grey). AUC averages of all models (Table 6) confirm that the combination of statistical criteria allows to obtain the optimal number of features for modeling - 20, by PCA - 15. Table 6. Average AUC results for the Logistic Regression, Neural Network, SVM, CHAID. Number (1) (2) (3) (4) (5) (6) (7) of fea- Pearson Likeli- Cramer's Lambda PCA Aver- Aver- tures chi-square hood-ratio V age age chi-square (1-4) (5-6) 15 0.700 0.703 0.702 0.710 0.638 0.704 0.671 20 0.709 0.706 0.717 0.707 0.617 0.710 0.663 25 0.690 0.693 0.713 0.714 0.610 0.702 0.656 It is obvious, that the value of AUC is higher using a uniform distribution of statis- tical measures, rather than using a weighted approach taking into account PCA. An interesting fact is that with decreasing number of feature, we can observe their similarity tendency. The most frequently rated fields by feature selection technique are as follows: 1. CLN_YEARS (current age of a client); 2. WRK_FIELD (work field of a client); 3. SPOUSE (existence of a spouse); 4. INC_ALL_AP (total incomes of a client); 5. AMOUNT_PER_TERM (a sum of credit by term). As it comes to the created features in order to adjust business logic to the process of feature selection, we should mention that all of them are selected for 20 important fea- tures by all criteria. Moreover, PCA takes them in the top six. Attempts to determine the best feature selection technique by averaging AUC values in different models for different number of features were unsuccessful, as each model demonstrated different best measures): for the Logistic Regression - Likelihood-ratio chi-square, for Neural Network - Cramer's V, for SVM - Lambda, for CHAID - Pearson chi-square. As none of the criteria was overweight in most models, we concluded that the ensemble was inappropriate in this case. 5 Conclusion The study shows processes of data pre-processing, feature creation and feature selection that can be applicable in real-life business situations for binary classification problems by using nodes from IBM SPSS Modeler. Results have proved that application of hy- brid model of feature selection, which allows to obtain the optimal number of features, conduces in credit scoring accuracy increase. Proposed hybrid model comparing to expert judgmental approach performs in harder explanation but shows better accuracy and flexibility of factors selection which is ad- vantage in fast changing market. Besides, the paper shows the accessibility of the approach for users of analytical platforms, the availability of tools in business application packages (IBM SPSS Mod- eler as an example) and the method of combining these tools. It is obvious that using feature selection techniques differ among countries, so the results of this particular study can be implemented in real business cases in Ukraine. It should be noted that Ukrainian banks may be advised to try using the feature se- lection according to such statistical measures as Pearson chi-square, Likelihood-ratio chi-square, Cramer's V and Lambda, rather than PCA. The study results of Ukrainian lending market also show that the choice of features can be limit to 20, which allows to obtain the maximum AUC value. The study empirically confirms that in Ukraine banks should consider hybrid selec- tion technique with equal weights for statistical measures. The results show that the usage of a hybrid approach to feature selection methods improves the AUC value com- pared to the none-use by 11.2%, which is a clear advantage. On the other hand, the weak point of the approach is the increase of amount of time spent on calculations. References 1. Abdou, H., Pointon, J.: Credit scoring, statistical techniques and evaluation criteria: a review of the literature. Intelligent Systems in Accounting, Finance & Management. 18 (2-3), 59- 88 (2011) DOI: 10.1002/isaf.325 2. Aryuni, M., Madyatmadja, E.: Feature selection in credit scoring model for credit card ap- plicants in xyz bank: A comparative study. International Journal of Multimedia and Ubiqui- tous Engineering. 10 (5), 17-24 (2015) DOI: 10.14257/ijmue.2015.10.5.03 3. Boughaci, D., Alkhawaldeh, A.A.: Three local search-based methods for feature selection in credit scoring. Vietnam Journal of Computer Science. 5, 107–121 (2018) DOI: 10.1007/s40595-018-0107-y 4. Chen, F.-L., Li, F.-C.: Combination of feature selection approaches with SVM in credit scor- ing. Expert Systems with Applications. 37 (7), 4902-4909 (2010) DOI: 10.1016/j.eswa.2009.12.025 5. Chornous, G., Nikolskyi, I.: Business-oriented feature selection for hybrid classification model of credit scoring. In.: Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), pp. 397-401. IEEE Press, Lviv (2018). DOI: 10.1109/DSMP.2018.8478534 6. Falangis, K., Glen, J.: Heuristics for feature selection in mathematical programming discri- minant analysis models. Journal of the Operational Research Society. 61 (5), 804-812 (2010) DOI: 10.1057/jors.2009.24 7. Gietzen, T.: Credit Scoring vs. Expert Judgment - A Randomized Controlled Trial. SSRN Electronic Journal. (2017) DOI: 10.2139/ssrn.2983076. 8. Ha Van Sang, Ha Nam, Nguyen Duc Nhan: A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest. Indian Journal of Science and Technology. 9(20), (2016) DOI: 10.17485/ijst/2016/v9i20/92299 9. IBM SPSS Modeler 18.2 Modeling Nodes. Copyright IBM Corp. 1994 – 2018. ftp://pub- lic.dhe.ibm.com/software/analytics/spss/documentation/modeler/18.2/en/ModelerModel- ingNodes.pdf 10. Kamalov, F., Thabtah, F.: A Feature Selection Method Based on Ranked Vector Scores of Features for Classification. Annals of Data Science. 4, 483–502 (2017). DOI: 10.1007/s40745-017-0116-1 11. Koutanaei, F., Sajedi, H., Khanbabaei, M.: A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. Journal of Retailing and Consumer Services. 27, 11-23 (2015) DOI: 10.1016/j.jretconser.2015.07.003 12. Kuhn, M., Johnson K.: Applied Predictive Modeling. 2nd ed. Springer (2018) 13. Kuhn, M., Johnson K.: Feature Engineering and Selection: A Practical Approach for Predic- tive Models. Chapman and Hall/CRC. (2019) 14. Liang, D., Tsai, C.-F., Wu, H.-T.: The effect of feature selection on financial distress pre- diction. Knowledge-Based Systems. 73 (1), 289-97 (2014) DOI: 10.1016/j.knosys.2014.10.010 15. Liu, Y., Schumann, M.: Data mining feature selection for credit scoring models. Journal of the Operational Research Society. 56 (9), 1099-1108 (2005) DOI: 10.1057/pal- grave.jors.2601976 16. Louzada, F., Ara, A., Fernandes G.B. Classification methods applied to credit scoring: A systematic review and overall comparison. Surveys in Operations Research and Manage- ment Science. 21(2), 117-134 (2016) DOI: 10.1016/j.sorms.2016.10.001 17. Rozlini, M., Munirah, M. Y., Wahidi, N.: A Comparative Study of Feature Selection Tech- niques for Bat Algorithm in Various Applications. MATEC Web of Conferences, 150 (2018) 06006 DOI: 10.1051/matecconf/201815006006 18. Sadatrasoul, S., Gholamian, M., Shahanaghi, K.: Combination of feature selection and opti- mized fuzzy apriori rules: The case of credit scoring. International Arab Journal of Infor- mation Technology. 12 (2), 138-145 (2015) 19. Salappa, A., Doumpos, M., Zopounidis, C.: Feature selection algorithms in classification problems: an experimental evaluation. Optim. Methods Softw. 22(1), 199–212 (2007) DOI: 10.1080/10556780600881910 20. Shi, J., Zhang, S.-Y., Qiu, L.-M.: Credit scoring by feature-weighted support vector ma- chines. Journal of Zhejiang University: Science C. 14 (3), 197-204 (2013) DOI: 10.1631/jzus.C1200205 21. Siddiqi, N.: Intelligent Credit Scoring: Building and Implementing Better Credit Risk Score- cards. 2 ed. Wiley (2017) 22. Somol, P., Baesens, B., Pudil, P., Vanthienen, J.: Filter- versus wrapper-based feature selec- tion for credit scoring. International Journal of Intelligent Systems. 20 (10), 985-999 (2005) DOI: 10.1002/int.20103 23. Thomas, L. C., Edelman D.B., Crook J.N.: Credit scoring and its applications. SIAM- Society for Industrial & Applied Mathematics. 2nd revised ed. (2017) 24. Tripathi, D., Edla, D., Kuppili, V., Bablani, A., Dharavath, R. Credit Scoring Model based on Weighted Voting and Cluster based Feature Selection. Procedia Computer Science. 132, 22-31 (2018). DOI: 10.1016/j.procs.2018.05.055 25. Van-Sang Ha, Ha-Nam Nguyen: Credit scoring with a feature selection approach based deep learning. MATEC Web of Conferences, 54 (2016) DOI: 10.1051/matecconf/20165405004 26. Volkova, E.S., Gisin, V.B., Solov'ev, V.I.: Data Mining Techniques: Modern Approaches to Application in Credit Scoring. Finance and Credit. 23(34), 2044–2060 (2017) DOI: 10.24891/fc.23.34.2044 27. Waad, B., Ghazi, B., Mohamed, L.: A three-stage feature selection using quadratic program- ming for credit scoring. Applied Artificial Intelligence. 27 (8), 721-742 (2013) DOI: 10.1080/08839514.2013.823327 28. Wang, J., Hedar, A.-R., Wang, S., Ma, J.: Rough set and scatter search metaheuristic based feature selection for credit scoring. Expert Systems with Applications. 39 (6), 6123-6128 (2012) DOI: 10.1016/j.eswa.2011.11.011