Melanoma Classification Using Feature Extraction Methods and Machine Learning Approaches Sihui He1+*, Zheyang Huang2+, and Xinjie Zhong3 1 Faculty of Science, Western University,London, N6A 37K, Canada 2 Faculty of Engineering, The University of Sydney, Sydney, NSW 2006, Australia 3 School of Electrical Engineering, Zhejiang University, Hangzhou, 30058, China + They are both first authors * Corresponding author: she334@uwo.ca Abstract Due the unbalanced melanoma data and the complexity and resolution of the melanoma image backgrounds, classification of the melanoma regions is very challenging. In this paper, EffNet B5 models with different augmentation methods and machine learning models that use feature vectors from feature extraction method are proposed to solve the classification task. GLCM, LBP, SFTA, and ResNet18 are chosen as the feature extraction methods in this study, while SVM, Random Forest, and XGBoost are chosen to be the machine learning classification models. In this work, the most effective feature extraction algorithm and the classification algorithm with the best AUC scores are determined. Keywords Melanoma, Feature Extraction, Classification, Machine Learning 1. Introduction Melanoma occurs when DNA damage from burning or tanning due to UV radiation triggers changes (or mutations) in the melanocytes, resulting in uncontrolled cellular growth [1]. It is widespread cancer with 6% people who are estimated to get melanoma in 2021; Also, an estimated 7,180 people (4,600 men and 2,580 women) will die of melanoma in the U.S. in 2021 [2]. However, it can be cured if we can find it early, with a five-year survival rate estimated to 99%. Therefore, using AI techniques to help people detect melanoma earlier is essential for saving millions of people's lives. In this paper, our goal is to conduct classification tasks on the melanoma data set 2020 SIIM-ISIC Melanoma Classification [3] and 2019 SIIM-ISIC Melanoma Classification [4-6]. We aim to find the augmentation methods which could improve the performance of EffNets for the binary classification task by training EffNets along with both position and color augmentation methods, such as flipping, hue, saturation. Also, the machine learning algorithms SVM, Random Forest, and XGBoost along with four different feature extraction algorithms are proposed to implement. The four feature extraction algorithms Gray- Level Co-Occurrence Matrix (GLCM), Local Binary Patterns (LBP), Segmentation Based Fractal Texture Analysis (SFTA), and Residual Neural Network (ResNet18) are also to be used in this paper. By combining the machine learning algorithms with the featured extraction algorithms that are proposed previously, the combination with the best performance on the classification of the melanoma data set and the machine learning algorithm with the highest positive detection rate on the positive cases are both successfully determined. 2. Related Works Before feature extraction and classification, the image dataset should be pre-treated. Mikołajczyk [7] reviewed the classical data augmentation methods, such as picture rotating, cropping, zooming, histogram-based methods, as well as the deep learning methods, such as style transfer using Generative Adversarial Networks (GANs). The classical methods are still popular and powerful. However, they found that combining both classical methods and style transfer would have a better performance. Style Transfer can generate new images with high-level image synthesis and manipulation. However, it also Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). has some disadvantages. The style transfer is based on texture and color transfer, so it may limit those images where the structures are essential. Texture features are thought to be important to the classification problems, as well as the color features (RGB, HSV). Scientists also use dermoscopy features such as ABCD rules (Atypical, Border, Color, and Diameter). Gray level co-occurrence Matrix (GLCM) is a popular method to extract texture features [8-11]. Wu Xuelian, \etal use 14 features extracted by GLCM and 16 features extracted by CS-LBP, and they have 84.7615% accuracy using the SVM classifier [10]. The paper “Texture and color feature extraction for classification of melanoma using SVM” shows that only using texture features may have bad performance on classification. The classification results using the same classifier (SVM) on texture features, RGB and HSV color space features, OPP features, and texture + RGB features are 76%, 92%, 89%, 93%. Also, Agnieszka, etal [9] use SVM as classifier and GLCM + ABCDs as feature extraction, and they have a score of 92%. Researchers also found that using less but important features perform as well as the original features [12, 13]. Wies law Paja [12] uses original features (13) with 19.07% error rate. They remove four features and find that the accuracy value is almost the same. They remove half of the features, and the error rate is 22.40%. It shows that the performance will not be worse too much if we remove those less important features. Besides, Suleiman Mustafa [13] extracts shape, color, and geometry features and uses the SVM RBF classifier. They get 86.87% accuracy. After that, they use sequential backward selection (SBS) to select features. Finally, they can reduce features number to six with a performance as good as the original features [9, 11]. Using Principal Component Analysis (PCA) to reduce dimension also has a good performance. A deep learning way on feature extraction shows a better result on traditional methods [14]. Aimin Yang compares the classical feature extraction algorithms and 2 CNN structures. The results of LBP, LBP_C, LBP_M, LBP_S, Xception and DenseNet are separately 97.12%, 98.20%, 93.21%, 92.01%, 99.01$\%$, 99.16%. Xception and DenseNet are CNN structures. Moreover, they perform better than other algorithms [15]. Besides the different methods of feature extraction, different classifiers also have different performances on the same dataset [11, 15, 16]. Arslan Javaid uses the same features on SVM (Medium Gaussian), Random Forest, and Quadratic Discriminant and gets 88.17%, 93.89% and 90.84%. Şaban Öztürk [16] compares different feature extraction methods (GLCM, LBM, LBGLCM, GLRLM, SFTA) and different classifiers (SVM, KNN, LDA, Boosted Tree). They conclude that SFTA is better (> 92%) than other feature extraction methods, and LBP is worse (< 90%). Furthermore, SFTA with Boosted Tree gets the best performance of 94.3%. What is more, Fábio Perez [17] evaluates 9 CNNs architectures in 5 sets of splits created on the ISIC Challenge 2017 dataset, with 3 repeated measures. The 135 models show that the correlation between the performance of CNN architectures on ImageNet and their performance on target tasks seems smaller than other researchers thought. Also, using multiple models is better than a single model. Although using the validation set on choosing models is better, picking high-performance models at random also has competitiveness. 3. Methodology 3.1. Image Data Set and Data Pre-Processing The data sets 2020 SIIM-ISIC Melanoma Classification [3] and 2019 SIIM-ISIC Melanoma Classification [4-6] are available for general audiences on the ISIC website. The training data sets and the training ground truth tables are downloaded from the ISIC website. The ground truth tables contain the information of standard lesion diagnosis for every image in the data set. The training sets from the ISIC website contain 58,457 dermoscopic images in total, with 5,106 positive cases and 53,351 negative cases. Each image comes from the dermoscopic image of a patient identified with a distinct patient id. Histopathology has been used to confirm all malignant diagnoses, whereas expert agreement, longitudinal follow-up, and histopathology have all been used to confirm benign diagnoses. The dataset is randomly resampled into dataset into three classes, 18% for the test set, 18% for the validation set, and 64% for the training set. 3.2. EfficientNets and Augmentation For the augmentation experiments, different augmentations methods are applied on the training set to obtain a new training set particular, then the new training set is used to train the EfficientNets. 3.2.1. EfficientNets (EffNets) Mingxing Tan etal [18] initially proposed EfficientNets for more efficient computing while also achieving state-of-the-art 84 percent top-1 accuracy on ImageNet by designing a new baseline network EfficientNet-B0 and applying it with compound scaling to obtain a new family of EfficientNets. In this paper, EfficientNets-B5 with the batch size 16 are used for our augmentation experiments. 3.2.2. Augmentations Data augmentation is a common strategy to solve an intrinsic data imbalance problem [19]. Data augmentation allows practitioners to dramatically enhance the diversity of data availability for training sets by position and color augmentation techniques without directly collecting additional data. The color and position augmentation methods we chose for the experiments are random horizontal flipping, random vertical flipping, random rotation with 90 degrees, random hue with random seed 0.01, random saturation with lower bound 0.7 and upper bound 1.3, random contrast with lower bound 0.8 and upper bound 1.2, random brightness with random seed 0.1. In addition to the traditional augmentation methods, we would perform two non-traditional augmentation methods: randomly dropping out some selected patches on the images and adding faked hairs to the images. 3.3. Feature Extraction Algorithms 3.3.1. Gray Level Co-occurrence Matrix (GLCM) Gray Level co-occurrence matrix feature extraction algorithm is a texture-based feature extraction method that determines the spatial relationship between pixels with a specific gray level by extracting second-order statistical features from images [20]. The spatial relationship, also known as the offset, is defined as two horizontally adjacent pixels of an image [21]. By defining the spatial relationship along with different directions and distances, one can obtain a multidimensional feature vector that describes the frequencies of occurrence of relative pixel pairs by applying the GLCM feature extraction method [16]. In this study, we converted the images from the training set to gray, then conducted GLCM feature extraction on each image. The GLCM features we chose to use are dissimilarity and correlation. In addition, the distance we chose is four offsets, and the directions we chose are 45 degrees, 60 degrees, 100 degrees, 120 degrees, 135 degrees, 180 degrees, 200 degrees, 225 degrees, 240 degrees, 300 degrees, and 340 degrees. 3.3.2. Local Binary Patterns (LBP) Local binary patterns feature extraction algorithm is an effective feature extraction method that is robust to light fluctuations. By considering an image as an examined window, the LBP algorithm divided the window into equal-sized cells. The method compares each pixel in a cell to all of its neighbors by going clockwise or counterclockwise through all the neighbors [22]. The method generates a 256-dimensional feature vector after completing the comparisons on the central pixel. The feature vector is a histogram that contains the frequency values for each combination of pixels that are smaller and larger than the cell's center pixel. Mathematically, the process of labeling the pixels can be described as [16]: 𝐿𝐵𝑃{𝑃,𝑅 } = ∑𝑃−1 P=0 𝑠(𝑔𝑝 − 𝑔𝑐 )2 𝑝 (1) 1, 𝑥 ≥ 0 s(x) { 0, 𝑥 < 0 (2) where 𝑔𝑐 is the gray value of a central pixel, 𝑔𝑝 is the value of the neighbor of the central pixel. 3.3.3. Segmentation-based Fractal Texture Analysis (SFTA) The SFTA feature extraction algorithm decomposes the gray-level input image into a series of binary images from which the fractal dimensions of the resultant regions are calculated to represent segmented texture patterns of the image. After applying multi-level threshold processing to a gray-level input image, the Two-Threshold Binary Decomposition (TTBD) algorithm converts the input image into different binary images [16]. The SFTA algorithm then receives the binary images as input, and extracts the features from the binary images. The SFTA algorithm's mathematical definition is interpreted as [16]: 1, ∃(𝑥 ′ , 𝑦 ′ ) ∈ 𝑁𝑔 [(𝑥, 𝑦)]: 𝐼𝑏 (𝑥 ′ , 𝑦 ′ ) = 0, 𝐼𝑏 (𝑥, 𝑦) = 1 ∆(𝑥, 𝑦) = { () 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 where 𝐼𝑏 (𝑥, 𝑦) as the binary images of the original gray-level image after applying the TTBD method. 3.3.4. Residual Neural Networks (ResNets) A residual neural network (ResNet) is a kind of artificial neural network (ANN) based on pyramidal cell constructions in the cerebral cortex. The most common ResNet models include double-layer or triple-layer skips with some direct connections in between. The connections are called skip connections and are the core of residual blocks, the stacks of layers set in the ResNet model. In addition to the traditional feature extraction algorithms, a pre-trained ResNet18 provided by the PyTorch Library torch.utils.model_zoo [23] is also used to extract the feature vector for each image in the data set. 3.3.5. Dimmensionality Reduction of the Feature Vectors Due to the strong independence of the ResNet18 features, PCA did not perform well in terms of dimensionality reduction. Consequently, the random projection approach is employed to reduce the dimension of the feature extracted from each feature extraction algorithm. The dimensions of the training set are reduced to 15 and 30 for experimental purposes. 3.4. Machine Learning Classifiers 3.4.1. Support Vector Machines (SVMs) The Support Vector Machine algorithm is a supervised learning model used to solve regression and classification problems. A support vector machine creates a collection of hyperplanes in a high- dimensional space used for classification [24, 25]. It then uses the nearest points to the hyperplanes as support vectors to determine the optimal decision boundary that divides the data points into two classes with the minimum error. In other words, with the labeled training data points as the inputs, the SVM algorithm generates an optimum hyperplane that can categorize new sample points. 3.4.2. Extreme Gradient Boosting (XGBoost) XGBoost is a distributed gradient boosting library used for supervised learning problems with better computational speeds compared with other boosting methods [26]. In this study, we applied the XGBoost classifier with gbtree as the booster. 3.4.3. Random Forest The random forest algorithm is an ensemble learning algorithm consisting of many decision trees [27]. When developing every single tree, the random forest algorithm employs bagging and feature randomness to produce an uncorrelated forest of trees whose prediction is more accurate than any single decision tree. 4. Results In this study, the four feature extraction methods GLCM, pre-trained ResNet18, LBP, and SFTA are utilized. After deducing the dimensions of the feature vectors, the new vectors are used as inputs to train three machine learning models: SVM, Random Forest, and XGBoost on test and validation sets. 4.1. Evaluation Metrics 4.1.1. AUC scores ( Area Under the ROC Curve) The Area Under the ROC Curve (AUC) is defined as the area enclosed with the coordinate axis under the ROC Curve [28]. A ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds. This curve plots with two parameters: the True Positive Rate and the False Positive Rate. The True Positive Rate (Sensitivity Rate) is defined as 𝑇𝑃 𝑇𝑃 = 𝑇𝑃+𝐹𝑁 () where TP is the number of true positives and $FN$ is the number of false negatives [29]. If an algorithm has a high sensibility rate, it has a better performance on predicting positive cases. The False Positive Rate is defined as 𝐹𝑃 𝐹𝑃 = 𝐹𝑃+𝑇𝑁 (5) where FP is the number of false positives and FN is the number of false negatives [29]. The ROC curve varies when the threshold is changed. In addition, the value of this area is not greater than 1. Since the ROC curve is generally above the straight-line y=x, the values of AUC scores range from 0.5 to 1. As the AUC score of a detection model is closer to 1.0, the model has a higher authenticity. However, when the AUC score of a model is 0.5, the model is not applicable for classification tasks. 4.1.2. True Negative Rate ( Selectivity Rate) The true negative rate (selectivity rate) is defined as 𝑇𝑁 𝑇𝑁 = 𝑇𝑁+𝐹𝑃 (6) where TN is the number of true negatives, and FP is the number of false positives. The true negative rate is used to measure the performance of models in detecting the negative cases. If a detection model has a higher true negative rate, it has a better performance in detecting the negative cases. 4.2. EffNets with Augmentations The experimental results are shown in Table 1. The augmentation examples are shown in Figure 1. Some conclusions can also be drawn from the experiments of the augmentation methods. Traditional augmentation methods, such as horizontal flipping and Saturation, can slightly improve the model performance. However, changing the hue of the images might degrade the performance of the model as the AUC score of Hue is 0.0024 lower than the AUC score of the model without any augmentation. Therefore, augmentations might not always lead to model improvement. However, the two non-traditional augmentations, adding hair to the image and selecting patches on the image and then dropping them, were more effective than other traditional augmentation methods in improving the performance of EffNets. These two non-traditional augmentation methods have better AUC scores on validation sets. Table 1 The AUC scores of EfficientNet Augmentation Method Test Set Validation Set Rotation with 90 Degrees 0.9502 0.9177 Horizontal Flipping 0.9411 0.9148 Contrast 0.9450 0.9176 Saturation 0.9439 0.9120 Hue 0.9368 0.9198 Brightness 0.9422 0.9234 Hair Faking 0.9461 0.9241 Drop Out 0.9501 0.9237 Without Augmentation 0.9392 0.9091 Figure 1: Augmentation Examples 4.3. Machine Learning Classifiers The AUC scores of the experiments are shown in Table 2; the following findings are made by evaluating the outcomes achieved using various feature extraction methods in different classifiers. • Increasing the number of features extracted by GLCM and ResNet18 can improve the AUC scores of the models. • In contrast, increasing the number of features in LBP and SFTA may not improve the model's AUC scores. The new additional features may not have better performance at representing the entire picture from this data set. • GLCM, on the other hand, has the highest AUC scores among the four feature extraction methods tested on the experimental data. Table 2 The AUC scores of Different Classifiers with Different Feature Extraction Methods Classifiers Feature Extraction # of Features Test Validation 15 0.738 0.7199 SVM GLCM 30 0.7326 0.7801 15 0.7699 0.7808 LBP 30 0.7572 0.7933 15 0.7918 0.8127 SFTA 30 0.8042 0.8196 15 0.7225 0.7073 ResNet18 30 0.7313 0.712 15 0.6122 0.6842 GLCM 30 0.6425 0.7506 15 0.6521 0.7463 LBP 30 0.649 0.7704 RF 15 0.6503 0.6875 SFTA 30 0.5283 0.6034 15 0.6308 0.6764 ResNet18 30 0.6839 0.6758 15 0.6755 0.671 GLCM 30 0.7206 0.803 15 0.7347 0.7466 LBP 30 0.7112 0.7868 XGBoost 15 0.6734 0.7122 SFTA 30 0.6975 0.7429 15 0.678 0.696 ResNet18 30 0.6845 0.6979 4.4. Failure Modes Analysis Since the models with 30 features have better AUC scores, the models with thirty features from the feature extraction methods on the validation set is chosen to be examined. The validation set contains 10,525 images, including 908 positive cases and 9617 negative cases. The predicted results of the algorithms range between 0 and 1. The sensibility rates are shown in Table 3 and the selectivity rates are shown in Table 4. We carefully examined the distributions, choosing 0.25 as the threshold value to determine if a predicted result successfully detects the positive cases. If a predicted value is lower or equal to 0.25, this predicted result is considered failing to detect a positive case. Table 3 The Sensibility Rates of the Models on the Validation Set GLCM LBP SFTA ResNet-18 RF 0.2192 0.9317 0.6101 0.5903 SVM 0.0518 0.4020 0.3425 0.0947 XGBoost 0.2037 0.4725 0.3139 0.1938 Table 4 The Selectivity Rates of the Models on the Validation Set GLCM LBP SFTA ResNet-18 RF 0.2655 0.7567 0.7682 0.9337 SVM 0.9071 0.8978 0.9625 0.9799 XGBoost 0.9540 0.9150 0.9546 0.9540 Among the twelve combinations of feature extraction algorithm and machine learning model, the combination with the highest sensibility rate 0.9317 is the random forest model with the GLCM method. However, this combination does not have the highest AUC score. In addition, the combination with the highest selectivity rate of 0.9799 is the SVM model with ResNet-18. Significantly, the combination with the highest sensibility rate maintains a 0.2655 selectivity rate; meanwhile, the combination with the highest selectivity rate maintains a 0.0518 sensibility rate. By comparing the AUC scores of the combinations, we found out that even though the random forest algorithm has the lowest AUC scores, it has the best performance on correctly detecting the positive cases. Also, even though a model has a good performance in detecting positive cases, it might fail to detect negative cases. 5. Conclusion In this paper, four feature extraction methods are used: GLCM, pre-trained ResNet18, LBP, and SFTA. SVM, Random Forest, and XGBoost are trained to perform classification tasks on the test and validation sets using the feature vectors from the prior feature extraction techniques. Also, Effnets B5 along with different augmentation methods are implemented. From the experimental results, augmentation methods generally improve the performance of EffNets while the hair faking method has the largest AUC score on the validation set and the rotation method has the largest AUC score on the test set. However, random changing the hue of images fails to improve the models as it might blur the complexities of the images. For the machine learning models with different feature extraction methods, the SVM algorithms with SFTA features have the best AUC scores. However, the random forest algorithm has better performance in terms of detecting positive cases. However, since the data set is significantly unbalanced, the data set needs to be rebalanced for higher sensibility rates in the future work. The SMOTE approach can be made to balance the positive and negative cases. Meanwhile, since EffNets have higher AUC scores, other deep learning models with transfer learning approaches can also be implemented to improve the accuracy of predictions. 6. References [1] Curtin, J. A., Fridlyand, J., Kageshita, T., Patel, H. N., Busam, K. J., Kutzner, H., Cho, K. H., Aiba, S., Bröcker, E. B., LeBoit, P. E., Pinkel, D., & Bastian, B. C. (2005) Distinct sets of genetic alterations in melanoma. The New England journal of medicine, 353(20), 2135–2147. https://doi.org/10.1056/NEJMoa050092 [2] American Cancer Society. (2021) Melanoma skin cancer: Understanding melanoma. https://www.cancer.org/cancer/melanomaskin-cancer.html. [3] Rotemberg, V., Kurtansky, N., Betz-Stablein, B., Caffery, L., Chousakos, E., Codella, N., Combalia, M., Dusza, S., Guitera, P., Gutman, D.,Halpern, A., Helba, B., Kittler, H., Kose, K., Langer, S., Lioprys, K., Malvehy, J., Musthaq, S., Nanda, J., Reiter, O., Shih, G., Stratigos, A., Tschandl, P., Weber, J. Soyer, P. (2021) A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific data, 8(1): 1-8. [4] Tschandl, P., Rosendahl, C., & Kittler, H. (2018) The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1): 1-9. [5] Codella, N. C., Gutman, D., Celebi, M. E., Helba, B., Marchetti, M. A., Dusza, S. W., ... & Halpern, A. (2018) Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). pp. 168- 172. [6] Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Allan C. Halpern, Susana Puig, Josep Malvehy. (2019) BCN20000: Dermoscopic Lesions in the Wild. https://arxiv.org/pdf/1908.02288.pdf [7] Agnieszka, M., Michał, G. (2018) Data augmentation for improving deep learning in image classification problem. In: International Interdisciplinary PhD Workshop (IIPhDW). Poland. pp. 117-122. [8] Wu, X., et al. (2018) Hyphae Detection in Fungal Keratitis Images with Adaptive Robust Binary Pattern. IEEE Access, 6: 13449-13460. [9] Alquran, H., et al. (2017) The melanoma skin cancer detection and classification using support vector machine. In: IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). Aqaba, Jordan. pp. 1-5. [10] Kavitha, J. C., Suruliandi, A. (2016) Texture and color feature extraction for classification of melanoma using SVM. In: International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE’16). Kovilpatti, India. pp. 1-6. [11] Javaid, A., Sadiq, M., Akram, F. (2021) Skin Cancer Classification Using Image Processing and Machine Learning. In: International Bhurban Conference on Applied Sciences and Technologies (IBCAST). Islamabad, Pakistan. pp. 439-444. [12] Paja, W., Wrzesie´n, M. (2013) Melanoma important features selection using random forest approach. In: 6th International Conference on Human System Interactions (HSI). Sopot, Poland. pp. 415-418. [13] Mustafa, S., Kimura, A. (2018) A SVM-based diagnosis of melanoma using only useful image features. In: International Workshop on Advanced Image Technology (IWAIT). Chiang Mai, Thailand. pp. 1-4. [14] Yang, A., Yang, X., Wu, W., Liu, H., Zhuansun, Y. (2019) Research on Feature Extraction of Tumor Image Based on Convolutional Neural Network. IEEE Access, 7: 24204-24213. [15] He, K., Zhang, X., Ren, S., & Sun, J. (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770-778. [16] Öztürk, Ş., & Akdemir, B. (2018) Application of feature extractionand classification methods for histopathological image using GLCM, LBP, LBGLCM, GLRLM and SFTA. Procedia computer science, 132: 40-46. [17] Perez, F., Avila, S., Valle, E. (2019) Solo or ensemble? choosing a cnn architecture for melanoma classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, CA, USA. pp. 2775-2783. [18] Tan, M. and Le, Q. V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 2020. https://arxiv.org/pdf/1905.11946.pdf [19] Agnieszka, M., Michał, G. (2018) Data augmentation for improving deep learning in image classification problem. In: International Interdisciplinary PhD Workshop (IIPhDW). Poland. pp. 117-122. [20] Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973) Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, (6): 610-621. [21] Mohanaiah, P., Sathyanarayana, P., & GuruKumar, L. (2013) Image texture feature extraction using GLCM approach. International journal of scientific and research publications, 3(5): 1-5. [22] Ojala, T., Pietikainen, M., & Harwood, D. (1994, October) Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proceedings of 12th international conference on pattern recognition. Jerusalem. Vol. 1, pp. 582-585. [23] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32: 8026-8037. [24] Vapnik, V. (1998) The support vector method of function estimation. In Nonlinear modeling. Springer, Boston, MA. pp. 55-85. [25] Vapnik, V. (1995) The Nature of Statistical Learning Theory. Springer-Verlag, New York. [26] Chen, T., Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: ACM. https://doi.org/10.1145/2939672.2939785 [27] Ho, T. K. (1995) Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278–282). [28] Hand, D. J. (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine learning, 77(1), 103-123. [29] Sultana, N., Puhan, N., Mandal, B. (2018) DeepPCA Based Objective Function for Melanoma Detection. In: International Conference on Information Technology (ICIT). Bhubaneswar, India. pp. 68-72.