Recognition of Skin Diseases and Exanthema with Deep Learning Techniques Antonio Crinieri1 , Luca Terzi1 , Francesco Ruggeri2[0000−0002−5827−2565] , Ricardo Anibal Matamoros Aragon2[0000−0002−1957−2530] , Francesco Epifania2 , and Luca Marconi2 1 B4 Service SRL, Italy {antonio.crinieri,luca.terzi}@b4service.it 2 Social Things SRL, Italy {francesco.ruggeri, ricardo.matamoros,francesco.epifania, luca.marconi}@socialthingum.com Abstract. This study deals with the problem of the recognition of der- matological and exanthematic diseases through the use of deep learning techniques in order to diagnose malignant diseases at an early stage and in general to bring the pathology identified by the models to the attention of the person. A fundamental part of the research was the study of the methodologies present in the state of the art and for this reason in this paper we report the studies considered as most relevant. In this paper, two different types of models are reported, the Convolutional Network Disease (CND) model and the CND-InceptionV3 model using the trans- fer learning technique. The use of these two models made it possible to carry out an experimental phase in which the performance that can be achieved using the ISIC-archive dataset was analysed. Subsequently, the description of the work carried out for the improvement of the dataset through the association of syntactic-semantic information is reported. Finally, in the last section, conclusions are drawn on the values obtained and future developments that can be made to improve the performance of the models reported are reported. Keywords: Transfer Learning · Skin Diseases · Convolutional Neural Networks. Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 State-of-the-art analysis 1.1 Dermatologist-level classification of skin cancer with deep NN The aim of this study [1] is to diagnose skin cancer, firstly by visual recogni- tion and then by medical techniques such as biopsy or histopathological tests. histopathological examinations. As reported in the article, automatic recognition in this field is very complex due to the different possible aspects of skin lesions. The performance of this model was certified by dermatologists. The transfer learning technique was used, specifically the GoogleNet Inception V3 convolu- tional network, trained with 1.28 million images from the 2014 Image Net Large Scale Visual Recognition Challenge. The training phase was done with a dataset of about 130 thousand clinical images of 2000 different pathologies. The reported performance is 72% accuracy. 1.2 Computer-Aided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models In this study [2] there are two datasets with images collected from the web: SKIN-10 which contains 10000 images of 10 diseases, SKIN-100 contains 20000 images of 100 diseases. They achieved an accuracy of 79Two methodologies were applied: in the first one SVM, KNN and decision trees models are used, in the second one deep learning techniques are used. The results show that the second technique is more performing. The classification is done through a model called Ensemble Net, created by the union of four already trained models ResNet50, DenseNet121, Nasnetmobile, Pnasnet5large. 1.3 Skin Disease Recognition Method Based On Image Color And NN In this study [3] the ISIC 2018 dataset created for the workshop ”Skin lesion analysis towards melanoma detection” was used. Several dermatological diseases are present such as melanoma (1113 images), melanocytic nevus (6705 images), basal cell carcinoma (514 images), actinic keratosis (327 images), benign kerato- sis (1099 images), dermatofibroma (115 images) and vascular (142 images). Three already trained CNNs were used: ResNet 50, DenseNet121 and MobileNet. The authors do not report the accuracy achieved by the model. 1.4 Studies on Different CNN Algorithms for Face Skin Disease Classification Based on Clinical Images This study [4] refers to the Xiangya Derm dataset, the largest Chinese dataset of dermatological disease images, containing 2656 images of faces with six com- mon diseases such as: seborrheic keratosis, actinic keratosis, rosacea, lupus ery- thematosus, basal cell carcinoma and squamous cell carcinoma. Five pre-trained CNNs were used: ResNet-50, Inception-v3, DenseNet121, Xception and Inception- ResNet-v2. Initially, the five models were trained only using facial images, the Inception-ResNet-v2 model achieved the highest performance (recall 67.2% and accuracy 63.7%). Then, the transfer learning technique was used: they pre- trained the model with images of other parts of the body and used the pa- rameters of the latter as initial parameters for the new model. Comparing the results of the two models, the second one obtained better results (recall 77% and accuracy 70.8%). 2 Description of context Dermatological diseases have a serious impact on people’s lives and health. These diseases are also very common, for example: dermatitis, infections cured by bac- teria and viruses, skin infections by parasites, mycosis, acne, rosacea, benign tumours and malignancies. Many of these diseases are the expression of internal or endocrine causes in the human organism, and these features are widely studied and taken into account in determining the occurrence of skin diseases. The particularity of these diseases is the reason why this domain presents more than a century of research. In particular, this paper aims to present an efficient approach for the recogni- tion of these diseases, this approach is based on methodologies related to the deep learning domain. In addition, in the next section we report the structure of the custom model and the description of a model known in the literature, both designed to address the problem of disease recognition and classification using image analysis. 3 Models description: CND and CND-InceptionV3 A convolutional neural network (CNN), shown in Figure 1, is one of the most common algorithms for deep learning, a type of machine learning in which a model computerises and learns to perform classification tasks directly from im- ages, videos, text or sounds. CNNs are particularly useful for finding patterns in images and also for recognising objects, faces and scenes. They learn directly from image data, using patterns to classify images and eliminating the need for manual feature extraction. Two models using this technology have been imple- mented. The first approach has been to formalise the structure of a custom model called Convolutional Network Disease (CND), taking inspiration from the state of the art [5], and and the description of the structure of a CNN found in several image classification problems. The model was built using the most well-known layers in this field: – Convolutional layer: a convolution is a filter that passes over an image, processes it and extracts features from common characteristics. The applica- tion of the filter thus makes it possible to highlight the details of the images so that it is possible to find the meaning and identify the objects. – Pooling layer: the pixels are now grouped and filtered into a subset. There are different variants of pooling, one of which is max pooling: the original image is grouped into sets of 2x2 pixels, so only the one with the highest value is considered. The application of this filter allows the size to be reduced to a quarter of the original, while maintaining the original characteristics. – Fully connected layer: the output from the convolutional layers represents high-level features in the data. While that output could be flattened and connected to the output layer, the addition of a fully connected layer allows non-linear combinations of these features to be learned [6]. Fig. 1. Generic representation of the structure of a convolutional neural network The second approach makes use of the technique of transfer learning, a technique that takes knowledge gained while solving a problem and applies it to a different but related problem [7]. The Inception V3 network was used [7], trained on the ImageNet database containing more than 14 million images of 20 thousand different categories, the model is called CND-InceptionV3. 4 Experimental stage The dataset used for this study is the ISIC archive [8]. Only three categories of dermatological and exanthematous diseases were considered for this study: nevus (5000 images), pigmented benign keratosis (2000 images), melanoma (1000 images). The decision was based on the lack of balance in the available dataset. The images were pre-processed and resized (200x200) and then split into training and testing sets. 5 Results of the experiment The metrics used to validate the models are accuracy (percentage of correct classifications) and loss (number indicating the error between the value the model returns and the correct value). The first model (CND) achieved 75% accuracy and a loss of 0.55 in the training phase, and 72% accuracy and 0.64 loss in the validation phase (figure 2). The second model CND-Inception V3 has reached very good performances in the training phase (100% accuracy, 0.001 loss), in the validation phase instead it has not reached good results, in fact it has accuracy of 0.60 and loss 2.5 (figure 3). The performances in the validation phase fluctuate. In this model there is overfitting: the model fits the training data too well and loses generality. The model is perfect in the training phase, but registers many errors in the validation phase. Table 1 shows the values obtained in the test phase. Model Accuracy Loss CND-InceptionV3 17% 23.5 CND 75% 0.53 Table 1. Results of analysed models. Fig. 2. Graphs illustrating the performance of the accuracy and loss function with the CND model 6 Medical reports analysis A subsequent development was based on the engineering of a parser to analyse medical reports of patients with skin diseases. A first implementation analyses the fields present in the blood analysis report for the extraction of information useful for the generation of metadata to be associated with the patients’ images. This parser allows to create a better dataset for training the described models since it also introduces semantic information describing the content present in the images under analysis. This information can be used together with the images in the dataset due to the fact that deep learning techniques allow the creation of hybrid models, i.e. the introduction of semantic content in the training phase of the model [9] [10]. Fig. 3. Graphs illustrating the performance of the accuracy and loss function with the CND-InceptionV3 model 7 Conclusions The problem addressed in this paper, besides being complex because it concerns a large active research domain, presents other problems related to the lack of a dataset suitable for training models that can achieve acceptable performance for the task at hand. The two models and approaches used in this study, although constrained by the limited dataset, manage to achieve performances that allow us to highlight that if a more accurate dataset were available, it would be possible to increase the performance of the models. Therefore, in conclusion, the techniques used belonging to deep learning remain, both in literature and according to the values obtained, the best approach to continue using. In addition, a hybrid model is planned to make use of the textual content from the analysis phase implemented using the proposed parser agorithm. References 1. Esteva, A., Kuprel, B., Novoa, R., Ko, J., Swetter, S., Blau, H., and Thrun, S., 2017. “Dermatologist- level classification of skin cancer with deep neural networks”. Nature, 542, 01. 2. He, X., He, X., Wang, S., Shi, S., Tang, Z., Wang, Y., Zhao, Z., Dai, J., Ni, R., Zhang, X., Liu, X., Wu, Z., Yu, W., and Chu, X., 2019. “Computer-aided clinical skin disease diagnosis using cnn and object detection models”. 2019 IEEE International Conference on Big Data (Big Data), pp. 4839–4844. 3. Kshirsagar, P., 2020. “Skin disease recognition method based on image color and neural network”. 4. Wu, Z., Zhao, S., Peng, Y., He, X., Zhao, X., Huang, K., Wu, X., Fan, W., Li, F., Chen, M., Li, J., Huang, W., Chen, X., and Li, Y., 2019. “Studies on different cnn algorithms for face skin disease classification based on clinical images”. IEEE Access, 7, pp. 66505–66511. 5. Krizhevsky, A., Sutskever, I., and Hinton, G. E., 2017. “Imagenet classification with deep convolutional neural networks”. 6. Lin, Z., Memisevic, R., and Konda, K., 2015. “How far can we go without convolu- tion: Improving fully- connected networks”. 7. Yang, Z., and Liu, Z., 2020. “The risk prediction of alzheimer’s disease based on the deep learning model of brain 18f-fdg positron emission tomography”. Saudi Journal of Biological Sciences, 27(2), pp. 659 – 665. 8. Center, U. D. The international skin imaging collaboration. 9. Liu, K., Kang, G., Zhang, N., and Hou, B., 2018. “Breast cancer classification based on fully-connected layer first convolutional neural networks”. IEEE Access, 6, pp. 23722–23732. 10. Liu, Y., Jain, A., Eng, C., Way, D., Lee, K., Bui, P., Kanada, K., de Oliveira Marinho, G., Gallegos, J., Ga- briele, S., Gupta, V., Singh, N., Natarajan, V., Peng, L., Webster, D., Ai, D., Huang, S., Liu, Y., Dunn, C., and Coz, D. D., 2020. “A deep learning system for differential diagnosis of skin diseases”. Nature Medicine.