Implementing Vision Transformers in Dermatological Practice: A Web Application for Melanoma Screening Daniele Sirico1,† , Giuseppe Accardo1,† and Valentina Esposito1,* 1 Data Jam srl, Centro Direzionale Isola F8, Via F. Lauria, Naples, 80143, Italy Abstract In this study, we introduce a pioneering web-based application designed to enhance melanoma detection accuracy through the innovative use of a Vision Transformer (ViT) model. Leveraging the power of advanced deep learning architectures, our application processes dermatological images to identify potential melanomas with a precision previously unattainable in conventional screening methods. The development process involved finetuning the ViT model on a diverse dataset of dermatoscopic images, ensuring robustness and reliability across a wide range of images. The web application is intuitively designed, allowing for easy access and use by dermatologists and potentially by the general public for preliminary screening purposes. This research not only underscores the viability of ViT models in medical imaging but also offers a practical tool for early melanoma detection, thereby contributing to better clinical outcomes and facilitating early treatment interventions. Keywords AI, Melanoma, Computer Vision, Vision Trasformer 1. Introduction and the general public, facilitating widespread screening and awareness. The early diagnosis of melanoma, a highly aggressive Such web applications can transform smartphones and form of skin cancer, is crucial for improving patient personal computers into powerful tools for preliminary outcomes [1, 2, 3]. When detected at an early stage, screening, empowering individuals to seek professional melanoma can often be treated effectively, significantly advice at the earliest suspicion of melanoma. This ap- reducing mortality rates [1]. However, the challenge proach to healthcare leverages the ubiquity of internet- lies in the timely and accurate identification of potential connected devices to bridge the gap between advanced melanomas among a vast array of skin lesions, which diagnostic technologies and end-users [10]. The ease of requires a high level of expertise and experience. In this use and accessibility of these applications are critical fac- context, the application of artificial intelligence (AI) in tors in their adoption and effectiveness, enabling timely medical imaging has emerged as a groundbreaking ad- intervention and potentially saving lives. In sum, the con- vancement [4, 5]. AI, particularly deep learning models, vergence of AI in medical imaging and user-friendly web has shown remarkable success in enhancing the accuracy applications marks a pivotal moment in the fight against and efficiency of diagnostic processes in various medical melanoma, offering new horizons for early detection and fields [6, 7], including dermatology [8]. treatment. The integration of AI into medical imaging for To date, the ABCDE method is the standard ap- melanoma detection allows for the analysis of derma- proach used by medical professionals for the diagno- tological images with a level of detail and precision that sis of melanoma, emphasizing the evaluation of lesions surpasses human capability [9]. This not only aids der- based on Asymmetry, Border irregularity, Color varia- matologists in making more informed decisions but also tion, Diameter larger than 6mm, and Evolution. Despite has the potential to democratize access to high-quality its widespread adoption and utility in raising awareness, diagnostic services, especially in under-resourced areas. this method’s subjective nature can lead to variability in Furthermore, the advent of user-friendly web applica- diagnostic accuracy, potentially overlooking early-stage tions for medical purposes represents a significant leap melanomas or prompting unnecessary biopsies of be- forward. Applications built on platforms like Streamlit of- nign lesions [1]. In response, Artificial Intelligence, es- fer an accessible interface for both medical professionals pecially through deep learning algorithms, presents a significant advancement by providing an objective and Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- precise analysis far surpassing the traditional methods. nized by CINI, May 29-30, 2024, Naples, Italy By incorporating AI-driven diagnostic capabilities into a *** Corresponding author. † user-friendly web application, the project aims not only These authors contributed equally. to enhance diagnostic precision but also to make sophis- $ dg.sirico@almaviva.it (D. Sirico); gi.accardo@almaviva.it (G. Accardo); v.esposito@almaviva.it (V. Esposito) ticated melanoma screening tools accessible to a wider  0000-0002-2760-9209 (D. Sirico) population. This initiative marks a crucial step forward in © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). improving early detection rates and patient outcomes by CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings bridging the gap between traditional diagnostic methods in real-world diagnostic applications. and the potential of modern technology [11, 12]. In addition to the dataset preparation from ISIC 2019 In this study, we have developed a comprehensive and ISIC 2020, the fine-tuned model was subsequently dataset derived from the ISIC (International Skin Imaging tested on an entirely different dataset, MEDNODE [18], Collaboration) Challenge datasets of 2019-2020. Utiliz- to assess its generalization capability and performance in ing this dataset, we have fine-tuned the pre-trained Vi- real-world scenarios. This step was crucial for validating sion Transformer (ViT) Large model provided by Google the effectiveness of our fine-tuning process and ensuring [13], with the specific objective of classifying dermato- that the model could accurately classify melanoma across logical images into two distinct categories: "Melanoma" diverse datasets. The MEDNODE dataset, distinct in its and "Non-Melanoma". The model has been meticulously composition and image characteristics, provided a chal- adapted to the unique characteristics of dermatological lenging environment to evaluate the model’s robustness imagery. While the ViT model inherently possesses a and adaptability. This further testing underscores the broad capability for image recognition owing to its ex- model’s potential for application in a wide range of clini- tensive initial training on diverse data, our fine-tuning cal settings, demonstrating its ability to maintain high process has significantly enhanced its accuracy and sen- levels of accuracy and sensitivity in detecting melanoma, sitivity to the specific features of skin lesion images. This even when confronted with data significantly different targeted refinement improves the model’s diagnostic pre- from that on which it was trained. This cross-dataset cision, making it a highly effective tool for distinguishing validation is a critical aspect of our research, confirming between melanoma and non-melanoma cases. By lever- the model’s utility as a reliable tool in the early detection aging the cutting-edge ViT architecture and tailoring it of melanoma, potentially revolutionizing dermatological to the nuances of dermatological conditions, we aim to diagnostics. advance the field of medical imaging and offer a more accurate, AI-driven approach to melanoma detection. 3. Fine Tuning and Validation 2. Dataset Preparation 3.1. Fine Tuning In this work, we have constructed a dataset starting The concept of transfer learning, applied to the context from the data of the ISIC Challenges of 2019 [14, 15, 16] of melanoma detection using Google’s pre-trained Vision and 2020 [17], with which we fine-tuned Google’s pre- Transformer (ViT) Large model, leverages a model devel- trained Vision Transformer (ViT) Large model, aiming oped for general vision tasks and adapts it to the specific to distinguish between two classes: "Melanoma" and "No challenge of identifying melanoma in dermatological im- Melanoma". The model has been specifically adapted to ages. This approach benefits from the original model’s the characteristics of dermatological images. Although learning capabilities, significantly reducing the time and the network already possesses a broad capacity for recog- resources needed for training from scratch. Thanks to its nition due to its prior training, fine-tuning allows us to prior training, the model can be optimized to recognize refine its precision and sensitivity to the peculiarities the specific features of melanoma with greater efficiency. of skin lesions, thereby improving the accuracy of diag- During the fine-tuning process, the already pre-trained noses. ViT Large model from Google is further optimized for For the dataset preparation, with the objective of melanoma detection. In this phase, the model is specifi- achieving balanced classes between Melanoma and No cally adapted to the characteristics of dermatological im- Melanoma, images were extracted from ISIC 2019 and ages. Although the network already has a broad recogni- ISIC 2020. The resulting dataset comprises 8,223 training tion capability thanks to its previous training, fine-tuning images and 1,450 test images, witnessing a significant allows for the refinement of its precision and sensitiv- increase in the sample size, especially in the number of ity to the peculiarities of skin lesions, thus enhancing images within the Melanoma class of interest (3,890 No diagnostic accuracy. Melanoma, 4,333 Melanoma). This dataset was synthe- The development environment utilized for training sized from images sourced from ISIC 2019 and ISIC 2020. the model features the following specifications in AWS Specifically, the “Melanoma” class benefited from con- Environment: p3.2xlarge, Intel(R) Xeon(R) CPU E5-2686 tributions from both datasets, while the “No-Melanoma” v4 @ 2.30GHz, 64GB RAM, Graphics Card: Tesla V100- class was formed using images exclusively from ISIC 2019. SXM2 with 16GB of VRAM. This balanced approach ensures a more equitable distri- For the inference phase, no specific dedicated hard- bution of classes, enhancing the model’s ability to learn ware with GPU is required. This flexibility in hardware and accurately differentiate between Melanoma and No requirements for inference ensures that the fine-tuned Melanoma, which is crucial for the model’s performance model can be deployed in a wide range of environments, Table 1 Performance Evaluation of Skin Lesion Classification on the ISIC Test Set: This table presents a detailed breakdown of classification metrics for distinguishing between melanoma and non-melanoma skin lesions. Each class is evaluated based on precision, recall, F1-score, and support, highlighting the model’s ability to accurately identify and classify each condition within the dataset. Class Precision Recall F1-Score Support No-Melanoma 0.86 0.83 0.85 677 Melanoma 0.86 0.88 0.87 773 making it accessible for clinical use without the need for high-performance computing resources. 3.2. Validation on ISIC Test Set The results obtained from the model on the test set, as detailed in Table 1, underscore its adeptness in distin- guishing between melanoma and non-melanoma lesions, showcasing substantial precision, recall, and F1-score Figure 1: Diagnostic Performance of the Vision Transformer metrics across both categories. For the "No Melanoma" Model for Melanoma Detection on ISIC test set. On the left, class, precision was marked at 0.86, with a recall of 0.83 the confusion matrix illustrates the model’s accuracy in clas- and an F1-score of 0.85, across a support of 677 cases. sifying 1: ’Melanoma’ and 0: ’No Melanoma’ cases, with the number of true positives, true negatives, false positives, and This high level of accuracy in identifying non-melanoma false negatives. On the right, the ROC (Receiver Operating instances indicates a strong balance between the preci- Characteristic) curve. sion and recall, reflecting the model’s efficiency in mini- mizing false positives while effectively recognizing true negatives. Conversely, the "Melanoma" class demonstrated pre- ize melanoma screening and diagnosis through the in- cision and recall scores of 0.86 and 0.88, respectively, tegration of cutting-edge AI technologies into clinical achieving an F1-score of 0.87 over 773 instances. These practices, thereby enhancing diagnostic accuracy and metrics highlight the model’s capability in reliably detect- facilitating improved patient care. ing melanoma lesions, with the elevated recall indicating a particular strength in reducing false negatives—crucial 3.3. Validation on MEDNODE Dataset for melanoma screening where the cost of missing a pos- itive diagnosis is exceedingly high. The model’s performance on the MEDNODE dataset, as The consistency in precision across both classes, paired detailed in Table 2, reflects its diagnostic precision in with the balanced F1-scores, attests to the model’s robust- distinguishing between melanoma and non-melanoma ness, suggesting its potential as a dependable tool in the cases. With a precision of 0.83 and a recall of 0.98 for "No diagnostic toolkit. Such performance, detailed in Table 1, Melanoma," the model demonstrates a high capability in affirms the advanced AI models, like the Vision Trans- correctly identifying non-melanoma cases, as evidenced former’s, significant role in dermatological diagnostics. by an F1-score of 0.90 across 100 instances. This high Further insight into the model’s performance is pro- recall rate is crucial, indicating the model’s strength in vided by Figure 1, which depicts the confusion matrix minimizing the risk of false negatives in non-melanoma and the ROC curve for the model’s predictions. The con- diagnoses, which is vital for avoiding unnecessary further fusion matrix visually illustrates the model’s accuracy testing and anxiety for patients. in classifying the test cases, offering a clear depiction Conversely, for the "Melanoma" category, the preci- of the true positive and negative rates, alongside the in- sion stands at an impressive 0.96, showing the model’s stances of false positives and negatives. The ROC curve, reliability in its melanoma predictions. However, the re- accompanying this matrix, further elucidates the model’s call of 0.71 highlights a potential area for improvement diagnostic ability across different thresholds, showcas- in capturing all true melanoma cases, with an F1-score of ing its exceptional capability to balance sensitivity and 0.82 across 70 instances reflecting the balance between specificity effectively. Together, Table 1 and Figure 1 precision and the need to improve recall. offer a comprehensive overview of the model’s diagnos- Figure 2 provides further insight into these results tic performance, highlighting its potential to revolution- through a visual representation. The left side of the fig- ure features the confusion matrix, illustrating the model’s Table 2 Performance Evaluation of Skin Lesion Classification on the MEDNODE Dataset: This table presents a detailed breakdown of classification metrics for distinguishing between melanoma and non-melanoma skin lesions. Each class is evaluated based on precision, recall, F1-score, and support, highlighting the model’s ability to accurately identify and classify each condition within the dataset. Class Precision Recall F1-Score Support No-Melanoma 0.83 0.98 0.90 100 Melanoma 0.96 0.71 0.82 70 taining high diagnostic accuracy, the model paves the way for broader clinical adoption, offering a promising tool for early melanoma detection and thereby improving patient outcomes through timely and accurate diagnoses. 4. Streamlit WebApp Figure 2: Diagnostic Performance of the Vision Transformer In recent years, data visualization has become increas- Model for Melanoma Detection on MEDNODE Dataset. On ingly crucial in the comprehension and communication the left, the confusion matrix illustrates the model’s accuracy of information, especially in the realm of medical image in classifying 1: ’Melanoma’ and 0: ’No Melanoma’ cases, with analysis. Streamlit has emerged as a powerful and flexible the number of true positives, true negatives, false positives, tool for creating interactive web applications, particularly and false negatives. On the right, the ROC (Receiver Operating excelling in the manipulation and analysis of visual data. Characteristic) curve. This chapter delves into the pivotal role of Streamlit in developing a web-based application aimed at melanoma detection through the analysis of dermatological images. accuracy in classifying cases into "Melanoma" and "No Leveraging the capabilities of Streamlit, the objective Melanoma" categories, revealing the true positive, true was to construct an interactive application directly linked negative, false positive, and false negative counts. The to the melanoma recognition system based on the image right side of the figure displays the ROC curve, show- models trained and discussed in previous chapters. This casing the model’s ability to differentiate between the application provides an intuitive user interface, allowing two classes at various threshold levels, thus highlighting users to upload and analyze dermatological images to the trade-off between sensitivity (true positive rate) and assess the potential presence of melanoma. Furthermore, specificity (true negative rate). the application displays the results from the melanoma Together, Table 2 and Figure 2 offer a comprehen- recognition system, enhancing the user’s understanding sive overview of the model’s diagnostic performance and interaction with the diagnostic process. The web on the MEDNODE dataset. They underline the model’s application can be segmented into four main sections: strengths in identifying non-melanoma cases with high Why: This section elucidates the importance of early accuracy while also pointing out the necessity for fur- melanoma detection, providing users with background ther refinement to enhance its sensitivity to melanoma information on the significance of timely diagnosis and detection. This balance between precision and recall, how advancements in AI and medical imaging have fa- especially for a condition as critical as melanoma, is cilitated this process. paramount in developing AI diagnostic tools that can How: Here, the application explains the underlying effectively assist clinicians in making accurate and timely technology and algorithms that power the melanoma diagnoses, thereby improving patient care and outcomes. detection process, offering insights into the workings of In essence, the model’s strong performance on the the AI model and the role of deep learning in analyzing MEDNODE dataset, a collection distinct from the train- dermatological images. ing set provided by the ISIC challenges, enhances its Image Upload: Users are presented with a straightfor- credibility as a robust diagnostic tool. It affirms the po- ward mechanism to upload dermatological images. This tential for AI-driven models, specifically those based on functionality underscores the application’s user-friendly advanced architectures like the Vision Transformer, to design, ensuring that users can easily navigate the pro- revolutionize melanoma detection. By effectively bridg- cess of submitting images for analysis. ing the gap between different imaging sources and main- Prediction: Upon image submission, this segment of the application presents the AI model’s predictions. It visualizes the diagnostic results, including the probabil- AI-driven diagnostics accessible to a broad audience. ity of melanoma presence. This section exemplifies the In essence, this work illustrates the synergy between critical role of data visualization in making complex AI cutting-edge AI technology and user-centric application analyses accessible and understandable to users, facilitat- design in addressing the critical healthcare challenge of ing an informed interpretation of the results. early melanoma detection. By marrying the technical The emphasis on data visualization and ease of use prowess of Vision Transformers with the accessibility within the Streamlit application not only democratizes and clarity provided by Streamlit, this initiative paves access to advanced melanoma detection tools but also sig- the way for future advancements in the field of medi- nificantly enhances the user experience. By translating cal imaging and diagnosis. It stands as a testament to sophisticated AI diagnostics into intuitive visual outputs, the potential of AI to not only revolutionize diagnos- the application bridges the gap between complex medical tic processes but also to empower individuals with the data and actionable insights, making it an invaluable tool tools and knowledge necessary for early detection and in the early detection of melanoma. intervention, ultimately contributing to better healthcare outcomes and the broader goal of reducing melanoma- related mortality. 5. Conclusion In conclusion, this article has presented a comprehensive References exploration of an innovative web application developed for the early detection of melanoma, leveraging the ad- [1] H. Tsao, J. M. Olazagasti, K. M. Cordoro, J. D. Brewer, vanced capabilities of Vision Transformers (ViTs) and S. C. Taylor, J. S. Bordeaux, M.-M. Chren, A. J. Sober, the intuitive platform of Streamlit. The initial stages C. Tegeler, R. Bhushan, et al., Early detection of of this work involved the meticulous construction of melanoma: reviewing the abcdes, Journal of the a dataset from the ISIC Challenges of 2019 and 2020, American Academy of Dermatology 72 (2015) 717– followed by the fine-tuning of a pre-trained ViT Large 723. model from Google, with the dual objectives of distin- [2] L. E. Davis, S. C. Shalin, A. J. Tackett, Current guishing between "Melanoma" and "No Melanoma" cases state of melanoma diagnosis and treatment, Cancer and adapting the model to the unique characteristics of biology & therapy 20 (2019) 1366–1379. dermatological images. This process was underscored [3] M. Rastrelli, S. Tropea, C. R. Rossi, M. Alaibac, by the strategic preparation of the dataset to ensure bal- Melanoma: epidemiology, risk factors, pathogene- anced classes, thereby enhancing the model’s learning sis, diagnosis and classification, In vivo 28 (2014) and predictive accuracy. 1005–1011. Subsequent testing on the distinct MEDNODE dataset [4] H. Bhatt, V. Shah, K. Shah, R. Shah, M. Shah, confirmed the model’s robustness and adaptability, State-of-the-art machine learning techniques for demonstrating significant diagnostic precision across melanoma skin cancer detection and classification: varying conditions. The application of transfer learning A comprehensive review, Intelligent Medicine 3 techniques further exemplified the utility of leveraging (2023) 180–190. existing AI models for specialized tasks, reducing both [5] D. Pirone, A. Montella, D. Sirico, M. Mugnano, the time and resources required for model development D. Del Giudice, I. Kurelac, M. Tirelli, A. Iolascon, from scratch. The deployment environment, character- V. Bianco, P. Memmolo, et al., Phenotyping neu- ized by high-performance computing resources, facili- roblastoma cells through intelligent scrutiny of tated the model’s training and validation phases, while stain-free biomarkers in holographic flow cytome- the streamlined requirements for the inference phase un- try, APL bioengineering 7 (2023). derscored the model’s practical applicability in diverse [6] D. Pirone, D. Sirico, L. Miccio, V. Bianco, M. Mug- clinical settings. nano, P. Ferraro, P. Memmolo, Speeding up re- The Streamlit-based web application represents a sig- construction of 3d tomograms in holographic flow nificant stride towards democratizing access to advanced cytometry via deep learning, Lab on a Chip 22 diagnostic tools. By offering an intuitive interface for (2022) 793–804. uploading and analyzing dermatological images, coupled [7] H. Polo Friz, V. Esposito, G. Marano, L. Primitz, with real-time presentation of diagnostic results, the ap- A. Bovio, G. Delgrossi, M. Bombelli, G. Grignaffini, plication emphasizes the critical role of data visualization G. Monza, P. Boracchi, Machine learning and lace in enhancing user engagement and understanding. Each index for predicting 30-day readmissions after heart segment of the application, from providing background failure hospitalization in elderly patients, Internal on melanoma detection to visualizing AI-generated pre- and Emergency Medicine 17 (2022) 1727–1737. dictions, is designed to make the complex process of [8] S. Jain, N. Pise, et al., Computer aided melanoma skin cancer detection using image processing, Pro- tions 42 (2015) 6578–6585. cedia Computer Science 48 (2015) 735–740. [9] A. Bosco, S. Capuozzo, B. Celano, M. Gravina, S. Marrone, M. P. Maurelli, V. Moscato, G. Pontillo, M. Postiglione, A. M. Rinaldi, et al., Ai in health- care: Activities of the university of naples federico ii node of the cini-aiis lab (2023). [10] J. Lorca-Cabrera, C. Grau, R. Martí-Arques, L. Raigal-Aran, A. Falcó-Pegueroles, N. Albacar- Riobóo, Effectiveness of health web-based and mo- bile app-based interventions designed to improve informal caregiver’s well-being and quality of life: a systematic review, International Journal of Medical Informatics 134 (2020) 104003. [11] R. Indraswari, R. Rokhana, W. Herulambang, Melanoma image classification based on mo- bilenetv2 network, Procedia computer science 197 (2022) 198–207. [12] G. Cirrincione, S. Cannata, G. Cicceri, F. Prinzi, T. Currieri, M. Lovino, C. Militello, E. Pasero, S. Vitabile, Transformer-based approach to melanoma detection, Sensors 23 (2023) 5677. [13] B. Wu, C. Xu, X. Dai, A. Wan, P. Zhang, Z. Yan, M. Tomizuka, J. Gonzalez, K. Keutzer, P. Vajda, Vi- sual transformers: Token-based image represen- tation and processing for computer vision (2020). arXiv:2006.03677. [14] M. Combalia, N. C. Codella, V. Rotemberg, B. Helba, V. Vilaplana, O. Reiter, C. Carrera, A. Barreiro, A. C. Halpern, S. Puig, et al., Bcn20000: Dermoscopic lesions in the wild, arXiv preprint arXiv:1908.02288 (2019). [15] P. Tschandl, C. Rosendahl, H. Kittler, The ham10000 dataset, a large collection of multi-source dermato- scopic images of common pigmented skin lesions, Scientific data 5 (2018) 1–9. [16] N. C. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, et al., Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imag- ing (isbi), hosted by the international skin imag- ing collaboration (isic), in: 2018 IEEE 15th inter- national symposium on biomedical imaging (ISBI 2018), IEEE, 2018, pp. 168–172. [17] V. Rotemberg, N. Kurtansky, B. Betz-Stablein, L. Caf- fery, E. Chousakos, N. Codella, M. Combalia, S. Dusza, P. Guitera, D. Gutman, et al., A patient- centric dataset of images and metadata for identi- fying melanomas using clinical context, Scientific data 8 (2021) 34. [18] I. Giotis, N. Molders, S. Land, M. Biehl, M. F. Jonkman, N. Petkov, Med-node: A computer- assisted melanoma diagnosis system using non- dermoscopic images, Expert systems with applica-