=Paper=
{{Paper
|id=Vol-3762/465
|storemode=property
|title=Implementing Vision Transformers in Dermatological Practice: A Web Application for Melanoma Screening
|pdfUrl=https://ceur-ws.org/Vol-3762/465.pdf
|volume=Vol-3762
|authors=Daniele Sirico,Giuseppe Accardo,Valentina Esposito
|dblpUrl=https://dblp.org/rec/conf/ital-ia/SiricoAE24
}}
==Implementing Vision Transformers in Dermatological Practice: A Web Application for Melanoma Screening==
Implementing Vision Transformers in Dermatological
Practice: A Web Application for Melanoma Screening
Daniele Sirico1,† , Giuseppe Accardo1,† and Valentina Esposito1,*
1
Data Jam srl, Centro Direzionale Isola F8, Via F. Lauria, Naples, 80143, Italy
Abstract
In this study, we introduce a pioneering web-based application designed to enhance melanoma detection accuracy through
the innovative use of a Vision Transformer (ViT) model. Leveraging the power of advanced deep learning architectures,
our application processes dermatological images to identify potential melanomas with a precision previously unattainable
in conventional screening methods. The development process involved finetuning the ViT model on a diverse dataset of
dermatoscopic images, ensuring robustness and reliability across a wide range of images. The web application is intuitively
designed, allowing for easy access and use by dermatologists and potentially by the general public for preliminary screening
purposes. This research not only underscores the viability of ViT models in medical imaging but also offers a practical tool
for early melanoma detection, thereby contributing to better clinical outcomes and facilitating early treatment interventions.
Keywords
AI, Melanoma, Computer Vision, Vision Trasformer
1. Introduction and the general public, facilitating widespread screening
and awareness.
The early diagnosis of melanoma, a highly aggressive Such web applications can transform smartphones and
form of skin cancer, is crucial for improving patient personal computers into powerful tools for preliminary
outcomes [1, 2, 3]. When detected at an early stage, screening, empowering individuals to seek professional
melanoma can often be treated effectively, significantly advice at the earliest suspicion of melanoma. This ap-
reducing mortality rates [1]. However, the challenge proach to healthcare leverages the ubiquity of internet-
lies in the timely and accurate identification of potential connected devices to bridge the gap between advanced
melanomas among a vast array of skin lesions, which diagnostic technologies and end-users [10]. The ease of
requires a high level of expertise and experience. In this use and accessibility of these applications are critical fac-
context, the application of artificial intelligence (AI) in tors in their adoption and effectiveness, enabling timely
medical imaging has emerged as a groundbreaking ad- intervention and potentially saving lives. In sum, the con-
vancement [4, 5]. AI, particularly deep learning models, vergence of AI in medical imaging and user-friendly web
has shown remarkable success in enhancing the accuracy applications marks a pivotal moment in the fight against
and efficiency of diagnostic processes in various medical melanoma, offering new horizons for early detection and
fields [6, 7], including dermatology [8]. treatment.
The integration of AI into medical imaging for To date, the ABCDE method is the standard ap-
melanoma detection allows for the analysis of derma- proach used by medical professionals for the diagno-
tological images with a level of detail and precision that sis of melanoma, emphasizing the evaluation of lesions
surpasses human capability [9]. This not only aids der- based on Asymmetry, Border irregularity, Color varia-
matologists in making more informed decisions but also tion, Diameter larger than 6mm, and Evolution. Despite
has the potential to democratize access to high-quality its widespread adoption and utility in raising awareness,
diagnostic services, especially in under-resourced areas. this method’s subjective nature can lead to variability in
Furthermore, the advent of user-friendly web applica- diagnostic accuracy, potentially overlooking early-stage
tions for medical purposes represents a significant leap melanomas or prompting unnecessary biopsies of be-
forward. Applications built on platforms like Streamlit of- nign lesions [1]. In response, Artificial Intelligence, es-
fer an accessible interface for both medical professionals pecially through deep learning algorithms, presents a
significant advancement by providing an objective and
Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- precise analysis far surpassing the traditional methods.
nized by CINI, May 29-30, 2024, Naples, Italy By incorporating AI-driven diagnostic capabilities into a
***
Corresponding author.
† user-friendly web application, the project aims not only
These authors contributed equally.
to enhance diagnostic precision but also to make sophis-
$ dg.sirico@almaviva.it (D. Sirico); gi.accardo@almaviva.it
(G. Accardo); v.esposito@almaviva.it (V. Esposito) ticated melanoma screening tools accessible to a wider
0000-0002-2760-9209 (D. Sirico) population. This initiative marks a crucial step forward in
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0). improving early detection rates and patient outcomes by
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
bridging the gap between traditional diagnostic methods in real-world diagnostic applications.
and the potential of modern technology [11, 12]. In addition to the dataset preparation from ISIC 2019
In this study, we have developed a comprehensive and ISIC 2020, the fine-tuned model was subsequently
dataset derived from the ISIC (International Skin Imaging tested on an entirely different dataset, MEDNODE [18],
Collaboration) Challenge datasets of 2019-2020. Utiliz- to assess its generalization capability and performance in
ing this dataset, we have fine-tuned the pre-trained Vi- real-world scenarios. This step was crucial for validating
sion Transformer (ViT) Large model provided by Google the effectiveness of our fine-tuning process and ensuring
[13], with the specific objective of classifying dermato- that the model could accurately classify melanoma across
logical images into two distinct categories: "Melanoma" diverse datasets. The MEDNODE dataset, distinct in its
and "Non-Melanoma". The model has been meticulously composition and image characteristics, provided a chal-
adapted to the unique characteristics of dermatological lenging environment to evaluate the model’s robustness
imagery. While the ViT model inherently possesses a and adaptability. This further testing underscores the
broad capability for image recognition owing to its ex- model’s potential for application in a wide range of clini-
tensive initial training on diverse data, our fine-tuning cal settings, demonstrating its ability to maintain high
process has significantly enhanced its accuracy and sen- levels of accuracy and sensitivity in detecting melanoma,
sitivity to the specific features of skin lesion images. This even when confronted with data significantly different
targeted refinement improves the model’s diagnostic pre- from that on which it was trained. This cross-dataset
cision, making it a highly effective tool for distinguishing validation is a critical aspect of our research, confirming
between melanoma and non-melanoma cases. By lever- the model’s utility as a reliable tool in the early detection
aging the cutting-edge ViT architecture and tailoring it of melanoma, potentially revolutionizing dermatological
to the nuances of dermatological conditions, we aim to diagnostics.
advance the field of medical imaging and offer a more
accurate, AI-driven approach to melanoma detection.
3. Fine Tuning and Validation
2. Dataset Preparation 3.1. Fine Tuning
In this work, we have constructed a dataset starting The concept of transfer learning, applied to the context
from the data of the ISIC Challenges of 2019 [14, 15, 16] of melanoma detection using Google’s pre-trained Vision
and 2020 [17], with which we fine-tuned Google’s pre- Transformer (ViT) Large model, leverages a model devel-
trained Vision Transformer (ViT) Large model, aiming oped for general vision tasks and adapts it to the specific
to distinguish between two classes: "Melanoma" and "No challenge of identifying melanoma in dermatological im-
Melanoma". The model has been specifically adapted to ages. This approach benefits from the original model’s
the characteristics of dermatological images. Although learning capabilities, significantly reducing the time and
the network already possesses a broad capacity for recog- resources needed for training from scratch. Thanks to its
nition due to its prior training, fine-tuning allows us to prior training, the model can be optimized to recognize
refine its precision and sensitivity to the peculiarities the specific features of melanoma with greater efficiency.
of skin lesions, thereby improving the accuracy of diag- During the fine-tuning process, the already pre-trained
noses. ViT Large model from Google is further optimized for
For the dataset preparation, with the objective of melanoma detection. In this phase, the model is specifi-
achieving balanced classes between Melanoma and No cally adapted to the characteristics of dermatological im-
Melanoma, images were extracted from ISIC 2019 and ages. Although the network already has a broad recogni-
ISIC 2020. The resulting dataset comprises 8,223 training tion capability thanks to its previous training, fine-tuning
images and 1,450 test images, witnessing a significant allows for the refinement of its precision and sensitiv-
increase in the sample size, especially in the number of ity to the peculiarities of skin lesions, thus enhancing
images within the Melanoma class of interest (3,890 No diagnostic accuracy.
Melanoma, 4,333 Melanoma). This dataset was synthe- The development environment utilized for training
sized from images sourced from ISIC 2019 and ISIC 2020. the model features the following specifications in AWS
Specifically, the “Melanoma” class benefited from con- Environment: p3.2xlarge, Intel(R) Xeon(R) CPU E5-2686
tributions from both datasets, while the “No-Melanoma” v4 @ 2.30GHz, 64GB RAM, Graphics Card: Tesla V100-
class was formed using images exclusively from ISIC 2019. SXM2 with 16GB of VRAM.
This balanced approach ensures a more equitable distri- For the inference phase, no specific dedicated hard-
bution of classes, enhancing the model’s ability to learn ware with GPU is required. This flexibility in hardware
and accurately differentiate between Melanoma and No requirements for inference ensures that the fine-tuned
Melanoma, which is crucial for the model’s performance model can be deployed in a wide range of environments,
Table 1
Performance Evaluation of Skin Lesion Classification on the ISIC Test Set: This table presents a detailed breakdown of
classification metrics for distinguishing between melanoma and non-melanoma skin lesions. Each class is evaluated based
on precision, recall, F1-score, and support, highlighting the model’s ability to accurately identify and classify each condition
within the dataset.
Class Precision Recall F1-Score Support
No-Melanoma 0.86 0.83 0.85 677
Melanoma 0.86 0.88 0.87 773
making it accessible for clinical use without the need for
high-performance computing resources.
3.2. Validation on ISIC Test Set
The results obtained from the model on the test set, as
detailed in Table 1, underscore its adeptness in distin-
guishing between melanoma and non-melanoma lesions,
showcasing substantial precision, recall, and F1-score Figure 1: Diagnostic Performance of the Vision Transformer
metrics across both categories. For the "No Melanoma" Model for Melanoma Detection on ISIC test set. On the left,
class, precision was marked at 0.86, with a recall of 0.83 the confusion matrix illustrates the model’s accuracy in clas-
and an F1-score of 0.85, across a support of 677 cases. sifying 1: ’Melanoma’ and 0: ’No Melanoma’ cases, with the
number of true positives, true negatives, false positives, and
This high level of accuracy in identifying non-melanoma
false negatives. On the right, the ROC (Receiver Operating
instances indicates a strong balance between the preci- Characteristic) curve.
sion and recall, reflecting the model’s efficiency in mini-
mizing false positives while effectively recognizing true
negatives.
Conversely, the "Melanoma" class demonstrated pre- ize melanoma screening and diagnosis through the in-
cision and recall scores of 0.86 and 0.88, respectively, tegration of cutting-edge AI technologies into clinical
achieving an F1-score of 0.87 over 773 instances. These practices, thereby enhancing diagnostic accuracy and
metrics highlight the model’s capability in reliably detect- facilitating improved patient care.
ing melanoma lesions, with the elevated recall indicating
a particular strength in reducing false negatives—crucial 3.3. Validation on MEDNODE Dataset
for melanoma screening where the cost of missing a pos-
itive diagnosis is exceedingly high. The model’s performance on the MEDNODE dataset, as
The consistency in precision across both classes, paired detailed in Table 2, reflects its diagnostic precision in
with the balanced F1-scores, attests to the model’s robust- distinguishing between melanoma and non-melanoma
ness, suggesting its potential as a dependable tool in the cases. With a precision of 0.83 and a recall of 0.98 for "No
diagnostic toolkit. Such performance, detailed in Table 1, Melanoma," the model demonstrates a high capability in
affirms the advanced AI models, like the Vision Trans- correctly identifying non-melanoma cases, as evidenced
former’s, significant role in dermatological diagnostics. by an F1-score of 0.90 across 100 instances. This high
Further insight into the model’s performance is pro- recall rate is crucial, indicating the model’s strength in
vided by Figure 1, which depicts the confusion matrix minimizing the risk of false negatives in non-melanoma
and the ROC curve for the model’s predictions. The con- diagnoses, which is vital for avoiding unnecessary further
fusion matrix visually illustrates the model’s accuracy testing and anxiety for patients.
in classifying the test cases, offering a clear depiction Conversely, for the "Melanoma" category, the preci-
of the true positive and negative rates, alongside the in- sion stands at an impressive 0.96, showing the model’s
stances of false positives and negatives. The ROC curve, reliability in its melanoma predictions. However, the re-
accompanying this matrix, further elucidates the model’s call of 0.71 highlights a potential area for improvement
diagnostic ability across different thresholds, showcas- in capturing all true melanoma cases, with an F1-score of
ing its exceptional capability to balance sensitivity and 0.82 across 70 instances reflecting the balance between
specificity effectively. Together, Table 1 and Figure 1 precision and the need to improve recall.
offer a comprehensive overview of the model’s diagnos- Figure 2 provides further insight into these results
tic performance, highlighting its potential to revolution- through a visual representation. The left side of the fig-
ure features the confusion matrix, illustrating the model’s
Table 2
Performance Evaluation of Skin Lesion Classification on the MEDNODE Dataset: This table presents a detailed breakdown of
classification metrics for distinguishing between melanoma and non-melanoma skin lesions. Each class is evaluated based
on precision, recall, F1-score, and support, highlighting the model’s ability to accurately identify and classify each condition
within the dataset.
Class Precision Recall F1-Score Support
No-Melanoma 0.83 0.98 0.90 100
Melanoma 0.96 0.71 0.82 70
taining high diagnostic accuracy, the model paves the
way for broader clinical adoption, offering a promising
tool for early melanoma detection and thereby improving
patient outcomes through timely and accurate diagnoses.
4. Streamlit WebApp
Figure 2: Diagnostic Performance of the Vision Transformer In recent years, data visualization has become increas-
Model for Melanoma Detection on MEDNODE Dataset. On ingly crucial in the comprehension and communication
the left, the confusion matrix illustrates the model’s accuracy of information, especially in the realm of medical image
in classifying 1: ’Melanoma’ and 0: ’No Melanoma’ cases, with analysis. Streamlit has emerged as a powerful and flexible
the number of true positives, true negatives, false positives, tool for creating interactive web applications, particularly
and false negatives. On the right, the ROC (Receiver Operating excelling in the manipulation and analysis of visual data.
Characteristic) curve. This chapter delves into the pivotal role of Streamlit in
developing a web-based application aimed at melanoma
detection through the analysis of dermatological images.
accuracy in classifying cases into "Melanoma" and "No Leveraging the capabilities of Streamlit, the objective
Melanoma" categories, revealing the true positive, true was to construct an interactive application directly linked
negative, false positive, and false negative counts. The to the melanoma recognition system based on the image
right side of the figure displays the ROC curve, show- models trained and discussed in previous chapters. This
casing the model’s ability to differentiate between the application provides an intuitive user interface, allowing
two classes at various threshold levels, thus highlighting users to upload and analyze dermatological images to
the trade-off between sensitivity (true positive rate) and assess the potential presence of melanoma. Furthermore,
specificity (true negative rate). the application displays the results from the melanoma
Together, Table 2 and Figure 2 offer a comprehen- recognition system, enhancing the user’s understanding
sive overview of the model’s diagnostic performance and interaction with the diagnostic process. The web
on the MEDNODE dataset. They underline the model’s application can be segmented into four main sections:
strengths in identifying non-melanoma cases with high Why: This section elucidates the importance of early
accuracy while also pointing out the necessity for fur- melanoma detection, providing users with background
ther refinement to enhance its sensitivity to melanoma information on the significance of timely diagnosis and
detection. This balance between precision and recall, how advancements in AI and medical imaging have fa-
especially for a condition as critical as melanoma, is cilitated this process.
paramount in developing AI diagnostic tools that can How: Here, the application explains the underlying
effectively assist clinicians in making accurate and timely technology and algorithms that power the melanoma
diagnoses, thereby improving patient care and outcomes. detection process, offering insights into the workings of
In essence, the model’s strong performance on the the AI model and the role of deep learning in analyzing
MEDNODE dataset, a collection distinct from the train- dermatological images.
ing set provided by the ISIC challenges, enhances its Image Upload: Users are presented with a straightfor-
credibility as a robust diagnostic tool. It affirms the po- ward mechanism to upload dermatological images. This
tential for AI-driven models, specifically those based on functionality underscores the application’s user-friendly
advanced architectures like the Vision Transformer, to design, ensuring that users can easily navigate the pro-
revolutionize melanoma detection. By effectively bridg- cess of submitting images for analysis.
ing the gap between different imaging sources and main- Prediction: Upon image submission, this segment of
the application presents the AI model’s predictions. It
visualizes the diagnostic results, including the probabil- AI-driven diagnostics accessible to a broad audience.
ity of melanoma presence. This section exemplifies the In essence, this work illustrates the synergy between
critical role of data visualization in making complex AI cutting-edge AI technology and user-centric application
analyses accessible and understandable to users, facilitat- design in addressing the critical healthcare challenge of
ing an informed interpretation of the results. early melanoma detection. By marrying the technical
The emphasis on data visualization and ease of use prowess of Vision Transformers with the accessibility
within the Streamlit application not only democratizes and clarity provided by Streamlit, this initiative paves
access to advanced melanoma detection tools but also sig- the way for future advancements in the field of medi-
nificantly enhances the user experience. By translating cal imaging and diagnosis. It stands as a testament to
sophisticated AI diagnostics into intuitive visual outputs, the potential of AI to not only revolutionize diagnos-
the application bridges the gap between complex medical tic processes but also to empower individuals with the
data and actionable insights, making it an invaluable tool tools and knowledge necessary for early detection and
in the early detection of melanoma. intervention, ultimately contributing to better healthcare
outcomes and the broader goal of reducing melanoma-
related mortality.
5. Conclusion
In conclusion, this article has presented a comprehensive References
exploration of an innovative web application developed
for the early detection of melanoma, leveraging the ad- [1] H. Tsao, J. M. Olazagasti, K. M. Cordoro, J. D. Brewer,
vanced capabilities of Vision Transformers (ViTs) and S. C. Taylor, J. S. Bordeaux, M.-M. Chren, A. J. Sober,
the intuitive platform of Streamlit. The initial stages C. Tegeler, R. Bhushan, et al., Early detection of
of this work involved the meticulous construction of melanoma: reviewing the abcdes, Journal of the
a dataset from the ISIC Challenges of 2019 and 2020, American Academy of Dermatology 72 (2015) 717–
followed by the fine-tuning of a pre-trained ViT Large 723.
model from Google, with the dual objectives of distin- [2] L. E. Davis, S. C. Shalin, A. J. Tackett, Current
guishing between "Melanoma" and "No Melanoma" cases state of melanoma diagnosis and treatment, Cancer
and adapting the model to the unique characteristics of biology & therapy 20 (2019) 1366–1379.
dermatological images. This process was underscored [3] M. Rastrelli, S. Tropea, C. R. Rossi, M. Alaibac,
by the strategic preparation of the dataset to ensure bal- Melanoma: epidemiology, risk factors, pathogene-
anced classes, thereby enhancing the model’s learning sis, diagnosis and classification, In vivo 28 (2014)
and predictive accuracy. 1005–1011.
Subsequent testing on the distinct MEDNODE dataset [4] H. Bhatt, V. Shah, K. Shah, R. Shah, M. Shah,
confirmed the model’s robustness and adaptability, State-of-the-art machine learning techniques for
demonstrating significant diagnostic precision across melanoma skin cancer detection and classification:
varying conditions. The application of transfer learning A comprehensive review, Intelligent Medicine 3
techniques further exemplified the utility of leveraging (2023) 180–190.
existing AI models for specialized tasks, reducing both [5] D. Pirone, A. Montella, D. Sirico, M. Mugnano,
the time and resources required for model development D. Del Giudice, I. Kurelac, M. Tirelli, A. Iolascon,
from scratch. The deployment environment, character- V. Bianco, P. Memmolo, et al., Phenotyping neu-
ized by high-performance computing resources, facili- roblastoma cells through intelligent scrutiny of
tated the model’s training and validation phases, while stain-free biomarkers in holographic flow cytome-
the streamlined requirements for the inference phase un- try, APL bioengineering 7 (2023).
derscored the model’s practical applicability in diverse [6] D. Pirone, D. Sirico, L. Miccio, V. Bianco, M. Mug-
clinical settings. nano, P. Ferraro, P. Memmolo, Speeding up re-
The Streamlit-based web application represents a sig- construction of 3d tomograms in holographic flow
nificant stride towards democratizing access to advanced cytometry via deep learning, Lab on a Chip 22
diagnostic tools. By offering an intuitive interface for (2022) 793–804.
uploading and analyzing dermatological images, coupled [7] H. Polo Friz, V. Esposito, G. Marano, L. Primitz,
with real-time presentation of diagnostic results, the ap- A. Bovio, G. Delgrossi, M. Bombelli, G. Grignaffini,
plication emphasizes the critical role of data visualization G. Monza, P. Boracchi, Machine learning and lace
in enhancing user engagement and understanding. Each index for predicting 30-day readmissions after heart
segment of the application, from providing background failure hospitalization in elderly patients, Internal
on melanoma detection to visualizing AI-generated pre- and Emergency Medicine 17 (2022) 1727–1737.
dictions, is designed to make the complex process of [8] S. Jain, N. Pise, et al., Computer aided melanoma
skin cancer detection using image processing, Pro- tions 42 (2015) 6578–6585.
cedia Computer Science 48 (2015) 735–740.
[9] A. Bosco, S. Capuozzo, B. Celano, M. Gravina,
S. Marrone, M. P. Maurelli, V. Moscato, G. Pontillo,
M. Postiglione, A. M. Rinaldi, et al., Ai in health-
care: Activities of the university of naples federico
ii node of the cini-aiis lab (2023).
[10] J. Lorca-Cabrera, C. Grau, R. Martí-Arques,
L. Raigal-Aran, A. Falcó-Pegueroles, N. Albacar-
Riobóo, Effectiveness of health web-based and mo-
bile app-based interventions designed to improve
informal caregiver’s well-being and quality of life: a
systematic review, International Journal of Medical
Informatics 134 (2020) 104003.
[11] R. Indraswari, R. Rokhana, W. Herulambang,
Melanoma image classification based on mo-
bilenetv2 network, Procedia computer science 197
(2022) 198–207.
[12] G. Cirrincione, S. Cannata, G. Cicceri, F. Prinzi,
T. Currieri, M. Lovino, C. Militello, E. Pasero,
S. Vitabile, Transformer-based approach to
melanoma detection, Sensors 23 (2023) 5677.
[13] B. Wu, C. Xu, X. Dai, A. Wan, P. Zhang, Z. Yan,
M. Tomizuka, J. Gonzalez, K. Keutzer, P. Vajda, Vi-
sual transformers: Token-based image represen-
tation and processing for computer vision (2020).
arXiv:2006.03677.
[14] M. Combalia, N. C. Codella, V. Rotemberg, B. Helba,
V. Vilaplana, O. Reiter, C. Carrera, A. Barreiro, A. C.
Halpern, S. Puig, et al., Bcn20000: Dermoscopic
lesions in the wild, arXiv preprint arXiv:1908.02288
(2019).
[15] P. Tschandl, C. Rosendahl, H. Kittler, The ham10000
dataset, a large collection of multi-source dermato-
scopic images of common pigmented skin lesions,
Scientific data 5 (2018) 1–9.
[16] N. C. Codella, D. Gutman, M. E. Celebi, B. Helba,
M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris,
N. Mishra, H. Kittler, et al., Skin lesion analysis
toward melanoma detection: A challenge at the
2017 international symposium on biomedical imag-
ing (isbi), hosted by the international skin imag-
ing collaboration (isic), in: 2018 IEEE 15th inter-
national symposium on biomedical imaging (ISBI
2018), IEEE, 2018, pp. 168–172.
[17] V. Rotemberg, N. Kurtansky, B. Betz-Stablein, L. Caf-
fery, E. Chousakos, N. Codella, M. Combalia,
S. Dusza, P. Guitera, D. Gutman, et al., A patient-
centric dataset of images and metadata for identi-
fying melanomas using clinical context, Scientific
data 8 (2021) 34.
[18] I. Giotis, N. Molders, S. Land, M. Biehl, M. F.
Jonkman, N. Petkov, Med-node: A computer-
assisted melanoma diagnosis system using non-
dermoscopic images, Expert systems with applica-