<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Annals of Oncology</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1093/annonc/mdp322</article-id>
      <title-group>
        <article-title>AI-Driven CAD for Histological Analysis in Mountain Regions: Advancing Local Healthcare and Sustainable Development</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matteo Calabrese</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chiara B. Salvemini</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michela Assale</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Sartor</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberta Patetta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Caramanico</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Cavalli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Gustincich</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jean Marc Christille</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Azienda USL della Valle D'Aosta, U. Parini Hospital, Struttura Complessa di Anatomia Patologica</institution>
          ,
          <addr-line>Aosta</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CECAM-EPFL</institution>
          ,
          <addr-line>1015 Lausanne</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fondazione Clément Fillietroz ONLUS, Astronomical Observatory of the Autonomous Region of the Aosta Valley (OAVdA)</institution>
          ,
          <addr-line>Nus</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Non-coding RNA and RNA-based therapeutics, Center for Human Technology, Italian Institute of Technology (IIT)</institution>
          ,
          <addr-line>Genova</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Non-coding RNA and RNA-based therapeutics, Italian Institute of Technology (IIT)</institution>
          ,
          <addr-line>CMP</addr-line>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>VdA</institution>
          ,
          <addr-line>Aosta</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <volume>20</volume>
      <issue>2009</issue>
      <fpage>1319</fpage>
      <lpage>1329</lpage>
      <abstract>
        <p>Accurate quantification of the Ki -67 proliferation index is essential in cancer diagnostics, yet manual evaluation remains time-consuming and prone to inter-observer variability. This study presents a machine vision-based CAD tool designed to compute Ki-67 proliferation index from digitized histopathological images, aiding diagnostic workflows at the U. Parini hospital in Aosta. Developed through an interdisciplinary approach integrating artificial intelligence and computational pathology, the system was validated against expert annotations, demonstrating strong concordance and clinical reliability. Beyond its scientific contributions, this initiative fosters digital transformation in healthcare, improving diagnostic accessibility in mountainous regions while promoting local economic and technological development.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;AI</kwd>
        <kwd>Machine Vision (MV)</kwd>
        <kwd>Computer-Aid-Diagnostic (CAD)</kwd>
        <kwd>scientific research</kwd>
        <kwd>local healthcare system</kwd>
        <kwd>economic and social development</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Artificial intelligence (AI) is revolutionizing healthcare by enhancing diagnostic accuracy, personalizing
treatments, and expanding access to medical services, particularly in logistically-complex regions. In
rural and mountainous areas, such as Italy’s Aosta Valley, healthcare delivery faces unique challenges,
including geographical isolation and limited medical resources. Implementing AI-driven solutions in
these regions can bridge healthcare disparities by providing advanced diagnostic tools and supporting
clinical decision-making.</p>
      <p>Accurate assessment of cellular proliferation is crucial in cancer diagnostics, with the Ki-67
protein serving as a key biomarker. Traditionally, pathologists manually evaluate Ki-67 expression in
histopathological images, a process that is time-consuming and susceptible to inter-observer variability.
Computer-aided diagnosis (CAD) systems utilizing machine vision (MV) techniques have emerged to
automate this task, ofering consistent and eficient analysis of Ki-67 proliferation index.</p>
      <p>The integration of such technologies is particularly significant in remote and mountainous regions. By
deploying AI-driven diagnostic tools locally, these areas can undergo digital transformation, enhancing
healthcare delivery and fostering collaborations between research centers, hospitals, and medical
professionals. This synergy not only improves diagnostic practices but also contributes to the overall
quality of life in these communities.
2nd Workshop “New frontiers in Big Data and Artificial Intelligence” (BDAI 2025), May 29-30, 2025, Aosta, Italy
* Corresponding author.
$ calabrese@oavda.it (M. Calabrese)
0000-0002-2637-2422 (M. Calabrese)</p>
      <p>© 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>This paper presents a machine vision tool developed within the CMP3 - 5000 genomi@vda1 project,
designed to analyse histopathological images of breast cancer and compute the relative Ki-67 proliferation
index. This tool has been implemented and deployed at the local hospital U. Parini in Aosta, enabling
clinicians to utilize it as a supportive CAD system for their diagnostic processes. The objectives of this
paper are to describe the machine vision tool and the CAD infrastructure, and to contextualize their
development within the local setting. We aim to demonstrate how initiatives that link local research
centers with healthcare institutions can promote collaboration, drive scientific and technological
research, and enhance diagnostic practices, thereby improving the quality of life in mountainous regions
like the Aosta Valley.</p>
      <p>The structure of this paper is as follows: Section 2 contextualizes the scientific and aspects, providing
a literature review on Ki-67 assessment and the need for CAD systems; Section 3 describes the MV
algorithms and the data utilized; Section 4 presents the results, validation, and implementation of the
solution in clinical practice; Section 5 ofers a discussion on the findings and their implications; and
Section 6 concludes the paper, summarizing the contributions and future directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and motivation</title>
      <sec id="sec-2-1">
        <title>2.1. Scientific rationale</title>
        <p>
          Early detection of breast cancer is crucial for improving patient survival and quality of life [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. As the
most common malignancy among women worldwide - with millions of new cases annually - eficient
diagnostic tools are essential for managing clinical workloads and ensuring timely, accurate diagnoses.
Diagnostic approaches include self- and clinical examinations, as well as imaging modalities such as
mammography, ultrasound, and magnetic resonance imaging (MRI) [
          <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
          ]. Mammography is the primary
screening tool, while ultrasound and MRI are particularly useful in cases of dense breast tissue and
complex presentations. Nonetheless, biopsy remains the definitive method for confirming a diagnosis
[
          <xref ref-type="bibr" rid="ref4">4, 5, 6</xref>
          ]. However, challenges such as dense tissue and the potential for false-positive or false-negative
results persist [7]. Ongoing research into advanced diagnostic techniques and molecular biomarkers
promises to further improve the diagnosis and treatment of breast cancer [8, 9]. In the clinical practice,
following tissue extraction through biopsy or surgical resection, histopathological analysis is conducted.
This process involves preparing tissue sections and staining them - typically with Hematoxylin and
Eosin (H-E) [10] or alternative markers such as DAB and Ki-67 [11] - to enhance cellular details before
digitizing the slides into high-resolution images that play a vital role in clinical assessment [12].
        </p>
        <p>Artificial intelligence (AI) applications in medical imaging have significantly advanced histopathology
through Machine Vision (MV), enabling automated detection and quantification of biomarkers [ 13,
14, 15, 16]. Our work has focused on digital histological slides of breast cancer as a case study for
developing machine vision systems. In particular, slides stained with H-E and Ki-67 serve as the basis
for evaluating algorithm performance and ensuring diagnostic precision [17]. Additionally, machine
learning has found broad application in digital pathology [18, 19].</p>
        <p>The Ki-67 marker is widely recognized as both a predictive and prognostic biomarker in breast
carcinoma. It is commonly used to assess cellular proliferation [20, 21], with elevated levels indicating
a poorer prognosis [22]. Endorsed by the 2009 St. Gallen International Breast Cancer Conference [23],
the use of proliferation markers like Ki-67 informs optimal treatment decisions for early-stage breast
cancer. The Ki-67 index is calculated as the ratio of Ki-67-positive tumor cells to the total tumor cell
count, or:</p>
        <sec id="sec-2-1-1">
          <title>Number of cells positive to Ki-67</title>
          <p>Ki67 = Total number of tumor cells · 100, (1)
and it is typically estimated in clinical practice either by an average method, where pathologists count
positive cells across several regions of a slide, or by the hotspot method, which focuses on areas with
particularly intense staining [24]. Recent studies [25] suggest that a Ki-67 level above 10-14% indicates
a high-risk prognosis, but these thresholds remain a subject of ongoing debate[26], complicating the
standardization of manual and automated scoring systems. The International Ki-67 Working Group
(IKWG) proposes Ki-67 cutofs of 5% and 30% for prognosis, with intermediate values considered
a “gray zone” [27]. At U. Parini Hospital (Aosta), a 20% threshold is commonly used; around this
level, therapeutic decisions require comprehensive clinical evaluation beyond the Ki-67 index. High
inter-observer variability in classification and diferences in threshold selection further hinder reliable
assessment [28]. Threshold variability for risk stratification and inter-laboratory diferences—such as
region-of-interest selection—complicate evaluation of automated tools.</p>
          <p>Several computational solutions support automated Ki-67 quantification. Among open-source options,
QuPath [29] - which uses the StarDist algorithm for cell detection - has been widely employed in
studies comparing manual scoring and digital image analysis [30, 31, 32, 33]. Our work integrates this
detection model into a custom system tailored for the hospital’s clinical workflow, encompassing data
management, high-performance computing (HPC) processing, and modules for cell normalization and
ifltering. Other deep learning approaches (e.g. YOLO[ 34], VGG networks[35], and U-Net with ResNet
backbones[36]) focus on nucleus-level classification rather than reporting a whole-slide Ki-67 index.
Finally, commercial closed-source platforms (e.g. Visiopharm and DeepBio) designed for clinical use
employ proprietary algorithms, but their methods and validation data are often undisclosed.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. CAD in clinical practice</title>
        <p>Computer-Aided Diagnosis (CAD) is an advanced field in medical imaging that uses digital imaging and
artificial intelligence algorithms to support healthcare professionals in interpreting complex medical
images [37, 38, 39]. CAD systems have been widely adopted for the quantitative assessment of
immunohistochemical staining, facilitating the identification and measurement of biomarkers critical for
diagnosis [12]. Recent developments have expanded these systems to include the analysis of standard
hematoxylin and eosin (H-E) images, which form the foundation of pathological diagnosis [40, 41]. By
integrating advanced algorithms, CAD improves diagnostic accuracy and decision making, enabling the
detection of subtle patterns and anomalies that may be dificult to discern with the naked eye [ 42, 15].
Furthermore, this approach improves eficiency by streamlining image analysis and allowing clinicians
to devote more time to complex diagnostic tasks.</p>
        <p>In the context of local clinical practice, particularly in resource-limited or remote settings, such
as in mountain regions, CAD ofers significant benefits by providing reliable and automated support
that complements the expertise of local clinicians. Our system, while innovative, has been designed
with the specific intention of integrating seamlessly into existing clinical workflows without causing
disruption. For this work, our multidisciplinary team has worked closely with clinical partners, notably
the Specialized Unit of Pathological Anatomy2 in Aosta, to translate clinical requirements into efective
CAD tools. This collaboration has ensured that our machine vision algorithms are not only efective in
real-world contexts but also respect and accommodate common clinical practices, reinforcing rather
than interfering with the essential role of clinical judgment in patient care [43, 44, 45].</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. European and regional relevance</title>
        <p>Mountain regions face significant challenges, notably limited access to specialist care, which necessitates
the adoption of innovative technologies such as AI-driven machine vision and CAD tools. European
best practices and regulatory frameworks support this integration by promoting multi-stakeholder
collaboration and user-centered design. An European Parliament study [46] proposes engaging clinicians,
patients, social scientists, and regulators throughout the development process while also emphasizing
enhanced education programs to improve AI literacy among healthcare professionals and the public.
Similarly, the new European AI Act calls for the development of harmonized guidelines and standards
through close collaboration between AI developers, healthcare professionals, and patient communities
to ensure efective monitoring of health-related AI applications [ 47]. Complementing these initiatives,</p>
        <sec id="sec-2-3-1">
          <title>2"Struttura complessa di Anatomia Patologica" is the formal name of the unit at the hospital.</title>
          <p>the Standing Committee of European Doctors (CPME) guidelines3 stress that AI systems must be
designed based on actual healthcare demands, and to adhere to ethical and data protection standards.
Finally, the white paper Sustainable AI to Drive Global Health4 underscores the transformative potential
of data and AI in addressing global health challenges.</p>
          <p>Collectively, these frameworks provide a comprehensive road-map for deploying AI in under-served
mountain regions, ensuring that technological advancements are tailored to overcome local
healthcare disparities while aligning with international best practices. Emerging evidence indicates that
the integration of AI-driven machine vision and CAD systems can substantially improve healthcare
outcomes in logistically-complex mountain regions. For example, studies have shown that telemedicine
and AI-assisted diagnostics can enhance diagnostic accuracy and facilitate timely treatment [48, 49].
Emerging evidence indicates that practical implementation of AI in medicine leads to earlier disease
detection and improved patient outcomes, ultimately enhancing quality of life [50]. Moreover, AI
integration streamlines data sharing and promotes interdisciplinary collaboration, strengthening ties
between hospitals and research institutions [51]. Collectively, these advances not only optimize resource
allocation and reduce costs but also contribute to broader economic and social development in mountain
regions.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Machine Vision for Computer-Aided Diagnosis</title>
      <p>In this Section, we describe the workflow developed for the machine vision tool to segment nuclei and
compute the Ki-67 index. For a detailed overview of the workflow, please see Figure 1.
3.1. Data
A representative dataset of histological samples stained with hematoxylin and Ki-67 was used during
the development phase. This dataset [52], called DataDev consists of DataDev = 694 images, each
measuring 512×512 pixels, obtained from samples of 32 patients with breast cancer. This dataset was
used to calibrate the system’s parameters based on a semi-qualitative assessment of the segmentation,
combining statistical analysis with visual inspection. For the results and the clinical validation phase,
a diferent dataset, DataVal, was provided by the U. Parini hospital, with each slide accompanied
3https://www.cpme.eu/api/documents/adopted/2024/11/cpme_ad_09112024_073.final.policy.on.deployment.of.ai.in.
healthcare.pdf
4https://www.feam.eu/wp-content/uploads/Sustainable-AI-to-Drive-Global-Health-white-paper-11-Sep.pdf
by a Ki-67 index assigned by expert pathologists. This dataset comprises DataVal = 13 anonymised
histological slides from tumor biopsies performed on female patients diagnosed with breast cancer5.
Both dataset represent breast cancer cases at diferent stages of aggressiveness. The machine vision
system was customised and optimised on the DataDev dataset and subsequently validated in a blinded
manner on the DataVal dataset provided by the hospital; the validation dataset was used exclusively
for evaluation, without any involvement in the calibration phase.</p>
      <sec id="sec-3-1">
        <title>Data preparation and annotation for DataVal dataset. Hematoxylin-Ki-67 stained slides were</title>
        <p>prepared by incubating them with a chromogenic substrate, such as DAB, until the desired color
developed (typically 5–10 minutes), with the brown DAB chromogen highlighting proliferating cells.
A light hematoxylin counterstain was then applied to enhance the histological context, binding to
all nuclei to facilitate histological analysis. The slides from the research dataset and the test image
were digitized using the Aperio AT2 scanner6. The process involved setting the scanner parameters,
including a magnification of 20X or 40X and saving images as .svs format. The digitized images
were then assigned a clinical Ki-67 index, determined by medical professionals through microscopic
observation. The combined workflow of histological preparation, staining, digitization, and clinical
annotation ensures the quality and accuracy of subsequent analyses for both diagnosis and research. A
distinctive feature of these images is their pyramidal structure, where each image contains multiple
resolution levels, allowing access to diferent versions with varying detail. The choice of resolution
level also depends on the magnification used during digitization (20X or 40X) and the specific size of
the acquired slide. For this study, analysis was conducted at level 0 (maximum resolution) to ensure the
highest level of detail for histological sample evaluation. As reference, a magnification of 40X (20X)
corresponds to a resolution of 0.25/pixel (0.5/pixel).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Architecture and algorithm details</title>
        <p>The machine vision system is designed as a modular pipeline for processing digitized histological
slides to compute proliferation indices. The libraries used in the pipeline include OpenSlide7 for
opening and managing image files, Scikit-image8 for creating masks to remove non-biological tissue
areas, DeepZoom9 for tile extraction, SciPy10 and Scikit-learn11 for stain separation and other
image processing tasks such as color deconvolution and normalization, finally StarDist for image
segmentation. A workflow of the machine vision pipeline is reproduced in Figure 1.
A. Tile extraction. The images produced by the Aperio scanner are ultra-high-resolution, often too
large to be processed in a single operation. To make processing more manageable, tiling is used to
divide the original image into smaller, more manageable sections, called "tiles." Each tile retains the
original resolution, allowing for detailed analysis without losing important information. During tile
extraction, non-biological regions are removed, focusing only on relevant tissue regions. This step is
crucial as it significantly reduces computational load, enabling subsequent algorithms to operate more
eficiently. To prevent nuclei from being cut at tile boundaries and causing segmentation errors, an
overlap method was implemented. However, final results indicate that this efect does not significantly
impact the final Ki-67 computation.
5All 13 patients provided written informed consent - consenso informato - allowing the use of their sensitive data for research
purposes. In compliance with ethical guidelines, all images were anonymised, ensuring that only the hospital’s medical staf
had access to patients’ identities and sensitive information.
6https://www.leicabiosystems.com/sites/default/files/2020-10/Aperio_AT2_Brochure_USA.pdf
7https://openslide.org/api/python/
8https://scikit-image.org/docs/stable/api/skimage.html
9https://github.com/openzoom/deepzoom.py
10https://scipy.org/
11https://scikit-learn.org/stable/index.html
B. Image pre-processing: stain separation and normalization. Histological images can vary
significantly in coloration due to diferences in staining protocols, acquisition conditions, or scanner
types. Tile normalization standardizes image coloration, ensuring comparability across samples and
preventing color variations from afecting automated analysis [ 53]. This process relies on a reference
image representing the desired chromatic standard, adjusting each tile to match it as closely as possible
by correcting brightness, contrast, and hue. For this work, we used the normalization method as in
[54]. The result is a dataset with uniform chromatic properties, improving the reliability of subsequent
analyses. Histological images are typically stained with specific dyes to highlight diferent cellular
components. Hematoxylin stains cell nuclei in blue or purple, while eosin colors the cytoplasm,
connective tissue, and other structures in shades of pink. Ki-67 selectively stains proliferating cell nuclei
in brown, making them identifiable within the histological sample (see some examples in Figure 2).
Stain separation, or color deconvolution, isolates the chromatic information associated with each dye.
This technique is particularly useful for converting images into grayscale representations, where each
channel corresponds to a single stain.</p>
      </sec>
      <sec id="sec-3-3">
        <title>C. Nuclei segmentation and feature extraction. After stain separation, each tile produced two</title>
        <p>grayscale images: one highlighting Ki-67 staining, and the other emphasizing hematoxylin.
Segmentation is then performed to automatically identify cell nuclei, a crucial step for calculating the Ki-67 index
[55, 56]. The segmentation algorithms detect the darkest areas in the grayscale images, corresponding
to cell nuclei—either those stained with hematoxylin, appearing purple, or those marked by Ki-67,
appearing brown.</p>
        <p>The segmentation process was performed using StarDist, a deep learning-based algorithm designed
for the detection of star-convex objects, particularly useful for segmenting cell nuclei in histological
images. StarDist utilizes a convolutional neural network (CNN) to predict the shape and location of
nuclei, allowing for precise delineation of overlapping or irregularly shaped nuclei through star-convex
polygons. This method has shown superior performance over traditional segmentation techniques in
various biological image analysis tasks [57]. The code was customized to suit our processing pipeline
and evaluated on DataDev dataset - taken advantages of the pre-trained neural network weights already
present in the code distribution. Validation was then conducted on the DataVal dataset, by comparing
the automated results to manual annotations demonstrating the algorithm’s accuracy and reliability for
segmenting cell nuclei in this study (see Sec. 4). The implementation used in this work is based on the
open-source version of the code12.</p>
        <p>The segmentation algorithm creates a table containing various morphological properties of the
detected nuclei, including area, perimeter, ellipticity, and color intensity. This data is used for filtering,
selecting only nuclei with specific characteristics. Filtering is essential to focus on relevant cells, as
without it, segmentation would produce an inaccurate count, mistakenly identifying all cells in the slide
instead of only those of interest. To ensure the algorithm correctly distinguishes and counts only the
relevant cells, thresholds were applied based on three properties: area, staining intensity, and ellipticity.
D. Proliferation index calculation. To calculate the Ki-67 index, the property tables generated
by the segmentation algorithm were used to automatically count the cells in the segmented images.
Specifically, the numerator of the formula (see Eq (1)) was obtained by counting the segmented elements
from the grayscale image derived from the Ki-67 stain separation. Similarly, the denominator was
calculated by counting the segmented elements from the grayscale image derived from the hematoxylin
stain separation. For some example of nuclei segmentation and index computation see Figure 2.</p>
        <p>This system was tested on the High-Performance Computing (HPC) cluster at Pont-Saint-Martin,
utilizing GPU nodes for optimized processing. Execution times for pre-processing and segmentation
were recorded, with an average of 8 minutes for pre-processing and 16 minutes for segmentation and
Ki-67 extraction, both deemed suitable for clinical workflows.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.3. CAD implementation</title>
        <p>The CAD system is implemented as a web-based application accessible via a secure VPN, restricted
to authorized personnel authenticated through access credentials. It ensures proper data handling
in compliance with current regulations, including GDPR13. The system is deployed on an HPC
cluster and structured using Singularity/Apptainer containers. Web apps were developed using
Dash-Plotly14, a Python library and framework for publishing and visualizing data through a
website. The backend of Dash is built on Flask15, a lightweight web application framework, while the
frontend consists of a static web page populated with React components, which provide interactivity
to users without the need for a full-page reload16.</p>
        <p>The application features two dashboards: one for image upload and high-resolution preview, and
another interactive viewer displaying processed results, including nuclear segmentation metrics and
Ki-67 values. Upon image upload, metadata is stored in a backend database, and the file is saved on
the HPC. Processing is automated, generating segmentation and proliferation data. The interactive
viewer enables clinicians to select regions of interest (ROIs) and obtain real-time Ki-67 index updates,
facilitating focused analysis. Designed to integrate seamlessly into existing clinical workflows, the
interface provides clear visualizations and actionable data to support clinical decision-making.
13https://www.edps.europa.eu/data-protection/our-work/subjects/health_en
14https://dash.plotly.com/
15https://flask.palletsprojects.com/en/stable/
16In more details, Dash-Plotly handles the frontend and data exchange between the frontend and backend. The lifecycle of
each application is managed by uWSGI, an application server that provides load balancing and health check functionalities.
To complete the stack, Nginx was chosen as the web server for its native support of the WSGI protocol. Nginx also provides
data compression, ofloading both the network and the Python application.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and validation</title>
      <p>The algorithm aims to compute the Ki-67 index for entire whole-slide images. DataDev contains nuclei
annotated for Ki-67 positivity, extracted as cropped regions from externally sourced slides with varying
sizes and quality. To prevent overfitting, we did not retrain the StarDist model on DataDev. Instead, we
used DataDev to calibrate nucleus detection by tuning post-segmentation filters on mean intensity (&gt;
80 on a 0–255 scale), area (150–4000 pixels), and ellipticity (&gt; 0.2). These thresholds were chosen based
on descriptive statistics, qualitative inspection of segmentation results, and expert pathologist feedback.</p>
      <p>The system’s performance was evaluated using the DataVal dataset. Nuclear segmentation outputs
were assessed both quantitatively and qualitatively by comparing segmented nuclei and their statistical
properties per tile. Visual inspection by expert pathologists at the hospital confirmed accurate
identification of most nuclei, although some misclassifications occurred with non-tumoral structures (e.g., red
blood cells, lymphocytes). Introducing morphological thresholds significantly improved segmentation
accuracy by excluding these artifacts. Importantly, the DataVal dataset was used exclusively for
evaluation, without any involvement in training or optimization of the machine vision system.</p>
      <p>The Ki-67 index computation was validated by comparing system-generated values with those
assigned by pathologists, Figure 3. Due to the limited sample size ( = 13), we compared automated
and annotated Ki-67 indices using a ±5% discrepancy threshold, selected in consultation with expert
pathologists. This threshold reflects a clinically meaningful margin and aligns with inter-laboratory
variability and IKWG guidelines [27, 28]. While the automated calculation closely aligned with expert
assessments in most cases, two out of 13 images exhibited discrepancies exceeding 5%: patients 4 (id:
I2304166) and 5 (id: I2302878). In patient 4, clinical values remain high (around 70%), while the algorithm
tends to underestimate Ki-67. For patient 5, the clinical value is approximately 10%, whereas algorithm
estimates are 28% at 40X magnification and 37% at 20X. This prompted an exploration of alternative
region selection methods, such as hotspot and bootstrap-based approaches, to refine Ki-67 computation.
Error analysis revealed that a major source of error was the inclusion of regions that are not clinically
relevant or contain tumor cells which should not be counted for Ki-67. This is consistent with the
performance variability depending on whether regions of interest are preselected, as demonstrated
in reported studies using tools like QuPath (see Sec. 2.1). To address this, we kept the automated cell
segmentation and counting system active for index computation while integrating a CAD tool for
selecting relevant areas. This web application enables clinicians to upload images, define regions of
interest, and receive index computations within minutes.</p>
      <sec id="sec-4-1">
        <title>4.1. Clinical validation and impact</title>
        <p>The validation of the system demonstrates its significant clinical utility by ensuring accurate,
reproducible, and eficient quantification of the Ki-67 index, a key biomarker in breast cancer diagnosis and
prognosis. Through a comprehensive validation process that compared automated Ki-67 calculations
with expert pathologist annotations, the system proved its reliability in assisting routine pathological
assessments. The system was ultimately deployed to hospital via a dedicated web application. This web
app, developed as a CAD tool, enables (only) authenticated clinicians at Parini hospital to access the
system remotely via VPN, providing an interactive platform for uploading, processing, and visualizing
histopathological images. The application allows real-time visualization of whole-slide images, displays
segmentation results, and provides a detailed breakdown of the Ki-67 index for each analyzed region
(for an example, see Figure 4).</p>
        <p>By automating and standardizing Ki-67 quantification, the tool significantly reduces inter-observer
variability and minimizes diagnostic subjectivity, enhancing diagnostic confidence, particularly in
borderline cases. Moreover, it improves workflow eficiency, allowing pathologists to focus on more
complex evaluations rather than time-consuming manual quantifications. The validation process also
identified and addressed challenges such as misclassification of non-tumoral cells and discrepancies
in region selection. To enhance accuracy, the system incorporates morphological thresholds to refine
segmentation and enables manual selection of regions of interest (ROI) for Ki-67 calculation, ensuring
alignment with clinical best practices. Additionally, the web app’s structured data storage and retrieval
system facilitates longitudinal analysis and integration with hospital databases, potentially supporting
further predictive analytics and research applications.</p>
        <p>The deployment of this web-based CAD tool represents a step toward integrating AI-driven solutions
into clinical practice, enabling real-time, evidence-based decision-making. While the system
demonstrated robust performance, ongoing evaluations in clinical settings will further refine its capabilities
and address limitations. Future improvements may include AI-assisted ROI selection and deep-learning
enhancements for more precise segmentation.</p>
      </sec>
      <sec id="sec-4-2">
        <title>5.1. Scientific contributions and interdisciplinary collaboration</title>
        <p>Our study demonstrates significant advances in AI-driven medical [ 18, 19] imaging by developing an
innovative CAD system practically implemented within the clinical workflow, that automates
highresolution histological analysis and precise quantification of the Ki -67 proliferation index [38, 12]. A
ifrst validation against expert pathologist annotations confirmed its clinical utility, notably reducing
inter-observer variability and diagnostic subjectivity [42, 15]. Comprehensive validation of the system
impact will be a key focus of future work, which aims to systematically assess detection performances as
well as clinical utility and user experience. The system is designed to eficiently collect statistical results
and expert-selected regions, enabling future model retraining and optimization for the specific use case,
as well as more meaningful performance evaluations. Moreover, the deployment of a secure, web-based
CAD tool facilitates real-time access and analysis, further bridging the gap between technological
innovation and clinical practice. This work contributes to the expanding research on CAD systems
and exemplifies the power of interdisciplinary collaboration in addressing complex clinical challenges,
particularly in resource-limited settings [47].</p>
      </sec>
      <sec id="sec-4-3">
        <title>5.2. Economic and social benefits</title>
        <p>This pilot study underscores the potential for technology transfer and innovative research to drive
local economic growth in mountain regions. By focusing on the computation of the Ki-67 proliferation
index, our CAD platform not only enhances diagnostic accuracy but also lays the foundation for a
comprehensive digital pathology infrastructure - one that can facilitate secure image sharing among
clinicians and spur the development of new digital health services. In line with European regulations
and ethical guidelines governing AI in clinical settings [46], our system was developed following best
practices that ensure both scientific integrity and clinical reliability. Furthermore, the implementation
of such cutting-edge technology can foster job creation and stimulate local markets while significantly
improving the quality of care, community resilience, and sustainable development in mountain territories
like the Aosta Valley.</p>
      </sec>
      <sec id="sec-4-4">
        <title>5.3. Clinical and regional impact</title>
        <p>The CAD tool significantly enhances diagnostic accuracy and supports local clinicians in
underresourced mountain regions by fostering a collaborative environment where machine-based analysis
and expert human judgment work synergistically [43]. Its streamlined platform enables easy sharing of
digital images, facilitating second opinions and external consultation, which is particularly valuable
in small, low-population areas where clinicians may encounter fewer complex cases. This system not
only improves workflow eficiency but also ensures that even rare or challenging diagnostic scenarios
receive expert review, ultimately boosting the quality of care and promoting regional resilience [50].</p>
      </sec>
      <sec id="sec-4-5">
        <title>5.4. Future prospects</title>
        <p>Future developments of the CAD system will focus on scalability and expanding its diagnostic capabilities
on other case studies rather than breast cancer, e.g. lung or prostate cancer. One promising direction
is the integration of additional biomarkers, such as the evaluation of PD-L1 expression in tumor
cells, which plays a critical role in immunotherapy decision-making. Incorporating other imaging
modalities and advanced deep-learning techniques could further enhance the system’s clinical utility,
transforming it into a comprehensive digital pathology platform. Strengthening collaborations between
local institutions, research organizations, and industry partners will be essential to driving innovation,
fostering technological advancements, and promoting sustainable healthcare solutions. These eforts will
not only improve diagnostic accuracy and eficiency but also contribute to the long-term development
of digital health services in remote and underserved regions.</p>
        <p>In parallel, a more comprehensive evaluation is planned. This will involve a systematic assessment
of the system’s performance in comparison with other software solutions and expert annotations. The
ongoing collection of data and results through the CAD tool usage will enable this evaluation and also
support future model refinement, contributing to the advancement of automated Ki-67 quantification
also from a scientific perspective.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <p>This study presented a machine vision CAD tool designed to compute the Ki-67 index from digitized
histopathological images. Developed as a web-based platform, the system facilitates seamless integration
into clinical workflows, enabling automated and standardized quantification. Validation against
hospitalprovided images demonstrated strong concordance with expert assessments, confirming its reliability as
a diagnostic support tool. Beyond its clinical impact, the platform fosters interdisciplinary collaboration
and exemplifies how AI-driven innovations can enhance healthcare accessibility in mountainous
regions. By promoting digital pathology, second-opinion consultations, and eficient diagnostics, this
initiative supports both scientific progress and regional healthcare development. Continued research
and collaboration will be key to expanding the system’s capabilities and results validation, ensuring its
long-term impact in both local and broader medical and scientific contexts.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The OAVdA17 is managed by the Fondazione Clément Fillietroz-ONLUS, which is supported by the
Regional Government of the Aosta Valley, the Town Municipality of Nus and the “Unité des Communes
valdôtaines Mont-Émilius”. We acknowledge that all simulations, machine vision development, and
testing were performed on the CMP3@vda - 5000genomi project High Performance Computing (HPC)
cluster, managed by Engineering D.HUB. 5000genomi@VdA is a scientific project that has enabled
the creation of a new research Center dedicated to Personalized, Preventive and Predictive Medicine
(CMP3VdA) for neurodevelopmental, neurodegenerative and oncological diseases. 5000genomi@VdA
is carried out by a research consortium led by IIT-Istituto Italiano di Tecnologia (Italian Institute of
Technology), comprising Universitá della Valle d’Aosta, Cittá della Salute e della Scienza di Torino,
Fondazione Clément Fillietroz-ONLUS, and Engineering D.HUB.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>This document was reviewed using generative AI strictly for grammar and text revision, ensuring
clarity and coherence without altering the original content or analysis.
report (oscar) procedure performed in a multidisciplinary one-stop breast clinic, Cancers 15 (2023)
4967. URL: http://dx.doi.org/10.3390/cancers15204967. doi:10.3390/cancers15204967.
[5] S. J. Schnitt, T. W. Jacobs, Pathology of breast cancer: a review, The American Journal of Surgical</p>
      <p>Pathology 43 (2019) 757–766.
[6] C. W. Elston, I. O. Ellis, Pathological features of breast cancer, European Journal of Cancer 27
(1991) 123–130.
[7] P. A. Carney, D. L. Miglioretti, B. C. Yankaskas, et al., Individual and combined efects of age, breast
density, and hormone replacement therapy use on the accuracy of screening mammography, Annals
of Internal Medicine 138 (2003) 168–175. doi:10.7326/0003-4819-138-3-200302040-00008.
[8] S. R. Cummings, et al., The future of cancer screening: biomarkers and early detection, Cancer
Epidemiology, Biomarkers and Prevention 26 (2017) 1–10. doi:10.1158/1055-9965.EPI-16-0603.
[9] N. L. Henry, D. F. Hayes, Cancer biomarkers, Molecular Oncology 6 (2012) 140–146. doi:10.1016/
j.molonc.2012.01.010.
[10] J. K. C. Chan, The wonderful colors of the hematoxylin-eosin stain in diagnostic surgical pathology,</p>
      <p>Int. J. Surg. Pathol. 22 (2014) 12–32.
[11] C. R. Taylor, S.-R. Shi, B. Chaiwun, L. Young, S. Imam, R. J. Cote, Strategies for improving the
immunohistochemical staining of various intranuclear prognostic markers in formalin-parafin
sections: Androgen receptor, estrogen receptor, progesterone receptor, p53 protein,
proliferating cell nuclear antigen, and ki-67 antigen revealed by antigen retrieval techniques, Human
Pathology 25 (1994) 263–270. URL: http://dx.doi.org/10.1016/0046-8177(94)90198-8. doi:10.1016/
0046-8177(94)90198-8.
[12] A. C. Dufour, A. H. Jonker, J.-C. Olivo-Marin, Deciphering tissue morphodynamics using bioimage
informatics, Philosophical Transactions of the Royal Society B: Biological Sciences 372 (2017)
20150512. URL: http://dx.doi.org/10.1098/rstb.2015.0512. doi:10.1098/rstb.2015.0512.
[13] E. Meijering, A. E. Carpenter, H. Peng, F. A. Hamprecht, J.-C. Olivo-Marin, Imagining the future
of bioimage analysis, Nature Biotechnology 34 (2016) 1250–1255. URL: http://dx.doi.org/10.1038/
nbt.3722. doi:10.1038/nbt.3722.
[14] A. Kan, Machine learning applications in cell image analysis, Immunology &amp; Cell Biology 95
(2017) 525–530. URL: http://dx.doi.org/10.1038/icb.2017.16. doi:10.1038/icb.2017.16.
[15] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. van der Laak,
B. van Ginneken, C. I. Sánchez, A survey on deep learning in medical image analysis, Medical
Image Analysis 42 (2017) 60–88. URL: http://dx.doi.org/10.1016/j.media.2017.07.005. doi:10.1016/
j.media.2017.07.005.
[16] S. A. Alowais, S. S. Alghamdi, N. Alsuhebany, T. Alqahtani, A. I. Alshaya, S. N. Almohareb,
A. Aldairem, M. Alrashed, K. Bin Saleh, H. A. Badreldin, M. S. Al Yami, S. Al Harbi, A. M. Albekairy,
Revolutionizing healthcare: the role of artificial intelligence in clinical practice, BMC Med. Educ.
23 (2023) 689.
[17] S. S. Alahmari, D. Goldgof, L. O. Hall, P. R. Mouton, A review of nuclei detection and segmentation
on microscopy images using deep learning with applications to unbiased stereology counting,
IEEE Transactions on Neural Networks and Learning Systems 35 (2024) 7458–7477. URL: http:
//dx.doi.org/10.1109/TNNLS.2022.3213407. doi:10.1109/tnnls.2022.3213407.
[18] A. Esteva, et al., Dermatologist-level classification of skin cancer with deep neural networks,</p>
      <p>Nature 542 (2017) 115–118.
[19] B. E. Bejnordi, et al., Diagnostic assessment of deep learning algorithms for detection of lymph
node metastases in women with breast cancer, JAMA 318 (2017) 2199–2210.
[20] M. C. U. Cheang, S. K. Chia, D. e. a. Voduc, Ki67 index is a strong predictor of breast cancer survival
in the ncic ctg ma.21 trial, Breast Cancer Research 11 (2009) R87. doi:10.1186/bcr2425.
[21] S. E. Pinder, et al., The prognostic value of ki67 in breast cancer: a meta-analysis, Breast Cancer</p>
      <p>Research and Treatment 81 (2003) 253–266.
[22] M. Dowsett, et al., Prognostic value of ki67 in breast cancer: a systematic review of the literature,</p>
      <p>Breast Cancer Research 13 (2011) R105.
[23] A. Goldhirsch, J. Ingle, R. Gelber, A. Coates, B. Thürlimann, H.-J. Senn, Thresholds for therapies:
in: Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 964–971.
[57] U. Schmidt, M. Weigert, C. Broaddus, G. Myers, Cell detection with star-convex polygons, in:
Medical image computing and computer assisted intervention–MICCAI 2018: 21st international
conference, Granada, Spain, September 16-20, 2018, proceedings, part II 11, Springer, 2018, pp.
265–273.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>American</given-names>
            <surname>Cancer</surname>
          </string-name>
          <string-name>
            <surname>Society</surname>
          </string-name>
          ,
          <source>Breast Cancer Early Detection and Diagnosis</source>
          ,
          <year>2023</year>
          . Retrieved from https://www.cancer.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Martin</surname>
          </string-name>
          , et al.,
          <article-title>Mammographic density and the risk and detection of breast cancer</article-title>
          ,
          <source>New England Journal of Medicine</source>
          <volume>356</volume>
          (
          <year>2007</year>
          )
          <fpage>227</fpage>
          -
          <lpage>236</lpage>
          . doi:
          <volume>10</volume>
          .1056/NEJMoa062790.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Houssami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bernardi</surname>
          </string-name>
          , et al.,
          <article-title>Integration of 3d digital mammography with tomosynthesis for population breast-cancer screening (storm): A prospective comparison study</article-title>
          ,
          <source>The Lancet Oncology</source>
          <volume>14</volume>
          (
          <year>2013</year>
          )
          <fpage>583</fpage>
          -
          <lpage>589</lpage>
          . doi:
          <volume>10</volume>
          .1016/S1470-2045(
          <volume>13</volume>
          )
          <fpage>70134</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Suciu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>El Chamieh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Soufan</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-C. Mathieu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Balleyguier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Delaloge</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Balogh</surname>
            ,
            <given-names>J.-Y.</given-names>
          </string-name>
          <string-name>
            <surname>Scoazec</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Chevret</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Vielh</surname>
          </string-name>
          ,
          <article-title>Real-world diagnostic accuracy of the on-site cytopathology advance</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>