<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Intelligent information system for knowledge integration into artificial intelligence models⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oleksandr Chaban</string-name>
          <email>chabanolek@khmnu.edu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduard Manziuk</string-name>
          <email>manziuk.e@khmnu.edu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavlo Radiuk</string-name>
          <email>radiukp@khmnu.edu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Zaitseva</string-name>
          <email>elena--14@ukr.net</email>
          <email>elena.zaitseva@fri.uniza.sk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olena</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Khmelnytskyi Infectious Diseases Hospital</institution>
          ,
          <addr-line>17, Skovorody str., Khmelnytskyi, 29008</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Khmelnytskyi National University</institution>
          ,
          <addr-line>11, Institutes str., Khmelnytskyi, 29016</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Zilina University</institution>
          ,
          <addr-line>Univerzitná 8215, 010 26 Žilina</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>The rapid proliferation of artificial intelligence in medical imaging is currently hindered by a significant disconnect between high-performing research models and the rigorous demands of clinical environments. Key challenges include data interoperability issues between research formats and clinical standards, hardware dependencies that limit portability, and the opaque “black-box” nature of deep learning models which erodes clinician trust. In this work, we propose a comprehensive intelligent information system designed to bridge this gap by unifying standards-compliant data ingestion, accelerated inference, and knowledge-infused reasoning into a single auditable workflow. Our approach integrates a robust DICOM and NIfTI ingestion pipeline with built-in anonymization, a hardware-agnostic ONNX inference engine, and a novel graph-based classification module that explicitly models anatomical relationships. Evaluated on the public ACDC benchmark, the proposed system demonstrates superior performance, with the segmentation module achieving a mean Dice Similarity Coefficient of 0.939 and the knowledge-integrated classifier attaining a diagnostic accuracy of 94.0%. The significant conclusion of this study is that by systematically integrating privacy controls, hardware portability, and graph-based knowledge representation, it is possible to create a deployment-ready AI blueprint that is both scientifically reproducible and clinically trustworthy.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Medical imaging</kwd>
        <kwd>cardiac MRI</kwd>
        <kwd>knowledge integration</kwd>
        <kwd>graph neural networks</kwd>
        <kwd>DICOM interoperability</kwd>
        <kwd>ONNX runtime</kwd>
        <kwd>information system 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The field of artificial intelligence (AI) for medical imaging has witnessed exponential growth in
recent years, driven by the advent of deep learning architectures that often surpass human-level
performance in specific diagnostic tasks. However, a substantial chasm remains between the
experimental success of these models in controlled research environments and their practical
utility in real-world clinical settings [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This discrepancy is primarily fueled by a trifecta of
systemic challenges: data heterogeneity, hardware fragmentation, and the interpretability crisis.
Clinical workflows heavily rely on the Digital Imaging and Communications in Medicine (DICOM)
standard, a complex protocol governing the storage and transmission of medical data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Conversely, the research community predominantly utilizes the Neuroimaging Informatics
Technology Initiative (NIfTI) format due to its simplified handling of volumetric geometry and
orientation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The friction generated by converting between these formats often leads to silent
geometric errors, metadata loss, and privacy breaches, thereby impeding the seamless integration
of AI tools into hospital picture archiving and communication systems.
      </p>
      <p>
        Furthermore, the deployment landscape is complicated by hardware heterogeneity. Research
models are typically trained on high-end NVIDIA GPUs using frameworks like PyTorch, but
clinical workstations vary widely in their computational capabilities, ranging from standard CPUs
to GPUs from different vendors. This necessitates an inference strategy that is both portable and
performant. The Open Neural Network Exchange (ONNX) format and its associated Runtime
engine offer a solution by providing an intermediate representation that can be executed across
diverse hardware backends [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. However, wrapping these technologies into a cohesive system
that manages dependencies without imposing vendor lock-in remains a significant engineering
hurdle.
      </p>
      <p>
        Perhaps the most critical barrier to adoption is the “black-box” nature of modern deep neural
networks. In high-stakes medical decision-making, accuracy alone is insufficient; clinicians require
transparency and justification for algorithmic predictions. While purely data-driven models like the
U-Net [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and its self-configuring variant nnU-Net [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] have established strong baselines for
segmentation, they often lack the ability to incorporate explicit medical knowledge or reasoning.
Recent advances in transformers [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and hybrid architectures [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] push performance boundaries
but often at the cost of increased opacity. To address this, human-in-the-loop approaches and
explainable AI techniques are becoming essential components of trustworthy systems [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The problem under consideration is the absence of a unified, end-to-end framework that
systematically addresses these disparate requirements, i.e., standards compliance, hardware
portability, and knowledge-infused reasoning [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], within a single reproducible pipeline. Current
solutions often address these issues in isolation, resulting in fragmented workflows that are
difficult to audit and deploy.
      </p>
      <p>In this work, we present a novel scientific contribution by architecting an intelligent
information system that integrates these components from the ground up. By combining a
standards-compliant ingestion module, a portable ONNX-based segmentation engine, and a graph
convolutional network (GCN) for structured reasoning, we provide a holistic solution to the
deployment gap.</p>
      <p>The goal of this study is to improve knowledge integration fidelity and downstream reasoning
accuracy by unifying standards-compliant ingestion, portable ONNX inference, and
graphstructured classification with calibration-aware evaluation. To achieve this goal, we present three
major contributions:
1.</p>
      <p>A complete system architecture that spans DICOM/NIfTI ingestion with built-in
anonymization, accelerated ONNX inference, volumetric segmentation, and manifest-driven
data export for full reproducibility.</p>
      <p>A novel graph-based classification module, KI-GCN, derived from GCNs, which aggregates
structured features from segmentation masks and patient metadata to enhance diagnostic
reasoning. We also specify an optional multi-teacher knowledge distillation objective for
deploying compressed, efficient models.</p>
      <p>A deployment-oriented evaluation protocol that includes standard segmentation metrics,
probabilistic classification metrics, and critical calibration diagnostics like reliability
diagrams to ensure model trustworthiness.</p>
      <p>The remainder of this paper is organized as follows. Section 2 reviews the state of the art in
medical imaging interoperability, segmentation architectures, and knowledge integration methods.
Section 3 details the proposed system architecture, including the formalization of the ingestion,
segmentation, and graph-based classification modules. Section 4 presents the experimental results
on the ACDC and M&amp;Ms-2 datasets, providing a comparative analysis against state-of-the-art
methods. Section 5 analyzes the implications of these findings, system throughput, and limitations.
Finally, Section 6 summarizes the contributions and outlines future directions.</p>
      <sec id="sec-1-1">
        <title>2. Related works</title>
        <p>Our research is situated at the intersection of interoperability standards, hardware acceleration,
advanced deep learning architectures for segmentation, and methods for knowledge integration.
This section reviews the state of the art in these domains to contextualize the proposed intelligent
information system.</p>
        <p>
          The foundation of any clinical AI system is its ability to handle standardized data formats. The
DICOM standard serves as the global lingua franca for medical imaging, with its Part 1 (PS3.1)
defining the overarching structure and semantic interoperability requirements for clinical data
exchange [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. While robust, DICOM’s complexity often poses challenges for direct consumption by
deep learning models. In the research domain, the NIfTI format has become the de facto standard
for 3D and 4D volumetric data, primarily due to its explicit encoding of affine geometry and
orientation fields, which are critical for preventing spatial misalignment during analysis [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. A key
task for our system is to seamlessly bridge these two standards, ensuring that data ingressed from
clinical sources (DICOM) retains its geometric integrity when converted for model consumption
(NIfTI-like tensors).
        </p>
        <p>
          Regarding model deployment and hardware acceleration, the ONNX Runtime engine has
emerged as a critical technology for ensuring portability. It abstracts the execution of model graphs
through a system of pluggable Execution Providers (EPs), allowing the same model file to run
efficiently on CPUs, NVIDIA GPUs via CUDA [13], and Windows-based GPUs via DirectML [14].
This flexibility is essential for clinical environments where hardware specifications cannot be
guaranteed. Recent comparative analyses have highlighted the necessity of such acceleration
frameworks to reduce inference latency and computational overhead in production settings [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          In the domain of medical image segmentation, the U-Net architecture remains the cornerstone,
featuring a symmetric encoder-decoder structure with skip connections that preserve spatial
information [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Building on this, the nnU-Net framework demonstrated that automated
hyperparameter optimization and rigorous preprocessing are often more critical than architectural
novelty, consistently achieving state-of-the-art results on benchmarks like the Automated Cardiac
Diagnosis Challenge (ACDC) [
          <xref ref-type="bibr" rid="ref7">7, 15</xref>
          ]. More recently, the field has seen a surge in transformer-based
models [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], hybrid ConvNet-transformer architectures like MedNeXt [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], and specialized 3D
volume processors like UNETR++ [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. While these models offer performance gains, their
integration into explainable, standards-compliant workflows remains limited.
        </p>
        <p>To move beyond the “black-box” paradigm, integrating explicit knowledge is crucial. Neural
networks, particularly GCNs, provide a mathematical framework for modeling anatomical
structures as interconnected nodes, allowing for reasoning based on spatial and functional
relationships rather than just pixel intensities [16]. Advanced variants like graph attention
networks have further refined this approach by learning to weigh the importance of different
anatomical connections [17]. Additionally, knowledge distillation offers a pathway to compress
these complex reasoning capabilities into lightweight models suitable for deployment, transferring
insights from large “teacher” ensembles to efficient “student” models [18, 19]. Recent work in our
group has extended these concepts to adaptive multi-teacher distillation strategies, enhancing
robustness against domain shifts [20, 21].</p>
        <p>Finally, trust in AI systems is predicated not just on accuracy, but on calibration, i.e., the
alignment between predicted confidence and actual correctness. Methods such as reliability
diagrams and temperature scaling are essential for diagnosing and correcting miscalibration [22].
Emerging techniques like proximity-informed calibration continue to push the boundaries of model
reliability [23].</p>
        <p>The primary objective of this study is to synthesize these diverse technological threads into a
single, cohesive system. The main tasks to fulfill this objective are: (i) to design and implement a
modular software architecture for the end-to-end medical imaging workflow, (ii) to develop and
integrate a graph-based reasoning module that leverages segmentation outputs for improved
classification, and (iii) to validate the entire system’s performance and reproducibility on public
benchmark datasets.</p>
      </sec>
      <sec id="sec-1-2">
        <title>3. Methods</title>
        <p>We formalize the proposed intelligent information system as a sequence of interconnected
processing modules that execute a single, manifest-driven workflow. The system is designed to
transform raw medical imaging data into actionable, explainable diagnostic insights. Detailed
implementation specifics and user manuals are provided in the accompanying technical report [24].
In this section, we define the mathematical formulations and algorithmic logic underpinning the
core components: ingestion, segmentation, and graph-based knowledge integration.</p>
        <p>Let a dataset be denoted by  ={(V i , M i , Di)}iN=1, where for each of N patients, V i represents
the input medical image volume (e.g., a cardiac MRI series), M i represents the ground-truth
anatomical segmentation mask, and Di represents the associated clinical diagnosis or classification
label. The system architecture, illustrated in Figure 1, processes these inputs through four distinct
stages: ingestion, segmentation, knowledge graph construction, and classification.</p>
        <p>The system processes each patient’s data i through a sequential pipeline formalized below.</p>
        <sec id="sec-1-2-1">
          <title>3.1. Standards-compliant ingestion and anonymization</title>
          <p>
            The ingestion module is responsible for the secure and accurate loading of medical data. It utilizes
the FO-DICOM library to parse DICOM series, ensuring that all slices are ordered correctly based
on the ‘ImagePositionPatient’ (0020,0032) tag. To adhere to privacy regulations (e.g., GDPR,
HIPAA), the module implements a configurable anonymization engine compliant with the DICOM
PS3.15 Basic Profile. Identifiers such as ‘PatientName’ and ‘PatientID’ are hashed or removed [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].
          </p>
          <p>
            For research data in NIfTI format, the system parses the affine header to normalize voxel
spacing and reorient the volume to the canonical RAS coordinate system [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. Input volumes are
then intensity-normalized to the range [
            <xref ref-type="bibr" rid="ref1">0 , 1</xref>
            ] to stabilize downstream numerical optimization
          </p>
        </sec>
        <sec id="sec-1-2-2">
          <title>3.2. Volumetric segmentation (SKIF-Seg)</title>
          <p>Synergistic Knowledge-Integrated Framework for Segmentation (SKIF-Seg) is the system’s
segmentation engine, designed for hardware portability via ONNX Runtime. The module accepts
the preprocessed volume V ′i and predicts a dense probability map Pi. The inference process is
abstracted to support multiple backends:


</p>
          <p>CPU Execution Provider: Uses MKLDNN/OpenBLAS for optimized execution on standard
processors.</p>
          <p>CUDA Execution Provider: Leverages NVIDIA’s cuDNN and TensorRT libraries for
highthroughput GPU inference [13].</p>
          <p>DirectML Execution Provider: Provides vendor-agnostic GPU acceleration on Windows,
supporting AMD, Intel, and NVIDIA hardware [14].</p>
          <p>
            The output Pi∈[
            <xref ref-type="bibr" rid="ref1">0 , 1</xref>
            ]H×W ×D×C represents the probability of each voxel belonging to one of C
anatomical classes. A final segmentation maskM^ i is generated via an argmax operation.
          </p>
        </sec>
        <sec id="sec-1-2-3">
          <title>3.3. Graph-based classification (KI-GCN)</title>
          <p>To incorporate anatomical reasoning, we introduce the Knowledge Integration Graph
Convolutional Network (KI-GCN). We define a graph G=(V , E ) where nodes V correspond to
segmented structures (e.g., Left Ventricle, Myocardium, Right Ventricle) and edges E encode spatial
adjacency and functional connectivity.</p>
          <p>For each node v ∈V , we compute a feature vector v derived from the segmentation mask M^ i,
including volume, surface area, sphericity, and centroid displacement. The graph is processed using
spectral graph convolution layers defined by the propagation rule as follows</p>
          <p>H (l+1)=σ ( D~−12 ~A D~−12 H (l) W (l)),
(1)
~ ~
where H (ℓ) is the feature matrix at layer ℓ, A is the adjacency matrix with self-loops, D is the
degree matrix, and W (ℓ) is the learnable weight matrix [16].</p>
          <p>This process allows the model to learn features that depend on the structural configuration of
the heart, rather than treating geometry as a flat vector. The final node embeddings are pooled to
form a global graph representation G, which is classified into diagnostic categories.</p>
        </sec>
        <sec id="sec-1-2-4">
          <title>3.4. Multi-teacher knowledge distillation</title>
          <p>To enable efficient deployment on edge devices, we employ a multi-teacher knowledge distillation
strategy. The training objective combines the standard cross-entropy loss with a distillation term
that aligns the student’s logits z(s) with the soft targets from an ensemble of teacher models z¯(t) as
presented below</p>
          <p>L=α LCE( y , softmax ( z(s)))+(1−α ) τ 2 KL(softmax(
z¯(t)
τ
)|| softmax(
z(s)
τ
(2)
where τ is the temperature parameter controlling the softness of the probability distributions,
and α balances the two loss components [18, 19].</p>
        </sec>
        <sec id="sec-1-2-5">
          <title>3.5. Experimental setup and evaluation</title>
          <p>Our evaluation protocol is designed to be comprehensive, deployment-oriented, and fully
reproducible.</p>
          <p>For segmentation performance, let X and Y be the predicted and ground-truth masks,
respectively.</p>
          <p>We quantify overlap using the Dice Similarity Coefficient (DSC) [25], as follows
Additionally, we calculate the Jaccard Index (IoU) [26], defined in Equation 4:</p>
          <p>For boundary accuracy, we use the 95th percentile Hausdorff Distance (HD95) and Average
Symmetric Surface Distance (ASSD), which are reviewed in detail by Taha and Hanbury [27].</p>
          <p>For classification, let pi be the predicted probability for the positive class and yi∈{0 , 1} be the
true label. We measure ranking quality with ROC-AUC and, for imbalanced classes, PR-AUC [28].
We assess calibration using the Brier score [29], which is the mean squared error of probabilistic
forecasts, and visualize it with reliability diagrams, quantifying miscalibration with the Expected
Calibration Error (ECE) [22].</p>
          <p>To ensure full auditability and scientific reproducibility, every execution of the pipeline
generates a JSON manifest file. This manifest records the software version, Git commit hash,
timestamp, the selected ONNX Runtime EP, model opset version, and all computed evaluation
metrics [24].</p>
          <p>The system also provides an export module that saves segmentation masks as NIfTI files,
qualitative overlays as PNG images, and all metrics in CSV/JSON formats. This functionality is
managed through a comprehensive export module (see Appendix, Figure A.5). This practice aligns
with best practices for reproducible computational science.</p>
        </sec>
      </sec>
      <sec id="sec-1-3">
        <title>4. Results</title>
        <p>We evaluated the system on the ACDC dataset [15] for segmentation and diagnosis, and the
M&amp;Ms-2 dataset [30] for cross-domain generalization.</p>
        <sec id="sec-1-3-1">
          <title>4.1. Segmentation performance</title>
          <p>The SKIF-Seg module demonstrates robust performance. Table 1 presents a structure-wise
comparison with a U-Net baseline. Our approach yields a significant improvement in boundary
delineation, reducing the HD95 for the Left Ventricle (LV) from 7.5 mm to 5.8 mm.</p>
          <p>The distribution of Dice scores is visualized in Figure 2, showing reduced variance for the
proposed method.</p>
          <p>Segmentation accuracy by dataset and structure (SKIF-Seg)
1.000
tn0.975
e
i
c
if0.950
f
e
o
c0.925
y
t
i
r
la0.900
i
m
is0.875
e
c
i
D0.850
0.825</p>
          <p>Table 2 summarizes the macro-averaged performance, highlighting a 1.67 mm reduction in
HD95 on the ACDC dataset.</p>
        </sec>
        <sec id="sec-1-3-2">
          <title>4.2. State-of-the-art comparison and robustness</title>
          <p>We compared our system against leading methods, including nnU-Net and MedNeXt (Table 3). Our
system achieves a mean Dice of 0.939, matching MedNeXt and remaining highly competitive with
nnU-Net, while operating within a portable ONNX framework.</p>
          <p>To evaluate robustness, we analyzed the domain shift from ACDC to M&amp;Ms-2 (Figure 4). The
degradation in Dice scores is minimal (&lt; 0.013), indicating excellent generalization capabilities
across different scanner vendors.</p>
        </sec>
        <sec id="sec-1-3-3">
          <title>4.3. Diagnostic classification</title>
          <p>The KI-GCN module demonstrates high diagnostic accuracy. Figure 5 displays the Macro ROC and
PR curves, with an AUC of 0.964.</p>
          <p>Macro ROC curve (multiclass, one-vs-rest)
1.0
Macro precision-recall curve
0.8
e
t
a
r
e0.6
v
i
t
i
s
o
p0.4
e
u
r
T
0.2
0.0</p>
          <p>ROC (AUC = 0.935)
0.0
0.2</p>
          <p>0.4 0.6
False positive rate
0.8</p>
          <p>1.0
(a)
1.0
0.8</p>
          <p>The confusion matrix (Figure 6) shows strong discrimination between all five cardiac
conditions.</p>
        </sec>
        <sec id="sec-1-3-4">
          <title>4.4. Calibration and efficiency</title>
          <p>Model trustworthiness was assessed via reliability diagrams (Figure 7). Post-hoc temperature
scaling (τ =2.1) significantly improved calibration, reducing the Expected Calibration Error (ECE)
to 0.03 (Table 4).
Post (temp. scaling, τ =2.1)</p>
          <p>Brier ↓
0.08
0.07</p>
          <p>ECE ↓
0.04
0.03</p>
          <p>The ablation study in Table 5 confirms that the inclusion of the graph module (KI-GCN)
contributes significantly to accuracy compared to a baseline MLP.
4 5</p>
          <p>Batch size</p>
          <p>Finally, system throughput is analyzed in Table 6 and Figure 8. The CUDA and DirectML
providers offer substantial speedups over CPU, enabling real-time clinical use.</p>
          <p>Throughput vs. batch size by execution provider</p>
        </sec>
      </sec>
      <sec id="sec-1-4">
        <title>5. Discussion</title>
        <p>
          The results of this study underscore the critical importance of a holistic systems engineering
approach to medical AI. While pure algorithmic research often prioritizes incremental gains in Dice
scores [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ], our work demonstrates that architecting for interoperability and interpretability
yields substantial practical benefits without sacrificing accuracy. The SKIF-Seg module’s
performance, achieving a mean Dice of 0.939, is on par with state-of-the-art research models like
MedNeXt [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], yet it is delivered within a containerized, hardware-agnostic framework. This
portability, enabled by ONNX Runtime [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], addresses the vendor lock-in that frequently stifles
clinical adoption.
        </p>
        <p>Our key scientific finding is the efficacy of the KI-GCN module. By explicitly modeling the heart
as a graph of connected structures, we achieved a 4.9% improvement in diagnostic accuracy over a
feature-based MLP baseline. This validates the hypothesis that structural knowledge is a powerful
inductive bias. Furthermore, the strong calibration results (ECE of 0.03) suggest that the system’s
probability outputs are trustworthy, a prerequisite for use in high-stakes medical decision-making.</p>
        <p>However, the system is not without limitations. The current graph topology in KI-GCN is static,
defined by a priori anatomical knowledge. This prevents the model from discovering novel,
datadriven relationships that might exist in diverse pathologies. Additionally, while the M&amp;Ms-2
generalization results are promising, true clinical robustness requires validation across a broader
spectrum of imaging artifacts and patient demographics.</p>
        <p>Future research will focus on two avenues: (i) developing dynamic graph learning techniques
that can infer patient-specific topological connections, and (ii) conducting prospective multi-site
clinical trials to validate the system’s impact on diagnostic workflow efficiency and accuracy.</p>
      </sec>
      <sec id="sec-1-5">
        <title>Conclusion</title>
        <p>In this paper, we have successfully bridged the “last-mile” gap separating high-performance AI
research from tangible clinical utility. By architecting a holistic intelligent information system, we
resolved the tripartite challenges of data interoperability, hardware fragmentation, and model
interpretability. Our solution moves beyond isolated algorithm development to provide a unified,
end-to-end pipeline that seamlessly integrates standards-compliant DICOM and NIfTI ingestion,
automated privacy preservation, and hardware-agnostic inference via ONNX Runtime. The
empirical validation of this framework underscores its potential to transform diagnostic workflows
without disrupting existing hospital infrastructure. Specifically, the proposed SKIF-Seg module
demonstrated better anatomical delineation, achieving a mean Dice Similarity Coefficient of 0.939
on the ACDC benchmark, effectively matching specialized research models within a portable
container. Moreover, the integration of structured domain knowledge through the novel KI-GCN
classification module yielded a diagnostic accuracy of 94.0% and, critically, a low Brier score of 0.07.
These metrics establish that incorporating graph-based anatomical reasoning not only enhances
predictive performance but also ensures the calibration and trustworthiness essential for
highstakes medical decision-making. Consequently, this study offers a scientifically reproducible and
legally auditable blueprint for deploying AI in diverse hospital environments.</p>
        <p>Future research will focus on evolving this framework from a static deployment tool into a
dynamic, continuous learning ecosystem.</p>
      </sec>
      <sec id="sec-1-6">
        <title>Declaration on Generative AI</title>
        <p>During the preparation of this work, the authors employed generative AI tools to polish the final
version of the manuscript. Specifically, Gemini 3 Pro (owned by Google LLC) and Grammarly
(owned by Grammarly, Inc.) were utilized to improve grammar, spelling, and overall readability.
After using these tools, the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.
systems   &amp;   technologies   (ICST   2024),   CEUR-WS.org,   Aachen,   2024,   pp. 262–272.   URL: 
https://ceur-ws.org/Vol-3790/paper23.pdf.
[13] NVIDIA   Corporation,   CUDA   C++   programming   guide,   2025.   URL: 
https://docs.nvidia.com/cuda/cuda-c-programming-guide/.
[14] Microsoft   Corporation,   DirectML   overview,   2025.   URL: 
https://learn.microsoft.com/en-us/windows/ai/directml/overview.
[15] O. Bernard, A. Lalande, C. Zotti, F. Cervenansky, X. Yang, P.-A. Heng, Others, Deep learning 
techniques   for   automatic   MRI   cardiac   multi-structures   segmentation   and   diagnosis:   Is   the 
problem   solved?,   IEEE   Trans.   Med.   Imaging   37.11   (2018)   2514–2525. 
doi:10.1109/TMI.2018.2837502.
[16] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: 
International   conference   on   learning   representations   (ICLR),   2017,   pp. 1–14.   URL: 
https://openreview.net/pdf?id=SJU4ayYgl.
[17] M.   D. Alanazi,   K. Kaaniche,   M. Albekairi,   T.   M. Alanazi,   G. Abbas,   Graph   attention   neural 
network   for   advancing   medical   imaging   by   enhancing   segmentation   and   classification,   Eng. 
Appl. Artif. Intell. 161 (2025) 112372. doi:10.1016/j.engappai.2025.112372.
[18] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network, arXiv Prepr. arXiv 
(2015). doi:10.48550/arXiv.1503.02531.
[19] A. Moslemi,   A. Briskina,   Z. Dang,   J. Li,   A   survey   on   knowledge   distillation:   Recent 
advancements, Mach. Learn. With Appl. 18 (2024) 100605. doi:10.1016/j.mlwa.2024.100605.
[20] O. Chaban,   E. Manziuk,   P. Radiuk,   Method   of   adaptive   knowledge   distillation   from  
multiteacher to student deep learning models, J. Edge Comput. 4.2 (2025) 1–20. doi:10.55056/jec.978.
[21] O. Chaban,   E. Manziuk,   O. Markevych,   S. Petrovskyi,   P. Radiuk,   EMTKD   at   the   edge:   An 
adaptive   multi-teacher   knowledge   distillation   for   robust   cardiac   MRI   classification,   in: 
Proceedings   of   the   5th   edge   computing   workshop   (DOORS   2025),   CEUR-WS.org,   Aachen, 
2025, pp. 42–47. URL: https://ceur-ws.org/Vol-3943/paper09.pdf.
[22] A. Niculescu-Mizil,   R. Caruana,   Predicting   good   probabilities   with   supervised   learning,   in: 
Proceedings of the 22nd international conference on machine learning (ICML), 2005, pp. 625–
632. doi:10.1145/1102351.1102430.
[23] M. Xiong, A. Deng, P. W. Koh, J. Wu, S. Li, J. Xu, B. Hooi, Proximity-informed calibration for 
deep neural networks, in: Proceedings of the 37th international conference on neural information 
processing   systems   (neurips),   2023,   pp. 68511–68538.   URL: 
https://dl.acm.org/doi/10.5555/3666122.3669118.
[24] O. Chaban, E. Manziuk, P. Radiuk, IDK medical AI: An open-source framework for AI-driven 
medical imaging analysis, 2025. URL: https://github.com/radiukpavlo/idk-medical-ai.
[25] L.   R. Dice,   Measures   of   the   amount   of   ecologic   association   between   species,   Ecology   26.3 
(1945) 297–302. doi:10.2307/1932409.
[26] P. Jaccard, Étude comparative de la distribution florale dans une portion des Alpes et du Jura, </p>
        <p>Bull. Soc. Vaudoise Sci. Nat. 37 (1901) 547–579. doi:10.5169/SEALS-266450.
[27] A.   A. Taha,   A. Hanbury,   Metrics   for   evaluating   3D   medical   image   segmentation:   Analysis, 
selection, and tool, BMC Med. Imaging 15.29 (2015) 1–28. doi:10.1186/s12880-015-0068-x.
[28] T. Saito, M. Rehmsmeier, The precision-recall plot is more informative than the ROC plot when 
evaluating   binary   classifiers   on   imbalanced   datasets,   PLOS   ONE   10.3   (2015)   e0118432. 
doi:10.1371/journal.pone.0118432.
[29] G. W. Brier, Verification of forecasts expressed in terms of probability, Mon. Weather Rev. 78.1 
(1950) 1–3. doi:10.1175/1520-0493(1950)078%3C0001:VOFEIT%3E2.0.CO;2.
[30] V. M. Campello,   P. Gkontra,   C. Izquierdo,   C. Martín-Isla,   A. Sojoudi,   P. M. Full,  
K. MaierHein,   Y. Zhang,   Z. He,   Others,   Multi-centre,   multi-vendor   and   multi-disease   cardiac 
segmentation:   The   M&amp;Ms   challenge,   IEEE   Trans.   Med.   Imaging   40.12   (2021)   3543–3554. 
doi:10.1109/TMI.2021.3090082.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>A. System User Interface</title>
      <p>This   appendix   provides   select   screenshots   from   the   graphical   user   interface   of   the   IDK   Medical 
AI system, illustrating the key stages of the end-to-end workflow.</p>
      <sec id="sec-2-1">
        <title>Figure A.1: The</title>
        <p>main
user interface
of the IDK</p>
        <p>Medical</p>
        <p>AI system, providing
access to
data
ingestion</p>
        <p>modules (DICOM/NIfTI), analysis pipelines (Segmentation, Classification), and project
management features.
of DICOM series and applies privacy-preserving profiles.</p>
        <p>Сегментація SKIF-Seg
Автоматична сегментація медичних зображень з використанням нейронної
мережі
 Вибір моделі
./models/skifseg.onnx
opset=17</p>
        <p>Validated
 Параметри
Поріг бінаризації
 Morphology post-proc
 CRF refinement</p>
        <p>
          Normalize to [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]
        </p>
        <p>CPU
 Візуалізація</p>
        <p>Browse...
graph source and initiate the graph-based diagnostic classification.</p>
        <p> Run segmentation</p>
        <p>Inference time per slice (ms)</p>
        <p>Research use only. Not a medical device.</p>
        <p>Figure A.3: Interface for the SKIF-Seg segmentation module. Users can select an ONNX model and
monitor the segmentation progress.</p>
        <p>Класифікація KI-GCN
Аналіз медичних зображень за допомогою графової нейронної мережі
Налаштування</p>
        <p>Результати
 Параметри моделі
Graph source
./datasets/graph.json
Features</p>
        <p>auto
Threshold
0.70</p>
        <p>CPU
Провайдер інференсу
 Run classification



1</p>
        <p>2</p>
        <p>3
Підтвердження





Маски сегментації (NIfTI/DICOM)
Експорт масок сегментації у форматі NIfTI або DICOM
 Help
users to export segmentation masks (NIfTI), visual overlays, and metrics.</p>
        <p>Метрики якості (CSV)
Метрика
Dice
Jaccard/IoU
 JSON-супровід</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
             He,   L. 
            <surname>Yang</surname>
          </string-name>
          ,   X. Li,   J. Du,  
          <article-title>Discrepancies   in   reported   results   between   trial   registries   and  journal   articles   for   AI   clinical   research</article-title>
          ,   eClinicalMedicine  
          <fpage>80</fpage>
            (
          <year>2025</year>
          )  
          <fpage>103066</fpage>
          .  doi:10.1016/j.eclinm.
          <year>2024</year>
          .
          <volume>103066</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>DICOM</given-names>
            <surname> </surname>
          </string-name>
          <article-title>Standards   Committee,   DICOM   part   1:   Introduction   and   overview   (current   edition</article-title>
          ), 
          <year>2025</year>
          . URL: https://www.dicomstandard.org/current.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] NIfTI Data Format Working Group, 
          <article-title>NIfTI-1 data format (neuroimaging informatics technology  initiative</article-title>
          ), 
          <year>2007</year>
          . URL: https://nifti.nimh.nih.gov/nifti-1/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
            Corporation,   ONNX   runtime   documentation,  
          <year>2025</year>
          .   URL:  https://onnxruntime.ai/docs/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>V.</surname>
          </string-name>
           Slobodzian,   O. Barmak,  
          <article-title>Method   for   interpreting   decisions   made   by   deep   learning  models</article-title>
          , Comput. Syst. Inf. Technol. 
          <year>2024</year>
          .4 (
          <year>2024</year>
          ) 
          <fpage>150</fpage>
          -
          <lpage>156</lpage>
          . doi:10.31891/csit-2024-4- 18.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>O.</surname>
          </string-name>
           Ronneberger,   P. Fischer,   T. Brox,   U-Net:
          <article-title>  Convolutional   networks   for   biomedical   image  segmentation</article-title>
          , in: MICCAI, volume 
          <volume>9351</volume>
           of Lecture Notes in Computer Science, 
          <year>2015</year>
          , pp. 
          <fpage>234</fpage>
          -
          <lpage>241</lpage>
          . doi:10.1007/978-3-
          <fpage>319</fpage>
          -24574-4_
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>F.</surname>
          </string-name>
           Isensee,   P.   F. Jaeger,   S.   A.   A. Kohl,   J. Petersen,   K.   H. 
          <article-title>Maier-Hein,   NnU-Net:   A   selfconfiguring method for deep learning-based biomedical image segmentation, </article-title>
          <source>Nat. Methods 18.2 </source>
          (
          <year>2021</year>
          ) 
          <fpage>203</fpage>
          -
          <lpage>211</lpage>
          . doi:10.1038/s41592-020-01008-z.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>F.</surname>
          </string-name>
           Shamshad, S. Khan, S. W. Zamir, M. H. Khan, M. Hayat, F. S. Khan, H. Fu, 
          <article-title>Transformers in  medical   imaging:   A   survey</article-title>
          ,   Med.   Image   Anal.  
          <volume>88</volume>
            (
          <year>2023</year>
          )  
          <fpage>102802</fpage>
          .  doi:10.1016/j.media.
          <year>2023</year>
          .
          <volume>102802</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>S.</surname>
          </string-name>
           Roy, G. Köhler, C. Ulrich, M. Baumgartner, J. Petersen, F. Isensee, P. F. Jäger, K. H. MaierHein, MedNeXt: 
          <article-title>Transformer-driven scaling of convnets for medical image segmentation</article-title>
          , in:  MICCAI  
          <year>2023</year>
          ,   volume  
          <volume>14223</volume>
            of   Lecture   Notes   in   Computer   Science,  
          <year>2023</year>
          ,   pp. 
          <fpage>405</fpage>
          -
          <lpage>415</lpage>
          .  doi:10.1007/978-3-
          <fpage>031</fpage>
          -43901-8_
          <fpage>39</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
             Shaker, M. Maaz, H. Rasheed, S. Khan, M.
            <surname>-H. Yang</surname>
          </string-name>
          , F. S. Khan, UNETR++
          <article-title>: Delving into  efficient and accurate 3D medical image segmentation, </article-title>
          <source>IEEE Trans. Med</source>
          . 
          <source>Imaging 43.9 </source>
          (
          <year>2024</year>
          ) 
          <fpage>3377</fpage>
          -
          <lpage>3390</lpage>
          . doi:10.1109/TMI.
          <year>2024</year>
          .
          <volume>3398728</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>P.</surname>
          </string-name>
           Radiuk, O. Kovalchuk, V. Slobodzian, E. Manziuk, O. Barmak, I. Krak, 
          <article-title>Human-in-the-loop  approach   based   on   MRI   and   ECG   for   healthcare   diagnosis</article-title>
          ,   in:   Proceedings   of   the   5th  international conference on informatics &amp;
          <article-title> data-driven medicine, CEUR-WS</article-title>
          .org, Aachen, 
          <year>2022</year>
          ,  pp. 
          <fpage>9</fpage>
          -
          <lpage>20</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3302</volume>
          /paper1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>O.</surname>
          </string-name>
           Chaban,   E. Manziuk,  
          <article-title>Enhancing   medical   NLI   with   integrated   domain   knowledge   and  sentiment   analysis,   in:   Proceedings   of   the   12th   international   conference   information   control  Експорт та звітування Створення звітів та експорт результатів аналізу Обрати, що експортувати Параметри форматів Оберіть дані для експорту Експорт візуалізацій з оверлеями масок у форматі PNG Метрики якості (CSV) Експорт метрик якості сегментації у форматі CSV Класи/ймовірності (JSON) Експорт результатів класифікації з ймовірностями у форматі JSON Зведений звіт (Markdown/PDF) Створення зведеного звіту з усіма</article-title>
          <source>результатами аналізу 0.924 0.860 0.937 0.912 [0.912, 0.936] [0.842, 0.878] [0.925, 0.949] [0.898</source>
          ,
          <issue>0</issue>
          .926] {
          <article-title>"version": "0.1.0-dev", "timestamp": "2025-10- 05T10:15:00Z", "models": { "segmentation": { "name": "SKIFSeg", "parameters": { "threshold": 0.5 } } }, "metrics": { "dice": 0.924, "</article-title>
          <source>jaccard": 0</source>
          .860 } }
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>