<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>D. Zhang);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Evaluation of the Privacy of Images Generated by ImageCLEFmedical GANs 2025 Based on Pre-trained Model Feature Extraction Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dengtao Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xutao Yang</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information Science and Engineering, Yunnan University</institution>
          ,
          <addr-line>Kunming 650504, Yunnan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Our team's primary contributions to the ImageCLEFmedical GANs 2025 task are as follows. This task evaluates whether medical images generated by Generative Adversarial Networks (GANs) utilized specific real images while training the generative models. We developed a methodology that fine-tuned and evaluated multiple pretrained models based on a contrastive learning framework, combined with a Mixture of Experts (MoE) strategy to fuse these models. Leveraging the similarity of feature extractions between generated and real images, we performed a binary classification task to identify real images that were potentially used during GAN training. Our best-performing model achieved a Cohen's Kappa score of 0.108 among the submitted results. Our experimental ifndings demonstrate that our approach can efectively distinguish between "used" and "unused" real images in the context of GAN training. Our code is public available at https://github.com/zhangdt123/image.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;GANs</kwd>
        <kwd>pre-trained model</kwd>
        <kwd>contrastive learning</kwd>
        <kwd>MoE</kwd>
        <kwd>Medical Imaging</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deep learning has achieved remarkable progress in medical image analysis, demonstrating powerful
capabilities in tasks such as classification, detection, and segmentation. However, the high performance
of deep neural networks typically depends on large-scale, high-quality, and accurately annotated
datasets. Due to the high cost of image acquisition, the need for expert annotation, and concerns about
patient privacy, obtaining suficient training data in medical imaging is often challenging. As a result,
models are limited in terms of generalization and robustness [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In this context, deep generative models—such as Generative Adversarial Networks (GANs), Variational
Autoencoders (VAEs), and Difusion Models—ofer a promising solution. By learning the underlying
distribution of real medical images, these models can synthesize structurally coherent and semantically
consistent images, efectively mitigating the problem of data scarcity to a certain extent [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Furthermore,
synthetic images can be leveraged for data augmentation, enhancing the stability of model training under
limited data conditions and even providing additional "virtual samples" for clinically rare conditions,
thereby broadening the applicability of medical AI models [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Despite the significant potential of deep generative models in medical imaging, their application
is accompanied by a range of non-negligible concerns, particularly in the areas of privacy protection,
ethical compliance, and clinical usability.</p>
      <p>
        First, training generative models often requires access to large volumes of real patient imaging data.
Without stringent data de-identification and access control measures, there is a risk of compromising
patient privacy [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Moreover, although the generated images are synthetic, their underlying feature
representations may still retain identifiable information from the original data, especially when employing
high-fidelity models such as Generative Adversarial Networks (GANs). This risk of "re-identification"
makes it dificult for synthetic data to fully comply with data protection regulations such as the
General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act
(HIPAA) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Second, generative models are susceptible to misuse. Utilizing synthetic medical images without
thorough validation may cause models to rely on spurious features, ultimately compromising their
diagnostic accuracy in real-world clinical settings [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. For example, models may learn to recognize
abnormal structures or lesion patterns that do not exist in authentic data, thereby undermining the
reliability of clinical decisions. Additionally, if synthetic images are used in diagnostic tasks without
expert annotation or review, this could lead to legal disputes concerning medical accountability and
malpractice [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        To advance research on the controllability, quality assessment, and clinical applicability of generative
models in medical imaging, the ImageCLEF initiative has organized a series of medical challenge
tasks.Our team’s username is taozi. The 2025 ImageCLEFmedical competition includes a dedicated
subtask focusing on generative models,[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] aiming to evaluate generative methods’ feasibility and
practical value in real-world medical scenarios. This study focuses on the competition’s subtask titled
"ImageCLEFmed GAN 2025: Training Data Analysis and Fingerprint Detection," [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] which centers on
analyzing synthetic biomedical images to determine whether specific real images were used during the
training of generative models. It’s also named Subtask 1: “Detect Training Data Usage”. Specifically, for
each real image in the test set, the goal is to predict whether it was used in generating a given synthetic
image (label 1) or not (label 0). The core challenge is to detect the presence of "fingerprints" of training
data within the synthetic outputs.
      </p>
      <p>Since synthetic images are generated by modeling the data distribution of real images, they often
exhibit strong statistical similarity to authentic samples. The closer a generated image’s distribution is
to that of real images, the higher its perceived quality and visual realism. In this study, we formulate the
task as a binary classification problem: determining whether a given real image was used during the
generation process. To achieve this, we compute image similarity scores—higher similarity indicating
likely usage and lower similarity suggesting non-usage.</p>
      <p>Our approach primarily leverages a contrastive learning strategy combined with three pre-trained
models and a Mixture-of-Experts (MoE) framework for training and evaluation. The pre-trained models
include ResNet50, ViT-B/16, and EficientNet-B0. First, we utilize these models to precompute similarity
matrices between synthetic images and real training images, forming a candidate pool of positive samples
for the contrastive learning framework. Next, the pre-trained models act as feature extractors, with a
dynamic projection head applied for dimensionality reduction, enabling multi-level feature decoupling
and adaptive parameter tuning. Finally, the trained deep learning models extract features from input
images, and inter-feature similarities are computed. Within the contrastive learning framework, these
features provide a more accurate representation of image similarity.</p>
      <p>By integrating contrastive learning with diverse pre-trained models, we can comprehensively evaluate
the similarity between generated and real images, thereby efectively solving the binary classification
task. Based on the individual performance of the pre-trained models, we apply a Mixture-of-Experts
(MoE) mechanism to fuse their outputs, ofering richer feature representations and more robust
guarantees for image analysis and interpretation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        With the widespread application of deep learning in medical image analysis, models’ reliance on
largescale, high-quality annotated datasets has become increasingly prominent. However, acquiring medical
images is often constrained by high collection costs, stringent ethical approvals, specialized manual
annotations, and concerns over patient privacy. These practical limitations hinder the generalization
and robustness of AI models in tasks such as lesion detection, organ segmentation, and modality
transformation. As a result, synthesizing medical images using deep generative models—as a means of
data augmentation, supplementation, or even substitution for real samples—has emerged as a prominent
les
research focus in recent years [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>
        Typical applications of synthetic images in the medical domain include data augmentation, where
additional training images are generated in few-shot scenarios to enhance model performance [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ];
modality completion and transformation, such as generating one modality from another (e.g., CT to MRI),
thereby facilitating multimodal learning and registration [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]; and privacy-preserving data sharing,
where synthetic images are used to construct open-source medical datasets, alleviating restrictions on
real data distribution.
      </p>
      <p>
        Nevertheless, medical image synthesis faces multiple challenges. First, unlike natural images, medical
images exhibit highly specialized anatomical and pathological structures. If these are not accurately
expressed in the synthetic output, the result may appear visually plausible yet lack clinical relevance [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
Second, there is currently no standardized, objective, and reproducible framework for evaluating the
quality of generated medical images. This not only hampers the assessment of diagnostic utility but also
impedes fair benchmarking across models [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Moreover, if model training does not adequately mitigate
data leakage risks, there remains the possibility that real patient images are "implicitly memorized" and
reproduced in the generated output [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        In terms of technical approaches, the dominant generative models currently include:
Generative Adversarial Networks (GANs), such as pix2pix, CycleGAN, and StyleGAN, are widely
used for generating and translating CT, MRI, and X-ray images due to their ability to produce highly
detailed and visually realistic results [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ]. Variational Autoencoders (VAEs) are better suited for
modeling the latent distribution of images, generating stable but less detailed outputs, and are often
employed in scenarios requiring control over anatomical shape or organ structure [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Difusion
Models have recently gained traction in medical imaging for their iterative generation process, ofering
improved stability and higher-quality synthesis compared to GANs, particularly in high-resolution
image generation tasks [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Conditional Generative Models (e.g., cGANs, VAE-GANs) incorporate
structural information such as labels, semantic maps, or medical text, enabling the production of
synthetic images with higher clinical fidelity.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>Figure 1 illustrates the overall architecture of our proposed dynamic contrastive learning model. The
system begins with an input image, which is augmented and passed through a pre-trained ResNet-50
backbone (initialized with ImageNet weights) to extract high-level visual features. These features
are subsequently projected into an embedding space using a dynamic projection head incorporating
multi-layer perceptrons and generating normalized representations.</p>
      <p>To facilitate contrastive learning, a similarity matrix precomputed from image features is utilized
to dynamically construct a positive pool for each generated image. This enables adaptive positive
matching by selecting real samples with similarity scores exceeding a predefined threshold or selecting
top-k similar examples when insuficient matches are available. Simultaneously, negative samples
are obtained through a FIFO memory queue, which stores normalized embeddings from previous
mini-batches, ensuring stable and diverse contrastive pairs.</p>
      <p>A momentum encoder—synchronized with the online encoder using exponential moving average
updates—is employed to encode the positive samples, reducing noise and enhancing representation
consistency. Both online and momentum embeddings are input into the InfoNCE contrastive loss
function, where the model is trained to minimize the distance between positive pairs while maximizing
separation from negative instances.</p>
      <p>This dynamic sampling strategy, in combination with momentum encoding and a structured memory
queue, significantly enhances the model’s capacity for robust representation learning, especially under
distributional shifts between generated and real samples.</p>
      <sec id="sec-3-1">
        <title>3.1. MoCo Framework</title>
        <p>MoCo (Momentum Contrast), proposed by Kaiming He’s team, is a self-supervised learning framework
designed to extract efective visual representations from unlabeled data using contrastive learning. The
core idea of MoCo involves two key innovations:</p>
        <p>First, it introduces a dynamic queue of negative samples, which stores feature representations of
previous batches extracted by a momentum encoder. This allows for a large and consistent set of negative
examples, which is crucial for efective contrastive learning. Second, MoCo employs a momentum
encoder whose parameters are updated using an exponential moving average (EMA) from the online
encoder. This design ensures stability in feature representation across diferent training iterations.
(1)
(2)
 momentum ←</p>
        <p>·  momentum + (1 − ) ·  online</p>
        <p>On the one hand, this mechanism generates stable feature representations, preventing the rapid
updates of the online encoder from destabilizing the contrastive learning objective. On the other
hand, since the momentum encoder extracts all features in the negative sample queue, it ensures
**consistency** within the queue.</p>
        <p>Regarding the loss function, MoCo employs the InfoNCE loss (a contrastive loss), which encourages
the model to bring positive sample pairs closer in the feature space while pushing apart negative pairs.
ℒ = − log</p>
        <p>exp( · +/ )
exp( · +/ ) + ∑︀− exp( · − / )</p>
        <p>Here,  represents the query feature, + is the positive key feature, − is the negative key feature,
and  is the temperature coeficient. The query feature is obtained by feeding an augmented version
of the current input image through the online encoder (e.g., a ResNet). The momentum encoder
processes a diferent augmentation of the same image (or another positive sample image) to generate
the corresponding key feature.</p>
        <p>We adopted the widely used MoCo v2 as the baseline contrastive learning framework, enhancing it
with a nonlinear projection head (MLP) and richer data augmentation strategies. To better address the
characteristics of the generated image detection task, we made the following extensions to the MoCo
v2 framework:</p>
        <p>Positive sample selection: Rather than relying solely on data augmentations to create positive pairs,
we selected positives based on precomputed semantic similarity between the generated image and real
images used during training. This improves the quality and relevance of positive samples.</p>
        <p>ResNet50</p>
        <p>EfficientNet-B0</p>
        <p>ViT-B/16
Expert Backbones</p>
        <p>Normalize</p>
        <p>Concat</p>
        <p>InfoNCE Contrastive</p>
        <p>Loss
Momentum</p>
        <p>Encoder</p>
        <p>FIFO Queue
(Negtives)
(Embedding + Adaptive Temperature)</p>
        <p>Dynamic Projection</p>
        <p>Head
(3)
(4)
(5)
(6)
(7)
Gating Network</p>
        <p>Multi-Head Attention</p>
        <p>Fused Expert Feature</p>
        <p>Adaptive temperature coeficient: Instead of using a fixed global temperature, we introduced
samplespecific temperature values, allowing the model to adjust its learning focus dynamically for each
sample. This is particularly beneficial in tasks with complex feature distributions and varying sample
dificulty—such as in generated image detection.</p>
        <p>Loss function: We adopted a Hybrid Contrastive Loss, which calculates similarity using negative
samples from the queue and incorporates additional supervision from the momentum encoder’s
perspective. This hybrid formulation combines standard contrastive loss with momentum-aware guidance
to enhance representation learning.</p>
        <p>⎡
1 ∑︁
ℒ1 =  ·  =1 ⎣−</p>
        <p>⎡
1 ∑︁
ℒ2 = (1 −  ) ·  =1 ⎣−
ℒtotal = ℒ1 + ℒ2</p>
        <p>⎛
pos_sim + log ⎝exp ︂( pos_sim )︂</p>
        <p>+ ∑︁ exp
=1
︃(  · − )︃⎞⎤</p>
        <p>⎠⎦
 
⎛
pos_sim + log ⎝exp ︂( pos_sim )︂
 fixed  fixed</p>
        <p>+ ∑︁ exp
=1
︃(  · − )︃⎞⎤</p>
        <p>⎠⎦
 fixed</p>
        <p>For each positive sample  in the batch, the cosine similarity between the online feature  (denoted
as ) and the momentum feature + (denoted as ) is computed as follows:
pos_sim =  · +
For each negative sample − in the queue, the dot product between the online feature  and the queue
vector is calculated as follows:
neg_sim, =  · −
(j ∈ {1, 2, 3, ..., })
Here,  denotes the capacity of the queue. The resulting negative sample similarity matrix has the
shape [, ], where  is the batch size. Among them, the formula  fixed =  .ℎ() represents that
the gradient of temperature values is blocked. The weight parameter  (denoted as ℎ in the
code) regulates the balance between online loss and momentum loss, with a default value of 0.7.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dynamic Projection Head</title>
        <p>Traditional contrastive learning frameworks (e.g., MoCo, SimCLR) typically employ fixed linear or
multilayer perceptron (MLP) projection heads to merely map high-dimensional features—output by
the backbone network—into a lower-dimensional space. As illustrated in the diagram, we extend this
architecture in our work, primarily comprising two core enhancements:</p>
        <p>Feature Dimension Adaptation: We project the backbone network’s 2048-dimensional features (output
from ResNet50) into a 256-dimensional contrastive learning space. Dynamic Temperature Generation:
We propose generating sample-wise adaptive temperature parameters  ∈ (0.05, 0.2) dynamically
based on input features. The temperature paramete   is adapted for each sample as follows:
  = 0.05 + 0.15 ·  ( ·  + )
where  denotes the Sigmoid function, and  is the feature vector after projection.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Mixture of Experts (MoE)</title>
        <p>Mixture of Experts (MoE) is a machine learning paradigm that integrates multiple sub-models (experts)
and dynamically weights their outputs. The core idea is to allow diferent experts to specialize in
learning distinct subspaces of the input data, enabling adaptive feature fusion through a gating network
that intelligently assigns weights to each expert.</p>
        <p>In this work, we introduce a novel design of the MoE architecture tailored to better meet the
requirements of our task. Specifically, we incorporate three heterogeneous networks as experts: ResNet50 (for
local texture), EficientNet-B0 (for fine-grained features), and Vision Transformer (for global semantics).
After unifying their outputs to the same dimensional space, we apply L2 normalization to eliminate
discrepancies in magnitude. A gating network is then constructed to dynamically fuse expert features
using a two-stage cascade structure:</p>
        <p>() =  (2 ·  (1 · (,  , )))
,  ,  denote the image feature vectors extracted by the three pre-trained models, respectively,
and are processed with L2 normalization. The first layer of the gating network is parameterized by
a weight matrix 1 ∈ 512× 6144, which projects the concatenated 6144-dimensional input into a
512-dimensional hidden space. The second layer is parameterized by 2 ∈ 3× 512, responsible for
generating the weight scores corresponding to each expert.</p>
        <p>Secondly, a multi-head cross-attention mechanism (with 8 heads) is introduced to enhance inter-expert
feature interaction:</p>
        <p>=  ( =  =  =  )
The query, key, and value matrices are all derived from the stacked feature representations of the
three experts   ∈ × 3× 2048. The MultiHead module employs an 8-head cross-attention
mechanism, where each head has a dimensionality of  = 2048/8 = 256. The overall process is shown
in Figure 2.
(8)
(9)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Evaluation Metrics</title>
        <p>To simplify our experimental analysis, we did not partition a validation set from the dataset but instead
utilized the entire training set for model training. Subsequently, we submitted our results to the
IMAGECLEFMED Gans 2025: Recognition of Training Data Fingerprints Challenge. This challenge is
formulated as a binary classification task, with evaluation criteria comprising several key performance
metrics: the Kappa value, accuracy, precision, recall, and F1-score. Notably, the Kappa value has been
designated as the primary evaluation metric for this year’s competition. The definitions for these
metrics are as follows:
  +  
Accuracy = (14)</p>
        <p>+   +   +  
 represents the probability of observed agreement, i.e., the proportion of instances where two
evaluators (raters) assign the same classification in practice.  denotes the probability of expected
chance agreement, assuming that the two evaluators classify instances independently and randomly.
True Positives (TP) refer to the number of samples where the model correctly predicts the positive class,
and the actual class is also positive. False Positives (FP) represent the number of samples where the
model incorrectly predicts the positive class for instances that are actually negative. True Negatives (TN)
indicate the number of samples where the model correctly predicts the negative class for true negative
instances. False Negatives (FN) correspond to the number of samples where the model erroneously
predicts the negative class for instances that are actually positive.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental Results</title>
        <p>In this experiment, we evaluated the task across three pre-trained models. During the testing/inference
phase, inheriting the objective design from contrastive learning pre-training (a characteristic of the
MoCo framework), we primarily employed cosine similarity to assess the relationship between generated
images and real images. Based on similarity scores, classifications were performed to derive the Kappa
value, accuracy, precision, recall, and F1 score.</p>
        <p>To enhance the diversity of the training data and improve the model’s generalization capability, we
applied data augmentation techniques to the training images, including random cropping, flipping,
color jittering, and input normalization. These operations ensured that contrastive learning could
efectively discriminate semantic features. Additionally, within the contrastive learning framework, for
each generated image, we selected at least two positive samples and five negative samples for training.
This approach allowed us to evaluate the performance of diferent pre-trained models during testing.</p>
        <p>By systematically incorporating diverse pre-trained models, we aimed to assess their individual
contributions to the final task outcomes. This enabled the integration of these models to construct a
suitable Mixture of Experts (MoE) hybrid model. Through this comprehensive experimental design, we
could more accurately evaluate the feature extraction capabilities of each pre-trained model for image
characterization, providing valuable insights for future improvements in image processing and analysis.
Key parameter categories and the specific values used in the experiments are summarized in the table.</p>
        <p>For ResNet50, we systematically tested various combinations of the following critical parameters:
 __: The number of negative samples sampled.  __: The minimum
number of positive samples selected. _ℎ: The weighting coeficient balancing the online
loss and momentum loss. ℎ_: The size of training batches. _ℎℎ: The similarity
threshold for classification decisions (based on cosine similarity). These parameters represent a critical
hyperparameter set that directly influences model training and evaluation outcomes.</p>
        <p>We performed predictions on all 500 generated images and submitted these results. To evaluate
model performance, we adopted Cohen’s Kappa as the primary evaluation metric due to its significant
advantages in computer vision tasks involving imbalanced class distributions or scenarios requiring
consistency assessment, making it particularly suitable for this task. Additionally, the F1-score was
used as a secondary metric, as it holistically integrates precision and recall, providing a more
comprehensive performance evaluation. This evaluation protocol ensured a thorough understanding of
model performance under varying conditions. In total, we submitted seven distinct sets of results.
The table summarizes partial detailed scores, displaying the specific conditions and corresponding
evaluation metrics for each submission. These outcomes facilitate further analysis and model refinement
to enhance its practical applicability and efectiveness.</p>
        <p>As shown in the table, we submitted a total of ten results and selected six representative results for
presentation. Through experimentation, we observed that diferent parameter combinations significantly
influence the outcomes. Due to hardware constraints, our parameter combination optimizations were
primarily focused on ResNet50. When ID number is 1107, ResNet50 uses Configuration 3 from Table 1.
ID number 1179 uses Configuration 2. ID number 1875 uses Configuration 1. Notably, the ResNet50 of
configuration 3 demonstrated relatively superior performance on the test set compared to other models.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this study, we employed multiple pre-trained models integrated with contrastive learning frameworks
for classification. By defining a similarity threshold, images were classified based on their similarity
scores between real and generated images. Features of the generated images were extracted using
diferent pre-trained models and then matched against real images. Depending on the performance of
each pre-trained model, their outputs were fused via a Mixture of Experts (MoE) strategy to leverage
their complementary strengths. Moving forward, we plan to investigate methods to further optimize
the MoE framework, focusing on refining dynamic weighting mechanisms and adaptive model selection.
This aims to enhance cross-model collaboration for more precise feature alignment and maximize the
collective performance of the integrated models.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work Chat-GPT-4o and Grammarly were used to check grammar and
spelling. After using this tool, the author reviewed and edited the content as needed and takes full
responsibility for the publication’s content.
The sources for the ceur-art style are available via
• GitHub,
• Overleaf template.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Litjens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kooi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. E.</given-names>
            <surname>Bejnordi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A. A.</given-names>
            <surname>Setio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciompi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghafoorian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A. van der</given-names>
            <surname>Laak</surname>
          </string-name>
          , B. van
          <string-name>
            <surname>Ginneken</surname>
            ,
            <given-names>C. I.</given-names>
          </string-name>
          <string-name>
            <surname>Sánchez</surname>
          </string-name>
          ,
          <article-title>A survey on deep learning in medical image analysis</article-title>
          ,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .1016/j.media.
          <year>2017</year>
          .
          <volume>07</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Yi</surname>
          </string-name>
          , E. Walia,
          <string-name>
            <given-names>P.</given-names>
            <surname>Babyn</surname>
          </string-name>
          ,
          <article-title>Generative adversarial network in medical imaging: A review</article-title>
          ,
          <source>Medical Image Analysis</source>
          <volume>58</volume>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1016/j.media.
          <year>2019</year>
          .
          <volume>101552</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Frid-Adar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Diamant</surname>
          </string-name>
          , E. Klang,
          <string-name>
            <given-names>M.</given-names>
            <surname>Amitai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goldberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Greenspan</surname>
          </string-name>
          ,
          <article-title>Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>321</volume>
          (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .1016/j.neucom.
          <year>2018</year>
          .
          <volume>09</volume>
          .013.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Kaissis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Makowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rückert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Braren</surname>
          </string-name>
          , Secure, privacy
          <article-title>-preserving and federated machine learning in medical imaging</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          <volume>2</volume>
          (
          <year>2020</year>
          ).
          <source>doi:10.1038/ s42256-020-0186-1.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Shokri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stronati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Shmatikov</surname>
          </string-name>
          ,
          <article-title>Membership inference attacks against machine learning models</article-title>
          ,
          <source>in: Proceedings - IEEE Symposium on Security and Privacy</source>
          ,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .1109/ SP.
          <year>2017</year>
          .
          <volume>41</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Luck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Honari</surname>
          </string-name>
          ,
          <article-title>Distribution matching losses can hallucinate features in medical image translation</article-title>
          ,
          <source>in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          , volume
          <volume>11070</volume>
          LNCS,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>030</fpage>
          -00928-1_
          <fpage>60</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonekaboni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. McCradden</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Goldenberg</surname>
          </string-name>
          ,
          <article-title>What clinicians want: Contextualizing explainable machine learning for clinical end use</article-title>
          ,
          <source>in: Proceedings of Machine Learning Research</source>
          , volume
          <volume>106</volume>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ionescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.-C.</given-names>
            <surname>Stanciu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Andrei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radzhabov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Prokopchuk</surname>
          </string-name>
          , Ştefan, LiviuDaniel, M.-G. Constantin,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dogariu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Damm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rückert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Ben</given-names>
            <surname>Abacha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>García Seco de Herrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Friedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bloch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brüngel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Idrissi-Yaghir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. M. G.</given-names>
            <surname>Pakull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bracke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Eryilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Becker</surname>
          </string-name>
          , W.-W. Yim,
          <string-name>
            <given-names>N.</given-names>
            <surname>Codella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Novoa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Malvehy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitrov</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. J. Das</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>H. M.</given-names>
          </string-name>
          <string-name>
            <surname>Shan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Nakov</surname>
            , I. Koychev,
            <given-names>S. A.</given-names>
          </string-name>
          <string-name>
            <surname>Hicks</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gautam</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Thambawita</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Halvorsen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Fabre</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macaire</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Lecouteux</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Heinrich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kiesel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Stein</surname>
          </string-name>
          , Overview of imageclef 2025:
          <article-title>Multimedia retrieval in medical, social media and content recommendation applications, in: Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and
          <string-name>
            <surname>Interaction</surname>
          </string-name>
          ,
          <source>Proceedings of the 16th International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Springer Lecture Notes in Computer Science LNCS, Madrid, Spain,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Andrei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Constantin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dogariu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radzhabov</surname>
          </string-name>
          , L.
          <string-name>
            <surname>-D. Ştefan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Prokopchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Müller</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Ionescu</surname>
          </string-name>
          ,
          <article-title>Overview of imageclefmedical 2025 GANs task: Training data analysis and ifngerprint detection</article-title>
          ,
          <source>in: CLEF2025 Working Notes, CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Madrid, Spain,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H. C.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Tenenholtz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Rogers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Schwarz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Senjem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Gunter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Andriole</surname>
          </string-name>
          , M. Michalski,
          <article-title>Medical image synthesis for data augmentation and anonymization using generative adversarial networks</article-title>
          ,
          <source>in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          , volume
          <volume>11037</volume>
          LNCS,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>030</fpage>
          -00536-
          <issue>8</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chartsias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Joyce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. V.</given-names>
            <surname>Giufrida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Tsaftaris</surname>
          </string-name>
          ,
          <article-title>Multimodal mr synthesis via modalityinvariant latent representation</article-title>
          ,
          <source>IEEE Transactions on Medical Imaging</source>
          <volume>37</volume>
          (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .1109/ TMI.
          <year>2017</year>
          .
          <volume>2764326</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hayes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Melis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Danezis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Cristofaro</surname>
          </string-name>
          , Logan:
          <article-title>Membership inference attacks against generative models</article-title>
          ,
          <source>Proceedings on Privacy Enhancing Technologies</source>
          <year>2019</year>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .2478/ popets-2019-0008.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>J. M. Wolterink</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Dinkla</surname>
            ,
            <given-names>M. H.</given-names>
          </string-name>
          <string-name>
            <surname>Savenije</surname>
            ,
            <given-names>P. R.</given-names>
          </string-name>
          <string-name>
            <surname>Seevinck</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>A. van den Berg, I. Išgum, Deep mr to ct synthesis using unpaired data</article-title>
          ,
          <source>in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          , volume
          <volume>10557</volume>
          LNCS,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -68127-
          <issue>6</issue>
          _
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Karras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Laine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Aila</surname>
          </string-name>
          ,
          <article-title>A style-based generator architecture for generative adversarial networks</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>43</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1109/TPAMI.
          <year>2020</year>
          .
          <volume>2970919</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Auto-encoding variational bayes</article-title>
          ,
          <source>in: 2nd International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings</source>
          ,
          <year>2014</year>
          . doi:
          <volume>10</volume>
          .61603/ceas. v2i1.
          <fpage>33</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <article-title>Denoising difusion probabilistic models</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>2020</volume>
          <source>-December</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>