<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimizing Synthetic Data from Scarcity: Towards Meaningful Data Generation in High-Dimensional Low-Sample Size Domains</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Danilo Danese</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Politecnico di Bari</institution>
          ,
          <addr-line>Via E. Orabona, 4, 70126 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Deep learning has revolutionized artificial intelligence by enabling the extraction of intricate representations from large datasets. Generative models have emerged as powerful tools for data synthesis by mimicking the distributions of training data. Despite their advancements, these models encounter critical concerns, including biases, privacy risks, and the authenticity of the generated data. These challenges underscore the necessity for incorporating fairness, expert insights, and comprehensive evaluations into their development. Applications like medical imaging pose challenges due to scarce high-quality data and demanding requirements for condition-specific synthesis. Furthermore, the management of high-dimensional low-sample size (HDLSS) data accentuates the demand for sophisticated representation learning techniques, enabling the generation of efective synthetic data from limited clinical datasets. The complexity of longitudinal medical data, characterized by intricate temporal correlations, further challenges existing methodologies, revealing their limitations. In the light of above, my doctorate research path intends to focus on two main objectives: (i) employ cutting-edge techniques to advance beyond current state-of-the-art in data synthesis, and (ii) bridge the gap between privacy, fairness and generating meaningful synthetic data leveraging on XAI and HCI for further robustness.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Synthetic Data Augmentation</kwd>
        <kwd>Sensitive Domains</kwd>
        <kwd>Generative Models</kwd>
        <kwd>XAI</kwd>
        <kwd>HCI</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deep learning has revolutionized artificial intelligence by enabling the extraction of intricate
patterns from vast datasets. However, in many fields, including healthcare, the scarcity of
labeled data remains a significant bottleneck despite the increasing availability of large datasets.
This challenge is particularly acute in healthcare due to the limited availability of patient
cohorts and the high dimensionality of data, such as neuroimaging with millions of voxels.
Traditional statistical analyses may be unreliable [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] due to the sparse representation of the
population. While algorithms based on deep learning frameworks show impressive performance,
their efectiveness relies heavily on the availability of training samples, often requiring large
datasets to avoid overfitting and ensure statistically meaningful results [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Acquiring
highquality reference standards for labeling demands substantial investments in time, financial
resources, and human expertise [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Moreover, class imbalance is prevalent in manually labeled
healthcare datasets, with certain categories underrepresented. To address the limited availability
of medical training data, researchers have proposed various data augmentation (DA) techniques
to synthetically generate additional examples. However, these methods have limitations in
capturing the full diversity and complexity of real-world clinical data. DA ofers a promising
approach to expand datasets by generating synthetic, labeled samples. This approach employs
a variety of techniques such as the injection of Gaussian noise, cropping, flipping, and padding
to generate new samples that retain the original image’s label, thereby preserving the semantic
meaning of the data [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, the efectiveness of DA depends heavily on the original data,
as transformations beneficial for one dataset may introduce bias or be inefective for another.
The susceptibility to misleading transformations is exemplified by rotating a "6" to resemble
a "9". This issue is pertinent in medical domains, where the complexity of medical images,
including diverse anatomy, irregular tumor shapes, and occasional anatomical inconsistencies,
can render traditional operations inefective. The limitations of DA can result in the generation
of irrelevant or anomalous images, disrupting model performance. Additionally, imbalanced
class representation can bias models toward overrepresented categories, necessitating careful
evaluation and transformation methods to address class imbalance efectively. The primary aim
of DA extends beyond augmenting data volume to faithfully replicate the true data distribution.
This involves generating synthetic samples that not only blend indistinguishably with the
original data but also encapsulate the complex relationships and patterns inherent within the
dataset. Synthetic samples generated through DA must not only preserve the label of their source
data but also embody the nuanced statistical characteristics that underpin their authenticity. As
a result, it should be dificult to discern synthetic samples from real ones.
      </p>
      <p>Balancing Realism and Variability
While imposing constraints to enhance the realism and clinical validity of synthesized medical
images is crucial, it is important to acknowledge the potential trade-of between realism and
variability. Overly strict restrictions on the generative models may limit the diversity and
representative capacity of the generated samples, introducing bias into the synthetic data.
Ensuring a high degree of realism is essential for the efective utilization of synthetic data in
medical applications. However, if the generative process is excessively constrained, the models
may fail to capture the true underlying distribution of the data, leading to an incomplete or
skewed representation of the target domain. This could result in synthetic samples that do not
accurately reflect the inherent diversity and variability present in real clinical data, potentially
hindering the generalization capabilities of models trained on such data.</p>
      <p>To address this challenge, it is crucial to strike a delicate balance between imposing
constraints for realism and allowing suficient flexibility for variability. One approach could involve
incorporating domain knowledge and expert feedback through human-computer interaction
(HCI) techniques. By leveraging the expertise of medical professionals, the generative process
can be guided to prioritize clinically relevant features and constraints, while still allowing for a
certain degree of variability within acceptable bounds.</p>
      <p>Moreover, the integration of explainable AI (XAI) methodologies can provide insights into
the generative models’ decision-making processes, enabling a more clear understanding of the
factors influencing the generated samples. Through XAI-guided analysis, it may be possible
to identify and adjust specific model components or parameters that contribute to unrealistic
artifacts or limited variability, without compromising overall realism.</p>
      <p>Additionally, iterative refinement cycles with domain experts could be implemented, wherein
generated samples undergo evaluation for realism and diversity, leading to adjustments in the
generative process. This iterative approach can help fine-tune the balance between realism and
variability, ensuring that the synthetic data accurately reflects the complexities and nuances of
real clinical data while maintaining representative diversity.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background Technologies</title>
      <p>Image augmentation techniques constitute a fundamental subset of DA methods that manipulate
existing image samples through various transformations. These can range from basic operations
such as elastic transformations which deform image spatial dimensions through displacement
ifelds, introducing localized shape variations without constraints on parallelism or aspect ratios.
However, elastic warping may produce anatomically implausible samples. On the other hand,
erasing transformations replace selected image regions with constant intensity values or random
noise, while pixel-level techniques adjust attributes like brightness, contrast, saturation, and
noise. These augmentations mainly modify original data, potentially limiting generalization.
Augmented samples also tend to strongly correlate with originals.</p>
      <p>
        Synthetic generation ofers a diferent approach to overcome the limitations of traditional
techniques. Unlike manipulating existing data, these methods create entirely new samples
from scratch, potentially introducing greater diversity and complexity. Specialized models,
tailored to specific modalities and tasks, further enhance those capabilities. While promising,
synthetic techniques require increased computational resources and more complex architectures
compared to basic transformations. A crucial challenge remains to ensure the visual fidelity and
realism of generated samples, as unrealistic artifacts can negatively impact model performance.
Generative Adversarial Networks (GANs). GANs [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have showcased their capacity to
produce lifelike images, rendering them extensively utilized in medical research [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and
incorporated into various DA assessments [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. However, despite their eficacy, GANs are not without
limitations; challenges include learning instability, convergence dificulties, and susceptibility
to mode collapse [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], wherein the generator generates a limited number of samples, and
therefore limiting their ability to diversify the data and improve model generalizability. Moreover,
previous research [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] has shown that GANs can sometimes "hallucinate" features in generated
images, potentially introducing artifacts that mimic or hide real features. Medical image datasets
frequently display significant class imbalance, with a bias towards healthy or normal cases.
One prevalent augmentation strategy involves integrating synthesized pathological lesions into
otherwise healthy images. However, Cohen et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] have identified significant challenges with
this method, particularly when employing Cycle Generative Adversarial Networks (CycleGANs)
for data translation tasks, whether involving unpaired or paired data. These studies demonstrate
a significant limitation: CycleGANs may fail to accurately retain all known and potentially
unknown class labels during the translation process.
      </p>
      <p>
        Variational autoencoders (VAEs). VAEs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] ofer greater output diversity and avoid mode
collapse compared to GANs, but they often produce blurry, low-fidelity images due to minimizing
the Kullback-Leibler divergence [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. While VAEs have advantages over GANs in stability and
sample variety, their image quality limitations have restricted their adoption for DA. Nonetheless,
Chadebec et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] achieved a noteworthy result with the introduction of a geometry-aware
VAE tailored for DA in HDLSS scenarios. By integrating Riemannian geometry into the model,
they enhanced the learning process of latent representations, enabling the generation of realistic
samples even with sparse data. Their model demonstrated substantial performance gains over a
standard VAE in an MRI classification task, achieving an approximate 8% increase in accuracy
when trained on a dataset consisting of 50 real and 5,000 synthetic MRIs.
      </p>
      <p>
        Difusion Models (DMs). Recent academic literature has notably expanded the use of DMs for
image synthesis [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. DMs achieve high-fidelity sample generation by approximating complex
real-world data distributions through a series of simpler distributions progressively ’difused’
together. This process efectively captures the intricacies of the original data, resulting in
more realistic and diverse generated samples. This capability is particularly advantageous in
image synthesis tasks, as general-purpose images often present a diverse array of textures,
colors, and other visual attributes that challenge simpler parametric models. However, despite
these strengths, DMs also present certain limitations. Compared to GANs and VAEs, DMs
can be computationally demanding and require a significant amount of data for accurate
calibration. Moreover, DMs entail prolonged sampling times due to the extensive steps in the
reverse difusion process, posing challenges for real-time applications or scenarios requiring
a substantial volume of samples. Consequently, researchers propose solutions to enhance
sampling eficiency while maintaining sample quality and diversity. A notable approach is
progressive distillation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], which involves distilling a pre-trained deterministic difusion
sampler with numerous steps into a novel difusion model requiring fewer sampling steps.
      </p>
      <p>
        While DMs progress rapidly, their application in medical contexts lags, highlighting a gap
between state-of-the-art (SotA) generative models in general domains and those in medical fields.
There is a pursuit for computational eficiency in general-purpose generative models.
Wuerstchen [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], a novel architecture, demonstrates competitive performance and cost-efectiveness
for large-scale DMs. Central to Wuerstchen’s innovation is its employment of a latent difusion
technique that acquires a detailed semantic image representation, reducing computational
costs while surpassing SotA. Wuerstchen achieves comparable results with less data, doubling
inference speed and reducing time and costs.
      </p>
      <p>
        State Space Models (SSMs). Parallel to DMs, research explores computationally eficient
architectures. Mamba [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], spanning modalities like language, audio, and genomics, utilizes
eficient Selective State Space Models for sequence processing, demonstrating speed improvements
compared to Transformers [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] particularly evident when handling longer sequences. Applying
SSMs to vision is challenging due to data characteristics. Vision-specific architectures like Vim
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], with bidirectional Mamba blocks and position embeddings, aim to overcome limitations of
self-attention. Another work [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] inspired by SSMs and Vision Transformers models maintains
global receptive fields while improving eficiency. Notably, addressing direction-sensitivity in
images enables the processing of visual data as ordered sequences. These architectures show
promise in various vision tasks, especially at higher resolutions, potentially paving the way for
sensitive domains like medical imaging.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Exploratory Approach: Integrating XAI and RLHF</title>
      <p>Explainable Artificial Intelligence (XAI). XAI has emerged as a critical field in the era of deep
learning, aiming to address the opaque and complex nature of modern machine learning models.
As these models become increasingly sophisticated and are deployed in sensitive domains,
the need for transparency, interpretability, and accountability has become fundamental. XAI
techniques aim to clarify how these models make decisions, helping users and researchers
understand why they make certain predictions, recognize any biases, and ensure compliance
with ethical and regulatory guidelines. In the context of synthetic data generation, XAI plays a
pivotal role in ensuring the reliability and trustworthiness of the generated data.</p>
      <p>
        As previously mentioned, synthetic DA is fundamental in fields with scarce, sensitive, or
costly real-world data. Among GANs, VAEs, SSMs and DMs, the latter have gained significant
attention due to their ability to generate high-fidelity samples by approximating complex
realworld data distributions, yet their complex architectures and iterative generation processes pose
challenges in terms of interpretability and explainability. Integrating XAI techniques into DMs
for synthetic data generation can provide valuable insights into the model’s decision-making
process, as proposed by Park et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. By visualizing and interpreting the denoising process,
researchers can identify the regions and visual concepts that the model focuses on at each time
step, ensuring that the generated data accurately captures the desired features.
      </p>
      <p>
        Although initially designed for general-purpose images, its application within the medical
domain proves advantageous for ensuring accurate representation and generation of relevant
anatomical structures and pathological features. An initial strategy for developing forthcoming
frameworks may involve the adaptation of tools such as DF-RISE and DF-CAM [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], derived from
RISE and Grad-CAM. DF-RISE and DF-CAM are complementary techniques that provide insights
into the DM’s decision-making process from external and internal perspectives, respectively.
DF-RISE reveals the denoising levels and regions of focus, while DF-CAM unveils the specific
visual concepts prioritized by the model at each step In DMs. This approach can facilitate the
comprehension of which visual concepts (e.g., specific organs, tissues, lesions) are prioritised at
diferent time steps during image synthesis. It can also aid in fine-tuning the model to better
capture the desired features during experiments. By visualising the denoising levels using
DF-RISE, it is possible to comprehend the semantic and detail levels recovered by the DM
during generation. This can help ensure that the model accurately captures both high-level
semantic information (e.g. organ structures) and fine-grained details (e.g. lesions, abnormalities).
Furthermore, DF-CAM can help visualize and interpret the visual concepts the DM focuses
on at each inference step during medical image generation. This can provide insights into the
model’s decision-making process and assist in identifying potential biases or inconsistencies in
the generated images.
      </p>
      <p>
        Reinforcement Learning from Human Feedback (RLHF). The synergy between XAI and
DMs can be further amplified through RLHF [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] to improve the generation process. In the
context of DMs, RLHF enables learning from human preferences and feedback, enhancing their
ability to generate data that aligns with human expectations [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The integration of XAI and
RLHF facilitates the acquisition of insights into the decision-making mechanisms of the model
and enables active refinement of the generation process to better align with human values.
      </p>
      <p>
        In this task, RLHF can be used to fine-tune DMs for generating synthetic medical images
that align with medical standards and requirements. The process involves training a DM to
generate medical images based on specific classes. Human evaluators (e.g. domain experts)
then provide feedback on the generated images, indicating which ones better align with the
task requirements, such as improving class-image alignment, or refining aesthetic quality. In
this work [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] was proposed the D3PO method, an extension of DPO, which directly fine-tunes
DMs based on human feedback without requiring a separate reward model. This approach
is more direct, cost-efective, and minimizes computational overhead compared to traditional
RLHF methods that rely on a reward model and are incompatible with the strict requirement
of a domain expert supervision. Physicians can provide valuable feedback on the quality and
relevance of medical images, guiding the fine-tuning process of DMs.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Research Questions and Methodology</title>
      <p>This research work aims to investigate the capabilities and the limits of modern generative
models in the context of HDLSS domains. Specifically, the primary objective of this study
is to develop a comprehensive framework capable of efectively addressing the fundamental
constraints associated with scenarios characterized by HDLSS data such as medical imaging. This
framework will focus on systematically addressing challenges related to privacy, fairness, and
consistency in discrete segments. To ensure the reliability and robustness of the methodology,
the incorporation of Human-Computer Interaction (HCI) techniques can facilitate systematic
validation of significant outcomes. By harnessing the expertise of domain specialists, these
techniques can guide both the generation of synthetic data and the final output, thereby
validating the integrity and credibility of the framework. Moreover, it is imperative to devise a
solution that transcends limitations inherent to specific environments by leveraging reproducible
datasets and extending its applicability to a diverse array of real-world datasets, as real-world
data may exhibit unpredictable or anomalous behavior.</p>
      <p>RQ1: What are the current limitations of using modern generative models for
sensitive domains?
Modern generative models have immense potential to benefit healthcare, but their adoption
remains limited. Key challenges include: ensuring the clinical validity of synthesized images;
handling multimodal data; scarce annotated datasets; protecting privacy; explaining outputs;
computational eficiency; and robustness to data distribution shifts. A comprehensive
examination of these limitations can help identify critical gaps and requirements, informing the
development of tailored methodologies, validation procedures, guidelines, and best practices.</p>
      <p>Specifically, how to ensure that data from the same patient or source remains entirely
within either the training or validation set, preventing any overlap or leakage between the
two? Additionally, how to establish appropriate similarity measures that go beyond human
visual evaluation to objectively assess whether the synthetic data accurately captures the true
underlying distribution of the target domain?</p>
      <p>To address this, it is necessary to develop robust evaluation protocols that involve splitting
the limited dataset into two distinct subsets (training and validation) in a principled manner,
ensuring that data from the same source (e.g., patient) is consistently allocated to either subset.
This strict separation is crucial to maintain the validity of the DA process and subsequent model
evaluation, restricting the generation of synthetic data solely to the training subset.</p>
      <p>However, rather than relying solely on human visual inspection, which can be subjective
and limited, we must establish quantitative similarity measures that can objectively evaluate
the extent to which the synthetic data accurately represents the characteristics and nuances
present in the validation set. These measures should be designed to assess the ability of models
trained on the augmented data to generalize and accurately label or process the validation data,
thereby indirectly evaluating the quality and representativeness of the synthetic samples.</p>
      <p>Finally, it is essential to acknowledge that in certain scenarios, the available data may be so
scarce that confident label prediction becomes impossible without introducing additional real
data into the dataset. In such extremely data-limited situations, relying solely on augmentation
techniques is unlikely to be suficient and may, in fact, aggravate bias rather than address it.
To make meaningful progress, it is crucial to recognize the inherent limitations of DA and
its inability to circumvent the fundamental issue of inadequate real data in certain contexts.
Identifying and quantifying the data scarcity thresholds beyond which augmentation alone
becomes inefective would represent a significant step forward.</p>
      <p>By addressing these challenges and adhering to strict train-test separation principles, we
can ensure a rigorous and reliable evaluation of synthetic data generation methods in sensitive
domains, ultimately enabling the development of more efective and trustworthy DA techniques
for applications with limited and scarce data.</p>
      <p>
        RQ2: Can the combination of XAI and HCI in generative models efectively enhance
the quality and robustness of new data?
The integration of XAI into DA is still limited. Apart from XAI-guided classification tasks, some
work [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] has proposed XAI-guided generation pipelines but has focused narrowly on GANs.
In general, XAI aims to enhance understanding of AI systems’ outputs and decision-making,
improving human interpretability. Within the domain of DA, XAI may provide insights into
existing methods, analyze their impact on model performance, and even guide the development
of new, robust augmentation strategies with the help of domain experts bridging the gap between
privacy, fairness and generation of meaningful synthetic data.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Deep generative models for medical image augmentation address the fundamental challenge of
scarce training data in healthcare applications. While traditional augmentation provides some
benefits, its eficacy remains limited. This research focuses on progressing from foundational
models like GANs and VAEs to more advanced techniques including difusion models and SSMs.
Key advantages of generative augmentation include producing realistic synthetic data and
capturing the true underlying distribution. However, limitations persist. The dual objectives of
this work are: (i) employing cutting-edge advances to push beyond the current SotA in medical
image synthesis, and (ii) addressing the interplay between privacy, fairness, and the generation
of meaningful synthetic data by leveraging XAI and HCI for enhanced robustness.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Button</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mokrysz</surname>
          </string-name>
          , Nosek.,
          <article-title>Power failure: why small sample size undermines the reliability of neuroscience, Nature reviews neuroscience (</article-title>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Shorten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Khoshgoftaar</surname>
          </string-name>
          ,
          <article-title>A survey on image data augmentation for deep learning</article-title>
          ,
          <source>Journal of big data</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Litjens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kooi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bejnordi</surname>
          </string-name>
          , et al.,
          <article-title>A survey on deep learning in medical image analysis, Medical image analysis (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Garcea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lamberti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Morra</surname>
          </string-name>
          ,
          <article-title>Data augmentation for medical imaging: A systematic literature review</article-title>
          ,
          <source>Computers in Biology and Medicine</source>
          <volume>152</volume>
          (
          <year>2023</year>
          )
          <fpage>106391</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pouget-Abadie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mirza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warde-Farley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ozair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Courville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Generative adversarial nets</article-title>
          ,
          <source>in: NIPS</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sandfort</surname>
          </string-name>
          , et al,
          <article-title>Data augmentation using generative adversarial networks (cyclegan) to improve generalizability in ct segmentation tasks, Scientific reports (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.-H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Heidari</surname>
          </string-name>
          , et al.,
          <article-title>Generative adversarial networks in medical image augmentation: A review, Computers in Biology and Medicine (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Mescheder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Geiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nowozin</surname>
          </string-name>
          ,
          <article-title>Which training methods for gans do actually converge?</article-title>
          ,
          <source>in: International conference on machine learning, PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>3481</fpage>
          -
          <lpage>3490</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Luck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Honari</surname>
          </string-name>
          ,
          <article-title>Distribution matching losses can hallucinate features in medical image translation</article-title>
          ,
          <source>in: MICCAI</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Auto-encoding variational bayes</article-title>
          ,
          <source>ICLR</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>I. Goodfellow</surname>
          </string-name>
          ,
          <article-title>Generative adversarial networks</article-title>
          ,
          <year>Nips 2016</year>
          tutorial (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chadebec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Thibeau-Sutre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Burgos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Allassonnière</surname>
          </string-name>
          ,
          <article-title>Data augmentation in high dimensional low sample size setting using a geometry-based variational autoencoder</article-title>
          ,
          <source>IEEE Trans. Pattern Anal. Mach</source>
          . Intell. (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <article-title>Denoising difusion probabilistic models</article-title>
          ,
          <source>NIPS</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Salimans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <article-title>Progressive distillation for fast sampling of difusion models</article-title>
          ,
          <source>arXiv preprint arXiv:2202.00512</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pernias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rampas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Richter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Aubreville</surname>
          </string-name>
          ,
          <string-name>
            <surname>Wuerstchen:</surname>
          </string-name>
          <article-title>An eficient architecture for large-scale text-to-image difusion models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2306</volume>
          .
          <fpage>00637</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gu</surname>
          </string-name>
          , T. Dao, Mamba:
          <article-title>Linear-time sequence modeling with selective state spaces</article-title>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , L. u. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>in: NIPS</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Vision mamba: Eficient visual representation learning with bidirectional state space model</article-title>
          ,
          <source>CoRR</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          , et al.,
          <article-title>Vmamba: Visual state space model</article-title>
          ,
          <source>CoRR</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>J.-H. Park</surname>
            ,
            <given-names>Y.-J.</given-names>
          </string-name>
          <string-name>
            <surname>Ju</surname>
            ,
            <given-names>S.-W.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Explaining generative difusion models via visual analysis for interpretable decision-making process</article-title>
          ,
          <source>Expert Systems with Applications</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Christiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leike</surname>
          </string-name>
          , T. Brown, M. Martic,
          <string-name>
            <given-names>S.</given-names>
            <surname>Legg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Deep reinforcement learning from human preferences (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Using human feedback to fine-tune difusion models without any reward model</article-title>
          ,
          <source>CVPR</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Narteni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Orani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ferrari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Verda</surname>
          </string-name>
          , E. Cambiaso,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mongelli</surname>
          </string-name>
          ,
          <article-title>A new xai-based evaluation of generative adversarial networks for imu data augmentation</article-title>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>