<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Causal Synthetic Data Generation in Recruitment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Iommi</string-name>
          <email>andrea.iommi@phd.unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Mastropietro</string-name>
          <email>antonio.matropietro@di.unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Guidotti</string-name>
          <email>riccardo.guidotti@unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Monreale</string-name>
          <email>anna.monreale@unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Ruggieri</string-name>
          <email>salvatore.ruggieri@unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ISTI-CNR Pisa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Pisa</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>The importance of Synthetic Data Generation (SDG) has increased significantly in domains where data quality is poor or access is limited due to privacy and regulatory constraints. One such domain is recruitment, where publicly available datasets are scarce due to the sensitive nature of information typically found in curricula vitae, such as gender, disability status, or age. This lack of accessible, representative data presents a significant obstacle to the development of fair and transparent machine learning models, particularly ranking algorithms that require large volumes of data to efectively learn how to recommend candidates. In the absence of such data, these models are prone to poor generalisation and may fail to perform reliably in real-world scenarios. Recent advances in Causal Generative Models (CGMs) ofer a promising solution. CGMs enable the generation of synthetic datasets that preserve the underlying causal relationships within the data, providing greater control over fairness and interpretability in the data generation process. In this study, we present a specialised SDG method involving two CGMs: one modelling job ofers and the other modelling curricula. Each model is structured according to a causal graph informed by domain expertise. We use these models to generate synthetic datasets and evaluate the fairness of candidate rankings under controlled scenarios that introduce specific biases.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Causal Generative Models</kwd>
        <kwd>Ranking</kwd>
        <kwd>Recruitment</kwd>
        <kwd>Bias simulation</kwd>
        <kwd>Fairness evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Synthetic data generation is gaining importance, especially in contexts where data quality is low,
privacy concerns are prominent, or regulatory constraints limit data availability. Poor-quality datasets,
often containing missing or unrepresentative information, can significantly impair the performance of
Machine Learning (ML) models [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In many domains, collecting real data is prohibitively expensive or
logistically challenging, and ensuring coverage of all relevant scenarios is rarely straightforward [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Moreover, in high-risk settings such as healthcare [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], business [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or recruitment [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], extensive
preprocessing and privacy-preserving measures often degrade data utility, further motivating the
need for high-quality synthetic alternatives. Synthetic Data Generators (SDGs) ofer a promising
solution to challenges related to data scarcity, privacy, and regulatory compliance [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In healthcare, for
example, SDGs support disease modelling and drug discovery while preserving patient confidentiality.
Models such as SynSys [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and CorGAN [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] address data availability and privacy concerns in medical
applications. Similarly, in business domains, strict privacy regulations often hinder research and
development. In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], it is demonstrated how the SDGs can be utilised to simulate financial scenarios
under specific constraints.
      </p>
      <p>As in other sensitive domains, the availability of accessible datasets in human recruitment is limited
due to the private nature of attributes such as gender, disability, and age, which are pieces of information
that candidates may be reluctant to disclose. Consequently, synthetic datasets play a crucial role in
this field, enabling the training and evaluation of ranking models not only in terms of performance
but also in terms of fairness within human recommendation systems. In addition to improving model
efectiveness, synthetic data helps mitigate the risk of disclosing sensitive attributes, thereby addressing
both ethical and legal concerns. Unfortunately, generating synthetic datasets that accurately reflect
real-world data is a non-trivial task. Indeed, to be efective, synthetic data must closely replicate the
underlying statistical properties of the original data. Furthermore, in socially sensitive domains, the
generation process must also ensure fairness and interpretability to prevent biased outcomes.</p>
      <p>
        Causal Generative Models (CGMs) can address these needs by explicitly encoding causal relationships
using Structural Causal Models (SCMs) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Indeed, unlike deep learning approaches such as Generative
Adversarial Networks (GANs) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] or Variational Autoencoders (VAEs) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], which excel at capturing
complex non-linear patterns, CGMs provide transparent and interpretable mechanisms grounded in
causality. While deep learning models can uncover correlations, they often fail to reveal the underlying
causal structure and may introduce spurious associations due to opaque training processes [
        <xref ref-type="bibr" rid="ref12">12, 13</xref>
        ]. In
fact, in high-risk domains, the importance of interpretability is underscored by the European Union’s AI
Act, which mandates transparency and human oversight in AI systems [14]. This regulation highlights
the need for transparent ML models and data generation processes that can be audited and understood
by domain experts.
      </p>
      <p>In this paper, we present a SDG system grounded on two CGMs, one for job ofers and one for
curricula, each structured according to causal graphs derived from interviews with domain experts.
These graphs capture the decision-making processes underlying the creation of job ofers and candidate
profiles. We use the CGMs to generate synthetic datasets that simulate realistic recruitment scenarios.</p>
      <p>As a test-bed of our SDG, we explore fairness in ranking tasks. We introduce a controlled bias by
incorporating a parametric causal link between the gender attribute and working hours, simulating
gender disparities as discussed in social sciences [15]. This setup enables us to assess how such bias,
when propagated through the data, influences the fairness of rankings of job candidates produced by
ML ranking models. In summary, the contribution of this work is threefold: (i) the formulation of
causal graphs to model the HR domain for job ofer and curriculum generation, (ii) a new approach to a
tabular synthetic data generation, causality-grounded and intrinsically interpretable and (iii) a public
and extendible GitHub Python repository1 in which are deployed Causal mechanisms that work with
multiple data types.</p>
      <p>The rest of this paper is organised as follows. After reviewing works on synthetic data generation in
Section 2, we briefly review the key concepts behind our proposal in Section 3. Then, in Section 4, we
describe our proposal. In Section 5, we present the experimental results. Finally, Section 6 summarises
our contributions and outlines potential directions for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>We begin by reviewing the literature on synthetic data generation for tabular data.</title>
        <p>Early statistical methods, such as SMOTE [16], generate synthetic samples by fitting empirical
distributions and interpolating between existing data points. While efective in leveraging marginal
distributions, these techniques often fall short in capturing complex feature interactions, which can
result in the generation of less realistic (low-fidelity) and less representative (biased) synthetic data.</p>
        <p>
          Deep learning (DL)–based models, including Generative Adversarial Networks (GANs), Variational
Autoencoders (VAEs), and Difusion Models [
          <xref ref-type="bibr" rid="ref1 ref6">6, 1</xref>
          ], show impressive capabilities in learning complex
data distributions. However, these models also present several limitations. First, as model complexity
increases, they require large volumes of training data. In data-scarce settings, these models often
struggle to learn accurate distributions, resulting in poor generalisation. Second, DL–based generative
models are prone to mode collapse. The latter is a phenomenon where the model generates samples from
only a limited subset of the true distribution, namely, a subset with high probability mass. This issue,
as discussed in [17], undermines the diversity and representativeness of the synthetic data, making
it dificult to ensure dataset quality. Lastly, these models typically function as black boxes, ofering
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>1https://github.com/jacons/CausalSDG</title>
        <p>limited transparency into the data generation process. This lack of interpretability poses challenges in
high-stakes applications, where understanding the model’s behaviour is critical.</p>
        <p>In contrast to deep generative models, probabilistic graphical models, such as Bayesian Networks
(BNs), are better suited under certain conditions, namely when (i) datasets are limited in size, (ii)
consistency is critical, meaning generated samples must adhere to domain-specific constraints, and (iii)
strong correlations exist among features [18, 19]. BNs explicitly model the conditional dependencies
between variables, allowing them to capture the joint distribution of the data more transparently.
This structure enables BNs to replicate the statistical properties of an empirical dataset with greater
interpretability compared to DL–based approaches. Structural Causal Models (SCMs) extend the
capabilities of BNs by embedding causal semantics into the graphical structure. While BNs focus
on statistical dependencies, SCMs incorporate structural equations that define how each variable is
generated from its causes. This allows SCMs not only to model observational distributions but also
to support interventional and counterfactual reasoning. As a result, SCMs provide a more expressive
framework for generating synthetic data that is both statistically consistent and causally grounded,
making them particularly valuable in domains where fairness, transparency, and causal interpretability
are essential.</p>
        <p>A growing body of research has focused on developing methods for generating synthetic data that
explicitly mitigate fairness concerns while preserving utility. Specifically, [ 20, 21, 22] extend GAN–
based methods, and [23] adopts a genetic approach. Unlike these approaches, (i) we focus on the
specific domain of recruiting (curricula and job ofer datasets), (ii) we adopt a SCM approach with
controllable bias parameters, and (iii) the data generation process is fully interpretable. The focus on a
specific domain permits us to derive causal dependencies among features by eliciting them from expert
knowledge, thus overcoming the dificulty of discovering causal dependencies from observational data.
We instead use observational data to learn the structural equations given the known causal dependencies
among features. To the best of our knowledge, this is the first approach to using SCMs for SDG in the
recruiting domain.</p>
        <p>The closest works are [22] and [24]. van Breugel et al. [22] is a generic approach, which assumes a
given causal graph and learns the structural equations through conditional GANs. The data generation
process first intervenes on the derived SCM by eliminating dependencies that lead to unfairness in
a downstream model (these dependencies may be specific to the fairness metric being considered).
Subsequently, it generates data by applying the (GAN–based) structural equations of the remaining
dependencies. On the other hand, our method integrates fairness constraints directly into the causal
mechanisms during the generative process, permitting us to produce fairness requirements without
integrating post-hoc intervention. Barbierato et al. [24] present a methodology for bias-controllable
synthetic data generation using parametric causal mechanisms. Their experimental framework explores
fairness metrics by systematically varying a bias parameter during the data generation process. However,
their approach is limited to continuous variables and lacks the ability to learn causal mechanisms directly
from data, relying instead on predefined parametric forms. In contrast, our approach supports mixed
data types and utilises a learned causal mechanism fitted from observational data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Background</title>
      <p>To keep the paper self-contained, we provide a brief outline of the key concepts underlying our proposal.</p>
      <sec id="sec-3-1">
        <title>3.1. The ESCO Taxonomy and the EQF Classification System</title>
        <p>The European Skills, Competences, Qualifications, and Occupations [25] (ESCO, https://esco.ec.europa.eu/)
is a multilingual classification system developed by the European Union (EU) to provide a standardised
framework for describing skills and occupations. It aims to support better matching between individuals
and job opportunities, as well as between education and labour market needs.</p>
        <p>ESCO is organised into two main pillars: Occupations and Skills. The Occupations pillar provides a
structured vocabulary for consistently describing occupations. This structure is based on hierarchical
relationships between concepts, using the International Standard Classification of Occupations (ISCO-08)
as its foundational taxonomy. Each occupation entry includes several attributes, such as a Unique
Resource Identifier (URI), a preferred term that represents the concept in a specific language, a set
of non-preferred terms including synonyms, spelling variants, declensions, and abbreviations, and a
textual description. Similarly, the Skills pillar provides a taxonomy for describing competencies and
knowledge. It mirrors the hierarchical organisation of the Occupations pillar and shares the same set
of attributes. This consistency facilitates interoperability and integration across diferent systems and
languages. ESCO also defines associations between occupations and skills, categorising them as either
essential or optional. These associations enhance the system’s ability to support detailed profiling and
more accurate matching in both employment and educational contexts.</p>
        <p>The European Qualifications Framework (EQF,
https://europass.europa.eu/en/europass-digitaltools/european-qualifications-framework) serves as a common European reference framework to
facilitate the comparison of qualifications across diferent countries and education systems. Established
by the EU, the EQF aims to promote transparency, mobility, and lifelong learning by aligning national
qualification systems through a shared structure into eight levels: from Level 1, which corresponds to
basic general knowledge and skills, to Level 8, which reflects the highest level of expertise, typically
associated with doctoral-level qualifications. Each level is defined by a set of descriptors that express
the expected learning outcomes in terms of knowledge, skills, and competence.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Structural Causal Models</title>
        <p>
          A Structural Causal Model (SCM) [
          <xref ref-type="bibr" rid="ref9">26, 9, 27</xref>
          ] describes a data-generating process by relating random
variables in cause-efect pairs. Let X = {1, . . . , } be  observable random variables, defined by a
set F of structural equations:
 := (PA, )
for  = 1, . . . , 
(1)
where U = {1, . . . , } are  independent exogenous (unobserved) random variables, and PA ⊆
X ∖ {} are the causal parents of . The equations describe the causal mechanism by which an
 is generated from its causal parents and an exogenous variable . Formally, a SCM ℳ is a tuple
ℳ = ⟨U,  (U), X, F⟩, where  (U) = ∏︀  () is the probability distribution of the exogenous
variables. The parental relations in a SCM induce a causal graph , in which the nodes represent random
variables and a directed edge  →  denotes a causal relation between  ∈ PA and . We assume
Directed Acyclic Graphs (DAGs), meaning there are no loops in , so the data generation process can
proceed by following a topological order of the variables given the graph. Under the Markov property
assumption, the induced probability on X can then be factorized as  (X) = ∏︀  (|PA, ).
        </p>
        <p>Let us assume that we know the causal graph , and that we are given a dataset of i.i.d. observations.
Parametric and non-parametric approaches can be used to infer the structural equations. In this work,
we consider the following approach based on the type of .</p>
        <p>For a continuous variable , we model the task as a regression problem of the dependent continuous
variable  given the independent variables PA and the exogenous variable . In particular, we
assume additive noise, namely, the structural equation is of the form:  = (PA) + , where  is
a regressor trained from the dataset of observations, and the exogenous variable  is assumed to be
empirically distributed from the residual  − (PA).</p>
        <p>For  discrete, we assume  ∼ (^ (PA)), namely  is categorically distributed with
probabilities given by the predictions of a probabilistic ML classifier 2 ^  trained from the dataset of observations.
If a variable in PA is of set type (e.g., skills required in a job ofer or those possessed by a job applicant),
we first one-hot encode all possible values in the set.</p>
        <p>For  of set type with possible values in {1, . . . , }, then  samples  ∼  (min , max ) values
without replacement, where each  is sampled with probability equal to the empirical conditional
distribution  ( ∈ |PA) over the dataset of observations. Here, min and max are user parameters.
2We adopt a HistGradientBoostingClassifier from the scikit-learn Python library.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>In this section, we outline: (i) the formulation of causal graphs to model the HR domain for job ofer
and curriculum generation and (ii) the design of a data preprocessing pipeline.</p>
      <sec id="sec-4-1">
        <title>4.1. Causal Graphs for SDG</title>
        <p>A central component of our approach is the causal graph representing the structure of job ofers and
curricula. Rather than relying on automated causal discovery from observational data, we construct this
graph through expert elicitation, based on qualitative insights gathered from HR professionals. To this
end, we conducted four semi-structured interviews with HR representatives from three Italian companies
of varying sizes: one with over 50 employees, another with over 200 employees, and a third with more
than 5, 000 employees. These organisations operate in distinct sectors and organisational scales,
providing a diverse perspective on hiring practices. The interviews were designed to uncover the
realworld decision-making processes behind the creation of job ofers. Participants were asked to describe
the key attributes considered when drafting job postings, as well as the relationships among these
elements. The insights obtained were instrumental in defining the structure and dependencies encoded
in the causal graph, ensuring that it reflects domain-specific knowledge and practical constraints.
The Process Leading to Job Ofers. In the companies interviewed, the recruitment process is closely
integrated with annual budget planning. Each production unit submits its needs for occupations, which
are then reviewed and approved through an internal administrative process. These needs may arise from
various factors, including employee resignations, unplanned replacement costs, increased workloads,
or skill shortages within teams. A common sequence emerged across the interviews regarding how
job postings are formulated. The process typically begins with the budget, which is defined during
the annual planning phase. This budget plays a pivotal role in shaping the job ofer: it determines the
experience of the sought candidate (e.g., junior or senior), the nature of the employment contract, and
its duration. Specifically, higher budget availability, combined with organisational needs, afects the
number of working hours and the contract type (intended as contract duration: permanent or fixed-term).
Under the guidance of the department head who initiated the stafing request, HR professionals identify
the required hard skills. Based on these, they determine the appropriate seniority level (e.g., years
of experience) and specify technical qualifications, such as degrees or certifications. This structured
approach reflects a rationale in which workforce needs are translated into a set of competencies
that candidates must possess. These competencies are typically associated with specific educational
backgrounds or levels of professional experience. Moreover, defining education and experience through
the lens of skills provides an initial validation mechanism: requiring a degree in a specific field, HR
professionals implicitly assume that the candidate possesses the associated skills. Interestingly, some
interviews revealed that the process can also operate in reverse—starting with an ideal candidate profile
and subsequently adjusting it to fit within budget constraints. Thus, the budget may either serve as
the starting point for defining job requirements or act as a constraint to be considered after the ideal
profile has been outlined. Figure 1 shows the job posting generation process as reconstructed from the
interviews.</p>
        <p>Causal Graphs for Job Ofers and Curricula. Job ofers typically do not include all of the variables
of the graph from Figure 1. For instance, the budget of companies is not disclosed. Other variables
may be missing for specific data collections. In our datasets (see next section), in particular, we lack
the contract type, the salary, and the age target. To reflect these limitations, we present a simplified
version of the causal graph used in our experiments, shown in Figure 2 (left). Despite this simplification,
our SDG is designed to be fully parametric with respect to the input DAG–based causal graph. This
design enables the system to be applied to richer datasets than the one used in our current experiments,
granting broader applicability in future scenarios where more complete data is available. The causal
graph for job ofers models the determination of required education and experience based on the skills</p>
        <p>Budget</p>
        <p>Salary
Gender</p>
        <p>α
necessary for a given occupation. This reflects a form of backwards reasoning: starting from the desired
skills, the HR professionals infer the qualifications that provide a reasonable guarantee of possessing
those skills. In other words, they answer the question: What educational background and years of
experience are typically needed to ensure a candidate has the required competencies? Conversely, for job
applicants, the causal direction is reversed. A candidate’s skills are shaped by their education and years
of experience. Additionally, the job sector in which the candidate has specialised plays a significant
role in skill development. It is important to note that occupation is a more specific concept than job
sector. For example, while the job sector might be Information and Communication Technology (ICT), the
occupation could be Python Developer. In the causal graph for curricula, age is modelled as a determinant
of the years of experience a candidate has acquired. This reflects the natural progression of professional
development over time. The full structure of this graph is shown in Figure 2 (right).</p>
        <p>The graph also includes the variable working hours, which represents whether a candidate is looking
for a part-time or full-time contract. The study presented in [15] ofers valuable insights into the ways
gender and societal norms influence working time patterns among men and women. It highlights
that women are significantly more likely than men to engage in part-time employment and are less
inclined to work overtime. Importantly, this trend is not merely a matter of personal preference but is
deeply embedded in prevailing social constructs. The primary driver of this disparity lies in traditional
gender roles, which continue to position women as the primary caregivers within the family. To model
such a potential bias, we introduce a directed edge from gender to working hours, with the strength
of this dependency modulated by a parameter  . This parameter allows us to control the degree
of gender–based bias in the data generation process and will be varied in the experimental analysis
presented later in the paper. An example of a job ofer and a corresponding curriculum, including the
variables considered in their respective causal graphs, is provided in Table 1.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Downstream Task: Ranking</title>
        <p>The primary downstream application of the SDG framework is to evaluate, or potentially train, a
ranking model that orders applicants for a given job position. In this section, we present the end-to-end
pipeline, from dataset generation to the evaluation of ranking performance and fairness metrics. To
enable the assessment of gender–based (un)fairness in ranking models, our SDG approach incorporates
the parameter  within the causal graph generating curricula see Section 4.1). This parameter allows
for controlled interventions on gender-related attributes, facilitating rigorous fairness analysis in the
ranking process.</p>
        <p>Ranking Pipeline. Learning to Rank (LTR) methods originate from Information Retrieval [28]. They
are typically classified into three main categories w.r.t. a given user query: pointwise, where individual
documents are scored independently; pairwise, where the model learns to predict the relative order
between pairs of documents; and listwise, where the entire list of documents is considered simultaneously
to optimize a ranking metric. In the context of recruitment, the job ofer represents a user query, and the
curricula identify the documents to be ranked. Since candidates may choose which job ofer to apply to,
or at least to which job sector, the curricula to be ranked for a given job ofer  will be denoted by  .</p>
        <p>
          Let us consider the case of pointwise ranking here. A model is trained based on a dataset of
observations consisting of pairs (v(,), ) where: v(,) are fitness values of the curricula  w.r.t. the job ofer ,
and  is a relevance score assigned by an HR professional to the fitness values. The fitness values are
numbers in the interval [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] that measure how much each requirement of a job ofer is satisfied by a
candidate, with 0 indicating no match and 1 indicating a full match. With reference to the graphs of
Figure 1, a fitness value is computed for skills (resp., working hours/education and
qualification/experience) of the candidate w.r.t. skills (resp., working hours/education and qualification/experience) of the
job ofer. For instance, the match for skills can be computed as the fraction of skills required by the job
ofer that the candidate possesses. Regarding the relevance score , when the data is collected from
past candidate selections, it is provided by the HR professional assessing the candidates. In the case of
synthetically generated data, such information is lacking. To experiment with our SDG, we assume a
linear model:
(, ) =  (,) + 
(2)
where  ∈ R is a weight vector that encodes the importance of each fitness value and  ∼  (0,  2) is
a small noise value modelling uncertainty. Due to their interpretability, linear models are often adopted
in real ranking systems. Eq. (2) provides the “ground-truth” for evaluating a learned ranking model. We
emphasise that, while an actual recruitment system would utilise a ranker such as LambdaMART or
ListNet with scores assigned by HR, training and evaluating such models is beyond the scope of this
work. Indeed, in our experiment, we are interested in how fairness measurement changes at varying
values of  ’s parameters.
        </p>
        <p>For a fixed job ofer , the ground-truth ranking of curricula in  is the one obtained by descending
value of (, ). Candidates with the same score will have the same ranking position. Let us denote by
  the rank position of . We further denote by ^ the rank position of  as determined by another (e.g.,
a learned) ranking model.</p>
        <p>Performance and Fairness Metrics. Several evaluation metrics have been considered to quantify
the performance of ranking models. They measure how close ^ and   are over all ranked candidates
(Kendall’s tau, Spearman’s rho) or over the top  ranked candidates (precision, NDCG), for a fixed
job ofer , or for a set of job ofers (average precision). In this paper, however, we are not evaluating
ranking models, but rather the properties of the datasets generated by our SDG. The performance
metric of a SGD is the fidelity of the generated data to the distribution of real data. However, due to the
issues mentioned regarding the collection of representative real data, we opt for a “by-design” approach,
assuming that the causal graphs depicted in Figure 2 are faithful to the true distribution of job ofers
and curricula. Conditional distribution of a variable given the parent nodes is enforced by fitting the
structural equations as described in Section 3.2. Whether or not the causal graphs from Figure 2 are
valid in a specific context, such as countries or industry sectors, remains to be determined context by
context, possibly adapting the causal graphs in case of diferent/additional variables that are observable.</p>
        <p>
          The other property that our SDG can model is the bias of the generated data. We link such a notion
to the (un)fairness of the downstream task. Several fairness metrics have been considered in the
literature [29, 30]. Since we do not consider a ranking model learned from data, we have to restrict
ourselves only to metrics that regard the ground-truth ranking model from Eq. 2, possibly at the
variation of the weights . We will consider the group fairness metrics [29] of demographic parity (DP)
and normalised discounted diference (rND):
 () = 1 − (P(  ≤ |  protected,  ∈  ) − P(  ≤ |  unprotected,  ∈  ))
(3)
 () ranges in [
          <xref ref-type="bibr" rid="ref2">0, 2</xref>
          ], with 2 denoting full unfairness against the protected group (only candidates
from the protected group are selected in the top  positions), 1 denoting fairness (equal probability
of being chosen among protected and unprotected), and 0 denotes reverse unfairness against the
unprotected group. The rND metric improves over DP by considering: (1) the deviation of the fraction
of protected candidates in top- position against the proportion of protected candidates applying for
the position; (2) at multiple thresholds ’s.
        </p>
        <p>
          () =
||
∑︁
1 1
 =5,10,15,... 2() |
|{  ≤ ,  protected,  ∈  }|
|{  ≤ }|
−
|{ protected,  ∈  }|
| |
|
(4)
where  is a normalising factor (making it comparable across diferent ’s) so that  () ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ],
where 0 means fairness (proportional representation of protected candidates in top positions).
Bias Parametrization. We describe here how the parameter  of the causal graph of the curricula
(Figure 2 right) can be used to control the degree of bias induced in the generated datasets. Let us denote
the variables gender and working hours as  and  .  is the sensitive variable, assuming the value 0 as
male (unprotected group) or 1 for not male (protected group), while  assume values  for full-time or
 for part-time. When there is no edge between  and  , the generation of working hours boils down
to the empirical distribution  ( | ) over the dataset of observations. The parameter  , which we
assume to be a pair  = ( 0,  1), can then be used to shift such a distribution either for the unprotected
group, through  0, or for the protected group, through  1. Exponential tilting, power transformation,
or other probability-shifting/skewing methods can be used. We adopt here exponential tilting, for which
the shifted conditional probabilities are:  ′( =  |  = 0) = − 2 1 /1 ·  ( =  |  = 0), and
 ′( =  |  = 0) = − 2 1 /1 ·  ( =  |  = 0), where 1 is a normalizing constant. Similarly,
we shift the conditional probability of the protected group:  ′( =  |  = 1) = − 2 0 /0 ·  ( =
 |  = 1), and  ′( =  |  = 1) = − 2 0 /0 ·  ( =  |  = 1), where 0 is a normalising
constant.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments</title>
      <p>Experimental Settings. In this section, we illustrate some experiments on the functionalities of our
SDG. For each experimental parameter ( 0 and  1, which will be discussed later), we conducted 10 runs,
where in each one we generated 300 job ofers and 1000 curricula. For each job ofer, all curricula are
considered as candidates. The motivation behind the generated data size is that, in our ranking pipeline,
there is no efective model training process, except for the generator. In particular, the relevance score
for each candidate is calculated using Eq.2, and the ranking is produced accordingly. Hence, the number
of job ofers and curricula generated for each run represents a good compromise between observing the
changes in metrics and the execution time.</p>
      <p>Data is generated according to the causal graphs shown in Figure 2, with the structural equation fitted
from the preprocessed datasets described in Appendix A. The share of male and not-male (women and
non-binary) applicants is 50%-50%. Regarding the ranking model pipeline, the fitness values between
curriculum and job ofer characteristics are defined through matching functions (MFs). In particular, the
MF for education returns 1 if the candidate’s education level is equal to or greater than the level required
in the job ofer; otherwise, it is 0. The MF for experience checks whether the candidate’s experience falls
within the interval required by the job ofer. The MF for skills calculates the fraction of skills required
by the job ofer that the candidate possesses. Finally, the MF for working hours tests for equality of the
form required in the job ofer and the form desired by the candidate. For example, for the data from
Table 1, the fitness value vector is (,) = [1.0, 1.0, 1/3, 1.0] respectively for the MF of the features
education, experience, skills and working hours.</p>
      <p>In the following experiments, we investigate how changing the working hours preferences conditional
on gender afects the fairness metrics DP and rND. These kinds of experiments aim to simulate social
norms that often say that women should prioritise family responsibilities over their careers, leading them
to opt for part-time contracts that ofer more free time but may limit career advancement opportunities.
The generated job ofers exhibit a skewed distribution of working hours: 86.6% of the positions are
full-time contracts, while only 13.3% are part-time contracts. Also, the generated curricula have skewed
distributions. Male applicants prefer full-time contracts (76%), while not-male applicants prefer
parttime contracts (59%). We will be varying  = ( 0,  1) to investigate shifted distributions. Table 2
presents the average shifted distributions over the 10 experimental runs for a few  0 and  1 (see Section
4.2). In the experiments, we will be varying  0 and  1 from − 4.0 to 4.0.</p>
      <p>In particular, we denote by  0 the parameter for the “No-man” distribution and by  1 the one for
“Man”. We recall that larger  ’s result in a redistribution of probability mass towards part-time. Notice
that for  0 = − 1.5, the distributions of male and not-male are almost identical. The same occurs for
 1 = 1.5.</p>
      <p>We explore four weighting vectors for the ranking model of Eq. (2). All of them assign the same
weights to the fitness values of education (0.8), experience (0.5), and skills (1.0), and set  ∼  (0, 0.012),
while for working hours we consider four cases: 0, 0.5, 0.8 and 1.0. Since gender can only afect the
score through working hours (cfr. Figure 2 right), when the weight is 0, then the scores of the ranking
model are independent of the gender, hence the ranking model is fair.</p>
      <p>We emphasise that the weights considered for the experiment correspond to a “rough approximation”
of the general HR decision-making. Whether or not they constitute the efective importance value in a
candidate evaluation is beyond the scope of this work. What matters is evaluating the fairness metrics
at varying working hours importances and the  ’s values.</p>
      <p>Results. Figure 3a illustrates the behaviour of Demographic Parity (DP) under the not-male conditional
distribution shift. When the parameter  0 is negative, the distribution of preferences for not-male
becomes skewed towards full-time employment. This shift results in a more balanced alignment between
job requirements and candidate preferences. Hence, DP is close to the fair value of 1 for all weights of
working hours in the ranking model. However, as  0 increases, the distribution shifts toward part-time
contracts. Given the dominance of full-time job ofers, this misalignment causes the ranking system to
increasingly favour candidates from the non-protected group, i.e., males. As a result, DP values decline
significantly, highlighting a growing disparity and reduced fairness for the protected group, except for
the ranking model, where the mediating variable working hours has zero weight.</p>
      <p>A reversed trend is observed in Figure 3b, which illustrates the variation of DP with respect to
changes in the parameter  1. For negative values of  1, male candidates are predominantly associated
with full-time contracts. As a result, they are favoured by the ranking model (except in the case where
the model assigns zero weight to the working hours feature), leading to high DP values that indicate
potential unfairness against non-male candidates. As  1 increases, the distribution of male candidates
shifts toward part-time contract preferences. This shift reduces their alignment with the majority of
job ofers, which are primarily full-time, and causes the model to assign higher scores to non-male
candidates. Hence, DP values fall below 1, highlighting unfairness against male candidates. Notice that
fairness (DP ≈ 1) is achieved for all ranking models when the conditional distributions of working hours
are the same for male and not-male, namely for  1 = 1.5 (cfr. also Table 2).</p>
      <p>Similar patterns can be observed for the rND metric in Figures 3c and 3d. The metric is computed
by considering  = 5, 10, 15, 20 in Eq. (4). Unlike Demographic Parity (DP), which measures average
diferences in exposure between groups, rND captures absolute deviations. As a result, trends where
DP falls below 1 (indicating underexposure of the protected group) correspond to elevated rND values.
This reflects a greater degree of ranking disparity, as rND penalises any deviation from perfect parity,
regardless of direction. As with DP, the most favourable rND values are observed at  0 = − 1.5 and
 1 = 1.5, corresponding to the settings where the conditional distributions of working hours are
equivalent for male and non-male candidates. These configurations yield the most balanced exposure
across groups, minimising both average and absolute disparities in ranking outcomes.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>We introduced a synthetic data generation system for recruitment, based on expert-informed causal
graphs that enable explicit control over bias and interpretability. Through causal modelling, we assessed
the fairness impact of biased data on a linear point-wise ranking model using DP and rND metrics. Our
experiments demonstrate that controlled distributional shifts in the generative process can significantly
influence ranking fairness—positively or negatively—especially as the weight of bias-related features
increases. To foster further research in high-risk human recommendation scenarios, we will release the
system as open-source software3.</p>
      <p>Despite promising results, two main limitations remain: the current approach supports only tabular
data and depends on the availability of both high-quality training data and well-structured causal graphs;
additionally, our experiments sufered from heterogeneous and non-standardised training data, which
required approximations during preprocessing and may have reduced representativeness. Future work
will focus on (i) make comparisons with closest related works pointing out advantages and limitations
between the generative methods, (ii) enhancing feature engineering to expand causal graphs with
additional attributes, (iii) analysing more complex discrimination scenarios like the intersectional bias,
and (iv) integrating into the SDG external knowledge, such as ESCO ontology, to improve the data
diversity in the generated data.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, I used Grammarly in order to improve my grammar and spelling
checking. After using these tools, I reviewed and edited the content as needed.
[13] A. Komanduri, X. Wu, Y. Wu, F. Chen, From identifiable causal representations to controllable
counterfactual generation: A survey on causal generative modeling, Trans. Mach. Learn. Res. 2024
(2024).
[14] G. Pavlidis, Unlocking the black box: Analysing the EU artificial intelligence act’s framework for
explainability in AI, CoRR abs/2502.14868 (2025).
[15] J. Mazei, N. Backhaus, A. M. Wöhrmann, C. Brauner-Sommer, J. Hüfmeier, Similar, but diferent:
gender diferences in working time arrangements and the work–life interface, Collabra: Psychology
9 (2023) 87546.
[16] K. W. Bowyer, N. V. Chawla, L. O. Hall, W. P. Kegelmeyer, SMOTE: synthetic minority
oversampling technique, CoRR abs/1106.1813 (2011).
[17] Y. Kossale, M. Airaj, A. Darouichi, Mode collapse in generative adversarial networks: An overview,
in: 2022 8th International Conference on Optimization and Applications (ICOA), IEEE, 2022, pp.
1–6.
[18] A. Schoen, G. Blanc, P. Gimenez, Y. Han, F. Majorczyk, L. Mé, A tale of two methods: Unveiling
the limitations of GAN and the rise of bayesian networks for synthetic network trafic generation,
in: EuroS&amp;P Workshops, IEEE, 2024, pp. 273–286.
[19] H. M. Combrink, V. Marivate, B. Rosman, Comparing synthetic tabular data generation between
a probabilistic model and a deep learning model for education use cases, CoRR abs/2210.08528
(2022).
[20] D. Xu, S. Yuan, L. Zhang, X. Wu, Fairgan: Fairness-aware generative adversarial networks, in:</p>
      <p>IEEE BigData, IEEE, 2018, pp. 570–575.
[21] M. Abroshan, A. Elliott, M. M. Khalili, Imposing fairness constraints in synthetic data generation,
in: AISTATS, volume 238 of Proceedings of Machine Learning Research, PMLR, 2024, pp. 2269–2277.
[22] B. van Breugel, T. Kyono, J. Berrevoets, M. van der Schaar, DECAF: generating fair synthetic data
using causally-aware generative networks, in: NeurIPS, 2021, pp. 22221–22233.
[23] F. Mazzoni, M. M. Manerba, M. Cinquini, R. Guidotti, S. Ruggieri, Genfair: A genetic
fairnessenhancing data generation framework, in: DS, volume 14276 of Lecture Notes in Computer Science,
Springer, 2023, pp. 356–371.
[24] E. Barbierato, M. L. D. Vedova, D. Tessera, D. Toti, N. Vanoli, A methodology for controlling bias
and fairness in synthetic data generation, Applied Sciences 12 (2022) 4619.
[25] J. D. Smedt, M. le Vrang, A. Papantoniou, ESCO: towards a semantic web for the european labor
market, in: LDOW@WWW, volume 1409 of CEUR Workshop Proceedings, CEUR-WS.org, 2015.
[26] F. Li, A forecaster’s review of judea pearl’s causality: Models, reasoning and inference, second
edition, 2009, CoRR abs/2308.05451 (2023).
[27] A. R. Nogueira, A. Pugnana, S. Ruggieri, D. Pedreschi, J. Gama, Methods and tools for causal
discovery and causal inference, WIREs Data Mining Knowl. Discov. 12 (2022).
[28] T. Liu, Learning to Rank for Information Retrieval, Springer, 2011. URL: https://doi.org/10.1007/
978-3-642-14267-3. doi:10.1007/978-3-642-14267-3.
[29] E. Pitoura, , et al., Fairness in rankings and recommendations: an overview, VLDB J. 31 (2022) 431–
458. URL: https://doi.org/10.1007/s00778-021-00697-y. doi:10.1007/S00778-021-00697-Y.
[30] M. Zehlike, K. Yang, J. Stoyanovich, Fairness in ranking, part I: score-based ranking, ACM Comput.</p>
      <p>Surv. 55 (2023) 118:1–118:36.
[31] G. K. Palshikar, S. Pawar, A. S. Banerjee, R. Srivastava, N. Ramrakhiyani, S. Patil, D. Thosar, J. Bhat,
A. Jain, S. Hingmire, S. Chaurasia, P. Mandloi, D. Chalavadi, RINX: A system for information and
knowledge extraction from resumes, Data Knowl. Eng. 147 (2023) 102202.
[32] C. Gan, T. Mori, A few-shot approach to resume information extraction via prompts, in: NLDB,
volume 13913 of Lecture Notes in Computer Science, Springer, 2023, pp. 445–455.</p>
    </sec>
    <sec id="sec-8">
      <title>A. Data Preprocessing for SDG</title>
      <p>The other key input of SDG consists of two datasets of i.i.d. observations of job ofers and curricula.
From such a dataset, we can derive the structural equations of the SCM as described in Section 3.2.
However, several preprocessing steps are necessary to ensure the datasets are of high quality and
suitable for modelling. To facilitate this, our system provides a suite of APIs designed to perform various
data transformations aimed at standardising and enriching the raw data. We take advantage of a Large
Language Model (LLM)4 to align the raw data with the ESCO taxonomy (see Section 3.1). The task
of extracting features from curricula and job ofers through LLMs is receiving increasing attention.
Recent studies have addressed similar challenges, as illustrated in works such as [31, 32]. It is important
to emphasise that while the ESCO and EQF are EU-centric taxonomies, they serve as tools to ensure
comparability between job ofers and curricula. Our proposed methodology is modular, incorporating
external knowledge, such as the O*NET ontology adopted in the US.</p>
      <p>In this section, we describe the preprocessing procedures applied to two raw data sources: a
realworld dataset of 10,000 job ofers (  ) and a semi-synthetic dataset of 1,020 curricula (). The job
ofers dataset was provided by a major recruiting company based in Spain. The curricula dataset was
obtained by the FINDHR project (https://github.com/findhr), which combines features extracted from
real-world curricula voluntarily donated for research purposes. Several challenges arise when working
with these two data sources. First, the job ofers are written in Spanish, whereas the curricula are
in English. Second, the underlying taxonomies for education, qualifications, and related attributes
difer significantly between the two datasets. The following subsections detail the preprocessing
transformations applied to harmonise these datasets, organised by the features represented in the two
DAGs shown in Figure 2.</p>
      <p>Preprocessing Education and Qualifications. Education and qualification naming conventions
vary significantly across countries and may also difer between the two datasets,  and . For instance,
in Spain, educational credentials are typically expressed using the national system, including degrees
such as “Bachillerato”, “Ingeniería Técnica”, and “Ciclo Formativo de Grado Superior”. In contrast,
Englishlanguage curricula often refer to broader categories such as “Bachelor’s Degree” or “Master’s Degree”
to denote tertiary education levels. To enable meaningful cross-country comparisons, we adopted
a mapping developed by domain experts to translate raw educational and qualification data into
the European Qualifications Framework (EQF) levels (see Section 3.1). For example, “ Bachillerato”
corresponds to EQF Level 4, while “Ingeniería Técnica” maps to EQF Level 6, which is generally equivalent
to a “Bachelor’s degree” in other national systems.</p>
      <p>Job ofer descriptions typically specify a single education level, usually indicating the minimum
required qualification for the position. In contrast, curricula often list multiple educational achievements.
Thus, determining a candidate’s EQF level in the  dataset involves several steps. We begin by applying
a keyword–based approach to extract and classify educational entries. For example, a phrase such as
“BSc in Computer Science” is interpreted as a Bachelor’s degree. These inferred degree types are then
mapped to their corresponding EQF levels. When a candidate’s CV includes multiple qualifications, we
select the highest EQF level as a representative indicator of their overall educational attainment.
Preprocessing Occupation and Job Sector. In the  dataset, there is a “job title” feature that
captures the title associated with each job ofer. However, these titles exhibit considerable variability in
both structure and specificity. For example, some entries are well-defined, such as “Data Scientist” or
“ICT System Architect”, while others are either overly generic (e.g., “Developer”) or excessively detailed
(e.g., “Web Developer (PHP, JS proficiency) – full remote contract”). This inconsistency often stems from
difering practices among HR departments. Some prefer concise titles with detailed descriptions, while
others embed extensive information directly into the job title field. Such heterogeneity complicates the
task of comparing or categorising job ofers in a standardised manner. To address this, we implemented</p>
      <sec id="sec-8-1">
        <title>4Specifically, we use gemma2-9B, available at https://huggingface.co/google/gemma-2-9b.</title>
        <p>Raw skills</p>
        <p>ESCO skills</p>
        <p>Comp. &amp; Know
Not ESCO skills</p>
        <p>Soft. , Fram. &amp;</p>
        <p>Lang</p>
        <p>ESCO
Comp. &amp; Know</p>
        <p>ESCO
Technical skills</p>
        <p>ESCO
Comp. &amp; Know
ESCO Hard skills
ESCO Soft skills</p>
        <p>ESCO Lanugages
Other</p>
        <p>Other
ESCO Skills
Not ESCO Skills
Discarded
ESCO Alignment
LLM Alignment</p>
        <p>Concatenation
a multi-phase alignment process that leverages LLMs and the ESCO taxonomy to normalise job titles
across the dataset. Such an alignment process consists of three steps. Step 1: Title Refinement via
LLM. We use an LLM to generate a cleaner, more representative job title based on the original title
and the accompanying job description. This step reduces noise and ambiguity, making titles that
better reflect the underlying occupation. Step 2: ESCO Occupation Retrieval. We query the ESCO API
(https://esco.ec.europa.eu/en/use-esco/use-esco-services-api) using the LLM-generated job title. The
API returns a list of relevant ESCO occupations, which serve as candidate labels for standardisation. This
step is crucial, as the ESCO search engine performs more efectively when provided with well-structured
input. Step 3: Final Classification via LLM . We use the LLM again to select the most appropriate ESCO
occupation label from the list retrieved in the previous step. This ensures that each job title (i.e.,
occupation in the terminology of Figure 2) is mapped to a standardised occupational category.</p>
        <p>For the  dataset, job sectors were already relatively normalised. Therefore, we applied only the
second and third steps of the alignment process: querying the ESCO API and classifying the result
using the LLM. Through the above pipeline, we achieved a consistent ESCO–based standardisation of
occupation and job sector across both datasets, enabling reliable comparisons and downstream modelling.
Preprocessing Skills. The skill domains in the two datasets difer significantly, making direct
comparison non-trivial. The objective of the skills alignment process is to produce an ESCO-compliant
list of skills for both  and . To achieve this, we design a multi-step procedure that leverages contextual
information, specifically, the ESCO occupation in  or the ESCO job sector in , previously aligned (see
previous Section 4.1).</p>
        <p>The first step involves separating skills that are already ESCO-compliant from those that are not. This
is accomplished by matching skill terms against the ESCO taxonomy depending on the language of the
dataset. Then, we search the terms among the principal concepts, known as preferred labels, and the
associated synonyms provided by the ESCO APIs.</p>
        <p>In the second step, we focus on the non-ESCO skills. These are further classified into three categories
using the LLM: “Competence and Knowledge”, “Software, Frameworks, Tools and Similar”, and “Other”.
The classification is performed by using an in-context learning approach, where the LLM receives not
only the skill to be classified but also contextual information such as the job sector and the surrounding
skill set. This context significantly improves classification accuracy by allowing the model to consider
semantic relationships beyond isolated terms. For skills categorised as “Competence and Knowledge”, we
query the ESCO API to retrieve a list of relevant ESCO skill terms. The LLM is then used to select the
most semantically appropriate term based on the provided context. For “Software, Frameworks, Tools and
Similar”, we first manually selected a set of ten representative ESCO concepts. The LLM then associates
each technical skill with the most relevant concept from this list. This step was necessary because the
ESCO API often fails to return meaningful results for highly specialised technical terms not covered by
the taxonomy. Skills classified as “ Other” were deemed too noisy or incomplete by the LLM and were
excluded from further processing.</p>
        <p>In the final step , we isolate language-related skills from the remaining set and categorise all
ESCOcompliant skills into “Hard Skills” and “Soft Skills” following the ESCO classification scheme. The result
is a harmonised and structured skill set for each dataset, consisting of three categories: “Hard Skills”,
“Soft Skills”, and “Language Skills”. Figure 4 provides a visual overview of the preprocessing pipeline for
skills.</p>
        <p>Working Hours and Contract Type. Job ofer descriptions often include various contractual details.
Based on this observation, we designed a task for the LLM to extract two specific types of information:
working hours and contract type. For each job ofer description, we issue two separate prompts to
the LLM, each consisting of a query accompanied by the relevant context. The first prompt aims to
classify the job as either a full-time or a part-time contract, based on the information provided in the job
description. In cases where explicit references to working hours were absent, the LLM was instructed to
infer the appropriate classification from the surrounding context. The second one focused on identifying
the nature of the contract duration: “fixed-term” or “permanent”. This classification followed the same
procedure as the working hours task, relying on both explicit cues and contextual inference when
necessary. This approach enabled the extraction of structured contractual information from unstructured
job descriptions, contributing to a more comprehensive and standardised representation of job ofers.
Finally, we emphasise that although we extracted the “Contract Type” feature from the job descriptions,
we observed that it did not align with the curriculum characteristics. Consequently, we opted not to
include the feature in the experiments to preserve simplicity and interpretability in the results.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Trapp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stenger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Leppich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kounev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leznik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. T.</given-names>
            <surname>Foster</surname>
          </string-name>
          ,
          <article-title>Comprehensive exploration of synthetic data generation: A survey</article-title>
          ,
          <source>CoRR abs/2401</source>
          .02524 (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abufadda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mansour</surname>
          </string-name>
          ,
          <article-title>A survey of synthetic data generation for machine learning</article-title>
          ,
          <source>in: ACIT</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Murtaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Khan</surname>
          </string-name>
          , G. Murtaza,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zafar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bano</surname>
          </string-name>
          ,
          <article-title>Synthetic data generation: State of the art in health care domain</article-title>
          ,
          <source>Comput. Sci. Rev</source>
          .
          <volume>48</volume>
          (
          <year>2023</year>
          )
          <fpage>100546</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Assefa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dervovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mahfouz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Tillman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Veloso</surname>
          </string-name>
          ,
          <article-title>Generating synthetic data in finance: opportunities, challenges and pitfalls</article-title>
          , in: ICAIF, ACM,
          <year>2020</year>
          , pp.
          <volume>44</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>44</lpage>
          :
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Beretta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ercoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Iommi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mastropietro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rotelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <article-title>Requirements of explainable AI in algorithmic hiring</article-title>
          ,
          <source>in: AIMMES</source>
          , volume
          <volume>3744</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <article-title>Machine learning for synthetic data generation: a review</article-title>
          ,
          <source>CoRR abs/2302</source>
          .04062 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Dahmen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Cook</surname>
          </string-name>
          ,
          <article-title>Synsys: A synthetic data generation system for healthcare applications</article-title>
          ,
          <source>Sensors</source>
          <volume>19</volume>
          (
          <year>2019</year>
          )
          <fpage>1181</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Torfi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <article-title>Corgan: Correlation-capturing convolutional generative adversarial networks for generating synthetic healthcare records</article-title>
          , in: FLAIRS, AAAI Press,
          <year>2020</year>
          , pp.
          <fpage>335</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Janzing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schölkopf</surname>
          </string-name>
          ,
          <article-title>Elements of causal inference: foundations and learning algorithms</article-title>
          , The MIT Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pouget-Abadie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mirza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warde-Farley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ozair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Courville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Generative adversarial networks</article-title>
          ,
          <source>CoRR abs/1406</source>
          .2661 (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>An introduction to variational autoencoders</article-title>
          , CoRR abs/
          <year>1906</year>
          .02691 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Vallverdú</surname>
          </string-name>
          ,
          <source>Causality for Artificial Intelligence - From a Philosophical Perspective</source>
          , Springer,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>