<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>EXTRACT: Explainable Transparent Control of Bias in Embeddings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhijin Guo</string-name>
          <email>zhijin.guo@bristol.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhaozhen Xu</string-name>
          <email>zhaozhen.xu@bristol.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martha Lewis</string-name>
          <email>martha.lewis@bristol.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nello Cristianini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bath</institution>
          ,
          <addr-line>Claverton Down, Bath BA2 7AY</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Bristol</institution>
          ,
          <addr-line>Beacon House, Queens Road, Bristol, BS8 1QU</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>Knowledge Graphs are a widely used method to represent relations between entities in various AI applications, and Graph Embedding has rapidly become a standard technique to represent Knowledge Graphs in such a way as to facilitate inferences and decisions. As this representation is obtained from behavioural data, and is not in a form readable by humans, there is a concern that it might incorporate unintended information that could lead to biases. We propose EXTRACT: a suite of Explainable and Transparent methods to ConTrol bias in knowledge graph embeddings, so as to assess and decrease the implicit presence of protected information. Our method uses Canonical Correlation Analysis (CCA) to investigate the presence, extent and origins of information leaks during training, then decomposes embeddings into a sum of their private attributes by solving a linear system. Our experiments, performed on the MovieLens-1M dataset, show that a range of personal attributes can be inferred from a user's viewing behaviour and preferences, including gender, age and occupation. Further experiments, performed on the KG20C citation dataset, show that the information about the conference in which a paper was published can be inferred from the citation network of that article. We propose four transparent methods to maintain the capability of the embedding to make the intended predictions without retaining unwanted information. A trade-of between these two goals is observed.</p>
      </abstract>
      <kwd-group>
        <kwd>Fairness</kwd>
        <kwd>Knowledge graph embedding</kwd>
        <kwd>Learning representations</kwd>
        <kwd>Recommender system</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Knowledge graphs are structures that encode entities and the relationships between them, and
are useful representations of real world phenomena. Knowledge graphs can be employed in an
array of applications, including linguistic representation learning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], question answering [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
multihop reasoning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], or recommender systems [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>This study emphasizes link prediction , an inherent task in Knowledge Graph Embedding,
that can simplify crucial issues like recommending user actions or answering entity-specific
queries. Using user gender as an exemplar of private information, we scrutinize its influence on
representations and its ethical implications within AI systems.
nEvelop-O
LGOBE</p>
      <p>https://research-information.bris.ac.uk/en/persons/martha-lewis (M. Lewis);</p>
      <p>
        Recent studies showed that personal information can be inferred from user behaviour. A
study of Facebook users showed that certain private traits (including gender and race) could
be reconstructed on the basis of their public “likes” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; and a study of word embeddings
showed that these representations contained information related to gender and race, that might
potentially afect the decisions of algorithms that make use of those representations [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. We
investigate this problem with specific application to detecting and removing bias in knowledge
graphs.
      </p>
      <p>
        We address issues, similar to prior word embedding research, concerning vector
representations derived from word co-occurrence [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], recasting this process as a particular facet of the
graph embedding problem. The notable “compositionality” feature in word embeddings signified
the potential for analogical reasoning and the risk of undesirable encodings, such as gender
biases [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It was observed that these distributions also contain additional information including
associations and biases that reflect customs and practices. For example, the embeddings of
color names were not gender neutral [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], nor were those of job titles or academic disciplines.
Engineering disciplines and leadership jobs tended to be represented in a “more male” way than
artistic disciplines or service jobs [
        <xref ref-type="bibr" rid="ref6 ref9">9, 6</xref>
        ].
      </p>
      <p>
        Related Work The presence of gender information in word embeddings was already reported
in Bolukbasi et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], in an article aptly entitled “Man is to Computer Programmer as Woman
is to Homemaker?”, as well as in Mikolov et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. An interesting possibility is the presence of
similar biases in Knowledge Graph embedding, which would lead both to opportunities and
challenges, and which would require attention. Our research significantly extends the field by
introducing detection methods under the EXTRACT framework. In a “fair classification” context,
Madras et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] aimed to predict a label  from data  , while ensuring fairness regarding a
binary sensitive attribute  . Our detection methods are tailored to identify specific dimensions
in user embeddings that are associated with private or sensitive attributes. Notably, our linear
decomposition system reveals that user behaviour can be approximated as a sum of demographic
vectors, uncovering another layer of embedded bias.
      </p>
      <p>
        When it comes to bias removal, a range of strategies exists. Regularization methods focusing
on fairness have been explored [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The LFR framework [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] aimed to encode non-sensitive
attributes while minimizing sensitive ones. Its extension, ALFR, has been adopted in
recommendation systems [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Works like Zhu et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] employ adversarial learning for equal score
distribution, Bose and Hamilton [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] and Wu et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] extended ALFR’s scope by imposing
compositional fairness. Notably, Fisher et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] used adversarial loss for model neutrality.
Our contributions in debiasing ofer transparency and interpretability. We employ linear
transformations to identify bias direction and leverage first-moment loss to minimize distributional
disparities during training.
      </p>
      <p>Our Contributions We introduce EXTRACT: Explainable and Transparent ConTrol of bias
in embeddings, a two-step process to detect and mitigate bias in Knowledge Graph Embeddings,
implemented on the MovieLens 1M and KG20C citation datasets. Our detection methods
include logistic classifiers (detect-LC), canonical correlation analysis (detect-CCA), and a linear
decomposition (detect-LD). Bias removal leverages linear projection methods (remove-LP and
remove-LP-multi) and two retraining methods, remove-FM and remove-FM-multi.</p>
      <p>The proposed methods are efective and interpretable, with our novel application of
detectCCA and detect-LD. Specifically, detect-CCA constitutes a novel use of CCA, marking its
debut in bias detection. Additionally, detect-LD contributes a unique decomposition of node
embedding space into interpretable dimensions, signifying attribute presence or absence - a
method not previously implemented. Our methods, remove-LP and remove-LP-multi, can be
directly applied to pre-existing embeddings, saving computational resources by eliminating the
need for retraining from scratch. Despite slight performance decreases, our methods parallel
state-of-the-art results, maintaining conciseness, interpretability, and explainability.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Mathematical Preliminaries</title>
      <p>Knowledge Graph-Based Recommendation Algorithms In a knowledge graph  =
( , ℱ , ) , vertices  signify real-world entities, each endowed with a set  of attributes.
Relationships between these vertices are captured in a set ℱ consisting of facts. Each fact  ∈ ℱ is
articulated as a triple (ℎ,  , ) , where  belongs to a set of relations ℛ that serve as direct edges
from head ℎ to tail  . For instance, considering a set of users and movies as vertices, user ratings
form the edges.</p>
      <p>
        A knowledge graph embedding translates vertices into vectors maintaining graph topology,
represented as (h, R, t) ∈ ℝ . The score function ( h, R, t) governs the linking between nodes.
Rating Prediction In alignment with Berg et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], we establish a function  that, given
a triple of embeddings (h, R, t), calculates the probability of the relation against all potential
alternatives.
(1)
(2)
 (h, R, t) = SoftArgmax (( )) =
      </p>
      <p>( )
 ( ) + ∑ ′≠∈ R  ( ′)
In the above formula,  = (ℎ,  , ) denotes a true triple, and  ′ = (ℎ,  ′, ) denotes a corrupted
triple, that is a randomly generated one. We use as a proxy for a negative example (a pair of
nodes that are not connected).</p>
      <p>
        Assigning numerical values to relations  , the predicted relation is then just the expected value
prediction = ∑∈ R   (h, R, t) In our application of viewers and movies, the set of relations
R could be the possible ratings that a user can give a movie. The predicted rating is then
the expected value of the ratings, given the probability distribution produced by the scoring
function. ( ) refers to the scoring function in Yang et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        To learn a graph embedding, we follow the setting of Bose and Hamilton [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] as follows,
 = −
∑ log
 ∈ F
      </p>
      <p>( )
 ( ) + ∑ ′∈F ′  ( ′)
This loss function maximises the probabilities of true triples ( ) and minimises the probability
of triples with corrupted triples ( ′).</p>
      <sec id="sec-2-1">
        <title>Entity prediction</title>
        <p>
          Unlike predicting movie ratings which is the relation in MovieLens 1M,
we use a margin based loss [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] to predict the tail entities (which could be paper, afilation and
domain) as follows. In this loss function,  represents for a true triple while  ′ represents for a
corrupted triple,
        </p>
        <p>∈ℱ  ′∈ℱ ′
 =
∑
∑ [ +  ( ′) − ( ) ]
+
Here, []+ means keeping the positive part of the loss function and we want ( ) &gt; (</p>
      </sec>
      <sec id="sec-2-2">
        <title>Evaluation Metrics</title>
        <p>We use 4 metrics to evaluate our performance on the link prediction
task. These are root mean square error (RMSE,
relation and   is the true relation), Hits@K - the probability that our target value is in the top
 predictions, mean rank (MR) - the average ranking of each prediction, and mean reciprocal
rank (MRR) to evaluate our performance on the link prediction task. These are standard metrics
√ 
1 ∑

=1 ( ̂ −   )2, where  ̂ is our predicted
in the knowledge graph embedding community.
′).</p>
        <p>(3)</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Explainable Bias Detection</title>
      <p>The general problem is as follows. Given a knowledge graph  = ( , ℱ , )
, suppose that one
attribute  ∈</p>
      <p>is private: we do not wish users of the embedded graph to be able to predict the
value of  for a given vertex  ∈  . We show here that it is possible to predict the value of some
private attributes even when that attribute is not used in the embedding algorithm.</p>
      <p>We give three methods: a logistic classifier based method (detect-LC), Canonical
Correlation Analysis (detect-CCA), and a linear decomposition method (detect-LD) to detect private
information in the vertices  .
detect-LC: Logistic Classifier</p>
      <p>Consider a knowledge graph denoted by  = ( , ℱ , )
.</p>
      <p>
        We’ve represented this graph in a  -dimensional space, ℝ , based solely on the information
from  and  . Suppose a subset of vertices  ⊆ 
is labelled with the presence or absence of
attributes   . For example, in a database of users and movies, we might have a subset  of users,
with each node labelled as over 18 or under 18. We train a logistic classifier to predict the value
of   for each vertex  ∈  . See Figure 1 for an illustration. In the case of our movie rating
example, private attributes could be gender, age, and occupation.
detect-CCA: Canonical Correlation Analysis Canonical Correlation Analysis (CCA) is
used to measure the correlation information between two multivariate random variables [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
Just like the univariate correlation coeficient, it is estimated on the basis of two aligned samples
of observations.
      </p>
      <p>As above, we assume that a subset  ⊆  of vertices is labelled with the presence or absence
of attributes   . Supposing we have  attributes, we can then form a Boolean-valued matrix A of
dimension | | ×  , where each row corresponds to a vertex in  , and each column corresponds
to an attribute. In this matrix, each  ∈  is therefore represented by a vector of ones and zeros
corresponding to presence or absence of each attribute. Suppose we have a | | ×  matrix of
vertex embeddings U that we have learnt via our knowledge graph embedding. We compute
the CCA between A and U. We learn vectors w and w , such that the correlation between
the projected A and projected U are maximised, that is  = max(w ,w ) corr (Aw , Uw ). Note
there are  correlations corresponding to  components.</p>
      <p>In the example of viewers and movies, we use this method to compare two descriptions of
users, illustrated in Figure 2. One matrix is based on demographic information, represented
by Boolean vectors. The other matrix is based on their behaviour, computed by their movie
ratings only.
detect-LD: Linear Decomposition Again assuming we have a matrix of entity embeddings
U with Boolean matrix A encoding the attribute information, we investigate the possibility
that the entity embeddings can be decomposed into a linear combination of embeddings
corresponding to attributes. Specifically, we investigate whether we can learn a matrix X by solving
AX = U.</p>
      <p>
        We investigate here the possibility that entity embeddings in knowledge graphs can be
decomposed into linear combinations of embeddings corresponding to attributes. We use
methods from Xu et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] to see if an entity embedding u can be decomposed.
      </p>
      <p>In our example of viewers and movies, we define a set of users as  and the Boolean encoding
matrix of the attributes as A. We aim to solve a linear system AX = U, so that the user
embedding can be decomposed into three components (gender, age, occupation) as follows,
u = ∑   x , where u is a user embedding,  ranges over all possible values of each private
attribute, x is an embedding corresponding to the  th attribute value, and   ∈ {0, 1} indicates the
presence or absence of that attribute value. Figure 3 shows a running example of decomposing
user into gender and age (18-30 and 31-60). Figure 3 shows the schematic of detect-LD.</p>
      <sec id="sec-3-1">
        <title>Hypothesis Testing with Random Permutations We test whether our methods are able</title>
        <p>
          to detect private information by comparing with randomly shufled data. Our null hypothesis is
that the embedding of a vertex  and its attributes  are independent. To test whether this is the
case, we employ a non-parametric statistical test, whereby we directly estimate the  -value as
the probability that we could obtain a “good”1 value of the test statistic under the null hypothesis.
If the probability of obtaining the observed value of the test statistic is less that 1%, we reject
the null hypothesis. Specifically, we will randomly shufle the pairing of vertices and attributes
100 times, and compute the same test statistic. If the test statistic of the paired data is better
than that of the randomly shufled data across all 100 random permutations, we conclude that
the correctly paired data performs better to a 1% significance level. The test statistic for the
logistic classifier is the accuracy of predicted labels, for CCA is the correlation  , and for the
linear System AX = U we look at the L2 norm loss of the linear system, cosine similarity and
retrieval accuracy, a metric defined in [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Transparent Control of Bias</title>
      <p>
        As well as detecting bias in a knowledge graph, we also wish to remove it. In the following, we
give two methods for removing bias based on a linear transformation method (remove-LP) as
well as an extension of this (remove-LP-multi), and two (remove-FM and remove-FM-multi)
based on penalising diferences between the distribution of items with diferent protected
characteristics. The diference in distribution is approximated by comparing only the first
moments, though the method can be extended to further moments.
(remove-LP): Linear Transformation We refer to previous work on word embedding debias
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] on eliminating sensitive attributes for user embedding. Suppose we have a knowledge
graph ( , ℱ , ) and an embedding of that graph, and we are interested in the attributes of
a subset  ⊆  of the vertices. Suppose we have private attribute  that we wish to protect,
which takes two values  and  . Let   represnt a set of entity embeddings with attribute
 , and   is a set of entity embeddings with attribute  . If we identify a hyperplane that best
separates these two sets of points, then b̂ can be defined as the normalised vector orthogonal to
the hyperplane pointing from  to  . Given a vertex embedding u, we can project  onto the
hyperplane orthogonal to b̂ via u⟂ = u − (u ⋅ b̂ ) ⋅ b̂.
(remove-LP-multi): Linear Shifting for 2 or more classes For more than two classes, we
propose a new linear-shifting method. Again suppose we have a graph ( , ℱ , ) , an embedding
of this graph, and a subset  ⊆  vertices with particular attributes. Suppose that we are able to
detect information about an attribute  that we wish to keep private, and  can take one of 
values. We take the set of entity embeddings  = { u }, each having a value   of the attribute  .
For instance, in our running example of movie rating,  may be users,  may be occupation,
and the values   could be {academic, artist, clerical, college student, ...}. Now, the set  of
embeddings is partitioned into  sets { 1, ...,   } according to the value of  . We calculate the
1either high or low, depending on the statistic
centroid   of each of the sets   , and then calculate the centroid  of all the centroids   . Finally,
we shift each vector u
      </p>
      <p>∈   by  −   , so that now, all the sets   have the same centroid.
(remove-FM): First-Moment Loss Ideally, two sets of items that difer only along protected
dimensions should have the same distribution in the embedding space. Therefore we aim to
reduce the diference between distributions of items which difer in that way by penalising the
distance between first-moments of their empirical distribution (i.e. between their empirical
means) during the training process.</p>
      <p>
        Again suppose we have a graph  = ( , ℱ , )
, a subset of vertices of interest  , an embedding
of this graph, and that we are able to detect information about an attribute  that we wish
to keep private. Consider two subsets of vertex embeddings   = {u } and   = {u } with
private attribute  taking values  and  . For the first-moment loss we consider each of   and
  as multivariate distributions of the two classes  and  . We aim to remove the diference in
the first moments of the two classes by centering the mean coordinates of the two classes. An
illustration of this can be shown in Figure 4. We follow Zemel et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and implement this
method on our knowledge graph. In order to do this we minimize the following loss function:
L = Loss +  ‖
 
  
∑ u +  
  
∑ u ‖
where   ∈ {−1, 1}, whilst   = 1,   = −1.   ,   stand for the counts of vertices with attribute
taking values  and  .  is a weight parameter that gives more weight to the regularization, Loss
refers to Equations 2, 3 for predicting relations/entities. Consider the example of knowledge
graph of viewers and movies, we aim to neutralize gender information in the first moment of
the two multivariate distributions by standardizing the mean coordinates across genders.
(remove-FM-multi): First-Moment loss for 2 or more classes
For more than two classes,
we adapt the “mean-match” [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] process. We impose a constraint, limiting the norm of the
embeddings to 1. Our rationale for this is rooted in the belief that this constraint might preserve
or even boost the predictive capability of the model. We aim to test this conjecture and, if
corroborated, demonstrate its implications for the robustness of the predictive model. We
propose a new loss function in 5,
      </p>
      <p>L =  ∗ Loss + ∑
   
1 (‖u  −   ‖2 − 1)

2
+ ∑</p>
      <p>1
   ,
(   −   )
2
where  , ∈ {Class 1, Class2, Class3 ... },   is the count of vertices of class  ,   , is the count of

combinations of diferent classes.   is the center point of class  ,  is the weight parameter that

gives more weight to the Loss (the original loss in Equation 2, 3) for predicting relations/entities.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Study</title>
      <sec id="sec-5-1">
        <title>5.1. Datasets and Training Details</title>
        <sec id="sec-5-1-1">
          <title>MovieLens</title>
          <p>
            This experiment was conducted on the MovieLens 1M dataset [
            <xref ref-type="bibr" rid="ref26">26</xref>
            ] which consists
of a large set of movies and users, and a set of movie ratings for each individual user. It is widely
(4)
(5)
used to create and test recommender systems.
          </p>
          <p>
            Users and movies each have attributes. For example, users have demographic information
such as gender, age, or occupation. Whilst this information is typically used to improve the
accuracy of recommendations, we use it to test whether the embedding of a user correlates
to private attributes, such as gender or age. Crucially, we compute our graph embedding
based only on ratings, leaving out user attributes. Experiments for training knowledge
graph embeddings are implemented with the OpenKE [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ] toolkit.
          </p>
          <p>We use a 90:10 train:test split to train user and movie embeddings with triples
( ,  ,  ) . Initial embeddings are randomly assigned and optimized to minimize
loss as per equation (2). We sample 10 corrupted entities and 4 corrupted relations for each true
relation. The learning rate is 0.01 and epochs are set at 300. Link prediction on the 10% test set
yields an RMSE score of 0.88 (table 2).</p>
          <p>
            Citation Network The KG20C knowledge graph is constructed from 20 top AI conferences
[
            <xref ref-type="bibr" rid="ref27">27</xref>
            ]. The entities consist of 5407 papers, 8060 authors, 692 afiliations, 1920 domains, and 20
conferences. We train entity embeddings for papers, authors, afiliations, and domains, and
relation embeddings of the following four types: “Author in afiliation”, “Author write paper”,
“Paper cite paper” and “Paper in domain”.
          </p>
          <p>We view the venue, i.e. conference, as information that we wish to detect and potentially
remove. As an application, suppose we are building a search engine and wish to avoid the “echo
chamber” of groups of authors citing each other’s work. Detecting conference information
could help to return a more diverse set of search results.</p>
          <p>Embeddings for entities and relation types were randomly initialized and trained to minimize
the loss as per equation 3, using a 0.05 learning rate over 300 epochs and a margin,  , set to
6. Link prediction task metrics (Hits@10: 0.29, Hits@1: 0.075, MRR: 0.15, and MR: 1369.93)
attest to our embeddings’ quality. The bias-removal strength,  , incrementally set to 50, 100,
and 150, increased the loss function’s bias-removal component. With 1/3 of triples sharing the
same head and relation but diferent tails, individual triples were tested to maintain accuracy.
To ensure the reproducibility of our results, we have made the code available at https://github.
com/ZhijinGuo/EXTRACT.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. MovieLens Bias Detection</title>
        <p>We now apply our three methods for bias detection to investigate the extent to which private
information can be detected in user embeddings trained without that information.
detect-LC on MovieLens We use a logistic classifier to predict gender, age, and occupation
from the embeddings. We sub-sample the data to have balanced datasets with respect to gender,
age, and occupation.</p>
        <p>
          Using an 80:20 training-to-test split across all datasets, we trained logistic classifiers for each
attribute, implemented via the scikit-learn toolkit [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. Preliminary results highlight the model’s
ability to predict private traits, with 73% accuracy in predicting gender, approximately 72%
and 68% in predicting younger and older users respectively, and 59% accuracy for occupation
prediction. These binary classification results outperform the 0.50 baseline.
detect-CCA on MovieLens We collect attribute information for all 6040 users and represent
their personal attributes with Boolean indicator vectors a which encode the value of each
attribute (gender, age, and occupation). See figure 2 for a schematic of these vectors.
        </p>
        <p>We apply CCA to calculate the correlation between users and their attributes. We apply
the non-parametric statistical test described in section 3. Specifically, our null hypothesis is
that users’ movie preferences are not correlated with their attributes. We calculate Pearson’s
correlation coeficient (PCC) between projected Aw and projected Uw . We go on to calculate
the PCC between 100 randomly generated pairings of user and attribute embeddings, and find
that the PCC between true pairs of attribute and user embeddings is higher each time. We
therefore reject the null hypothesis at a 1% significance level. The correlation coeficients
between real pairs and random pairs is reported in figure 5a.</p>
        <p>(a) Pearson’s correlation coeficient (PCC) for true
user-attribute pairings and 100 permuted pairings.</p>
        <p>
          PCC is calculated between projected A and
projected U.  axes stands for the  th components,
 axes gives the value. The PCC value for real
pairings is larger than for any permuted pairings.
(b) (t-SNE) visualisation [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] for paper embedding,
colour-coded by conference. Notice that even
though we do not use conference information in
learning these embeddings, information about
the conferences is clearly present in the
embeddings.
detect-LD on MovieLens We investigate the ability of a user embedding to be reconstructed
as a linear sum of attribute embeddings. We find that a user embedding can be reconstructed as
a linear combination of its attributes by solving the linear system described in section 3 with
pseudo-inverse method. In order to interpret the user embedding with user attributes such
as gender, age and occupation, we first group the user by age and gender firstly and compute
the mean embedding of 14 group of users. We afterwards group the user by gender, age and
occupation and compute the mean embedding of 241 group of users. Two test statistics are used
to test our linear system with significance threshold  = 0.01 .
        </p>
        <p>As with the CCA setting, we permuted the pairing of users 100 times. Table 1 shows the
observed p-value for three diferent statistics, which is the probability of seeing that value of
statistic under the null hypothesis. We first decompose the user embedding into gender and
age. Our results show the linear system is able to decompose the user embedding with a loss of
0.47, which is lower than every loss for a random permutation (1.11-2.11). The cosine similarity
is 99.8%, higher than any permuted pairs. The identity retrieval accuracy is 0.79 which is higher
than any random permuted pairs (0.0-0.14). Therefore, the null hypothesis is rejected. This
shows that a user embedding can be reconstructed as a linear combination of gender and age.</p>
        <p>When decomposing the embedding into gender, age and occupation, the L2 norm is 17.87
which is lower than every loss for a random permutation (18.90-19.56). The cosine similarity is
97.1%, higher than any permuted pairs. As for identity retrieval accuracy, although the value is
only 0.23 which is not a good result, it is still higher than any random permuted pairs (0.00-0.08).
Therefore, the null hypothesis is rejected.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Citation Network Bias Detection</title>
        <p>We consider a dataset of scientific papers, whose entities are the papers, their authors, the
conference in which those papers were presented, and the scientific domain they belong to. For
the sake of example, we will treat as sensitive the information about the specific conferences in
which a given paper was presented (we could pretend that this information should not be used
as part of career assessment, and therefore should be hidden for this reason).</p>
        <p>We embed scientific papers using their relations with the sets of authors and domains, and
then we will test if that embedding could contain enough information for a logistic classifier to
learn how to predict the specific conference. Note that this is a very strict requirement - that in
no way can any information relevant to this task be contained in the embedding of the paper.
Similar to the movie recommender system, we treat the conference information in this citation
network as a sensitive attribute.</p>
        <p>Utilizing 5407 papers with respective author, citation, domain, and author afiliation relations,
we generate embeddings for each and label them with one of the 20 conferences, which are
then split into training and test sets. Our logistic classifier (detect-LC) yields a 55% accuracy in
conference venue prediction, surpassing the 13% baseline (5b). Despite not being trained on
conference data, the embeddings can still distinguish diferent conferences.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Transparent Control of Bias</title>
        <p>We apply the techniques introduced in section 4 to remove information about age and gender
from MovieLens-1M, and information about conference from the KG20C citation network.
(a) Gender - rating trade-of
(b) Age - rating trade-of
remove-LP for MovieLens We investigate the ability to remove the gender and age bias
detected in the knowledge graph user embedding whilst maintaining the ability of the embedding
to predict behaviour. We apply remove-LP (section 4) to remove gender and age information
from the data splits described in section 5.2. We run the bias removal process for 10 iterations.
Results are reported in table 2.</p>
        <p>As depicted in figures 6a and 6b, there is a trade-of between predictive power and bias. An
increase in iterations leads to a drop in gender classification accuracy from 73% to 54%, nearing
baseline accuracy, with a slight increase in RMSE from 0.88 to 0.93. A similar pattern is observed
in age bias removal, with accuracy for under/over 50s decreasing from 0.72 to 0.62, and RMSE
rising from 0.88 to 1.01.
remove-FM for MovieLens We use remove-FM (section 4) to remove bias during the training
of the embeddings. We again use embeddings and splits described in section 5.2, and investigate
how well predictive power is retained as bias information is removed. Figures are reported
in table 2. We see that as we increase the parameter  , the accuracy of the gender classifier
decreases, although not monotonically. The RMSE increases slightly from 0.88 to 0.96.
remove-LP-multi for KG20C Regarding the KG20C dataset, we employ linear shifting
for classes greater than or equal to two to eliminate bias post-training, adjusting the paper
embedding to ensure that paper sets labelled with diferent conferences converge at a common
center. Results, detailed in Table 3, indicate that the model, while becoming conference neutral,
experiences a minor decrease in predictive power (Hits@10 reduces from 0.29 to 0.21).
remove-FM for KG20C We apply the methods described in the multi-class version of
removeFM (section 4). The embeddings are debiased during training. A parameter  weights the strength
of the prediction power. We set  to 1 and 20, to increase the influence of the prediction part
of the loss function. As shown in Table 3, a larger  slightly improves Hits@10 but decreases
Hits@1 and MR. Additionally, the retained conference information increases accuracy from
0.14 to 0.26, though Hits@10 drops 0.14, signifying a trade-of between prediction power and
unwanted information removal.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion and Conclusions</title>
      <p>In this work, we introduced EXTRACT, a comprehensive framework that combined three
methods for detecting and four methods for removing biases in knowledge graph embeddings.
Using datasets like MovieLens 1M and KG20C, EXTRACT eficiently identified leaks of sensitive
attributes and unintended conference afiliations. This two-step process ensured that specific
dimensions of user embeddings, correlating to private or sensitive attributes, were identified and
appropriately treated. Our findings showed that the correlations between the private attributes
and the user representations were significantly higher than random, validating the eficacy of
EXTRACT in capturing statistical patterns that reflect private attribute information.</p>
      <p>Furthermore, our linear decomposition system, an integral part of the EXTRACT framework,
revealed that user-behaviour-embedding can be approximated by a sum of user-demographic
vectors. This validated the framework’s ability to decompose user embeddings into a weighted
sum of attribute embeddings, thereby allowing for the targeted removal or control of bias.</p>
      <p>Our four approaches to debiasing were more transparent and explainable than more complex
methods. Remove-LP method was most successful at debiasing, whilst retaining relatively good
performance than the link prediction task. Overall, EXTRACT served as a unified, two-step
approach that advanced the field towards ethical and transparent knowledge graph embeddings.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. L</given-names>
            .
            <surname>Logan</surname>
          </string-name>
          <string-name>
            <surname>IV</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Barack's wife hillary: Using knowledge-graphs for fact-aware language modeling</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>07241</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Cfo: Conditional focused neural question answering with large-scale knowledge bases</article-title>
          ,
          <source>arXiv preprint arXiv:1606</source>
          .
          <year>01994</year>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <article-title>Commonsense for generative multi-hop question answering tasks</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>06309</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          , E. Cambria,
          <string-name>
            <given-names>P.</given-names>
            <surname>Marttinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Y.</given-names>
            <surname>Philip</surname>
          </string-name>
          ,
          <article-title>A survey on knowledge graphs: Representation, acquisition, and applications</article-title>
          ,
          <source>IEEE transactions on neural networks and learning systems 33</source>
          (
          <year>2021</year>
          )
          <fpage>494</fpage>
          -
          <lpage>514</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kosinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stillwell</surname>
          </string-name>
          , T. Graepel,
          <article-title>Private traits and attributes are predictable from digital records of human behavior</article-title>
          ,
          <source>Proceedings of the national academy of sciences 110</source>
          (
          <year>2013</year>
          )
          <fpage>5802</fpage>
          -
          <lpage>5805</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Caliskan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Bryson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <article-title>Semantics derived automatically from language corpora contain human-like biases</article-title>
          ,
          <source>Science</source>
          <volume>356</volume>
          (
          <year>2017</year>
          )
          <fpage>183</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          ,
          <source>in: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>26</volume>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jonauskaite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristianini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mohr</surname>
          </string-name>
          ,
          <article-title>English colour terms carry gender and valence biases: A corpus study using word embeddings</article-title>
          ,
          <source>PloS one 16</source>
          (
          <year>2021</year>
          )
          <article-title>e0251559</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bolukbasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saligrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Kalai</surname>
          </string-name>
          ,
          <article-title>Man is to computer programmer as woman is to homemaker? debiasing word embeddings</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>29</volume>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Madras</surname>
          </string-name>
          , E. Creager,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pitassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zemel</surname>
          </string-name>
          ,
          <article-title>Learning adversarially fair and transferable representations</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>3384</fpage>
          -
          <lpage>3393</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Bias and debias in recommender system: A survey and future directions</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          <volume>41</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Zemel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Swersky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pitassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dwork</surname>
          </string-name>
          ,
          <article-title>Learning fair representations</article-title>
          ,
          <source>in: International conference on machine learning, PMLR</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>325</fpage>
          -
          <lpage>333</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Edwards</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Storkey</surname>
          </string-name>
          ,
          <article-title>Censoring representations with an adversary</article-title>
          ,
          <source>arXiv preprint arXiv:1511.05897</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Caverlee</surname>
          </string-name>
          ,
          <article-title>Measuring and mitigating item under-recommendation bias in personalized ranking systems</article-title>
          ,
          <source>in: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bose</surname>
          </string-name>
          , W. Hamilton,
          <article-title>Compositional fairness constraints for graph embeddings</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>715</fpage>
          -
          <lpage>724</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Learning fair representations for recommendation: A graph-based perspective</article-title>
          ,
          <source>in: Proceedings of the Web Conference</source>
          <year>2021</year>
          ,
          <year>2021</year>
          , pp.
          <fpage>2198</fpage>
          -
          <lpage>2208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fisher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Palfrey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <article-title>Debiasing knowledge graph embeddings</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>7332</fpage>
          -
          <lpage>7345</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <source>595. doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 /
          <article-title>2 0 2 0</article-title>
          . e m n l p -
          <source>m a i n . 5</source>
          <volume>9</volume>
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19] R. v. d. Berg,
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Kipf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Graph convolutional matrix completion</article-title>
          ,
          <source>arXiv preprint arXiv:1706.02263</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <article-title>Embedding entities and relations for learning and inference in knowledge bases</article-title>
          ,
          <source>arXiv preprint arXiv:1412.6575</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>OpenKE:</surname>
          </string-name>
          <article-title>An open toolkit for knowledge embedding</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Association for Computational Linguistics</source>
          , Brussels, Belgium,
          <year>2018</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          . URL: https://aclanthology.org/D18-2024.
          <article-title>doi:1 0 . 1 8 6 5 3 / v 1 / D 1 8 - 2 0 2 4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristianini</surname>
          </string-name>
          , et al.,
          <article-title>Kernel methods for pattern analysis</article-title>
          , Cambridge university press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristianini</surname>
          </string-name>
          ,
          <article-title>On compositionality in data embedding</article-title>
          ,
          <source>in: Advances in Intelligent Data Analysis XXI: 21st International Symposium, IDA 2023</source>
          , Springer,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lansdall-Welfare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristianini</surname>
          </string-name>
          ,
          <article-title>Biased embeddings from wild data: Measuring, understanding and removing</article-title>
          ,
          <source>in: International Symposium on Intelligent Data Analysis</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>328</fpage>
          -
          <lpage>339</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kamishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akaho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Asoh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sakuma</surname>
          </string-name>
          ,
          <article-title>Enhancement of the neutrality in recommendation</article-title>
          ., in: Decisions@ RecSys,
          <year>2012</year>
          , pp.
          <fpage>8</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Harper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <article-title>The movielens datasets: History and context, Acm transactions on interactive intelligent systems (tiis) 5 (</article-title>
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>H. N.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Takasu</surname>
          </string-name>
          ,
          <article-title>Exploring scholarly data by semantic query on knowledge graph embedding space</article-title>
          ,
          <source>in: Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL</source>
          <year>2019</year>
          , Oslo, Norway, September 9-
          <issue>12</issue>
          ,
          <year>2019</year>
          , Proceedings 23, Springer,
          <year>2019</year>
          , pp.
          <fpage>154</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>L.</given-names>
            <surname>Van der Maaten</surname>
          </string-name>
          , G. Hinton,
          <article-title>Visualizing data using t-sne.</article-title>
          ,
          <source>Journal of machine learning research 9</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>