<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Similarity Measure Learning for Analogical Transfer</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chunyang Fan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sorbonne Université</institution>
          ,
          <addr-line>CNRS, LIP6, F-75005 Paris</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Université Sorbonne Paris Nord, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé - LIMICS, INSERM, UMR 1142</institution>
          ,
          <addr-line>F-93000, Bobigny</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Analogical transfer infers unknown information by leveraging known similarities between situations. This doctoral research investigates similarity measure learning tailored for analogical transfer, focusing on classification. A unified theoretical framework, called Similarity Measure Learning Architecture (SiMeLAr), is introduced to optimize systematically similarity measures for analogical tasks. Beside this generic question, we focus on the specific case of the method called CoAT (Complexity-based Analogical Transfer). To address its non-continuity, we propose a continuous variant which enables gradient-based optimization. Future work will explore the application of the variant to real-world domains, such as culinary and medical use cases.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Analogical Transfer</kwd>
        <kwd>Similarity Measure Learning</kwd>
        <kwd>Metric Learning</kwd>
        <kwd>Case-Based Reasoning</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Analogical transfer infers information from known similar cases by assuming that their similarity in
certain components implies similarity in others [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Case-based prediction (CBP) methods (see e.g. [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]
for surveys) apply this general principle to solve supervised machine learning tasks, classification or
regression: they consider a data instance as a case with two components, respectively called situation
and outcome, that correspond to the instance features and its label. They then apply the general
analogical transfer principle that, in that context, is expressed as follows: if two situations are similar,
their outcomes should also be similar. Therefore, they predict the outcome of a new situation leveraging
its similarities with situations from the case base.
      </p>
      <p>
        Similarity measures thus obviously play a central role to analogical transfer, that can be seen as
essentially transferring similarity knowledge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This role raises questions related to the topic of metric
learning [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
      </p>
      <p>
        This thesis aims at studying the interactions between analogical transfer and metric learning and at
developing a methodology to learn optimized similarity measures for case-based prediction. Validation will
occur in two domains: culinary recipe transfer [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and breast cancer management decision-making [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <sec id="sec-1-1">
        <title>1.1. State of the Art</title>
        <p>This research lies at the intersection of similarity measure learning and analogical transfer, necessitating
an overview of both fields and their interactions.</p>
        <p>
          Similarity Measure Learning Similarity measures quantify how alike two objects are, often derived
from distances transformed by decreasing functions [
          <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
          ]. Metric learning optimizes these measures
using data-driven constraints [
          <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
          ]. Numerous approaches have been proposed, they for instance
include linear [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], nonlinear [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], local metrics [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], and histogram-based methods [
          <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
          ].
        </p>
        <p>
          More recently, Deep metric learning (DML) employs neural networks projecting data into embedding
spaces, leading to approaches that can be categorized into pair-based methods using contrastive
loss [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], triplet-based methods [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], similarity cloud-based method [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and clustering-based methods
capturing global data structures [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Additional approaches integrate attribute-specific measures, e.g.,
k-Prototypes [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], or unsupervised hyperbolic methods for hierarchical data [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
        <p>
          In the context of case-based reasoning, some works further propose data-driven approaches to
learning or tuning similarity measures that rely less on expert-driven designs [
          <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
          ].
Analogical Transfer
        </p>
        <p>
          Analogical transfer infers new situations’ outcomes from similar known
instances by mapping similarities (see e.g. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]). The analogical transfer principle can be formulated as:
if two situations are similar with respect to some criteria, then it is plausible that they are
also similar with respect to other criteria (see e.g. [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]). Computational analogy formalizes this
principle through various methods that can for instance be categorized into [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]: instance-label
alignment [
          <xref ref-type="bibr" rid="ref25 ref26">25, 26</xref>
          ], negative constraints [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], label support measures [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], domain knowledge integration [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ],
and Complexity-based Analogical Transfer (CoAT) [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ].
        </p>
        <p>Case based prediction (CBP) considers a case base CB, which is a set of cases where each case  is a
pair (, ) ∈  × ℛ</p>
        <p>
          , where  and ℛ respectively denote the situation and the outcome spaces. They
are respectively equipped with two similarity measures   :  ×  →
describe in more details the CoAT [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] approach below. It constitutes a recent model that has shown
R+ and  ℛ : ℛ × ℛ →
        </p>
        <p>R+. We
promising results in analogical transfer tasks.</p>
        <p>CoAT relies on a global indicator, computed on the whole case base instead of evaluating locally
similarity between source-target pairs: applying the analogical transfer to CBP, according to which
similar situations imply similar outcomes, the incompatibility indicator computes the number of triplets
(,  , ) in CB that violate this principle, i.e.   (,  ) ≥   (, ) but  ℛ(,  ) &lt;  ℛ(, ):
Γ(   ,  ℛ, CB) :=</p>
        <p>Ind (,  , )
∑︁
(,,)∈CB3
(1)
∈</p>
      </sec>
      <sec id="sec-1-2">
        <title>1.2. Research Questions</title>
        <p>where  = (  ,  ℛ, CB), and Ind (,  , ) := 1 { 
(,  ) ≥  
(, )} 1 { ℛ(,  ) &lt;  ℛ(, )}.</p>
        <p>As a classifier, for a new situation new, when leveraging the situation similarity to predict its outcome,
CoAT then identifies the most plausible outcome, defined as the one minimizing the incompatibility of
the case base augmented with the candidate new case (new, )
ˆnew := arg min CoAT(new, )

where CoAT(new, ) := Γ (   ,  ℛ, CB ∪ {(new, )})</p>
        <p>Similarity measure learning and analogical transfer are interdependent: efective analogical transfer
requires suitably learned measures, and optimal measures can be guided by analogical prediction
tasks. Taking CoAT as an example, the incompatibility indicator Γ enables metric learning beyond
classification. In fact, defining parameters</p>
        <p>
          and incompatibility  , Badra et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] identify CoAT as
an energy-based model (EBM). Many loss functions have been proposed to optimize the performance
for EBM, such as hinge loss ℓ (, ) := max (︀ 0,  (, ) −
open the way to optimise the similarity measure used in CoAT.
min′̸=  (, ′) +  )︀ . Such approaches
        </p>
        <p>This opens the following questions: Do similarity measures optimized for a specific algorithm,
for example CoAT, generalize across analogical classifiers? Can we tailor measures specifically for
analogical classifiers or vice versa?</p>
        <p>This research investigates how metric learning algorithms influence analogical transfer classifier
performance, e.g. measured by their accuracy, and considering in particular the case of CoAT.
Challenges include studying classifier-metric interactions, defining mathematical frameworks for analogical
similarity learning, and taking into account computational complexity constraints.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Research Direction and Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Objectives</title>
        <p>The objectives of this research include both theoretical and practical aspects. The former aims to
study the relationship between similarity measure learning and analogical transfer, with a focus on
the classification task. This involves (1) establishing a theoretical framework that unifies similarity
measure learning and analogical transfer, and (2) studying the optimization of similarity measures using
a specific analogical transfer model. On the practical side, the analogical transfer model will be applied
to diferent domains, including culinary recipe cases and medical use cases.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Methodology</title>
        <p>︁(
In order to achieve the first objective in the theoretical part, I propose a unified framework that
combines similarity measure learning and analogical transfer, by abstracting the common processes,
as measure, interaction and optimization. This work includes the following targets. Firstly, it aims
to provide a clear mathematical framework. By doing so, my aim is to identify parts of existing
metric learning that can be considered as using components of analogical transfer like  . After
that, I would like to provide a base of theoretical proofs for the research questions introduced in
Section 1.2, for example, whether guarantee can be provided about the prediction accuracy, i.e., studying
the probability P arg min∈ ^CoAT(new, )}) = new . Eventually, this work could help to guide

the development of new models. This part will be evaluated by a systematic literature survey and
mathematical demonstrations: we will investigate models that fit into this framework and characterize
formally their expression within this framework. In addition, we will prove the theorems under this
︁)
framework using mathematical methods.</p>
        <p>
          After establishing the theoretical framework, the next step is to study the optimization of similarity
measures for the analogical transfer model CoAT [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] presented in Section 1.1. Since its incompatibility
indicator Γ (see Eq.1) is a sum of binary indicators, and therefore a step function, it is non-continuous
at a finite number of points, and its gradient is zero at all other points. As a consequence, traditional
optimisation methods based on diferentiation cannot be applied to find the optimal parameter values
 ,
in particular the similarity measure   .
        </p>
        <p>I would like to explore the use of metric learning methods to optimize   for CoAT. This involves a
more detailed theoretical analysis the limitations of the original CoAT model, such as its discrete and
non-diferentiable nature limiting its optimization capabilities, and proposing a method to overcome
these limitations. The proposed method will be evaluated on real datasets to assess its efectiveness
in improving the performance of CoAT, using classical supervised learning quality metrics, such as
accuracy and algorithmic complexity.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Progress Summary</title>
      <p>At present, the theoretical part of my study comprises a framework that integrates similarity measure
learning with analogical transfer (Section 3.1), and an investigation into optimizing similarity measures
for the analogical transfer model CoAT (Section 3.2).</p>
      <sec id="sec-3-1">
        <title>3.1. Identification of Common Architecture in Metric Learning</title>
        <p>
          In order to explore the relationship between metric learning and analogical transfer, I first propose a
theoretical framework based on the literature in the state of the art (especially inspired by CoAT [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]),
I propose the following common structure of models in metric learning called Similarity Measure
Learning Architecture (SiMeLAr), which consists of three core components:
        </p>
        <p>situation space () and the outcome space (ℛ), establishing intra-space relationships.
1. The measure component defines the similarity measures (   and  ℛ) independently within the
2. The interaction component introduces an interaction function linking metrics from the situation
and outcome spaces, establishing inter-space relationships. This function ensures consistency, so
that similar input features correspond to similar outputs.
3. The optimization component employs a parameterized loss function and an aggregated total loss
function to optimize the model parameters. It guides the training process, aligning the interaction
of metrics toward efective learning.</p>
        <p>Based on this framework I develop several theorems for discussing when the loss function can be
guaranteed to be diferentiable and convex, and study the relationship between the loss function and
the accuracy of the model.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Continuous CoAT</title>
        <p>
          Beyond the general case, I look at the particular case of CoAT [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. The original CoAT model is based
on the incompatibility function Γ given in Eq. (1), which is defined as the sum of binary indicators Ind .
However, it does not allow to apply gradient-based optimization methods as discussed in Section 2.2.
Besides, it is not sensitive to small changes in the input data or parameters, which can lead to poor
performance in some cases.
        </p>
        <p>To overcome these limitations, I propose a new continuous energy function, which measures the
extent of violation of the analogical transfer principle rather than merely counting violations. Specifically,
in Ind , I propose to replace the first term of the product with max (0,  −   (,  ) +   (, )),
where  is a margin parameter. It quantifies the degree of violation. By doing so, the proposed method
enables gradient-based optimization due to its continuity and diferentiability. Additionally, I propose
to modify the number of triplets (,  , ) involved in the computation of Γ (see Eq. 1), to reduce
the algorithmic complexity, and to introduce a normalization term in order to deal with multi-class
classification scenarios.</p>
        <p>
          Experimental Study Ongoing works study the proposed function for metric leaning, combining it
with several common loss functions in energy-based models, such as MCE loss, hinge loss and direct
loss [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. We evaluate the proposed model Continuous CoAT (C-CoAT) against the original CoAT
model on classical datasets, employing various similarity measures (Polynomial and Mahalanobis) and
optimization methods (Adam and SGD).
        </p>
        <p>Results show that the new continuous energy-based method outperforms the original CoAT in 4 out
of the 6 considered datasets, particularly when the similarity measures are optimized. Notably, datasets
with purely categorical attributes present challenges, that we are currently investigating in more details.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>My research advances the integration of similarity measure learning with analogical transfer for
classification tasks. A unified theoretical framework (SiMeLAr) is proposed to clarify structural elements
common to metric learning and analogical transfer. The proposed Continuous Complexity-based
Analogical Transfer (C-CoAT) resolves discontinuity and zero-gradient limitations of the original CoAT,
and opens the way to improving the classification performance via gradient-based optimization of the
similarity measure it relies on, validated through empirical evaluations.</p>
      <p>Future work will further explore whether the similarity measures optimized for C-CoAT are also
efective for other classifiers, and extend to real-world applications such as cooking and medical use
cases.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work is funded by the SMeLT project, ANR-22-CE23-0032-03.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors did not use any generative AI during the preparation of this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gust</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Krumnack</surname>
          </string-name>
          , K.-U. Kühnberger,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schwering</surname>
          </string-name>
          ,
          <article-title>Analogical Reasoning: A Core of Cognition</article-title>
          .,
          <source>KI</source>
          <volume>22</volume>
          (
          <year>2008</year>
          )
          <fpage>8</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>I.</given-names>
            <surname>Gilboa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmeidler</surname>
          </string-name>
          ,
          <article-title>Case based Predictions: Introduction, Introduction to Case-Based Prediction</article-title>
          . World Scientific Publishers (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-J. Lesot</surname>
          </string-name>
          ,
          <article-title>Case-based prediction - A survey, Int</article-title>
          .
          <source>Journal of Approximate Reasoning</source>
          <volume>158</volume>
          (
          <year>2023</year>
          )
          <fpage>108920</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Habrard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sebban</surname>
          </string-name>
          , Metric Learning,
          <source>Synthesis Lectures on Artificial Intelligence and Machine Learning</source>
          , Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Suárez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <article-title>A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis</article-title>
          ,
          <source>prospects and challenges, Neurocomputing</source>
          <volume>425</volume>
          (
          <year>2021</year>
          )
          <fpage>300</fpage>
          -
          <lpage>322</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bendaoud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bentebibel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-A.</given-names>
            <surname>Champin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cojan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cordier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Després</surname>
          </string-name>
          , S. JeanDaubias, J. Lieber,
          <string-name>
            <given-names>T.</given-names>
            <surname>Meilender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mille</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Nauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Toussaint</surname>
          </string-name>
          , TAAABLE: Text Mining, Ontology Engineering, and
          <article-title>Hierarchical Classification for Textual Case-Based Cooking</article-title>
          , in: M.
          <string-name>
            <surname>Schaaf</surname>
          </string-name>
          (Ed.),
          <source>9th European Conf. on Case-Based Reasoning - ECCBR</source>
          <year>2008</year>
          ,
          <string-name>
            <given-names>Workshop</given-names>
            <surname>Proc</surname>
          </string-name>
          .,
          <year>2008</year>
          , pp.
          <fpage>219</fpage>
          -
          <lpage>228</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Redjdal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bouaud</surname>
          </string-name>
          , G. Guézennec,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gligorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Seroussi</surname>
          </string-name>
          ,
          <article-title>Reusing Decisions Made with One Decision Support System to Assess a Second Decision Support System: Introducing the Notion of Complex Cases</article-title>
          ,
          <source>Studies in Health Technology and Informatics</source>
          <volume>281</volume>
          (
          <year>2021</year>
          )
          <fpage>649</fpage>
          -
          <lpage>653</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Santini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <article-title>Similarity measures</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>21</volume>
          (
          <year>1999</year>
          )
          <fpage>871</fpage>
          -
          <lpage>883</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>M.-J. Lesot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rifqi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Benhadda</surname>
          </string-name>
          ,
          <article-title>Similarity measures for binary and numerical data: a survey, Int</article-title>
          .
          <source>Journal of Knowledge Engineering and Soft Data Paradigms</source>
          <volume>1</volume>
          (
          <year>2009</year>
          )
          <fpage>63</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <article-title>Distance Metric Learning with Application to Clustering with Side-Information</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>15</volume>
          , MIT Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Dhillon</surname>
          </string-name>
          ,
          <article-title>Information-theoretic metric learning</article-title>
          ,
          <source>in: Proc. of the 24th Int. Conf. on Machine Learning, ICML '07</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2007</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>216</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalousis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Woznica</surname>
          </string-name>
          ,
          <article-title>Parametric Local Metric Learning for Nearest Neighbor Classification</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>25</volume>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kedem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tyree</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sha</surname>
          </string-name>
          , G. Lanckriet,
          <string-name>
            <given-names>K. Q.</given-names>
            <surname>Weinberger</surname>
          </string-name>
          ,
          <article-title>Non-linear Metric Learning</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>25</volume>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Guibas</surname>
          </string-name>
          ,
          <article-title>Supervised Earth Mover's Distance Learning and Its Computer Vision Applications</article-title>
          , in: A.
          <string-name>
            <surname>Fitzgibbon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lazebnik</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Perona</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Sato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Schmid (Eds.),
          <source>Computer Vision - ECCV 2012</source>
          , Springer,
          <year>2012</year>
          , pp.
          <fpage>442</fpage>
          -
          <lpage>455</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hermans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Leibe</surname>
          </string-name>
          ,
          <article-title>In Defense of the Triplet Loss for Person Re-Identification, ArXiv (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ailon</surname>
          </string-name>
          ,
          <article-title>Deep Metric Learning Using Triplet Network</article-title>
          , in: A.
          <string-name>
            <surname>Feragen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pelillo</surname>
          </string-name>
          , M. Loog (Eds.),
          <source>Similarity-Based Pattern Recognition</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gabel</surname>
          </string-name>
          , E. Godehardt,
          <article-title>Top-Down Induction of Similarity Measures Using Similarity Clouds</article-title>
          , in: E. Hüllermeier, M. Minor (Eds.),
          <source>Case-Based Reasoning Research and Development</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>149</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>H. O.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jegelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Savarese</surname>
          </string-name>
          ,
          <source>Deep Metric Learning via Lifted Structured Feature Embedding</source>
          ,
          <source>2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)</source>
          (
          <year>2016</year>
          )
          <fpage>4004</fpage>
          -
          <lpage>4012</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values, Data Mining and Knowledge Discovery 2 (</article-title>
          <year>1998</year>
          )
          <fpage>283</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Unsupervised Hyperbolic Metric Learning</article-title>
          ,
          <source>in: Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>12465</fpage>
          -
          <lpage>12474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <article-title>A Data-Driven Approach for Determining Weights in Global Similarity Functions</article-title>
          , in: K. Bach,
          <string-name>
            <surname>C.</surname>
          </string-name>
          Marling (Eds.),
          <source>Case-Based Reasoning Research and Development</source>
          , volume
          <volume>11680</volume>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>139</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Mathisen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aamodt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Langseth</surname>
          </string-name>
          ,
          <article-title>Learning similarity measures from data</article-title>
          ,
          <source>Progress in Artificial Intelligence</source>
          <volume>9</volume>
          (
          <year>2020</year>
          )
          <fpage>129</fpage>
          -
          <lpage>143</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Davies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>A Logical Approach to Reasoning by Analogy</article-title>
          , in: J. P.
          <string-name>
            <surname>McDermott</surname>
          </string-name>
          (Ed.),
          <source>Proc. of the 10th Int. Joint Conf. on Artificial Intelligence (IJCAI'87)</source>
          , Morgan Kaufmann Publishers,
          <year>1987</year>
          , pp.
          <fpage>264</fpage>
          -
          <lpage>270</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-J. Lesot</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Barakat</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Marsala</surname>
          </string-name>
          ,
          <article-title>Theoretical and Experimental Study of a Complexity Measure for Analogical Transfer</article-title>
          , in: M. T. Keane, N. Wiratunga (Eds.),
          <source>Case-Based Reasoning Research and Development</source>
          , volume
          <volume>13405</volume>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>189</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sedki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ugon</surname>
          </string-name>
          ,
          <article-title>On the Role of Similarity in Analogical Transfer</article-title>
          , in: M. T.
          <string-name>
            <surname>Cox</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Funk</surname>
          </string-name>
          , S. Begum (Eds.),
          <source>Case-Based Reasoning Research and Development</source>
          , volume
          <volume>11156</volume>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>499</fpage>
          -
          <lpage>514</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>L.</given-names>
            <surname>Miclet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bayoudh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Delhay</surname>
          </string-name>
          , Analogical Dissimilarity:
          <article-title>Definition, Algorithms and Two Experiments in Machine Learning</article-title>
          ,
          <source>Journal of Artificial Intelligence Research</source>
          <volume>32</volume>
          (
          <year>2008</year>
          )
          <fpage>793</fpage>
          -
          <lpage>824</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hüllermeier</surname>
          </string-name>
          ,
          <source>Case-Based Approximate Reasoning, number 44 in Theory and Decision Library</source>
          , Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hüllermeier</surname>
          </string-name>
          ,
          <article-title>Possibilistic instance-based learning</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>148</volume>
          (
          <year>2003</year>
          )
          <fpage>335</fpage>
          -
          <lpage>383</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lieber</surname>
          </string-name>
          , E. Nauer,
          <string-name>
            <given-names>H.</given-names>
            <surname>Prade</surname>
          </string-name>
          ,
          <source>When Revision-Based Case Adaptation Meets Analogical Extrapolation, in: 29th Int. Conf. on Case-Based Reasoning (ICCBR</source>
          <year>2021</year>
          ), volume
          <volume>12877</volume>
          of Lecture Notes in Computer Science book series (LNCS),
          <year>2021</year>
          , pp.
          <fpage>156</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Dataset</given-names>
            <surname>Complexity</surname>
          </string-name>
          <article-title>Measure for Analogical Transfer</article-title>
          ,
          <source>in: Int. Joint Conf. on Artificial Intelligence</source>
          , volume
          <volume>2</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>1601</fpage>
          -
          <lpage>1607</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>F.</given-names>
            <surname>Badra</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-J. Lesot</surname>
            , E. Marquer,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Couceiro</surname>
          </string-name>
          ,
          <article-title>Some Perspectives on Similarity Learning for CaseBased Reasoning</article-title>
          and Analogical Transfer,
          <source>in: Workshop on the Interactions between Analogical Reasoning and Machine Learning</source>
          ,
          <source>IARML@IJCAI'</source>
          <year>2023</year>
          :, volume
          <volume>3492</volume>
          , CEUR-WS.org,
          <year>2023</year>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , S. Chopra,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hadsell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ranzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>A tutorial on energy-based learning</article-title>
          , in: G. Bakir,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Scholkopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smola</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          Taskar (Eds.),
          <article-title>Predicting structured data</article-title>
          , MIT Press,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>