<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Gradient Boosting Similarity in Entity Matching</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergei Fedorchenko</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergei Arefiev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>St Petersburg University</institution>
          ,
          <addr-line>7-9 Universitetskaya Embankment, St Petersburg, Russia, 199034</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Zurich</institution>
          ,
          <addr-line>Binzmühlestrasse 14 8050 Zürich</addr-line>
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>We present a solution to the CLEF 2025 [1] animal species identification task, leveraging pretrained embedding models and a boosting classifier for pairwise similarity learning. Our pipeline combines pretrained feature extraction, pairwise embedding comparison, and supervised boosting to determine species-level similarity between image pairs. As a team "Tim Riggins" we have achieved competitive performance, obtaining scores of 0.618 target metric for our selected submission and 0.629 target metric for our best submission on the CLEF private leaderboard.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Entity matching</kwd>
        <kwd>wildlife</kwd>
        <kwd>gradient boosting</kwd>
        <kwd>images similarity</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Metric Learning</title>
        <p>
          Metric learning aims to project images into an embedding space where semantically similar items are
close and dissimilar items are far apart. Common approaches include triplet loss [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], which optimizes
relative distances between anchor, positive, and negative samples, and contrastive loss [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], which
operates on pairs of examples. These methods often require careful sampling or mining strategies to be
efective. ArcFace [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] improves stability and discriminative power by adding an angular margin to the
softmax loss, making it particularly efective for tasks with large numbers of classes.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Boosting Methods</title>
        <p>
          Gradient boosting methods are widely used for structured and tabular data due to their strong
performance, robustness to overfitting, and ability to handle heterogeneous feature types. These models
iteratively build an ensemble of weak learners, typically decision trees, to minimize a loss function
through stage-wise additive modeling. Among popular implementations, LightGBM [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] ofers eficient
training with support for large-scale datasets, categorical features, and custom objective functions. In
the context of similarity learning, boosting models can be trained to predict whether a given pair of
examples belongs to the same class by using handcrafted features derived from image embeddings.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Open Set Recognition</title>
        <p>
          Open Set Recognition (OSR) aims to address the realistic scenario where a model may encounter
inputs from classes not seen during training. Unlike traditional closed-set classifiers, OSR models must
detect and reject unfamiliar samples rather than misclassify them. Techniques such as OpenMax [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
extend softmax classifiers by fitting Weibull distributions to activation vectors, enabling detection of
out-of-distribution inputs. Other methods apply thresholding in embedding space using distances to
class centroids or Mahalanobis scoring [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], making them compatible with metric learning approaches.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Energy-Based Models</title>
        <p>
          Energy-Based Models (EBMs) provide a principled framework for modeling uncertainty and detecting
out-of-distribution inputs. Instead of outputting class probabilities directly, these models compute
an energy score that reflects the compatibility of input and model. In the OSR context, inputs with
high energy are flagged as unfamiliar or anomalous. Liu et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] propose training classifiers to
minimize energy on in-distribution samples while maximizing it on out-of-distribution data using
Outlier Exposure. EBMs can be seamlessly integrated with pre-trained embedding networks and have
shown superior calibration and robustness compared to traditional softmax classifiers.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>
        The CLEF 2025 dataset includes images of individual animals across several species, such as lynxes,
salamanders, and sea turtles with the distribution shown in the Table 1, supported along with rich
metadata (e.g., species ID, individual ID, timestamp, and orientation) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The primary task is to
determine whether two images depict the same animal, making it a fine-grained entity recognition
problem. Each species presents unique challenges due to diferences in visual variability, camera
conditions, and availability of labeled examples. The dataset also includes both known and unknown
individuals, requiring models to generalize beyond the training set.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Task Description</title>
      <p>
        The goal of the entity matching task is to determine whether a query animal image belongs to a known
individual from a reference database or represents a previously unseen individual. The challenge spans
multiple species and requires models to handle significant visual variability while generalizing to novel
instances. Evaluation is based on two core metrics [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]: BAKS (Balanced Accuracy on Known Samples),
which measures class-balanced accuracy over known individuals, and BAUS (Balanced Accuracy on
Unknown Samples), which assesses performance on identifying novel individuals. The final score is
computed as the geometric mean of BAKS and BAUS, encouraging balanced performance between
identification and novelty detection.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Method</title>
      <sec id="sec-5-1">
        <title>5.1. Image Preprocessing</title>
        <p>To preserve the aspect ratio of the original images, we applied padding before resizing. Images were
then resized to 384 × 384 pixels when using the MegaDescriptor model, and to 512 × 512 pixels for
all other local descriptor extractors. This preprocessing ensured consistency across the input pipeline
while maintaining the structural integrity of key visual features.</p>
        <p>For normalization, we used ImageNet statistics (mean and standard deviation) when working with
MegaDescriptor. For the other models, min-max normalization.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Global Feature Extraction</title>
        <p>
          MegaDescriptor [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] was used as our primary model for global feature extraction. Although originally
designed for local descriptor learning, MegaDescriptor can also produce dense feature maps that serve
as strong global representations when properly combined. We extracted fixed-length embeddings for
each image without any additional fine-tuning, using these as input for direct similarity comparisons.
The embeddings were used as the foundation for cosine similarity baselines. This approach allowed us
to leverage powerful pre-trained representations while maintaining a training-free feature construction
pipeline.
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Local Feature-Based Matching</title>
        <p>
          We experimented with local descriptor extractors to match animal instances based on fine-grained
visual details. Specifically, we used three pre-trained models: ALIKED [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], Disk [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], and SuperPoint
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], all designed for keypoint detection and local descriptor extraction. Each model extracts sets
of keypoints and associated descriptors that can be matched across images to establish local visual
correspondences. These matches serve as a complementary signal to global embeddings, especially in
cases where texture, pose, or background diferences make purely global similarity less reliable.
        </p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Keypoints Information Aggregation</title>
        <p>
          To combine keypoint information from multiple local descriptor extractors, we used LightGlue [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] as
a lightweight and flexible matching framework. LightGlue was applied to the outputs of ALIKED, Disk,
and SuperPoint, performing keypoint matching between image pairs for each method independently.
The resulting correspondences were then aggregated to produce a richer and more reliable representation
of local visual similarity. This aggregation step allowed us to leverage complementary strengths
of diferent detectors—such as scale invariance, robustness to viewpoint changes, and localization
precision—to improve the quality of our pairwise features. The combined matches were used as part of
the input to our boosting model for final prediction.
        </p>
      </sec>
      <sec id="sec-5-5">
        <title>5.5. Pairwise Feature Construction</title>
        <p>For each image pair, we compute cosine similarity and the number of corresponding points from local
feature extractors Table 2 ending up with 9 features per sample pair. These serve as input features for
the boost classifier.</p>
        <p>To improve the robustness of local feature-based similarity, we applied a top- filtering strategy
using the MegaDescriptor model. For each pair of images, we extracted local descriptors and retained
only  the most similar keypoint correspondences based on the distance of the descriptor. This step
helps to reduce the noise from irrelevant or weak matches and emphasizes the most confident local
alignments between images.</p>
        <p>Utilizing cross-validation scheme, a smaller value of  = 40 was suficient for lynxes and turtles,
while salamanders required a higher threshold of  = 150 to maintain performance, likely due to
greater visual similarity and more challenging matching conditions within that class.</p>
      </sec>
      <sec id="sec-5-6">
        <title>5.6. Boosting Classifier</title>
      </sec>
      <sec id="sec-5-7">
        <title>5.7. Validation</title>
        <p>We train a LightGBM binary classifier to predict whether a given pair of images represents the same
entities. Positive pairs consist of the same entities; negative pairs consist of diferent entities.
To better simulate the conditions of the test set, we leveraged the timestamp information available in
the Kaggle dataset to construct train–test splits that follow the natural chronological order of image
collection.</p>
        <p>In addition, we held out 20% of the individuals during training and used them as unseen entities.
This setup allowed us to evaluate both identification of known individuals and detection of new ones,
closely aligning with the evaluation protocol of the challenge.</p>
        <p>Based on our validation results, we selected a threshold of 0.65 to distinguish new individuals from
known ones.</p>
      </sec>
      <sec id="sec-5-8">
        <title>5.8. Inference</title>
        <p>During inference, we followed the same procedure as in training. For each query image, we identiefid its
 most similar reference images based on embedding similarity. These top- pairs were then passed
through the boosting model to obtain binary match scores. The final prediction was assigned based on
the reference image with the highest predicted similarity score among the  candidates. Approach
depicted in the Figure 2 allowed us to eficiently combine retrieval-based filtering with supervised
matching for robust entity recognition.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Experiments and Results</title>
      <p>We report performance in the Tables 4, 5 using the oficial CLEF evaluation metrics. Our model
outperforms simple cosine thresholding and benefits from the added discrimination of the supervised
boosting model. Both the use of boosting instead of averaging, and the application of padding, lead to
improved results.</p>
      <p>Despite the overall consistency between validation and private leaderboard scores, some
discrepancies were observed. For instance, the configuration with a threshold of 0.75 achieved the highest
private score (0.629), while a lower threshold of 0.6 yielded the best validation score (0.686). This
suggests that our validation split, although timeline-based and stratified, may not perfectly reflect the
distribution or dificulty of the private test set. Overfitting to the validation threshold likely accounts
for this performance gap, highlighting the importance of evaluating across multiple thresholds and data
partitions. In future work, incorporating more robust cross-validation or ensembling across decision
thresholds could help bridge this mismatch.</p>
      <p>Finetuning MegaDescriptor within a multi-class classification framework led to noticeable
performance improvements over using frozen embeddings. The model was trained to predict individual
entities directly, which encouraged more discriminative representations. However, this approach
proved computationally expensive and required extensive training data for convergence. In contrast,
keypoint-based matching methods ofered a more eficient alternative, especially when used with
pretrained extractors and lightweight post-processing. As a result, we prioritized methods that combined
pretrained descriptors with retrieval and pairwise scoring.</p>
      <p>This pipeline demonstrated strong generalization performance and remained stable across validation
and test scenarios. By combining global and local features with a lightweight boosting classifier, it
efectively captured both coarse and fine-grained visual similarities. The method required minimal
ifnetuning and was computationally eficient at inference time. As a result, it secured 7th place on the
private leaderboard of the CLEF 2025 animal re-identification challenge.</p>
      <sec id="sec-6-1">
        <title>6.1. Ablation Study</title>
        <p>We evaluated several baseline strategies to understand the contribution of each component in our
pipeline. A simple cosine similarity between pretrained embeddings yielded limited performance,
especially in distinguishing hard negative pairs. We also experimented with training classification
models and using their penultimate-layer embeddings for cosine-based retrieval; while this improved
over the vanilla approach, it still lacked robustness. In contrast, our boosting-based method, which
leverages pairwise features, consistently outperformed both baselines by capturing more nuanced
relationships between image pairs. This highlights the value of supervised modeling on relational
features beyond raw embedding distances.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Discussion</title>
      <p>The pretrained embedding model captured useful semantic structure. Boosting provided flexible decision
boundaries in embedding space. Triplet-based training was unstable in our setup, and the pretrained
matcher consistently outperformed self-trained models.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusion and Future Work</title>
      <p>We presented a simple yet efective framework for entity matching based on pretrained visual
embeddings and a supervised boosting model trained on pairwise features. Rather than relying solely on
embedding distance thresholds, our approach learns to model similarity through handcrafted relational
features and a gradient boosting classifier, providing more flexibility and robustness to noise or domain
shifts. This framework proved particularly useful in scenarios where the embedding space alone was
insuficient to capture fine-grained species-level distinctions. In future work, we plan to explore
selfsupervised pretraining to improve the embedding quality, incorporate graph-based label propagation to
exploit the structure of embedding similarity graphs, and investigate hierarchical clustering techniques
to capture taxonomic relationships between species.</p>
    </sec>
    <sec id="sec-9">
      <title>9. Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT in order to: Grammar and spelling
check. The changes were minimal and afected only grammar and punctuation, not the meaning of the
text. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s)
full responsibility for the publication’s content.</p>
    </sec>
    <sec id="sec-10">
      <title>Acknowledgments References</title>
      <p>We thank the CLEF 2025 organizers and baseline authors for providing pretrained models and evaluation
scripts.</p>
    </sec>
    <sec id="sec-11">
      <title>A. Online Resources</title>
      <p>All code can be found at the github repository https://github.com/SergeyFedorchenko/AnimalCLEF25_
7th.git.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Picek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Goëau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Larcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leblanc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Servajean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Janoušková</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Matas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Planqué</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-P.</given-names>
            <surname>Vellinga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Klinck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Denton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Cañas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Martellucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vinatier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          , Overview of lifeclef 2025:
          <article-title>Challenges on species presence prediction and identification, and individual animal identification</article-title>
          ,
          <source>in: International Conference of the Cross-Language Evaluation Forum for European Languages (CLEF)</source>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kovář</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Picek</surname>
          </string-name>
          , Overview of AnimalCLEF 2025:
          <article-title>Recognizing individual animals in images</article-title>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ailon</surname>
          </string-name>
          ,
          <article-title>Deep metric learning using triplet network, 2018</article-title>
          . URL: https://arxiv.org/abs/ 1412.6622. arXiv:
          <volume>1412</volume>
          .
          <fpage>6622</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Teterwak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sarna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Isola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maschinot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          , Supervised contrastive learning,
          <year>2021</year>
          . URL: https://arxiv.org/abs/
          <year>2004</year>
          .11362. arXiv:
          <year>2004</year>
          .11362.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Xue</surname>
          </string-name>
          , I. Kotsia,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zafeiriou</surname>
          </string-name>
          , Arcface:
          <article-title>Additive angular margin loss for deep face recognition</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>44</volume>
          (
          <year>2022</year>
          )
          <fpage>5962</fpage>
          -
          <lpage>5979</lpage>
          . URL: http://dx.doi.org/10.1109/TPAMI.
          <year>2021</year>
          .
          <volume>3087709</volume>
          . doi:
          <volume>10</volume>
          .1109/tpami.
          <year>2021</year>
          .
          <volume>3087709</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Finley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W. Ma,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ye</surname>
          </string-name>
          , T.-Y. Liu,
          <article-title>Lightgbm: A highly eficient gradient boosting decision tree</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bendale</surname>
          </string-name>
          , T. Boult, Towards open set deep networks,
          <year>2015</year>
          . URL: https://arxiv.org/abs/1511.06233. arXiv:
          <volume>1511</volume>
          .
          <fpage>06233</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <article-title>A simple unified framework for detecting out-of-distribution samples and adversarial attacks</article-title>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/abs/
          <year>1807</year>
          .03888. arXiv:
          <year>1807</year>
          .03888.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Owens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Energy-based out-of-distribution detection</article-title>
          ,
          <year>2021</year>
          . URL: https://arxiv.org/abs/
          <year>2010</year>
          .03759. arXiv:
          <year>2010</year>
          .03759.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          , L. Picek, Wildlifereid-10k:
          <article-title>Wildlife re-identification dataset with 10k individual animals</article-title>
          ,
          <year>2025</year>
          . URL: https://arxiv.org/abs/2406.09211. arXiv:
          <volume>2406</volume>
          .
          <fpage>09211</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          , L. Picek,
          <article-title>Seaturtleid2022: A long-span dataset for reliable sea turtle re-identification</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>7146</fpage>
          -
          <lpage>7156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Picek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          ,
          <string-name>
            <surname>Wildlifedatasets:</surname>
          </string-name>
          <article-title>An open-source toolkit for animal re-identification</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2311.09118. arXiv:
          <volume>2311</volume>
          .
          <fpage>09118</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. C. Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Aliked: A lighter keypoint and descriptor extraction network via deformable transformation</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2304.03608. arXiv:
          <volume>2304</volume>
          .
          <fpage>03608</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>M. J. Tyszkiewicz</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Fua</surname>
          </string-name>
          , E. Trulls,
          <article-title>Disk: Learning local features with policy gradient</article-title>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>2006</year>
          .13566. arXiv:
          <year>2006</year>
          .13566.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>D. DeTone</surname>
            , T. Malisiewicz,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rabinovich</surname>
          </string-name>
          , Superpoint:
          <article-title>Self-supervised interest point detection and description, 2018</article-title>
          . URL: https://arxiv.org/abs/1712.07629. arXiv:
          <volume>1712</volume>
          .
          <fpage>07629</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lindenberger</surname>
          </string-name>
          , P.-E. Sarlin,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pollefeys</surname>
          </string-name>
          , Lightglue: Local feature matching at light speed,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2306.13643. arXiv:
          <volume>2306</volume>
          .
          <fpage>13643</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>