<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of BirdCLEF+ 2025: Multi-Taxonomic Sound Identification in the Middle Magdalena, Colombia</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Juan Sebastián Cañas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Kahl</string-name>
          <email>stefan.kahl@cornell.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff8">8</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tom Denton</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Paula Toro-Gómez</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Susana Rodriguez-Buritica</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose Luis Benavides-Lopez</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Sebastián Ulloa</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paula Caycedo-Rosales</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Holger Klinck</string-name>
          <xref ref-type="aff" rid="aff8">8</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hervé Goëau</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Willem-Pier Vellinga</string-name>
          <xref ref-type="aff" rid="aff9">9</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robert Planqué</string-name>
          <xref ref-type="aff" rid="aff9">9</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexis Joly</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRAD, UMR AMAP</institution>
          ,
          <addr-line>Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Centre for Biodiversity and Environment Research, University College London</institution>
          ,
          <addr-line>London WC1E 6BT</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Chemnitz University of Technology</institution>
          ,
          <addr-line>Chemnitz</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Departamento de Ciencias Biológicas, Universidad de los Andes</institution>
          ,
          <addr-line>Bogotá</addr-line>
          ,
          <country country="CO">Colombia</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Fundación Biodiversa Colombia</institution>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Google Deepmind</institution>
          ,
          <addr-line>San Francisco</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>Inria, LIRMM, University of Montpellier</institution>
          ,
          <addr-line>CNRS, Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>Instituto de Investigación de Recursos Biológicos Alexander von Humboldt</institution>
          ,
          <addr-line>Bogotá</addr-line>
          ,
          <country country="CO">Colombia</country>
        </aff>
        <aff id="aff8">
          <label>8</label>
          <institution>K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University</institution>
          ,
          <addr-line>Ithaca</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff9">
          <label>9</label>
          <institution>Xeno-canto Foundation</institution>
          ,
          <addr-line>Groningen</addr-line>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The BirdCLEF+ 2025 challenge focused on the simultaneous acoustic identification of birds, amphibians, mammals and insects in the Middle Magdalena Valley, a biodiversity hotspot in Colombia. This edition aimed to advance passive acoustic monitoring by tasking participants with developing reliable systems for detecting and identifying multi-taxonomic vocalizations from extensive soundscape recordings. Using training data provided by museum collections, citizen science projects and new unlabeled soundscapes, participants addressed the challenge of out-of-distribution generalization under field conditions and limited training data for many species. Participants used data augmentation, pseudo-labeling, and self-training to enhance model robustness and accuracy, often refining pseudo-labels iteratively. For improved scores and runtime eficiency, teams commonly employed TestTime Augmentation, ensemble methods, and optimized inference with dominant Sound Event Detection and CNN-based models, frequently pretraining on external datasets. The highest-scoring submission achieved an ROC-AUC score of 0.930 on the private leaderboard (0.933 on the public leaderboard), with the top 10 systems difering by only 0.9% in their scores.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;LifeCLEF</kwd>
        <kwd>insect</kwd>
        <kwd>amphibian</kwd>
        <kwd>mammal</kwd>
        <kwd>bird</kwd>
        <kwd>song</kwd>
        <kwd>call</kwd>
        <kwd>species</kwd>
        <kwd>retrieval</kwd>
        <kwd>audio</kwd>
        <kwd>collection</kwd>
        <kwd>identification</kwd>
        <kwd>finegrained classification</kwd>
        <kwd>evaluation</kwd>
        <kwd>benchmark</kwd>
        <kwd>bioacoustics</kwd>
        <kwd>passive acoustic monitoring</kwd>
        <kwd>PAM</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Some of the world’s most biodiversity-rich regions are also those where socioeconomic conflicts run
deepest [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These areas often lack robust environmental governance, which heightens the tension
between conservation and economic exploitation. This institutional fragility exacerbates pressures on
ecosystems, undermining both ecological integrity and community well-being. One such region is the
Middle Magdalena Valley in Colombia, one of the world’s most biodiverse areas, yet it is undergoing
rapid land-use intensification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The Middle Magdalena Valley is a vital habitat for numerous taxonomic groups, including mammals,
amphibians, birds, and insects [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7 ref8">3, 4, 5, 6, 7, 8</xref>
        ], thriving in remarkable ecosystems such as humid tropical
lowland forests and extensive wetlands. However, economic development in the region—driven by
cattle ranching, mineral extraction, and oil palm cultivation—is severely impacting biodiversity and
diminishing Nature’s Contributions to People (NCP) [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], including water quality and regulation,
soil fertility, and carbon sequestration [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
        ]. Therefore, it is essential to design and deploy
practical biodiversity diagnostic tools to assess environmental dynamics. In doing so, the community
and decision-makers will be better equipped to implement informed, timely strategies that harmonize
human development with ecosystem resilience.
      </p>
      <p>
        In this context, robust biodiversity monitoring is fundamental for rapidly and efectively assessing
ecosystem health. For instance, precisely measuring the impact of restoration activities is crucial for
identifying optimal treatments that lead to desired ecological outcomes [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]. Acoustic data emerges
as a powerful ecological signal for this purpose. Specifically, passive acoustic monitoring (PAM) [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ],
combined with deep learning models [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] for data analysis, ofers a promising approach to inform the
eficacy of interventions and track long-term ecological changes. While several studies have explored
using sound to evaluate restoration, these approaches primarily rely on the presence of one taxonomic
group, such as birds and insects [
        <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
        ], as a proxy for overall diversity.
      </p>
      <p>
        However, studying entanglement patterns between taxonomic groups could significantly advance our
understanding of complex ecological processes, as some studies have shown when examining patterns
of presence and absence of diferent taxonomic groups in the tropical forest [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. A crucial step for
generating time series of species presence and absence required for such analysis is the construction of
highly curated datasets. These datasets are essential for training and testing deep learning models before
their broad application to PAM data. Previous works have curated datasets for birds [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], insects [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ],
amphibians [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], and mammals [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. While recent work has been merging diferent sources of data to
analyze multi-taxonomic approaches in bioacoustics [
        <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
        ], none have yet considered the co-existence
of multiple taxonomic groups in the same soundscape, which are especially rich in the Neotropics
(Figure 1), where co-ocurrence, overlapping, and diferent levels of activity are present in acoustic space [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        Despite previous works in datasets and automatic species identifiers, there are no PAM datasets that
consider the ubiquitous multi-taxonomy of the soundscapes. Furthermore, there are no multi-taxonomic
automatic models, as in the case of MegaDetector in camera trapping [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], that can be used as a backbone
for diferent applications in PAM. To address both challenges, we present:
• The ESMT (El Silencio Multi-Taxonomic) dataset, composed of two parts: 1) 770 strongly-labeled
soundscapes representing 15k bounding boxes of 4 taxonomic groups simultaneously singing in
the Middle Magdalena Valley and 2) 11340 unlabeled soundscapes in the same region.
• The BirdCLEF+ 2025 (Bird Recognition Challenge), an integral part of LifeCLEF 2025 [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], tasked
participants with identifying bird, mammal, insect, and amphibian calls within soundscapes
from the Middle Magdalena Valley. The competition ended with 9,829 registrations and 2,757
participants on 2,161 teams. We had 76,381 submissions from 86 countries.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. El Silencio Multi-Taxonomic Dataset</title>
      <p>In this section, we describe the construction of the dataset that we release in this work: the ESMT dataset.
First, we selected a dedicated subset for strong-labelling annotation. Next, we created bounding boxes
over frequently observed sonotypes across various taxa. Finally, we assign a taxonomic identification to
the sonotypes. In addition, we also selected unlabeled soundscapes for further exploration. The dataset
is made publicly available1.</p>
      <sec id="sec-2-1">
        <title>2.1. Data collection</title>
        <p>
          We deployed recorders across the Middle Magdalena River Valley, around the forests of the Barbacoas
wetlands (Figure 2). We used a stratified sampling design across properties and areas with contrasting
compositions of forest and pasture. We deployed Audiomoth v1.2.0 [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] passive recorders during March
and August 2023. Recorders were located 1.5m from the ground, programmed to capture one minute of
sound every five minutes with a sampling rate of 48 kHz.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Labeled soundscapes</title>
        <p>Selection: Seven sites were selected with diferent forest compositions. A random subset of recordings
within the 5-7 AM and 5-7 PM time frames was selected for annotation (110 per site). This resulted in a
total of 770 recordings, amounting to 12.8 hours
Annotation: We randomly selected 10 files of each site to hear common sonotypes. We focus on the
most frequent and stereotypical sonotypes to decrease workload. Two expert annotators were in charge
of creating strong labels over the entire soundscape. One expert checked all the files searching for birds
and mammals (PCR), and the other expert annotated insects and amphibians (MPTG).
Taxonomic identification: Birds and mammals were easily recognized through previous works,
expertise, audio repositories and a species list provided by the System of Biodiversity from Colombia
(SiB Colombia). Amphibians were identified using a similar route but with some additional confirmation</p>
        <sec id="sec-2-2-1">
          <title>1https://github.com/redecoacustica/elsilencio-dataset/</title>
          <p>
            between herpetologists. However, the hardest identification process was for insects. After an iterative
process [
            <xref ref-type="bibr" rid="ref33">33</xref>
            ] that included field work in the reserve and intensive manual verification led by an expert
entomologist (JLBL) with the Collection of Environmental Sounds (CSA) in Colombia [
            <xref ref-type="bibr" rid="ref34">34</xref>
            ], we identified
a subset of the insect sonotypes at the family level. Infrequent sonotypes were not identified.
          </p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Unlabeled soundscapes</title>
        <p>From the final 534,420 audio lfies collected, we randomly selected 11,340 unlabeled soundscapes. We
chose that specific quantity to keep the total size of the dataset below 50GB. These files correspond
to 63 sites (180 files per site) during all possible hours and days of the collection. We open unlabeled
soundscapes to explore potential algorithmic approaches that use unlabeled data to improve species
identification models.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. BirdCLEF+ 2025 Competition Overview</title>
      <p>Mobile and habitat-diverse species serve as valuable indicators of biodiversity change, as shifts in their
assemblages and population dynamics can signal the success or failure of ecological restoration eforts.
These species often respond rapidly to environmental changes, making them particularly useful for
detecting early signs of ecological improvement or degradation. However, traditional observer-based
biodiversity surveys across large areas are both costly and logistically demanding, often requiring
extensive fieldwork, expertise, and repeated visits to remote locations, challenges that limit the frequency
and scale of monitoring. In contrast, passive acoustic monitoring (PAM), combined with AI, ofers a
scalable and non-invasive solution that enables conservationists to collect and analyze vast amounts of
ecological data with minimal human presence. PAM systems can operate continuously over extended
periods and in challenging environments, capturing the vocal activity of a wide range of taxa, including
birds, amphibians, and insects. When paired with automated species identification, it enables researchers
to monitor biodiversity across broad spatial and temporal scales, allowing more timely and data-driven
reviews of restoration outcomes.</p>
      <sec id="sec-3-1">
        <title>3.1. Goal/Task</title>
        <p>This competition aimed to advance automated species identification in soundscape data from the
Middle Magdalena Valley of Colombia, including the El Silencio Natural Reserve. Key objectives include
detecting species across diverse taxonomic groups, developing machine learning models capable of
recognizing rare and endangered species from limited training data, and leveraging unlabeled data to
improve detection and classification performance.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Evaluation protocol</title>
        <p>
          The challenge was hosted on Kaggle, following a similar evaluation setup as in previous years [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ],
with hidden test data and a code competition format. We used a variant of macro-averaged ROC-AUC
as the evaluation metric, excluding classes with no true positive labels, allowing us to assess model
performance without relying on confidence threshold tuning and emphasizing species-level rather than
segment-level accuracy [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. Participants were asked to identify species in short, 5-second audio clips
extracted from labeled soundscape recordings, a length chosen to balance signal clarity with adequate
context. The dataset was kept under 50 GB to ensure accessibility and ease of use. To further support
participants, we provided starter code and documentation to help newcomers get started quickly.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Time limits</title>
        <p>Competitors were limited to 90 minutes of inference time on a CPU. This ensures that models are
cost-efective for real-world usage. A side efect is reducing the impact of ensembling, a common Kaggle
tactic obscuring underlying model quality.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Dataset for the competition</title>
        <p>
          Building on lessons from previous editions, we refined the task to encourage participants to design
models tailored to the unique challenges of the competition. Training and test data were carefully
selected to reflect a range of bird and non-bird taxa 2, supporting this goal. As in past years, Xeno-canto
[
          <xref ref-type="bibr" rid="ref37">37</xref>
          ] remained the main source of training data, complemented by expertly annotated soundscape
recordings for testing. This year, we expanded the dataset to include contributions from iNaturalist
[
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] and the Collection of Environmental Sounds (CSA) of the Humboldt Institute [
          <xref ref-type="bibr" rid="ref34 ref39">34, 39</xref>
          ], with a
focus on underrepresented species, those ecologically important but dificult to detect due to rarity or
elusive behavior. The training dataset included commonly occurring species identified via eBird and
iNaturalist observation data, supporting the development of robust models in cases where the target
species composition is unknown. As a result, some species were present in the training data but absent
from the test data, while still being representative of the target region.
        </p>
        <p>Test data sources were selected to capture a broad range of acoustic environments, incorporating
diferences in call density, background noise, and recording formats (mono and stereo). Species labels
were excluded when fewer than five training recordings were available or when species identification
could not be confirmed with certainty. Unlabeled training data, designed to resemble the test set, were
also included to encourage exploration of semi-supervised and self-supervised learning techniques. In
total, the dataset consisted of more than 38,000 labeled training recordings covering 206 species, along
with 705 one-minute soundscape recordings for testing and evaluation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results of BirdCLEF+ 2025</title>
      <p>A total of 2,025 teams with nearly 2,569 competitors participated in the BidCLEF+ 2025 competition,
submitting a total of 70,674 runs. As in recent years, two-thirds of the test data was allocated to
the private leaderboard and one-third to the public leaderboard. Based on the ROC-AUC metric,
the baseline score was 0.5, with random confidence scores for all birds across all segments. The
highest-scoring submission achieved 0.930 (0.933 on the public leaderboard), with the top 10 systems
difering by only 0.9% in their scores. The top 25 participant scores were above 0.905 (Figure 3).</p>
      <sec id="sec-4-1">
        <title>2We therefore renamed the competition from BirdCLEF to BirdCLEF+</title>
        <p>The Insecta class presented the most significant challenge in the competition, registering a mean
ROC-AUC of 0.667 ± 0.113 across its three considered classes (Figure 4a), which consistently appeared
at the lower end of the per-species ranking (Figure 4b). Following this, the Amphibia class achieved a
ROC-AUC of 0.840 ± 0.145, notably exhibiting the highest standard deviation, which is also evident in
its broad distribution across the per-species ranking. For the dominant Aves class, the mean ROC-AUC
was 0.936 ± 0.0809; while some avian species showed minimal performance diferences among top
participants, others were found lower in the ranking with considerable variation between competitors
(Figure 4b). In contrast, the Mammalia class, represented by Alouatta seniculus, demonstrated high
performance with a ROC-AUC of 0.983 ± 0.020 and a low standard deviation, occupying the upper
part of the ranking.</p>
        <sec id="sec-4-1-1">
          <title>4.1. Online write-up</title>
          <p>Across submissions, several common strategies emerged in participants’ online write-ups3. Data
augmentation played a central role, with techniques such as Mixup, Cutmix, Sumix, Frequency and
Time Masking, Gain adjustments, Resampling, and FilterAugment widely used. Some teams also
introduced external noise, including human speech, to improve model robustness. Undersampled
species were typically addressed through upsampling, while pseudo-labeling and self-training
on the unlabeled soundscape data proved key for boosting accuracy. These strategies often
involved generating pseudo-labels from preliminary models, applying transformations (e.g., power
scaling, filtering low-confidence predictions), and iteratively refining the labels. Weighting more
confident pseudo-labeled examples more heavily during training also contributed to improved outcomes.</p>
          <p>For inference, teams commonly employed Test-Time Augmentation (TTA) by processing overlapping
audio segments and smoothing predictions over time, sometimes with delta shifts. Post-processing
steps - such as adjusting prediction confidence, applying power-based scaling, or calibrating outputs
were used to further refine model predictions. Ensemble methods, including blending models from
diferent training folds or checkpoints, were instrumental in boosting final scores. To meet runtime
constraints, many participants optimized inference speed using tools like ONNX, OpenVINO, and
multiprocessing.</p>
          <p>The dominant modeling approach was Sound Event Detection (SED), often enhanced with
dedicated SED heads. CNN-based models were also widely used, sometimes in hybrid combinations
with SED components. EficientNet backbones were especially popular, though alternatives like
RegNet and NFNet also saw successful implementations. Some teams trained separate models for
3Individual write-ups can be accessed via the "Solution" icon on the leaderboard: https://www.kaggle.com/c/birdclef-2025/
leaderboard
taxonomic subgroups (e.g., Amphibia, Insecta), incorporating additional external datasets to improve
representation. Input features were typically log-transformed Mel spectrograms, with variation in
the number of mel bins, hop sizes, and frequency ranges. A variety of loss functions were explored,
including Cross Entropy, BCE With Logits Loss, and Focal Loss variants, with some evidence suggesting
Focal or Cross Entropy loss could ofer marginal improvements with appropriate tuning. Pretraining
model backbones on large external datasets such as Xeno-Canto prior to fine-tuning on the competition
data significantly boosted early performance.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.2. Working notes</title>
          <p>
            We accepted four working notes for the proceedings, which document the approaches and methodologies
used by individual teams:
Tan &amp; Wang [
            <xref ref-type="bibr" rid="ref40">40</xref>
            ]: The authors developed an end-to-end classification model that uses two parallel
input branches (Dual Branch Network) to process Mel-spectrogram and MFCC features, respectively.
MFCCs are fed into a ResNet50 pretrained on ImageNet, while Mel features are passed through a
randomly initialized ConvNeXt-v2. The feature representations from both branches are fused late
in the pipeline to produce final species predictions. The study evaluates diferent combinations of
pretrained and randomly initialized backbones, with a focus on understanding how complementary
audio representations and model initialization strategies afect classification performance.
Gokulnath et al. [
            <xref ref-type="bibr" rid="ref41">41</xref>
            ]: Adopting a modular approach, this team frames bird species identification as a
set of binary classification tasks—one per species. Rather than using a multi-label model, the authors
treat the task as 206 independent detection problems, enabling species-specific data augmentation,
threshold tuning, and diagnostics. Extensive cross-validation and performance visualizations help
analyze which species benefit most from augmentation. The authors argue that this modular design
simplifies model interpretation, allows fine-grained tuning, and reduces the complexity of the output
layer.
          </p>
          <p>Sydorskyi &amp; Gonçalves [42]: This team employs an ensemble strategy using lightweight CNN
architectures—specifically EficientNetV2-S and NFNet-L0—trained independently on log-Mel spectrograms
which were generated from 5-second audio segments. Augmentations such as MixUp and SpecAugment
were applied during training. The final predictions are computed by averaging the softmax outputs
of 15 diferent models, leveraging complementary strengths of the individual learners. Ensembling
improved prediction accuracy without introducing substantial computational complexity, making it
suitable for the competition despite the runtime constraint.</p>
          <p>Miyaguchi et al. [43]: This submission presents a token-based classification pipeline that transforms
MFCC features into discrete tokens. MFCCs are clustered into 256 discrete tokens using k-means,
forming sequences analogous to text. A Word2Vec model is trained on these sequences to learn
embeddings, which are then fed into a compact transformer model (the “student”) trained to match the
outputs of a CNN-based classifier (the “teacher”) using KL divergence. This approach results in a model
that retains competitive classification performance but is fast enough to process the entire test set in
under 5 minutes on CPU.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Lessons Learned</title>
      <p>The BidCLEF+ 2025 competition showcased remarkable progress in acoustic species identification,
drawing 2,025 teams who submitted an impressive 70,674 runs. The top systems achieved exceptional
results, with the leading entry hitting a ROC-AUC of 0.930 (0.933 on the public leaderboard) and the top
25 participants consistently scoring above 0.905. This widespread participation and strong performance
underscore the significant advancements in bioacoustics species identification.</p>
      <p>Participants used several strategies to achieve these results. Key techniques included extensive
data augmentation (e.g., Mixup, masking, external noise), upsampling for undersampled species, and
crucial pseudo-labeling and self-training on unlabeled data to enhance performance. During inference,
Test-Time Augmentation (TTA) and post-processing refined predictions, while ensemble methods
further boosted scores. Runtime optimization was also a focus, often through tools like ONNX. The
predominant modeling approach was Sound Event Detection (SED), frequently integrated with CNNs
(e.g., EficientNet backbones), with pretraining on large external datasets proving especially efective.</p>
      <p>Despite these impressive overall results, a deeper taxonomic analysis revealed persistent challenges.
Groups like Insects and Amphibians remain dificult to identify, primarily due to the limited availability
of data for these species and taxonomic uncertainty. Furthermore, not all bird species were equally easy
to classify, with some showing considerable performance variation among top competitors. Future work
should focus on new datasets for these groups and investigate which acoustic characteristics are the
strongest determinants of these performance disparities to inform more robust identification models.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>Compiling the dataset for this competition involved many people and institutions. We thank everyone
who contributed to recording, annotating, and processing this year’s data. We thank Earth Species
Project, Experiment.com and Footprint Coalition under a Science Engine grant AI for Interspecies
Communication for the initial grant that allowed the starting of the building of the ESMT dataset. We
also want to thank Kaggle for hosting the competition, with special thanks to Maggie Demkin and
Sohier Dane for their support in reviewing the dataset and setting up the competition. We are grateful
to Google for sponsoring the prize money. Lastly, we thank all participants for sharing their code bases
and write-ups with the Kaggle community.</p>
      <sec id="sec-6-1">
        <title>All results, code notebooks, and forum posts are publicly available at: https://www.kaggle.com/c/birdclef-2025</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used LanguageTool and Gemini to: Grammar and
spelling check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
2025: Conference and Labs of the Evaluation Forum, September 09–12, 2025, Madrid, Spain, 2025.
[42] V. Sydorskyi, F. Gonçalves, Tackling Domain Shift in Bird Audio Classification via Transfer
Learning and Semi-Supervised Distillation: A Case Study on BirdCLEF+ 2025, in: CLEF Working
Notes 2025, CLEF 2025: Conference and Labs of the Evaluation Forum, September 09–12, 2025,
Madrid, Spain, 2025.
[43] A. Miyaguchi, M. Gustineli, A. Cheung, Distilling Spectrograms into Tokens: Fast and Lightweight
Bioacoustic Classification for BirdCLEF+ 2025, in: CLEF Working Notes 2025, CLEF 2025:
Conference and Labs of the Evaluation Forum, September 09–12, 2025, Madrid, Spain, 2025.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Vira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kontoleon</surname>
          </string-name>
          ,
          <article-title>Dependence of the poor on biodiversity: which poor, what biodiversity?, Biodiversity conservation and poverty alleviation: Exploring the evidence for a link (</article-title>
          <year>2012</year>
          )
          <fpage>52</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Forero-Medina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Joppa</surname>
          </string-name>
          ,
          <article-title>Representation of global and national conservation priorities by colombia's protected area network</article-title>
          ,
          <source>PLoS One</source>
          <volume>5</volume>
          (
          <year>2010</year>
          )
          <article-title>e13210</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Vargas-Salinas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aponte-Gutiérrez</surname>
          </string-name>
          ,
          <article-title>Diversidad y recambio de escpecias de anfibios y reptiles entre coberturas vegetales en una localidad del valle del magdalena medio</article-title>
          , departamento de antioquia, colombia,
          <source>Biota colombiana 17</source>
          (
          <year>2016</year>
          )
          <fpage>117</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reyes-Amaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lozáno-Flórez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Flores</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Solari</surname>
          </string-name>
          ,
          <article-title>Distribution of the spix's disk-winged bat, thyroptera tricolor spix, 1823 (chiroptera: Thyropteridae) in colombia, with first records for the middle magdalena valley</article-title>
          ,
          <source>Mastozoología neotropical 23</source>
          (
          <year>2016</year>
          )
          <fpage>127</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W. A.</given-names>
            <surname>Valencia-Montoya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tuberquia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cardona-Duque</surname>
          </string-name>
          ,
          <article-title>Pollination of the cycad zamia incognita a. lindstr. &amp; idárraga by pharaxonotha beetles in the magdalena medio valley, colombia: a mutualism dependent on a specific pollinator and its significance for conservation</article-title>
          ,
          <source>Arthropod-Plant Interactions</source>
          <volume>11</volume>
          (
          <year>2017</year>
          )
          <fpage>717</fpage>
          -
          <lpage>729</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Achury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Suarez</surname>
          </string-name>
          ,
          <article-title>Richness and composition of ground-dwelling ants in tropical rainforest and surrounding landscapes in the colombian inter-andean valley</article-title>
          ,
          <source>Neotropical Entomology</source>
          <volume>47</volume>
          (
          <year>2018</year>
          )
          <fpage>731</fpage>
          -
          <lpage>741</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Arbeláez-Cortés</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Villamizar-Escalante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Trujillo-Arias</surname>
          </string-name>
          ,
          <article-title>New voucher specimens and tissue samples from an avifaunal survey of the middle magdalena valley of bolívar, colombia, bridge geographical and temporal gaps</article-title>
          ,
          <source>The Wilson Journal of Ornithology</source>
          <volume>132</volume>
          (
          <year>2020</year>
          )
          <fpage>773</fpage>
          -
          <lpage>779</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H. E.</given-names>
            <surname>Ramírez-Chaves</surname>
          </string-name>
          , et al., Mamíferos de Colombia.
          <year>v1</year>
          .
          <fpage>14</fpage>
          . Sociedad Colombiana de Mastozoología, https://doi.org/10.15472/kl1whs,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Etter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>McAlpine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Possingham</surname>
          </string-name>
          ,
          <article-title>Historical patterns and drivers of landscape change in colombia since 1500: a regionalized spatial approach</article-title>
          ,
          <source>Annals of the Association of American Geographers</source>
          <volume>98</volume>
          (
          <year>2008</year>
          )
          <fpage>2</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C. A. C.</given-names>
            <surname>Ayram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Etter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Díaz-Timoté</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Buriticá</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ramírez</surname>
          </string-name>
          , G. Corzo,
          <article-title>Spatiotemporal evaluation of the human footprint in colombia: Four decades of anthropic impact in highly biodiverse ecosystems</article-title>
          ,
          <source>Ecological Indicators</source>
          <volume>117</volume>
          (
          <year>2020</year>
          )
          <fpage>106630</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Molano</surname>
          </string-name>
          ,
          <source>En medio del Magdalena Medio</source>
          , Centro de Investigación y Educación Popular,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Potter</surname>
          </string-name>
          ,
          <article-title>Colombia's oil palm development in times of war and 'peace': Myths, enablers and the disparate realities of land control</article-title>
          ,
          <source>Journal of rural studies 78</source>
          (
          <year>2020</year>
          )
          <fpage>491</fpage>
          -
          <lpage>502</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Salgado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Shurin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Vélez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Link</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lopera-Congote</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>González-Arango</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jaramillo</surname>
          </string-name>
          , I. Åhlén, G. De Luna,
          <article-title>Causes and consequences of recent degradation of the magdalena river basin, colombia</article-title>
          ,
          <source>Limnology and Oceanography Letters</source>
          <volume>7</volume>
          (
          <year>2022</year>
          )
          <fpage>451</fpage>
          -
          <lpage>465</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lora-Ariza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piña</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Donado</surname>
          </string-name>
          ,
          <article-title>Assessment of groundwater quality for human consumption and its health risks in the middle magdalena valley, colombia</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>11346</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>T.-A. Natalia</surname>
          </string-name>
          , et al.,
          <article-title>Role of a campesine reserve zone in the magdalena valley (colombia) in the conservation of endangered tropical rainforests</article-title>
          ,
          <source>Nature Conservation Research</source>
          .
          <volume>8</volume>
          (
          <year>2023</year>
          )
          <fpage>49</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Brancalion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. H.</given-names>
            <surname>Joyce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Antonelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. D.</given-names>
            <surname>Holl</surname>
          </string-name>
          ,
          <article-title>Moving biodiversity from an afterthought to a key outcome of forest restoration</article-title>
          ,
          <source>Nature Reviews Biodiversity</source>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L. S. M.</given-names>
            <surname>Sugai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S. F.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W. Ribeiro</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Llusia</surname>
          </string-name>
          ,
          <article-title>Terrestrial passive acoustic monitoring: review and perspectives</article-title>
          ,
          <source>BioScience</source>
          <volume>69</volume>
          (
          <year>2019</year>
          )
          <fpage>15</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gibb</surname>
          </string-name>
          , E. Browning,
          <string-name>
            <given-names>P.</given-names>
            <surname>Glover-Kapfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. E.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <article-title>Emerging opportunities and challenges for passive acoustics in ecological assessment and monitoring</article-title>
          ,
          <source>Methods in Ecology and Evolution</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>169</fpage>
          -
          <lpage>185</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>D.</given-names>
            <surname>Stowell</surname>
          </string-name>
          ,
          <article-title>Computational bioacoustics with deep learning: a review and roadmap</article-title>
          ,
          <source>PeerJ</source>
          <volume>10</volume>
          (
          <year>2022</year>
          )
          <article-title>e13152</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Mitesser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Schaefer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Seibold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Busse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kriegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rabl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gelis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arteaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freile</surname>
          </string-name>
          , et al.,
          <article-title>Soundscapes and deep learning enable tracking biodiversity recovery in tropical forests</article-title>
          ,
          <source>Nature communications 14</source>
          (
          <year>2023</year>
          )
          <fpage>6191</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Do Nascimento</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pérez-Granados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B. R.</given-names>
            <surname>Alencar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Beard</surname>
          </string-name>
          ,
          <article-title>Time and habitat structure shape insect acoustic activity in the amazon</article-title>
          ,
          <source>Philosophical Transactions of the Royal Society B</source>
          <volume>379</volume>
          (
          <year>2024</year>
          )
          <fpage>20230112</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Burivalova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Maeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rayadin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Boucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Choksi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Roe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Truskinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Game</surname>
          </string-name>
          , et al.,
          <article-title>Loss of temporal structure of tropical soundscapes with intensifying land use in borneo</article-title>
          ,
          <source>Science of the Total Environment</source>
          <volume>852</volume>
          (
          <year>2022</year>
          )
          <fpage>158268</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>L.</given-names>
            <surname>Rauch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schwinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wirth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Huseljic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Herde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tomforde</surname>
          </string-name>
          , et al.,
          <article-title>Birdset: A large-scale dataset for audio classification in avian bioacoustics</article-title>
          ,
          <source>arXiv preprint arXiv:2403.10380</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Faiß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stowell</surname>
          </string-name>
          ,
          <article-title>Insectset459: an open dataset of insect sounds for bioacoustic machine learning</article-title>
          ,
          <source>arXiv preprint arXiv:2503.15074</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Cañas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Toro-Gómez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S. M.</given-names>
            <surname>Sugai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. D.</given-names>
            <surname>Benítez Restrepo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rudas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Posso</given-names>
            <surname>Bautista</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Toledo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. H. R.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>L. de Souza</surname>
          </string-name>
          , et al.,
          <article-title>A dataset for benchmarking neotropical anuran calls identification in passive acoustic monitoring</article-title>
          ,
          <source>Scientific Data</source>
          <volume>10</volume>
          (
          <year>2023</year>
          )
          <fpage>771</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>E.</given-names>
            <surname>Dufourq</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Durbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Hansford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoepfner</surname>
          </string-name>
          , H. Ma,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Bryant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Stender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          , et al.,
          <article-title>Automated detection of hainan gibbon calls for passive acoustic monitoring</article-title>
          ,
          <source>Remote Sensing in Ecology and Conservation</source>
          <volume>7</volume>
          (
          <year>2021</year>
          )
          <fpage>475</fpage>
          -
          <lpage>487</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagiwara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hofman</surname>
          </string-name>
          , J.-Y. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cusimano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Efenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zacarian</surname>
          </string-name>
          ,
          <article-title>Beans: The benchmark of animal sounds</article-title>
          ,
          <source>in: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chasmai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shepard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Maji</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Van Horn</surname>
          </string-name>
          ,
          <article-title>The inaturalist sounds dataset</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>37</volume>
          (
          <year>2024</year>
          )
          <fpage>132524</fpage>
          -
          <lpage>132544</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>L. S. M.</given-names>
            <surname>Sugai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Llusia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Siqueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <article-title>Revisiting the drivers of acoustic similarities in tropical anuran assemblages</article-title>
          ,
          <source>Ecology</source>
          <volume>102</volume>
          (
          <year>2021</year>
          )
          <article-title>e03380</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>S.</given-names>
            <surname>Beery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Eficient pipeline for camera trap image review</article-title>
          , arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>06772</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>L.</given-names>
            <surname>Picek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Goëau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          , et al.,
          <source>Overview of lifeclef</source>
          <year>2025</year>
          :
          <article-title>Challenges on species presence prediction and identification, and individual animal identification</article-title>
          ,
          <source>in: International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prince</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Piña</given-names>
            <surname>Covarrubias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Doncaster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Snaddon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rogers</surname>
          </string-name>
          ,
          <article-title>Audiomoth: Evaluation of a smart open acoustic device for monitoring biodiversity and the environment</article-title>
          ,
          <source>Methods in Ecology and Evolution</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>1199</fpage>
          -
          <lpage>1211</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>K.</given-names>
            <surname>Riede</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Balakrishnan</surname>
          </string-name>
          ,
          <article-title>Acoustic monitoring for tropical insect conservation</article-title>
          ,
          <source>Philosophical Transactions B</source>
          <volume>380</volume>
          (
          <year>2025</year>
          )
          <fpage>20240046</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>A. M. Mendoza-Henao</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Acevedo-Charry</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Martínez-Medina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Barona-Cortés</surname>
            , S. CórdobaCórdoba,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Caycedo-Rosales</surname>
            ,
            <given-names>J. S.</given-names>
          </string-name>
          <string-name>
            <surname>Ulloa</surname>
            ,
            <given-names>K. G.</given-names>
          </string-name>
          <string-name>
            <surname>Borja-Acosta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Buitrago-Cardona</surname>
          </string-name>
          , H. PantojaSánchez, Past, present, and
          <article-title>future of a tropical sounds collection from colombia</article-title>
          ,
          <source>Bioacoustics</source>
          <volume>32</volume>
          (
          <year>2023</year>
          )
          <fpage>474</fpage>
          -
          <lpage>490</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Denton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Klinck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Srivathsa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Arvind</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sawant</surname>
          </string-name>
          , et al.,
          <source>Overview of birdclef</source>
          <year>2024</year>
          :
          <article-title>Acoustic identification of under-studied bird species in the western ghats</article-title>
          ,
          <source>CEUR-WS</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>B. Van Merriënboer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hamer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dumoulin</surname>
          </string-name>
          , E. Triantafillou, T. Denton,
          <article-title>Birds, bats and beyond: Evaluating generalization in bioacoustics models</article-title>
          ,
          <source>Frontiers in Bird Science</source>
          <volume>3</volume>
          (
          <year>2024</year>
          )
          <fpage>1369756</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37] Xeno-canto, https://xeno-canto.org/,
          <source>accessed Feb 13</source>
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38] iNaturalist, https://www.inaturalist.org/,
          <source>accessed Feb 13</source>
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Colección de Sonidos Ambientales (CSA) Mauricio Álvarez Rebolledo</surname>
          </string-name>
          , https://colecciones. humboldt.org.co/sonidos/,
          <source>accessed Feb 13</source>
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Dual-branch Network for Species Identification via Passive Acoustic Monitoring</article-title>
          ,
          <source>in: CLEF Working Notes</source>
          <year>2025</year>
          , CLEF 2025:
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          ,
          <source>September 09-12</source>
          ,
          <year>2025</year>
          , Madrid, Spain,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Gokulnath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gaikwad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Senthilnathan</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Das</surname>
            ,
            <given-names>S. P.</given-names>
          </string-name>
          <string-name>
            <surname>Sawant</surname>
          </string-name>
          ,
          <article-title>One Detector per Bird: A Scalable Binary Classification Approach for BirdCLEF 2025</article-title>
          , in:
          <source>CLEF Working Notes</source>
          <year>2025</year>
          , CLEF
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>