=Paper=
{{Paper
|id=Vol-2936/paper-125
|storemode=property
|title=Overview of SnakeCLEF 2021: Automatic Snake Species Identification with Country-Level
                        Focus
|pdfUrl=https://ceur-ws.org/Vol-2936/paper-125.pdf
|volume=Vol-2936
|authors=Lukáš Picek,Andrew Durso,Isabelle Bolon,Rafael Ruiz de Castaneda
|dblpUrl=https://dblp.org/rec/conf/clef/PicekDBC21
}}
==Overview of SnakeCLEF 2021: Automatic Snake Species Identification with Country-Level
                        Focus==
<pdf width="1500px">https://ceur-ws.org/Vol-2936/paper-125.pdf</pdf>
<pre>
Overview of SnakeCLEF 2021: Automatic Snake
Species Identification with Country-Level Focus
Lukáš Picek1 , Andrew M. Durso2 , Isabelle Bolon3 and Rafael Ruiz de Castañeda3
1
  Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia, Czechia
2
  Department of Biological Sciences, Florida Gulf Coast University, Florida, USA
3
  Institute of Global Health, Department of Community Health and Medicine, University of Geneva, Switzerland


                                 Abstract
                                 A robust and accurate AI-driven system as an assistance tool for snake species identi�cation has vast
                                 potential to help lower deaths and disabilities caused by snakebites. With that in mind, we prepared
                                 the SnakeCLEF 2021: Automatic Snake Species Identi�cation Challenge with Country-Level Focus, de-
                                 signed to provide an evaluation platform that can help track the performance of end-to-end AI-driven
                                 snake species recognition systems with a focus on overall country-wise performance. We have pro-
                                 vided 386,006 photographs of 772 snake species collected in 188 countries and country-species presence
                                 mapping for the challenge. In this paper, we report 1) a description of the provided data, 2) evalua-
                                 tion methodology and principles, 3) an overview of the systems submitted by the participating teams,
                                 and 4) a discussion of the obtained results.

                                 Keywords
                                 LifeCLEF, SnakeCLEF, �ne grained visual categorization, global health, epidemiology, snake bite, snake,
                                 reptile, benchmark, biodiversity, species identi�cation, machine learning, computer vision, classi�cation


1. Introduction
Building an automatic and robust image-based system for snake species identi�cation is an
important goal for biodiversity, conservation, and global health. With recent estimates of 81,410 -
137,880 deaths and up to three times as many victims of amputations, permanent disability and
dis�gurement (globally each year) caused by venomous snakebite [1], such a system has the
potential to improve eco-epidemiological data and treatment outcomes (e.g. based on the speci�c
use of antivenoms) [2, 3]. This applies especially in remote geographic areas and developing
countries, where automatic snake species identi�cation has the greatest potential to save lives.
   The di�culty of snake species identi�cation – from both a human and a machine perspective
[4] – lies in the high intra-class and low inter-class variance in appearance, which may depend
on geographic location, color morph, sex, or age (Figure 1 and Figure 2). At the same time,
many species are visually similar to other species (e.g. mimicry [5]). Our knowledge of which
snake species occur in which countries is incomplete, and it is common that most or all images
of a given snake species might originate from a small handful of countries or even a single

CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania
� picekl@kky.zcu.cz (L. Picek)
� 0000-0002-6041-9722 (L. Picek); 0000-0002-3008-7763 (A. M. Durso); 0000-0001-5940-2731 (I. Bolon);
0000-0002-2287-0985 (R. R. d. Castañeda)
                               © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Wor
    Pr
       ks
        hop
     oceedi
          ngs
                ht
                I
                 tp:
                   //
                    ceur
                       -
                SSN1613-
                        ws
                         .or
                       0073
                           g

                               CEUR Workshop Proceedings (CEUR-WS.org)
country [6]. Furthermore, many snake species resemble species found on other continents, with
which they are entirely allopatric [7]. Knowing the geographic origin of an unidenti�ed snake
can narrow down the possible correct identi�cations considerably. In no location on Earth do
more than 126 of the approximately 3,900 snake species co-occur [8]. Thus, regularization to
all countries is a critical component of any snake identi�cation method. In previous LifeCLEF
Snake Species Identi�cation challenges [9, 10] we measured relatively poor performance – 0.625
Macro F1 score – showing that snake identi�cation is a task with a lot of space for improvement.


Figure 1: Variation in Vipera berus (European Adder) color and pattern within central Europe. Exam-
ples from Czechia, Germany, Switzerland and Poland, demonstrating di�erent color morphs within a
species. Taken from iNaturalist: ©Thorsten Stegmann, ©jandetka, ©jandetka, and ©jandetka


Figure 2: Naja nigricincta from northern Namibia (le�) and South Africa (right), demonstrating geo-
graphical variation within a species. Taken from iNaturalist: ©Di Franklin, and ©bryanmaritz
2. Task description
The main goal of this challenge was to build a system that is capable of recognizing 772 snake
species based on the given unseen image and relevant geographical location, with a focus on
worldwide performance. Unlike the previous SnakeCLEF edition – where we used the disclosed
dataset – we did not ask the participants to submit their solutions through Docker environment.
Just a simple CSV �le with Top1 species prediction for each image was expected.

2.1. Dataset
For this year’s challenge, we have prepared a new dataset with 409,679 images belonging to 772
snake species from 188 countries and all continents (386,006 images with labels targeted for
development and 23,673 images without labels for testing). In addition, we provide a simple
train/val (90% / 10%) split to validate preliminary results while ensuring the same species distri-
butions. Furthermore, we prepared a compact subset (70,208 images) for fast prototyping. The
test set data consists of 23,673 images submitted to the iNaturalist platform within the �rst four
months of 2021. Unlike in previous years, where the �nal testing set remained undisclosed, we
provided the test data to the participants.
   All data were gathered from online biodiversity platforms (i.e., iNaturalist, HerpMapper)
and further extended by data scraped from Flickr. In contrast to the previous SnakeCLEF
edition [10], we increased the number of images and covered countries, and �ltered noisy labels
and duplicated images. In addition, we de�ned clean (iNaturalist / HerpMapper) and noisy
(Flickr) subsets within the development data. The provided dataset has a heavy long-tailed
class distribution, where the most frequent species (Thamnophis sirtalis) is represented by
22,163 images and the least frequent by just 10 (Achalinus formosanus). For additional dataset
parameters refer to Table 1 and Table 2.

Table 1
Details of the SnakeCLEF 2021 datasets and their comparison with previous edition.
 Dataset                   Species    Images       # of Countries     min per species   max per species
 SnakeCLEF 2020               783     259,214           145                 19              14,433
 SnakeCLEF 2021               772     386,006           188                 10              22,163
 SnakeCLEF 2021 Comp.         768      70,208           178                  1                 299


Table 2
SnakeCLEF 2021 data sources and their taxonomic and geographic coverage.
 Data Source        # of Species     # of Genera      # of Families      # of Images    # of Countries
 iNaturalist            762             265                   17           277,025           181
 HerpMapper             614             244                   17            58,351            98
 Flickr                 733             260                   18            50,630           125
 Total                  772             269                   18           386,006           188
2.1.1. Geographical Information
Considering that all snake species have distinct, largely stable geographic ranges, with a maxi-
mum of 126 species of snakes occurring within the same 50 ⇥ 50 km2 area [8], geographical
information plays a crucial role in correct snake species identi�cation [11]. To evaluate this, we
have gathered two levels of geographical label (i.e., country and continent) for approximately
87% of the data. We have collected observations across 188 countries and all continents. A small
proportion of images (ca. 1 - 2%), particularly from Flickr, show captive snakes that are kept
outside of their native range (e.g., North American Pantherophis guttatus in Europe or Australian
Morelia viridis in the USA). We opted to retain these for three reasons:
   1. Users of an automated identi�cation system may wish to use it on captive snakes (e.g., in
      the case of customs seizures [12, 13]).
   2. Bites from captive snakes may occur (although the identity of the snake would normally
      be clear in this case; e.g. [14, 15]).
   3. Captive snakes sometimes escape and can found introduced populations outside their
      native range (e.g. [16, 17]).
Additionally, we provide a mapping matrix (MM) describing species-country presence to allow
better worldwide regularization, based on the August 2020 release of The Reptile Database [18].
                                      (
                                          1,   if species S 2 country C
                            MMcs =                                                                 (1)
                                          0,   else


Figure 3: Worldwide snake species distribution, showing the number of species found in each country.
Large countries in the tropics (Brazil, Mexico, Colombia, India, Indonesia) have more than 300 species.
Figure 4: Percentage of snake species per country included in SnakeCLEF2021. The countries with the
best coverage are in Europe, Oceania, and North America.


   The vast majority (77%) of all images came from the United States and Canada, with 9% from
Latin American and the Caribbean, 5.7% from Europe, 4.5% from Asia, 1.8% from Africa, and
1.5% from Australia/Oceania. Bias at smaller spatial scales undoubtedly exists as well [6, 19],
largely due to where participants in citizen science projects and other snake photographers
are concentrated. Nevertheless, snake species from nearly every country were represented,
with 46/215 (21%) of countries having all of their snake species represented, mostly in Europe.
Nearly half of all countries (106/215; 49%) had more than 50% of their snake species represented
(Figure 4). Priority areas for improvement of the training dataset in future rounds are countries
with high snake species diversity and low citizen science participation, especially Indonesia,
Papua New Guinea, Madagascar, and several central African and Caribbean countries (Figure 3).

2.2. Timeline
The training data were made public in February 2021 through the AICrowd challenge page,
and anyone with research ambitions was able to register and participate in the competition.
Releasing the test data in mid-May, we provided up to 100 days to participants to work on their
submissions. The test data were released three days before the competition deadline, minimizing
the possibility of manual labelling and other exploits. Each team had an opportunity to submit
up to 10 submissions corresponding to di�erent approaches or di�erent settings of the same
method. The �nal evaluation was done via a CSV �le containing Top1 prediction for each given
test image. Once the submission phase was closed (mid-June), the participants we allowed to
submit so-called post-competition submissions to evaluate any interesting �ndings.
2.3. Evaluation Protocol
To assure focus on worldwide performance, we de�ned the macro F1 country performance
(Macro F1c ) as the main metric. We calculate it as the mean of country F1 scores:
                                N                                      N
                            1 X                            1           X
             Macro F1c =        F1 c ,       F1 c = P k            ⇥          F1s M M cs      (2)
                            N                         s=1 M M cs
                               c=0                                      s=0

  where c is country index, s is species index, (F1c ) is the country performance, and M M cs is
the mapping matrix described in Subsection 2.1.1. To get the F1s we use following formula for
each species:
                                                  Ps ⇥ R s
                                      F1 s = 2 ⇥                                             (3)
                                                  Ps + R s
                                        tps                    tps
                            Ps =               ,    Rs =                                      (4)
                                    tps + f ps             tps + f ns
 To allow deeper comparison on di�erent levels, we also measure the Top1 Accuracy and the
Macro F1 score. The Macro F1 score is calculated as the mean of all F1s scores:
                                                      N
                                                   1 X
                                      Macro F1 =       F1 s                                   (5)
                                                   N
                                                     s=0

  where s is the species index and N the number of species. Final Macro F1 is calculated by
computing the F1 score for each species as the harmonic mean of the species Precision (Ps ) and
the Recall (Rs ).

2.4. Working Notes
All participants were asked to provide a Working Note paper – a technical report with information
needed to reproduce the results of all submissions. All submitted Working Notes were reviewed
by 2-3 reviewers with a decent publication history and PhD in Computer Vision and Machine
Learning, ensuring a su�cient level of reproducibility and quality. The review process was
single-blind and o�ered up to two rebuttals.


3. Participants and Methods
Seven teams participated in the SnakeCLEF 2021 challenge and submitted a total of 46 runs.
We have seen a vast increase in interest related to automatic snake recognition from the last
year [20]. Interestingly, three participating teams are originated from India – the country with
the most snakebites worldwide [21]. Most of the participants (6 out of 7) provided a technical
report with a description for each run, evaluated experiments and used methods, techniques
and experiments [22, 23, 24, 25, 26, 27]. Such a report had to pass a single-blind review, ensur-
ing a su�cient level of reproducibility and quality. For all the teams, we synthesized a short
description.
BME-TMIT [22]: The BME-TMIT was the only team that used a two-stage approach with
detection and classi�cation neural networks. E�cientDet [28] and E�cientNet [29] were uti-
lized for object detection and classi�cation, respectively. Additionally, the location metadata
integration increased the F1 country by 0.089 on the test data. Based on evaluated experiments,
we can conclude that object detection and the inclusion of geographical data showed signi�cant
improvement in all measured performance metrics. Utilizing that, they achieved the highest
scores in all measured metrics – Macro F1c of 0.903, F1c of 0.864, and 94.94% Top1 Accuracy.)

CMP [23]: The CMP team experimented with di�erent deep residual convolutional neural
networks (i.e., ResNet [30], ResNeXt [31], and ResNeSt [32]) and di�erent loss functions, includ-
ing standard cross-entropy, weighted cross-entropy and soft F1 loss. The performed experiment
showed that the standard cross-entropy loss achieved superior performance in all measured
metrics on the validation set. Thus, their best method is an ensemble of two ResNeSt-200,
ResNet-101, and ResNeXt-101, combining the top one predictions by majority voting strategy.
Additionally, they increased the performance with mixed-precision training and by dropping
the predictions of the species not occurring in the country of the given image. Interestingly,
their best single model in the case of Macro F1c was �ne-tuned just on the compact subset with
the almost �at distribution.

FHDO-BCSG [24]: The FHDO-BCSG team utilized the E�cientNets [29] and the Vision Trans-
formers (ViT) [33] in their experiments. In a subsequent step, they multiplied the prior probabil-
ities of the location context with the model predictions. Without surprise, the combination of
both modes achieved the best performance, more precisely a Macro F1c score of 0.829.

SSN [25]: SSN team used a classical approach with just a single ResNeXt-50-V2 optimized with
Adam and plenty of image augmentations, i.e., random crop, transposition, horizontal/vertical
�ip, shift, scale and rotation. With such an approach, they achieved a relatively small error rate
in terms of Top1 Accuracy (14.23%) but reached just the 0.724 in case of Macro F1c .

UAIC AI [26]: This team used relatively old CNN architectures GoogLeNet [34], VGG16 [35]
and ResNet-18 [30]. Even though they did not achieve high scores, they helped us to understand
the magnitude of the di�erence in performance between "pioneer" and the current state-of-the-
art architectures on a long-tailed �ne-grained dataset. Their best score – 0.785 Macro F1c – was
achieved by the ResNet-18 architecture.

SSN-MLRG [27]: The SSN-MLRG team used the Inception-ResNet-v2 [36] as a feature ex-
tractor and concatenated extracted image features with geographic information. Such a feature
vector is later forwarded into trained gradient boosting classi�er. This approach achieved the
worst performance in the competition (0.269 Macro F1c ) and revealed the superiority of the
neural network based classi�ers.

Gokul: This work primarily builds on their solution around ViT (ViT-Base-16) and the CNN
based ResNet101-v2 architectures [20]. An ensemble of both, with a few bells and whistles,
improved the Country Based F1 score up to 0.877 (2nd place).
4. Results and Discussion
We report the achieved performance by all the collected runs in Figure 5, Figure 6, and Figure 7.
The best performing model achieved an impressive Macro F1c of 0.903 while having 94.82%
Top1 Accuracy and Macro F1 of 0.855. Interestingly, the model with the highest Macro F1c was
not the best in terms of Top1 Accuracy and Macro F1 . The main outcomes we can derive from
the results are the following:

Object detection improves classi�cation: Utilization of the detection network for a bet-
ter region of interest selection showed a signi�cant performance gain in the case of the winning
team. However, such an approach requires additional labelling procedures and the construction
of two neural network models. Furthermore, a two-stage solution might be too heavy for
deployment on edge devices; thus, its usage is probably impossible.

CNN outperforms ViT in snake recognition: Similar to last year’s challenge [10], all par-
ticipants featured deep convolutional neural networks. Besides CNNs, Vision Transform-
ers (ViT) [33] were utilized by two teams. Interestingly, the performance of the ViT was slightly
worse, which is contradictory to their performance in fungi recognition [37], thus showing that
ViT might not be the best option for all �ne-grained tasks.

Geography improves classi�cation: Same as last year, usage of geographical information
improved the recognition capability. No matter which technique was used, every team that
incorporated the location metadata information increased the system’s performance by a signif-
icant margin, e.g., +0.089 and +0.103 Macro F1c , in the case of BME-TMIT and FHDO-BCSG
respectively.

Vast increase in performance: This year we experienced a signi�cant performance increase
in all measured metrics. Comparing the top Macro F1 score achieved in 2020 (0.625) and
2021 (0.864), we can see a 2.75 times smaller error rate. This is mainly due to increasing research
e�orts in automatic snake species identi�cation. With a Top1 Accuracy close to 95%, the 2021
SnakeCLEF challenge helped to build a system that has similar performance to other approaches
for natural species recognition [38, 39, 40, 41].

Increased interest in automatic snake species recognition: This year the SnakeCLEF
2021 challenge attracted seven research teams from India, Czechia, Germany, Romania, and
Hungary. This is so far the biggest participation in our Snake Identi�cation challenges and
even exceeds participation in other well-established LifeCLEF challenges. In 2022 we hope that
interest will continue to increase.
                                                                                                                                                                                                                                                                                                                                                                                                                                               0,1
                                                                                                                                                                                                                                                                                                                                                                                                                                                     0,2
                                                                                                                                                                                                                                                                                                                                                                                                                                                           0,3
                                                                                                                                                                                                                                                                                                                                                                                                                                                                 0,4
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       0,5
                                                                                                                                                                                                                                                                                                                                                                                                                                                                              0,6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     0,7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            0,8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    0,9


                                                                                                                                                                                                                                                                                                                                                                                                                                           0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          1


                                                                                                                                          100%


                                                                                                 0%
                                                                                                  10%
                                                                                                      20%
                                                                                                         30%
                                                                                                             40%
                                                                                                                 50%
                                                                                                                      60%
                                                                                                                            70%
                                                                                                                                80%
                                                                                                                                      90%
                                                                                                                                                                                                                                                                        0,1
                                                                                                                                                                                                                                                                               0,2
                                                                                                                                                                                                                                                                                      0,3
                                                                                                                                                                                                                                                                                             0,4
                                                                                                                                                                                                                                                                                                   0,5
                                                                                                                                                                                                                                                                                                         0,6
                                                                                                                                                                                                                                                                                                               0,7
                                                                                                                                                                                                                                                                                                                     0,8
                                                                                                                                                                                                                                                                                                                           0,9


                                                                                                                                                                                                                                                                   0
                                                                                                                                              94,94%                                                                                                                                                                                                                                                                                                                                                  0,903
                                                                                                                                                                                                                                                                                                                             0,864
                                                                                                                                              94,82%                                                                                                                                                                                                                                                                                                                                                  0,901
                                                                                                                                                                                                                                                                                                                             0,855
                                                                                                                                              94,39%                                                                                                                                                                                                                                                                                                                                                 0,878
                                                                                                                                                                                                                                                                                                                           0,840
                                                                                                                                            92,97%                                                                                                                                                                                                                                                                                                                                                   0,877
                                                                                                                                                                                                                                                                                                                           0,837
                                                                                                                                            92,91%                                                                                                                                                                                                                                                                                                                                                   0,876
                                                                                                                                                                                                                                                                                                                           0,832
                                                                                                                                            92,88%                                                                                                                                                                                                                                                                                                                                                   0,875
                                                                                                                                                                                                                                                                                                                           0,831
                                                                                                                                          92,13%                                                                                                                                                                                                                                                                                                                                                     0,870
                                                                                                                                                                                                                                                                                                                           0,830
                                                                                                                                          92,03%                                                                                                                                                                                                                                                                                                                                                     0,866
                                                                                                                                                                                                                                                                                                                        0,804
                                                                                                                                          91,80%                                                                                                                                                                                                                                                                                                                                                     0,860
                                                                                                                                                                                                                                                                                                                        0,802
                                                                                                                                          91,68%                                                                                                                                                                                                                                                                                                                                                    0,842
                                                                                                                                                                                                                                                                                                                        0,800
                                                                                                                                          91,68%                                                                                                                                                                                                                                                                                                                                                    0,839
                                                                                                                                                                                                                                                                                                                        0,799
                                                                                                                                         91,17%                                                                                                                                                                                                                                                                                                                                                     0,839
                                                                                                                                                                                                                                                                                                                        0,796
                                                                                                                                         91,11%                                                                                                                                                                                                                                                                                                                                                     0,837
                                                                                                                                                                                                                                                                                                                       0,795
                                                                                                                                         91,02%                                                                                                                                                                                                                                                                                                                                                   0,832
                                                                                                                                                                                                                                                                                                                       0,795
                                                                                                                                         90,77%                                                                                                                                                                                                                                                                                                                                                   0,829
                                                                                                                                                                                                                                                                                                                       0,788
                                                                                                                                         90,66%                                                                                                                                                                                                                                                                                                                                                   0,823
                                                                                                                                                                                                                                                                                                                       0,786
                                                                                                                                         90,42%                                                                                                                                                                                                                                                                                                                                                   0,820
                                                                                                                                                                                                                                                                                                                       0,779
                                                                                                                                         90,35%                                                                                                                                                                                                                                                                                                                                                   0,819
                                                                                                                                                                                                                                                                                                                       0,778
                                                                                                                                         90,02%                                                                                                                                                                                                                                                                                                                                                   0,814
                                                                                                                                                                                                                                                                                                                       0,772
                                                                                                                                         90,10%                                                                                                                                                                                                                                                                                                                                                   0,810
                                                                                                                                                                                                                                                                                                                      0,763
                                                                                                                                         89,58%                                                                                                                                                                                                                                                                                                                                               0,789
                                                                                                                                                                                                                                                                                                                      0,763
                                                                                                                                         89,55%                                                                                                                                                                                                                                                                                                                                               0,785
                                                                                                                                                                                                                                                                                                                      0,753
                                                                                                                                         89,14%                                                                                                                                                                                                                                                                                                                                               0,783
                                                                                                                                                                                                                                                                                                                     0,752
                                                                                                                                         89,08%                                                                                                                                                                                                                                                                                                                                               0,774
                                                                                                                                                                                                                                                                                                                     0,745
                                                                                                                                                                                                                                                                                                                                 F1 - Macro
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             F1 - Country


                                                                                                                                                       Top1 Accuracy
                                                                                                                                         88,90%                                                                                                                                                                                                                                                                                                                                               0,773
                                                                                                                                                                                                                                                                                                                     0,741
                                                                                                                                        87,85%                                                                                                                                                                                                                                                                                                                                               0,766
                                                                                                                                                                                                                                                                                                                     0,741
                                                                                                                                        87,78%                                                                                                                                                                                                                                                                                                                                               0,762
                                                                                                                                                                                                                                                                                                                     0,738
                                                                                                                                        86,09%                                                                                                                                                                                                                                                                                                                                               0,753
                                                                                                                                                                                                                                                                                                                     0,737
                                                                                                                                       85,77%                                                                                                                                                                                                                                                                                                                                                0,752
                                                                                                                                                                                                                                                                                                                     0,728
                                                                                                                                       85,86%                                                                                                                                                                                                                                                                                                                                               0,727
                                                                                                                                                                                                                                                                                                                 0,706
                                                                                                                                       85,55%                                                                                                                                                                                                                                                                                                                                               0,726
                                                                                                                                                                                                                                                                                                                 0,705
                                                                                                                                      83,04%                                                                                                                                                                                                                                                                                                                                                0,724
                                                                                                                                                                                                                                                                                                                0,684
                                                                                                                                      82,96%                                                                                                                                                                                                                                                                                                                                            0,703
                                                                                                                                                                                                                                                                                                               0,665
                                                                                                                                     79,83%                                                                                                                                                                                                                                                                                                                                            0,695
                                                                                                                                                                                                                                                                                                               0,650
                                                                                                                                     77,63%                                                                                                                                                                                                                                                                                                                                         0,612
                                                                                                                                                                                                                                                                                                           0,605
                                                                                                                            59,80%                                                                                                                                                                                                                                                                                                                                           0,512
                                                                                                                                                                                                                                                                                        0,302
                                                                                                                      51,90%                                                                                                                                                                                                                                                                                                                                 0,293
                                                                                                                                                                                                                                                                                     0,219
                                                                                                                      51,90%                                                                                                                                                                                                                                                                                                                                 0,293
                                                                                                                                                                                                                                                                                     0,219
                                                                                                                   43,25%                                                                                                                                                                                                                                                                                                                                   0,269
                                                                                                                                                                                                                                                                               0,166
                                                                                                                   43,10%                                                                                                                                                                                                                                                                                                                                   0,265
                                                                                                                                                                                                                                                                               0,164
                                                                                                                   42,97%                                                                                                                                                                                                                                                                                                                                   0,265
                                                                                                                                                                                                                                                                               0,162
                                                                                                                   42,85%                                                                                                                                                                                                                                                                                                                                   0,264
                                                                                                                                                                       Figure 6: O�icial Macro F1 scores achieved by all runs to the SnakeCLEF 2021 competition.
                                                                                                                                                                                                                                                                               0,162
                                                                                                                                                                                                                                                                                                                                              Figure 5: O�icial Macro F1c scores achieved by all runs to the SnakeCLEF 2021 competition.


                                                                                                                   42,82%                                                                                                                                                                                                                                                                                                                                   0,263
                                                                                                                                                                                                                                                                               0,159
                                                                                                            31,66%                                                                                                                                                                                                                                                                                                                                   0,161
                                                                                                                                                                                                                                                                          0,097


Figure 7: O�icial Top1 Accuracy scores achieved by all runs to the SnakeCLEF 2021 competition.
                                                                                                        18,86%                                                                                                                                                                                                                                                                                                                                  0,067
                                                                                                                                                                                                                                                                       0,023
                                                                                                        18,07%                                                                                                                                                                                                                                                                                                                                  0,064
                                                                                                                                                                                                                                                                       0,020
5. Conclusions and Perspectives
This paper presents an overview and results of the second edition of the SnakeCLEF challenge
organized in conjunction with the Conference and Labs of the Evaluation Forum (CLEF1 ) and
LifeCLEF2 research platform [42]. This year, we based the evaluation on the worldwide species
distribution. We have prepared the largest and most diverse snake image dataset to date,
covering 772 snake species with 409,679 images observed across 188 countries. This dataset
represents the most challenging dataset for automated snake species recognition in existence to
date. For future editions, we plan to focus upon the following:
   1. Extend the dataset, with new and rare species as well as reduce the bias towards North
      America.
   2. Integrate the snake species toxicity level into the dataset and lower the possibility of
      medically-critical mis-prediction, i.e., confusion of venomous species with non-venomous.
   3. Compare machine-learning based algorithms with human experts to better evaluate how
      far automated systems are from human expertise [4].


Acknowledgments
LP was supported by the UWB grant, project No. SGS-2019-027. A. M. Durso was supported
by the Fondation privée des Hôpitaux Universitaires de Genève (award QS04-20). We thank
the users and admins of open citizen science initiatives (iNaturalist, HerpMapper), and Flickr)
for their e�orts building these global datasets. We thank A. Flahault and the Fondation Louis-
Jeantet, and F. Chappuis for supporting R. Ruiz de Castañeda and this research at the Institute
of Global Health and at the Department of Community Health and Medicine of the University
of Geneva.


References
 [1] W. H. O. (WHO), Snakebite envenoming: Global situation., 2021 (accessed June 28, 2021).
     URL: https://www.who.int/news-room/fact-sheets/detail/snakebite-envenoming.
 [2] R. Ruiz de Castañeda, A. M. Durso, N. Ray, J. L. Fernández, D. J. Williams, G. Alcoba,
     F. Chappuis, M. Salathé, I. Bolon, Snakebite and snake identi�cation: empowering neglected
     communities and health-care providers with ai, The Lancet Digital Health 1 (2019) e202–
     e203.
 [3] I. Bolon, A. M. Durso, S. Botero Mesa, N. Ray, G. Alcoba, F. Chappuis, R. Ruiz de Castañeda,
     Identifying the snake: First scoping review on practices of communities and healthcare
     providers confronted with snakebite across the world, PLoS one 15 (2020) e0229989.
 [4] A. M. Durso, I. Bolon, A. Kleinhesselink, M. Mondardini, J. Fernandez-Marquez, F. Gutsche-
     Jones, C. Gwilliams, M. Tanner, C. E. Smith, W. Wüster, et al., Crowdsourcing snake
     identi�cation with online communities of professional herpetologists and avocational
     snake enthusiasts, Royal Society open science 8 (2021) 201273.
   1
       http://www.clef-initiative.eu/
   2
       http://www.lifeclef.org/
 [5] A. R. Davis Rabosky, C. L. Cox, D. L. Rabosky, P. O. Title, I. A. Holmes, A. Feldman, J. A.
     McGuire, Coral snakes predict the evolution of mimicry across new world snakes, Nature
     communications 7 (2016) 1–9.
 [6] A. M. Durso, R. Ruiz de Castañeda, C. Montalcini, M. R. Mondardini, J. L. Fernandez-
     Marques, F. Grey, M. M. Müller, P. Uetz, B. M. Marshall, R. J. Gray, et al., Citizen science
     and online data: Opportunities and challenges for snake ecology and action against
     snakebite, Toxicon: X (2021) 100071.
 [7] D. W. Pfennig, S. P. Mullen, Mimics without models: causes and consequences of allopatry
     in batesian mimicry complexes, Proceedings of the Royal Society B: Biological Sciences
     277 (2010) 2577–2585.
 [8] U. Roll, A. Feldman, M. Novosolov, A. Allison, A. M. Bauer, R. Bernard, M. Böhm, F. Castro-
     Herrera, L. Chirio, B. Collen, et al., The global distribution of tetrapods reveals a need for
     targeted reptile conservation, Nature Ecology & Evolution 1 (2017) 1677–1682.
 [9] A. Joly, H. Goëau, S. Kahl, B. Deneu, M. Servajean, E. Cole, L. Picek, R. R. De Castaneda,
     I. Bolon, A. Durso, et al., Overview of lifeclef 2020: a system-oriented evaluation of
     automated species identi�cation and species distribution prediction, in: International
     Conference of the Cross-Language Evaluation Forum for European Languages, Springer,
     2020, pp. 342–363.
[10] L. Picek, R. Ruiz de Castaañeda, A. M. Durso, S. P. Mohanty, Overview of the snakeclef
     2020: Automatic snake species identi�cation challenge, in: CLEF task overview 2020,
     CLEF: Conference and Labs of the Evaluation Forum, Sep. 2020, Thessaloniki, Greece.,
     2020.
[11] H. C. Wittich, M. Seeland, J. Wäldchen, M. Rzanny, P. Mäder, Recommending plant taxa
     for supporting on-site species identi�cation, BMC bioinformatics 19 (2018) 190.
[12] F. Hierink, I. Bolon, A. M. Durso, R. Ruiz de Castañeda, C. Zambrana-Torrelio, E. A. Eskew,
     N. Ray, Forty-four years of global trade in cites-listed snakes: Trends and implications for
     conservation and public health, Biological Conservation 248 (2020) 108601.
[13] D. J. Natusch, J. F. Carter, P. W. Aust, N. Van Tri, U. Tinggi, A. Riyanto, J. A. Lyons, et al.,
     Serpent’s source: Determining the source and geographic origin of traded python skins
     using isotopic and elemental markers, Biological Conservation 209 (2017) 406–414.
[14] A. Schaper, H. Desel, M. Ebbecke, L. D. Haro, M. Deters, H. Hentschel, M. Hermanns-
     Clausen, C. Langer, Bites and stings by exotic pets in europe: An 11 year analysis of 404
     cases from northeastern germany and southeastern france, Clinical Toxicology 47 (2009)
     39–43.
[15] B. J. Warrick, L. V. Boyer, S. A. Seifert, Non-native (exotic) snake envenomations in the us,
     2005–2011, Toxins 6 (2014) 2899–2911.
[16] M. Á. Cabrera-Pérez, R. Gallo-Barneto, I. Esteve, C. Patiño-Martínez, L. F. López-Jurado,
     et al., The management and control of the california kingsnake in gran canaria (canary
     islands): project life+ lampropeltis, Aliens: The Invasive Species Bulletin 32 (????) 20–28.
[17] F. Kraus, Alien reptiles and amphibians: a scienti�c compendium and analysis, volume 4,
     Springer Science & Business Media, 2008.
[18] P. Uetz, P. Freed, J. Hošek, et al., The reptile database, 2020. URL: https://reptile-database.
     reptarium.cz/advanced_search.
[19] E. E. Millar, E. C. Hazell, S. Melles, The ‘cottage e�ect’in citizen science? spatial bias in
     aquatic monitoring programs, International Journal of Geographical Information Science
     33 (2019) 1612–1632.
[20] A. M. Durso, G. K. Moorthy, S. P. Mohanty, I. Bolon, M. Salathé, R. Ruiz de Castañeda,
     Supervised learning computer vision benchmark for snake species identi�cation from
     photographs: Implications for herpetology and global health, Frontiers in Arti�cial
     Intelligence 4 (2021) 17.
[21] B. Mohapatra, D. A. Warrell, W. Suraweera, P. Bhatia, N. Dhingra, R. M. Jotkar, P. S.
     Rodriguez, K. Mishra, R. Whitaker, P. Jha, et al., Snakebite mortality in india: a nationally
     representative mortality survey, PLoS Negl Trop Dis 5 (2011) e1018.
[22] R. Borsodi, D. Papp, Incorporation of object detection models and location data into snake
     species classi�cation, in: Working Notes of CLEF 2021 - Conference and Labs of the
     Evaluation Forum, 2021.
[23] R. Chamidullin, M. Šulc, J. Matas, L. Picek, A deep learning method for visual recognitionof
     snake species, in: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation
     Forum, 2021.
[24] L. Bloch, C. M. Friedrich, E�cientnets and vision transformers for snake species identi�ca-
     tion using image and location information, in: Working Notes of CLEF 2021 - Conference
     and Labs of the Evaluation Forum, 2021.
[25] K. Lekshmi, P. Balasundaram, G. Pradeep, S. Sekhar B, R. Kumar M, Automatic snake
     classi�cation using deep learning algorithm, in: Working Notes of CLEF 2021 - Conference
     and Labs of the Evaluation Forum, 2021.
[26] L.-G. Coca, A.-T. Popa, R.-C. Croitoru, I. Bejan Luciana-Paraschiva, Adrian, Uaic-ai at
     snakeclef 2021: Impact of convolutions in snake species recognition, in: Working Notes of
     CLEF 2021 - Conference and Labs of the Evaluation Forum, 2021.
[27] D. Karthik, P. Mirunalini, J. Kumar, Snake species classi�cation using transfer learning
     technique, in: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation
     Forum, 2021.
[28] M. Tan, R. Pang, Q. V. Le, E�cientdet: Scalable and e�cient object detection, in: Proceed-
     ings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp.
     10781–10790.
[29] M. Tan, Q. Le, E�cientnet: Rethinking model scaling for convolutional neural networks,
     in: International Conference on Machine Learning, PMLR, 2019, pp. 6105–6114.
[30] K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in:
     Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
     2016.
[31] S. Xie, R. Girshick, P. Dollar, Z. Tu, K. He, Aggregated Residual Transformations for Deep
     Neural Networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern
     Recognition (CVPR), 2017.
[32] H. Zhang, C. Wu, Z. Zhang, Y. Zhu, H. Lin, Z. Zhang, Y. Sun, T. He, J. Mueller, R. Manmatha,
     M. Li, A. Smola, ResNeSt: Split-Attention Networks, 2020. arXiv:2004.08955.
[33] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. De-
     hghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words: Trans-
     formers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020).
[34] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke,
     A. Rabinovich, Going deeper with convolutions, in: Proceedings of the IEEE conference
     on computer vision and pattern recognition, 2015, pp. 1–9.
[35] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image
     recognition, arXiv preprint arXiv:1409.1556 (2014).
[36] C. Szegedy, S. Io�e, V. Vanhoucke, A. A. Alemi, Inception-v4, inception-resnet and the
     impact of residual connections on learning, in: Thirty-�rst AAAI conference on arti�cial
     intelligence, 2017.
[37] L. Picek, M. Šulc, J. Matas, J. Heilmann-Clausen, T. S. Jeppesen, T. Læssøe, T. Frøslev, Danish
     fungi 2020 – not just another image recognition dataset, 2021. arXiv:2103.10107.
[38] M. S. Norouzzadeh, A. Nguyen, M. Kosmala, A. Swanson, M. S. Palmer, C. Packer, J. Clune,
     Automatically identifying, counting, and describing wild animals in camera-trap images
     with deep learning, Proceedings of the National Academy of Sciences 115 (2018) E5716–
     E5725. doi:10.1073/pnas.1719367115.
[39] M. Willi, R. T. Pitman, A. W. Cardoso, C. Locke, A. Swanson, A. Boyer, M. Veldthuis,
     L. Fortson, Identifying animal species in camera trap images using deep learning and citizen
     science, Methods in Ecology and Evolution 10 (2019) 80–91. doi:10.1111/2041-210X.
     13099.
[40] M. Sulc, L. Picek, J. Matas, Plant recognition by inception networks with test-time class
     prior estimation, in: CLEF working notes 2018, CLEF: Conference and Labs of the Evalua-
     tion Forum, Sep. 2018, Avignon, France., 2018.
[41] L. Picek, M. Sulc, J. Matas, Recognition of the amazonian �ora by inception networks with
     test-time class prior estimation, in: CLEF working notes 2019, CLEF: Conference and Labs
     of the Evaluation Forum, Sep. 2019, Lugano, Switzerland., 2019.
[42] A. Joly, H. Goëau, S. Kahl, L. Picek, T. Lorieul, E. Cole, B. Deneu, M. Servajean, R. Ruiz De
     Castañeda, I. Bolon, H. Glotin, R. Planqué, W.-P. Vellinga, A. Dorso, H. Klinck, T. Denton,
     I. Eggel, P. Bonnet, H. Müller, Overview of lifeclef 2021: a system-oriented evaluation of
     automated species identi�cation and species distribution prediction, in: Proceedings of
     the Twelfth International Conference of the CLEF Association (CLEF 2021), 2021.
A. Country Distribution

</pre>