=Paper= {{Paper |id=Vol-2936/paper-142 |storemode=property |title=Weighted Pseudo Labeling Refinement for Plant Identification |pdfUrl=https://ceur-ws.org/Vol-2936/paper-142.pdf |volume=Vol-2936 |authors=Youshan Zhang,Brian Davison |dblpUrl=https://dblp.org/rec/conf/clef/ZhangD21 }} ==Weighted Pseudo Labeling Refinement for Plant Identification== https://ceur-ws.org/Vol-2936/paper-142.pdf
Weighted Pseudo Labeling Refinement
for Plant Identification
Youshan Zhang, Brian D. Davison
Lehigh University, 113 Research Drive, Bethlehem, PA, 18015


                                      Abstract
                                      Unsupervised domain adaptation (UDA) focuses on transferring knowledge from a labeled source do-
                                      main to an unlabeled target domain. However, existing domain adaptation methods try to handle var-
                                      ious DA scenarios that are subject to imbalanced labels or large domain discrepancy datasets. In this
                                      paper, we propose a weighted pseudo labeling refinement model (WPLR) to balance the dataset using
                                      a weighted cross-entropy loss. We also utilize the CORAL loss to further reduce the domain difference.
                                      To improve the generalizability of the model, we develop an easy-to-hard pseudo labeling refinement
                                      process by probabilistic soft selection to suppress noisy predicted target labels. Experimental results
                                      demonstrate our WPLR model yields promising results on the PlantCLEF 2021 Challenge.

                                      Keywords
                                      Unsupervised domain adaptation, Pseudo labeling refinement, Plant identification




1. Introduction
Automatic plant identification is helpful for the general audience in recognizing plant species
without the expertise of botanists. Deep neural networks can improve recognition performance
when a large number of labeled data are used for training but suffer from significant performance
degradation when deployed in a new domain due to the problem of domain shift. However,
the domain shift or domain mismatch problem exists for the plant identification problem in
PlantCLEF. Due to the significant difference between herbarium and real photos, classification
models often do not generalize well to the novel field photo domain.
   To circumvent the domain shift issue, the unsupervised domain adaptation (UDA) method
has been proposed, which can transfer the model trained on the labeled source domain to
an unlabeled target domain. Existing deep learning methods can be categorized into two
major tracks: discrepancy-based methods [1, 2, 3] and adversarial learning methods [4, 5,
6]. The former aligns the distributions of source and target domains by directly minimizing
the difference metric between feature distributions of the two domains, such as Maximum
Mean Discrepancy (MMD) [1], CORrelation ALignment [2], Kullback-Leibler divergence [3],
Jensen–Shannon divergence [7], and Wasserstein distance [8]. The latter category methods are
inspired by GANs [9], and adversarial learning has shown its power in learning domain invariant

CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania
" yoz217@lehigh.edu (Y. Zhang); bdd3@lehigh.edu (B. D. Davison)
~ https://sites.google.com/view/youshanzhang (Y. Zhang); http://www.cse.lehigh.edu/~brian/ (B. D. Davison)
 0000-0002-0074-0979 (Y. Zhang); 0000-0002-9326-3648 (B. D. Davison)
                                    © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
representations. It consists of a domain discriminator and a feature extractor. The domain
discriminator aims to distinguish the source domain from the target domain, while the feature
extractor aims to learn domain-invariant representations to fool the domain discriminator [4,
5, 6]. There is also much exploration of adversarial learning methods, such as DANN [10],
MCD [11], TADA [12], SymNets [13], and ACDA [14].
   Although many methods are proposed for domain adaptation, most of them are tested on
small domain divergence datasets, which may have lower transferability to large-divergence
datasets, and the data imbalance problem is not well addressed. To address these challenges, we
offer two contributions:
   1. We propose a weighted cross-entropy loss to balance the categorical data. To minimize
      the domain divergence, we utilize the existing CORAL loss.
   2. To remove noisy pseudo labels in the target domain, we also employ an easy-to-hard
      pseudo labeling refinement process by probabilistic soft selection. We then form a high-
      quality pseudo-labeled target domain to improve the generalizability of the model.


2. Dataset
PlantCLEF 2021 is a large-scale dataset of the PlantCLEF 2021 task[15, 16], organized in the
context of the LifeCLEF 2021 challenge. Fig. 1 shows some challenging images in this dataset.
Tab. 1 lists the statistics on PlantCLEF 2021 dataset. Due to the significant difference between
herbarium and real photos, it is extremely difficult to identify the correct class. All images are
the same as PlantCLEF 2020 dataset [17], but it also introduces five “traits" covering exhaustively
all species of the challenge.




Figure 1: Example images of the herbarium domain and photo domain. The large discrepancy between
the two domains causes difficulty in improving the performance of the model.
Table 1
Statistics of the PlantCLEF 2021 dataset
                                    Domain      Number of Samples   Number of Classes
                              Herbarium (H)         320,750              997
           Herbarium_photo_associations (A)          1,816               244
                                   Photo (P)         4,482               375
                                     Test (T)        3,186                 -



3. Methods
In this section, we will first introduce the problem and notation for UDA, and then introduce
the different components of our Weighted Pseudo Labeling Refinement (WPLR) model.

3.1. Problem and notation
In this work, we consider the unsupervised domain adaptation (UDA) classification problem in
the following setting. There exists a labeled source domain 𝒟𝒮 = {𝒳𝒮𝑖 , 𝒴𝒮𝑖 }𝒩
                                                                             𝑖=1 of 𝒩𝒮 labeled
                                                                               𝒮


samples in 𝐶 categories and a target domain 𝒟𝒯 = {𝒳𝒯𝑗 }𝒩    𝑗=1 of 𝒩𝒯 samples without any
                                                              𝒯

labels (i.e., 𝒴𝒯 is unknown). The samples 𝒳𝒮 and 𝒳𝒯 obey the marginal distributions of 𝑃𝒮
and 𝑃𝒯 . The conditional distributions of the two domains are denoted as 𝑄𝒮 and 𝑄𝒯 . Due to
the discrepancy between the two domains, the distributions are assumed to be different, i.e.,
𝑃𝒮 ̸= 𝑃𝒯 and 𝑄𝒮 ̸= 𝑄𝒯 . Our ultimate goal is to learn a classifier 𝐹 under a feature extractor
𝐺, which reduces domain discrepancy and improves the generalization ability of the classifier
to the target domain.




Figure 2: The weight of each class.
Figure 3: Architecture of the WPLR model. We first utilize NASNetLarge as the feature extractor 𝐺
to extract features from the two domains (𝐺(𝒳𝒮 ) and 𝐺(𝒳𝒯 )). The shared classifier 𝐹 is then trained
using the extracted features. ℒ𝒲𝒮 is the weighted source classification loss, ℒ𝐶𝑂𝑅𝐴𝐿 is the CORAL
                                                                                        𝑛𝑝𝑡
loss, and ℒ𝒯 is the pseudo-labeled target domain classification loss. {𝑄(𝒳𝒯𝑗 ), 𝑄(𝒴𝒯𝑗 )}𝑗=1 is the pseudo-
labeled target domain after 𝑇 times pseudo labeling refinement processes. Best viewed in color.


3.2. Weighted source classifier
The task in the source domain is trained using the typical cross-entropy loss. However, there are
imbalanced numbers of samples of each category. To handle this issue, we develop a weighted
source classifier to balance the weight of each category based on the source samples. We define
the weight of each class in the following equation.
                                                         𝒩𝑐
                                              𝑚𝑒𝑑𝑖𝑎𝑛({ 𝒩𝒮𝒮 }𝐶
                                                            𝑐=1 )
                                     𝑊 =            𝒩𝑐
                                                                    ,                                 (1)
                                                  { 𝒩𝒮𝒮 }𝐶
                                                         𝑐=1

                                                         𝒩𝑐
where 𝒩𝒮𝑐 is the number of samples in each class, { 𝒩𝒮𝒮 }𝐶
                                                         𝑐=1 ∈ R
                                                                 997×1 is the frequency of images

in each class, 𝑚𝑒𝑑𝑖𝑎𝑛(·) takes the median value of the frequency. The frequency value varies;
the median represents the middle frequency better than mean would. Fig. 2 shows the weight
of each class (997 classes in total). Therefore, we develop the weighted cross-entropy loss for
the labeled source domain in Eq. 2.

                                      𝒮   𝒩
                                   1 ∑︁
                            ℒ𝒲𝒮 =       𝑊𝑖 × ℒ𝑐𝑒 (𝐹 (𝐺(𝒳𝒮𝑖 )), 𝒴𝒮𝑖 ),                                 (2)
                                  𝒩𝒮
                                          𝑖=1

where ℒ𝑐𝑒 is the typical cross-entropy loss, 𝐹 is the classifier in Fig. 3, and 𝐹 (𝐺(𝒳𝒮𝑖 )) is the
predicted label.

3.3. CORAL loss
CORrelation ALignment loss (CORAL) [2] is one frequently used distance-based loss function
to minimize the difference between source and target domain. We also integrate CORAL loss
during the training as follows,
                                 1
                 ℒ𝐶𝑂𝑅𝐴𝐿 = 2 ||𝐶𝑂𝑉 (𝐹 (𝐺(𝒳𝒮 ))) − 𝐶𝑂𝑉 (𝐹 (𝐺(𝒳𝒯 )))||2𝐹 ,                    (3)
                                4𝑑
where 𝑑 is the feature dimensionality, 𝐶𝑂𝑉 (·) is the covariance matrices of the source and
target features, and || · ||2𝐹 denotes the squared matrix Frobenius norm. Therefore, our model
is able to minimize the domain divergence between the source domain and the target domain
during the training.

3.4. Pseudo labeling refinement
To further reduce the domain difference, we also generate pseudo labels for the target domain.
However, the detrimental effects of bad pseudo-labels are still significant. To mitigate this issue,
we employ a 𝑇 times recurrent easy-to-hard pseudo-label refinement process to improve the
quality of the pseudo-labels in the target domain via imposing a probabilistic soft selection [18,
19].
  The initial shared classifier 𝐹 is optimized by ℒ𝒲𝒮 . For the inference, we can directly
get predicted results for one target domain sample 𝐹 (𝐺(𝒳𝒯𝑗 )). Let Softmax(𝐹 (𝐺(𝒳𝒯𝑗 ))) be
                                                 𝑗
the predicted probability for each class, and 𝒴𝒫𝒯    = 𝑚𝑎𝑥(Softmax(𝐹 (𝐺(𝒳𝒯𝑗 ))))𝑖𝑛𝑑𝑒𝑥 be its
dominant class label, where 𝑚𝑎𝑥(·)𝑖𝑛𝑑𝑒𝑥 return the index of the maximum probability value.
Therefore, for the probabilistic soft selection, a higher quality pseudo label is defined as
𝑚𝑎𝑥(Softmax(𝐹 (𝐺(𝒳𝒯𝑗 )))) > 𝑝𝑡 , where 𝑝𝑡 is a threshold probability in number of 𝑡 train-
ing. For 𝑇 times recurrent easy-to-hard pseudo-label refinement, for easy examples, 𝑝𝑡 has a
higher value and for hard examples, 𝑝𝑡 has a lower value, hence 𝑝1 > 𝑝2 > · · · > 𝑝𝑇 .
  In pseudo labeling refinement, we form a robust new pseudo-labeled domain in the following
equation,
                             𝑛
          {𝑄(𝒳𝒯𝑗 ), 𝑄(𝒴𝒯𝑗 )}𝑗=1
                             𝑝𝑡
                                   if and only if 𝑚𝑎𝑥(Softmax(𝐹 (𝐺(𝒳𝒯𝑗 )))) > 𝑝𝑡                (4)
where 𝑄(·) represents the high quality, 𝑛𝑝𝑡 is the number of higher quality pseudo labels for
the target domain. We hence can mitigate detrimental effects of bad pseudo-labels using Eq. 4.
Similar to Eq. 2, we define the pseudo-labeled target domain loss as:
                                   𝑛𝑝𝑡
                             1 ∑︁
                       ℒ𝒯 =       𝑊𝑗 × ℒ𝑐𝑒 (𝐹 (𝐺(𝑄(𝒳𝒯𝑗 ))), 𝑄(𝒴𝒯𝑗 )),                           (5)
                            𝑛𝑝𝑡
                                  𝑗=1

where 𝑊 is the weight of each class and ℒ𝑐𝑒 is the cross-entropy loss.

3.5. WPLR model
Fig. 3 depicts the overall framework of our proposed WPLR model. Taken together, our model
minimizes the following objective function:
                                                              𝑇
                                                             ∑︁
                             arg min (ℒ𝒲𝒮 + ℒ𝐶𝑂𝑅𝐴𝐿 +               ℒ𝑡𝒯 )                        (6)
                                                             𝑡=1
where ℒ𝒲𝒮 is the weighted source classification loss, ℒ𝐶𝑂𝑅𝐴𝐿 is the CORAL loss, and ℒ𝒯 is
the pseudo-labeled target domain classification loss.
4. Experiments
4.1. Implementation details
We first extract features from the last fully connected layer [20, 21, 22] of a retrained NASNet-
Large [23] model. One image can be denoted by a feature vector with the size of 1 × 1000.
Therefore, the feature representation of domain herbarium (H) has the size of 320, 750 × 1000,
domain herbarium_photo_associations (A) has the size of 1, 816 × 1000, domain photo (P) has
the size of 4, 482 × 1000, and domain test (T) has the size of 3, 186 × 1000. Domain H + A has
the size of 322, 566 × 1000. In Tab. 2, H  P represents learning knowledge from domain H,
which is applied to domain P [24].
  We implement our approach using PyTorch. The outputs of the three Linear layers are
1000, 1000 and |𝐶|, respectively. Parameters in recurrent pseudo labeling are 𝑇 = 5 and
{𝑝𝑡 }5𝑡=1 = [0.9, 0.8, 0.7, 0.6, 0.5]. Learning rate (0.001), batch size (64), optimizer (Adam) and
number of epochs (𝒩𝒮 /64) are determined by performance on the source domain. Experiments
are performed with a GeForce 1080 Ti. We also compare our results with four domain adaptation
methods: DANN [10], ADDA [5], NASNetLarge-𝐴𝐶𝐿 [24] and BA3US [25].

4.2. Results

Table 2
Accuracy (%) on PlantCLEF 2021 dataset for photo domain
                                    Task         AP      HP       H+A  P
                                DANN [10]        1.07      1.85       2.01
                                 ADDA [5]        2.95      3.05       3.43
                               BA3US [25]        3.56      4.65       5.31
                     NASNetLarge-𝐴𝐶𝐿 [24]        5.98      8.64       9.67
                     WPLR- ℒ𝐶𝑂𝑅𝐴𝐿 − ℒ𝒯           6.03      9.12      10.03
                               WPLR- ℒ𝒯          6.12      9.23      11.46
                          WPLR- ℒ𝐶𝑂𝑅𝐴𝐿           6.22      9.47      12.51
                                   WPLR          6.38     9.645      13.44



   Tab. 2 shows the results of our ∑︀
                                    WPLR model of the photo domain. We report the accuracy of
the whole photo domain (𝐴𝑐𝑐 = 𝒩       𝑗=1 (𝒴𝒯 𝑗 == 𝒴𝒯 𝑗 )/𝒩𝒯 × 100), where 𝒴𝒯 is the predicted
                                       𝒯    ^                                   ^
label for the target domain. Compared with all other four methods, our WPLR model achieves
the highest accuracy in all three tasks, and especially in H+A  P task.
   We also carefully conduct an ablation study to demonstrate the effects of different loss
functions on final classification accuracy. Notice that weighted source classification loss ℒ𝒲𝒮 is
required for UDA. “WPLR- ℒ𝐶𝑂𝑅𝐴𝐿 − ℒ𝒯 ” is implemented without ℒ𝐶𝑂𝑅𝐴𝐿 and ℒ𝒯 . It is a
simple model, which only reduces the source risk without minimizing the domain discrepancy
using ℒ𝒲𝒮 . “WPLR- ℒ𝐶𝑂𝑅𝐴𝐿 ” reports results without performing CORAL loss. “WPLR- ℒ𝒯 ”
reports results without performing the 𝑇 time pseudo labeling refinement process. We can find
that with the increasing number of loss functions, the accuracy of our model keeps improving.
Table 3
MRR on PlantCLEF 2021 challenge for test domain
                                   Team       Full test set    Sub-set of the test set
              Organizer’s submission [15]        0.198                 0.093
                               Neuon AI          0.181                 0.158
                               LU (ours)         0.065                 0.037
                             Domain_run          0.065                 0.037
                                   To_be         0.056                 0.038

The effectiveness of loss functions on classification accuracy is ordered as ℒ𝒯 > ℒ𝐶𝑂𝑅𝐴𝐿 .
Therefore, the proposed weighted classification loss, CORAL loss, and easy-to-hard target
domain pseudo labeling refinement approaches are effective in minimizing target domain risk
and improving the accuracy.
   We also list the final performance of our model in the test domain in Tab. 3. Our model earns
the second position in the PlantCLEF 2021 challenge. We provided a total of nine submissions;
the MRR of the full test set ranged from 0.034 to 0.065, as a result of varying hyperparameters
(different number of iterations, 𝑇 and 𝑝𝑡 ).


5. Discussion
There are two compelling advantages of our WPLR model. First, we propose a weighted cross-
entropy loss to mitigate the imbalanced data issue in the source domain. Secondly, we develop
an easy-to-hard refinement process to improve the quality of pseudo labels in the target domain.
This strategy considers probabilistic soft selection, and it hence can push the shared classifier 𝐹
towards the target domain. Compared with other baselines in Tab. 2, the 𝑇 times easy-to-hard
refinement process is effective in improving the classification accuracy and further reduces
the domain discrepancy. However, our model only earns the second position in the challenge,
and the results are a little bit lower than the Organizer’s submission. One underlying reason
is that our model cannot extract very robust invariant features. Therefore, we will consider
designing a better feature extractor method and distill the domain invariant features across the
two domains for future work. In addition, we would like to include more external data during
the training (e.g., GBIF [26]).


6. Conclusion
In this paper, we propose a novel weighted pseudo labeling refinement (WPLR) method for
domain adaptation to solve the plant identification problem. We develop a weighted cross-
entropy loss to balance the categorical data and utilize the CORAL loss to minimize the domain
divergence. We also employ an easy-to-hard pseudo labeling refinement process by probabilistic
soft selection. It can improve the quality of pseudo labels and remove the detrimental effects of
bad labels. Experimental results demonstrate our proposed WPLR model is better than several
baselines.
References
 [1] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, T. Darrell, Deep domain confusion: Maximizing
     for domain invariance, arXiv preprint arXiv:1412.3474 (2014).
 [2] B. Sun, K. Saenko, Deep CORAL: Correlation alignment for deep domain adaptation, in:
     European Conference on Computer Vision, Springer, 2016, pp. 443–450.
 [3] Z. Meng, J. Li, Y. Gong, B. Juang, Adversarial teacher-student learning for unsupervised
     domain adaptation, in: 2018 IEEE International Conference on Acoustics, Speech and
     Signal Processing (ICASSP), IEEE, 2018, pp. 5949–5953.
 [4] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand,
     V. Lempitsky, Domain-adversarial training of neural networks, The Journal of Machine
     Learning Research 17 (2016) 2096–2030.
 [5] E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation,
     in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
     2017, pp. 7167–7176.
 [6] Y. Zhang, H. Ye, B. D. Davison, Adversarial reinforcement learning for unsupervised
     domain adaptation, in: Proceedings of the IEEE/CVF Winter Conference on Applications
     of Computer Vision, 2021, pp. 635–644.
 [7] J. Jiang, X. Wang, M. Long, J. Wang, Resource efficient domain adaptation, in: Proceedings
     of the 28th ACM International Conference on Multimedia, 2020, pp. 2220–2228.
 [8] B. Bhushan Damodaran, B. Kellenberger, R. Flamary, D. Tuia, N. Courty, DeepJDOT: Deep
     joint distribution optimal transport for unsupervised domain adaptation, in: Proceedings
     of the European Conference on Computer Vision (ECCV), 2018, pp. 447–463.
 [9] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville,
     Y. Bengio, Generative adversarial nets, in: Advances in Neural Information Processing
     Systems, 2014, pp. 2672–2680.
[10] M. Ghifary, W. B. Kleijn, M. Zhang, Domain adaptive neural networks for object recognition,
     in: Pacific Rim International Conference on Artificial Intelligence, Springer, 2014, pp. 898–
     904.
[11] K. Saito, K. Watanabe, Y. Ushiku, T. Harada, Maximum classifier discrepancy for unsuper-
     vised domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision
     and Pattern Recognition, 2018, pp. 3723–3732.
[12] X. Wang, L. Li, W. Ye, M. Long, J. Wang, Transferable attention for domain adaptation,
     in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 2019, pp.
     5345–5352.
[13] Y. Zhang, H. Tang, K. Jia, M. Tan, Domain-symmetric networks for adversarial domain
     adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern
     Recognition, 2019, pp. 5031–5040.
[14] Y. Zhang, B. D. Davison, Adversarial continuous learning in unsupervised domain adapta-
     tion., in: ICPR Workshops (2), 2020, pp. 672–687.
[15] H. Goëau, P. Bonnet, A. Joly, Overview of PlantCLEF 2021: cross-domain plant identifica-
     tion, in: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum,
     2021.
[16] A. Joly, H. Goëau, S. Kahl, L. Picek, T. Lorieul, E. Cole, B. Deneu, M. Servajean, R. Ruiz De
     Castañeda, I. Bolon, H. Glotin, R. Planqué, W.-P. Vellinga, A. Dorso, H. Klinck, T. Denton,
     I. Eggel, P. Bonnet, H. Müller, Overview of LifeCLEF 2021: a system-oriented evaluation
     of automated species identification and species distribution prediction, in: Proceedings of
     the Twelfth International Conference of the CLEF Association (CLEF 2021), 2021.
[17] H. Goëau, P. Bonnet, A. Joly, Overview of lifeclef plant identification task 2020, in:
     CLEF working notes 2020, CLEF: Conference and Labs of the Evaluation Forum, Sep. 2020,
     Thessaloniki, Greece., 2020.
[18] Y. Zhang, B. D. Davison, Deep spherical manifold gaussian kernel for unsupervised domain
     adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
     Recognition Workshops (CVPRW), 2021, pp. 4443–4452.
[19] Y. Zhang, B. D. Davison, Efficient pre-trained features and recurrent pseudo-labeling
     in unsupervised domain adaptation, in: Proceedings of the IEEE/CVF Conference on
     Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 2719–2728.
[20] Y. Zhang, J. P. Allem, J. B. Unger, T. B. Cruz, Automated identification of hookahs (wa-
     terpipes) on Instagram: an application in feature extraction using convolutional neural
     network and support vector machine classification, Journal of Medical Internet Research
     20 (2018) e10513.
[21] Y. Zhang, B. D. Davison, Modified distribution alignment for domain adaptation with
     pre-trained Inception ResNet, arXiv preprint arXiv:1904.02322 (2019).
[22] Y. Zhang, B. D. Davison, Impact of ImageNet model selection on domain adaptation,
     in: Proceedings of the IEEE Winter Conference on Applications of Computer Vision
     Workshops, 2020, pp. 173–182.
[23] B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le, Learning transferable architectures for scalable
     image recognition, in: Proceedings of the IEEE conference on computer vision and pattern
     recognition, 2018, pp. 8697–8710.
[24] Y. Zhang, B. D. Davison, Adversarial consistent learning on partial domain adaptation of
     PlantCLEF 2020 challenge, in: CLEF working notes 2020, CLEF: Conference and Labs of
     the Evaluation Forum, 2020.
[25] J. Liang, Y. Wang, D. Hu, R. He, J. Feng, A balanced and uncertainty-aware approach for
     partial domain adaptation, arXiv preprint arXiv:2003.02541 (2020).
[26] L. Picek, M. Sulc, J. Matas, Recognition of the amazonian flora by inception networks with
     test-time class prior estimation, in: Working Notes of CLEF 2019 - Conference and Labs of
     the Evaluation Forum, 2019.