<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Combining Present-Only and Present-Absent Data with Pseudo-Label Generation for Species Distribution Modeling Notebook for the LifeCLEF Lab at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yi-Chia</forename><surname>Chen</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Taiwan University</orgName>
								<address>
									<settlement>Taipei</settlement>
									<country key="TW">Taiwan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tai</forename><surname>Peng</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Taiwan University</orgName>
								<address>
									<settlement>Taipei</settlement>
									<country key="TW">Taiwan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Wei-Hua</forename><surname>Li</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Taiwan University</orgName>
								<address>
									<settlement>Taipei</settlement>
									<country key="TW">Taiwan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chu-Song</forename><surname>Chen</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Taiwan University</orgName>
								<address>
									<settlement>Taipei</settlement>
									<country key="TW">Taiwan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<address>
									<settlement>Grenoble</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Combining Present-Only and Present-Absent Data with Pseudo-Label Generation for Species Distribution Modeling Notebook for the LifeCLEF Lab at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">922BE9783B40B37C9627AF98B5BBA83A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:57+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Species distribution modeling</term>
					<term>Presence-Only data</term>
					<term>Pseudo labels</term>
					<term>LifeCLEF</term>
					<term>multimodal deep learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Predicting the composition of plant species at specific times and locations is crucial for biodiversity management and conservation. In this report, we leverage data from the GeoLifeCLEF 2024 challenge, which includes approximately 5 million plant occurrence records from Europe, a training set of about 90,000 plots, and a test set with 5,000 plots. These data encompass various modalities, including satellite images, climatic time series, land cover, human footprint, bioclimatic, and soil variables. Our approach combines a pseudo-label training framework based on large-scale data and multimodal pretrained deep learning models to address challenges such as multi-label learning from single positive labels, strong class imbalance, and large-scale data processing. On the private test set, our method achieved a score of 0.36837, securing second place on the leaderboard, just 0.04 points behind first place. We discuss the design of our approach and reflect on the results. Our code is available on GitHub.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Species distribution modeling (SDM) is a field of research focused on predicting species that are most likely to be observed at a given location and time. In recent years, the research community has collected a vast amount of species observations from various regions, providing us with the opportunity to train deep learning model to predict species distribution.</p><p>In GeoLifeCLEF 2024 <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, a large-scale training set is provided, but most of these samples have only single or partial positive labels (about 5 million of Presence-Only (PO) data and only 90,000 of Presence-Absence (PA) data with exhaustive labels). Therefore, how to effectively integrate these PO data with PA data is one of the challenging problems.</p><p>In this report, we propose a hybrid model that combines different CNN-based architectures for SDM. Furthermore, we will introduce the framework we employed during the competition, which effectively utilized the abundant PO data provided by the organizers to generate pseudo-labels. These pseudo-labels were sequentially integrated with PA data to finetune our models.</p><p>The rest of this report is structured as follows. Section 2 reviews related work. Section 3 provides a detailed description of the dataset and the evaluation metric for the competition. Section 4 introduces the proposed method. Section 5 presents the experimental results and ablation study. Finally, section 6 concludes the report.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Background and Related works</head><p>This section provides a brief overview of relevant works in the field of Single-Positive Multi-Label Learning and SPMLL for Species Distribution Modeling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Single-Positive Multi-Label Learning, SPMLL</head><p>Multi-label learning (MLL) <ref type="bibr" target="#b2">[3]</ref> has had many practical applications in the past, aiming to enable models to classify multiple labels. However, collecting large amount of the training data with complete multilabel annotation is quite difficult and time-consuming, therefore, SPMLL have been proposed to alleviate the burden of multi-label annotation.</p><p>Different from MLL, the goal of SPMLL is to achieve multi-label learning through the samples are annotated with only single positive label. In the field of computer vision, some works proposed to utilize pseudo-label generation. For example, Zhou et al. <ref type="bibr" target="#b3">[4]</ref> proposes entropy-maximization (EM) loss and asymmetric pseudo-labeling. Xie et al. <ref type="bibr" target="#b4">[5]</ref> proposed Label-Aware global Consistency (LAC) regularization. Liu et al. <ref type="bibr" target="#b6">[6]</ref> provides a theoretical guarantee for learning from pseudo-label on SPMLL and proposes MIME, which can simulataneously train the model and update the pseudo-labels. Although these methods are quite effective, they are not specifically designed for Species Distribution Modeling research, and the number of categories they need to predict is smaller.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">SPMLL for Species Distribution Modeling</head><p>In GeoLifeCLEF 2022, several CNN-based SDM models <ref type="bibr" target="#b7">[7,</ref><ref type="bibr" target="#b8">8]</ref> have been proposed for species distribution modeling. However, training CNN-based models for multi-label prediction tasks using samples with single positive labels is challenging, therefore, in GeoLifeCLEF 2023, Ung et al. <ref type="bibr" target="#b9">[9]</ref> proposed the threesteps training strategy. The three-step training process involves using PA data with BCELoss for pre-training the model, then using PO data with Cross Entropy Loss for extensive training, and finally fine-tuning with PA data. Inspired by this work, we have also designed a three-steps process, aiming to make good use of single-label PO data to assist us in performing species distribution modeling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Data and Evaluation Metric</head><p>In this section, we introduce the multimodal dataset provided by the GeoLifeCLEF 2024 competition and the evaluation metrics used for the competition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data</head><p>The GeoLifeCLEF 2024 Challenge aims to predict plant species presence at specific locations using various related features based on the GeoLifeCLEF 2023 multimodal dataset <ref type="bibr" target="#b10">[10]</ref>. The dataset encompasses 38 European countries, covering eight biogeographic regions: Alpine, Atlantic, Black Sea, Boreal, Continental, Mediterranean, Pannonian, and Steppic. The data were collected between 2017 and 2021, ensuring a comprehensive temporal and spatial coverage across Europe. The GeoLifeCLEF 2024 includes approximately 10,000 plant species observed through both Presence-Only (PO) and Presence-Absence (PA) surveys. The PO data consists of 5 million records extracted from trusted sources, while the PA data comprises 90 thousand surveys conducted by botanical experts.</p><p>The dataset incorporates several modalities of data, each providing unique insights into the environmental conditions affecting plant species distribution:</p><p>• Satellite Raster Images:</p><p>-Sentinel-2 Images: These include RGB and Near-Infra-Red (NIR) bands, capturing data over a 1280 meter × 1280 meter area at a 10-meter resolution, formatted into 128 × 128 pixel patches. -Landsat Time Series: This data spans from 2000 to 2020, offering quarterly median composites of six spectral bands (blue, green, red, NIR, SWIR1, and SWIR2) with a 30-meter resolution.</p><p>• Climatic Data:</p><p>-Bioclimatic Rasters: Nineteen low-resolution rasters describing various climatic variables, such as mean annual air temperature and precipitation, provided as GeoTIFF files with a 30-arcsecond resolution (1 km).</p><p>• Soil Variables:</p><p>-Soil-Grids: Nine low-resolution rasters detailing soil properties like pH, clay content, organic carbon, nitrogen, bulk density, sand, silt, and cation exchange capacity, measured at a depth range of 5 to 15 centimeters.</p><p>• Human Footprint:</p><p>-Sixteen rasters representing human activities and their pressures on the environment, including population density, road networks, and night-time lights. These data are provided for two time periods <ref type="bibr">(1993 and 2009)</ref>, allowing for the assessment of changes over time.</p><p>• Elevation and Land Cover:</p><p>-Elevation Data: High-resolution elevation data provided as a single GeoTIFF file with a 1-arcsecond resolution (30 m). -Land Cover: Multi-band raster files describing land cover classes using classifications like IGBP and LCCS, provided with a resolution of 500 meters.</p><p>The dataset matches species observations with different environmental factors commonly used in species distribution modeling, such as climate conditions, soil characteristics, land cover, and human impact. All data are provided at suitable spatial resolutions to support accurate modeling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Evaluation Metric</head><p>The evaluation metric for the GeoLifeCLEF 2024 competition is the samples-averaged 𝐹 1 -score, calculated on the test set composed of species Presence-Absence (PA) samples. This metric addresses a multi-label classification problem, providing an average measure of the overlap between the predicted and actual sets of species present at specific locations and times.</p><p>The micro 𝐹 1 -score is computed using the following formula:</p><formula xml:id="formula_0">𝐹 1 = 1 𝑁 𝑁 ∑︁ 𝑖=1 TP 𝑖 TP 𝑖 + (FP 𝑖 + FN 𝑖 )/2</formula><p>In this formula: 𝑁 is the total number of test PA samples. TP 𝑖 (True Positives) is the number of species correctly predicted to be present. FP 𝑖 (False Positives) is the number of species incorrectly predicted to be present. FN 𝑖 (False Negatives) is the number of species that are present but not predicted.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Proposed Method</head><p>This section introduces our proposed multimodal deep learning model and pseudo-label training framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Model architectures</head><p>To address the multi-label plant species prediction problem, we designed and experimented with a multimodal ensemble neural network model. This model integrates various data sources, including preprocessed tabular data (comprising metadata, Human Footprint, Landcover, and Soil), Landsat and Sentinel satellite imagery, and Bioclimatic Rasters. Each data type is processed by specialized neural networks before being fused for classification. The model architecture is illustrated in Figure <ref type="figure" target="#fig_0">1</ref>.</p><p>We use Multi-Layer Perceptron (MLP) to extract features for preprocessed high-value tabular data. The input tabular data feature vector first passes through a fully connected layer, transforming the number of features into 1,000 neurons. This is followed by batch normalization <ref type="bibr" target="#b11">[11]</ref> to stabilize data distribution and accelerate the training process. The activation function used is ReLU <ref type="bibr" target="#b12">[12]</ref>, which introduces non-linearity to enhance the model's expressive capability. This structure is repeated three times, resulting in three hidden layers, each containing 1,000 neurons. Finally, a fully connected layer reduces the feature dimension to 512, outputting a 512-dimensional feature vector.</p><p>The Landsat data processing module adopts a convolutional neural network structure based on ResNet18 <ref type="bibr" target="#b13">[13]</ref>. To fully exploit the rich temporal series of remote sensing data, we modified the ResNet18 architecture to suit the characteristics of Landsat data. Initially, we apply layer normalization <ref type="bibr" target="#b14">[14]</ref> to the input Landsat data to stabilize the input data distribution. We then use the ResNet18 model for feature extraction but modify the first convolutional layer to increase the kernel size from the original 3 channels to 6 channels, accommodating the multi-spectral nature of Landsat data. This modification enables the model to capture the rich information within Landsat data better. To simplify the model structure and focus on feature extraction, we removed the max-pooling layer and the fully connected layer from the ResNet18 model, retaining only the convolutional layers for feature extraction. This design ensures the model can efficiently process Landsat data and output high-quality feature vectors.</p><p>To effectively utilize the Bioclimatic Rasters data, we designed a deep convolutional neural network structure based on ResNet18 <ref type="bibr" target="#b13">[13]</ref>, which was modified to suit the characteristics of Bioclimatic Rasters data. Initially, layer normalization is applied to the input data to stabilize its distribution, accommodating the diversity of Bioclim data. We also modified the first convolutional layer of the ResNet18 model, changing it from the default 3 channels to 4 channels. Additionally, we removed the max-pooling layer and the fully connected layer to ensure efficient extraction of features from the Bioclimatic Rasters data.</p><p>Handling Sentinel satellite imagery data is crucial for species prediction based on geographic location. Sentinel-2 provides multispectral images, including red, green, blue, and near-infrared (NIR) bands. To leverage these high-resolution multispectral images, we employed a self-supervised pretrained ResNet18 model on the SSL4EO-S12 Earth observation dataset <ref type="bibr" target="#b15">[15]</ref>. This approach takes advantage of the off-the-shelf model's learning capability on large-scale datasets, enhancing feature extraction performance. We modified the first convolutional layer of ResNet18 from the default 3 channels to 4 channels to accommodate the four spectral bands of Sentinel data. Specifically, the first convolutional layer was set with a kernel size of 7 × 7, a stride of 2, and padding of 3, enabling the extraction of more local features while maintaining spatial resolution. To adapt to this modification, we concatenated the convolution kernels without altering the original weight distribution. Through this design, the Sentinel data processing module effectively extracts spatial and spectral features from high-resolution multispectral images, providing rich feature representations for subsequent multimodal feature fusion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Pseudo-label training framework</head><p>The GeoLifeCLEF 2024 competition provides two types of data, PA (Presence-Absence) and PO (Presence-Only), which exhibit significant differences in scale and quality. Although PO data suffers from biases due to the lack of standardized sampling, its vast volume (approximately five million records) is of immense value for model training, enriching the training data and enhancing prediction accuracy. The effective utilization of PO data is undoubtedly crucial for improving performance in this competition. To this end, we have designed a pseudo-label training framework based on both PO and PA data. This framework comprises three steps, as shown in Figure <ref type="figure" target="#fig_2">2</ref>.</p><p>In the first step, we train our model with PA data to equip it with the initial ability to classify multiple species. We can assume that each input 𝑥 from 𝑋 𝑝𝑎 corresponds to a label vector 𝑦 from 𝑌 = {𝑦 1 , 𝑦 2 , ..., 𝑦 𝐿 } ∈ {0, 1} 𝐿 , where 𝐿 denotes the total number of classes, 𝑦 𝑖 = 1 represents the species 𝑖 is present at the given location and 𝑦 𝑖 = 0 otherwise. The primary goal is to find a model (𝑀 ) that can accurately predict 𝑦 for each 𝑥. To achieve this, we use the common binary cross-entropy (BCE) loss to train the model.</p><p>In the second step, we utilize the pretrained model to derive pseudo-labels for each sample in the PO data. Given a sample (𝑥 𝑝𝑜 , 𝑦), our model 𝑀 predicts a label vector 𝑦 ˜= {𝑦 1 ˜, 𝑦 2 ˜, ..., 𝑦 𝐿 ˜} and the corresponding probability 𝑠 = {𝑠 1 , 𝑠 2 , ..., 𝑠 𝐿 } for each class based on 𝑥 𝑝𝑜 . To enhance the reliability of positive pseudo-labels, we introduce an ignore label (∅) and filter positive labels based on their confidence scores. We define two confidence thresholds, 𝑇 − and 𝑇 + , to aid in generating the final pseudo-labels 𝑦 𝑝 = {𝑦 𝑝 1 , 𝑦 𝑝 2 , ..., 𝑦 𝑝 𝐿 } ∈ {0, 1, ∅}, where 𝑦 𝑝 𝑖 can be expressed as follows: </p><formula xml:id="formula_1">𝑦 𝑝 𝑖 = ⎧ ⎪ ⎨ ⎪ ⎩ 1, if 𝑇 + &lt; 𝑠 𝑖 ∅, if 𝑇 − &lt; 𝑠 𝑖 &lt; 𝑇 + 0, if 𝑠 𝑖 &lt; 𝑇 −<label>(1)</label></formula><p>With this filtering mechanism, we can obtain more reliable positive labels, as those with high confidence scores are retained. For those positive labels with uncertain confidence levels (i.e., 𝑠 𝑖 between 𝑇 + and 𝑇 − ), we do not include them in the loss calculation. Labels with very low confidence levels are considered negative labels. Additionally, we also retain the original positive samples from the PO data. Therefore, the final pseudo-label can be represented as:</p><formula xml:id="formula_2">𝑦 𝑝 = 𝑦 𝑝 ∪ 𝑦<label>(2)</label></formula><p>Finally, we train our model using the PO data with pseudo-labels and the original PA data with multi-labels. This enables the model to undergo training with a larger volume of data. This three-steps process can significantly improve our performance. For related ablation experiments, please refer to section 5.1</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experimental Results</head><p>In this section, we present the details of the experiments.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Ablation Study</head><p>To better understand the impact of different components and design choices in our proposed method, we conducted an ablation study. The results are presented in Table <ref type="table">3</ref>.</p><p>We began with a baseline model submitted by organizer <ref type="foot" target="#foot_0">1</ref> and gradually added various components to observe their effect on the model's performance. The baseline achieved a Kaggle Private Score of 0.31535. By utilizing 5-fold cross-validation, we saw an improvement of 0.02075, reaching a score of 0.33610. Next, we investigated the effect of different threshold values on the model's performance, as shown in Table <ref type="table" target="#tab_1">2</ref>. We found that setting the threshold to 0.2 yielded the best result, with a Kaggle Private Score of 0.34886, an improvement of 0.01276 over the previous step.</p><p>Incorporating tabular data into the model provided a slight boost in performance, increasing the score by 0.00618 to 0.35504. Using a self-supervised pretrained ResNet (SSL pretrained ResNet) to the model resulted in an improvement, raising the score by 0.00520 to 0.36024. Finally, the introduction of our pseudo-labeling technique led to a significant improvement, raising the Kaggle Private Score to 0.36837, an increase of 0.00813 compared to the previous step.</p><p>These results demonstrate that each component of our proposed method contributes to the overall performance, with the pseudo-labeling technique being the most influential. The ablation study highlights the effectiveness of our design choices and validates the importance of utilizing both the PA and PO data through our pseudo-labeling framework. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The architecture of the proposed model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Dataset Split: We randomly split the original training set into a training set and validation set in an 8:2 ratio. The training set is used for model training, while the validation set is used for model performance evaluation and hyperparameter tuning. The final model is trained on the complete training set and evaluated on the officially provided test set.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The overview of the Pseudo-label framework. In stage 1, we train our model with PA data in 5-fold strategy. In stage 2, with the trained model, we inference on PO data to obtain abundant data with pseudo-label. In the final stage, we train our model with PA and partial PO data (with pseudo label generated in stage 2) to obtain final model.</figDesc><graphic coords="6,59.07,65.60,451.28,204.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Kaggle Scores</figDesc><table><row><cell cols="3">Private Rank Private Score Public Score</cell></row><row><cell>1</cell><cell>0.40890</cell><cell>0.41092</cell></row><row><cell>2 (Ours)</cell><cell>0.36837</cell><cell>0.37327</cell></row><row><cell>3</cell><cell>0.35292</cell><cell>0.35405</cell></row><row><cell>4</cell><cell>0.35220</cell><cell>0.35579</cell></row><row><cell>5</cell><cell>0.34898</cell><cell>0.34873</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Threshold Ablation Study ResultsThis paper presents our participation in the 2024 GeolifeCLEF competition. For the multi-label plant species prediction task, we propose our multimodal deep learning model, which consists of multiple ResNet-based multimodal feature extractors, and we further use a pre-training model to ensure the effectiveness of feature extraction for satellite image information. To effectively utilize the huge amount of PO data, we propose a pseudo-label training framework to further improve the accuracy and robustness of the model on the task. Our experiments demonstrate that our proposed multimodal deep learning model improves on both public and private test sets, and we also demonstrate the effectiveness of our proposed Pseudo-label training framework through ablation experiments.</figDesc><table><row><cell cols="2">Threshold Kaggle Private Score</cell><cell></cell></row><row><cell>Top 25</cell><cell>0.33610</cell><cell></cell></row><row><cell>Top 30</cell><cell>0.32966</cell><cell></cell></row><row><cell>Top 20</cell><cell>0.33915</cell><cell></cell></row><row><cell>0.2</cell><cell>0.34886</cell><cell></cell></row><row><cell>0.22</cell><cell>0.34721</cell><cell></cell></row><row><cell>Table 3</cell><cell></cell><cell></cell></row><row><cell>Ablation Study Results</cell><cell></cell><cell></cell></row><row><cell cols="3">Method/Feature Added Kaggle Private Score Score Improvement</cell></row><row><cell>Baseline</cell><cell>0.31535</cell><cell>-</cell></row><row><cell>+ 5 folds cross validation</cell><cell>0.33610</cell><cell>+0.02075</cell></row><row><cell>+ Positive threshold=0.2</cell><cell>0.34886</cell><cell>+0.01276</cell></row><row><cell>+ Tabular Data</cell><cell>0.35504</cell><cell>+0.00618</cell></row><row><cell>+ SSL pretrained ResNet</cell><cell>0.36024</cell><cell>+0.00520</cell></row><row><cell>+ Pseudo Label</cell><cell>0.36837</cell><cell>+0.00813</cell></row><row><cell>6. Conclusion</cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://www.kaggle.com/code/picekl/sentinel-landsat-bioclim-baseline-0-31626</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Acknowledgments</head><p>This work was supported in part by the National Science and Technology Council, Taiwan under grants NSTC 112-2634-F-002-005 and NSTC112-2634-F-006-002, and National Taiwan University under grants 113L900902.</p></div>
			</div>


			<div type="availability">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Data Proprocessing: We normalize the tabular data to have a mean of 0 and a standard deviation of 1. We do not use any data augmentation.</p><p>Hyperparameters: We use the AdamW <ref type="bibr" target="#b16">[16]</ref> optimizer with an initial learning rate of 0.00025 and a weight decay of 0.01. The batch size is set to 64, and the total number of training epochs is 10. The</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of GeoLifeCLEF 2024: Species presence prediction based on occurrence data and high-resolution remote sensing images</title>
		<author>
			<persName><forename type="first">L</forename><surname>Picek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Botella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Servajean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Deneu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">Marcos</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Palard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Larcher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Leblanc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Estopinan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bonnet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of lifeclef 2024: Challenges on species distribution prediction and identification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Picek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kahl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Goëau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Espitalier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Botella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Deneu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Marcos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Estopinan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Leblanc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Larcher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Šulc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hrúz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Servajean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference of the Cross-Language Evaluation Forum for European Languages</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A review on multi-label learning algorithms</title>
		<author>
			<persName><forename type="first">M.-L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-H</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on knowledge and data engineering</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="1819" to="1837" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Acknowledging the unknown for multi-label learning with single positive labels</title>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-A</forename><surname>Heng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Computer Vision</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="423" to="440" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">M.-K</forename></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Label-aware global consistency for multi-label learning with single positive labels</title>
		<author>
			<persName><forename type="first">J</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-J</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="18430" to="18441" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Revisiting pseudo-label for single-positive multi-label learning</title>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Geng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="22249" to="22265" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Block label swap for species distribution modelling</title>
		<author>
			<persName><forename type="first">B</forename><surname>Kellenberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Tuia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF (Working Notes)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="2103" to="2114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Species distribution modeling based on aerial images and environmental features with convolutional neural networks</title>
		<author>
			<persName><forename type="first">C</forename><surname>Leblanc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lorieul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Servajean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bonnet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF (Working Notes)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="2123" to="2150" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Leverage samples with single positive labels to train cnn-based models for multi-label plant species prediction</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">Q</forename><surname>Ung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kojima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wada</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Working Notes of CLEF</title>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Botella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Deneu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Marcos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Servajean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Estopinan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Larcher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Leblanc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bonnet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.05121</idno>
		<title level="m">The geolifeclef 2023 dataset to evaluate plant species distribution models at high spatial resolution across europe</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Batch normalization: Accelerating deep network training by reducing internal covariate shift</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ioffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International conference on machine learning</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="448" to="456" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Agarap</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1803.08375</idno>
		<title level="m">Deep learning using rectified linear units (relu)</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Deep residual learning for image recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="770" to="778" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Ba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Kiros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1607.06450</idno>
		<title level="m">Layer normalization</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A A</forename><surname>Braham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Albrecht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">X</forename><surname>Zhu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2211.07044</idno>
		<title level="m">Ssl4eo-s12: A large-scale multi-modal, multi-temporal dataset for self-supervised learning in earth observation</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Loshchilov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1711.05101</idno>
		<title level="m">Decoupled weight decay regularization</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
