Context-Aware AutoML for Accurate Wheat Disease Detection
                         Muhammad Uzair1 , Radwa ElShawi1,* and Stefania Tomasiello1,2
                         1
                             Institute of Computer Science, University of Tartu, Estonia
                         2
                             Department of Industrial Engineering, University of Salerno, Fisciano, Italy


                                            Abstract
                                            Timely detection and management of crop diseases are crucial for food security and agricultural productivity. Traditional methods,
                                            which rely on manual inspection, are often slow and prone to human error. With the rise of diseases like stripe rust in wheat, there is a
                                            growing need for efficient automated detection methods. This paper proposes a novel classification strategy that leverages Automated
                                            Machine Learning (AutoML) in combination with advanced feature engineering techniques. We develop a scalable framework that
                                            detects stripe rust by extracting comprehensive statistical features from images, distinguishing disease symptoms from healthy crops.
                                            To enhance feature quality, we employ Context-Aware Automated Feature Engineering, which iteratively generates meaningful features
                                            to capture subtle patterns in the data. Our method achieves 95.35% accuracy on the RustNet dataset, significantly outperforming the
                                            state-of-the-art ResNet-18 model, which achieved 85.2% accuracy. These findings highlight the potential of AutoML and automated
                                            feature engineering to revolutionize disease detection in agriculture, offering a cost-effective alternative to traditional deep learning
                                            methods that require extensive computational resources and expertise.

                                            Keywords
                                            AutoML, disease detection, feature engineering, large language models


                         1. Introduction                                                                                               have led to a growing interest in automating the ML process.
                                                                                                                                       This has spurred the development of Automated Machine
                         The Food and Agriculture Organization (FAO) forecasts a                                                       Learning (AutoML) techniques [9, 10], which simplify the
                         0.9% increase in global cereal utilization for 2023/24 com-                                                   creation of ML pipelines by automating stages such as data
                         pared to the previous year. Wheat, as the most widely culti-                                                  preprocessing, feature engineering, model selection, and
                         vated crop globally, is essential to agriculture, with rising                                                 optimization. By reducing the need for manual interven-
                         consumption expected in regions like the European Union,                                                      tion, AutoML streamlines the development of effective ML
                         China, India, the UK, and the US [1]. However, wheat faces                                                    models, making advanced disease detection more accessible
                         significant threats from diseases and pests, causing sub-                                                     and efficient.
                         stantial annual losses, roughly one-fifth of global yield [2].                                                   This study introduces a novel approach that integrates
                         Among these, wheat stripe rust, caused by Puccinia stri-                                                      AutoML with context-aware feature engineering for the de-
                         iformis f.sp.tritici, is particularly devastating, leading to se-                                             tection of stripe rust in wheat. We extract comprehensive
                         vere yield losses [3]. This disease has become increasingly                                                   statistical features from UAV-captured images and refine
                         prevalent worldwide, posing serious risks to food security                                                    them using Context-Aware Automated Feature Engineering
                         and agricultural sustainability.                                                                              (CAAFE), a feature engineering method designed for tabu-
                            Traditional methods for monitoring wheat rust rely on                                                      lar datasets [11]. CAAFE leverages a large language model
                         manual visual inspection, which is time-consuming, labor-                                                     (LLM) to iteratively generate additional semantically mean-
                         intensive, and costly, making it impractical for large-scale                                                  ingful features based on the dataset description, enhancing
                         agriculture [4]. Recent advancements in imaging technolo-                                                     the discriminatory power of the features. These refined
                         gies, especially the use of Unmanned Aerial Vehicles (UAVs),                                                  features are then processed using the Tree-Based Pipeline
                         offer a promising alternative for automated crop disease                                                      Optimization Tool (TPOT) [12], an AutoML framework that
                         detection. UAVs can capture high-resolution images of large                                                   automates the selection, optimization, and construction of
                         fields, enabling more efficient and accurate disease monitor-                                                 classification models. Our proposed framework was rigor-
                         ing [5, 6]. This technology, combined with advanced image                                                     ously evaluated on the publicly available RustNet dataset [6],
                         processing techniques, holds great potential for timely and                                                   achieving a remarkable accuracy of 95.35%. This represents
                         precise identification of disease outbreaks.                                                                  a substantial improvement over the state-of-the-art ResNet-
                            Effective and timely monitoring of yellow rust is essential                                                18 model, which attained an accuracy of only 85.2% [6].
                         for both disease management and sustainable crop produc-
                         tion. Accurate disease mapping facilitates the judicious
                         application of fungicides and enhances breeding programs                                                      2. Related Works
                         by identifying resistant wheat varieties [6]. Machine learn-
                         ing (ML) techniques play a crucial role in achieving high                                                     The application of Unmanned Aerial Vehicles (UAVs) for
                         precision in disease detection, focusing on extracting rele-                                                  plant disease detection has garnered substantial interest,
                         vant features from images and utilizing classifiers such as                                                   leading to the development of advanced methodologies that
                         Neural Networks, Random Forest, Support Vector Machines,                                                      integrate image processing with Machine Learning (ML)
                         and K-Nearest Neighbors [7, 8]. However, the complex-                                                         algorithms. Gu et al. [13] introduced a method for detect-
                         ity and manual effort required to develop these ML models                                                     ing and quantifying the severity of narrow brown leaf spot,
                                                                                                                                       a common disease affecting rice crops. The methodology
                         Published in the Proceedings of the Workshops of the EDBT/ICDT 2025
                                                                                                                                       began with the extraction of color features and vegetation
                         Joint Conference (March 25-28, 2025), Barcelona, Spain                                                        indices from UAV-acquired images. Pearson’s correlation
                         *
                           Corresponding author: Radwa ElShawi (radwa.elshawi@ut.ee).                                                  analysis was then employed to identify the four most sig-
                         $ muhammad.uzair@ut.ee (M. Uzair); radwa.elshawi@ut.ee                                                        nificant features, which were subsequently used as inputs
                         (R. ElShawi); stefania.tomasiello@ut.ee (S. Tomasiello)                                                       for support vector regression, achieving a high degree of
                          1234-5678-9012 (R. ElShawi); 0000-0001-8208-8285 (S. Tomasiello)
                                        © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License   accuracy in disease severity estimation.
                                        Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   In the field of wheat disease detection, Liu et al. [14]
focused on identifying powdery mildew using UAV im-
agery. They meticulously extracted textural features such
as contrast, correlation, and variance, and applied Partial
Least Squares Regression (PLSR) for comprehensive anal-
ysis, yielding a nuanced understanding and quantification
of the disease’s impact. Additionally, a study on monitor-
ing wheat scab using UAV remote sensing [15] emphasized
the value of texture features derived from multiple spec-
tral bands. When combined with vegetation indices, these
features provided extensive data for disease monitoring,
with Support Vector Regression (SVR) demonstrating effec-
tiveness in predictive analysis. Zhang et al. [16] utilized
a combination of spectral and textural features to detect
Fusarium Head Blight in wheat crops, employing Logis-
tic Regression to highlight the critical role of feature-rich
datasets in accurate disease classification and monitoring.
Subsequent studies [17, 18] advanced this approach by inte-
grating spectral, textural, and color features with various
classification models, including Support Vector Machines        Figure 1: Flowchart of the proposed framework
(SVM) and Neural Networks (NNs). These studies high-
lighted the significance of feature extraction techniques and
the adaptability of ML algorithms in managing the complex
datasets derived from UAV imagery. Furthermore, research        with regular irrigation. Lemhi 66 cultivar in borders was
on wheat yellow rust detection illustrated the interaction      highly susceptible to stripe rust, with three inoculated bor-
between traditional ML algorithms and Deep Learning (DL)        ders and one non-infected border. Images were collected
techniques [19, 6, 20]. While ML methods such as SVR, NNs,      only from the borders in Field 2.
and Random Forests demonstrated significant effectiveness,
DL models have shown promising potential for enhancing          3.2. Data preprocessing
accuracy and efficiency in disease detection tasks.
   Broadening the application beyond wheat, research on         Our preprocessing phase involves several key stages: ini-
UAV-based disease detection in rubber trees [21] and citrus     tial image acquisition, conversion to grayscale, resizing,
plants [22] demonstrated the broad applicability of UAV-        and feature extraction. During feature extraction, we com-
based disease detection techniques across different agricul-    pute essential statistical measures, including mean, standard
tural sectors. These studies emphasized the vital role of       deviation, variance, correlation, energy, entropy, contrast,
advanced image processing techniques and ML algorithms          skewness, kurtosis, and homogeneity.
in enhancing global food security by enabling effective dis-
ease detection in a wide range of crops.                        3.3. Context-Aware Automated Feature
                                                                     Engineering
3. Methodology                                                  Feature engineering is a critical component of machine
                                                                learning, as it involves transforming raw input data into fea-
Figure 1 illustrates the architecture of our approach that      tures that can improve predictive performance [23, 24]. In
consists of three main stages, including data preprocessing     our approach, we leverage CAAFE, an automated machine
(Section 3.2), automated feature engineering using CAFE         learning technique specifically designed for tabular datasets.
(Section 3.3), and model training and evaluation using Au-      CAAFE employs an LLM to iteratively generate semanti-
toML approach (Section 3.4). In the following subsections,      cally meaningful features based on a detailed description of
we explain the different building blocks of our approach.       the dataset. This process not only generates Python code
                                                                for creating new features but also provides explanations for
3.1. Dataset                                                    the relevance and utility of the generated features.
                                                                   CAAFE operates iteratively on both the training and val-
In this study, we used a publicly available dataset, RustNet    idation datasets, 𝐷𝑡𝑟𝑎𝑖𝑛 and 𝐷𝑣𝑎𝑙𝑖𝑑 , along with a descrip-
[6]. RustNet comprises 508 images categorized into two          tion of the dataset’s context and features. In each itera-
classes: disease and no disease. Among these, there are 281     tion, CAAFE constructs a prompt that includes detailed
images depicting instances of disease and 227 images with-      information about the dataset and the specific feature en-
out any disease. RustNet is based on data collected from        gineering task, which is then passed to the LLM. Based
two experimental wheat fields were imaged in Pullman, WA,       on this prompt, the LLM generates code to alter or create
the US, in 2021. Field 1, located at Palouse Conservation       new features. The generated code is executed on the cur-
Field Station, comprised two winter wheat trials: one for       rent datasets (𝐷𝑡𝑟𝑎𝑖𝑛 and 𝐷𝑣𝑎𝑙𝑖𝑑 ), producing transformed
testing fungicides on ’PS 279’ variety and another for as-      datasets (𝐷′ 𝑡𝑟𝑎𝑖𝑛 and 𝐷′ 𝑣𝑎𝑙𝑖𝑑). An ML classifier is subse-
sessing stripe rust resistance in 23 winter wheat cultivars.    quently trained on 𝐷′ 𝑡𝑟𝑎𝑖𝑛 and evaluated on 𝐷′ 𝑣𝑎𝑙𝑖𝑑. If
Both trials had randomized designs with four replications,      the classifier’s performance on 𝐷′ 𝑣𝑎𝑙𝑖𝑑 surpasses its perfor-
planted on November 1, 2020. Urediniospores of P. stri-         mance on the original 𝐷𝑣𝑎𝑙𝑖𝑑, the newly generated feature
iformis were inoculated twice to induce disease. Field 2, at    is retained, and the datasets are updated accordingly. If not,
Spillman Agronomy Farm, housed spring wheat nurseries           the feature is discarded, and the datasets remain unchanged.
   The prompt provided to the LLM includes semantic and               Class                           Train    Test    Total
descriptive information about the dataset, such as a user-            disease                          208      73      281
generated dataset description, feature names, data types,             no_disease                       172      55      227
the percentage of missing values, and random sample rows              Total                            380      128     508
from the dataset. Additionally, a template for the expected
format of the generated code and explanations is included,        Table 1
which improves the clarity and quality of the LLM’s out-          Number of images in train and test split for the RustNet dataset
put. To further enhance performance, chain-of-thought
instructions guide the LLM through a series of intermediate
reasoning steps, leading to more effective code generation.          Baselines. Given the randomized nature of the experi-
By utilizing CAAFE, we integrate domain knowledge into            ments reported in [6], we conducted new experiments using
the feature engineering process, all while maintaining in-        the same computational setup as described in their study.
terpretability and optimizing predictive performance. This        Specifically, we employed ResNet-18, following the origi-
approach offers a powerful and efficient method for gener-        nal architecture and hyperparameters outlined in [6], and
ating high-quality features in complex datasets, marking a        initialized the model with pre-trained weights.
promising advancement in machine learning research.                  CAAFE setting. We leverage the advanced capabilities
                                                                  of OpenAI’s language models, including GPT-3.5, as LLM
3.4. AutoML approach                                              within the CAAFE framework [27, 28]. The integration of
                                                                  these powerful language models enables CAAFE to generate
TPOT is an AutoML framework designed for constructing             semantically meaningful features iteratively, enhancing the
and optimizing machine learning pipelines for both classi-        effectiveness of feature engineering. To ensure robust per-
fication and regression tasks. It utilizes tree-based genetic     formance and accuracy, we conduct ten feature engineering
programming [25] to evolve pipelines by treating them as          iterations using the CAAFE framework. Additionally, in
individuals within an evolutionary algorithm. Each pipeline       the iterative evaluation of code blocks, we employ TabPFN
is structured as a tree, with its nodes categorized as either     (Tabular Predictive Functional Network), as proposed by
Primitives or Terminals. Primitives represent operators that      Hollmann et al. [29], to assess the effectiveness of generated
require input, such as machine learning algorithms needing        features and their impact on model performance.
data and hyperparameter values. Terminals, on the other              TPOT setting. To ensure a fair comparison, an equal time
hand, are constants that provide input to the Primitives. No-     budget was allocated for both TPOT and ResNet methodol-
tably, a Primitive can also act as input for another Primitive,   ogy. Experiments were constrained to a 20-minute time limit.
allowing for complex pipeline configurations. The evolu-          This consistent time allocation ensures parity in computa-
tionary process in TPOT operates by applying genetic oper-        tional resources between the methods, enabling a thorough
ations such as mutation and crossover to the pipelines. Mu-       and unbiased evaluation of their respective performances.
tation involves making small modifications, such as chang-        The input to TPOT is a data matrix after performing the fea-
ing a hyperparameter or introducing a new preprocessing           ture engineering step from CAAFE. The hyperparameters
step. Crossover, on the other hand, selects two pipelines         for TPOT were configured with a set number of genera-
that share common Primitives and allows them to exchange          tions, specifically 10, and a population size of 100. The
subtrees or branches. Once these operations are performed,        resulting pipeline generated by TPOT, constrained by the
each pipeline is evaluated and assigned a fitness score, which    specified time budget, is a multi-layer perceptron classifier
reflects its performance. This fitness score is used in the       with a learning rate of 0.01 and regularization parameter of
selection process to determine which pipelines should be          0.0001. The latter is a penalty term, constraining the size
retained and evolved further in the next generation, ulti-        of the weights [30]. The aim of such a strategy is to reduce
mately leading to the creation of highly optimized machine        overfitting and enhance the generalization ability of the NN.
learning pipelines. Generally, these pipeline trees could be         Hardware Resources. We conducted our experiments
arbitrarily large. Nevertheless, extensive machine learning       on a CPU environment. The CPU environment runs on Win-
pipelines usually have downsides. Longer pipelines with           dows 11 Pro 64-bit (10.0, Build 22621) with 16 core Intel(R)
numerous hyperparameters can be challenging to fine-tune,         Core(TM) i9-10885H Processor @ 2.40GHz,32 GB DIMM
more prone to overfitting, complicate the understanding of        memory, and 1000 GB SSD data storage. All the approaches
the final model, and demand extended evaluation time, thus        have been implemented in Python.
slowing down the optimization process. Due to these con-             Performance metrics. Since the classification problem
siderations, a multiobjective optimization technique, NSGA-       is being tackled in this study, the performance metrics used
II [26], is employed. It assists in selecting candidates based    are Accuracy, Precision, Recall, and F1-score.
on the Pareto front, representing the balanced trade-off be-
tween pipeline length and performance.
                                                                  4.2. Results
                                                                  4.2.1. Preprocessing
4. Experimental Evaluation
                                                                  We followed the preprocessing steps described in Section 3.2.
4.1. Experimental setup                                           Regarding the conversion of the class associated with the
                                                                  image to a numerical equivalent, we adopted for RustNet
Training and test. For a fair comparison, we adopted the          dataset 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 1 and 𝑛𝑜_𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 0.
same train-test split methodology as outlined in the refer-         After the statistical features are extracted from images,
enced study, allocating 70% of the RustNet dataset for train-     the resulting feature set is normalized using min-max nor-
ing and 30% for testing [6]. Detailed information regarding       malization, where each feature has a value between 0 and 1.
these splits is provided in Table 1.                              The general formula for min-max normalization is:
          Figure 2: Exemplary run of CAAFE on the RustNet image dataset. User-generated input is shown in blue, ML-classifier
          generated data in red, and LLM-generated code with syntax highlighting. The generated code contains a comment per
          generated/deleted feature following a template (Feature name, description of usefulness, features used in the generated code,
          and sample values). CAAFE improved the ACC on the validation dataset from 0.946 to 0.953 over 10 iterations, but only those
          improving ACC are shown.


                          𝑥𝑖 − 𝑚𝑖𝑛(𝑋)
                𝑥′𝑖 =
                        𝑚𝑎𝑥(𝑋) − 𝑚𝑖𝑛(𝑋)
   where 𝑥′𝑖 is the normalized value, 𝑥𝑖 ∈ 𝑋, 𝑖 =
1, 2, . . . , 𝑛 is the original value.

4.3. Feature Engineering
A demonstration of CAAFE using the RustNet dataset is il-
lustrated in Figure 2. User inputs are highlighted in blue, ML-
classifier-generated data in red, and LLM-generated code is
presented with syntax highlighting. The code includes com-            Figure 3: ResNet-18 Accuracy for RustNet dataset
ments for each generated feature, adhering to a predefined
template in CAAFE’s prompt. This template comprises the
feature name, its utility description, the features utilized in
the generated code, and sample values for these features.             4.4. AutoML
The retained generated features from CAAFE after 10 itera-            The results, evaluated using TPOT and ResNet-18, are de-
tions include ’mean_variance_ratio’, calculated as the mean           tailed in Table 2. For TPOT, two variants are considered: the
divided by the variance, and ’contrast_energy_ratio’, com-            baseline TPOT and TPOT with Context-Aware Automated
puted as the contrast divided by the energy. Incorporating            Feature Engineering (CAAFE), referred to as TPOT (FE). The
the features generated by CAAFE into TPOT improved the                comparative results demonstrate that both variants of our
accuracy from 93.02% achieved using TPOT alone on the                 proposed framework—TPOT and TPOT (FE)—outperform
validation dataset to 95.42%, as shown in Table 2.                    the baseline ResNet-18 model. TPOT achieved an accuracy
                                                                      of 93.02%, which was further enhanced to 95.35% with the
integration of CAAFE. In contrast, ResNet-18 achieved a                pathogen, Food Security 12 (2020). doi:10.1007/
lower accuracy of 85.2% on the same dataset, highlighting              s12571-020-01016-z.
the superior performance of our proposed approach. The             [4] J. su, C. Liu, X. Hu, X. Xu, L. Guo, W.-H. Chen, Spatio-
limited number of epochs achieved within the allocated                 temporal monitoring of wheat yellow rust using uav
time budget highlights the substantial computational effort            multispectral imagery, Computers and Electronics in
required.                                                              Agriculture (2019). doi:10.1016/j.compag.2019.
                                                                       105035.
 Dataset   Model       Accuracy   Precision   Recall   F1-Score    [5] D. Basurto-Lozada, A. Hillier, D. Medina, D. Pulido,
           TPOT          93.02      92.99     92.90     92.80          S. Karaman, J. Salas, Dynamics of soil surface temper-
 RustNet   TPOT (FE)    95.35       95.79     94.85     95.22
                                                                       ature with unmanned aerial systems, Pattern Recog-
                                                                       nition Letters 138 (2020). doi:10.1016/j.patrec.
           ResNet-18     85.20      86.13     86.54     86.15
                                                                       2020.07.003.
Table 2                                                            [6] Z. Tang, M. Wang, S. Michael, K.-H. Dammer, X. Li,
Performance of TPOT and ResNet-18 on RustNet dataset                   R. Brueggeman, S. Sankaran, A. Carter, M. Pumphrey,
                                                                       Y. Hu, X. Chen, Z. Zhang, Affordable high through-
                                                                       put field detection of wheat stripe rust using deep
                                                                       learning with semi-automated image labeling, 2022.
5. Conclusion                                                          doi:10.20944/preprints202204.0177.v1.
                                                                   [7] U. Shafi, R. Mumtaz, Z. Shafaq, S. Zaidi, Z. Mah-
This study introduces a novel approach to detect stripe rust           mood, S. Zaidi, Wheat rust disease detection tech-
in wheat crops, using AutoML and rigorous feature engineer-            niques: a technical perspective, Journal of Plant
ing techniques. By extracting a comprehensive set of statis-           Diseases and Protection 129 (2022). doi:10.1007/
tical features from original images and employing Context-             s41348-022-00575-x.
Aware Automated Feature Engineering, we enhance the dis-           [8] T. Hayıt, H. Erbay, F. Varçın, F. Hayıt, N. Akci, The
criminative power of the extracted features. Our iterative             classification of wheat yellow rust disease based on a
feature generation process aims to capture subtle patterns             combination of textural and deep features, Multimedia
and nuances, leading to superior effectiveness compared                Tools and Applications 82 (2023) 1–19. doi:10.1007/
to state-of-the-art deep learning techniques. The consid-              s11042-023-15199-y.
ered wheat rust disease problem has already been tackled           [9] R. Elshawi, S. Sakr, Automated machine learning:
in the literature by employing several ML techniques, such             Techniques and frameworks, in: R.-D. Kutsche,
as feed forward NNs, KNN, SVM, RF. All of them populated               E. Zimányi (Eds.), Big Data Management and Ana-
the search space of TPOT, which helped to determine the                lytics, Springer International Publishing, Cham, 2020,
best one for the considered case. We compared our results              pp. 40–69.
against the ones by ResNet-18, a state-of-the-art technique       [10] H. Eldeeb, M. Maher, O. Matsuk, A. Aldallal,
used for the same kind of problem, according to the most               R. El Shawi, S. Sakr, Automlbench: A comprehensive
recent literature. The experiments were performed on a                 experimental evaluation of automated machine learn-
publicly available dataset retrieved from the relevant liter-          ing frameworks, 2022. doi:10.2139/ssrn.4516282.
ature. Our approach outperformed the above-mentioned              [11] N. Hollmann, S. Müller, F. Hutter, Large language mod-
state-of-the-art technique, revealing a higher computational           els for automated data science: Introducing CAAFE
effort of the latter in the allotted computing time.                   for context-aware automated feature engineering, in:
                                                                       Thirty-seventh Conference on Neural Information Pro-
Acknowledgments                                                        cessing Systems, 2023. URL: https://openreview.net/
                                                                       forum?id=9WSxQZ9mG7.
This work has been partially funded by the Estonian Re-           [12] R. Olson, J. Moore, Tpot: A tree-based pipeline opti-
search Council, grant PRG1604, through the funding of                  mization tool for automating machine learning, 2019.
SusAn, FACCE ERA-GAS, ICT-AGRI-FOOD and SusCrop                        doi:10.1007/978-3-030-05318-5_8.
ERA-NET, and through the project Increasing the knowl-            [13] C. Gu, T. Cheng, N. Cai, W. Li, G. Zhang, X.-G. Zhou,
edge intensity of Ida-Viru entrepreneurship co-funded by the           D. Zhang, Assessing narrow brown leaf spot sever-
European Union.                                                        ity and fungicide efficacy in rice using low altitude
                                                                       uav imaging, Ecological Informatics 77 (2023) 102208.
                                                                       doi:10.1016/j.ecoinf.2023.102208.
References                                                        [14] Y. Liu, L. An, N. Wang, W. Tang, M. Liu, G. Liu, H. Sun,
                                                                       M. Li, Y. Ma, Leaf area index estimation under wheat
 [1] FAO, Fao cereal supply and demand brief, 2023,                    powdery mildew stress by integrating uav-based spec-
     www.fao.org/worldfoodsituation/csdb/en/, https://                 tral, textural and structural features, Computers
     www.fao.org/worldfoodsituation/csdb/en/, 2023. Ac-                and Electronics in Agriculture 213 (2023) 108169.
     cessed: 2024-11-22.                                               URL: https://www.sciencedirect.com/science/article/
 [2] S. Savary, L. Willocquet, S. Pethybridge, P. Esker,               pii/S0168169923005574. doi:https://doi.org/10.
     N. McRoberts, A. Nelson, The global burden of                     1016/j.compag.2023.108169.
     pathogens and pests on major food crops, Na-                 [15] W. Zhu, Z. Feng, S. Dai, P. Zhang, X. Wei, Using
     ture Ecology & Evolution 3 (2019) 1. doi:10.1038/                 uav multispectral remote sensing with appropriate
     s41559-018-0793-y.                                                spatial resolution and machine learning to monitor
 [3] X. Chen, Pathogens which threaten food secu-                      wheat scab, Agriculture 12 (2022) 1785. doi:10.3390/
     rity: Puccinia striiformis, the wheat stripe rust                 agriculture12111785.
                                                                  [16] Y. Xiao, Y. Dong, W. Huang, L. Liu, H. Ma, Wheat fusar-
     ium head blight detection using uav-based spectral and
     texture features in optimal window size, Remote Sens-
     ing 13 (2021). URL: https://www.mdpi.com/2072-4292/
     13/13/2437. doi:10.3390/rs13132437.
[17] H. Zhang, L. Huang, W. Huang, Y. Dong, S. Weng,
     J. Zhao, H. Ma, L. Liu, Detection of wheat fusarium
     head blight using uav-based spectral and image feature
     fusion, Frontiers in Plant Science 13 (2022) 1004427.
     doi:10.3389/fpls.2022.1004427.
[18] L. Liu, Y. Dong, W. Huang, X. Du, H. Ma, Monitoring
     wheat fusarium head blight using unmanned aerial
     vehicle hyperspectral imagery, Remote Sensing 12
     (2020) 3811. doi:10.3390/rs12223811.
[19] A. Guo, W. Huang, Y. Dong, H. Ye, H. Ma, B. Liu,
     W. Wu, Y. Ren, C. Ruan, Y. Geng, Wheat yellow
     rust detection using uav-based hyperspectral technol-
     ogy, Remote Sensing 13 (2021) 123. doi:10.3390/
     rs13010123.
[20] C. Nguyen, V. Sagan, J. Skobalski, J. Severo, Early detec-
     tion of wheat yellow rust disease and its impact on ter-
     minal yield with multi-spectral uav-imagery, Remote
     Sensing 15 (2023) 3301. doi:10.3390/rs15133301.
[21] T. Zeng, J. Fang, C. Yin, Y. Li, W. Fu, H. Zhang,
     J. Wang, X. Zhang, Recognition of rubber tree powdery
     mildew based on uav remote sensing with different
     spatial resolutions, Drones 7 (2023) 533. doi:10.3390/
     drones7080533.
[22] S. Ding, J. Jing, S. Dou, M. Zhai, W. Zhang, Cit-
     rus canopy spad prediction under bordeaux so-
     lution coverage based on texture- and spectral-
     information fusion, Agriculture 13 (2023). URL: https:
     //www.mdpi.com/2077-0472/13/9/1701. doi:10.3390/
     agriculture13091701.
[23] S. Wold, K. Esbensen, P. Geladi, Principal com-
     ponent analysis,      Chemometrics and Intelligent
     Laboratory Systems 2 (1987) 37–52. doi:10.1016/
     0169-7439(87)80084-9.
[24] H. Eldeeb, R. El Shawi, Empowering machine learning
     with scalable feature engineering and interpretable
     automl, IEEE Transactions on Artificial Intelligence
     PP (2024) 1–16. doi:10.1109/TAI.2024.3400752.
[25] W. Banzhaf, P. Nordin, R. Keller, F. Francone, Genetic
     programming: An introduction on the automatic evo-
     lution of computer programs and its applications, 1998.
[26] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast
     and elitist multiobjective genetic algorithm: Nsga-ii,
     Evolutionary Computation, IEEE Transactions on 6
     (2002) 182 – 197. doi:10.1109/4235.996017.
[27] OpenAI, Gpt-3 can’t count syllables - or doesn’t “get”
     haiku. https://community. openai.com/t/gpt-3-cant-
     count-syllables-or-doesnt-get-haiku/18733, 2021. ac-
     cessed on: 2024-03-1, 2021.
[28] OpenAI,       openai/openai-cookbook:              Exam-
     ples and guides for using the openai api.
     https://github.com/openai/openai-cookbook, 2023b.
     (accessed on 03/1/2023), 2023.
[29] N. Hollmann, S. Müller, K. Eggensperger, F. Hut-
     ter, Tabpfn: A transformer that solves small tab-
     ular classification problems in a second, 2022.
     URL: https://arxiv.org/abs/2207.01848. doi:10.48550/
     ARXIV.2207.01848.
[30] Mlpclassifier documentation, https://scikit-learn.org/
     stable/modules/generated/sklearn.neural_network.
     MLPClassifier.html, 2024. Accessed: 2024-11-22.