<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Effectiveness of Data Resampling in Mitigating Class Imbalance for Object Detection</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Michał</forename><surname>Tomaszewski</surname></persName>
							<email>m.tomaszewski@po.edu.pl</email>
							<affiliation key="aff0">
								<orgName type="institution">Opole University of Technology</orgName>
								<address>
									<addrLine>Prószkowska 76 St</addrLine>
									<postCode>45-758</postCode>
									<settlement>Opole</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jakub</forename><surname>Osuchowski</surname></persName>
							<email>j.osuchowski@po.edu.pl</email>
							<affiliation key="aff0">
								<orgName type="institution">Opole University of Technology</orgName>
								<address>
									<addrLine>Prószkowska 76 St</addrLine>
									<postCode>45-758</postCode>
									<settlement>Opole</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<address>
									<postCode>2023</postCode>
									<settlement>Ternopil</settlement>
									<region>Opole</region>
									<country>Ukraine, Poland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Effectiveness of Data Resampling in Mitigating Class Imbalance for Object Detection</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">BE685B6EFE314EC6CBFDE9F00D5967CB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:40+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>object detection</term>
					<term>class imbalance</term>
					<term>resampling</term>
					<term>YOLO</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Mitigating class imbalance for object detection on digital images is a critical challenge in the field of computer vision. This problem stems from the uneven distribution of object classes within image datasets, where some classes are significantly more prevalent than others. In object detection tasks, the primary goal is to identify and locate various objects within images accurately. However, when dealing with imbalanced datasets, several significant issues arise. Firstly, the training of machine learning models on imbalanced data can result in bias, where models tend to perform well on majority classes but struggle to recognize minority classes effectively. This bias is due to the disproportionate number of samples available for each class during training, leading to inadequate learning for underrepresented classes. Secondly, imbalanced datasets can lead to reduced detection accuracy, particularly for the minority classes. Models trained on such data may exhibit high overall accuracy but perform poorly when it comes to identifying rare objects or those from underrepresented classes. Moreover, the problem of class imbalance can lead to the loss of valuable information. Minority classes may include objects or instances that are crucial for the specific application, yet they are often overlooked due to their scarcity in the dataset. Furthermore, models trained on imbalanced data may struggle to generalize effectively to realworld scenarios where class distributions are more balanced. This limitation can hinder the practical applicability of object detection systems. The article aims to investigate the impact of data resampling methods on improving the object detection model YOLOv8m when dealing with an imbalanced image dataset. The described initial research involves using an object detection dataset with skewed class distributions and applying various resampling techniques like oversampling and undersampling to balance the data. The research used The Insulator Defect Image Dataset (IDID) representing power line insulators, which contains a large class depicting undamaged insulators and two other, relatively small classes depicting two types of damaged insulators. The implications of this research are practical, as it guides practitioners and researchers in selecting the most suitable resampling approach to address class imbalance in object detection tasks. Ultimately, this knowledge contributes to the development of more robust and reliable computer vision systems for real-world applications.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The problem of imbalanced classes in image datasets refers to a situation where the distribution of different classes or categories within the dataset is highly skewed. In other words, some classes have significantly more instances or samples compared to others, leading to an imbalance in class representation. This issue is particularly prevalent in image datasets used for machine learning and computer vision tasks. Imbalanced classes can pose several challenges in machine learning and image analysis:</p><p>• Model bias: Machine learning models trained on imbalanced datasets may develop biases towards the majority class, as they have more examples to learn from. This can result in poor performance for minority classes. • Reduced accuracy: Imbalanced datasets can lead to misleading accuracy metrics. A model may achieve a high accuracy by simply predicting the majority class most of the time, while failing to correctly classify minority class instances. • Limited generalization: Models trained on imbalanced data may struggle to generalize well to real-world scenarios where class distributions are more balanced. They may perform poorly on underrepresented classes. • Data collection cost: In some cases, collecting more data for underrepresented classes can be challenging, time-consuming, or expensive. To address this problem, researchers and practitioners employ various techniques, including:</p><p>• Resampling: Oversampling the minority classes (adding more instances) or undersampling the majority classes (removing instances) to balance class distribution. • Cost-Sensitive Learning: Assigning different misclassification costs to different classes to make the model more sensitive to minority classes. • Transfer Learning: Leveraging pre-trained models or features from large datasets to improve the performance on imbalanced data. • Synthetic Data Generation: Creating synthetic samples for underrepresented classes to increase their representation in the dataset. • Evaluation Metrics: Using appropriate evaluation metrics, such as precision, recall, F1-score, or Average Precision (AP), that consider the imbalanced nature of the dataset. These strategies aim to improve the model's ability to recognize and classify minority classes effectively, leading to more balanced and accurate image analysis results. Mitigating class imbalance for object detection is essential for building robust and effective computer vision systems that can accurately detect and locate objects across a wide range of classes, regardless of their representation in the dataset.</p><p>Research aimed at mitigating the class imbalance problem in image datasets holds substantial significance in the realm of computer vision and machine learning. This importance can be better understood through several key aspects.</p><p>Firstly, addressing the challenge of imbalanced datasets directly translates into improved model accuracy. Imbalanced datasets often lead to models excelling at recognizing majority classes while struggling with minority or underrepresented ones. Research in this area strives to create more balanced models that achieve higher overall accuracy by rectifying this bias toward majority classes.</p><p>Secondly, these studies enhance the capability to detect rare or uncommon damage in critical infrastructure, which holds paramount importance in various real-world applications. For instance, in the realm of power infrastructure, the identification of rare instances of damage or anomalies is crucial for ensuring the integrity and safety of power systems. Research aimed at mitigating data imbalance ensures that machine learning models can effectively recognize and address these infrequent cases, thereby enhancing the overall resilience and maintenance of critical infrastructure. Similarly, for instance, in medical imaging <ref type="bibr" target="#b0">[1]</ref>, identifying rare diseases or anomalies is vital for patient care. The research on mitigating dataset imbalance ensures that machine learning models can effectively recognize and handle these infrequent instances, thereby improving the quality of healthcare and diagnostics.</p><p>Furthermore, the generalization of models to real-world scenarios is a crucial consideration. Models trained on imbalanced datasets might falter when applied to scenarios with more balanced class distributions. This limitation can hinder the practical applicability of computer vision systems. Research in this field aims to enhance model generalization, making them more robust and reliable across a spectrum of real-world settings.</p><p>Another vital aspect is the prevention of data loss. Imbalanced datasets often lead to the underrepresentation of minority classes, resulting in the loss of valuable information. This research endeavours to ensure that all classes are adequately considered in the learning process, thereby harnessing the full potential of available data. Moreover, in applications where fairness and ethical considerations are paramount, such as facial recognition technology, addressing class imbalance becomes crucial. It helps prevent the unfair favouring or disadvantaging of particular groups or classes, mitigating biases and ethical concerns associated with imbalanced datasets.</p><p>Cost efficiency is another notable benefit. Collecting and annotating datasets can be resourceintensive. Research on balancing imbalanced datasets can lead to more efficient data collection by prioritizing efforts in areas that require representation the most, optimizing resource allocation. Ongoing research in this area drives advancements in state-of-the-art techniques and methodologies for addressing class imbalance. This benefits the broader machine learning community by continually improving the toolkit available to researchers and practitioners, ultimately advancing the field as a whole.</p><p>The objective of the presented initial study is to assess the impact of data resampling methods on object detection tasks performed by deep learning neural network. Specifically, the study aims to determine whether resampling techniques, such as different variants of oversampling or undersampling, can mitigate the challenges posed by class imbalance in object detection. As mentioned earlier, this problem applies in particular to technical systems and methods of their visual inspection, during which a lot of image material is collected illustrating correctly operating components and attempts to identify relatively rare failures, which, due to their nature, are usually represented by few instances in data sets teaching.</p><p>Therefore, the described study concerns the detection of damage in power line insulators. Overhead power lines are used to transmit electricity, which means that they are also one of the key elements of each country's security. Because the transmission of electricity plays a critical role in supporting the industry, economy, and defence, power disruptions resulting from failures in overhead power lines can lead to severe consequences for the entire country. Additionally, the safety and reliability of highvoltage lines can significantly impact the quality of life for citizens. To minimize interruptions in electricity supply and reduce their duration, electricity distribution companies should conduct frequent and comprehensive inspections of power network <ref type="bibr" target="#b1">[2]</ref>. Despite the ongoing research efforts in the field of automated inspection of high-voltage lines, for example <ref type="bibr" target="#b2">[3]</ref>, <ref type="bibr" target="#b3">[4]</ref>, <ref type="bibr" target="#b4">[5]</ref>, <ref type="bibr" target="#b5">[6]</ref>, <ref type="bibr" target="#b6">[7]</ref>, <ref type="bibr" target="#b7">[8]</ref>, <ref type="bibr" target="#b8">[9]</ref> there are still existing challenges and limitations that require addressing. To establish a dependable power grid, it is imperative to develop robust methods for the automated detection of various defects in its components.</p><p>An overhead power line consists of three essential parts: conductive wires, transmission towers, and power insulators. The primary functions of overhead line insulators are twofold: they provide insulation between the conductor and the ground and tower structure while also offering mechanical support to the conductive wires. Particularly in the case of high-voltage lines, insulators used in overhead lines must undergo regular inspection due to their susceptibility to degradation. The analysis was carried out using The Insulator Defect Image Dataset (IDID) <ref type="bibr" target="#b9">[10]</ref>, which depicts power line insulators. The dataset contains images with undamaged insulators and different examples of damaged insulators.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related works</head><p>The problem of data imbalance is widely described in the literature on machine learning methods. Publication <ref type="bibr" target="#b10">[11]</ref> showed that there are many challenges in the vast field of imbalanced data problems that require attention from the research community and intensive study. There are still many unaudited directions to be investigated in this branch of machine learning. The work is based on earlier review articles such as <ref type="bibr" target="#b11">[12]</ref>, <ref type="bibr" target="#b12">[13]</ref>, <ref type="bibr" target="#b13">[14]</ref> on different approaches to the problem of data imbalance.</p><p>In a survey <ref type="bibr" target="#b14">[15]</ref> a lot of studies are summarized and discussed, exploring a number of advanced techniques for learning from imbalanced data with deep neural networks. It has been shown that traditional machine learning techniques for handling class imbalance can be extended to deep learning models with success. The survey also finds that nearly all research in this area has been focused on computer vision tasks with convolutional neural networks.</p><p>Study <ref type="bibr" target="#b0">[1]</ref> investigates the existing deep learning techniques to address class imbalanced data, a critical challenge in real-world applications like fraud detection and cancer diagnosis. The authors state that despite the growing popularity of deep learning, limited empirical research exists in this area. The survey examines existing studies, highlighting the effectiveness and limitations of deep learning models, particularly in computer vision tasks, and identifies areas for future research to bridge the gap in this important field of study. Several areas for future work are outlined. Utilizing the various approaches across a broader range of datasets and varying degrees of class imbalance, assessing their performance using multiple complementary metrics, and presenting statistical evidence will aid in selecting the most suitable deep learning method for upcoming applications dealing with class imbalance. Experimenting with deep learning methods in handling class imbalance within the realms of big data and rare class scenarios holds significant benefits for the advancement of big data analytics. Additionally, the authors showed that further investigations involving non-convolutional deep neural networks are necessary to establish the generalizability of the presented methods to alternative architectural frameworks.</p><p>In <ref type="bibr" target="#b15">[16]</ref> presents a review of various learning issues due to the imbalanced distribution of data and different approaches to handle the problem of imbalanced data in classification. It was noted that the impact of imbalance on classification is baleful and the mentioned effect increases with the extent of a task, concluding that the classification of imbalanced data is an extensive research subject in the field of machine learning.</p><p>In <ref type="bibr" target="#b16">[17]</ref> Authors investigate complexity measures effectiveness on real imbalanced datasets and how they are affected by applying different data imbalance treatments. The issue of classification with imbalanced datasets is also extensively presented in <ref type="bibr" target="#b17">[18]</ref>. The authors additionally provide a website <ref type="bibr" target="#b18">[19]</ref> containing many materials on the discussed subject.</p><p>Despite a large amount of research on data imbalance, the fact of creating new machine learning techniques, in particular object detection algorithms, makes it necessary to conduct empirical research showing the effectiveness of using different techniques for balancing different data sets for new algorithms.</p><p>Relatively few works concern the problem of data imbalance in fault detection in particular technical systems. For example, the paper <ref type="bibr" target="#b19">[20]</ref> delves into the impact of data sampling techniques on improving cross-project defect prediction (CPDP) models. Employing eight data resampling methods, they resampled datasets and integrated them into CPDP model training after applying the Nearest Neighbour filter. The results demonstrated that data resampling methods effectively improved recall and performance measures but had limited success in terms of AUC performance. While these methods helped mitigate class imbalance issues, further research is needed to enhance prediction performance.</p><p>In <ref type="bibr" target="#b20">[21]</ref> proposed a framework that addresses the problem of data imbalance in supervised classification techniques for non-technical losses (NTL) in electrical power grids detection through resampling techniques. The Authors stated that an issue that other studies received not enough attention in the research is the imbalance between fraudulent and non-fraudulent data, which can have a significant negative impact on the performance of supervised learning methods. The same NTL issue was described in the paper <ref type="bibr" target="#b21">[22]</ref> , where deep reinforcement learning (DRL) was used to solve the data imbalanced problem. The advantage of the proposed method is that the classification method is adopted to use the partial input features without a pre-processing method for input feature selection.</p><p>Work <ref type="bibr" target="#b22">[23]</ref> describes the potential of hierarchical Federated Learning (FL) in the Internet of Things (IoT) heterogeneous systems. In particular, the Authors proposed an optimized solution for user assignment and resource allocation over hierarchical FL architecture for IoT heterogeneous systems. This work focuses on a generic class of machine learning models that are trained using gradient-descentbased schemes while considering the practical constraints of non-uniformly distributed data across different users.</p><p>In summary, the existing literature on addressing class imbalance in image datasets primarily focuses on general applications of proposed methods, overlooking the specific challenges posed by technical systems such as industrial automation, robotics, and critical infrastructure. This gap in the literature hinders the development of tailored solutions for real-world technical systems where misclassification can have significant consequences. Research in this area needs to prioritize the adaptation of algorithms to meet the unique data characteristics and operational constraints of technical systems. Additionally, there is a need for investigations into the scalability, efficiency, interpretability, and adaptability of class imbalance solutions in specific technical environments. Bridging this gap will be essential to ensure the robust and reliable deployment of image-based systems in critical technical domains.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology 3.1. Oversampling</head><p>The study concerned the use of various variants of oversampling and undersampling in order to check the impact of such action on the efficiency of object detection. The mathematical model for oversampling (upsampling) individual classes in a dataset to the size of the largest class using random sampling with replacement can be defined as follow.</p><p>Let's assume we have a dataset `D` consisting of `N` samples, and each sample is associated with a class label `yi`, where `i` denotes the index of the sample. We aim to oversample each class to the size of the largest class.</p><p>We define `K` as the target size of samples in each class, which is equal to the size of the largest class in the dataset. Our goal is to generate a new dataset `D'` where each class `i` will have exactly `K` samples.</p><p>The mathematical model for oversampling classes using random sampling with replacement can be described as follows: 1. Find the size of the largest class: max_class_size = max(Ni) for all classes i 2. For each class `i`:</p><p>a. If `Ni &lt; K`, randomly select `K -N_i` samples from class `i` with replacement until the target size is reached.</p><p>b. If `Ni &gt;= K`, include all samples from class `i` in `D'`.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Undersampling</head><p>The mathematical model for undersampling (downsampling) individual classes in a dataset to the size of the smallest class using random sampling without replacement shown below:</p><p>Let's assume we have a dataset `D` consisting of `N` samples, and each sample is associated with a class label `yi`, where `i` denotes the index of the sample. We aim to undersample each class to the size of the smallest class.</p><p>We define `K` as the target size of samples in each class, which is equal to the size of the smallest class in the dataset. Our goal is to generate a new dataset `D'` where each class `i` will have exactly `K` samples.</p><p>The mathematical model for undersampling classes using random sampling without replacement can be described as follows: 1. Find the size of the smallest class: min_class_size = min(Ni) for all classes i 2. For each class `i`: a. If `Ni &gt; K`, randomly select `K` samples from class `i` without replacement. b. If `Ni &lt;= K`, include all samples from class `i` in `D'`.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">The deep learning architecture used for object detection</head><p>YOLOv8m <ref type="bibr" target="#b23">[24]</ref> was used to perform this experiment. It is the newest state-of-the-art You Only Look Once (YOLO) model that can be used for object detection, image classification, and instance segmentation tasks. YOLOv8 was developed by Ultralytics, who also created the influential and industry-defining YOLOv5 model. YOLOv8 includes numerous architectural and developer experience changes and improvements over YOLOv5. YOLO has been nurtured by the computer vision community since its first launch in 2015 by Joseph Redmond. In the early versions, YOLO was maintained in C code in a custom deep learning framework Darknet. Subsequent versions were developed in PyTorch -a deep learning Python framework. In addition to its robust model foundation, the YOLO maintainers have shown a dedicated effort to foster a thriving software ecosystem for the model. They proactively address concerns and enhance the repository's functionalities in response to the community's needs.</p><p>There are five different models of YOLOv8 models made for object detection: YOLOv8n (nano), YOLOv8s (small), YOLOv8m (medium), YOLOv8l (large), YOLOv8x (extra-large). YOLOv8 Nano (YOLOv8n) is the fastest and smallest, while YOLOv8 Extra Large (YOLOv8x) is the most accurate yet the slowest.</p><p>More information about the current version of YOLO is given in <ref type="bibr" target="#b24">[25]</ref>, <ref type="bibr" target="#b25">[26]</ref>, <ref type="bibr" target="#b26">[27]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Evaluation metrics</head><p>The following metrics were used to evaluate the study results: Precision, Recall, F1 score and mAP (with different variations). Precision measures the accuracy of the positive predictions made by the model. In the context of image detection, precision represents the proportion of correctly detected objects (true positives) out of all objects that the model predicted as positive (true positives + false positives). Recall measures the model's ability to correctly detect all instances of the object in the dataset. In the context of image detection, recall represents the proportion of correctly detected objects (true positives) out of all actual instances of the object in the dataset (true positives + false negatives). The F1 score is the harmonic mean of precision and recall. It provides a balance between precision and recall and is especially useful when you want to consider both false positives and false negatives in your evaluation. The F1 score ranges from 0 to 1, where a higher score indicates better performance. A perfect F1 score of 1 means that the model has achieved both high precision and high recall, implying that it can detect all instances of the object with no false positives.</p><p>Mean average precision (mAP) is a crucial evaluation metric used in object detection tasks. It provides a comprehensive and quantitative measure of the accuracy and reliability of models employed in these tasks. Average precision (AP) is calculated for each class by computing the precision-recall curve. These measures are computed at different confidence thresholds to generate multiple data points forming the curve. mAP is then obtained by averaging AP values across all object classes. It signifies the overall accuracy and performance of the model across various classes, thereby providing a single, consolidated evaluation score.</p><p>An mAP of 1.0 indicates perfect accuracy, while lower values (like 0.95, 0.75 or 0.5) imply potential inaccuracies in IOU calculation. IOU, or Intersection over Union, is a commonly used metric in object detection tasks to measure the accuracy of bounding box predictions made by a model. It assesses the overlap between the predicted bounding box and the ground truth (actual) bounding box for an object in an image.</p><p>Refer to the YOLOv8 documentation for more information on each metric <ref type="bibr" target="#b27">[28]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Data Collection and Preprocessing</head><p>The Insulator Defect Image Dataset (IDID) <ref type="bibr" target="#b13">[14]</ref> representing power line insulators, which contains a large class depicting undamaged insulators and two other, relatively small classes depicting two types of damaged insulators was used in the research. The dataset IDID contains 1596 images in total. Annotations describing damaged and undamaged individual disks -components of power insulators -have been added to individual images of the dataset. The purpose of the detection was precisely these individual components, along with assigning them to the appropriate class.</p><p>Publicly available dataset IDID is divided into two parts: "train dataset" and "test dataset". For the purposes of this study, 1,000 examples representing undamaged isolator disks ("good insulator shell" -GIS class) and 200 examples of damaged isolator disks (100 examples for "flashover damage insulator shell" -FDIS class and 100 examples for "broken insulator shell" -BIS class) were selected from the part "train dataset". These images in different resampling variants were used to train the detector.</p><p>All images from the part "test dataset" were used to test the effectiveness of the detector. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Experimental Results and Discussion</head><p>Based on the images selected from the IDID dataset, described in the "Data Collection and Preprocessing" section, and using various resampling methods, 6 variants of the training sets listed below were prepared:</p><p>• Variant I: 1000 GIS, 100 FDIS, 100 BIS, starting case, imbalanced dataset,</p><p>• Variant II: 200 GIS, 100 FDIS, 100 BIS, undersampling so that the train dataset contains the same number of undamaged objects (GIS class) and damaged objects (FDIS class + BIS class), • Variant III: 100 GIS, 100 FDIS, 100 BIS, undersampling to the smallest classes FDIS, BIS, • Variant IV: 1000 GIS, 500 FDIS, 500 BIS, oversampling so that the train dataset contains the same number of undamaged objects (GIS class) and damaged objects (FDIS class + BIS class), • Variant V: 1000 GIS, 600 FDIS, 600 BIS, oversampling the smallest number of classes to triple their size and, as a result, reduce disproportions between classes, • Variant VI: 1000 GIS, 1000 FDIS, 1000 BIS, oversampling to the largest class GIS. The listed numbers indicate the number of instances in each class. For each of the presented variants, the learning process was carried out. The mean results obtained for the test dataset are presented in the Table <ref type="table" target="#tab_0">1</ref>. The best efficiency results were obtained for Variant IV -oversampling so that the train dataset contains the same number of undamaged objects (GIS class) and damaged objects (FDIS class + BIS class) for each of the analyzed mean metrics. This variant did not reduce the number of examples for</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :Figure 1 :</head><label>11</label><figDesc>Figure 1: Examples representing each of the listed classes (GIS, FDIS and BIS).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>It contains 930 examples (instances) of class GIS, 70 examples of class FDIS and 66 examples of class BIS.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Summary of mean results for individual resampling variants.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Variant Variant I Variant II Variant III Variant IV Variant V Variant VI Class size</head><label></label><figDesc></figDesc><table><row><cell></cell><cell>1000 GIS</cell><cell>200 GIS</cell><cell>100 GIS</cell><cell>1000 GIS</cell><cell>1000 GIS</cell><cell>1000 GIS</cell></row><row><cell></cell><cell>100 FDIS</cell><cell>100 FDIS</cell><cell>100 FDIS</cell><cell>500 FDIS</cell><cell>600 FDIS</cell><cell>1000 FDIS</cell></row><row><cell></cell><cell>100 BIS</cell><cell>100 BIS</cell><cell>100 BIS</cell><cell>500 BIS</cell><cell>600 BIS</cell><cell>1000 BIS</cell></row><row><cell>Mean AP at IoU</cell><cell>0.498</cell><cell>0.535</cell><cell>0.452</cell><cell>0.580</cell><cell>0.501</cell><cell>0.455</cell></row><row><cell>threshold of 0.5 for</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>all classes</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Mean AP at IoU</cell><cell>0.414</cell><cell>0.462</cell><cell>0.382</cell><cell>0.478</cell><cell>0.400</cell><cell>0.362</cell></row><row><cell>threshold of 0.75 for</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>all classes</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Mean AP at IoU</cell><cell>0.336</cell><cell>0.373</cell><cell>0.318</cell><cell>0.409</cell><cell>0.341</cell><cell>0.307</cell></row><row><cell>threshold of 0.5 to</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>0.95 for all classes</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>the most numerous GIS class, at the same time performing five-fold oversampling of the least numerous classes allowed for better representation of these classes (FDIS and BIS).</p><p>In most cases, the worst results were obtained for Variant VI -oversampling to the largest GIS class. Only a slightly worse (but very similar) result was obtained for the "Mean AP at IoU threshold of 0.5 for all classes" metric calculated for Variant III. Thus, it can be seen that the multiple duplication of examples from a small class brings effects only up to a certain threshold (Variant IV), while a significant multiplication of the same instances leads to the deterioration of the detection results. Results for individual resampling variants for each class ware shown in Table <ref type="table">2</ref>. During the analysis of the results for individual classes, it was noticed that the F1 score metric obtained the best values for Variant III, as well as for mean metrics. On the other hand, undersampling of the entire dataset to the least numerous class (Variant III) resulted in a significant reduction in the size of the entire training dataset and, as a result, a decrease in the F1 score metric (the worst result).</p><p>In contrast, the Precision and Recall metrics yielded mixed results. The multiplication of the number of instances improved the Precision parameter, but at the same time caused a deterioration of the Recall parameter value because the detector was unable to find objects that differed from the given pattern (problems related to the generalization of the neural network).</p><p>The mAP metric calculated for each class separately was the best for Variant IV, similar to the collectively calculated mAP and F1 score. In other cases, worse results were obtained, however, for this metric, the negative impact of resampling in individual variants cannot be unequivocally assessed, because ambiguous results of the instance detection efficiency for different classes were obtained -the analysis requires more research in this area.</p><p>Addressing the class imbalance in object detection is indeed a critical challenge, and it's essential to acknowledge the potential limitations and challenges encountered during the described experiment.</p><p>Here are some limitations and challenges that should be considered.</p><p>The effectiveness of resampling techniques heavily relies on the quality and representativeness of the dataset. If the dataset does not capture the real-world distribution of objects accurately, the results may not generalize well to practical scenarios. The success of resampling methods on the specific dataset used (IDID) does not guarantee similar outcomes on other datasets with different object classes and distributions. The research should assess the generalizability of the findings. The choice of hyperparameters for resampling methods and object detection models can significantly impact the results. Optimizing these hyperparameters is crucial for achieving the best performance.</p><p>For practical applications, it's important to assess the impact of resampling on the real-time inference speed of object detection systems, especially if used in latency-sensitive environments. In practical applications, there may be resource constraints, such as limited labelled data for rare classes. Research should explore strategies for addressing class imbalance in resource-constrained settings.</p><p>Resampling techniques may affect the interpretability of object detection models. Ensuring that the models provide meaningful explanations for their detections is essential, particularly in critical applications.</p><p>Addressing these limitations and challenges is crucial for advancing the field of object detection in the presence of class imbalance and for ensuring the applicability of research findings to real-world computer vision applications. The article describes the preliminary research carried out for only one detection algorithm, five variants of data resampling and one dataset. In order to face the limitations and challenges, the authors plan to continue the described research on limiting the impact of class imbalance on object detection efficiency.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Conclusion and Future Work</head><p>In summary, the article investigates how data resampling techniques can enhance the performance of object detection models in the presence of class imbalance, offering insights and guidance for improving the effectiveness of these models in practical applications.</p><p>The most efficient outcomes were achieved with variant, where oversampling was applied to ensure an equal number of undamaged (GIS class) and damaged objects (FDIS class + BIS class) in the training dataset for each of the average metrics under investigation. This approach preserved the abundance of GIS class examples while significantly improving the representation of the less common FDIS and BIS classes through a five-fold oversampling. In the remaining scenarios, diverse outcomes were observed, and these are elaborated upon in greater detail within the article.</p><p>In this article, it was described initial research on the influence of various methods of resampling on detection efficiency. The potential future research directions are:</p><p>• Testing other known methods of resampling training data,</p><p>• Development of a new resampling method using the knowledge gained during the described experiments, • Testing other algorithms for detecting and classifying objects on digital images and their effectiveness depending on the use of different data resampling methods, • Study of imbalance reduction in test dataset, • Investigation of the use of semi-supervised learning approaches to leverage both labelled and unlabeled data, potentially reducing the reliance on extensive labelled data for minority classes, • Applying Generative Adversarial Networks (GANs) to generate synthetic samples for minority classes in object detection datasets. This approach can potentially improve model performance by providing more diverse training data, • Investigate the use of ensemble models that combine multiple object detection models trained on resampled datasets to improve overall performance and robustness.</p><p>The findings of this research have practical implications for improving the accuracy and reliability of object detection models in real-world applications.</p><p>Understanding the effectiveness of data resampling can guide practitioners and researchers in selecting the appropriate approach to handle class imbalance in object detection. This knowledge can lead to more efficient computer vision systems, particularly in scenarios where imbalanced classes are common. Mitigating class imbalance for object detection is vital for developing robust computer vision systems capable of accurately identifying and locating objects across a diverse range of classes, regardless of their representation in the dataset. This research area continues to evolve, contributing to advancements in object detection and broader applications of computer vision in fields like autonomous driving, medical imaging, and more.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Data Preprocessing, Active Learning, and Cost Perceptive Approaches for Resolving Data Imbalance</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Rana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">G</forename><surname>Mehta</surname></persName>
		</author>
		<idno type="DOI">10.4018/978-1-7998-7371-6</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.4018/978-1-7998-7371-6" />
		<imprint>
			<date type="published" when="2021">2021</date>
			<publisher>IGI Global</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Detection of Power Line Insulators in Digital Images Based on the Transformed Colour Intensity Profiles</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tomaszewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gasz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Osuchowski</surname></persName>
		</author>
		<idno type="DOI">10.3390/s23063343</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.3390/s23063343" />
	</analytic>
	<monogr>
		<title level="j">Sensors</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="3343" to="3343" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A Review on State-of-the-Art Power Line Inspection Techniques</title>
		<author>
			<persName><forename type="first">L</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liang</surname></persName>
		</author>
		<idno type="DOI">10.1109/tim.2020.3031194</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Instrumentation and Measurement</title>
		<imprint>
			<biblScope unit="volume">69</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="9350" to="9365" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Data analysis in visual power line inspection: An indepth review of deep learning for component detection and fault diagnosis</title>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Miao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.arcontrol.2020.09.002</idno>
		<ptr target="https://doi.org/10.1016/j.arcontrol.2020.09.002" />
	</analytic>
	<monogr>
		<title level="j">Annual Reviews in Control</title>
		<imprint>
			<date type="published" when="2020-10">October 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A Review of Fault Diagnosing Methods in Power Transmission Systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Raza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Benrabah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Alquthami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename></persName>
		</author>
		<idno type="DOI">10.3390/app10041312</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.3390/app10041312" />
	</analytic>
	<monogr>
		<title level="j">Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">1312</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Unified Deep Learning Architecture for the Detection of All Catenary Support Components</title>
		<author>
			<persName><forename type="first">W</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nunez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Han</surname></persName>
		</author>
		<idno type="DOI">10.1109/access.2020.2967831</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1109/access.2020.2967831" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="17049" to="17059" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Automatic autonomous vision-based power line inspection: A review of current status and the potential role of deep learning</title>
		<author>
			<persName><surname>Vn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jenssen</surname></persName>
		</author>
		<author>
			<persName><surname>Roverso</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ijepes.2017.12.016</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.ijepes.2017.12.016" />
	</analytic>
	<monogr>
		<title level="j">International Journal of Electrical Power &amp; Energy Systems</title>
		<imprint>
			<biblScope unit="volume">99</biblScope>
			<biblScope unit="page" from="107" to="120" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tomaszewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Michalski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Osuchowski</surname></persName>
		</author>
		<idno type="DOI">10.3390/app10062104</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.3390/app10062104" />
	</analytic>
	<monogr>
		<title level="j">Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">2104</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Object Description Based on Local Features Repeatability</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tomaszewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Michalski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Osuchowski</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-</idno>
		<idno>3-030-72254-8_28</idno>
		<ptr target="org/10.1007/978-" />
		<imprint>
			<date type="published" when="2021-01-01">January 1, 2021</date>
			<biblScope unit="page" from="255" to="267" />
		</imprint>
	</monogr>
	<note>Advances in intelligent systems and computing</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Insulator Defect Detection</title>
		<author>
			<persName><forename type="first">P</forename><surname>Kulkarni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lewis</surname></persName>
		</author>
		<ptr target="https://ieee-dataport.org/competitions/insulator-defect-detection" />
		<imprint>
			<date type="published" when="2023-09-06">September 6, 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Learning from imbalanced data: open challenges and future directions</title>
		<author>
			<persName><forename type="first">B</forename><surname>Krawczyk</surname></persName>
		</author>
		<idno type="DOI">10.1007/s13748-</idno>
		<idno>016-0094-0</idno>
		<ptr target="org/10.1007/s13748-" />
	</analytic>
	<monogr>
		<title level="j">Progress in Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="221" to="232" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Learning from Imbalanced Data</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><surname>Garcia</surname></persName>
		</author>
		<idno type="DOI">10.1109/tkde.2008.239</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1109/tkde.2008.239" />
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page" from="1263" to="1284" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Haibo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><surname>Wiley</surname></persName>
		</author>
		<title level="m">Imbalanced Learning : Foundations, Algorithms, and Applications</title>
				<meeting><address><addrLine>Cop</addrLine></address></meeting>
		<imprint>
			<publisher>John Wiley &amp; Sons</publisher>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">A Survey of Predictive Modelling under Imbalanced Distributions</title>
		<author>
			<persName><forename type="first">P</forename><surname>Branco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Torgo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ribeiro</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1505.01658</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.48550/arXiv.1505.01658" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Survey on deep learning with class imbalance</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Khoshgoftaar</surname></persName>
		</author>
		<idno type="DOI">10.1186/s40537-019-0192</idno>
		<idno>-5</idno>
		<ptr target="https://doi.org/10.1186/s40537-019-0192" />
	</analytic>
	<monogr>
		<title level="j">Journal of Big Data</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Classification of Imbalanced Data:Review of Methods and Applications</title>
		<author>
			<persName><forename type="first">P</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bhatnagar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gaur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bhatnagar</surname></persName>
		</author>
		<idno type="DOI">10.1088/1757-899x/1099/1/012077</idno>
	</analytic>
	<monogr>
		<title level="j">IOP Conference Series: Materials Science and Engineering</title>
		<imprint>
			<biblScope unit="volume">1099</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">12077</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Assessing the data complexity of imbalanced datasets</title>
		<author>
			<persName><surname>Vh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lpf</forename><surname>Barella</surname></persName>
		</author>
		<author>
			<persName><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>De Souto</surname></persName>
		</author>
		<author>
			<persName><surname>Lorena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Acplf</forename><surname>De Carvalho</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ins.2020.12.006</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.ins.2020.12.006" />
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">553</biblScope>
			<biblScope unit="page" from="83" to="109" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics</title>
		<author>
			<persName><forename type="first">V</forename><surname>López</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>García</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Palade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Herrera</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ins.2013.07.007</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.ins.2013.07.007" />
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">250</biblScope>
			<biblScope unit="page" from="113" to="141" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Classification with Imbalanced Datasets</title>
		<ptr target="https://sci2s.ugr.es/imbalanced" />
	</analytic>
	<monogr>
		<title level="m">Soft Computing and Intelligent Information Systems</title>
				<imprint/>
	</monogr>
	<note>sci2s</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">An empirical study on the effectiveness of data resampling approaches for cross-project software defect prediction</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ke. Bennin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Tahir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Macdonell</surname></persName>
		</author>
		<author>
			<persName><surname>Börstler</surname></persName>
		</author>
		<idno type="DOI">10.1049/sfw2.12052</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1049/sfw2.12052" />
	</analytic>
	<monogr>
		<title level="j">IET Software</title>
		<imprint>
			<date type="published" when="2021-11-28">November 28, 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Improved practices in machine learning algorithms for NTL detection with imbalanced data</title>
		<author>
			<persName><forename type="first">G</forename><surname>Figueroa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">F</forename><surname>Avila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Chu</surname></persName>
		</author>
		<idno type="DOI">10.1109/pesgm.2017.8273852</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1109/pesgm.2017.8273852" />
	</analytic>
	<monogr>
		<title level="j">Published</title>
		<imprint>
			<date type="published" when="2017-07-01">July 1, 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Non-Technical Loss Detection Using Deep Reinforcement Learning for Feature Cost Efficiency and Imbalanced Dataset</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">G</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">H</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">I</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Y</forename><surname>Kim</surname></persName>
		</author>
		<idno type="DOI">10.1109/access.2022.3156948</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1109/access.2022.3156948" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="27084" to="27095" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Communication-efficient hierarchical federated learning for IoT heterogeneous systems with imbalanced data</title>
		<author>
			<persName><surname>Aa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Abdellatif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mhaisen</surname></persName>
		</author>
		<author>
			<persName><surname>Mohamed</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.future.2021.10.016</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.future.2021.10.016" />
	</analytic>
	<monogr>
		<title level="j">Future Generation Computer Systems</title>
		<imprint>
			<biblScope unit="volume">128</biblScope>
			<biblScope unit="page" from="406" to="419" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<ptr target="https://github.com/ultralytics/ultralytics?ref=blog.roboflow.com" />
		<title level="m">GitHub -ultralytics/ultralytics at blog</title>
				<imprint>
			<date type="published" when="2023-09-07">September 7, 2023</date>
		</imprint>
	</monogr>
	<note>roboflow</note>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">What is YOLOv8? The Ultimate Guide. Roboflow Blog</title>
		<author>
			<persName><forename type="first">J</forename><surname>Solawetz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francesco</forename></persName>
		</author>
		<ptr target="https://blog.roboflow.com/whats-new-in-yolov8/" />
		<imprint>
			<date type="published" when="2023-01-11">January 11, 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios</title>
		<author>
			<persName><forename type="first">G</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>An</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><surname>Huang</surname></persName>
		</author>
		<idno type="DOI">10.3390/s23167190</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.3390/s23167190" />
	</analytic>
	<monogr>
		<title level="j">Sensors</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">16</biblScope>
			<biblScope unit="page" from="7190" to="7190" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Terven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cordova-Esparaza</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2304.00501.pdf" />
		<title level="m">a comprehensive review of yolo: from yolov1 to yolov8 and beyond under review in acm computing surveys</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<ptr target="https://docs.ultralytics.com/reference/utils/metrics/#ultralytics.utils.metrics.Metric" />
		<title level="m">Ultralytics metrics</title>
				<imprint>
			<date type="published" when="2023-09-07">September 7, 2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
