<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deep Segmentation of Fuel Rod End Plugs for Nuclear Assembly Inspection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vojtěch Bláha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Blažek</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charles University, Faculty of Mathematics and Physics, Department of Software and Computer Science Education</institution>
          ,
          <addr-line>Prague, Czech</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Information Theory and Automation, Czech Academy of Sciences</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Research Centre Řež</institution>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>In nuclear power generation, precise geometric conformity of fuel rods is essential for reactor safety and optimal performance. We present an end-to-end image processing system built around deep learning-based segmentation for automated inspection of fuel rod end plugs in grayscale side-view images. The system handles challenges posed by metallic surfaces with complex light reflections and structural ambiguity. We train U-Net and U-Net 3+ decoders with EficientNet-V2-S and ConvNeXt-Tiny encoders on both real and synthetically rendered datasets. The proposed approach is evaluated across multiple configurations, with the best model achieving an average positional error of 0.42 mm and standard deviation of 0.39 mm. Our results demonstrate the feasibility of deploying convolutional architectures in real-world industrial inspection workflows.</p>
      </abstract>
      <kwd-group>
        <kwd>data</kwd>
        <kwd>industrial automation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Nuclear power remains a key contributor to low-carbon electricity generation, ofering high reliability
and eficiency. The safe and efective operation of a nuclear reactor depends significantly on the
structural integrity and precise geometry of its core components, in our case the fuel rods. These rods
are sealed with metallic end plugs and arranged in grid-like structures within fuel assemblies. Accurate
measurement of plug positions is essential for verifying manufacturing quality by determining the
length of each fuel rod, which in turn is a key parameter for assessing structural conformity and fuel
burnup. Detecting deviations allows operators to identify design flaws or damaged rods, enabling
targeted replacements during scheduled outages and reducing the risk of in-service failures.</p>
      <p>Our experience with the quality assurance (QA) involves visual inspection by human operators
which are responsible for the image processing and measurement of rod growth. In past years, we
developed our first image processing system that utilizes edge detection, morphological operations, and
geometric heuristics and produces reproducible results with automation about 80%. These approaches,
however, often fall short in robustness and precision, especially in the presence of complex reflections
on metalic surfaces. Consequently, we register an increasing demand for automated, reliable, and
accurate inspection systems that generate reproducible inspection data under industrial constraints.</p>
      <p>The recent success of deep learning, particularly convolutional neural networks (CNNs), in computer
vision tasks has paved the way for their application in industrial inspection. Semantic segmentation
using CNNs allows for pixel-precise classification of image regions and is well-suited for tasks requiring
precise object localization and measurement. However, deploying these methods in real-world nuclear
inspection scenarios presents three main challenges:
• Data scarcity: Annotated data is limited due to confidentiality and cost of expert labeling.
• Domain complexity: As illustrated in Figure 1, reflections on metallic surfaces — common
in real fuel assembly imagery — pose challenges for accurate segmentation. Figure 2 shows</p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073
an example of a synthetically rendered fuel assembly designed to closely resemble the visual
characteristics of real inspection data.
• Industrial constraints: The system must be reliable and verifiable, producing correct outputs
or clearly indicating uncertainty, as required in safety-critical environments.</p>
      <p>This paper proposes an image processing pipeline based on deep learning segmentation models,
trained on nuclear industry specific data. We focus on the segmentation of fuel rod end plugs and
adjacent grid structures from a grayscale side-view images produced during visual inspections. Our
method integrates modern CNN architectures — U-Net and U-Net 3+ — tailored for image segmentation
with high-performing backbones such as EficientNetV2-S and ConvNeXt-Tiny.</p>
      <p>To address data scarcity and promote generalization, we develop a synthetic image generation
pipeline using Blender, capable of rendering annotated fuel rod images with realistic geometry, textures,
and lighting. We also apply targeted data augmentation strategies to better capture the variability of
real-world data.</p>
      <p>The core contributions of this work include:
1. A detailed semantic segmentation pipeline optimized for fuel rod inspection.
2. A synthetic dataset simulating images acquired during real fuel inspection.
3. A rigorous evaluation of model variants and training strategies, achieving sub-millimeter accuracy.
4. A comparison with legacy rule-based systems, showing significant improvements in speed and
precision.
5. A deployment-ready model export with GPU acceleration.</p>
      <p>In the following sections, we describe the industrial requirements and image characteristics, review
related work, explain our data preparation and model architectures, present experimental results, and
discuss implications for industrial deployment and future extensions.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Industrial Context and Problem Setup</title>
      <p>Visual inspection of nuclear fuel rod end plugs is typically performed during scheduled outages of
nuclear power plants. The goal is to assess the condition and assembly quality of fuel rods, including
both those in active use and those nearing the end of their fuel cycle.</p>
      <p>An example frame is shown in Figure 3, highlighting end plug boundaries and the grid’s upper edge.
Variations in appearance arise due to surface oxidation and irradiation reflections.</p>
      <p>Each image captures the upper part of the fuel assembly at the point where the fuel rods terminate,
using a high-resolution grayscale radiation-resistant camera under controlled lighting. Our scene
typically contains eleven vertically arranged fuel rods end plug and the top edge of a horizontal spacer
grid. The imaging setup is designed so that the camera sensor is aligned parallel to the plane formed by
the peripheral rods, although slight deviations may occur due to mechanical tolerances, resulting in
minor left-right geometric skew.</p>
      <p>The primary goal is to estimate the full 2D position and shape of each visible end plug head, as
well as to accurately localize the upper edge of the spacer grid which serves as a reference point. The
perspective transformation used to correct geometric skew is identical to the original rule-based system;
the main advantage of our approach lies in the improved robustness and accuracy of segmentation
through deep learning.</p>
      <p>For each rod depicted in the image, we generate binary masks for two specific regions:
• The visible plug head
• The upper edge of the horizontal spacer grid</p>
      <p>The choice to use masks is a design decision; the 2D position of plugs could be estimated through
other methods. While the shape of the mask could, in principle, serve as an additional check for
assembly correctness, our current approach primarily relies on the size of the masked area rather than
detailed shape information.</p>
      <p>Although no strict specification is provided, the acceptable positional deviation is typically around
1.5 mm, based on downstream processing requirements and empirical tolerances. The expected detection
success rate is not formally quantified. Real-time performance is not a critical constraint, as the fuel
assembly is scanned over several minutes, allowing suficient time for repeated measurements and
evaluation of rod lengths across the image sequence.</p>
      <p>In this context, the segmentation model should achieve high localization precision, robustness to
visual variability, and suficient inference speed to support real-time on-site decision-making.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>The problem of object segmentation in industrial settings, particularly for metallic components, has
been addressed by several computer vision and machine learning approaches. This section reviews the
key developments in semantic segmentation, synthetic data generation, and applications in nuclear and
manufacturing domains.</p>
      <sec id="sec-3-1">
        <title>2.1. Semantic Segmentation in Industrial Vision</title>
        <p>
          Since the introduction of UNet [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], semantic segmentation has seen significant advancements aimed at
improving contextual understanding, feature representation, and eficiency. UNet’s encoder-decoder
structure with skip connections efectively combines spatial and semantic information but is limited
in capturing global context. Subsequent models like PSPNet [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and DeepLabV3 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] addressed this by
introducing pyramid pooling and atrous spatial pyramid pooling, respectively, to incorporate multi-scale
contextual information. Deep residual networks replaced shallow encoders, enhancing feature extraction
capabilities, as seen in DeepLabV3+ [6]. Attention mechanisms and, more recently, Transformer-based
architectures such as TransUNet [7], SegFormer [8], and Mask2Former [9] have further improved
segmentation by modeling long-range dependencies. Additionally, lightweight models like ENet [10],
BiSeNet [11], and Fast-SCNN [12] have been developed for real-time applications, while refined decoders
and self-supervised techniques [13] continue to enhance boundary precision and reduce annotation
dependence. These developments collectively mark a shift toward more context-aware, accurate, and
eficient segmentation architectures.
        </p>
        <p>To address the demands of a new industry application, we consider architectures that combine
U-Net-style decoders with modern encoder backbones such as EficientNet [ 14] and ConvNeXt [15],
ofering a strong balance between performance, scalability, and modular design.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Synthetic Data and Domain Adaptation</title>
        <p>Due to confidentiality constraints surrounding real-world datasets, we propose training convolutional
neural networks (CNNs) using a synthetic dataset that can be openly shared with the research
community. Synthetic data not only circumvent privacy issues but also address challenges related to precise
annotations and the limited availability of labeled images. Our prior work [16] introduced a synthetic
generator for fuel assemblies, which we build upon in this study.</p>
        <p>Recent advances demonstrate that high-quality synthetic data generated via 3D modeling and
rendering tools, such as Blender, can efectively support CNN training. Studies like [ 17, 18] show that
synthetic-to-real transfer is feasible when rendered data incorporate diverse and realistic variations
in lighting, textures, and object poses. Furthermore, domain randomization [19] and techniques like
structured synthetic pipelines [20] and texture transfer [21] have proven valuable in bridging the domain
gap between synthetic and real images. In our approach, we leverage both photorealistic rendering and
domain variation strategies to simulate operational conditions and maximize model generalizability.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Deep Learning in Nuclear and Manufacturing Inspection</title>
        <p>In the nuclear industry, most applications of machine learning have focused on reactor monitoring,
anomaly detection in sensor networks [22], or predictive maintenance. Visual inspection of fuel rods
remains underexplored due to strict confidentiality and domain-specific constraints.</p>
        <p>However, deep learning has been successfully applied to weld inspection [23], defect detection in
turbine blades [24], and PCB quality control [25]. These tasks share similar challenges, such as small
defect size, reflective surfaces, and class imbalance.</p>
        <p>To our knowledge, this paper is among the first to address deep segmentation of nuclear fuel rod
end plugs in a production-grade setting. Our work contributes a detailed, reproducible pipeline and
highlights the viability of CNN-based inspection under safety-critical constraints.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Data Preparation and Augmentation</title>
      <p>Deep learning-based semantic segmentation models rely on diverse, well-annotated training data. In
our study, we use a combination of real inspection images with manual annotations and synthetically
generated images. Since the real dataset is small and labor-intensive to annotate consistently, we apply
data augmentation to enhance model generalization. The same augmentation techniques are applied to
the synthetic data to maintain consistency across the training set.</p>
      <sec id="sec-4-1">
        <title>3.1. Real Dataset</title>
        <p>The real dataset consists of 523 manually annotated grayscale images captured from a nuclear fuel rod
assembly line. Each 720×576 image contains 11 vertically aligned peripheral rods. Expert annotators
labeled two key regions: the end plug and the corresponding grid bar. To ensure consistency, all
annotations were reviewed and refined based on inter-rater agreement. The limited size of the dataset
highlights the need for artificial diversity through synthetic data or augmentation. Due to a
nondisclosure agreement (NDA), these images cannot be publicly shown in this work.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Synthetic Dataset Generation</title>
        <p>To complement real images, we developed a parametric 3D rendering pipeline using Blender, building
upon the publicly available codebase introduced in [16]. The synthetic data generator is capable of
simulating diverse rod configurations, illumination conditions, and camera viewpoints. Key features
include:
• Photorealistic rendering using physically-based shaders and ray-traced lighting.
• Automatic generation of ground truth masks for plug and grid regions.
• Variable camera roll, pitch, and distance to simulate mechanical misalignment.
• Simulation of various lighting intensities and realistic image noise artifacts to enhance domain
variability.</p>
        <p>A total of 523 synthetic images were rendered with random configurations. Post-rendering, all images
were converted to 8-bit grayscale and resized to match real frame specifications.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. On-the-Fly Augmentations</title>
        <p>To further increase variability, we used the Albumentations library [26] to perform random
augmentations, applied only to training dataset. The pipeline included:
• Geometric: Small afine transformations (rotation ±3∘, scaling ±25%, horizontal flip).
• Photometric: Random brightness/contrast changes (±80%).
• Noise: Additive Gaussian noise, blur, and simulated compression artifacts.</p>
        <p>• Cutout: Random erasure of random patch to simulate occlusions.</p>
        <p>Each image passed through 10–15 transformations during training. All transformations were
labelpreserving, meaning masks were transformed in parallel with input images.</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Data Splits and Normalization</title>
        <p>Models were trained and evaluated separately on both real and synthetic datasets to assess their
performance under controlled and practical conditions. We split the dataset into 60% training, 15%
validation, and 25% test sets.</p>
        <p>This data pipeline provided a rich training signal while simulating the diversity encountered in
real manufacturing scenarios, ultimately contributing to the segmentation model’s robustness and
generalization.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Model Architectures and Training Protocols</title>
      <p>In designing our semantic segmentation framework, we prioritize three key criteria: fast training and
inference, support for rapid prototyping through modular design, and strong performance in terms
of convergence and generalization. To meet these goals, we adopt an encoder-decoder architecture
that enables clear separation between feature extraction and spatial reconstruction. This modular
structure facilitates easy substitution of encoder backbones and decoder strategies, allowing for flexible
experimentation and optimization. Additionally, we favor architectures that are computationally
eficient yet capable of learning robust representations from both real and synthetic data, making them
suitable for deployment in industrial inspection scenarios with limited training resources.</p>
      <sec id="sec-5-1">
        <title>4.1. Network Architectures</title>
        <sec id="sec-5-1-1">
          <title>We implemented and benchmarked four model variants:</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>1. U-Net with EficientNetV2-S encoder 2. U-Net with ConvNeXt-Tiny encoder 3. U-Net 3+ with EficientNetV2-S encoder 4. U-Net 3+ with ConvNeXt-Tiny encoder</title>
          <p>All backbones were initialized with ImageNet-1k weights. The decoder was constructed with 2D
convolutions, batch normalization, ReLU activations, and bilinear upsampling blocks. U-Net 3+ variants
included dense skip connections and deep supervision at multiple decoder stages.</p>
          <p>For each dataset combination, two separate models were trained: one for predicting the plug regions
and another for the spacer grid. Each model outputs a probability map corresponding to either the
plug or grid class, with a final softmax layer applied to normalize the pixel-wise logits.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Loss Functions</title>
        <p>To train the segmentation models, we used a composite loss function that combines Binary
CrossEntropy (BCE) and Dice loss:</p>
        <p>ℒtotal(  ,   ) = ℒBCE(  ,   ) + ℒDice(  ,   ),
where   is the ground truth label for the i-th sample,   is the predicted probability for the positive
class.</p>
        <p>Binary Cross-Entropy ensures accurate pixel-wise classification, while Dice loss promotes shape
consistency and robustness to class imbalance. Their combination helps the model capture fine details
and produce coherent segmentation masks.</p>
        <p>The individual components are defined as:

ℒBCE(  ,   ) = −(1 −   )   log(  )
ℒDice(  ,   ) = (1 − )
 (1 −   ) log(1 −   )
(1)
(2)
(3)</p>
        <p>We used hyperparameters  = 0.05 ,  = 2.0 .</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Training Settings</title>
        <p>All models were trained using the Adam optimizer with its default hyperparameters:  1 = 0.9,  2 = 0.999,
and  = 1 e−7. Training proceeded for a maximum of 100 epochs, with early stopping based on
validation loss employed to prevent overfitting. The Adam optimizer’s adaptive learning rate adjustment
contributed to stable and eficient convergence across all tested architectures and dataset variants, even
in scenarios involving reflective artifacts, class imbalance, or synthetic-to-real domain shifts.</p>
        <p>Batch size was set to 32 and training was performed on an NVIDIA RTX A5000 GPU with 24 GB of
GDDR6 memory. Each model variant required approximately 1–3 hours to train.</p>
      </sec>
      <sec id="sec-5-4">
        <title>4.4. Input Resolution</title>
        <p>Although the original images have a resolution of 720 × 576 pixels, the input resolution was downscaled
using bilinear interpolation to 128×128 pixels due to the computational demands of the chosen backbone
architectures. We are aware that downscaling the images can lead to a loss of precision. Although this
issue has been addressed, a satisfactory solution is under development.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Experiments and Results</title>
      <p>We evaluated our segmentation models on the both datasets using standard accuracy metrics, as well
as domain-specific evaluation criteria relevant to industrial quality control. This section presents our
experimental design, benchmark results, and visual analysis.</p>
      <sec id="sec-6-1">
        <title>5.1. Evaluation Metrics</title>
        <sec id="sec-6-1-1">
          <title>We measured segmentation performance using the following metrics:</title>
          <p>• Intersection over Union (IoU) for plug/grid masks
• F1-score- - for plug/grid positions, computed with spatial tolerance thresholds  and  (in
pixels) relative to the original image resolution of 720 × 576.</p>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Model Performance</title>
      </sec>
      <sec id="sec-6-3">
        <title>5.3. Qualitative Results</title>
      </sec>
      <sec id="sec-6-4">
        <title>5.4. Comparison with Legacy Systems and Deployment</title>
        <p>Prior to the adoption of our CNN-based segmentation pipeline, a rule-based image processing system
was used for inspection. This legacy approach relied on edge detection and periodicity of the FA rods
in the image to estimate rod’s welds positions from pixel intensities.</p>
        <p>Table 6 compares plug detection performance between the legacy system and our best model
(EficientNet-V2-S + U-Net 3+), showing significant improvements in F1 scores across test cases.</p>
        <p>Figure 5 illustrate improvements in rod height estimation, where the proposed method achieves a
mean error of  = 0.424 mm compared to 7.094 mm with the legacy system. Standard deviation is also
markedly reduced (0.393 mm vs. 10.567 mm), confirming better robustness.</p>
        <p>While the CNN-based system introduces slightly higher latency, GPU acceleration enables batch
processing without afecting throughput. Known limitations include minor segmentation degradation
near image borders and sensitivity to extreme camera angles. These are mitigated through data
augmentation, synthetic training examples, and postprocessing heuristics. Future work may explore
transformer-based architectures for enhanced global context modeling.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusion and Future Work</title>
      <p>We have presented a complete, data-driven segmentation pipeline for automated control of nuclear
fuel rod end plugs. By leveraging modern deep learning architectures, synthetic data augmentation,
and domain-specific evaluation, our system delivers robust, sub-millimeter plug and grid segmentation
under diverse industrial conditions.</p>
      <p>Among four tested model variants, the U-Net 3+ decoder with a EficientNet-V2-S encoder ofered the
best trade-of between accuracy and speed. Our pipeline enabled the system to generalize across plug
types, lighting setups, and slight misalignments — a limitation of the previous rule-based framework.
The best model achieved an average positional error of just 0.424 mm, significantly outperforming the
legacy rule-based system (7.094 mm). On the test set, it also improved the F1-score from 61.94 to 96.18
for loose spatial tolerance (F1-8-5), and from 50.80 to 69.85 for stricter tolerance (F1-3-2). These results
confirm that the proposed method meets the sub-millimeter precision required in nuclear inspection
workflows.</p>
      <p>The CNN-based inspection module was successfully tested and has demonstrated stable performance
and real-time processing capabilities. It is well-positioned to serve as a practical enhancement to
the current inspection workflow, significantly increasing the level of automation in video analysis.
Furthermore, the potential for continuous model monitoring and periodic retraining ofers a promising
path toward further improving accuracy and adaptability, ultimately supporting a more robust and fully
automated inspection process in the future.</p>
      <sec id="sec-7-1">
        <title>Future Work</title>
        <p>
          Our future research and development eforts will focus on the following directions:
• Exploration of Alternative Segmentation Architectures: Evaluate architectures tailored for
high-precision segmentation on small datasets, such as HRNet [27] and DeepLabv3+ [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. These
models ofer strong performance through high-resolution feature preservation and multi-scale
context aggregation.
• Custom Backbone Design: Develop a flexible custom backbone with tunable width (filter
multiplier) and depth (number of layers) parameters. This would allow optimization for larger
input resolutions and better balance between computational eficiency and segmentation accuracy.
• Enhanced Plug Annotation Protocol: Augment the current center-point plug annotations to
include plug width or bounding boxes, enabling more precise mask generation. Future work may
also explore full contour annotations if labeling cost is justified.
• Cross-Dataset Evaluation: Test the trained models on real-world datasets with diferent plug
types and assembly geometries to assess generalization performance and identify adaptation
needs.
        </p>
        <p>These research directions aim to systematically improve both the precision and adaptability of the
segmentation system. By enhancing annotation quality, exploring advanced architectures, and
evaluating cross-domain generalization, we intend to develop a more robust, fully automatic, reproducible and
precise solution that can meet the evolving demands of industrial nuclear fuel inspection.</p>
      </sec>
      <sec id="sec-7-2">
        <title>Final Remarks</title>
        <p>Our work demonstrates that modern convolutional neural networks — when combined with
photorealistic synthetic data and targeted augmentations — are not only efective for academic benchmarks
but also mature enough for industrial deployment. The methodology and insights presented here may
serve as a blueprint for future visual inspection systems across high-stakes manufacturing domains.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We thank to Jaroslav Knotek for generator of the synthetic fuel dataset and Marcin Kopeć for domain
expertise. This work was supported by Centre of Recherche Řež, ČEZ a.s. and Charles University. We
also acknowledge the state support of the Technology Agency of the Czech Republic within the National
Competence Centre Programme, project TN02000012 „Center of Advanced Nuclear Technology II“
which is partially co-financed within the National Recovery Plan from the European Instrument for
recovery and resilience.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT-4 and in order to: Grammar and spelling
check. After using these tool, the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.
[6] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schrof, H. Adam, Encoder-decoder with atrous separable
convolution for semantic image segmentation, in: Proceedings of the European conference on
computer vision (ECCV), 2018, pp. 801–818. doi:10.48550/arXiv.1802.02611.
[7] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, Y. Zhou, Transunet: Transformers make strong
encoders for medical image segmentation, arXiv preprint arXiv:2102.04306 (2021). doi:10.48550/
arXiv.2102.04306.
[8] E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, P. Luo, Segformer: Simple and eficient
design for semantic segmentation with transformers, arXiv preprint arXiv:2105.15203 (2021).
doi:10.48550/arXiv.2105.15203.
[9] B. Cheng, A. Schwing, A. Kirillov, Masked-attention mask transformer for universal image
segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2022, pp. 1290–1299. doi:10.48550/arXiv.2112.01527.
[10] A. Paszke, A. Chaurasia, S. Kim, E. Culurciello, Enet: A deep neural network architecture for
real-time semantic segmentation, arXiv preprint arXiv:1606.02147 (2016). doi:10.48550/arXiv.
1606.02147.
[11] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Bisenet: Bilateral segmentation network for
real-time semantic segmentation, in: Proceedings of the European Conference on Computer
Vision (ECCV), 2018, pp. 334–349. doi:10.48550/arXiv.1808.00897.
[12] R. P. K. Poudel, S. Liwicki, R. Cipolla, Fast-scnn: Fast semantic segmentation network, arXiv
preprint arXiv:1902.04502 (2019). doi:10.48550/arXiv.1902.04502.
[13] K. He, H. Fan, Y. Wu, S. Xie, R. Girshick, Momentum contrast for unsupervised visual
representation learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2020, pp. 9729–9738. doi:10.48550/arXiv.1911.05722.
[14] M. Tan, Q. V. Le, Eficientnet: Rethinking model scaling for convolutional neural networks,
CoRR abs/1905.11946 (2019). URL: http://arxiv.org/abs/1905.11946. arXiv:1905.11946, [Accessed:
2025-04-28].
[15] Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, A convnet for the 2020s, 2022. URL:
https://arxiv.org/abs/2201.03545. arXiv:2201.03545, [Accessed: 2025-04-28].
[16] J. Knotek, J. Blažek, M. Kopeć, Simulating nuclear fuel inspections: Enhancing
reliability through synthetic data, Nuclear Engineering and Technology 57 (2025) 103571. URL:
https://www.sciencedirect.com/science/article/pii/S1738573325001391. doi:https://doi.org/10.
1016/j.net.2025.103571, [Accessed: 2025-04-28].
[17] S. R. Richter, V. Vineet, S. Roth, V. Koltun, Playing for data: Ground truth from computer
games, in: European Conference on Computer Vision (ECCV), Springer, 2016, pp. 102–118.
doi:10.1007/978-3-319-46475-6_7.
[18] J. Tremblay, T. To, S. Birchfield, Falling things: A synthetic dataset for 3d object detection and
pose estimation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition Workshops (CVPRW), 2018, pp. 2038–2041. doi:10.1109/CVPRW.2018.00257.
[19] J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, P. Abbeel, Domain randomization for
transferring deep neural networks from simulation to the real world, in: IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS), 2017, pp. 23–30. doi:10.1109/IROS.2017.8202133.
[20] A. Kar, C. Häne, J. Malik, Meta-sim: Learning to generate synthetic datasets, in: Proceedings
of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4551–4560.
doi:10.1109/ICCV.2019.00465.
[21] S. Zheng, T. Xiao, Z. Li, H. Yang, Y. Song, P. Luo, Unbiased synthetic data generation via texture
disentanglement for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), 2022, pp. 15765–15775. doi:10.1109/CVPR52688.
2022.01536.
[22] Y. Shi, X. Xue, Y. Qu, J. Xue, L. Zhang, Machine learning and deep learning methods used in
safety management of nuclear power plants: A survey, in: 2021 International Conference on Data
Mining Workshops (ICDMW), 2021, pp. 917–924. doi:10.1109/ICDMW53433.2021.00120.
[23] Y. Chang, W. Wang, A deep learning-based weld defect classification method using radiographic
images with a cylindrical projection, IEEE Transactions on Instrumentation and Measurement 70
(2021) 1–11. doi:10.1109/TIM.2021.3124053.
[24] J. Liu, J. Liu, D. Yu, M. Kang, W. Yan, Z. Wang, M. G. Pecht, Fault detection for gas turbine hot
components based on a convolutional neural network, Energies 11 (2018). URL: https://www.mdpi.
com/1996-1073/11/8/2149. doi:10.3390/en11082149.
[25] L. Zhou, X. Ling, S. Zhu, Z. Sun, J. Yang, An self-supervised learning self-attention based
method for defects classification on pcb surface images, in: 2021 2nd International Conference
on Electronics, Communications and Information Technology (CECIT), 2021, pp. 229–234. doi:10.
1109/CECIT53797.2021.00047.
[26] A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, A. A. Kalinin, Albumentations:
Fast and flexible image augmentations, Information 11 (2020) 125. URL: http://dx.doi.org/10.3390/
info11020125. doi:10.3390/info11020125.
[27] J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu,
B. Xiao, Deep high-resolution representation learning for visual recognition, 2020. URL: https:
//arxiv.org/abs/1908.07919. arXiv:1908.07919, [Accessed: 2025-04-28].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>U.S.</given-names>
            <surname>Energy Information</surname>
          </string-name>
          <string-name>
            <surname>Administration</surname>
          </string-name>
          ,
          <article-title>The nuclear fuel cycle</article-title>
          , https://www.eia.gov/ energyexplained/nuclear/the-nuclear
          <article-title>-fuel-cycle</article-title>
          .php,
          <year>2025</year>
          . Accessed:
          <fpage>2025</fpage>
          -06-24.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Bláha</surname>
          </string-name>
          , Segmentation of Fuel Rod End Plugs,
          <source>Master's thesis</source>
          , Charles University, Department of Software and Computer Science Education, Prague, Czech Republic,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O.</given-names>
            <surname>Ronneberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Brox</surname>
          </string-name>
          , U-net:
          <article-title>Convolutional networks for biomedical image segmentation</article-title>
          ,
          <year>2015</year>
          . URL: https://arxiv.org/abs/1505.04597. arXiv:
          <volume>1505</volume>
          .04597, [Accessed:
          <fpage>2025</fpage>
          -04-28].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jia</surname>
          </string-name>
          , Pyramid scene parsing network,
          <year>2017</year>
          . URL: https://arxiv.org/ abs/1612.01105. arXiv:
          <volume>1612</volume>
          .
          <fpage>01105</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.-C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Papandreou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schrof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <article-title>Encoder-decoder with atrous separable convolution for semantic image segmentation</article-title>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/abs/
          <year>1802</year>
          .02611. arXiv:
          <year>1802</year>
          .
          <volume>02611</volume>
          , [Accessed:
          <fpage>2025</fpage>
          -04-28].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>