<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deep Learning Algorithms for Fragmented Solid Objects Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pasquale Santaniello</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Ponzi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberta Avanzato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Electrical, Electronics and Computer Engineering, University of Catania</institution>
          ,
          <addr-line>Catania</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Systems Analysis and Computer Science, Italian National Research Council</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>51</fpage>
      <lpage>58</lpage>
      <abstract>
        <p>This paper presents a series of experiments conducted on an artificially generated dataset of 3D shapes, specifically focusing on fragments of larger objects. Developed in collaboration with the Ente Parco Archeologico dei Fori Imperiali, this research aims to create a system capable of recognizing and classifying objects from their fragments, with a particular emphasis on ancient artifacts. To achieve this objective, we explore various methods for classifying fragmented objects, identifying the most efective approach for this specific task. Although there are multiple techniques for 3D shape classification, this study centers on the PointNet network, which directly processes 3D point cloud data. This method is not only computationally eficient but also well-suited for handling irregular and unordered data structures, making it particularly advantageous over traditional techniques. Furthermore, we investigate the impact of data augmentation and noise injection strategies to enhance the model's robustness. A comparative analysis with state-of-the-art architectures is also provided. Finally, we present the trained models developed on our artificial dataset, demonstrating classification performance on par with the best existing solutions, and highlighting the potential of our approach in the domain of fragmented object recognition.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>This study aims to tackle these challenges by develop</title>
        <p>ing a deep learning-based framework capable of
classiFragmented solid objects are prevalent across various fying fragments of solid objects based on their
geometdomains in real-world scenarios, ranging from archae- ric properties. By simulating fragmentation processes
ological remains and geological samples to industrial on synthetic objects and introducing variability in
fragdebris analysis. The task of classifying these fragments, ment shapes and surfaces, we create a controlled yet
identifying whether they originate from external surfaces challenging environment to evaluate the performance of
or internal structures, or associating them with their orig- learning algorithms. The ultimate goal is to bridge the
inal object classes, is important for applications such as gap between synthetic experiments and real-world
appliartifact restoration, digital reconstruction, and quality cations, providing tools that can assist domain experts
control. in reconstructing fragmented objects from incomplete</p>
        <p>Unlike complete 3D models, fragments present partial, information.
often noisy representations of the original objects.
Surface degradation, random break patterns, and loss of
significant geometric features introduce severe challenges 2. Related Works
in analyzing fragmented solids. Furthermore, fragments
may exhibit highly irregular shapes, missing structural The classification of 3D shapes has been extensively
studcontinuity, and limited discriminative features, making ied, leading to the development of several deep learning
conventional 3D classification techniques inadequate. architectures designed to process diferent
representa</p>
        <p>
          In practical applications, especially in cultural her- tions of three-dimensional data. The most widely
exitage preservation, the ability to automatically analyze plored approaches include volumetric convolutional
neuand classify fragments could significantly accelerate the ral networks (volumetric CNNs) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], multi-view CNNs
processes of cataloging, reassembly, and reconstruction [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], spectral CNNs [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], and point-based methods such
of historical artifacts. However, building robust models as PointNet [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], as well as many other CNN based
apfor fragment classification requires addressing several in- proaches [
          <xref ref-type="bibr" rid="ref10 ref11 ref5 ref6 ref7 ref8 ref9">5, 6, 7, 8, 9, 10, 11</xref>
          ].
herent dificulties: the unordered and sparse nature of 3D Early attempts at 3D shape classification relied on
voludata, variations in fragment size and orientation, and the metric CNNs, which operate on voxelized representations
need for generalization across diferent fragmentation of objects [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. These methods encode a 3D shape as a
patterns. binary or real-valued 3D tensor and apply 3D
convolutions to extract features. Although efective, volumetric
SYSTEM 2025: 11th Sapienza Yearly Symposium of Technology, Engi- approaches sufer from high memory consumption and
neering and Mathematics. Rome, June 4-6, 2025 computational complexity due to the sparsity of voxel
$ ponzi@iasi.cnr.it (V. Ponzi); roberta.avanzato@unict.it grids, making them impractical for high-resolution
mod(R. Avanzato)
© 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License els. More recent refinements, such as Vote3D [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ],
atAttribution 4.0 International (CC BY 4.0).
tempt to mitigate these issues by improving the handling favorable balance between accuracy and eficiency,
makof sparse volumetric data, but overall eficiency remains ing it a strong candidate for real-time applications and
a challenge. large-scale 3D data processing. While Multi-View CNNs
        </p>
        <p>
          Multi-view CNNs ofer a diferent solution by render- remain highly accurate, their computational complexity
ing 3D objects into multiple 2D images from various and memory requirements limit their practicality.
Meanviewpoints. These images are then processed using stan- while, volumetric and spectral methods struggle with
dard 2D convolutional networks, using well-established scalability, particularly for high-resolution models.
techniques from image classification. This approach has PointNet++ [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] extends the original architecture by
achieved state-of-the-art results in certain benchmarks, incorporating hierarchical feature learning. By
applybenefiting from the high performance of 2D CNNs. How- ing PointNet recursively on progressively refined
subever, multi view methods introduce a trade-of: while sets of points, PointNet++ captures local dependencies
they capture shape details efectively, they rely on prede- while preserving the eficiency and eflxibility of the
origifned viewpoints, leading to potential information loss inal model. This enhancement significantly improves
and requiring an extensive number of views to achieve performance on tasks such as object segmentation and
robust performance. classification.
        </p>
        <p>
          An alternative approach is provided by spectral CNNs, Recent works have also explored the flexibility of
which operate directly on 3D meshes by applying the graph-based architectures for 3D data, such as Graph
Fourier transform [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This enables classification in the Neural Networks (GNNs) [
          <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
          ].
frequency domain, ofering advantages such as spectral The next section provides a detailed overview of the
pooling, which eficiently reduces dimensionality while PointNet architecture, discussing its main components
preserving important structural information. Neverthe- and design principles in the context of fragmented object
less, spectral CNNs are primarily designed for manifold- classification.
based representations, making their application to non
isometric shapes more challenging.
        </p>
        <p>
          A more direct and flexible solution is to learn from 3. PointNet Architecture
raw 3D point clouds [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], which represent objects as
unordered sets of points in space:
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Before explaining why PointNet was chosen as the frame</title>
        <p>work for our classification task, it is essential to examine
 = { |  = 1, . . . , },  ∈ R3. the properties of point clouds, which difer significantly
from other 3D representations such as voxel grids or
Each point  encodes spatial coordinates and, optionally, meshes.
additional attributes such as color or surface normals. A point cloud is defined as an unordered set of 3D</p>
        <p>PointNet represents a major breakthrough in this field, points in R, where each point is defined by its spatial
as it directly processes point cloud data without the need coordinates and, in some cases, additional attributes such
for intermediate projections or transformations. It is as color or surface normals. Unlike structured data
fordesigned to be invariant to the ordering of points and ef- mats such as images or voxel grids, point clouds lack an
fectively captures both local and global features through inherent spatial arrangement, which introduces unique
a symmetric function and a max-pooling aggregation challenges for deep learning models. One of the most
step. Thanks to these properties, PointNet ofers a highly significant properties of point clouds is their unordered
eficient and accurate solution for 3D shape classification nature. Since there is no predefined structure, a model
and segmentation, outperforming traditional approaches designed to process point clouds must be
permutationin both speed and performance. invariant. This means that diferent orderings of the</p>
        <p>Another advantage of PointNet lies in its computa- same set of points should yield the same result, ensuring
tional eficiency. The following table (1) compares the consistency in processing. Another main aspect is the
number of parameters and floating-point operations interplay between local and global dependencies. The
(FLOPs) per sample for diferent architectures: geometric structure of a point cloud is dictated by the
Architecture Parameters (M) FLOPs/Sample (M) spatial relationships among its points. A well-designed
PointNet 3.5 148 model must be capable of capturing both fine-grained
SubVolume 16.6 3633 local details and broader global structures to accurately
Multi-View CNN 60.0 62057 interpret the data. In addition, an efective classification
model must exhibit invariance to rigid transformations
TCaobmlepa1rison of computational eficiency for deep architectures such as rotations and translations. Regardless of how an
in 3D shape classification. object is placed in space, its fundamental characteristics
should remain recognizable, ensuring reliable
classification across diferent orientations. These properties
From this analysis, it is evident that PointNet ofers a
highlight the complexity of working with point clouds Given these properties, PointNet is particularly well
and the importance of designing specialized deep learn- suited for our task of classifying 3D shape fragments. The
ing architectures that can efectively handle their unique next section details the dataset used in our experiments.
characteristics.</p>
        <p>
          PointNet was chosen for our classification task because
of its ability to directly process raw point-cloud data 4. Dataset
while preserving the main properties described above. It
introduces an innovative approach that avoids the com- One of the main challenges in this research was the lack
putational overhead associated with volumetric and mul- of publicly available datasets for fragmented 3D objects.
tiview methods. The network consists of three key com- To address this, we generated an artificial dataset
consistponents: a symmetric function for permutation invari- ing of 3D objects created using the open-source software
ance, a hierarchical feature aggregation mechanism, and Blender [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This dataset was designed to simulate the
a transformation network for spatial alignment. PointNet problem of classifying object fragments, a key step in
is designed to eficiently process unordered point cloud reconstructing partial shapes.
data by leveraging a symmetric function that ensures The dataset comprises three main object categories:
permutation invariance. Since the order of points in a spheres, cubes, and icospheres. Each object was
syspoint cloud is arbitrary, the network aggregates infor- tematically fragmented using a Python script within the
mation through a symmetric operation. Specifically, it Blender environment. After fragmentation, each piece
approximates a general function  defined over a set of was exported in STL format and labeled as either external
points as: or internal. External fragments contain portions of the
object’s outer surface, while internal fragments originate
from the object’s core. To better approximate real-world
 ({1, . . . , }) ≈ (ℎ(1), . . . , ℎ()) (1) conditions, random surface deformations were applied
to external fragments, simulating natural aging efects.
        </p>
        <p>where ℎ : R → R represents a feature transforma- Consequently, internal fragments retained smooth
surtion that maps each point to a higher-dimensional space, faces, while external fragments exhibited rough and
irwhile  : R × · · · × R → R is a symmetric func- regular patterns. Fragmentation was carried out using
tion, typically implemented as a max-pooling operation. two distinct cutting strategies. The first approach,
reguThis approach allows the network to extract meaningful lar cutting, involved dividing the object at fixed intervals
point-wise features before aggregating them into a global along the , , and  axes, resulting in uniformly shaped
descriptor, preserving permutation invariance. fragments. The second approach, random cutting,
in</p>
        <p>After extracting point-wise features, a global max- troduced variability in the division intervals, producing
pooling operation is applied to condense the most rel- fragments of irregular sizes.
evant information into a fixed-length vector. This step The dataset was structured into three diferent
proenables the network to generalize across diferent point ductions. The first production combined both regular
clouds, making it particularly efective for classification and random cutting methods, yielding a total of 5,461
and segmentation tasks. By focusing on the most salient fragments. Among these, 1,130 were classified as
interfeatures, the model can recognize geometric structures nal fragments, with 564 originating from regular cuts
regardless of the input point order. and 566 from random cuts. The remaining 4,331 were</p>
        <p>To further enhance robustness, PointNet incorporates labeled as external fragments, consisting of 2,164
genera joint alignment mechanism that addresses the challenge ated through regular cuts and 2,167 through random cuts.
of arbitrary geometric transformations in raw point cloud The second production exclusively employed the random
data. The network includes a transformation module that cutting method, generating a total of 7,259 fragments.
learns an afine transformation matrix and applies it to Of these, 2,093 were categorized as internal fragments,
align input points, ensuring invariance to translation while 5,166 were classified as external. Lastly, the Auriga
and rotation. This alignment improves the consistency fragmentation was conducted using a real object, the
of point cloud representations. To stabilize training, a Auriga, which was fragmented specifically for testing
regularization term is introduced to enforce the transfor- purposes. This process produced only nine fragments,
mation matrix to be close to an orthogonal matrix: all of which were labeled as external.</p>
        <p>
          A challenge in this dataset is the imbalance between
reg = ‖ −  ‖2 (2) external and internal fragments, with external fragments
being significantly more numerous. This imbalance could
where  is the learned transformation matrix. This negatively impact model training, leading to biased
preregularization helps maintain transformation stability, dictions. To mitigate this, we applied data augmentation
preventing unwanted distortions in aligned point clouds. techniques [
          <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
          ], as detailed in the following section.
Production ID
3
4
5
6
        </p>
        <p>Oversampling Factor
2
3
4
1</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>5. Model</title>
      <p>was used, where only the final MLP layer difers, adapting
the output dimension to the three target classes.</p>
      <sec id="sec-2-1">
        <title>The classification network takes as input a set of  points,</title>
        <p>applies input and feature transformations, and aggregates
point features through a max-pooling operation. The 6. Data Augmentation Techniques
ifnal output consists of classification scores over  classes.</p>
        <p>The architecture includes two transformation net- To address dataset imbalance and improve model
generworks. The first transformation network is a mini- alization, we applied various data augmentation
techPointNet that processes the raw point cloud and predicts niques, including oversampling, noise injection, and
a 3 × 3 transformation matrix. It consists of a shared dataset mixing. To enhance the performance of the
multilayer perceptron (MLP) applied independently to model and address the imbalance of the dataset, we
imeach point, with output sizes 64, 128, and 1024. This is plemented several data preprocessing and augmentation
followed by a global max-pooling operation across all strategies. First, we applied oversampling, replicating
points and two fully connected layers with 512 and 256 samples from the minority class (internal fragments) to
units, respectively. The resulting matrix is initialized as balance the dataset. We experimented with diferent
the identity matrix. Except for the final layer, all layers replication factors ( = 2, 3, 4) to assess their impact
use ReLU activations and batch normalization. on model performance. Another strategy was random</p>
        <p>The second transformation network follows a similar data shufling, in which we shufled the dataset after each
structure but outputs a 64 × 64 transformation matrix, training epoch. This approach minimized the variance of
also initialized as the identity. A regularization term, the batch composition and reduced the risk of overfitting,
weighted by 0.001, is added to the softmax classification improving the generalization of the model.
loss to encourage this matrix to remain close to orthogo- To further enhance the robustness of the model, we
nal. introduced noise injection, adding small random
pertur</p>
        <p>Following the transformation steps, a global max- bations to the coordinates of the points. This technique
pooling layer aggregates the point-wise features into a prevented the model from memorizing exact spatial
patcompact global descriptor. This descriptor is then passed terns, thereby improving its ability to generalize to
unthrough an MLP consisting of three fully connected lay- seen data. Lastly, we used dataset mixing, combining
ers with 512, 256, and 2 units, respectively, for binary fragments from diferent dataset productions. This
inclassification. In the case of multi-class classification, the creased data variability and allowed us to test the model’s
ifnal dense layer is modified to output 3 units instead of ability to adapt to diferent fragmentation strategies. The
2. ifnal dataset configurations used in the training are
sum</p>
        <p>Dropout with a keep ratio of 0.3 is applied after the marized in 2.
second-to-last fully connected layer (dimension 256). An Similar strategies to enhance robustness and
generadditional dropout layer was added as it empirically im- alization through data transformation and noise-based
proved model performance. techniques have been successfully applied in other
do</p>
        <p>
          The network is trained using the Adam optimizer with mains, such as EEG signal processing and afective
coman initial learning rate of 0.001, momentum 0.9, and a puting, including the use of GAN-based denoising [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ],
batch size of 32. CycleGAN for cross-domain adaptation [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], and
Trans
        </p>
        <p>
          Several models were developed during the experimen- former architectures for sentiment classification [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
tal phase. The main diferences between them lie in
the data augmentation techniques applied prior to train- 7. Noise Addition and Tolerance
ing. Regarding binary classification, the best-performing
model is referred to as Model 2 in the results. Model 3, Once we have obtained a very good model for the
classiModel 4, and Model 5 share the same architecture but fication of the dataset, thanks to data augmentation
techwere trained on diferent sample compositions. For the niques,we investigated how much noise (points are jitted
multi-class classification task, a slightly diferent model on all three axes) could be inserted without changing the
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>8. Extension to Multi-Class Classification</title>
      <p>performance of the network, and also if the procedure
of adding noise increased or decreased the
generalization capability of the network. This procedure is also
called noise injection. The noise injection method refers One thing that we noticed during the implementation of
to adding ’noise’ artificially to the ANN input data during our model,was actually based on how the Dataset was
the training process. Jitter is one particular method to im- composed. As we saw in section, the Dataset used for
plement noise injection. With this method,a noise vector training and testing the model was obtained by
fragmentis added to each training case in between training itera- ing objects starting from three classes: Sphere, Icosphere
tions. This causes the training data to ’jitter’ in the feature and Cube. Considering the fact that this kind of
algospace during training, making it dificult for the ANN rithm could be used in order to actually try to
’reconto find a solution that fits precisely the original training struct’ objects starting from its fragments as final goal,
dataset, and thereby reduces the overfitting of the ANN. i.e. a 3D puzzle,we thought that one further step in
orWe are actually able to inject noise inside the dataset der to extend the algorithm to this task was to try to
owing to a random uniform distribution applied on the implement a multi-class classification instead of a binary
training sample,varying the lower and upper bounds of one, trying to deduce the ’family’ of objects where each
our distribution to check tolerance and generalization fragment comes from. For doing this we had to slightly
capabilities. We tried to change diferent levels of jitter modify first the Dataset composition and related label’s
to actually try to understand when the model would col- extraction and then we also modified a bit the model in
lapse in terms of parameters. We tried ranges of uniform order to make it able to predict over the three mentioned
distribution from (-0.005,+0.005) to (-0.100,+0.100). This classes,Sphere, Icosphere and Cube.
last interval could be considered the breaking point of Before saying how training and test samples are
obperformance of our model, as we will show in the results tained for this classification task, we have to say that
section. At testing time instead, we did experiments to since during our experiments we noticed that the
’intercheck which jitter injection would cause the net to be ca- nal’ fragments are very similar between them also across
pable of better generalizing when classifying a diferent diferent starting classes, using also this kind of
fragdataset. Diferent noise ranges were tested 3. ments for the new classification task was too challenging
for our application. That is why we decided to create two
Noise Range Change of performances productions of samples made entirely by the ’external’
(-0.005,+0.005) Default case fragments and use them to train and test the new model.
(-0.010,+0.010) Slow change of performances So what we obtained are two productions: Production
(-0.050,+0.050) Slow change of performances 1: made of all the external fragments of the first
pro(-0.100,+0.100) High drop of performances duction in Section, obtaining 4331 external fragments.</p>
      <p>Production 2: we tried to increase the number of samples
Table 3 for training, and to do this we added all the ’external’
Comparison of model complexity and computational cost fragments of the second production in Section obtaining
(FLOPs per sample) for selected 3D shape classification archi- 9497 ’external’ fragments. Labels : Now that we created
tectures. a dataset made only of ’external’ fragments, since they
are the most ’discriminative’ ones,we had to label each</p>
      <p>Then to test what range of noise will increase the gen- of these fragments with the original class where it was
eralization capability of the net, we created a new Dataset, fragmented, and this was easy since in each fragment’s
to obtain a new Dataset that is very diferent from the name exported from Blender, there was the initial object
previous one especially considering the minor class (the where it was composed. So we could actually retrieve
Dataset was balanced by adding random fragments). labels from the fragment’s name just by processing the</p>
      <p>The noise in the range of (-0.010,+0.010) will make our string of the name. The model used for prediction is very
model (trained on the first dataset, diferent from the similar to the one described in Section, the only thing that
new one) achieve the same performances as a classifier changes is the actual last dense layer of the MLP since
trained directly on the new dataset (see the Result section it has 3 units instead of 2. One possible solution to the
for a better explanation). This could probably mean that problem, to actually extend the task of classifying
fragthis range of noise will not throw down performances ments of object in the object they were composed, could
on the dataset while maintaining a good generalization actually be to introduce new families of objects that difer
power on diferent Datasets. All the results are shown in a bit from the other ones, or maybe try to apply some
the result section with attached confusion matrices. Data Augmentation making the model able to distinguish
between Sphere and Icosphere during classification.</p>
    </sec>
    <sec id="sec-4">
      <title>9. Results</title>
      <p>’break point’ of the net in which the noise is too much
and performances of the net will go down steeply,and
In this section, we present all the results obtained during also in retrieving the range of noise that will grow up
our experiments, providing comparisons and highlight- generalization performances of our network. We have
ing the most efective techniques for classifying 3D object to say that all these experiments are performed on the
fragments. We begin by reporting the accuracy achieved best performing model discovered until now , the one
by our model without any data augmentation, using stan- that reached 0.9222 of accuracy. In this table we show
dard noise injection. The dataset used for training is Pro- the comparison between various range of noise applied
duction 1, while the test set consists of a random subset to the best performing model and the ’break point’. In
of the training data. As a starting point for our experi- this case the test set is obtained by a percentage of the
ments, we compare the performance of models trained Training set.
on samples converted using Open3D and Trimesh.</p>
      <p>As the next step, we compared models trained with
diferent conversion libraries, varying sampling densities,
and applying oversampling as a data augmentation
technique. The table below summarizes the configurations As observed, introducing noise within the range of
and corresponding accuracies achieved on the Produc- (-0.100, +0.100) leads to a significant drop in performance,
tion 1 dataset. Our goal was to train a competitive model which we identify as the "breaking point" for noise
incapable of reaching accuracy levels comparable to more jection in our application. Beyond this threshold, the
recent architectures like PointNet++, solely by balancing model’s classification accuracy deteriorates considerably.
the dataset through oversampling. The next objective is to determine the optimal level of
noise injection to improve generalization while avoiding</p>
    </sec>
    <sec id="sec-5">
      <title>ConveTrrsiimonesLhibrary Sampl1in02g4Points Oversamp3ling Factor Ac8c9u.2r6a%cy overfitting. To this end, we exploit a diferent dataset,</title>
      <p>Trimesh 2048 3 88.74% referred to as Production Final. Production Final difers
OOppeenn33DD 12002448 33 9829..2920%% from Production 1 mainly in the composition of internal
Trimesh 1024 2 87.57% fragments, as it aggregates internal samples from both
Trimesh 1024 4 83.12% the first and second productions, resulting in a more
balTable 5 anced dataset. To assess the impact of noise injection,</p>
      <sec id="sec-5-1">
        <title>Impact of conversion library, sampling density, and oversam- we first train a new PointNet model from scratch on Pro</title>
        <p>pling factor on model performance (Production 1 dataset). duction Final, treating it as a reference model. We then
evaluate the best-performing model previously obtained</p>
        <p>Thanks to these techniques, we obtained a model that (trained on Production 1) by applying various levels of
significantly improved its ability to classify fragments, in- noise injection, testing it on a subset of Production
Ficluding internal ones, compared to the initial production. nal. By carefully analyzing the confusion matrices, we
Moving forward, we focus on identifying best practices aim to identify the level of noise that enables the
prefor preventing overfitting. Before doing so, it is impor- viously trained model to behave similarly to the new
tant to highlight an interesting observation made during baseline model trained directly on Production Final. If a
testing. After training a model with samples converted model trained with noise augmentation achieves similar
using either Trimesh or Open3D, once the model has classification patterns to the baseline, we can
reasonconverged, the choice of conversion library for the test ably conclude that noise injection at that level enhances
set no longer afects performance: both conversion meth- the model’s generalization ability. In the following, we
ods yield identical results at test time. In practice, while present the accuracy of the model trained from scratch
the choice of conversion library can impact the model’s on Production Final.
performance during training, it does not influence the Now we test the best performing model (model2),
outcomes during testing. trained with diferent levels of jitter, on the same test set</p>
        <p>Now we will consider experiments regarding the noise made by Production Final, since we want to see which
injection,in particular we are interested in obtaining the model acts more similarly to the Baseline model trained</p>
        <p>Conversion
Open3D
Trimesh</p>
        <p>Dataset
Production Final
Production Final</p>
        <p>We observe that the best-performing model, trained
with a jitter range of (-0.010, +0.010), is able to classify
samples from the new dataset almost as efectively as a
model trained directly on it. Considering that the main
diference between the two datasets lies in the internal
fragments — which, according to the confusion matrices,
are classified similarly by both models. We can conclude
that injecting noise within this range slightly reduces
performance on the original dataset, but enhances the
model’s generalization capabilities. Regarding the binary
classification task (internal vs external fragments), the
ifnal evaluation concerns the predictions made by the
best-performing model on the fragments of a real object
(Auriga), as introduced in Section 2.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Auriga’s fragment</title>
      </sec>
      <sec id="sec-5-3">
        <title>Fragment n.1</title>
      </sec>
      <sec id="sec-5-4">
        <title>Fragment n.2</title>
      </sec>
      <sec id="sec-5-5">
        <title>Fragment n.3</title>
      </sec>
      <sec id="sec-5-6">
        <title>Fragment n.4</title>
      </sec>
      <sec id="sec-5-7">
        <title>Fragment n.5</title>
      </sec>
      <sec id="sec-5-8">
        <title>Fragment n.6</title>
      </sec>
      <sec id="sec-5-9">
        <title>Fragment n.7</title>
      </sec>
      <sec id="sec-5-10">
        <title>Fragment n.8</title>
      </sec>
      <sec id="sec-5-11">
        <title>Fragment n.9</title>
      </sec>
      <sec id="sec-5-12">
        <title>Label</title>
      </sec>
      <sec id="sec-5-13">
        <title>External</title>
      </sec>
      <sec id="sec-5-14">
        <title>External</title>
      </sec>
      <sec id="sec-5-15">
        <title>External</title>
      </sec>
      <sec id="sec-5-16">
        <title>External</title>
      </sec>
      <sec id="sec-5-17">
        <title>External</title>
      </sec>
      <sec id="sec-5-18">
        <title>External</title>
      </sec>
      <sec id="sec-5-19">
        <title>External</title>
      </sec>
      <sec id="sec-5-20">
        <title>External/Internal</title>
      </sec>
      <sec id="sec-5-21">
        <title>External</title>
      </sec>
      <sec id="sec-5-22">
        <title>Prediction</title>
      </sec>
      <sec id="sec-5-23">
        <title>External</title>
      </sec>
      <sec id="sec-5-24">
        <title>External</title>
      </sec>
      <sec id="sec-5-25">
        <title>External</title>
      </sec>
      <sec id="sec-5-26">
        <title>External</title>
      </sec>
      <sec id="sec-5-27">
        <title>External</title>
      </sec>
      <sec id="sec-5-28">
        <title>External</title>
      </sec>
      <sec id="sec-5-29">
        <title>External</title>
      </sec>
      <sec id="sec-5-30">
        <title>External</title>
      </sec>
      <sec id="sec-5-31">
        <title>External</title>
        <p>10. Conclusion
In this work, we presented a series of experiments
focused on the classification of 3D object fragments using
deep learning techniques. Starting from an artificially
generated dataset of fragmented shapes, we evaluated
the impact of diferent preprocessing methods, data
augmentation strategies, and model configurations. Our
experiments demonstrated that simple oversampling
techniques, combined with the use of Open3D for point cloud
generation and an appropriate sampling density, allowed
us to improve the classification performance of the
baseline PointNet architecture, reaching an accuracy
comparable to more advanced models such as PointNet++. We
also investigated the role of noise injection as a means to
enhance model generalization. Our results suggest that
controlled noise injection, particularly in the range of
(-0.010, +0.010), enables the model to better adapt to new
datasets without severely impacting performance on the
original data.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <sec id="sec-6-1">
        <title>During the preparation of this work, the authors used ChatGPT, Grammarly in order to: Grammar and spelling</title>
        <p>check, Paraphrase and reword. After using this
tool/service, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s
content.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nießner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Guibas</surname>
          </string-name>
          ,
          <article-title>Volumetric and multi-view cnns for object classification on 3d data</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>5648</fpage>
          -
          <lpage>5656</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Maji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kalogerakis</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Learned-Miller, Multi-view convolutional neural networks for 3d shape recognition</article-title>
          ,
          <source>in: Proceedings of the IEEE international conference on computer vision</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>945</fpage>
          -
          <lpage>953</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bruna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaremba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Szlam</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <article-title>LeCun, Spectral networks and locally connected networks on graphs</article-title>
          ,
          <source>arXiv preprint arXiv:1312.6203</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Guibas</surname>
          </string-name>
          , Pointnet:
          <article-title>Deep learning on point sets for 3d classification and segmentation</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>652</fpage>
          -
          <lpage>660</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Avanzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Beritelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Vaccaro, Yolov3-based mask and face recognition algorithm for individual protection applications</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>2768</volume>
          ,
          <year>2020</year>
          , p.
          <fpage>41</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fiani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <article-title>Keeping eyes on the road: Understanding driver attention and its role in safe driving</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3695</volume>
          ,
          <year>2023</year>
          , p.
          <fpage>85</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Capizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Sciuto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Monforte</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Napoli, Cascade feed forward neural network-based model for air pollutants evaluation of single monitoring stations in urban areas</article-title>
          ,
          <source>International Journal of Electronics and Telecommunications</source>
          <volume>61</volume>
          (
          <year>2015</year>
          )
          <fpage>327</fpage>
          -
          <lpage>332</lpage>
          . doi:
          <volume>10</volume>
          .1515/eletel-2015-0042.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bianco</surname>
          </string-name>
          , G. Castro,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <article-title>Automatic rgb inference based on facial emotion recognition</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3092</volume>
          ,
          <year>2021</year>
          , p.
          <fpage>66</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E.</given-names>
            <surname>Iacobelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Napoli,</surname>
          </string-name>
          <article-title>A machine learning based real-time application for engagement detection</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3695</volume>
          ,
          <year>2023</year>
          , p.
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A comparative study of machine learning approaches for autism detection in children from imaging data</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3398</volume>
          ,
          <year>2022</year>
          , p.
          <fpage>9</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lo Sciuto</surname>
          </string-name>
          , G. Capizzi,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shikler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Organic solar cells defects classification by using a new feature extraction algorithm and an ebnn with an innovative pruning algorithm</article-title>
          ,
          <source>International Journal of Intelligent Systems</source>
          <volume>36</volume>
          (
          <year>2021</year>
          )
          <fpage>2443</fpage>
          -
          <lpage>2464</lpage>
          . doi:
          <volume>10</volume>
          .1002/int.22386.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Gezawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Bello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yunqi</surname>
          </string-name>
          ,
          <article-title>A voxelized point clouds representation for object classification and segmentation on 3d data</article-title>
          ,
          <source>The Journal of Supercomputing</source>
          <volume>78</volume>
          (
          <year>2022</year>
          )
          <fpage>1479</fpage>
          -
          <lpage>1500</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D. Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Posner</surname>
          </string-name>
          ,
          <article-title>Voting for voting in online point cloud object detection</article-title>
          .,
          <source>in: Robotics: science and systems</source>
          , volume
          <volume>1</volume>
          , Rome, Italy,
          <year>2015</year>
          , pp.
          <fpage>10</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Bracewell</surname>
          </string-name>
          ,
          <article-title>The fourier transform</article-title>
          ,
          <source>Scientific American</source>
          <volume>260</volume>
          (
          <year>1989</year>
          )
          <fpage>86</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          , L. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bennamoun</surname>
          </string-name>
          ,
          <article-title>Deep learning for 3d point clouds: A survey</article-title>
          ,
          <source>IEEE transactions on pattern analysis and machine intelligence</source>
          <volume>43</volume>
          (
          <year>2020</year>
          )
          <fpage>4338</fpage>
          -
          <lpage>4364</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Guibas</surname>
          </string-name>
          , Pointnet++:
          <article-title>Deep hierarchical feature learning on point sets in a metric space</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Graph neural networks: Architectures, applications, and future directions</article-title>
          , IEEE Access (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Comito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , Pnmlr:
          <article-title>Enhancing route recommendations with personalized preferences using graph attention networks</article-title>
          ,
          <source>IEEE Access</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>L.</given-names>
            <surname>Flavell</surname>
          </string-name>
          ,
          <article-title>Beginning blender: open source 3d modeling, animation, and game design</article-title>
          ,
          <source>Apress</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Shorten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Khoshgoftaar</surname>
          </string-name>
          ,
          <article-title>A survey on image data augmentation for deep learning</article-title>
          ,
          <source>Journal of big data 6</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Van Dyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.-L.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <article-title>The art of data augmentation</article-title>
          ,
          <source>Journal of Computational and Graphical Statistics</source>
          <volume>10</volume>
          (
          <year>2001</year>
          )
          <fpage>1</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>I. E.</given-names>
            <surname>TIBERMACINE</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Citeroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Mancini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rabehi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Alharbi</surname>
          </string-name>
          , E.
          <string-name>
            <surname>-S. M. Elkenawy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Adversarial denoising of eeg signals: A comparative analysis of standard gan and wgan-gp approaches</article-title>
          ,
          <source>Frontiers in Human Neuroscience</source>
          <volume>19</volume>
          (
          <year>2025</year>
          )
          <fpage>1583342</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Enhancing eeg signal reconstruction in cross-domain adaptation using cyclegan</article-title>
          ,
          <source>in: 2024 International Conference on Telecommunications and Intelligent Systems (ICTIS)</source>
          , IEEE,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Guettala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <article-title>Enhancing sentiment analysis on seed-iv dataset with vision transformers: A comparative study</article-title>
          ,
          <source>in: Proceedings of the 2023 11th international conference on information technology: IoT and smart city</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>238</fpage>
          -
          <lpage>246</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>