1. Introduction

Deep Learning Algorithms for Fragmented Solid Objects Classification

Pasquale Santaniello

Valerio Ponzi

Roberta Avanzato

0 0 Department of Electrical, Electronics and Computer Engineering, University of Catania , Catania , Italy 1 Institute for Systems Analysis and Computer Science, Italian National Research Council , Rome , Italy

51 58

This paper presents a series of experiments conducted on an artificially generated dataset of 3D shapes, specifically focusing on fragments of larger objects. Developed in collaboration with the Ente Parco Archeologico dei Fori Imperiali, this research aims to create a system capable of recognizing and classifying objects from their fragments, with a particular emphasis on ancient artifacts. To achieve this objective, we explore various methods for classifying fragmented objects, identifying the most efective approach for this specific task. Although there are multiple techniques for 3D shape classification, this study centers on the PointNet network, which directly processes 3D point cloud data. This method is not only computationally eficient but also well-suited for handling irregular and unordered data structures, making it particularly advantageous over traditional techniques. Furthermore, we investigate the impact of data augmentation and noise injection strategies to enhance the model's robustness. A comparative analysis with state-of-the-art architectures is also provided. Finally, we present the trained models developed on our artificial dataset, demonstrating classification performance on par with the best existing solutions, and highlighting the potential of our approach in the domain of fragmented object recognition.

1. Introduction This study aims to tackle these challenges by develop

ing a deep learning-based framework capable of classiFragmented solid objects are prevalent across various fying fragments of solid objects based on their geometdomains in real-world scenarios, ranging from archae- ric properties. By simulating fragmentation processes ological remains and geological samples to industrial on synthetic objects and introducing variability in fragdebris analysis. The task of classifying these fragments, ment shapes and surfaces, we create a controlled yet identifying whether they originate from external surfaces challenging environment to evaluate the performance of or internal structures, or associating them with their orig- learning algorithms. The ultimate goal is to bridge the inal object classes, is important for applications such as gap between synthetic experiments and real-world appliartifact restoration, digital reconstruction, and quality cations, providing tools that can assist domain experts control. in reconstructing fragmented objects from incomplete

Unlike complete 3D models, fragments present partial, information. often noisy representations of the original objects. Surface degradation, random break patterns, and loss of significant geometric features introduce severe challenges 2. Related Works in analyzing fragmented solids. Furthermore, fragments may exhibit highly irregular shapes, missing structural The classification of 3D shapes has been extensively studcontinuity, and limited discriminative features, making ied, leading to the development of several deep learning conventional 3D classification techniques inadequate. architectures designed to process diferent representa

In practical applications, especially in cultural her- tions of three-dimensional data. The most widely exitage preservation, the ability to automatically analyze plored approaches include volumetric convolutional neuand classify fragments could significantly accelerate the ral networks (volumetric CNNs) [ 1 ], multi-view CNNs processes of cataloging, reassembly, and reconstruction [ 2 ], spectral CNNs [ 3 ], and point-based methods such of historical artifacts. However, building robust models as PointNet [ 4 ], as well as many other CNN based apfor fragment classification requires addressing several in- proaches [ 5, 6, 7, 8, 9, 10, 11 ]. herent dificulties: the unordered and sparse nature of 3D Early attempts at 3D shape classification relied on voludata, variations in fragment size and orientation, and the metric CNNs, which operate on voxelized representations need for generalization across diferent fragmentation of objects [ 12 ]. These methods encode a 3D shape as a patterns. binary or real-valued 3D tensor and apply 3D convolutions to extract features. Although efective, volumetric SYSTEM 2025: 11th Sapienza Yearly Symposium of Technology, Engi- approaches sufer from high memory consumption and neering and Mathematics. Rome, June 4-6, 2025 computational complexity due to the sparsity of voxel $ ponzi@iasi.cnr.it (V. Ponzi); roberta.avanzato@unict.it grids, making them impractical for high-resolution mod(R. Avanzato) © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License els. More recent refinements, such as Vote3D [ 13 ], atAttribution 4.0 International (CC BY 4.0). tempt to mitigate these issues by improving the handling favorable balance between accuracy and eficiency, makof sparse volumetric data, but overall eficiency remains ing it a strong candidate for real-time applications and a challenge. large-scale 3D data processing. While Multi-View CNNs

Multi-view CNNs ofer a diferent solution by render- remain highly accurate, their computational complexity ing 3D objects into multiple 2D images from various and memory requirements limit their practicality. Meanviewpoints. These images are then processed using stan- while, volumetric and spectral methods struggle with dard 2D convolutional networks, using well-established scalability, particularly for high-resolution models. techniques from image classification. This approach has PointNet++ [ 16 ] extends the original architecture by achieved state-of-the-art results in certain benchmarks, incorporating hierarchical feature learning. By applybenefiting from the high performance of 2D CNNs. How- ing PointNet recursively on progressively refined subever, multi view methods introduce a trade-of: while sets of points, PointNet++ captures local dependencies they capture shape details efectively, they rely on prede- while preserving the eficiency and eflxibility of the origifned viewpoints, leading to potential information loss inal model. This enhancement significantly improves and requiring an extensive number of views to achieve performance on tasks such as object segmentation and robust performance. classification.

An alternative approach is provided by spectral CNNs, Recent works have also explored the flexibility of which operate directly on 3D meshes by applying the graph-based architectures for 3D data, such as Graph Fourier transform [ 14 ]. This enables classification in the Neural Networks (GNNs) [ 17, 18 ]. frequency domain, ofering advantages such as spectral The next section provides a detailed overview of the pooling, which eficiently reduces dimensionality while PointNet architecture, discussing its main components preserving important structural information. Neverthe- and design principles in the context of fragmented object less, spectral CNNs are primarily designed for manifold- classification. based representations, making their application to non isometric shapes more challenging.

A more direct and flexible solution is to learn from 3. PointNet Architecture raw 3D point clouds [ 15 ], which represent objects as unordered sets of points in space:

Before explaining why PointNet was chosen as the frame

work for our classification task, it is essential to examine = { | = 1, . . . , }, ∈ R3. the properties of point clouds, which difer significantly from other 3D representations such as voxel grids or Each point encodes spatial coordinates and, optionally, meshes. additional attributes such as color or surface normals. A point cloud is defined as an unordered set of 3D

PointNet represents a major breakthrough in this field, points in R, where each point is defined by its spatial as it directly processes point cloud data without the need coordinates and, in some cases, additional attributes such for intermediate projections or transformations. It is as color or surface normals. Unlike structured data fordesigned to be invariant to the ordering of points and ef- mats such as images or voxel grids, point clouds lack an fectively captures both local and global features through inherent spatial arrangement, which introduces unique a symmetric function and a max-pooling aggregation challenges for deep learning models. One of the most step. Thanks to these properties, PointNet ofers a highly significant properties of point clouds is their unordered eficient and accurate solution for 3D shape classification nature. Since there is no predefined structure, a model and segmentation, outperforming traditional approaches designed to process point clouds must be permutationin both speed and performance. invariant. This means that diferent orderings of the

Another advantage of PointNet lies in its computa- same set of points should yield the same result, ensuring tional eficiency. The following table (1) compares the consistency in processing. Another main aspect is the number of parameters and floating-point operations interplay between local and global dependencies. The (FLOPs) per sample for diferent architectures: geometric structure of a point cloud is dictated by the Architecture Parameters (M) FLOPs/Sample (M) spatial relationships among its points. A well-designed PointNet 3.5 148 model must be capable of capturing both fine-grained SubVolume 16.6 3633 local details and broader global structures to accurately Multi-View CNN 60.0 62057 interpret the data. In addition, an efective classification model must exhibit invariance to rigid transformations TCaobmlepa1rison of computational eficiency for deep architectures such as rotations and translations. Regardless of how an in 3D shape classification. object is placed in space, its fundamental characteristics should remain recognizable, ensuring reliable classification across diferent orientations. These properties From this analysis, it is evident that PointNet ofers a highlight the complexity of working with point clouds Given these properties, PointNet is particularly well and the importance of designing specialized deep learn- suited for our task of classifying 3D shape fragments. The ing architectures that can efectively handle their unique next section details the dataset used in our experiments. characteristics.

PointNet was chosen for our classification task because of its ability to directly process raw point-cloud data 4. Dataset while preserving the main properties described above. It introduces an innovative approach that avoids the com- One of the main challenges in this research was the lack putational overhead associated with volumetric and mul- of publicly available datasets for fragmented 3D objects. tiview methods. The network consists of three key com- To address this, we generated an artificial dataset consistponents: a symmetric function for permutation invari- ing of 3D objects created using the open-source software ance, a hierarchical feature aggregation mechanism, and Blender [ 19 ]. This dataset was designed to simulate the a transformation network for spatial alignment. PointNet problem of classifying object fragments, a key step in is designed to eficiently process unordered point cloud reconstructing partial shapes. data by leveraging a symmetric function that ensures The dataset comprises three main object categories: permutation invariance. Since the order of points in a spheres, cubes, and icospheres. Each object was syspoint cloud is arbitrary, the network aggregates infor- tematically fragmented using a Python script within the mation through a symmetric operation. Specifically, it Blender environment. After fragmentation, each piece approximates a general function defined over a set of was exported in STL format and labeled as either external points as: or internal. External fragments contain portions of the object’s outer surface, while internal fragments originate from the object’s core. To better approximate real-world ({1, . . . , }) ≈ (ℎ(1), . . . , ℎ()) (1) conditions, random surface deformations were applied to external fragments, simulating natural aging efects.

where ℎ : R → R represents a feature transforma- Consequently, internal fragments retained smooth surtion that maps each point to a higher-dimensional space, faces, while external fragments exhibited rough and irwhile : R × · · · × R → R is a symmetric func- regular patterns. Fragmentation was carried out using tion, typically implemented as a max-pooling operation. two distinct cutting strategies. The first approach, reguThis approach allows the network to extract meaningful lar cutting, involved dividing the object at fixed intervals point-wise features before aggregating them into a global along the , , and axes, resulting in uniformly shaped descriptor, preserving permutation invariance. fragments. The second approach, random cutting, in

After extracting point-wise features, a global max- troduced variability in the division intervals, producing pooling operation is applied to condense the most rel- fragments of irregular sizes. evant information into a fixed-length vector. This step The dataset was structured into three diferent proenables the network to generalize across diferent point ductions. The first production combined both regular clouds, making it particularly efective for classification and random cutting methods, yielding a total of 5,461 and segmentation tasks. By focusing on the most salient fragments. Among these, 1,130 were classified as interfeatures, the model can recognize geometric structures nal fragments, with 564 originating from regular cuts regardless of the input point order. and 566 from random cuts. The remaining 4,331 were

To further enhance robustness, PointNet incorporates labeled as external fragments, consisting of 2,164 genera joint alignment mechanism that addresses the challenge ated through regular cuts and 2,167 through random cuts. of arbitrary geometric transformations in raw point cloud The second production exclusively employed the random data. The network includes a transformation module that cutting method, generating a total of 7,259 fragments. learns an afine transformation matrix and applies it to Of these, 2,093 were categorized as internal fragments, align input points, ensuring invariance to translation while 5,166 were classified as external. Lastly, the Auriga and rotation. This alignment improves the consistency fragmentation was conducted using a real object, the of point cloud representations. To stabilize training, a Auriga, which was fragmented specifically for testing regularization term is introduced to enforce the transfor- purposes. This process produced only nine fragments, mation matrix to be close to an orthogonal matrix: all of which were labeled as external.

A challenge in this dataset is the imbalance between reg = ‖ − ‖2 (2) external and internal fragments, with external fragments being significantly more numerous. This imbalance could where is the learned transformation matrix. This negatively impact model training, leading to biased preregularization helps maintain transformation stability, dictions. To mitigate this, we applied data augmentation preventing unwanted distortions in aligned point clouds. techniques [ 20, 21 ], as detailed in the following section. Production ID 3 4 5 6

Oversampling Factor 2 3 4 1

5. Model

was used, where only the final MLP layer difers, adapting the output dimension to the three target classes.

The classification network takes as input a set of points,

applies input and feature transformations, and aggregates point features through a max-pooling operation. The 6. Data Augmentation Techniques ifnal output consists of classification scores over classes.

The architecture includes two transformation net- To address dataset imbalance and improve model generworks. The first transformation network is a mini- alization, we applied various data augmentation techPointNet that processes the raw point cloud and predicts niques, including oversampling, noise injection, and a 3 × 3 transformation matrix. It consists of a shared dataset mixing. To enhance the performance of the multilayer perceptron (MLP) applied independently to model and address the imbalance of the dataset, we imeach point, with output sizes 64, 128, and 1024. This is plemented several data preprocessing and augmentation followed by a global max-pooling operation across all strategies. First, we applied oversampling, replicating points and two fully connected layers with 512 and 256 samples from the minority class (internal fragments) to units, respectively. The resulting matrix is initialized as balance the dataset. We experimented with diferent the identity matrix. Except for the final layer, all layers replication factors ( = 2, 3, 4) to assess their impact use ReLU activations and batch normalization. on model performance. Another strategy was random

The second transformation network follows a similar data shufling, in which we shufled the dataset after each structure but outputs a 64 × 64 transformation matrix, training epoch. This approach minimized the variance of also initialized as the identity. A regularization term, the batch composition and reduced the risk of overfitting, weighted by 0.001, is added to the softmax classification improving the generalization of the model. loss to encourage this matrix to remain close to orthogo- To further enhance the robustness of the model, we nal. introduced noise injection, adding small random pertur

Following the transformation steps, a global max- bations to the coordinates of the points. This technique pooling layer aggregates the point-wise features into a prevented the model from memorizing exact spatial patcompact global descriptor. This descriptor is then passed terns, thereby improving its ability to generalize to unthrough an MLP consisting of three fully connected lay- seen data. Lastly, we used dataset mixing, combining ers with 512, 256, and 2 units, respectively, for binary fragments from diferent dataset productions. This inclassification. In the case of multi-class classification, the creased data variability and allowed us to test the model’s ifnal dense layer is modified to output 3 units instead of ability to adapt to diferent fragmentation strategies. The 2. ifnal dataset configurations used in the training are sum

Dropout with a keep ratio of 0.3 is applied after the marized in 2. second-to-last fully connected layer (dimension 256). An Similar strategies to enhance robustness and generadditional dropout layer was added as it empirically im- alization through data transformation and noise-based proved model performance. techniques have been successfully applied in other do

The network is trained using the Adam optimizer with mains, such as EEG signal processing and afective coman initial learning rate of 0.001, momentum 0.9, and a puting, including the use of GAN-based denoising [ 22 ], batch size of 32. CycleGAN for cross-domain adaptation [ 23 ], and Trans

Several models were developed during the experimen- former architectures for sentiment classification [ 24 ]. tal phase. The main diferences between them lie in the data augmentation techniques applied prior to train- 7. Noise Addition and Tolerance ing. Regarding binary classification, the best-performing model is referred to as Model 2 in the results. Model 3, Once we have obtained a very good model for the classiModel 4, and Model 5 share the same architecture but fication of the dataset, thanks to data augmentation techwere trained on diferent sample compositions. For the niques,we investigated how much noise (points are jitted multi-class classification task, a slightly diferent model on all three axes) could be inserted without changing the

8. Extension to Multi-Class Classification

performance of the network, and also if the procedure of adding noise increased or decreased the generalization capability of the network. This procedure is also called noise injection. The noise injection method refers One thing that we noticed during the implementation of to adding ’noise’ artificially to the ANN input data during our model,was actually based on how the Dataset was the training process. Jitter is one particular method to im- composed. As we saw in section, the Dataset used for plement noise injection. With this method,a noise vector training and testing the model was obtained by fragmentis added to each training case in between training itera- ing objects starting from three classes: Sphere, Icosphere tions. This causes the training data to ’jitter’ in the feature and Cube. Considering the fact that this kind of algospace during training, making it dificult for the ANN rithm could be used in order to actually try to ’reconto find a solution that fits precisely the original training struct’ objects starting from its fragments as final goal, dataset, and thereby reduces the overfitting of the ANN. i.e. a 3D puzzle,we thought that one further step in orWe are actually able to inject noise inside the dataset der to extend the algorithm to this task was to try to owing to a random uniform distribution applied on the implement a multi-class classification instead of a binary training sample,varying the lower and upper bounds of one, trying to deduce the ’family’ of objects where each our distribution to check tolerance and generalization fragment comes from. For doing this we had to slightly capabilities. We tried to change diferent levels of jitter modify first the Dataset composition and related label’s to actually try to understand when the model would col- extraction and then we also modified a bit the model in lapse in terms of parameters. We tried ranges of uniform order to make it able to predict over the three mentioned distribution from (-0.005,+0.005) to (-0.100,+0.100). This classes,Sphere, Icosphere and Cube. last interval could be considered the breaking point of Before saying how training and test samples are obperformance of our model, as we will show in the results tained for this classification task, we have to say that section. At testing time instead, we did experiments to since during our experiments we noticed that the ’intercheck which jitter injection would cause the net to be ca- nal’ fragments are very similar between them also across pable of better generalizing when classifying a diferent diferent starting classes, using also this kind of fragdataset. Diferent noise ranges were tested 3. ments for the new classification task was too challenging for our application. That is why we decided to create two Noise Range Change of performances productions of samples made entirely by the ’external’ (-0.005,+0.005) Default case fragments and use them to train and test the new model. (-0.010,+0.010) Slow change of performances So what we obtained are two productions: Production (-0.050,+0.050) Slow change of performances 1: made of all the external fragments of the first pro(-0.100,+0.100) High drop of performances duction in Section, obtaining 4331 external fragments.

Production 2: we tried to increase the number of samples Table 3 for training, and to do this we added all the ’external’ Comparison of model complexity and computational cost fragments of the second production in Section obtaining (FLOPs per sample) for selected 3D shape classification archi- 9497 ’external’ fragments. Labels : Now that we created tectures. a dataset made only of ’external’ fragments, since they are the most ’discriminative’ ones,we had to label each

Then to test what range of noise will increase the gen- of these fragments with the original class where it was eralization capability of the net, we created a new Dataset, fragmented, and this was easy since in each fragment’s to obtain a new Dataset that is very diferent from the name exported from Blender, there was the initial object previous one especially considering the minor class (the where it was composed. So we could actually retrieve Dataset was balanced by adding random fragments). labels from the fragment’s name just by processing the

The noise in the range of (-0.010,+0.010) will make our string of the name. The model used for prediction is very model (trained on the first dataset, diferent from the similar to the one described in Section, the only thing that new one) achieve the same performances as a classifier changes is the actual last dense layer of the MLP since trained directly on the new dataset (see the Result section it has 3 units instead of 2. One possible solution to the for a better explanation). This could probably mean that problem, to actually extend the task of classifying fragthis range of noise will not throw down performances ments of object in the object they were composed, could on the dataset while maintaining a good generalization actually be to introduce new families of objects that difer power on diferent Datasets. All the results are shown in a bit from the other ones, or maybe try to apply some the result section with attached confusion matrices. Data Augmentation making the model able to distinguish between Sphere and Icosphere during classification.

9. Results

’break point’ of the net in which the noise is too much and performances of the net will go down steeply,and In this section, we present all the results obtained during also in retrieving the range of noise that will grow up our experiments, providing comparisons and highlight- generalization performances of our network. We have ing the most efective techniques for classifying 3D object to say that all these experiments are performed on the fragments. We begin by reporting the accuracy achieved best performing model discovered until now , the one by our model without any data augmentation, using stan- that reached 0.9222 of accuracy. In this table we show dard noise injection. The dataset used for training is Pro- the comparison between various range of noise applied duction 1, while the test set consists of a random subset to the best performing model and the ’break point’. In of the training data. As a starting point for our experi- this case the test set is obtained by a percentage of the ments, we compare the performance of models trained Training set. on samples converted using Open3D and Trimesh.

As the next step, we compared models trained with diferent conversion libraries, varying sampling densities, and applying oversampling as a data augmentation technique. The table below summarizes the configurations As observed, introducing noise within the range of and corresponding accuracies achieved on the Produc- (-0.100, +0.100) leads to a significant drop in performance, tion 1 dataset. Our goal was to train a competitive model which we identify as the "breaking point" for noise incapable of reaching accuracy levels comparable to more jection in our application. Beyond this threshold, the recent architectures like PointNet++, solely by balancing model’s classification accuracy deteriorates considerably. the dataset through oversampling. The next objective is to determine the optimal level of noise injection to improve generalization while avoiding

ConveTrrsiimonesLhibrary Sampl1in02g4Points Oversamp3ling Factor Ac8c9u.2r6a%cy overfitting. To this end, we exploit a diferent dataset,

Trimesh 2048 3 88.74% referred to as Production Final. Production Final difers OOppeenn33DD 12002448 33 9829..2920%% from Production 1 mainly in the composition of internal Trimesh 1024 2 87.57% fragments, as it aggregates internal samples from both Trimesh 1024 4 83.12% the first and second productions, resulting in a more balTable 5 anced dataset. To assess the impact of noise injection,

Impact of conversion library, sampling density, and oversam- we first train a new PointNet model from scratch on Pro

pling factor on model performance (Production 1 dataset). duction Final, treating it as a reference model. We then evaluate the best-performing model previously obtained

Thanks to these techniques, we obtained a model that (trained on Production 1) by applying various levels of significantly improved its ability to classify fragments, in- noise injection, testing it on a subset of Production Ficluding internal ones, compared to the initial production. nal. By carefully analyzing the confusion matrices, we Moving forward, we focus on identifying best practices aim to identify the level of noise that enables the prefor preventing overfitting. Before doing so, it is impor- viously trained model to behave similarly to the new tant to highlight an interesting observation made during baseline model trained directly on Production Final. If a testing. After training a model with samples converted model trained with noise augmentation achieves similar using either Trimesh or Open3D, once the model has classification patterns to the baseline, we can reasonconverged, the choice of conversion library for the test ably conclude that noise injection at that level enhances set no longer afects performance: both conversion meth- the model’s generalization ability. In the following, we ods yield identical results at test time. In practice, while present the accuracy of the model trained from scratch the choice of conversion library can impact the model’s on Production Final. performance during training, it does not influence the Now we test the best performing model (model2), outcomes during testing. trained with diferent levels of jitter, on the same test set

Now we will consider experiments regarding the noise made by Production Final, since we want to see which injection,in particular we are interested in obtaining the model acts more similarly to the Baseline model trained

Conversion Open3D Trimesh

Dataset Production Final Production Final

We observe that the best-performing model, trained with a jitter range of (-0.010, +0.010), is able to classify samples from the new dataset almost as efectively as a model trained directly on it. Considering that the main diference between the two datasets lies in the internal fragments — which, according to the confusion matrices, are classified similarly by both models. We can conclude that injecting noise within this range slightly reduces performance on the original dataset, but enhances the model’s generalization capabilities. Regarding the binary classification task (internal vs external fragments), the ifnal evaluation concerns the predictions made by the best-performing model on the fragments of a real object (Auriga), as introduced in Section 2.

Auriga’s fragment Fragment n.1 Fragment n.2 Fragment n.3 Fragment n.4 Fragment n.5 Fragment n.6 Fragment n.7 Fragment n.8 Fragment n.9 Label External External External External External External External External/Internal External Prediction External External External External External External External External External

10. Conclusion In this work, we presented a series of experiments focused on the classification of 3D object fragments using deep learning techniques. Starting from an artificially generated dataset of fragmented shapes, we evaluated the impact of diferent preprocessing methods, data augmentation strategies, and model configurations. Our experiments demonstrated that simple oversampling techniques, combined with the use of Open3D for point cloud generation and an appropriate sampling density, allowed us to improve the classification performance of the baseline PointNet architecture, reaching an accuracy comparable to more advanced models such as PointNet++. We also investigated the role of noise injection as a means to enhance model generalization. Our results suggest that controlled noise injection, particularly in the range of (-0.010, +0.010), enables the model to better adapt to new datasets without severely impacting performance on the original data.

Declaration on Generative AI During the preparation of this work, the authors used ChatGPT, Grammarly in order to: Grammar and spelling

check, Paraphrase and reword. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the publication’s content.

[1]

C. R.

Qi ,

Su ,

Nießner ,

Dai ,

Yan ,

L. J.

Guibas , Volumetric and multi-view cnns for object classification on 3d data , in: Proceedings of the IEEE conference on computer vision and pattern recognition , 2016 , pp. 5648 - 5656 .

[2]

Su ,

Maji ,

Kalogerakis , E. Learned-Miller, Multi-view convolutional neural networks for 3d shape recognition , in: Proceedings of the IEEE international conference on computer vision , 2015 , pp. 945 - 953 .

[3]

Bruna ,

Zaremba ,

Szlam , Y. LeCun, Spectral networks and locally connected networks on graphs , arXiv preprint arXiv:1312.6203 ( 2013 ).

[4]

C. R.

Qi ,

Su ,

Mo ,

L. J.

Guibas , Pointnet: Deep learning on point sets for 3d classification and segmentation , in: Proceedings of the IEEE conference on computer vision and pattern recognition , 2017 , pp. 652 - 660 .

[5]

Avanzato ,

Beritelli ,

Russo ,

Russo , M. Vaccaro, Yolov3-based mask and face recognition algorithm for individual protection applications , in: CEUR Workshop Proceedings , volume 2768 , 2020 , p. 41 - 45 .

[6]

Fiani ,

Ponzi ,

Russo , Keeping eyes on the road: Understanding driver attention and its role in safe driving , in: CEUR Workshop Proceedings , volume 3695 , 2023 , p. 85 - 95 .

[7]

Capizzi ,

G. L.

Sciuto ,

Monforte , C. Napoli, Cascade feed forward neural network-based model for air pollutants evaluation of single monitoring stations in urban areas , International Journal of Electronics and Telecommunications 61 ( 2015 ) 327 - 332 . doi: 10 .1515/eletel-2015-0042.

[8]

Brandizzi ,

Bianco , G. Castro,

Russo ,

Wajda , Automatic rgb inference based on facial emotion recognition , in: CEUR Workshop Proceedings , volume 3092 , 2021 , p. 66 - 74 .

[9]

Iacobelli ,

Russo , C. Napoli, A machine learning based real-time application for engagement detection , in: CEUR Workshop Proceedings , volume 3695 , 2023 , p. 75 - 84 .

[10]

Ponzi ,

Russo ,

Wajda ,

Napoli , A comparative study of machine learning approaches for autism detection in children from imaging data , in: CEUR Workshop Proceedings , volume 3398 , 2022 , p. 9 - 15 .

[11]

Lo Sciuto , G. Capizzi,

Shikler ,

Napoli , Organic solar cells defects classification by using a new feature extraction algorithm and an ebnn with an innovative pruning algorithm , International Journal of Intelligent Systems 36 ( 2021 ) 2443 - 2464 . doi: 10 .1002/int.22386.

[12]

A. S.

Gezawa ,

Z. A.

Bello ,

Wang ,

Yunqi , A voxelized point clouds representation for object classification and segmentation on 3d data , The Journal of Supercomputing 78 ( 2022 ) 1479 - 1500 .

[13]

D. Z.

Wang , I. Posner , Voting for voting in online point cloud object detection ., in: Robotics: science and systems , volume 1 , Rome, Italy, 2015 , pp. 10 - 15 .

[14]

R. N.

Bracewell , The fourier transform , Scientific American 260 ( 1989 ) 86 - 95 .

[15]

Guo ,

Wang ,

Hu ,

Liu , L. Liu,

Bennamoun , Deep learning for 3d point clouds: A survey , IEEE transactions on pattern analysis and machine intelligence 43 ( 2020 ) 4338 - 4364 .

[16]

C. R.

Qi ,

Yi ,

Su ,

L. J.

Guibas , Pointnet++: Deep hierarchical feature learning on point sets in a metric space , Advances in neural information processing systems 30 ( 2017 ).

[17]

Ponzi ,

Napoli , Graph neural networks: Architectures, applications, and future directions , IEEE Access ( 2025 ).

[18]

Ponzi ,

Comito ,

Napoli , Pnmlr: Enhancing route recommendations with personalized preferences using graph attention networks , IEEE Access ( 2025 ).

[19]

Flavell , Beginning blender: open source 3d modeling, animation, and game design , Apress , 2011 .

[20]

Shorten ,

T. M.

Khoshgoftaar , A survey on image data augmentation for deep learning , Journal of big data 6 ( 2019 ) 1 - 48 .

[21]

D. A.

Van Dyk ,

X.-L.

Meng , The art of data augmentation , Journal of Computational and Graphical Statistics 10 ( 2001 ) 1 - 50 .

[22]

I. E.

TIBERMACINE ,

Russo ,

Citeroni ,

Mancini ,

Rabehi ,

A. H.

Alharbi , E. -S. M. Elkenawy , C. Napoli , Adversarial denoising of eeg signals: A comparative analysis of standard gan and wgan-gp approaches , Frontiers in Human Neuroscience 19 ( 2025 ) 1583342 .

[23]

Russo ,

Ahmed ,

I. E.

Tibermacine ,

Napoli , Enhancing eeg signal reconstruction in cross-domain adaptation using cyclegan , in: 2024 International Conference on Telecommunications and Intelligent Systems (ICTIS) , IEEE, 2024 , pp. 1 - 8 .

[24]

I. E.

Tibermacine ,

Guettala ,

Napoli ,

Russo , Enhancing sentiment analysis on seed-iv dataset with vision transformers: A comparative study , in: Proceedings of the 2023 11th international conference on information technology: IoT and smart city , 2023 , pp. 238 - 246 .