<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main"></title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Vasyl</forename><surname>Teslyuk</surname></persName>
							<email>m.teslyuk@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 S. Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Iryna</forename><surname>Kazymyra</surname></persName>
							<email>iryna.y.kazymyra@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 S. Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Volodymyr</forename><surname>Tsapiv</surname></persName>
							<email>volodymyr.tsapiv.mknus.2023@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 S. Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">57205F259D1B78DE9FB7D5316CFF4A39</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:54+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents a novel framework for real-time dynamic hand gesture recognition designed to enhance interaction with smart devices and touchless interfaces. The proposed system integrates Google Mediapipe for hand pose detection with a modified version of the DD-Net architecture, optimized for online classification of gestures using 2D and 3D data. Key innovations include introducing an auxiliary classification head to address the class imbalance and an attention mechanism to improve the recognition of partially observed gestures. The system is evaluated on the NVGesture and SHREC22 datasets, achieving an accuracy of 0.784 and 0.924, respectively, surpassing previous benchmarks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Keywords</head><p>real-time gesture recognition, human-machine interactions, dynamic hand gestures 1</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Human-computer interaction (HCI) has evolved significantly over the years, moving from traditional input methods such as keyboards and mice to more intuitive and natural modes of interaction. Hand gesture recognition is prominent among these due to its close alignment with human communication habits. Hand gestures, being an integral part of non-verbal communication, provide a seamless and intuitive way for humans to convey commands and intentions. This makes them ideal for controlling machines, particularly when hands-free, touchless interaction is required.</p><p>The relevance of real-time dynamic hand gesture recognition lies in its broad applicability across several emerging fields, including augmented reality (AR), virtual reality (VR), robotics, and smart environments. In AR and VR systems, where users are immersed in virtual spaces, traditional input devices can be cumbersome and break immersion. Gesture control offers a more natural alternative, enabling users to interact directly with virtual objects. Similarly, gesture recognition can facilitate smoother interaction between humans and machines in robotics, enabling more intuitive control in industrial or assistive contexts.</p><p>One of the most pressing applications of this technology is in the development of touchless interfaces, which have gained significant importance due to public health concerns, particularly following the COVID-19 pandemic. In public spaces like elevators, ATMs, and kiosks, touchless interaction can help reduce the spread of infectious diseases by minimizing physical contact with shared surfaces. Gesture recognition provides an ideal solution for these scenarios by allowing users to interact without direct contact, ensuring convenience and hygiene.</p><p>Despite these benefits, real-time dynamic hand gesture recognition remains a challenging task. The primary obstacles include the high computational cost of processing continuous streams of data and the need for low-latency, high-accuracy systems that can function effectively on devices with limited processing power, such as those equipped with CPUs only. While accurate, traditional offline gesture recognition methods are not designed to handle real-time data processing, traditional offline gesture recognition methods typically require pre-segmented sequences of hand poses for analysis. This presents a significant gap in real-time applications, which require systems to make instantaneous decisions based on continuously incoming data.</p><p>This research aims to develop a real-time 3D hand pose recognition framework that balances high accuracy with computational efficiency, making it suitable for smart devices and touchless interfaces. To achieve this, the following tasks were undertaken:  Adapting the DD-Net architecture for efficient online inference.  Designing a dual-head prediction system to enhance the detection of gestures and non-gestures.  Evaluating the system on the NVGesture dataset <ref type="bibr" target="#b10">[11]</ref> and real-time input to validate its performance.</p><p>This framework advances the field of gesture recognition by offering a practical solution for realtime, low-latency applications with wide-reaching implications for smart environments and public interfaces.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related works</head><p>Dynamic hand gesture recognition has seen significant advancements, with various models addressing offline and online recognition tasks using skeletal, depth, and motion data. However, real-time gesture recognition, particularly in streaming video, presents unique challenges due to the need for low-latency processing and robustness to environmental variations.</p><p>One of the prominent approaches in hand gesture recognition is the STA-GCN model <ref type="bibr" target="#b0">[1]</ref>, which employs a two-stream graph convolutional network with spatial-temporal attention for skeletonbased hand gesture recognition. The two-stream architecture processes pose and motion streams separately, using spatial-temporal graph convolutional layers to capture hand gestures over time. This method incorporates temporal pyramid pooling to extract features across multiple time scales, enhancing the model's ability to recognize gestures. The approach was evaluated on the DHG14/28 and SHREC2017 <ref type="bibr" target="#b4">[5]</ref> datasets, demonstrating high accuracy. Still, the complexity introduced by spatial-temporal attention can increase computational overhead, making it less suitable for real-time applications where efficiency is vital.</p><p>Another relevant work focuses on robust feature extraction from skeletal and depth data. The Robust Hand Shape Features for Dynamic Hand Gesture Recognition study <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref> proposes a multi-level feature LSTM model that extracts 3D geometric transformations from skeletal data and segmentation-based depth shape features. This method, tested on the DHG14/28 dataset, achieved state-of-the-art (SOTA) results by leveraging Conv1D and Conv2D pyramid structures with LSTM blocks. While this approach enhances accuracy, its reliance on depth and skeletal data increases computational complexity. It makes it less feasible for real-time applications where only skeletal data might be available.</p><p>The DD-Net architecture <ref type="bibr" target="#b2">[3]</ref>, designed for skeleton-based action recognition, takes a different approach by simplifying the network structure to improve efficiency. DD-Net introduces a doublemotion feature extraction method that captures joint distances and global motion variations. This lightweight model was tested on SHREC17 <ref type="bibr" target="#b4">[5]</ref> and JHMDB datasets, achieving competitive results for offline gesture classification. However, its design could be optimized for real-time continuous recognition, as it assumes the gesture's start and end are predefined, making it less adaptable to streaming data.</p><p>The challenge of continuous gesture recognition is addressed by methods presented in the SHREC 2022 Track on Online Detection of Heterogeneous Gestures <ref type="bibr" target="#b3">[4]</ref>. The contest evaluated online recog-nition methods where gestures are embedded in continuous sequences, interspersed with non-gestural motions. One such method is the Two-stage ST-GCN <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13]</ref>, which uses a sliding window technique to identify gesture candidates before refining them with a larger classification model. This two-stage approach improves the system's ability to handle continuous data and boundary detection between gestures and non-gestures. However, the sliding window approach can introduce latency, especially if the window size is large, impacting the real-time responsiveness of the system.</p><p>Another approach, Transformer Network + Finite State Machine (TN-FSM) <ref type="bibr" target="#b3">[4]</ref>, uses temporal convolutional networks (TCNs) and transformers to classify gestures within continuous data streams. The model includes a logical state machine that helps delineate gesture boundaries. This method excels in handling non-gesture segments and works well for continuous streams. Still, transformers and state machines introduce computational complexity, limiting their use in real-time applications unless carefully optimized.</p><p>The DeepGRU architecture <ref type="bibr" target="#b5">[6]</ref> also contributes to this domain by employing gated recurrent units (GRUs) combined with a global attention mechanism. Designed for gesture recognition, DeepGRU achieves SOTA results on SHREC17 <ref type="bibr" target="#b4">[5]</ref> and SHREC19 <ref type="bibr" target="#b6">[7]</ref> datasets, offering an end-to-end approach for gesture classification from raw skeletal data. While highly accurate, the attention mechanism increases computational demand, and like many deep learning models, its performance in real-time settings depends on hardware capabilities.</p><p>Despite the advancements in accuracy and feature extraction, many of these models struggle to balance computational efficiency with real-time performance. Our work builds on these methods, particularly DD-Net and the SHREC competition models, by adapting the DD-Net architecture for online inference while integrating 3D and 2D hand pose data for robust recognition. By introducing residual connections and employing a sliding window approach optimized for low latency, our model aims to achieve real-time gesture recognition in CPU-constrained environments, ensuring accuracy and efficiency in practical applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Overview of the proposed approach</head><p>This study presents a real-time hand gesture recognition system that modifies an existing offline classification model to function online. The core model is based on the double-feature double-motion network (DD-Net) architecture, a 1D convolutional neural network (CNN) optimized for sequential data and designed originally for offline tasks. Our modifications adapt DD-Net for real-time inference, allowing it to classify continuous hand gestures while maintaining high accuracy and computational efficiency. The method works with Google Mediapipe <ref type="bibr" target="#b7">[8]</ref> for hand pose detection and 3D hand landmark acquisition, reflecting the integration of advanced technologies observed in previous studies <ref type="bibr" target="#b8">[9]</ref>. Also it is compatible with other tracking devices like Intel RealSense or Hololens 2.</p><p>The architecture captures temporal hand dynamics by buffering and preprocessing 3D landmarks into windows of 16 frames, making predictions continuously, and reducing spurious gestures through a post-processing step. Figure <ref type="figure" target="#fig_0">1</ref> presents a pipeline architecture overview.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Data acquisition and preprocessing</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data acquisition</head><p>The system employs various hardware options, such as cameras and depth sensors, to acquire 3D hand landmarks:</p><p> Google Mediapipe provides real-time hand pose estimation, outputting 3D joint coordinates for each frame.  Other supported sensors include Intel RealSense or Hololens 2, which provide hand pose data in formats compatible with the model.</p><p>Each captured frame consists of the 3D positions of multiple hand joints. The system buffers these into windows of 16 frames to match the training configuration of the network. The preprocessing step thus ensures the model receives data with consistent temporal properties, preserving gesture dynamics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Model input representation</head><p>The input representation of hand gestures is crafted to capture both static and dynamic aspects of the movement. Each 16-frame observation window W is processed into three distinct views:  Geometric Layout (Joint Collection Distances, JCD): This view represents the spatial relationship between hand joints by calculating the Euclidean distances between every pair of joints.</p><p>The JCD features are location and viewpoint invariant, ensuring the model can recognize gestures regardless of hand orientation. The resulting tensor is of size (J -1) * J / 2 × W, where J is the number of joints and W is the window size <ref type="bibr" target="#b15">(16)</ref>.  Short-term Slow Motion (Mslow): This view captures short-term motion dynamics by computing the linear velocity of each joint between consecutive frames. This view reflects slow, continuous movements and is represented as a tensor of size J×(W−1)  Short-term Fast Motion (Mfast): Similar to Mslow, this view computes linear velocity but skips every other frame, focusing on faster motions. This results in a tensor of size J×(W/2−1). Mfast helps the model differentiate between quick, subtle movements and slower motions.</p><p>These three views are fed into separate embedding branches, creating a comprehensive multiview description of the hand gesture.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Model architecture</head><p>As it was mentioned before the core model is based on the double-feature double-motion network (DD-Net) architecture. We adapt DD-Net for real-time inference, allowing it to classify continuous hand gestures while maintaining high accuracy and computational efficiency. The modified DD-Net model architecture is presented in Figure <ref type="figure" target="#fig_1">2</ref>. An important enhancement over the original DD-Net is the introduction of residual connections in the convolutional blocks. See Figure <ref type="figure" target="#fig_1">2</ref>. These connections, inspired by the ResNet architecture <ref type="bibr" target="#b9">[10]</ref>, allow the network to bypass layers when beneficial, facilitating better learning and addressing the vanishing gradient problem that can occur in deep networks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Feature concatenation and classification</head><p>After each embedding branch processes its respective input, the features are concatenated to form a unified representation of the hand gesture. This concatenated representation flows into two classification heads: Primary Classification Head: This head performs fine-grained gesture classification, predicting specific gesture classes. It consists of several convolutional blocks followed by max-pooling layers for feature extraction and translation invariance. Afterwards, a global max-pooling layer is applied, followed by two fully connected layers and a Softmax activation to output the final gesture class. Auxiliary Classification Head: Designed to handle the major class imbalance in online scenarios where the majority of frames contain no gestures. This head classifies gestures into three categories: static gestures, dynamic gestures, and no-gesture frames. A small attention unit enhances this branch, helping the model ignore irrelevant background movements and focus on meaningful hand gestures. This is particularly useful for frames at the start and end of gestures, where only part of the gesture is in the window. The attention unit uses pointwise convolution to generate a soft attention mask multiplied by the concatenated embeddings to focus on relevant features selectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Model training</head><p>The dataset used for training is adapted to the sliding window format required by the network. For each frame t, the system collects joint data from frames t−W + 1 to t, creating a 16-frame observation window. The corresponding label is the gesture class at frame t. The primary classification head is trained to predict the fine-grained gesture class using a crossentropy loss. Using a separate cross-entropy loss, the auxiliary classification head is trained on the simplified three-class task (static, dynamic, no-gesture).</p><p>Both classification heads are optimized simultaneously, with cross-entropy loss applied to each. The model is trained using the Adam optimizer with a learning rate schedule that decays over time to ensure convergence.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.6.">Model inference</head><p>During inference, the system processes each frame in real time, updating its prediction every 16 frames using the following steps:</p><p>1. Observation Window Sampling: For each frame t, a 16-frame observation window W is created. The 3D joint data is preprocessed, and the multi-view description (JCD, Mslow, Mfast) is extracted. 2. Preliminary Classification: The multi-view embeddings are fed into both classification heads, producing a preliminary gesture class prediction for the current window. 3. Post-processing: A post-processing step is applied to reduce false positives and spurious gesture classifications. The last k preliminary predictions are stored in a buffer, and a majority voting mechanism is used to determine the final class. This helps smooth out predictions over time and eliminates noise from accidental or incomplete gestures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Datasets</head><p>The proposed hand gesture recognition system was evaluated using two datasets: an adapted version of the NVGesture dataset <ref type="bibr" target="#b10">[11]</ref> and the SHREC'22 benchmark <ref type="bibr" target="#b3">[4]</ref>. These datasets cover various gestures, comprehensively analyzing the model's performance on static and dynamic hand gestures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>NVGesture dataset</head><p>The NVGesture dataset consists of over 1,500 video sequences depicting 25 unique hand gestures performed by 20 different participants. The dataset includes a variety of gestures, with both static gestures (e.g., "thumb up," "ok") and dynamic gestures (e.g., "stop," "right"). Originally, this dataset was designed as an image-based gesture dataset [Figure <ref type="figure" target="#fig_2">3</ref>], and thus, a significant amount of preprocessing was required to adapt it for 3D hand pose recognition using the proposed model. A custom annotation procedure was developed to prepare the dataset for model training. This procedure utilized Google's Mediapipe framework to automatically extract 2D and 3D joint positions for each gesture sequence. Given that some sequences contained background participants, which resulted in multiple hand detections, a size-based filter was applied. This filter ensured that only the largest detected hand, assumed to belong to the main performer closest to the camera, was retained for the training process.</p><p>Another preprocessing step involved resolving issues caused by motion blur, which led to missed detections in certain frames. A windowed interpolation method was applied to address this. In cases where the first and last frames of a 5-frame window were recognized, but intermediate frames were missing, linear interpolation was used to estimate the joint trajectories within the window. This process effectively filled in gaps in the data, ensuring more complete gesture annotations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>SHREC'22 benchmark</head><p>The SHREC'22 benchmark features continuous recordings of 3D hand poses captured in simulated Mixed Reality interactions using a Hololens 2 device. The dataset is structured with training and testing sets comprising 144 sequences. Each sequence contains 16 gesture classes interleaved with non-significant hand movements (referred to as "non-gestures"). The dataset includes gestures categorized into four types: static gestures, dynamic coarse gestures, dynamic fine gestures, and periodic gestures.</p><p>Each sequence is annotated with start frames, end frames, and gesture labels, making it wellsuited for gesture segmentation and classification tasks. Notably, the subjects in the training and testing sequences are different, which ensures a challenging cross-subject evaluation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Evaluation metrics</head><p>For the quantitative evaluation of the model's performance, the following standard metrics were employed:</p><p>Accuracy: Measures the percentage of correctly classified gestures.</p><p>Precision: Indicates the proportion of true positive predictions among all positive predictions, capturing the model's ability to avoid false positives. Recall: Represents the proportion of true positive predictions among all actual positives, reflecting the model's capacity to identify all relevant gestures. F1 Score: The harmonic mean of precision and recall provides a balanced metric for false positives and false negatives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Evaluation process</head><p>The evaluation was conducted in two phases: offline evaluation on benchmark datasets and real-time evaluation using live input from a webcam.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Phase 1: offline dataset evaluation</head><p>The first phase involved testing the model's performance on the NVGesture dataset and SHREC'22 benchmark. For the NVGesture dataset, the model was trained on the 3D joint data extracted from Mediapipe's world coordinate estimates. To ensure robustness, the evaluation was performed on two types of keypoint representations:</p><p>3D keypoints: Using the world-coordinate hand pose estimates provided by Mediapipe. 2D keypoints: Normalized 2D joint coordinates projected from the camera view.</p><p>After preprocessing and annotating the NVGesture dataset, the modified model was trained and evaluated on the test partition. The dataset includes a balanced mix of static and dynamic gestures, which allowed for a comprehensive evaluation of the model's ability to recognize various hand motions.</p><p>Key challenges included the handling of motion blur and occasional multiple detections in the background, which were mitigated through the custom preprocessing steps described earlier. The results of this evaluation provided insight into the model's robustness in recognizing static and dynamic gestures under various conditions.</p><p>The SHREC'22 benchmark posed additional challenges due to the continuous nature of the recordings, where gestures were interleaved with non-significant movements. The model was trained to distinguish between gesture and non-gesture frames, focusing on identifying each gesture's precise start and end.</p><p>Due to the variability in subjects and the mix of gesture types (static, dynamic coarse, dynamic fine, and periodic), the SHREC'22 dataset served as a valuable test of the model's generalization capabilities across different individuals and gesture styles.</p><p>The evaluation results on both datasets, including accuracy, precision, recall, and F1 score, provided quantitative measures of the model's performance and demonstrated its effectiveness in recognizing static and dynamic gestures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Phase 2: real-time evaluation</head><p>The model was tested in a real-time environment in the second evaluation phase to simulate practical application scenarios. The system was connected to a webcam, and the trained model was deployed to classify gestures as performed live. This evaluation was designed to mimic real-world usage, where the model must operate continuously and make online predictions.</p><p>Each gesture type (static, dynamic coarse, dynamic fine, periodic) was attempted three times, and the system's performance was recorded based on the number of successful attempts. A gesture was successful if the model correctly classified it during the real-time session without significant delay or misclassification.</p><p>Both 2D and 3D keypoint representations were used during the real-time tests to compare performance. The success rate for each gesture type provided a practical assessment of the model's usability in real-time applications, offering valuable insights into its responsiveness and robustness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">NVGesture dataset evaluation</head><p>The evaluation of the model on the NVGesture dataset was conducted using both 2D and 3D hand pose data. Performance metrics such as accuracy, recall, precision, and F1 score were used to assess the model's effectiveness in gesture recognition. The results are presented in Table <ref type="table" target="#tab_0">1</ref>. The analysis of the results shows that the model trained on 2D data slightly outperforms the one trained on 3D data in terms of accuracy and F1 score. The 2D model achieved an accuracy of 0.794, compared to 0.784 for the 3D model. However, the 3D data model showed higher precision (0.784 vs. 0.763), suggesting it might be better at minimizing false positives. Overall, the 3D data approach outperformed the original results from the NVGesture dataset, which reported an accuracy of 0.74 using colour data only. This demonstrates the improved performance of our model and its potential for more accurate real-time hand gesture recognition using 3D landmarks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Manual evaluation</head><p>A manual real-time evaluation was conducted using a webcam to assess the model's ability to generalize beyond the NVGesture dataset. This test involved classifying static gestures, dynamic gestures, and non-gesture frames. The results are summarized in Table <ref type="table" target="#tab_1">2</ref>. The manual evaluation highlights a significant performance gap between the 2D and 3D models in real-world applications. The model trained on 3D data demonstrated superior performance across all gesture types, especially notable improvements in static gesture recognition (0.893 accuracy in 3D vs. 0.345 in 2D). This suggests that the 2D model generalizes poorly to real-world data, whereas the 3D model performs well outside controlled dataset conditions. The 3D model's high accuracy for non-gesture frames (0.921) also reflects its ability to distinguish between gesture and non-gesture movements in live input effectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">SHREC'22 dataset evaluation</head><p>The model's performance was also evaluated on the SHREC'22 dataset, which consists of continuous 3D hand pose recordings captured with a Hololens 2 device. The dataset only contains 3D data, so a direct comparison with 2D data was impossible. The results are provided in Table <ref type="table" target="#tab_2">3</ref>.</p><p>The model achieved high accuracy (0.9243) and F1 score (0.9241) on the SHREC'22 dataset, demonstrating its robustness in recognizing gestures in continuous, real-world recordings. The Static-Dynamic-Non-gesture (SDN) accuracy was particularly strong at 0.9497, highlighting the model's ability to accurately distinguish between different gesture types and non-gesture movements in mixed reality environments. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Discussions</head><p>This research demonstrates that using 3D hand pose data significantly improves real-time gesture recognition compared to previous approaches relying on 2D data or colour information. On the NVGesture dataset, our method achieved 0.784 accuracy, outperforming the original paper's 0.74 accuracy, which only utilized color data. The improvement can be attributed to the detailed spatial information captured by 3D hand pose data, allowing for a better understanding of hand movements. Furthermore, our modifications to the DD-Net architecture, including residual connections and an auxiliary classification head to handle class imbalance, proved effective in refining gesture classification.</p><p>A key finding is that models trained on 2D data generalized poorly in real-world scenarios, as evidenced by our manual evaluation, where the 2D model struggled with static gestures (0.345 accuracy), while the 3D model performed significantly better (0.893 accuracy). This highlights the robustness of 3D data in handling varying environments, lighting conditions, and background noise, making it more suitable for real-world applications like smart devices and touchless interfaces.</p><p>Our evaluation of the SHREC'22 dataset, which captures continuous gestures in mixed-reality environments using Hololens 2, further validated the approach. The model achieved a strong 0.924 accuracy and distinguished static, dynamic, and non-gestures, with 0.949 SDN accuracy. These results suggest that the method can be effectively integrated into augmented reality (AR) <ref type="bibr" target="#b16">[17]</ref> and virtual reality (VR) applications, where continuous gesture recognition is critical.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusions</head><p>This research has significantly advanced dynamic hand gesture recognition, presenting a robust realtime system that effectively utilizes 3D and 2D data for gesture classification. By adopting the DD-Net architecture and introducing modifications such as an auxiliary classification head and attention mechanisms, the system demonstrated improved accuracy, particularly when working with 3D data, achieving 0.784 accuracies on the NVGesture dataset-outperforming prior benchmarks based solely on 2D colour data. The automatic annotation procedure using Google Mediapipe allowed for efficient data preprocessing, further enhancing the system's performance.</p><p>The evaluation of the system was comprehensive, including tests on benchmark datasets like NVGesture and SHREC22, as well as real-time manual evaluations using a webcam. The results showed that the system, particularly when trained on 3D data, generalizes well to real-world environments, achieving superior performance in dynamic and static gestures and non-gesture categories.</p><p>The system remains within acceptable limits for real-time applications with an average inference time of 5 msec (combined with 22 msec per inference from Google Mediapipe). This low-latency performance makes it particularly valuable for practical use cases, such as controlling smart devices or providing touchless interfaces in public spaces, where health concerns and convenience are of growing importance.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Overview of the pipeline architecture based on the Mediapipe hand pose estimator.</figDesc><graphic coords="4,78.00,104.64,444.96,199.92" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Overview of the modified DD-Net model architecture.</figDesc><graphic coords="5,78.72,192.96,443.52,330.24" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: NVGesture data collection setup and a sample of multimodal data.</figDesc><graphic coords="7,118.80,167.04,362.64,178.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>NVGesture Dataset Evaluation Results</figDesc><table><row><cell>Data Type</cell><cell>Accuracy</cell><cell>Recall</cell><cell>Precision</cell><cell>F1 Score</cell></row><row><cell>2D</cell><cell>0.794</cell><cell>0.794</cell><cell>0.763</cell><cell>0.768</cell></row><row><cell>3D</cell><cell>0.784</cell><cell>0.758</cell><cell>0.784</cell><cell>0.751</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Manual Evaluation Results</figDesc><table><row><cell>Gesture Type</cell><cell>2D Data</cell><cell>3D Data</cell></row><row><cell>Static</cell><cell>0.345</cell><cell>0.893</cell></row><row><cell>Dynamic</cell><cell>0.567</cell><cell>0.712</cell></row><row><cell>Non-gesture</cell><cell>0.790</cell><cell>0.921</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc></figDesc><table><row><cell>SHREC'22 Dataset Evaluation Results</cell><cell></cell></row><row><cell>Metric</cell><cell>Value</cell></row><row><cell>Accuracy</cell><cell>0.924</cell></row><row><cell>Precision</cell><cell>0.926</cell></row><row><cell>Recall</cell><cell>0.924</cell></row><row><cell>F1 Score</cell><cell>0.924</cell></row><row><cell>SDN Accuracy</cell><cell>0.950</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">STA-GCN: Two-Stream Graph Convolutional Network with Spatial-Temporal Attention for Hand Gesture Recognition</title>
		<author>
			<persName><forename type="first">Wei</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zeyi</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jian</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cuixia</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaoming</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hongan</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00371-020-01955-w</idno>
		<ptr target="https://doi.org/10.1007/s00371-020-01955-w" />
	</analytic>
	<monogr>
		<title level="j">The Visual Computer</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">10-12</biblScope>
			<biblScope unit="page" from="2433" to="2444" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Robust Hand Shape Features for Dynamic Hand Gesture Recognition Using Multi-Level Feature LSTM</title>
		<author>
			<persName><forename type="first">Nhu-</forename><surname>Do</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Soo-Hyung</forename><surname>Tai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hyung-Jeong</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guee-Sang</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><surname>Lee</surname></persName>
		</author>
		<idno type="DOI">10.3390/app10186293</idno>
		<ptr target="https://doi.org/10.3390/app10186293" />
	</analytic>
	<monogr>
		<title level="j">Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">18</biblScope>
			<biblScope unit="page">6293</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Make Skeleton-Based Action Recognition Model Smaller, Faster and Better</title>
		<author>
			<persName><forename type="first">Fan</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yang</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sakriani</forename><surname>Sakti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Satoshi</forename><surname>Nakamura</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ACM Multimedia Asia</title>
				<meeting>the ACM Multimedia Asia<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">SHREC 2022 Track on Online Detection of Heterogeneous Gestures</title>
		<author>
			<persName><forename type="first">Marco</forename><surname>Emporio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ariel</forename><surname>Caputo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea</forename><surname>Giachetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Cristani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guido</forename><surname>Borghi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea D'</forename><surname>Eusanio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Minh-Quan</forename><surname>Le</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.cag.2022.07.015</idno>
		<ptr target="https://doi.org/10.1016/j.cag.2022.07.015" />
	</analytic>
	<monogr>
		<title level="j">Computers &amp; Graphics</title>
		<imprint>
			<biblScope unit="volume">107</biblScope>
			<biblScope unit="page" from="241" to="251" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">SHREC&apos;17 Track: 3D Hand Gesture Recognition Using a Depth and Skeletal Dataset</title>
		<author>
			<persName><forename type="first">Quentin</forename><surname>De Smedt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hazem</forename><surname>Wannous</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jean-Philippe</forename><surname>Vandeborre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joris</forename><surname>Guerry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bertrand</forename><surname>Le Saux</surname></persName>
		</author>
		<idno type="DOI">10.2312/3dor.20171049_xffff_._xffff_hal-01563505_xffff_</idno>
	</analytic>
	<monogr>
		<title level="m">3DOR -10th Eurographics Workshop on 3D Object Retrieval</title>
				<meeting><address><addrLine>Lyon, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-04">Apr 2017</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">DeepGRU: Deep Gesture Recognition Utility</title>
		<author>
			<persName><forename type="first">Mehran</forename><surname>Maghoumi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><forename type="middle">J</forename><surname>Laviola</surname><genName>Jr</genName></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<biblScope unit="page" from="16" to="31" />
			<date type="published" when="2019">2019</date>
			<publisher>Springer International Publishing</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Caputo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Marco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gianni</forename><surname>Burato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Théo</forename><surname>Pavan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hazem</forename><surname>Voillemin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jean-Philippe</forename><surname>Wannous</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mehran</forename><surname>Vandeborre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eugene</forename><surname>Maghoumi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Matthew Taranta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><forename type="middle">J</forename><surname>Razmjoo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Laviola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stefano</forename><surname>Manganaro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guido</forename><surname>Pini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Roberto</forename><surname>Borghi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rita</forename><surname>Vezzani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cucchiara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Minh-Triet</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea</forename><surname>Tran</surname></persName>
		</author>
		<author>
			<persName><surname>Giachetti</surname></persName>
		</author>
		<title level="m">SHREC 2019 Track: Online Gesture Recognition</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">MediaPipe Hands: On-device Real-time Hand Tracking</title>
		<author>
			<persName><forename type="first">Fan</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valentin</forename><surname>Bazarevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrey</forename><surname>Vakunov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrei</forename><surname>Tkachenka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">George</forename><surname>Sung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chuo-Ling</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matthias</forename><surname>Grundmann</surname></persName>
		</author>
		<idno>ArXiv abs/2006.10214</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Methods for the efficient energy management in a smart mini greenhouse Computers</title>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Tsmots</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gregus Ml</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kazymyra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Materials and Continua</title>
		<imprint>
			<biblScope unit="volume">70</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="3169" to="3187" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Deep Residual Learning for Image Recognition</title>
		<author>
			<persName><forename type="first">Kaiming</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiangyu</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shaoqing</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jian</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">Pavlo</forename><surname>Molchanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaodong</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shalini</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kihwan</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stephen</forename><surname>Tyree</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Kautz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Dynamic Hand Gesture Recognition Using Improved Spatio-Temporal Graph Convolutional Network</title>
		<author>
			<persName><forename type="first">Jae</forename><forename type="middle">-</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kyeongbo</forename><surname>Hun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Suk-Ju</forename><surname>Kong</surname></persName>
		</author>
		<author>
			<persName><surname>Kang</surname></persName>
		</author>
		<idno type="DOI">10.1109/tcsvt.2022.3165069</idno>
		<ptr target="https://doi.org/10.1109/tcsvt.2022.3165069" />
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Circuits and Systems for Video Technology: A Publication of the Circuits and Systems Society</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page" from="6227" to="6239" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Dynamic Hand Gesture Recognition Using Multi-Branch Attention Based Graph and General Deep Learning Model</title>
		<author>
			<persName><forename type="first">Abu</forename><surname>Miah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Md</forename><surname>Saleh Musa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jungpil</forename><surname>Al Mehedi Hasan</surname></persName>
		</author>
		<author>
			<persName><surname>Shin</surname></persName>
		</author>
		<idno type="DOI">10.1109/ac-cess.2023.3235368</idno>
		<ptr target="https://doi.org/10.1109/ac-cess.2023.3235368" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access: Practical Innovations, Open Solutions</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="4703" to="4716" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Real-Time Hand Gesture Detection and Classification Using Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">Okan</forename><surname>Kopuklu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ahmet</forename><surname>Gunduz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Neslihan</forename><surname>Kose</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gerhard</forename><surname>Rigoll</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">14th IEEE International Conference on Automatic Face &amp; Gesture Recognition</title>
				<meeting><address><addrLine>FG</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2019">2019. 2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Hand Gesture Recognition with Convolution Neural Networks</title>
		<author>
			<persName><forename type="first">Felix</forename><surname>Zhan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">MMTM: Multimodal Transfer Module for CNN Fusion</title>
		<author>
			<persName><forename type="first">Vaezi</forename><surname>Joze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hamid</forename><surname>Reza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Amirreza</forename><surname>Shaban</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><forename type="middle">L</forename><surname>Iuzzolino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kazuhito</forename><surname>Koishida</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">The Analysis and Comparison of File Formats for the Construction of iOS OS Three-dimensional Objects for Augmented Reality Systems</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ostrovka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Veselý</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Protsko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">DCSMart</title>
				<meeting><address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="325" to="334" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
