<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SYSYEM</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>for Real-Time Psychometric and HCI Applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emanuele Iacobelli</string-name>
          <email>iacobelli@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Pelella</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Ponzi</string-name>
          <email>ponzi@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Samuele Russo</string-name>
          <email>samuele.russo@duniroma1.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Napoli</string-name>
          <email>cnapoli@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Eye Tracking, Machine Learning, Real-Time Application, Appearance-Based Eye Tracking System, Gaze Laterality Studies</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computational Intelligence, Czestochowa University of Technology</institution>
          ,
          <addr-line>42-201 Czestochowa</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer, Automatic and Management Engineering, Sapienza University of Rome</institution>
          ,
          <addr-line>00185 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Psychology, Sapienza University of Rome</institution>
          ,
          <addr-line>00185 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Institute for Systems Analysis and Computer Science, Italian National Research Council</institution>
          ,
          <addr-line>00185 Roma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>learning</institution>
          ,
          <addr-line>particularly Convolutional Neural Networks</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>10</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Eye-tracking technology has long been a valuable tool across various domains, and recent advancements in neural networks have significantly expanded its versatility and potential. However, real-world applications continue to face challenges such as accommodating users' natural movements, variations in lighting, occlusions of the eyes, and the limited availability of large, open-source datasets for training models. To address these issues, we developed a comprehensive pipeline that produces a lightweight and eficient model, requiring only an RGB camera as external hardware, making it easily deployable on standard PCs. Key input features include facial images, eye regions, head pose angles, the Eye Aspect Ratio (EAR), and a face grid that determines the face's location within the camera's frame. The model was trained using a custom dataset, in which participants were instructed to fixate on both randomly positioned points and the standard 9-point grid commonly employed in eye-tracking calibration. The resulting system was integrated into a real-time application, ofering fast and accessible gaze tracking, making it well-suited for studies requiring rapid gaze assessments across broad regions of the screen, such as psychometric research and Human-Computer Interaction (HCI) tasks. Its design is particularly advantageous for gaze laterality studies, which explore hemispheric dominance and attentional bias in cognitive and emotional processing, key concepts relevant to ADHD and dyslexia. Moreover, the system's capabilities naturally extend to emotional and decision-making tasks, where broad-area gaze tracking can support the analysis of preference formation and attentional patterns without the need for specialized hardware.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The human senses gather approximately 11 million bits of
information per second, with about 80% being visual and
the remainder distributed among the other senses. Due to
the dominance of visual perception, AI-based technology
[
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1, 2, 3, 4, 5</xref>
        ] has become a valuable research tool in fields
such as psychology [
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6, 7, 8, 9</xref>
        ], marketing [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ],
healthcare [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15">12, 13, 14, 15</xref>
        ], safety [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ], Human-Computer
Interaction (HCI) [18, 19, 20, 21], and Virtual Reality (VR)
and robotics [22, 23, 24]. This technology is particularly
crucial in psychometric applications, facilitating
studies on cognitive functions like focus, emotion
recognition, and decision-making, as well as in gaze laterality
research, where phenomena such as hemispheric
dominance and attentional bias are investigated. Historically,
professional systems relied on expensive hardware, such
as scleral search coils [25], electrooculography [26], EEG
(C. Napoli)
Similarly, [
        <xref ref-type="bibr" rid="ref20">31</xref>
        ] investigated reading performance in
children with ADHD, providing key insights into how the
condition afects oculomotor control and reading
ability, highlighting its potential for educational and clinical
applications. in [
        <xref ref-type="bibr" rid="ref21">32</xref>
        ] a similar approach is used for to
diagnose autism spectrum disorder. In addition, these systems
have proven efective in detecting dyslexia by capturing
distinctive eye movement patterns during reading tasks.
      </p>
      <sec id="sec-2-1">
        <title>This approach, powered by CNNs, enables early identification of dyslexia, allowing for timely interventions [33].</title>
      </sec>
      <sec id="sec-2-2">
        <title>To summarize, eye-tracking in gaze laterality research</title>
        <p>Attribution 4.0 International (CC BY 4.0).
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License provides a unique window into cognitive processes,
allowing for a deeper understanding of how attentional 2.1.2. Appearance-Based Approach
resources are allocated across the visual field. For that
reason, the motivation behind developing our lightweight Appearance-based methods aim to learn a direct mapping
real-time application is to enable more researchers to between the input image and the eye-gaze direction
withstudy gaze movement patterns without the need to in- out relying on camera calibration or geometric models
vest in expensive professional eye-tracking systems. By [44]. These methods are highly flexible, but they can be
reducing the cost and complexity, we aim to make this sensitive to head movements. Currently, the most
efectechnology more available for a wider range of studies tive approaches leverage convolutional neural networks
focused on cognitive and neurological research. As eye- (CNNs) and their variants to create mapping functions.
tracking becomes more accessible, its application in both While CNNs often achieve high accuracy on benchmark
research and clinical environments will continue to grow, datasets, they can struggle to generalize across diferent
ofering new avenues for understanding and addressing datasets unless trained on large-scale annotated datasets,
these conditions. which are time-consuming and complex to create.</p>
        <p>
          Recent works have made significant eforts to
overcome these challenges by creating diverse and
compre2. Related Works hensive datasets that improve the training and
generalization of CNN models. For example, the MPIIGaze
2.1. Eye-Tracking Approaches dataset [45] is a widely-used resource that contains over
200,000 images of 15 participants captured in real-world
In literature is possible to distinguish among 2 possible environments. This dataset helps improve gaze
predicapproaches, free from heavy specific instrumentation: tion in unconstrained settings, with variations in lighting,
Model-Based, and Appearance Based [
          <xref ref-type="bibr" rid="ref23">34</xref>
          ]. head pose, and other real-world factors.
Similarly, ETH-XGaze [46] provides a large dataset
2.1.1. Model-Based Approach with high-quality annotations, including images from 110
The model-based approach utilizes a 3D geometric model subjects captured under a wide range of head poses and
to determine the direction of the eye’s gaze. This is done lighting conditions. This dataset addresses the limitations
by calculating a vector that connects the 3D positions of smaller datasets and enables CNN models to learn
of the eyeball’s center and the pupil’s center. These po- robust gaze estimations in diverse environments.
sitions are derived from 2D eye landmarks and the 2D Additionally, the FAZE dataset [47] is designed
specifposition of the iris in the image, which are then projected ically to tackle domain generalization problems. FAZE
onto the 3D model. Initially, research in this area focused includes a large number of participants and images across
on developing accurate geometric models, but more re- diferent devices and environments, aiming to enhance
cent advancements have shifted towards improving the the generalization of appearance-based gaze estimation
precision of eye landmark detection using machine learn- models by incorporating domain adaptation techniques.
ing methods [
          <xref ref-type="bibr" rid="ref24 ref25 ref26 ref27 ref28 ref29">35, 36, 37, 38, 39, 40</xref>
          ]. For instance, [48] introduced GazeCapture, a dataset of
        </p>
        <p>
          For example, [
          <xref ref-type="bibr" rid="ref30">41</xref>
          ] describes an eye-tracking system videos recorded using smartphone front cameras under
that uses the Kinect v2 sensor. This device, equipped varying lighting conditions and head movements. They
with RGB and depth cameras, identifies facial landmarks used this dataset to train a CNN to predict the screen
coand computes the 3D gaze vector by combining face ori- ordinates a user is looking at on a smartphone or tablet.
entation with eye direction. Another system, presented The input to the CNN includes segmented images of the
in [42], employs the Supervised Descent Method (SDM) eyes and face, as well as a mask showing the face’s
loto detect 2D facial landmarks, while depth information cation in the image. To enhance real-time performance
from the Kinect is used to estimate the user’s 3D head (10–15 FPS on modern mobile devices), the authors
appose. The eye regions are further processed using the plied a technique called dark knowledge to reduce model
Starburst algorithm to estimate the pupil center for accu- complexity.
rate gaze tracking. An alternative approach, proposed by [49], works in a
        </p>
        <p>A more recent approach [43] uses a combination of desktop environment and uses an RGB camera to track
Unet and Squeezenet networks to significantly improve eye movement. The system first segments the eye region,
the accuracy and memory eficiency of eye-gaze tracking, detects the iris center and the inner eye corner and then
making it feasible even on smartphones. Although model- calculates an eye vector representing the eye’s movement.
based techniques ofer the advantage of being training- A second-order polynomial mapping function, combined
free and adaptable to various conditions, they can still with head pose information, is used to map this eye
vecface challenges with the precision of landmark detection tor to screen coordinates while compensating for head
and the accurate positioning of the iris. movements.</p>
        <p>More recent work [50] shifts the focus from traditional
eye-gaze tracking to time-varying signals such as the
vertical displacement between the iris and the inner eye
corner, which is less afected by head movements. Instead
of a direct mapping function, this method uses a CNN
to track multiple eye feature points, including the iris
center and eyelid positions. These points are then used
to generate eye movement signals, which are fed into a
specialized CNN for user behavior recognition.</p>
        <sec id="sec-2-2-1">
          <title>2.2. Challenges and Approach</title>
          <p>limitations, a brand-new collection of the dataset was
necessary, which was more suitable for the task of
interest. To address these challenges a system was designed
to record the user’s gaze on a PC’s screen optimizing the
data to the task of interest.</p>
          <p>Despite notable advancements, real-world applications
of eye-tracking technologies continue to face significant
challenges. These challenges arise from environmental
factors such as varying lighting conditions, reflections
in the images (e.g., glare), objects on the face (e.g.,
eyeglasses), diferences in contrast between the iris and pupil
due to varying iris colors, and individual variations in
eye anatomy. Additionally, the required computational
resources, combined with the limited range of vertical
eye movements, further complicate these
implementations. Furthermore, the end-to-end approach relies on
access to large-scale, publicly available datasets for train- 3.1.1. Recording
ing, which presents an additional hurdle. As a result,
despite their potential, these methods have not yet been
widely adopted, often being overshadowed by specialized
eye-tracking equipment designed for specific purposes.</p>
          <p>To address these challenges, a comprehensive pipeline
has been developed, encompassing dataset collection,
model architecture design, and real-time testing. The
goal is to utilize Convolutional Neural Networks to
create an end-to-end gaze prediction system that uses only
images captured from a standard laptop webcam, aiming
to achieve real-time performance.</p>
          <p>Data collection used 15-inch laptops in various
environmental and lighting conditions. To mitigate the potential
biases introduced by the use of a single webcam for all the
data, multiple webcams from diferent computers were
utilized. This strategy ensured the collection of a diverse
set of images, simulating possible real-world applications
and enhancing the robustness and generalization
capability of the model limiting the bias introduction.</p>
          <p>The custom dataset was gathered using specially
developed software designed to display nine strategically
chosen key points on the screen. These points included
one at each corner of the screen, one at the center, and
3. Implementation one at each of the four cardinal directions on the screen,
Nord, sud, est, and west, as illustrated in Figure 1.
ParticiThis section explores the implementation of the entire pants were instructed to fixate on each point sequentially,
pipeline, from data collection to the architecture and as they were shown, for a predetermined amount of time.
real-time tracking, expanding the key components. This method allowed the collection of data samples for
each gaze point while permitting participants to
natu3.1. Dataset Collection rally adjust their head orientation and position like in a
typical user interaction. Besides these 9 points, a variable
To develop a robust eye-gaze tracking system using just a number of random points were also shown on the screen,
portable computer’s webcam, the dataset is crucial. The one after the other.
actual available ones present many limitations, such as Additionally, the data collection process included
sesthe poor amount of data, poor quality data, or less lib- sions where participants were asked to wear glasses, to
erty in the disposition and interaction of the user with enrich the dataset with varied and challenging
condithe screen and the distance with the camera. Others, tions.
with higher volume data, are based on mobile devices,
not allowing an easy transition from vertical screens
of smartphones to horizontal PC screens, similarly, the
proximity and the relative angle of interaction to the
device itself are drastically diferent. To overcome these
32–41</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>3.2. Data extraction and Annotation</title>
          <p>3.2.1. Face, mask grid and eyes</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Each video is then processed extracting candidate frames.</title>
        <p>Each frame is inspected and the cropped face image is
extracted if available. Face detection is executed using
MediaPipe Face Detection, a lightweight model based on
the BlazeFace architecture, which provides
state-of-theart techniques optimized for real-time applications. This
model also performs well under challenging conditions
such as partial occlusions, diverse facial orientations, Figure 2: Facial landmarks provided by Dlib’s 68 model, which
and varying lighting conditions. The MediaPipe detector detect the face and then the coordinate (x,y) of the 68 total
outputs the coordinates (,  ,  , ℎ) of the bounding box features, providing information about the aperture of mouth,
around the detected face, which will be used to generate eye, and the orientation of the head. The points from 37 to
the face grid. This grid will provide a spatial map of face 41 and from 43 to 48 will be leveraged for the computation of
positioning within the video frame, helping the model to the Eye Aspect Ratio. Points 37 and 46 are leveraged for the
understand where the face is positioned relative to the roll pose, points 28, and 9 for the tilt, and the 34, 37, and 46
entire frame. For each detected face, the bounding box for the yaw.
coordinates (,  ,  , ℎ) are scaled down to fit a grid of
size 25 × 25. The bounding box is then mapped to this
grid, marking cells where the face is 1 and all other cells Where  37,  38, … ,  48 are the landmarks around the
as 0. This binary grid serves as one of the inputs to the left and right eyes, respectively, according to the Figure 2.
model, facilitating the learning of spatial relationships This metric facilitates the identification of eyelid position
in the gaze estimation tasks. The pipeline proceeds to and blinks and leverages this information to improve the
the detection of the eyes, which employs either Haar robustness of the model.
cascades or the lib library, depending on which method
yields the most accurate results on the specific conditions, 3.2.3. Roll-Pitch-Yaw
as determined through a human-in-the-loop evaluation.</p>
        <p>While the Haar Cascades already provide a bounding box The head orientation is derived from facial landmarks
deto crop the region of interest of the eyes, the dlib uses the tected in each frame. Roll is determined by the tilt of the
landmark features of the eyes, considers padding around, line connecting the outer corners of the eyes (landmarks
and then crops. The eyes are not automatically included 37 and 46) relative to the horizontal axis, indicating left
in the dataset, instead, each pair is inspected to ensure or right head tilt. The pitch measures the vertical tilt of
they are successfully recognized and suficiently open. the head and is calculated from the vertical position of
This check is crucial for confirming the quality of the the top of the nose bridge (landmark 28) relative to the
data and that at least the horizontal position of the pupil chin (landmark 9), showing whether the head is tilted
can be discerned, excluding instances where the eyes are upward or downward. Yaw, indicating left or right head
fully closed. rotation around the vertical axis, is calculated from the</p>
        <p>The face grid together with the face and eye images position of the nose tip (landmark 34) relative to the
midare grouped with the gaze point as in Figure 3 and then point between the eyes (average of landmarks 37 and
expanded with the additional input features. 46). These three angles provide a comprehensive 3D
orientation of the head, enhancing the accuracy of gaze
3.2.2. Eye Aspect Ratio estimation without necessitating a 3D head model or
extensive computations, making the system adaptable
for real-time applications where the user’s head position
varies.</p>
        <p>In the end, this information is paired with the
corresponding gaze point on the screen, selected from nine
possible options.</p>
      </sec>
      <sec id="sec-2-4">
        <title>If both eyes are correctly detected, the pipeline proceeds</title>
        <p>to associate the corresponding Eyes Aspect Ratio. The
EAR is a geometric measure used to quantify the
openness of the eyes. It is computed for each eye using six
specific facial landmarks. For the left eye, the EAR is
calculated as follows:</p>
        <p>EARℎ
EAR 
=
=
‖ 38 −  42‖ + ‖ 39 −  41‖</p>
        <p>2‖ 37 −  40‖
‖ 44 −  48‖ + ‖ 45 −  47‖
2‖ 43 −  46‖</p>
      </sec>
      <sec id="sec-2-5">
        <title>Some preprocessing steps were performed before feeding the data into the model for training to ensure the reliability and robustness of the system.</title>
        <p>3.3.1. Image Resizing and Cropping Figure 4: Model Architecture pipeline: The model is organized
in 2 parallel pipelines that work on the eye and face. The first,
All images, face and eye regions, were resized to a uni- in red, takes as input the cropped eye images and the Eye
form dimension of 64 × 64 pixels to maintain consistency Aspect Ratio computed with the facial landmarks. The CNNs
across the dataset and to be fed into the model. that take as input the eyes share the parameters. The second,
in blue, takes as input the cropped face, the Mask grid, and
3.3.2. Histogram Equalization tthhee hgeaazdeppooisnet.. TThheeni mthaegoeustapruetsblaurrerecdonfcoartpenriavtaecdytroecaosomnpsu.te
3.4.1. Model Architecture
Histogram equalization was employed to improve feature
extraction. This technique adjusts pixel values in an
image to enhance overall contrast. By redistributing
the intensity levels, it equalizes the histogram of the
output image. This process makes the model more robust
in identifying relevant features under varied lighting
conditions.</p>
      </sec>
      <sec id="sec-2-6">
        <title>The model architecture draws inspiration from the</title>
        <p>iTracker model [48], incorporating modifications to
enhance performance. These modifications include
additional input features such as head pose angles (yaw, pitch,
3.3.3. Data Augmentation and roll), the Eye Aspect Ratio (EAR), and the
reorganization and reduction of the layers, to provide a lighter
Several data augmentation techniques were applied to model with faster convergence. The complete pipeline is
enhance the robustness of the model. Specifically, a ran- shown in Figure 4.
dom crop was used to simulate limited visibility of the The Eye Aspect Ratio incorporation started from the
face or eyes, and Gaussian Blur was employed to mimic consideration that, in normal conditions, users will tend
poor image quality or focus. Variability in brightness to open their eyes wider when looking at higher points
and saturation was introduced, along with random rota- on a screen and as narrow as they are looking downward
tions and random erasing of portions and filling it with on the screen. The integration of the EAR information
random values. These techniques help reduce overfitting aims to specifically enhance the sensitivity of the model
and improve the model’s ability to generalize from the to vertical gaze shifts, improving the performance of the
training data to unseen data in real-world applications. model on the vertical axe prediction and better handling</p>
        <p>These preprocessing steps, collectively, ensure that the cases in which the pupil is hardly observable by the
simdata fed into the model is of high quality, consistent in ple raw image provided by the webcam.
size and format, and varied enough to promote robust The integration of head orientation data, along with
learning and prediction accuracy. the face grid, aims to provide the model with
comprehensive information about the head’s spatial positioning,
3.4. Model without the necessity for computationally demanding
external 3D models of the head or the eyes. Leverages
In this section will be presented the model, the architec- the advantages of model-based methods while avoiding
ture, and the training. The object was the realization of their drawbacks.
an eficient model able to provide good performances and The model’s architecture is organized in two distinct
run in real-time on a real-world application. The core of semantic pathways for the eyes and face, each consisting
the implementation involved developing and training a of several convolutional layers followed by pooling layers,
convolutional neural network (CNN) to predict the gaze these layers are designed to capture fine-grained details
point based on the processed input features.
necessary for accurate gaze estimation. The eye pathway
processes separately the eye images with convolutional
layers with shared parameters between the right and left
eye, then the information is integrated with the EAR
of both eyes with a fully connected. The face pathway
processes the entire face region through a similar series
of convolutional layers, then combines this information
with the face grid, and roll pitch yaw angles.</p>
        <sec id="sec-2-6-1">
          <title>3.5. Loss Function</title>
        </sec>
      </sec>
      <sec id="sec-2-7">
        <title>The choice of an appropriate loss function is extremely</title>
        <p>important for the efectiveness of model training. During
development, two primary loss functions were evaluated: Figure 5: Model Training Plot: The mean absolute error (MAE)
Mean Squared Error (MSE) and Huber Loss. for pixel coordinates (x,y) is illustrated, with training data in</p>
        <p>Huber Loss was used to mitigate the outlier sensitivity blue and validation data in orange. The green and red lines
issue with MSE, and the large scale of pixel predictions. It represent the MAE for the x and y coordinates in the training
combines the best properties of MSE and Mean Absolute data, while purple and brown depict these in the validation
Error (MAE), behaving like MSE for small errors and like data. Notably, the MAE for the x coordinates is consistently
MAE for large errors, reducing the influence of outliers higher than for the y coordinates, likely due to the larger pixel
on the model’s training. The Huber Loss is defined as: scale on the laptop screen.</p>
        <p>1 ( −  )̂ 2
{ 2
 (| −  | ̂−
for | −  | ̂ ≤ 
otherwise
  ( ,  )̂ =</p>
        <p>12  ) toafinthinegucsoernsloisotkenincgy awtitthhet hlaepdtoaptasscert.eeTnhe1s5e-ifnrcahm,
emsaairneWhere  is a threshold parameter that dictates the transi- captured at a standard video frame rate of 30 frames per
tion point between the squared loss and the absolute loss. second, usually provided by commonly available
webThis property makes Huber Loss particularly promising cams, which balances between providing smooth video
for this application, as it balances the need for robust- and the computational load on the system. Each captured
ness with the sensitivity to small errors, critical for the frame undergoes a series of preprocessing steps like in
precise prediction of gaze points. As shown in 5, Huber the training phase to maintain consistent data and to
Loss provided a significant improvement in model con- enhance the model’s performance.
vergence and performance compared to MSE, leading to
more stable training and reduced gradient accumulation 3.6.2. Calibration</p>
      </sec>
      <sec id="sec-2-8">
        <title>The application allows to perform an optional calibration</title>
        <p>3.5.1. Regularization step to improve the results on the actual user of the
eyeTo further increase the robustness of the architecture, tracking. To perform the calibration, the system proceeds
were leveraged some regularization techniques. Together to show 9 points on the screen, in the 9 main
representawith the already cited data augmentation, working on tive points, for each collects the prediction provided by
the data, on the model side leveraged the dropout, with the model and compares it with the actual ground truth.
a hyperparameter tuning which led to a successful value Then leverage the diference between the two values to
of 0.2. The training loop then incorporated a learning improve further predictions of the actual user.
rate scheduler together with an early stopping.
3.6.3. Feature Extraction</p>
        <sec id="sec-2-8-1">
          <title>3.6. Real-Time Tracking</title>
        </sec>
      </sec>
      <sec id="sec-2-9">
        <title>The system ofers flexibility in selecting the method for</title>
        <p>The implementation of the real-time tracking functional- extracting eye patches from images, with options
includity represents an essential step for practical applications. ing the dlib68 and the eye cascade approaches. Following
The following section describes the system’s setup, the this it calculates the Eye Aspect Ratio and head angles.
operational flow, and the technologies employed. Then the model uses this information to predict the gaze
point in screen coordinates in real-time. This step is
com3.6.1. System Setup putationally intensive and is optimized to run eficiently
on standard consumer hardware without significant
deFor the real-time application, the system uses standard lays. The predicted gaze point is immediately displayed
laptop webcams, 1280 x 720p, to capture video frames on the user’s screen, providing real-time feedback.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Results</title>
      <p>above the other, the main focus was paid to the vertical
movement recognition of the eyes, a critical aspect in the
The proposed eye-tracking system demonstrates signif- eye-tracking field.
icant and promising improvements in gaze estimation The comparison of the model with the MPIIGaze,
compared to existing methods, excelling not only in pre- ETHXGaze, and FAZE pointed out a series of
considdiction accuracy but also in eficiency—an essential fac- erations about the performances of the models. The
comtor for real-time applications. These gains are largely parison focused on the zone classification accuracy in a
attributed to model optimizations that resulted in a more grid setup of the screen (Figure 6),
lightweight design. Specifically, the system performed
smoothly on a laptop GPU, achieving a frame rate of 4.1.1. Four Cell Grid task
50 frames per second under optimal visibility and
environmental conditions. On a laptop CPU, the model Our model showed an overall accuracy of 88.5% with
prealso maintained commendable performance, consistently cision of 0.887 and recall of 0.885, excelling particularly in
delivering 30 frames per second without any drop in accu- the top left grid cell while showing weaker performance
racy. This places the system on par with state-of-the-art in the bottom right grid cell. This shows a promising
models but with fewer parameters, making it more efi- overall behavior, with room for improvement due to the
cient. unbalanced result in the four cells. Interestingly, the</p>
      <p>To validate the model’s efectiveness, a real-time appli- proposed architecture, compared to the other models,
cation was developed. This application captures the live showed a slightly better understanding of the top left and
video feed from the webcam, processes it through the top right cases, rather than the lower one.
model to predict the user’s gaze point, and then displays
the predicted point on the screen, providing immediate 4.1.2. Vertical Dual-Grid Task
feedback. To further assess performance, the model was
compared with several state-of-the-art architectures on
a common task, where the screen was divided into cells
to track accuracy. The system showed promising
results in terms of both inference time for real-time
deployment and prediction accuracy, performing competitively
against the benchmark models.</p>
      <p>The second task aimed to inspect the model’s
capability of recognizing horizontal movement, and the model
demonstrated a good 93% overall accuracy, with 0.935
precision and 0.912 recall, quite struggling with the right
section. This result comes from the previous
considerations on the 4-grid task, in which, the bottom-right case
was shown to be responsible for a drop-down in the
prediction performances, making sufering this lack also to
this other task when needed to correctly identify the
gaze-point into the right part of the screen.</p>
      <sec id="sec-3-1">
        <title>To evaluate the eye-tracking recordings and benchmark</title>
        <p>model performance, the real-time eye-tracking system
was leveraged to perform a Fixation-Zone task [51]. 4.1.3. Horizontal Dual-Grid Task
To maintain consistency, all the experiments were per- The third task focused on evaluating the model’s ability
formed on a laptop with an incorporated camera and to identify vertical eye movement accurately. Here, the
a 15-inch screen. The approach performs a zone-wise model achieved an accuracy of 91%, precision of 0.925,
classification accuracy, aggregated over the participants, and recall of 0.899, in correctly recognizing the vertical
where the users are instructed to fixate on specific regions grid cell observed. While it is apparent that other models
of the screen, which turn green for a certain amount of experience a significant decline in performance
transitime, free to move, as long as their gaze is constrained tioning from horizontal to vertical eye movement tasks,
within the boundaries. The experiments instructed to the proposed model exhibited only a slight drop in
perperform a total of 3 tasks, where each aimed to enforce formance. It still significantly outperformed the other
and study the model performances to specific behavior models, especially in the top cell case. Unfortunately,
and compare this information with other SOTA archi- the bottom cell exhibited a slightly lower precision of
tectures like the MPIIGaze, ETHXGaze, and the FAZE. the model when the gaze point approached the screen’s
In the first one, the screen was divided into 4 grid cells, center, leading to some misclassifications that slightly
determining the overall behavior of the model, and ob- exceeded the bottom grid cell boundary and resulted in
serving the general performances of eye-tracking all over errors.
the screen. The second task has 2 grid cells that split Unfortunately, many misclassification cases were also
vertically the screen on two sides, this allows to better linked to unfavorable user visibility or environmental
focus on the architecture capability to recognize the hor- conditions. These factors made predictions more
chalizontal movement of the gaze. In the last task, which lenging for the model, highlighting areas for
improvedivided the screen into two grid cells horizontally, one ment and the potential to surpass existing architectures.</p>
        <sec id="sec-3-1-1">
          <title>4.1. Comparison tasks</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusions</title>
      <p>This work presented an end-to-end eye-tracking solution
designed to be lightweight, utilizing only a standard
webcam, while maintaining high accuracy and low resource
requirements. The results indicate that the proposed
system can be efectively applied in various real-world
scenarios, achieving robust performance in both vertical
and horizontal gaze detection. This versatility makes it a
practical tool for studies in areas such as psychometrics
and Human-Computer Interaction (HCI), especially those
focused on gaze laterality and cognitive assessments for
broad regions of the screen. Interestingly, the model
demonstrated significant robustness in detecting
vertical gaze movements, likely due to its high sensitivity to
eye aperture ratio, making it particularly adept at
distinguishing between upper and lower gaze positions. This
capability was confirmed during task evaluations, where
the system showed better precision in upper-screen
positions compared to lower ones. Some imprecision was
noted in central areas of the screen, particularly in
distinguishing between center-up and center-low positions,
likely due to the natural tendency for the eyes to be more
open in upper gaze positions. Despite these challenges,
the model maintained eficiency even on smaller laptop
screens and at greater distances, contrasting with typical
close-range setups required by mobile devices. Future
work could focus on enhancing the system’s robustness
under diverse lighting conditions and user poses by
enriching the dataset with more varied samples and a wider
range of user demographics. Increasing the number of
ifxation points during data collection could also provide
a more comprehensive understanding for the model,
improving precision across all screen areas. Additionally,
modifying the model to focus solely on the eye regions,
rather than the entire face, could improve its performance
in situations where face visibility is limited or when only
one eye is visible. This refinement would not only make
the model more eficient but also help it handle
challenging conditions such as medical constraints or occlusions
more efectively. In summary, the proposed system
represents a significant advancement in making eye-tracking
technology more accessible and practical for a wide range
of everyday applications, reducing the need for
expensive specialized hardware and ofering a versatile tool for
research and clinical environments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. M. Mariani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Perez-Vega</surname>
            ,
            <given-names>J. Wirtz,</given-names>
          </string-name>
          <article-title>Ai in marketing, consumer research and psychology: A systematic literature review and research agenda</article-title>
          ,
          <source>Psychology &amp; Marketing</source>
          <volume>39</volume>
          (
          <year>2022</year>
          )
          <fpage>755</fpage>
          -
          <lpage>776</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sindhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sindhu</surname>
          </string-name>
          , et al.,
          <article-title>Exploring the intersections of ai (artificial intelligence) in psychology and astrology: a conceptual inquiry for human well-being</article-title>
          ,
          <source>J Psychol Clin Psychiatry</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>75</fpage>
          -
          <lpage>77</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bonanno</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Capizzi,</surname>
          </string-name>
          <article-title>An hybrid neurowavelet approach for long-term prediction of solar wind</article-title>
          ,
          <source>in: Proceedings of the International Astronomical Union</source>
          , volume
          <volume>6</volume>
          ,
          <year>2010</year>
          , p.
          <fpage>153</fpage>
          -
          <lpage>155</lpage>
          . doi:
          <volume>10</volume>
          .1017/S174392131100679X.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pappalardo</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Tramontana,</surname>
          </string-name>
          <article-title>An agent-driven semantical identifier using radial basis neural networks and reinforcement learning</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>1260</volume>
          ,
          <year>2014</year>
          . URL: https://www.scopus.com/inward/ record.uri?eid=
          <fpage>2</fpage>
          -
          <lpage>s2</lpage>
          .
          <fpage>0</fpage>
          -
          <lpage>84919742629</lpage>
          &amp;partnerID=
          <volume>40</volume>
          &amp;md5=
          <fpage>c3ee8a3fa1716b39215326edfc67d955</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Capizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. Lo</given-names>
            <surname>Sciuto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , E. Tramontana,
          <article-title>Advanced and adaptive dispatch for smart grids by means of predictive models</article-title>
          ,
          <source>IEEE Transactions on Smart Grid</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>6684</fpage>
          -
          <lpage>6691</lpage>
          . doi:
          <volume>10</volume>
          .1109/TSG.
          <year>2017</year>
          .
          <volume>2718241</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Clifton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Inhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Liversedge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Reichle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Schotter</surname>
          </string-name>
          ,
          <article-title>Eye movements in reading and information processing: Keith rayner's 40year legacy</article-title>
          ,
          <source>Journal of Memory and Language</source>
          <volume>86</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          . URL: https://www.sciencedirect.com/science/ article/pii/S0749596X15000960. doi:https://doi. org/10.1016/j.jml.
          <year>2015</year>
          .
          <volume>07</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A comprehensive solution for formatics)</article-title>
          , volume
          <volume>14126</volume>
          LNAI,
          <year>2023</year>
          , p.
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          .
          <article-title>psychological treatment and therapeutic</article-title>
          path plan- doi:10.1007/978- 3-
          <fpage>031</fpage>
          - 42508-
          <article-title>0_1. ning based on knowledge base and expertise shar-</article-title>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karn</surname>
          </string-name>
          , Eye Tracking in Humaning,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>2472</volume>
          ,
          <string-name>
            <given-names>Computer</given-names>
            <surname>Interaction</surname>
          </string-name>
          and Usability Research:
          <year>2019</year>
          , p.
          <fpage>41</fpage>
          -
          <lpage>47</lpage>
          .
          <article-title>Ready to Deliver the Promises</article-title>
          , volume
          <volume>2</volume>
          ,
          <year>2003</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lo Sciuto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , A cloud-based pp.
          <fpage>573</fpage>
          -
          <lpage>605</lpage>
          . doi:
          <volume>10</volume>
          .1016/B978- 044451020
          <article-title>- 4/ lfexible solution for psychometric tests validation, 50031- 1. administration and evaluation</article-title>
          , in: CEUR Workshop [19]
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          , G. Galati, C. Napoli, AddressProceedings, volume
          <volume>2468</volume>
          ,
          <year>2019</year>
          , p.
          <fpage>16</fpage>
          -
          <lpage>21</lpage>
          .
          <article-title>ing vehicle sharing through behavioral analysis: A</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Falciglia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Betello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Learn- solution to user clustering using recency-frequencying visual stimulus-evoked eeg manifold for neural monetary and vehicle relocation based on neighimage classification</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>588</volume>
          (
          <year>2024</year>
          ). borhood splits,
          <source>Information (Switzerland) 13</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1016/j.neucom.
          <year>2024</year>
          .
          <volume>127654</volume>
          . doi:
          <volume>10</volume>
          .3390/info13110511.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pappalardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Tramontana</surname>
          </string-name>
          , A hy- [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Brociek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Magistris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cardia</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Coppa, brid neuro-wavelet predictor for qos control and S. Russo, Contagion prevention of covid-19 by stability</article-title>
          , in: Lecture Notes in Computer Sci- means
          <article-title>of touch detection for retail stores</article-title>
          ,
          <source>in: CEUR ence (including subseries Lecture Notes in Arti- Workshop Proceedings</source>
          , volume
          <volume>3092</volume>
          ,
          <year>2021</year>
          , p.
          <fpage>89</fpage>
          - ifcial
          <source>Intelligence and Lecture Notes in Bioinfor- 94. matics)</source>
          , volume
          <volume>8249</volume>
          LNAI,
          <year>2013</year>
          , p.
          <fpage>527</fpage>
          -
          <lpage>538</lpage>
          . [21]
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brociek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          , First doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>319</fpage>
          - 03524- 6_
          <fpage>45</fpage>
          .
          <article-title>studies to apply the theory of mind theory to green</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>G. De Magistris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Russo</surname>
            , P. Roma,
            <given-names>J. T.</given-names>
          </string-name>
          <string-name>
            <surname>Starczewski</surname>
          </string-name>
          ,
          <article-title>and smart mobility by using gaussian area clusterC. Napoli, An explainable fake news detector based ing</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3118</volume>
          ,
          <article-title>on named entity recognition and stance classifica- 2021</article-title>
          , p.
          <fpage>71</fpage>
          -
          <lpage>76</lpage>
          . tion applied to covid-19,
          <string-name>
            <surname>Information</surname>
            (Switzerland) [22]
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Ponzi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Russo</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Bianco</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          Wa13 (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .3390/info13030137. jda,
          <article-title>Psychoeducative social robots for an healthier</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Holzman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Proctor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Yasillo</surname>
          </string-name>
          ,
          <article-title>lifestyle using artificial intelligence: a case-study,</article-title>
          <string-name>
            <surname>H. Y. Meltzer</surname>
            ,
            <given-names>S. W.</given-names>
          </string-name>
          <string-name>
            <surname>Hurt</surname>
          </string-name>
          , Eye-tracking dysfunc- in
          <source>: CEUR Workshop Proceedings</source>
          , volume
          <volume>3118</volume>
          ,
          <article-title>tions in schizophrenic patients and their relatives</article-title>
          ,
          <year>2021</year>
          , p.
          <fpage>26</fpage>
          -
          <lpage>33</lpage>
          . Archives of general psychiatry
          <volume>31</volume>
          (
          <year>1974</year>
          )
          <fpage>143</fpage>
          -
          <lpage>151</lpage>
          . [23]
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Dat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vincelli</surname>
          </string-name>
          , Supporting
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S. I.</given-names>
            <surname>Illari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Avanzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A cloud- impaired people with a following robotic assistant oriented architecture for the remote assessment by means of end-to-end visual target navigation and follow-up of hospitalized patients, in: CEUR and reinforcement learning approaches</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>2694</volume>
          ,
          <year>2020</year>
          , p.
          <fpage>29</fpage>
          -
          <string-name>
            <surname>Workshop</surname>
            <given-names>Proceedings</given-names>
          </string-name>
          , volume
          <volume>3118</volume>
          ,
          <year>2021</year>
          , p.
          <fpage>51</fpage>
          -
          <lpage>35</lpage>
          .
          <fpage>63</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Puglisi</surname>
          </string-name>
          , S. Russo, [24]
          <string-name>
            <given-names>G.</given-names>
            <surname>Capizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          , M. Woźniak, LessenI. E. Tibermacine,
          <article-title>Exploiting robots as healthcare ing stress and anxiety-related behaviors by means resources for epidemics management and support of ai-driven drones for aromatherapy, in: CEUR caregivers</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , vol- Workshop Proceedings, volume
          <volume>2594</volume>
          ,
          <year>2020</year>
          , p.
          <fpage>7</fpage>
          -
          <lpage>ume</lpage>
          3686,
          <year>2024</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. I. Illari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Avanzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , Reduc- [25]
          <string-name>
            <given-names>E.</given-names>
            <surname>Whitmire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Trutoiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cavin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Perek</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          <article-title>Scally, ing the psychological burden of isolated oncologi- J.</article-title>
          <string-name>
            <surname>Phillips</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>Eyecontact: scleral coil eye cal patients by means of decision trees, in: CEUR tracking for virtual reality</article-title>
          ,
          <year>2016</year>
          , pp.
          <fpage>184</fpage>
          -
          <lpage>191</lpage>
          . Workshop Proceedings, volume
          <volume>2768</volume>
          ,
          <year>2020</year>
          , p.
          <fpage>46</fpage>
          -
          <lpage>doi</lpage>
          :10.1145/2971763.2971771. 53. [26]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <article-title>Fatigue driving detection based on</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <article-title>Eye tracking based electrooculography: a review, EURASIP Journal on driver fatigue monitoring and warning system</article-title>
          ,
          <source>in: Image and Video Processing</source>
          <year>2021</year>
          (
          <year>2021</year>
          )
          <fpage>33</fpage>
          . India International Conference on Power Electron- [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Tibermacine, ics
          <year>2010</year>
          (
          <article-title>IICPE2010</article-title>
          ),
          <year>2011</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1109/ D. Chebana,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nahili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Starczewscki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , IICPE.
          <year>2011</year>
          .
          <volume>5728062</volume>
          .
          <article-title>Analyzing eeg patterns in young adults exposed to</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alfarano</surname>
          </string-name>
          , G. De Magistris,
          <string-name>
            <given-names>L.</given-names>
            <surname>Mongelli</surname>
          </string-name>
          , S. Russo,
          <article-title>diferent acrophobia levels: a vr study, Frontiers J</article-title>
          .
          <string-name>
            <surname>Starczewski</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Napoli</surname>
          </string-name>
          , A novel convmixer in
          <source>Human Neuroscience</source>
          <volume>18</volume>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .
          <article-title>3389/ transformer based architecture for violent behav- fnhum</article-title>
          .
          <year>2024</year>
          .
          <volume>1348154</volume>
          . ior detection, in: Lecture Notes in Computer [28]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boutarfaia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. E.</surname>
          </string-name>
          <article-title>TiberScience (including subseries Lecture Notes in Ar- macine, Deep learning for eeg-based motor imagery tificial Intelligence and Lecture Notes in Bioin- classification: Towards enhanced human-machine interaction and assistive robotics, in: CEUR Work- gaze tracking by combining eye-and facial-gaze shop Proceedings</article-title>
          , volume
          <volume>3695</volume>
          ,
          <year>2023</year>
          , p.
          <fpage>68</fpage>
          -
          <lpage>74</lpage>
          . vectors,
          <source>The Journal of Supercomputing</source>
          <volume>73</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Johns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chapman</surname>
          </string-name>
          , K. Crow-
          <volume>3038</volume>
          -3052. ley, N. Michael,
          <article-title>Monitoring eye</article-title>
          and eyelid move- [42]
          <string-name>
            <given-names>X.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z. Zhang,</surname>
          </string-name>
          <article-title>Eye gaze trackments by infrared reflectance oculography to mea- ing using an rgbd camera: a comparison with a sure drowsiness in drivers</article-title>
          ,
          <source>Somnologie</source>
          <volume>11</volume>
          (
          <year>2007</year>
          )
          <article-title>rgb solution</article-title>
          ,
          <source>in: Proceedings of the 2014 ACM 234-242</source>
          . International Joint Conference on Pervasive and
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>D. Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Park</surname>
          </string-name>
          , S.
          <string-name>
            <surname>-M. Cho</surname>
          </string-name>
          , S. Han, Ubiquitous Computing: Adjunct Publication,
          <year>2014</year>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Choo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Shim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-W.</given-names>
            <surname>Jeon</surname>
          </string-name>
          , et al., pp.
          <fpage>1113</fpage>
          -
          <lpage>1121</lpage>
          .
          <article-title>Use of eye tracking to improve the identification</article-title>
          of [43]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <article-title>Realtime and accurate 3d attention-deficit/hyperactivity disorder in children, eye gaze capture with dcnn-based iris</article-title>
          and
          <source>pupil Scientific Reports</source>
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <article-title>14469</article-title>
          . segmentation, IEEE transactions on visualization
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>S.</given-names>
            <surname>Caldani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Acquaviva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moscoso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Peyre</surname>
          </string-name>
          , and computer graphics 27 (
          <year>2019</year>
          )
          <fpage>190</fpage>
          -
          <lpage>203</lpage>
          . R. Delorme,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Bucci</surname>
          </string-name>
          , Reading performance in [44]
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tibermacine</surname>
          </string-name>
          , W. Guettala,
          <article-title>children with adhd: An eye-tracking study</article-title>
          ,
          <string-name>
            <surname>Annals</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Russo</surname>
          </string-name>
          ,
          <source>Enhancing sentiment analof Dyslexia</source>
          <volume>72</volume>
          (
          <year>2022</year>
          )
          <fpage>552</fpage>
          -
          <lpage>565</lpage>
          .
          <article-title>ysis on seed-iv dataset with vision transformers:</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A com- A comparative study</article-title>
          ,
          <source>in: ACM International parative study of machine learning approaches for Conference Proceeding Series</source>
          ,
          <year>2023</year>
          , p.
          <fpage>238</fpage>
          -
          <lpage>246</lpage>
          .
          <article-title>autism detection in children from imaging data</article-title>
          ,
          <source>in: doi:10.1145/3638985.3639024. CEUR Workshop Proceedings</source>
          , volume
          <volume>3398</volume>
          ,
          <year>2022</year>
          , [45]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sugano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fritz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bulling</surname>
          </string-name>
          , Mpiigaze: p.
          <fpage>9</fpage>
          -
          <lpage>15</lpage>
          .
          <article-title>Real-world dataset and deep appearance-based gaze</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>B.</given-names>
            <surname>Nerušil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Polec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Škunda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kačur</surname>
          </string-name>
          ,
          <article-title>Eye tracking estimation, IEEE transactions on pattern analysis based dyslexia detection using a holistic approach</article-title>
          ,
          <source>and machine intelligence</source>
          <volume>41</volume>
          (
          <year>2017</year>
          )
          <fpage>162</fpage>
          -
          <lpage>175</lpage>
          . Scientific Reports
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <fpage>15687</fpage>
          . [46]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Park,
          <string-name>
            <given-names>T.</given-names>
            <surname>Beeler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bradley</surname>
          </string-name>
          , S. Tang,
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Hansen, In the eye of the beholder: A sur-</article-title>
          <string-name>
            <surname>O. Hilliges</surname>
          </string-name>
          ,
          <article-title>Eth-xgaze: A large scale dataset for vey of models for eyes and gaze, IEEE Transactions gaze estimation under extreme head pose</article-title>
          and
          <source>gaze on Pattern Analysis &amp;amp; Machine Intelligence</source>
          <volume>32</volume>
          variation, in: Computer Vision-ECCV
          <year>2020</year>
          :
          <article-title>16th (</article-title>
          <year>2010</year>
          )
          <fpage>478</fpage>
          -
          <lpage>500</lpage>
          . doi:
          <volume>10</volume>
          .1109/TPAMI.
          <year>2009</year>
          .
          <volume>30</volume>
          . European Conference, Glasgow, UK,
          <year>August</year>
          23-28,
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fiani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>An advanced solu- 2020</article-title>
          , Proceedings, Part V 16, Springer,
          <year>2020</year>
          , pp.
          <source>tion based on machine learning for remote emdr 365-381. therapy, Technologies</source>
          <volume>11</volume>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .3390/ [47]
          <string-name>
            <given-names>S.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Mello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Molchanov</surname>
          </string-name>
          , U. Iqbal,
          <year>technologies11060172</year>
          .
          <string-name>
            <given-names>O.</given-names>
            <surname>Hilliges</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kautz</surname>
          </string-name>
          ,
          <article-title>Few-shot adaptive gaze es-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>E.</given-names>
            <surname>Iacobelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Napoli,</surname>
          </string-name>
          <article-title>A machine learning timation, in: Proceedings of the IEEE/CVF interbased real-time application for engagement detec</article-title>
          - national
          <source>conference on computer vision</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>tion</fpage>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>9368</volume>
          -
          <fpage>9377</fpage>
          . 3695,
          <year>2023</year>
          , p.
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          . [48]
          <string-name>
            <given-names>K.</given-names>
            <surname>Krafka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kellnhofer</surname>
          </string-name>
          , H. Kannan,
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fiani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A fully automatic visual S. Bhandarkar</article-title>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Matusik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Torralba</surname>
          </string-name>
          ,
          <article-title>Eye trackattention estimation support system for a safer driv- ing for everyone</article-title>
          ,
          <year>2016</year>
          . arXiv:
          <volume>1606</volume>
          .05814. ing experience, in: CEUR Workshop Proceedings, [49]
          <string-name>
            <given-names>C.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <source>Webcam-based eye movevolume 3695</source>
          ,
          <year>2023</year>
          , p.
          <fpage>40</fpage>
          -
          <lpage>50</lpage>
          .
          <article-title>ment analysis using cnn</article-title>
          ,
          <source>IEEE Access 5</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>E.</given-names>
            <surname>Iacobelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , Eye-
          <volume>19581</volume>
          -19587.
          <article-title>tracking system with low-end hardware</article-title>
          : Devel- [50]
          <string-name>
            <surname>Y.-m. Cheung</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Peng</surname>
          </string-name>
          ,
          <article-title>Eye gaze tracking with opment and evaluation, Information (Switzerland) a web camera in a desktop environment</article-title>
          ,
          <source>IEEE</source>
          <volume>14</volume>
          (
          <year>2023</year>
          ).
          <source>doi:10.3390/info14120644. Transactions on Human-Machine Systems</source>
          <volume>45</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pepe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tedeschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          , L. Ioc-
          <volume>419</volume>
          -430. chi, C. Napoli, Human attention assessment us- [51]
          <string-name>
            <given-names>S.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. K.</given-names>
            <surname>Fink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. B.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <article-title>Deep ing a machine learning approach with gan-based learning models for webcam eye trackdata augmentation technique trained using a cus- ing in online experiments, Behavior Retom dataset</article-title>
          ,
          <source>OBM Neurobiology 6</source>
          (
          <year>2022</year>
          ).
          <source>doi:10. search Methods</source>
          <volume>56</volume>
          (
          <year>2024</year>
          )
          <fpage>3487</fpage>
          -
          <lpage>3503</lpage>
          . URL: 21926/obm.neurobiol.2204139. https://doi.org/10.3758/s13428-023-02190-6.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fiani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <source>Keeping eyes on the doi:10.3758/s13428-023-02190-6</source>
          . road:
          <article-title>Understanding driver attention and its role in safe driving</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3695</volume>
          ,
          <year>2023</year>
          , p.
          <fpage>85</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ko</surname>
          </string-name>
          , U. Jang, H. Han,
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Lee</surname>
          </string-name>
          , 3d
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>