<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Impact of Trial-wise and Test Data Leakage on EEG-Based Emotion Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peihong Lei</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mengyao Wu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wenjun Yi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hanlin Mo</string-name>
          <email>mohanlin@xidian.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Artificial Intelligence, Xidian University, 266 Xinglong Section of Xifeng Road</institution>
          ,
          <addr-line>Xi'an, Shaanxi 710126</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Deep learning-based approaches have significantly advanced emotion recognition technology using electroencephalography (EEG) data. However, data leakage poses a major threat to model generalizability. This paper focuses on analyzing two common leakage patterns, test data leakage (test-set-driven hyperparameter tuning) and trial-wise data leakage (where the same trial segment is split between training and test sets), lead to overestimation of deep learning model performance. We systematically quantify the impact of these two leakage types, applying four data processing approaches to the DEAP dataset: normal setting, test data leakage, trial-wise data leakage, and combined test data and trial-wise leakage. Six representative deep learning models were trained and tested under each data processing condition, maintaining identical other factors across all models to control variables. Experimental results demonstrate that under the three leakage conditions, all six models significantly outperform the normal setting: the minimum improvement in valence classification accuracy reached 35.71%, while the minimum improvement in arousal classification accuracy reached 25.00%. Architectures based on convolutional neural networks (CNN) were most afected, while transformer-based models showed smaller but still significant impacts. Further, visualization of the average intermediate features across all EEG data belonging to each class for these models reveals that data leakage induces significant alterations in brain topography patterns. The severity of performance inflation followed the order: combined leakage &gt; trial-wise data leakage &gt; test data leakage. In summary, our findings underscore the critical importance of implementing rigorous data partitioning protocols and leakage-aware experimental designs in both afective computing and neuroscience research. Only in this way can we ensure that artificial intelligence assists researchers in uncovering genuine scientific laws rather than leading them astray.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Electroencephalography(EEG)</kwd>
        <kwd>Emotion Classification</kwd>
        <kwd>Trial-wise Leakage</kwd>
        <kwd>Test Data Leakage</kwd>
        <kwd>Inflated Performance</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Emotional computing has become an increasingly important area in human-computer interaction,
cognitive science, and afective neuroscience. Accurate recognition of human emotions enables
intelligent systems to better understand and respond to user states, thereby improving user experience and
interaction eficiency. Common modalities for emotion recognition include facial expressions, voice,
and physiological signals such as electroencephalography (EEG). Among these, EEG-based emotion
recognition has unique advantages. It directly measures brain activity, is less susceptible to
deliberate masking by the user, and provides fine-grained temporal information. Consequently, EEG-based
emotion recognition has broad potential applications in mental health monitoring, adaptive learning,
afective gaming, and neurofeedback systems.</p>
      <p>
        Recent advances in EEG-based emotion recognition have been largely driven by deep learning
approaches. Methods can be broadly categorized based on network architectures. Convolutional neural
networks have been applied to capture spatial-temporal patterns in EEG signals [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]; graph neural
networks model the relationships between electrode channels [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]; generative adversarial networks
(GANs) have been used to augment EEG data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; and Vision Transformers (ViTs) have recently been
explored for their ability to model long-range dependencies [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. These models accept various forms
of input, including raw EEG time series, frequency-domain features, or two-dimensional image-like
transformations of EEG data.
      </p>
      <p>
        The evaluation of these models typically follows either subject-dependent or subject-independent
protocols. In subject-dependent settings, both the training and testing data are derived from the same
individuals, whereas in subject-independent settings, the testing set contains only data from individuals
unseen during training. Early studies primarily focused on subject-dependent evaluation, achieving
increasingly high performance. This success, however, has been partially misleading: recent analysis
suggests that many of the apparent improvements were inflated due to hidden data leakage issues, even
in supposedly standard evaluation protocols [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>
        One common source of leakage in subject-dependent experiments arises from splitting individual trials
into multiple segments. While segment-level classification followed by trial-level aggregation (e.g., via
voting) is a common practice, segments from the same trial can inadvertently appear in both training and
testing sets. A second, often overlooked source of data leakage stems from hyperparameter selection
using the test set. Specifically, selecting the number of training epochs or other hyperparameters
based on test set performance introduces information from the test set into the training process.
Theoretically, such selection should be performed on a separate validation set; however, EEG emotion
datasets are often small, prompting researchers to maximize the training data by using the test set
for hyperparameter tuning. Tuomas et al. first highlighted this issue in the field of micro-expression
recognition, demonstrating that commonly reported model performances were overestimated when
proper experimental protocols were enforced [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Despite its prevalence, this type of leakage has not
been systematically analyzed in EEG-based emotion recognition studies.
      </p>
      <p>In this work, we make the following contributions:
• We systematically analyze two major sources of data leakage in EEG-based emotion
recognition—trial-wise leakage and test data leakage—highlighting how these issues can inflate
reported performance and potentially mislead neuroscience interpretations.
• Based on the DEAP dataset, we evaluate six widely used EEG emotion classification models under
four experimental conditions: normally trained, trial-wise data leakage training, test data leakage
training, and combined leakage training. We quantitatively demonstrate the extent to which
performance is overestimated.
• Through visualization of intermediate model features, we show that data leakage alters EEG
topographic patterns, which may lead to erroneous conclusions in neuroscience and afective
computing research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>This section primarily reviews data leakage in the field of EEG-based emotion classification, with a
focus on two types of leakage: trial-wise leakage and test data leakage. Notably, while the former has
received attention recently, discussion and analysis of the latter remain scarce.</p>
      <sec id="sec-2-1">
        <title>2.1. Trial-Wise EEG Data Leakage</title>
        <p>In EEG analysis tasks (not limited to afective computing), particularly under subject-dependent
conditions, it is common to segment continuous EEG signals into shorter epochs for model training and
evaluation. Subject-dependent means that the model uses data from the same subjects during both
training and testing, i.e., each subject’s data is utilized for model learning as well as performance
validation. Segmenting long EEG trials into multiple short segments (segmentation or windowing)
can efectively increase the number of training samples, enabling deep learning models to learn more
robust representations of brain activity patterns and facilitating adaptation to commonly used network
architectures [11, 12, 13, 14].</p>
        <p>
          However, special care must be taken during data partitioning: all segments from the same trial must
be assigned to the same data split (training, testing, or validation). If segments from the same trial
or subject are erroneously distributed across multiple splits, data leakage occurs. In fact, segments
from the same subject are far more similar to each other than to segments from diferent subjects [ 15].
Such leakage can cause the model to learn subject-specific brain activity patterns rather than abstract
representations that generalize to new subjects. As a result, classification accuracy on the test set may
appear significantly high, while the model’s generalization to new subjects is severely compromised,
leading to an overestimation of performance [16, 17]. Recently, some studies have started to address this
issue and systematically analyzed the impact of data leakage on model performance and generalizability
[
          <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Test EEG Data Leakage</title>
        <p>In addition to trial-wise data leakage, another type of data leakage that has long been overlooked in the
ifeld of afective computing is test data leakage. A typical scenario occurs when test data are used to
select hyperparameters during model training, such as determining the number of training epochs or
choosing the optimal model. This efectively allows information from the test set to influence training,
leading to overly optimistic performance estimates.</p>
        <p>
          In the micro-expression recognition field, recent studies have systematically analyzed this issue.
Kapoor and Narayanan conducted a meta-review of data leakage and reproducibility in machine
learningbased sciences, finding that more than 17 research fields and 329 articles were afected by data leakage
or similar problems[18]. Specifically, for micro-expression recognition, several articles from 2019–2022
were potentially impacted by test data leakage [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The most common case was using test data to
determine the number of training epochs—that is, selecting the optimal model during training using the
test set. Analyses revealed that some methods reported F1-Scores close to 80 in the original publications,
but after correcting for data leakage, the actual performance dropped to around 50 F1-Score. Other
issues included feature extraction or preprocessing using test data, which similarly constitute data
leakage, causing substantial positive bias. The seriousness of such leakage lies in its potential to mislead
researchers regarding the true capabilities of the models.
        </p>
        <p>
          A similar problem exists in EEG-based afective computing, but it has not yet been systematically
identified or analyzed. For example, in 2023, Ding et al. proposed the TSception model [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], a deep
learning model for EEG-based emotion classification, and provided detailed experimental protocols and
source code on GitHub (https://github.com/yi-ding-cs/TSception). This work has attracted widespread
attention in the field, with nearly 300 citations. However, we found that during model training, the
authors used the test set, rather than a validation set, to select the optimal training model—i.e., to
determine the stopping epoch. This constitutes a typical case of test data leakage, which may lead to
overestimation of model performance.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>In this study, we consider an EEG-based emotion classification dataset consisting of  subjects, denoted
as:</p>
      <p>= {1, 2, . . . ,  }
where  represents all EEG recordings from subject . Each subject performs  trials, and each trial
corresponds to a specific emotional state:</p>
      <p>= {1, 2, . . . ,  }
Each trial  is associated with an emotion label  , which can be represented either as one of 
discrete emotion classes (e.g.,  = 6 basic emotions) or as a pair ( ,  ) in the valence–arousal (V–A)
space.
designed, including normal setting, test data leakage, trial-wise leakage, and combined data leakage.</p>
      <p>Under subject-dependent conditions, the model is both trained and tested on data from the same
subject. That is, for each subject , the data are divided into disjoint subsets:
 = 
train
∪ val
∪ test,
train
∩ val = ∅,
train
∩ 
test = ∅
This setting evaluates the model’s ability to learn individual-specific patterns of emotional responses,
which is a standard approach in EEG-based afective computing. Compared with the subject-independent
setting, it is relatively easier to achieve high performance, since the training and testing data come from
the same individual and thus share similar physiological and neural characteristics.</p>
      <p>Each trial  typically contains a long continuous EEG time series. To increase the number of
available samples and adapt to deep learning architectures, the trial is divided into  non-overlapping
segments:
segmentation process efectively increases the sample size:
where  ∈ R×  represents a segment with  EEG channels and  time points per segment. This
 = {1, 2, . . . ,  }</p>
      <p>|seg| =  ×  × 
Figure 1). Let 
(1) Normal Setting
train,</p>
      <p>val, and 
training, validation, and testing:
and enables models to learn more localized and stable representations of emotional brain activity.</p>
      <p>However, improper handling of segmented data during dataset partitioning can lead to data leakage,
as discussed in Section 2. To systematically investigate the impact of data leakage, we design four
experimental configurations based on diferent data partitioning and training strategies (as shown in
test denote the training, validation, and testing sets, respectively.</p>
      <p>In the normal (non-leaking) condition, the segmented data are divided in a ratio of 6 : 2 : 2 for
|
train| : |
val| : |
test| = 6 : 2 : 2
All segments originating from the same trial are assigned to the same subset:</p>
      <p>∈ 
⇒ (′) ∈/  ,</p>
      <p>,  ∈ {train, val, test},  ̸= 
model is selected based on validation accuracy:</p>
      <sec id="sec-3-1">
        <title>During training, the model  is optimized by minimizing the loss ℒ on</title>
        <p>train, and the best-performing
 * = arg min ℒ( , 

val)</p>
      </sec>
      <sec id="sec-3-2">
        <title>The final performance is then evaluated on</title>
        <p>test.
(2) Test Data Leakage In this setup, the data are correctly partitioned as in the normal condition.
However, during training, the test set is mistakenly used for model validation, meaning that the best
model is selected based on test performance:
resulting in overly optimistic evaluation results.
(3) Trial-wise Data Leakage This configuration follows the same data split ratio ( 6 : 2 : 2) but violates
the constraint that segments from the same trial remain in a single subset. In other words, for some
trials:
∃  ,  ∈ 
train, (′) ∈ 
test
This introduces trial-wise data leakage, allowing information from a trial to appear in both training and
testing phases, leading to inflated test accuracy.
(4) Combined Trial-wise and Test Data Leakage This represents the most severe case where both
leakage types occur simultaneously. Segments from the same trial are distributed across diferent
subsets, and the test set is also used to determine the optimal model:
{︃∃  ,  ∈ 
train, (′) ∈</p>
        <p>test
 * = arg min ℒ( , 
test)
This configuration leads to severe overestimation of model performance and minimal generalizability
to unseen subjects.</p>
        <p>The four configurations defined above provide a controlled experimental framework for evaluating
how diferent types of data leakage—trial-wise and test data leakage—afect EEG-based emotion
classification. Subsequent experiments quantify the impact of each leakage scenario on model performance
and demonstrate how improper dataset handling can mislead conclusions about model efectiveness.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Experiment Setting</title>
        <p>We conduct experiments based on the widely used EEG emotion recognition benchmark dataset—the
DEAP dataset [19]. The dataset contains EEG recordings from 32 participants, with each participant
watching 40 one-minute music videos during the experiment to elicit diferent emotional states. The EEG
signals were collected using 32 electrode channels at a sampling rate of 512 Hz. Following the standard
preprocessing pipeline provided by the dataset, we downsampled the signals to 128 Hz and applied
a 4–45 Hz band-pass filter to remove low-frequency drifts and high-frequency noise. Additionally,
the first 3 seconds of each trial were removed to avoid unstable responses from participants at the
beginning of the video. Each trial has continuous emotion labels, including the dimensions of Valence
and Arousal, with values ranging from 1 to 9. To facilitate classification tasks, these continuous labels
were discretized into three levels: low (1–3), medium (4–6), and high (7–9).</p>
        <p>
          We segmented each trial into non-overlapping 1-second segments. Based on the segmented data, we
designed four data partitioning and model training approaches as shown in Section 3, including normal
setting, test data leakage, trial-wise data leakage and combined trial-wise and test data leakage. These
are used to analyze the diferences in model performance under various data leakage conditions. We
selected six models for our experiments, including EEGNet[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], FCBNet[20], TSception[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], ATCNet[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ],
VanillaTransformer, and ArjunViT[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The first three models (EEGNet, FCBNet, TSception) are based
on convolutional neural network (CNN) architectures, while the latter three are built on transformer
architectures. All models were implemented using PyTorch and torcheeg frameworks.To isolate the
impact of data leakage on model performance while controlling for other variables, we maintained
identical training, validation, and testing procedures across all models. Specifically, we employed the
Adam optimizer with a learning rate of 0.001 and a batch size of 64, the epoch number of 100 throughout
the training process. All experiments were conducted on an NVIDIA 4090 GPU.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Quantitative and Visualization Analysis</title>
        <p>The experimental results presented in Tables 1∼ 3 reveal several critical observations regarding the
impact of data leakage on emotion classification performance. All six models exhibit substantial
performance improvements under various data leakage conditions compared to the normal setting.
For valence classification (Table 1), the average ACC increases from 0.42 in normal settings to 0.69 in
combined leakage conditions - a 64.3% relative improvement. Similarly, for arousal classification (Table
2), average ACC rises from 0.28 to 0.68, representing a 142.9% enhancement.</p>
        <p>The severity of performance inflation follows a clear hierarchy: combined leakage &gt; trial-wise data
leakage &gt; test data leakage. This pattern is consistent across both emotion dimensions and all evaluation
metrics, suggesting that trial-wise leakage has a more pronounced efect than test data contamination
alone. However, the impact of test data leakage alone should not be underestimated. For instance, in
valence classification, test data leakage caused an average ACC increase from 0.42 (normal setting)
to 0.67, while for arousal classification, ACC rose from 0.28 to 0.65. These substantial increments
demonstrate that even without trial-wise leakage, test data contamination alone can significantly inflate
performance metrics, potentially leading to overly optimistic evaluations and misleading conclusions
about model efectiveness.</p>
        <p>Traditional CNN-based architectures (EEGNet, ATCNet) show the most dramatic performance boosts
under leakage conditions. For instance, EEGNet’s ACC for arousal classification jumps from 0.30 to
0.90 under combined leakage. In contrast, transformer-based models (VanillaTransformer, ArjunViT)
demonstrate relatively more robustness, though still exhibiting significant inflation. Arousal
classiifcation shows greater susceptibility to data leakage efects compared to valence classification. The
performance gap between normal and leaked conditions is more substantial for arousal, particularly in
F1-score metrics where arousal shows a 226% improvement versus 111% for valence.</p>
        <p>The observed performance diferences underscore the necessity of rigorous experimental design
in EEG-based emotion recognition. The massive performance gaps (e.g., ACC diferences up to 0.62
points) highlight how improper data handling can lead to severely overoptimistic results, potentially
misleading research conclusions and practical applications.</p>
        <p>For the TSception model trained on the valence classification task, we computed the average
intermediate features of all EEG data belonging to each class and visualized these averaged features, as
shown in Figure 2. The analysis reveals that data leakage induces significant alterations in the brain
topography patterns. Notably, trial-wise data leakage produces more pronounced changes compared to
test data leakage, which aligns consistently with the quantitative results presented in Table 1∼ 3.</p>
        <p>Previous studies have frequently employed brain topography analysis to investigate the relationships
between brain regions and diferent emotional states, thereby deriving insights that inform cognitive
science research. However, our findings demonstrate that when data leakage occurs in experiments, the
resulting conclusions regarding brain region correlations may be substantially flawed and unreliable.
This underscores the critical importance of rigorous data partitioning protocols in neuroscientific studies
involving EEG-based emotion recognition.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we systematically investigate the critical issue of data leakage in EEG-based emotion
recognition research. We identify and analyze two predominant sources of data leakage—test data
leakage and trial-wise data leakage—demonstrating how these issues can substantially inflate model
performance metrics. Through comprehensive experiments on the DEAP dataset using six established
EEG classification models, we quantitatively validate the performance overestimation across four distinct
experimental conditions, revealing the significant impact of diferent leakage scenarios. Further, our
visualization analysis of intermediate model features provides compelling evidence that data leakage
fundamentally distorts EEG topographic patterns, thereby challenging the validity of brain region
correlations derived from contaminated experimental setups. These findings collectively underscore the
critical need for rigorous data partitioning protocols and leakage-aware experimental designs in both
afective computing and neuroscience research communities to ensure the reliability and interpretability
of future findings.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgments</title>
      <p>This work has partly been funded by the Academy of Finland for Academy Professor project EmotionAI
(Grant No.336116), the National Key R&amp;D Program of China (No.2017YFB1002703), and the National
Natural Science Foundation of China (Grant No.60873164, 61227802 and 61379082).</p>
    </sec>
    <sec id="sec-7">
      <title>7. Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT and DeepSeek in order to: Grammar
and spelling check, Paraphrase and reword. After using these tools and service, the authors reviewed
and edited the content as needed and takes full responsibility for the publication’s content..
[11] J. Riascos, M. Molinas, F. Lotte, A comparative study on the impacts of data leakage during
feature selection using the cic-iot 2023 intrusion detection dataset, The Proceedings of European
Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
(2024).
[12] S. Lemm, B. Blankertz, T. Dickhaus, K.-R. Müller, Introduction to machine learning for brain
imaging, NeuroImage 56 (2011) 387–399.
[13] H. Chen, J. Li, H. He, J. Zhu, S. Sun, X. Li, B. Hu, Toward the construction of afective brain-computer
interface: A systematic review, ACM Computing Surveys 57 (2025) 1–56.
[14] M. de Bardeci, C. T. Ip, S. Olbrich, Deep learning applied to electroencephalogram data in mental
disorders: A systematic review, Biological Psychology 162 (2021) 108117.
[15] M. Demuru, M. Fraschini, Eeg fingerprinting: Subject-specific signature based on the aperiodic
component of power spectrum, Computers in Biology and Medicine 120 (2020) 103748.
[16] Z. Zhang, J. M. Fort, G. Mateu, Mini review: Challenges in eeg emotion recognition, Frontiers in</p>
      <p>Psychology 14 (2024) 1289816.
[17] G. Ivucic, S. Pahuja, F. Putze, S. Cai, H. Li, T. Schultz, The impact of cross-validation schemes for
eegbased auditory attention detection with deep neural networks, The 46th Annual International
Conference of the IEEE Engineering in Medicine &amp; Biology Society (2024).
[18] S. Kapoor, A. Narayanan, Leakage and the reproducibility crisis in machine-learning-based science,</p>
      <p>Patterns 4 (2023) 100804.
[19] S. Koelstra, C. Muhl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, Deap: A database for emotion
analysis ;using physiological signals, IEEE Transactions on Afective Computing 3 (2012) 18–31.
[20] R. Mane, E. Chew, K. Chua, K. K. Ang, N. Robinson, A. Vinod, S.-W. Lee, C. Guan, Fbcnet: An
eficient multi-view convolutional neural network for brain-computer interface (2021). https:
//arxiv.org/abs/2104.01233.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V. J.</given-names>
            <surname>Lawhern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Solon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Waytowich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Gordon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Hung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Lance</surname>
          </string-name>
          ,
          <article-title>Eegnet: a compact convolutional neural network for eeg-based brain-computer interfaces</article-title>
          ,
          <source>Journal of Neural Engineering</source>
          <volume>15</volume>
          (
          <year>2018</year>
          )
          <fpage>056013</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Robinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <article-title>Tsception: capturing temporal dynamics and spatial asymmetry from eeg for emotion recognition</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>14</volume>
          (
          <year>2023</year>
          )
          <fpage>2238</fpage>
          -
          <lpage>2250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Song</surname>
          </string-name>
          , W. Zheng,
          <string-name>
            <given-names>P.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <article-title>Eeg emotion recognition using dynamical graph convolutional neural networks</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <fpage>532</fpage>
          -
          <lpage>541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Miao</surname>
          </string-name>
          ,
          <article-title>Eeg-based emotion recognition using regularized graph neural networks</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>1290</fpage>
          -
          <lpage>1301</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z. Zhang,</surname>
          </string-name>
          <article-title>Eegfusenet: hybrid unsupervised deep feature characterization and fusion for high-dimensional eeg with an application to emotion recognition</article-title>
          ,
          <source>IEEE Transactions on Neural Systems and Rehabilitation Engineering</source>
          <volume>29</volume>
          (
          <year>2021</year>
          )
          <fpage>1913</fpage>
          -
          <lpage>1925</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arjun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Rajpoot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Panicker</surname>
          </string-name>
          ,
          <article-title>Introducing attention mechanism for eeg signals: emotion recognition with vision transformers</article-title>
          ,
          <source>The 43rd Annual International Conference of the IEEE Engineering in Medicine &amp; Biology Society</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Altaheri</surname>
          </string-name>
          , G. Muhammad, M. Alsulaiman,
          <article-title>Physics-informed attention temporal convolutional network for eeg-based motor imagery classification</article-title>
          ,
          <source>IEEE Transactions on Industrial Informatics</source>
          <volume>19</volume>
          (
          <year>2023</year>
          )
          <fpage>2249</fpage>
          -
          <lpage>2258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brookshire1</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kasper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Blauch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Glatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Merrill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gerrol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Yoder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Quirk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lucero</surname>
          </string-name>
          ,
          <article-title>Data leakage in deep learning studies of translational eeg</article-title>
          ,
          <source>Frontiers in Neuroscience</source>
          <volume>18</volume>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sweet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Harvey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Knapp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Krusienski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>The role of review process failures in afective state estimation: an empirical investigation of deap dataset (</article-title>
          <year>2021</year>
          ). https://www.arxiv.org/abs/2508.02417.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Varanka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Zhao, Data leakage and evaluation issues in micro-expression analysis</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>186</fpage>
          -
          <lpage>197</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>