,


CNN-LSTM Based Stress Recognition Using Wearables
Ritu Tanwar1,∗,† , Orchid Chetia Phukan2,† , Ghanapriya Singh1 and Sanju Tiwari3,†
1
  Department of Electronics Engineering, National Institute of Technology, Uttarakhand, India
2
  P.E.S University, Bengaluru, Karnataka, India
3
  Universidade Autonoma de Tamaulipas, Mexico


                                         Abstract
                                         Mental health diseases such as depression, anxiety, and heart-related disorders are a big concern in
                                         today’s life. As a result, it is now crucial to monitor stress levels. Here, we propose and suggest a
                                         deep learning hybrid approach convolution neural network-long short-term memory (CNN-LSTM)
                                         model that can recognize stress. Electrocardiogram (ECG), electromyography (EMG), body temperature,
                                         electrodermal activity (EDA), and respiration biomedical signals vary, when a human being is exposed to
                                         stress. Physiological signals can be measured using wearable devices, which can help measure stress.
                                         Therefore, it is possible to recognize stress using wearables-based physiological signals. For stress
                                         detection, the deep learning approach CNN-LSTM was implemented for feature learning and three-class
                                         stress classification (baseline, stress, amusement). To evaluate the model performance on stress detection,
                                         confusion matrix, and accuracy were used. The accuracy obtained by the proposed model was 90.20 % for
                                         stress classification, which is higher as compared to other previous studies. Consequently, the proposed
                                         model can help people in stress management in office working environments, in driving conditions, or
                                         everyday life. Furthermore, integrating the proposed model with other mechanisms such as attention
                                         and explainability may become more accurate and capable of stress detection and healthcare systems
                                         development.

                                         Keywords
                                         Hybrid Deep Learning, Stress Recognition, Wearables, CNN-LSTM.


1. Introduction
The mental states of a human being, such as stress, are reflected in physiological signals [1].
Wearable sensors such as wrist-worn or chest-worn devices can be used to measure physiological
signal variations [2]. Many researchers have contributed to studying the correlation existence
between the physiological signals and the stress levels of a human being[3],[4],[5]. Stress levels
of a human being can be recognized by observing the physiological signals pattern. The stress

DLQ-2022: International Workshop on Deep Learning for Question Answering, Co-located with the KGSWC-2022,
November 21-23, 2022, Madrid, Spain.
∗
    Corresponding author.
†
    These authors contributed equally.
Envelope-Open ritu.tanwar2012@gmail.com (R. Tanwar); orchidchetiaphukan1@gmail.com (O. C. Phukan);
ghanapriya@nitkkr.ac.in (G. Singh); tiwarisanju18@ieee.org (S. Tiwari)
Orcid 0000-0003-0110-4197 (R. Tanwar)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
levels are observed to be correlated with biomedical signals such as movement, respiration,
and body temperature [6],[7]. Therefore, it is required to detect the stress levels for health
monitoring of human beings with affected states such as depression, anxiety, and stress.
   Previously, measuring the physiological signals involved cumbersome procedures with wear-
ing very uncomfortable body sensors and was also expensive. But advancements in wearable
technologies have made it easy and comfortable to measure physiological signal variations
[8]. Wearables-based devices help monitor the human mental state and are also helpful for
other purposes and applications [9], [10], [11]. In this study, various human beings’ biomedical
signals are used to recognize the stress levels in a very effective and accurate manner. A deep
learning approach based on CNN-LSTM models was implemented to detect stress levels. The
main contributions of this work are:
   1. Multi-physiological signals (ECG, EMG, body temperature, EDA, and respiration) are
      considered for stress monitoring,
   2. Wearable devices are used for collecting physiological signals in the dataset, and
   3. A hybrid CNN-LSTM deep learning approach is proposed to classify three-class stress
      levels (baseline, stress, amusement).
   The proposed methodology helps pass the information through different layers for pre-
processing, feature learning and classification purposes. The paper is structured in four sections:
the first section presents the related work in the stress recognition domain using machine
learning, deep neural networks, and wearables approaches. Second section provides the dataset
and methodology used in this study. The third section details the results of the proposed
approach, and fourth section provides the conclusion.


2. Related work
Wearables-based stress recognition uses physiological signals such as a EMG, EDA, and ECG to
determine the stress, neutral and amusement levels of human beings. Wearables based stress
recognition approach exploits the possibilities of machine learning and deep neural networks’
potential in healthcare [12, 13]. According to a survey [14], wearables or sensors do not interfere
with daily life activities and help monitor stress levels quickly, leading researchers to innovate
computational stress recognition by optimizing sensor modalities and using machine learning
or deep learning approaches. Using uncomfortable sensors in laboratories to check stress levels
might be frustrating.
   Enormous implementation and successful models based on machine learning and deep
neural networks method such as CNN in stress recognition have been introduced in recent
years. For stress detection, a study presented a publically available dataset: wearable stress
and affect detection (WESAD). Chest worn and wrist-worn based sensors named RespiBAN
professional and Empatica E4 were selected to collect the physiological signals such as three-
axis accelerometer, EDA, ECG, EMG, temperature, and respiration. Fifteen participants were
chosen to collect the data and were exposed to stress conditions. Three-class classification
(stress, amusement, and baseline) was evaluated using machine learning methods (e.g., k-Nearest
Neighbor (kNN), Decision Tree (DT), and Random Forest (RF)) [12].
   Montesinos et al. [15] detected acute stress using physiological signals from wrists-based
Empatica E4 and chest-based Shimmer3 ECG wearable sensors, exploiting the machine learning
algorithms such as kNN, DT, and RF to detect the stress levels and found accuracy of 84.13 %.
Can et al. [16] also introduced an automatic stress recognition system using Samsung Gear
S family wearables and Empatica E4 wearables. They used machine learning methods, e.g.,
kNN, linear regression (LR), RF, support vector machine (SVM), and multilayer perceptron
to recognize stress in 21 participants. Cosoli et al. [17] also implemented machine learning
techniques on the WESAD dataset for assessing human reactions with LR and SVM approaches
and found 75 and 72.62 % accuracy.
   The researchers presented a stress-recognition model using a deep learning approach and
found 78.8 % accuracy. Here, stress based datasets were used to detect the stress and implemented
the CNN model with ECG and EDA physiological signals [18]. A study proposed a stress
recognition model using the WESAD dataset and implemented machine learning and deep
neural networks to recognize the three class stress levels. They implemented DT, RF, kNN, SVM,
linear discriminant analysis (LDA), and artificial neural network (ANN) [19, 20] for stress level
classification with an accuracy rate of 68.16, 75.95, 74.71, 81.65, 74.82, and 84.32 % respectively
[21].
   A recent study proposed a hybrid CNN deep neural network method for stress recognition
with WESAD dataset and found 75.21 accuracy with three-class stress classification. They
compared the results with conventional machine learning approaches and found low accuracy
[22]. The researchers attempted to detect stress on the StudentLife dataset and implemented
LSTM, CNN, and CNN-LSTM deep learning algorithms [23]. They found an accuracy rate of
62.83% with LSTM, 60.43 % with CNN, and 60% with CNN-LSTM approaches [24]. These findings
suggest that wearables data and stress issues are related. The outcomes of stress recognition
models vary depending on the wearables used, application areas, available stress datasets, and
methodological approach. Researchers use different methods to recognize stress with wearable
sensor data.


3. Materials and methods
This part outlines and details of the dataset used in this study, preprocessing of the data and the
methodology implemented to recognize the stress based on wearables.

3.1. Dataset used
WESAD, a publically available stress dataset introduced by Schmidt, was used to detect stress
levels. This dataset involved 15 participants and is multivariate time-series data. The chest-
worn and wrist-worn RespiBAN professional and Empatica E4-based wearables were selected to
collect the biomedical signals data. Chest worn device collects the ECG, EDA, EMG, Respiration,
body temperature, and three-axis accelerometer physiological signals from the participants with
a sampling frequency of 700 Hz. The wrist-worn wearable collects BVP, EDA, body temperature,
and accelerometer signals with sampling frequencies of 64 Hz, 4 Hz, 4 Hz, and 32 Hz, respectively.
This dataset aims at inducing three mental states: baseline, stress, and amusement [12]. From
Schimdt et al.[12], it was observed that chest-worn devices provided the best results. Thus, only
Figure 1: The proposed CNN-LSTM model for wearables-based stress detection. Chest worn wearables
signals (ECG, EMG, Body Temperature, EDA, and Respiration) are input and stress, baseline, and
amusement are the output.


chest-worn device signals were considered in this study. The proposed architecture includes
wearable data selection, pre-processing, modalities fusion, feature extraction, and classification
(Figure 1).

3.2. Pre-processing
For data pre-processing, chest-worn signals sampled at 700 Hz were selected. The different
modalities, ECG, EMG, body temperature, EDA, and respiration, were pre-processed to make
them ready to feed into the model. The windowing technique where data signals were segmented
into 5-second window lengths with 2-second window shift was implemented for pre-processing.
After pre-processing, each participant data was concatenated for constructing test data and
training data for stress recognition.

3.3. Hybrid deep learning approach:Convolution neural network- Long
     short-term memory (CNN-LSTM)
Figure 1 shows the hybrid CNN-LSTM model used in this work. This model classifies stress
level by fusing the chest-worn device modalities to classify baseline, stress, and amusement
mental states. The configuration details of layers used in the proposed model are given in
Table 1. CNN-LSTM model has three 2D convolution layers, two max-pooling layers, one
batch normalization layer, two dense layers, one dropout, and one flatten layer. The model was
selected with the best accuracy rate through trial and error. The convolution layer extracted
the important features with its sliding filters. Rectified linear unit (ReLU) function was selected
as an activation function, enhancing the model’s convergence speed and robustness. The max
pooling layer minimizes the data by 50 % to minimize the computational complexity. The batch
normalization layer is implemented for data normalization and speeding up the training and
feature learning. After normalization, flatten layer is used to create a one-dimensional feature
vector. A one-dimensional vector is given as input to the two LSTM layers.
   LSTM contains a forget gate, input, and output gate to process the input time series. These
gates can manage the addition or deletion of information to facilitate forgetting and recollection.
Table 1
The proposed CNN-LSTM model configuration details
   Layer           Output shape    No. of        Activation   Others
                                   parameters    function
   Input           (-,5,3500,10)   –             –            –
   2D conv         (-,5,3500,10)   260           ReLU         No. of kernels=5, padding= ‘same’
   Max pooling     (-,2,1750,10)   –             –            Pool size = (2,2)
   2D conv         (-,2,1750,20)   820           ReLU         No. of kernels=5, padding= ‘same’
   Max pooling     (-,1,875,20)    –             –            Pool size = (2,2)
   2D conv         (-,1,875,30)    2430          ReLU         No. of kernels=5, padding= ‘same’
   Batch           (-,1,875,30)    120           –            –
   normalization
   Flatten         (-,1,26250)     –             –            –
   LSTM            (-,1,128)       13506048      Tanh         –
   LSTM            (-,60)          45360         Tanh         –
   Dense           (-,512)         31232         ReLU         No of neurons, initializer = ‘’
   Dropout         (-,512)         0             –            Portion=0.3
   Dense           (-,3)           1539          Softmax      –


In the LSTM network, the feature maps are transformed into matching hidden states. The
output from LSTM layers is given to the dense layer, which is a completely connected layer
with the neurons from all the previous layers. The dropout layer is also used in between two
dense layers to prevent the overfitting of the fully connected neural networks. The output
from the dense layer is given as an input vector to the last phase of classification and passed to
the softmax layer to classify the stress. The suggested framework is applicable for processing
various multimodal time series data. Without the requirement for handcrafted features, the
suggested framework employs an end-to-end training methodology.


4. Results
The experiments were performed on the google platform “Google colaboratory” and used
‘Tensorflow’ library for python coding. Out of 15 participants’ data, 14 participants data was
used for training the model and one for testing. The proposed model was trained with a
cross-entropy loss function. Here, performance metrics, confusion matrix, accuracy, and cross-
entropy loss function curves for training and validation of the model were selected to determine
the proposed model performance for stress recognition. The results for baseline, stress, and
amusement classification obtained with the proposed model using performance metrics are
given in Table 2. Accuracy, precision, and F1-score are used as performance metrics. The
definitions of performance metrics used are given in the Table 2. It is found in Table 2 that the
precision for baseline, stress, and amusement conditions is 92.28, 90.57, and 84.72, respectively.
Further, the F1 score for baseline, stress, and amusement conditions is 94.41, 90.39, and 77.91,
respectively. Figure 2 shows the accuracy rate and cross-entropy loss with an increasing number
of epochs for training and validation of the model.
Table 2
Details of performance metrics and values achieved by the CNN-LSTM model
     Performance     Definition                       Baseline (%)   Stress (%)   Amusement (%)
     metrics
     Accuracy        Ratio of correctly classified    96.63          90.20        72.12
                     samples to sum of samples
     Precision       Ratio of true positive           92.28          90.57        84.72
                     prediction samples to sum
                     of true samples
     F1-score        Measure of model accuracy.       94.41          90.39        77.91
                     Mean of precision and recall,
                     where recall is ratio of
                     no. of true positive predicted
                     samples to sum of samples


Figure 2: Training and validation accuracy rate (a) and cross-entropy loss (b) for the model


Table 3
Comparison with other previous studies on wearables-based stress recognition
    S.No.   Previous Studies          Wearables used     Method used                 Accuracy (%)
    1.      Schimdt et al.[12]        RespiBAN and       Machine learning models     76.50
                                      Empatica E4        DT, RF, LDA, kNN
                                                         AdaBoost
    2.      Chakraborty et al.[25]    RespiBAN and       Deep learning model         77.06
                                      Empatica E4        (CNN)
    3.      Montesinos [15]           Shimmer3 and       Machine learning models     84.13
                                      Empatica E4        (kNN, DT, RF)
    4.      Gil-Martin et al.[26]     RespiBAN and       Deep learning model         77.21
                                      Empatica E4        (CNN)
    5.      Cosoli et al.[17]         RespiBAN and       Machine learning models     75
                                      Empatica E4        (LR, SVM)
    6.      Proposed model            RespiBAN and       Deep learning model         90.20
                                      Empatica E4        (CNN-LSTM)
   Table 3 compares the proposed model with other stress recognition studies. Some previous
studies are there that are based on stress recognition using wearables [12], [25], [15], [17].
Schmidt et al. [12] used machine learning models (e.g., DT, RF, LDA, and kNN) and accuracy
obtained with these models was 76.50%, when chest worn signals were considered. Another
study by Montesinos et al. [15] also used machine learning methods and obtained the accuracy
of 84.13% with Shimmer3 and Empatica E4 wearables. Futher, Cosoli et al. [17] used these
models and obtained the accuracy of 75%. Thus, the studies that have used machine learning
models for stress detection using wearables have obtained accuracy upto 85% approximately
[12], [15], [17]. In contrast to machine learning approaches, deep learning methods implemented
on physiological signals acquired through wearables had accuracy of approximately 77% [25],
[26]. In this proposed hybrid model of CNN-LSTM deep learning approach, 90% accuracy is
achieved.


5. Conclusion
In this study, a hybrid CNN-LSTM based model for stress recognition is proposed. We achieved
an accuracy of 90.20% by using chest-worn device signals such as ECG, EMG, body temperature,
EDA, and respiration. The results show accuracy improvements compared to previous studies
that classified stress levels. This is mostly attributable to their capacity to benefit from the
extensive hierarchical and temporal data dependence between different physiological signals.
This may help to explain the reason behind higher accuracy achieved with CNN-LSTM model
than model that solely employ CNN and LSTM structures when ECG, EMG, EDA, RESP, and
Temp are simultaneously collected multivariate signals for stress identification. In future work,
we plan to integrate an attention mechanism to extract the stress-related information effectively
and improve the stress recognition accuracy rate. The proposed stress recognition model would
be helpful in stress management in modern society’s everyday life. Moreover, it is expected to
assist people with depression, anxiety, and other stress-related problems.


References
 [1] I. B. Mauss, R. W. Levenson, L. McCarter, F. H. Wilhelm, J. J. Gross, The tie that binds?
     coherence among emotion experience, behavior, and physiology., Emotion 5 (2005) 175.
 [2] D. P. Tobón Vallejo, A. El Saddik, Emotional states detection approaches based on physio-
     logical signals for healthcare applications: a review, Connected Health in Smart Cities
     (2020) 47–74.
 [3] A. Soni, K. Rawal, A review on physiological signals: Heart rate variability and skin conduc-
     tance, in: Proceedings of First International Conference on Computing, Communications,
     and Cyber-Security (IC4S 2019), Springer, 2020, pp. 387–399.
 [4] M. Zanetti, L. Faes, M. De Cecco, A. Fornaser, M. Valente, G. Guandalini, G. Nollo, Assess-
     ment of mental stress through the analysis of physiological signals acquired from wearable
     devices, in: Italian Forum of Ambient Assisted Living, Springer, 2018, pp. 243–256.
 [5] A. Arza, J. M. Garzón-Rey, J. Lázaro, E. Gil, R. Lopez-Anton, C. de la Camara, P. Laguna,
     R. Bailon, J. Aguiló, Measuring acute stress response through physiological signals: towards
     a quantitative assessment of stress, Medical & biological engineering & computing 57
     (2019) 271–287.
 [6] C. Tsigos, G. P. Chrousos, Hypothalamic–pituitary–adrenal axis, neuroendocrine factors
     and stress, Journal of psychosomatic research 53 (2002) 865–871.
 [7] C. Schubert, M. Lambertz, R. Nelesen, W. Bardwell, J.-B. Choi, J. Dimsdale, Effects of stress
     on heart rate complexity—a comparison between short-term and chronic stress, Biological
     psychology 80 (2009) 325–332.
 [8] M. Ragot, N. Martin, S. Em, N. Pallamin, J.-M. Diverrez, Emotion recognition using
     physiological signals: laboratory vs. wearable sensors, in: International Conference on
     Applied Human Factors and Ergonomics, Springer, 2017, pp. 15–22.
 [9] H. Jebelli, B. Choi, S. Lee, Application of wearable biosensors to construction sites. i:
     Assessing workers’ stress, Journal of Construction Engineering and Management 145
     (2019) 04019079.
[10] L. Han, Q. Zhang, X. Chen, Q. Zhan, T. Yang, Z. Zhao, Detecting work-related stress with
     a wearable device, Computers in Industry 90 (2017) 42–49.
[11] F. de Arriba-Pérez, J. M. Santos-Gago, M. Caeiro-Rodríguez, M. Ramos-Merino, Study
     of stress detection and proposal of stress-related features using commercial-off-the-shelf
     wrist wearables, Journal of Ambient Intelligence and Humanized Computing 10 (2019)
     4925–4945.
[12] P. Schmidt, A. Reiss, R. Duerichen, C. Marberger, K. Van Laerhoven, Introducing wesad, a
     multimodal dataset for wearable stress and affect detection, in: Proceedings of the 20th
     ACM international conference on multimodal interaction, 2018, pp. 400–408.
[13] M. Reza, G. Hossain, A. Goyal, S. Tiwari, A. Tripathi, A. Bhan, P. Dash, et al., Automatic
     diabetes and liver disease diagnosis and prediction through svm and knn algorithms,
     in: Emerging Technologies in Data Mining and Information Security, Springer, 2021, pp.
     589–599.
[14] N. Sharma, T. Gedeon, Objective measures, sensors and computational techniques for
     stress recognition and classification: A survey, Computer methods and programs in
     biomedicine 108 (2012) 1287–1301.
[15] V. Montesinos, F. Dell’Agnola, A. Arza, A. Aminifar, D. Atienza, Multi-modal acute stress
     recognition using off-the-shelf wearable devices, in: 2019 41st Annual International
     Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2019,
     pp. 2196–2201.
[16] Y. S. Can, N. Chalabianloo, D. Ekiz, C. Ersoy, Continuous stress detection using wearable
     sensors in real life: Algorithmic programming contest case study, Sensors 19 (2019) 1849.
[17] G. Cosoli, A. Poli, L. Scalise, S. Spinsante, Measurement of multimodal physiological
     signals for stimulation detection by wearable devices, Measurement 184 (2021) 109966.
[18] K. Radhika, V. R. M. Oruganti, Transfer learning for subject-independent stress detection
     using physiological signals, in: 2020 IEEE 17th India Council International Conference
     (INDICON), IEEE, 2020, pp. 1–6.
[19] S. Tiwari, O. Dogan, M. Jabbar, S. K. Shandilya, F. Ortiz-Rodriguez, S. Bajpai, S. Banerjee,
     Applications of machine learning approaches to combat covid-19: A survey, Lessons from
     COVID-19 (2022) 263–287.
[20] D. Gaurav, F. O. Rodriguez, S. Tiwari, M. Jabbar, Review of machine learning approach for
     drug development process, in: Deep Learning in Biomedical and Health Informatics, CRC
     Press, 2021, pp. 53–77.
[21] P. Bobade, M. Vani, Stress detection with machine learning and deep learning using
     multimodal physiological data, in: 2020 Second International Conference on Inventive
     Research in Computing Applications (ICIRCA), IEEE, 2020, pp. 51–57.
[22] N. Rashid, L. Chen, M. Dautta, A. Jimenez, P. Tseng, M. A. Al Faruque, Feature augmented
     hybrid cnn for stress recognition using wrist-based photoplethysmography sensor, in:
     2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology
     Society (EMBC), IEEE, 2021, pp. 2374–2377.
[23] N. Kanagaraj, D. Hicks, A. Goyal, S. Tiwari, G. Singh, Deep learning using computer vision
     in self driving cars for lane and traffic sign detection, International Journal of System
     Assurance Engineering and Management 12 (2021) 1011–1025.
[24] Y. Acikmese, S. E. Alptekin, Prediction of stress levels with lstm and passive mobile sensors,
     Procedia Computer Science 159 (2019) 658–667.
[25] S. Chakraborty, S. Aich, M.-i. Joo, M. Sain, H.-C. Kim, A multichannel convolutional neural
     network architecture for the detection of the state of mind using physiological signals
     from wearable devices, Journal of healthcare engineering 2019 (2019).
[26] M. Gil-Martin, R. San-Segundo, A. Mateos, J. Ferreiros-Lopez, Human stress detection with
     wearable sensors using convolutional neural networks, IEEE Aerospace and Electronic
     Systems Magazine 37 (2022) 60–70.