=Paper=
{{Paper
|id=Vol-3728/paper19
|storemode=property
|title=Development of a Prototype AI System for Real-time Emotion Prediction and Mental State Adjustment
|pdfUrl=https://ceur-ws.org/Vol-3728/paper19.pdf
|volume=Vol-3728
|authors=Akihiro Sasaki,Eriko Sugisaki,Roberto Legaspi,Yasushi Naruse,Nao Kobayashi
|dblpUrl=https://dblp.org/rec/conf/persuasive/SasakiSLNK24
}}
==Development of a Prototype AI System for Real-time Emotion Prediction and Mental State Adjustment==
<pdf width="1500px">https://ceur-ws.org/Vol-3728/paper19.pdf</pdf>
<pre>
                         Development of a Prototype AI System for Real-time
                         Emotion Prediction and Mental State Adjustment

                         Akihiro Sasaki1, Eriko Sugisaki1, Roberto Legaspi2, Yasushi Naruse3, Nao Kobayashi1
                         1 Healthcare Medical Group, Life Science Laboratories, and 2AI Division, KDDI Research, Inc., Fujimino, Japan

                         3Center for Information and Neural Networks, National Institute of Information and Communications Technology, Kobe,

                         Japan1


                         1. Introduction
                         Our emotional state influences our daily performance. Recent reports have highlighted a
                         hormetic relationship between stress and cognitive performance [1, 2], suggesting that
                         understanding the optimal balance of emotional states, rather than adopting a simple negative-
                         positive emotion perspective, could potentially enhance the optimization of behavioral
                         performance. Our ultimate goal is to construct a system that discerns individual emotional states
                         from physiological information, and generates music, visuals, or conversations as means to
                         facilitate the individual's transition toward their desired emotional state. To achieve this, it is
                         essential to evaluate complex emotional states on different emotional axes and assess possibly
                         continuously fluctuating emotional states in real-time as possible.

                         2. Related works
                         There are several studies on emotion estimation using physiological data (ECG, GSR, EEG), such
                         as predicting depressive mood when listening to news audio [3], and predicting the dynamics of
                         mood during video game play [4, 5]. Particularly, the method of Ishikawa et al. [6] accomplished
                         simultaneous estimation of emotional states along six axes, namely, Sad-Happy, Nervous-Relaxed,
                         Fear-Relieved, Lethargic-Excited, Depressed-Delighted, and Angry-Serene, by incorporating
                         cross-modal factors across multiple physiological modalities into the prediction model.
                         Evaluating its accuracy by the mean absolute percentage error (MAPE), the method achieved less
                         than 25% and 36% error rates for the Angry-Serene and the Fear-Relieved axes, respectively.

                         3. Emotion Estimation System During Music Listening
                         Here, we present an emotion estimation system that provides predicted emotional values on six
                         emotional axes, similar to Ishikawa et al. [6]. In our system, notably, we have incorporated the
                         real-time prediction capability, allowing us to update the predicted values every 0.5 seconds.
                         Model construction: We trained our model on 2,322 instances from 54 participants, each giving
                         physiological (EEG and ECG) and emotional rating data. Participants listened to music for a
                         minute while recording EEG and ECG, then rated their emotion on a 15-point scale for six
                         emotional axes. Explanatory variables were taken from the 10 seconds of EEG and ECG data
                         immediately preceding the emotion rating, and the emotion ratings served as the objective
                         variables. The model was trained using XGboost due to its computational efficiency, smaller
                         resource requirement, and faster training speed, enabling real-time estimation.
                         Implementation of emotion estimation: Users wear EEG and ECG devices and transmit the
                         measured data to a computer via a smartphone using Bluetooth. Once the computer accumulates
                         10 seconds of data, it begins estimating the emotional state. The emotional state estimate is then
                         In: Kiemute Oyibo, Wenzhen Xu, Elena Vlahu-Gjorgievska (eds.): The Adjunct Proceedings of the 19th International
                         Conference on Persuasive Technology, April 10, 2024, Wollongong, Australia
                         EMAIL: xakh-sasaki@kddi.com (A. Sasaki); no-kobayashi@kddi.com (N. Kobayashi);
                         ORCID: 0000-0002-2249-4975 (A. Sasaki); 0000-0002-7634-0031 (E. Sugisaki); 0000-0001-8789-0429 (Y. Naruse);
                         0009-0003-5533-2918 (N. Kobayashi)
                                     ©️ 2024 Copyright for this paper by its authors.
                                     Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                     CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
updated every 0.5 seconds based on the preceding 10 seconds of data. This system, as shown in
Figure 1, enables the real-time estimation of emotional state transitions during music or visual
content consumption.

                                                      Figure 1: Schematic diagram of our system
                                                      design: The model uses EEG, ECG, and
                                                      emotion ratings for training. Implementation
                                                      involves continuously transferring these
                                                      data via Bluetooth to a computer, updating
                                                      emotion predictions every 0.5 seconds based
                                                      on the previous 10 seconds of data.


4. Prospects and potential applications
In the future, we plan a comprehensive evaluation of our models' accuracy using metrics, such as
MAPE and other relevant measures, aiming for less than 20% error in both training and real-
world application. Future developments will merge AI to generate music or visuals, guiding users
to their desired emotional states, potentially enhancing presentations, work efficiency, and
mental health.

Acknowledgements
This work was partially supported by Innovative Science and Technology Initiative for Security
(JPJ004596), Acquisition, Technology, and Logistics Agency, Japan.

References
[1] A. Oshri, Z. Cui, M. M. Owens, C. A. Carvalho, and L. Sweet, “Low-to-moderate level of perceived
    stress strengthens working memory: Testing the hormesis hypothesis through neural
    activation,”       Neuropsychologia,     vol.    176,    p.    108354,    Nov.     2022,     doi:
    10.1016/j.neuropsychologia.2022.108354.
[2] K. A. James, J. I. Stromin, N. Steenkamp, and M. I. Combrinck, “Understanding the relationships
    between physiological and psychosocial stress, cortisol and cognition,” Front. Endocrinol., vol.
    14, 2023, doi: 10.3389/fendo.2023.1085950.
[3] K. Fuseda, H. Watanabe, A. Matsumoto, J. Saito, Y. Naruse, and A. S. Ihara, “Impact of depressed
    state on attention and language processing during news broadcasts: EEG analysis and
    machine learning approach,” Sci. Rep., vol. 12, no. 1, p. 20492, Nov. 2022, doi:
    10.1038/s41598-022-24319-x.
[4] Y. Yokota, T. Soshi, and Y. Naruse, “Error-related negativity predicts failure in competitive
    dual-player video games,” PLOS ONE, vol. 14, no. 2, p. e0212483, Feb. 2019, doi:
    10.1371/journal.pone.0212483.
[5] Y. Yokota and Y. Naruse, “Temporal Fluctuation of Mood in Gaming Task Modulates Feedback
    Negativity: EEG Study With Virtual Reality,” Front. Hum. Neurosci., vol. 15, p. 536288, Jun.
    2021, doi: 10.3389/fnhum.2021.536288.
[6] Y. Ishikawa et al., “Learning Cross-Modal Factors from Multimodal Physiological Signals for
    Emotion Recognition,” in PRICAI 2023: Trends in Artificial Intelligence, vol. 14325, F. Liu, A. A.
    Sadanandan, D. N. Pham, P. Mursanto, and D. Lukose, Eds., in Lecture Notes in Computer
    Science, vol. 14325, Singapore: Springer Nature Singapore, 2024, pp. 438–450. doi:
    10.1007/978-981-99-7019-3_40.
[7] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the
    22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San
    Francisco California USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.

</pre>