1. Introduction

Y. Zhu)

The Empathic Dialogue Generation Model Based on Emotion Cause Perception

Yun Su

0 1

Bozhen Fan

Haoran Bian

Runhe Huang

Yunhao Zhu

1 0 Hosei University , Tokyo, 1848584 , Japan 1 Northwest Normal University of China , Lanzhou ,730070 , China

2023

000 0 0002

Current methods for generating empathy dialogues often overlook the emotional triggers that lead to changes in emotions. To address this issue, we present a novel framework that enhances empathetic response generation by identifying emotional causes within conversations. Our framework consists of two modules: one that comprehends emotions originating from both content and context, and another that features an emotional attention mechanism for empathy expression. Experimental results demonstrate that our proposed model is capable of perceiving emotional causes and can improve the quality of empathy expression.

eol>emotional conversation generation emotion cause detection empathetic response generation

1. Introduction

The perception and expression of emotion is very important to the generation of dialogue. Emotional causes are events that trigger changes in the speaker’s emotions . Failure to analyze emotional causes could lead to poor emotional perception [1]. To address this issue, we propose a framework that improves the generation of empathetic responses by endowing the empathetic dialogue model with the ability to reason about human emotions in conversations.

Our framework comprises two components: an emotion reasoner and a response generator. The experimental results show that our proposed model outperforms other compared methods by considering emotional causes in generating more empathetic responses. [, 1, 2, ..., ], Similar to [], [] is an emotion classifier token. For each emotional word in the input, the encoder assigns a word embedding vector , a position embedding vector , and an emotional state embedding vector , which capture the emotional information associated with each word. Then the multi-resolution emotional context is represented as = [˜, ˜1, ˜2, ..., ˜].

To perceive the emotional information in dialogue context, a linear layer with softmax operation projects the concatenation of 0 and 0 into an emotion category distribution over the coarsedgrained emotional label to identify the emotion signal user expressed: (|) = ([˜0; ˜0])

(1) The emotion cause detection is a sequence labelling problem. Each 2. Approach word in the sequence is labelled with an emotion cause-oriented Our model architecture is illustrated in Figure 1, and it consists of label ∈ {0, 1}, indicating whether the word is related to the emotion tepwrraeotdoimrc.taiTinnhgembfiroostdthumlteohsde: uctlhoee,nttehemxetoeotmifoontthioreenaemsroeoantseioornnaenrc,daiustshreeesarpneosdnptsohinbesleecofgorerrne-- ftcuoanutcshteeidoe.nmT:hoteinoncocmaupsuetewtihthe parloinbeaabrililtayy er coof uthpeled-twhiwthoardsoreftlamteadx sponding emotion tag. The second module, the response generator, (|) = ( + ) (2) integrates the information provided by the emotion reasoner to Note that the [] token is always labeled with 1. The sequence generate an appropriate response. of emotion cause-oriented labels will later be used to select the

For the emotion reasoning, two encoders, that is, semantic and emotion cause-related words in the input sequence to attend to for emotional encoders are employed to understand the conversation the response generator. context from both a content and emotional perspective and locate Finally, the two encoders are combined to generate the final dithe words related to emotional causes. alogue representation [; ]. At the same time, based on the

The semantic encoder is used to process the historical dialogue semantic context vector representation, = [0, 1, ..., ] is obinput, which is denoted as = [, 1, 2, ..., ], [] is a se- tained through the full connection layer, and each word in the mantic classifier token. For each word in the input, the semantic conversation context is assigned an emotional reason tag, where encoder assigns a word embedding vector , a position embedding ∈ {0, 1} . The tag sequence of emotional cause is marked to vector , and a conversation state embedding vector , which see whether each word in the conversation is the emotional reason capture the semantic information, location in the context, and inter- word that causes the user’s emotional changes so that the model can locutor information of each word, respectively. The obtained final better understand the user’s emotion caused by emotional reasons. context representations are denoted as = [˜, ˜1, ˜,..., ˜]. The emotion expression process is based on the decoder of the

Similarly, the emotional encoder is used to process emotional transformer. The emotion attention mechanism is set after the crosswords in the semantic context , which is denoted as = attention mechanism so that the dialogue generation model can better focus on the emotion caused by user vector input. Then, the decoder exports the target vector = [1, 2, ..., ] from the dialogue context.

At the same time, to improve the ability of emotion recognition and semantic perception of the model, we only use the generated confrontation network in the training process. The discriminator part is inspired by [2]. It comprises two parts: emotion discriminator MIME 0.30 0.32 0.29 0.32 36.9 37.0 37.2 33.4 3.47 3.63 3.58 3.77 3.88 3.6 3.91 3.69 3.68 4.28 3.67 3.73

3. Experiments 4. Conclusion

The paper introduces a new framework that can enhance empathetic response generation by incorporating information about the causes of emotions. The evaluations demonstrate that the proposed models can generate more meaningful and empathetic responses compared to other existing approaches. By integrating emotional reasoning into conversation models, our framework has the potential to significantly improve the quality of human-computer interaction, particularly in scenarios where empathetic communication is essential.

Dataset To better capture the emotional content in user utterances, two diferent dataset are used: Empathtic Dialogues with emotional 5. Acknowledgments causes labels[3]. And the NRC Word-Emotion Association Lexicon This work was supported by the National Natural Science Founda(EmoLex) [4]. EmpatheticDialogues provides coarse-grained emo- tion of China (No. 61862058 and 8226070356), and in part by the tional labels for the dialogues, while EmoLex provides fine-grained China Scholarship Council(CSC). emotional labels for individual words. The emotion cause is identiifed at the discourse level in the dialogues using an existing emotion cause detection model and label them accordingly in EmpatheticDia- References logues. This approach allows us to better understand the emotional context of the dialogues and provide more accurate emotional labels [1] H. Herjanto, M. Amin, F. Okumus, and C. Cobanoglu, (2022). for the model training. Airline service: Low-cost-carriers (LCCs) failure and passenger

Baselines To assess our model efectiveness in capturing and emotional experience. Tourism Review, 77(3), 945-963. generating empathetic responses with subtle emotional nuances, [2] Q. Li, H. Chen, Z. Ren, P. Ren, Z. Tu, Z. Chen, we compare our model’s performance against several baselines, EmpDG:Multi-resolution Interactive Empathetic Dialogue Genincluding the MoEL model [5], which is an extension of the Trans- eration, Proc. 28th Int. Conf. Comput. Linguist. (2020). URL: former model that combines response representations from diferent https://doi.org/10.48550/arXiv.1911.08698 decoders optimized for diferent emotions; the MIME model [6] is [3] J. Gao, Y. Liu, H. Deng, W. Wang, Y. Cao, J. Du, R. Xu, another Transformer-based model that considers emotion clustering Improving Empathetic Response Generation by Recognizand emotional mimicry, and introduces sampling stochasticity dur- ing Emotion Cause in Conversations, Find. Assoc. Coming training; the EMPDG model [2] is a kind of empathic dialogue put. Linguist. Find. ACL EMNLP 2021. (2021) 807–819. URL: generation model based on generative adversarial network. https://doi.org/10.18653/v1/2021.findings-emnlp.70.

Evaluation Results As shown in table 1, our results have certain [4] S. M. Mohammad, and P. D. Turney, (2013). Crowdsourcadvantages in the accuracy of emotion recognition and the PPL of ing a word–emotion association lexicon. Computational indialogue, which shows that our reasoning process on emotional telligence, 29(3), 436-465. URL: https://doi.org/10.1111/j.1467causes helps the model to perceive emotion better, and at the same 8640.2012.00460.x time, produces a more sympathetic expression. At the same accuracy [5] Z. Lin, A. Madotto, J. Shin, P. Xu, and P. Fung, MOEL: rate of emotion recognition, our model has more advantages in Mixture of empathetic listeners, EMNLP-IJCNLP 2019 the value of ppl, which shows that our model can better perceive 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. the subtle emotional reasons and respond accordingly with the Conf. Nat. Lang. Process. Proc. Conf. (2019) 121–132. URL: same recognition efect. These automatic evaluation results suggest https://doi.org/10.48550/arXiv.1908.07687 that our approach is efective in generating empathetic responses [6] N. Majumder, P. Hong, S. Peng, J. Lu, and D. Ghosal, A. Gelbukh, with subtle emotional nuances and diverse language. At the same R. Mihalcea, S. Poria, MIME: Mimicking emotions for empatime, the results of the manual evaluation show that our empathy thetic response generation, in Proc. Conf. Empirical Methods expression and fluency of sentences are also better. Natural Lang. Process. (EMNLP), (2020), pp. 8968–8979. URL: https://doi.org/10.48550/arXiv.2010.01454