-

Using Domain Knowledge to Enhance Deep Learning for Emotional Intelligence (Extended Abstract)

Hortense Fong

Vineet Kumar

vineet.kumarg@yale.edu 0 0 Yale School of Management , New Haven CT 06511 , USA

Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). In:N. Chhaya, K. Jaidka, J. Healey, L. H. UngaAr,. Sinha (eds.): Proceedings of the 3rdWorkshop of Affective Content Analysis, New York, USA, 07FEB-2020, published at http://ceur-ws.org

We propose a hierarchical classi cation architecture for identifying granular emotions in unstructured text data. Whereas most existing emotion classi ers focus on a coarse set of emotions, such as Ekman's six basic emotions (e.g., joy, anger, sadness) [ 2 ], we focus on a larger set of 24 granular emotions (e.g., irritation, envy, rage). Compared to coarse emotions, granular emotions are more speci c in the information they convey (e.g., intensity, context). For example, sadness is a broad bucket of emotions that encompasses more speci c granular emotions ranging from disappointment to neglect to sympathy. Individuals who are able to better recognize the nuance of their emotional state and capture it with more speci c words [ 3 ] are typically characterized as having greater emotional intelligence [ 4, 1 ].

Granular classi cation is challenging because it is a ne-grained classi cation problem, which aims to distinguish subordinate-level categories. It is challenging when there is small inter-class variation but large intra-class variation. In the case of emotions, individuals may use di erent word patterns to evoke the same emotion and similar word patterns to evoke di ering emotions. The underlying idea to overcoming this challenge is to divide the data into similar subsets and then train a separate classi er for each subset so that the model can learn to more easily di erentiate similar groups. Motivated by psychology literature, we use the idea of hierarchical classi cation to improve the identi cation of granular emotions.

The proposed classi er takes advantage of the semantic network of emotions from the seminal work of Shaver et al. (1987), which maps out how individuals categorize emotions [ 5 ]. The semantic network contains a level of coarse emotions and a level of granular emotions that are subordinate to the coarse emotions. The coarse level helps us to divide the data into similar subsets of coarse emotions. Building on this, we develop a classi er that rst classi es input text into one or more of the coarse emotions, capturing the idea of mixed emotions, and subsequently classi es the input text into a granular emotion within the coarse emotion(s) identi ed. The rst level is a multi-label classi er and the second level is a multi-class classi er.

We collect self-labeled English tweet data in which the author has included an emotion hashtag to train our classi er. The emotion hashtags we use come from the emotion words in each of the granular emotion clusters identi ed in [ 5 ]. Table 1 shares a few example tweets from our data set. We take a number

H. Fong, V. Kumar of steps to lter out uninformative tweets and to pre-process the tweets for use in classi cation. In total, our data set contains 867,264 tweets.

Coarse Emotion Love Anger Sadness Sadness

Granular Emotion A ection Rage Shame Neglect

Tweet The way his eyes look right before he goes to kiss me again.. Oh, I love that. #a ection #handsome #hazeleyes For the 2nd time @verizon has erased information from a phone on my account. This time EVERYTHING. Rep said she backed it on the cloud but backed nothing!!! @VerizonSupport #furious Stayed up til midnight last night baking 30 chocolate chip cookies and 30 snowballs for my friends' Christmas presents. I have two tests today #regret I could really use some of my friends right about now :( #lonely #upset #sad

We train deep neural network classi ers since they have demonstrated stateof-the-art performance on emotion classi cation. The inputs to the neural networks are the tweet text and granular emotion labels. We compare the performance of convolutional neural networks (CNNs) with variants of long short-term memory networks (LSTMs) to show the impact of using a hierarchical classi er versus a at classi er. For the at classi er, a separate binary classi er is trained for each granular emotion, resulting in 24 classi ers. For the hierarchical classi er, a separate binary classi er is trained for each coarse emotion and then a multiclass classi er is trained for each coarse emotion to di erentiate the granular emotions, resulting in 12 classi ers.

Our performance metrics of interest are precision, recall, and F1. In many applications, recall is the measure of interest because false negatives are more costly. For example, if a customer is feeling exasperated but this sentiment goes unnoticed, the rm risks losing the customer. False positives, on the other hand, are typically less costly. If a happy customer gets tagged as irritated, the rm can easily realize the mislabeling and choose not to act on the tag.

We report the results of each architecture in Table 2. The proposed hierarchical classi er outperforms a single-stage at classi er in terms of F1 by increasing recall at the cost of precision. For example, the F1 for bi-LSTM increases from 34.93% to 39.84%. We believe the overall F1 from the bi-LSTM can be improved with additional ne-tuning or through the exploration of additional hierarchical neural networks (e.g., BERT).

The hierarchical structure increases the interpretability of the model by enabling interpretation at two levels rather than just one. At one level, it can identify which words contribute to or take away from the positive classi cation

F1 of each of the coarse emotions. At the second level, it can identify which words within each coarse emotion category contribute to each of the granular emotions. Not only does opening up the black box help address model validity, but it also provides insight to end users on which speci c terms are diagnostic of each granular emotion.

Overall, we nd that the use of domain knowledge to inform the design of a granular emotion classi er through a hierarchical structure improves recall and F1 as well as model explainability.

1. Brackett , M.A. , Rivers , S.E. , Bertoli , M.C. , Salovey , P. : Emotional intelligence . In: Handbook of emotions, pp. 513 { 531 . Guilford Publications ( 2016 )

2. Ekman , P. , Friesen , W.V. , Ellsworth , P. : Emotion in the human face: Guidelines for research and an integration of ndings ( 1972 )

3. Lindquist , K.A. , Gendron , M. , Satpute , A.B. : Language and emotion: Putting words into feelings and feelings into words . In: Handbook of emotions , pp. 579 { 594 . Guilford Publications ( 2016 )

4. Salovey , P. , Mayer , J.D. : Emotional intelligence . Imagination, cognition and personality 9 ( 3 ), 185 { 211 ( 1990 )

5. Shaver , P. , Schwartz , J. , Kirson , D. , O'connor, C.: Emotion knowledge: further exploration of a prototype approach . Journal of personality and social psychology 52(6) , 1061 ( 1987 )