=Paper=
{{Paper
|id=Vol-1619/paper7
|storemode=property
|title=Praiseworthy Acts Recognition Using Web-based Knowledge and Semantic Categories
|pdfUrl=https://ceur-ws.org/Vol-1619/paper7.pdf
|volume=Vol-1619
|authors=Rafal Rzepka,Kohei Matsumoto,Kenji Araki
|dblpUrl=https://dblp.org/rec/conf/ijcai/RzepkaMA16
}}
==Praiseworthy Acts Recognition Using Web-based Knowledge and Semantic Categories==
Praiseworthy Act Recognition Using Web-based Knowledge and Semantic Categories Rafal Rzepka, Kohei Matsumoto and Kenji Araki Graduate School of Information Science and Technology Hokkaido University, Japan {rzepka,matsumoto,araki}@ist.hokudai.ac.jp Abstract guage abilities4 . Problems related to psychological disor- ders could be alleviated by technological advancements, in- In this paper we1 introduce our novel method for cluding progress in Artificial Intelligence, especially in cases utilizing web mining and semantic categories for of social withdrawal in which depressed adolescents prefer determining automatically if a given act is worth to deal with computers than with people. As psychology praising or not. We report how existing lexi- studies show [Hofmann et al., 2012], the depression can be cons used in affective analysis and ethical judge- treated by cognitive behavioral therapies (CBT) as efficiently ment can be combined for generating useful queries as medicaments and such treatment is based on conversa- for knowledge retrieval from a 5.5 billion word tion. Although computers are already used as supportive blog corpus. We also present how semantic cat- tools in CBT [Wright et al., 2005], we are far away from egorization helped the proposed method to finally entrusting patients to autonomous therapists. However, we achieve 94% of agreement with human subjects believe that various conversational rules utilized in dialog- who decided which act, behavior or state should be based therapies and other positive aspects [Burnard, 2003; praised. We also discuss how our preliminary find- Zimmerman et al., 2009] of a conversation itself can be im- ings might lead to developing an important social plemented in artificial agents like companion robots [Sarma skill of a robotic companion or an automatic thera- et al., 2014]. In this paper we introduce our idea how to uti- pist during their daily interaction with children, el- lize Natural Language Processing techniques, a set of lexi- derly or depressed users. cons and semantic categories to web mine knowledge neces- sary for recognizing if an action being a dialog topic should be e.g. complimented by an agent. 1 Introduction 1.1 Importance of Praising Predictions from world demographic trends show that the cur- We chose the act of praising to be implemented in our ar- rent ratio of people aged sixty or more (12.6%) will nearly tificial agent for a variety of reasons. First of all it is an double in 2050 (almost 22%)2 . Younger generations would evaluation task which positively influences a praised per- need to work more and worry more, not only about their aged son [Kanouse et al., 1981] and motivates, especially children parents but also about their children to whom they would [Henderlong and Lepper, 2002]. Often seen in interpersonal dedicate less time. Stress among working age group could interaction, praising is used to encourage others, to socialize, be caused not only by work itself but also by the aware- to integrate groups, and to influence people [Lipnevich and ness of children and parents often left to their own devices. Smith, 2008]. It is believed to have beneficial effects on self- Data gathered by American Depression and Bipolar Sup- esteem, motivation and performance [Weiner et al., 1972; port Alliance3 indicates that depression most often strikes Bandura, 1977; Koestner et al., 1987]. It is widely ac- at age 32 in the United States, but poses also an obvious knowledged that to praise oneself could substantially help problem among different age groups. One child in 33 chil- dealing with depression [Swann et al., 1992] and praising dren and one in eight adolescents have clinical depression improves behavior [Garland et al., 2008], academic perfor- and even if as many as six million elderly people are af- mance [Strain et al., 1983] and work performance [Crowell fected by mood disorders, but only 10% ever receive treat- et al., 1988]. But there is some other interesting and difficult ment. Precise numbers are often difficult to obtain as many aspect of praising – the praiser has to be competent and share subjects do not want to participate in studies, do not respond some relationships with the praised person [Carton, 1996]. to surveys, do not answer the door or have insufficient lan- Also, from the Artificial Intelligence point of view, the auto- matic distinction between praiseworthy and not praiseworthy 1 Second author is currently with Panasonic Co. 4 www.nimh.nih.gov/health/statistics/ 2 www.unfpa.org/ageing prevalence/major-depression-among-adults. 3 www.dbsalliance.org shtml 41 Proceedings of the 4th Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2016), IJCAI 2016, pages 41-47, New York City, USA, July 10, 2016. acts is an interesting long-term challenge to create a righteous and trustful machine and, in this particular case, to investigate if the Web resources could become a sufficient knowledge base for such tasks. Our hypothesis is that knowing the po- larity of consequences of human acts might be the key to an automatic evaluation of these acts. 1.2 State of the Art The authors have found only one research proposal dedicated specifically to automating praising. In 1998 [Tejima et al., 1998] have published a two page paper in which they de- scribe their observations from physiotherapists’ sessions with elderly. The researchers proposed a simple verbal encourage- ment algorithm for walking training and implemented it later [Tejima and Bunki, 2001], however the effectiveness could not be confirmed due to the insufficient number of experi- mental subjects. Causing positive moods in interlocutors can be found as a sub-task in Human-Computer Interaction (HCI) field, especially in learning-oriented agents [Fogg and Nass, 1997; Kaptein et al., 2010] but the studies utilize scenarios and manually created rules when to praise. Systems that ac- cept, in theory, any sentence as an input and recognize polar- ity or emotive categories were proposed in the fields of sen- timent analysis and affect recognition [Wilson et al., 2005; Strapparava and Mihalcea, 2008] and the basic idea for our system is borrowed from their approaches. However these methods cannot be utilized straightforwardly because being positive does not have to mean an act is worth praising (“I saw a movie” is labelled positive by these methods but it usu- Figure 1: Algorithm for retrieving and analyzing conse- ally does not mean we need to react with a compliment to quences of acts in order to determine if they should be such a statement). For English language there are promising praised. methods for retrieving goodFor and badFor events [Deng and Wiebe, 2014] and for acquiring knowledge of stereotypically positive and negative events from personal blogs [Ding and Riloff, 2016]. Basically any new trend in the field [Cambria et the verb 15 suffixes representing conditions and temporal se- al., 2013; Socher et al., 2013] should eventually help improve quences to retrieve more adequate sentences (waruguchi itta our results as soon as they are implemented for Japanese lan- ato “after calling names”, waruguchi iu toki “when called guage, which often has much less resources to keep up with names”, waruguchi itte “called names and then”, etc.). Be- the latest methods. For Japanese [Rzepka and Araki, 2015] cause particles are often omitted in colloquial Japanese, an- have proposed a system that evaluates textual inputs from a other set of 15 phrases without particles is created and the fi- moral perspective. Similarly to our approach it uses lexicons nal 30 phrases together with phrases with verbs in their basic and one of them, based on Kohlberg’s theory of moral stages (dictionary) form become queries for 5.5 billion word YACIS development [Kohlberg, 1981], includes praise-punishment corpus of Japanese raw blogs [Ptaszynski et al., 2012]. Text polarized pairs. However, the lexicon contains only 14 praise retrieved from the corpus is then cleaned – emoticons usu- related words limited to synonyms of the verb “praise” which, ally used as sentence boundaries are converted to fullstops as shown later in the comparison experiment, are insufficient and too long and too short sentence candidates are deleted. for our purposes. In the next step, the generated temporary corpus of sentences containing input acts is normalized to verb dictionary forms 2 System Overview and divided into meaningful chunks by Argument Structure The algorithm of our system is presented in Figure 1. In Annotator ASA [Takeuchi et al., 2010] to avoid granular di- the first step an input act (noun - verb pair we treat as the vision of morphological analyzer. For instance “was | beat minimal semantic unit describing any act) in Japanese lan- | ing | brother” becomes “beat brother” and such transitions guage is morphologically analyzed by MeCab5 to determine are made to increase the coverage of matching chunks with a noun, a verb and the joining particle representing grammati- phrases from lexicons in the next step. Every match is scored cal case (e.g. aisatsu-o suru “to greet someone” or yakusoku- 1 and the totals are compared. If there are more than 50% of o mamoranai “not keeping promises”, where particle “o” in- positive or negative counts, the act is estimated as praisewor- dicates an object of given verb). Then the system adds to thy or not praiseworthy accordingly. Although in morality es- timation task 60/40 ratio scored highest [Rzepka and Araki, 5 taku910.github.io/mecab/ 2015], in our task the 50/50 ratio achieved better results. 42 2.1 Lexicons 3.1 Input Acts Web resources used in the study give an opportunity to pro- As mentioned in the introduction, we hypothesized that mea- cess any kind of act but this freedom causes difficulties with suring the polarity of act consequences might be the key choosing a fair and balanced input. To deal with this problem for recognizing praiseworthy acts. Although aware of pos- we created two sets, one generated automatically and evalu- sible problems mentioned in the Introduction, we decided ated by subjects, and second one created by the same subjects to investigate how efficient the existing emotional recogni- specifically instructed to give examples of praiseworthy and tion methods could deal with our task. Therefore firstly we not praiseworthy acts different from these which they labeled. chose two different freely available lexicons used for lexi- By introducing these two types we tried to find a balance be- con based polarity recognition in Japanese language. The tween “any input” (because the algorithm should recognize larger one was statistically generated from manually anno- neutral acts) and more specific, manually crafted set of cor- tated sentences in the study of [Takamura et al., 2005]. It rect data. contains 55,102 words divided into positive (5,121 words) and negative (49,981 words) ones. Every word was auto- Automatically Generated Set matically scored on the scale from minimal -1 to maximal For creating the first set we utilized 200 verbs from the Sta- 1 and the words closer to 0 tend to be inaccurately labeled tistical Lexicon with the highest hit number in the blog cor- (e.g. okaasan “mom” or narubeku “as possible”, are marked pus (100 from positive subset and 100 from negative sub- as negative words), therefore using the whole (significantly set) and paired them with nouns most frequently co-occurring unbalanced) set would cause drops in accuracy. In order to within Japanese Frames dataset automatically generated from minimize this problem and to make the lexicon more bal- the biggest Japanese Web corpus [Kawahara and Kurohashi, anced, after analyzing the entries we used most positive 3,000 2006]. In order to limit the number of acts and to maintain and most negative 3,000 words (closest to 1 and -1 from each sufficient coverage (to observe to what extent the automati- side) and called it “Statistical Lexicon”. cally polarized words are efficient), we added two conditions. Another lexicon used in polarity detection in Japanese texts The noun object must be included in the Statistical Lexicon is created manually by [Nakamura, 1993] from emotive sen- and the generated act must appear at least ten times in the tences retrieved from Japanese literature. The words are sep- blog corpus. Hence, if e.g. verb “keep” from the lexicon was arated into ten categories (Like, Joy, Relief, Dislike, Anger, co-occurring frequently with object noun “promise” and the Fear, Shame, Sadness, Excitement, Surprise) and because Ex- phrase “to keep a promise” was found more than 10 times in citement and Surprise have no distinct valence, these two cat- the blog corpus, the phrase was treated as a common human egories were excluded. The combined words from Like, Joy act and became an input. With this method we generated 119 and Relief form a positive subset and Dislike, Anger, Fear, acts which were then evaluated by three judges (one female Shame and Sadness form a negative one. Resulting lexicon in her fifties, one male university student and one female sec- of 526 positive and 756 negative words (1,282 in total) we ondary school pupil) by labeling the set as praiseworthy, not call here “Literature Lexicon” to make it more comprehensi- praiseworthy or hard to tell. The majority vote (three judges ble while presenting comparison between lexicons. agreed or two agreed and the third answered “hard to tell”) re- As mentioned before, a positive act does not necessarily sulted in 54 acts – 31 worth praising as tomodachi-o iwau (“to imply being praiseworthy, therefore we decided also to test congratulate a friend”) or chichi-o shitau (“to admire one’s fa- a lexicon used for ethical judgement by [Rzepka and Araki, ther”) and 23 not worth praising as tanin-o nikumu (“to hate 2015]. This relatively small set, containing 65 positive and 69 somebody”) or itami-o shiiru (“to impose pain upon some- negative words (134 in total), was created by applying phrases one”). Two examples of acts on which agreement was not related to the five stages of moral development proposed by reached are hiza-o kussuru (“to bend one’s knees / to yield to [Kohlberg, 1981]: obedience / punishment, self-interest, so- someone”) and yami-o kowagaru (“to be afraid of darkness”). cial norms, authority / social-order, and social contract. For The labeled data became both the input and first correct data example in the obedience / punishment subset there are words set and we named it “Automatically Generated Set”. like “punished”, “awarded”, “punishment”, “award” and au- Manually Created Set thority / social order contains law-related words like “sen- Because the automatically retrieved input set was biased to- tenced”, “legal” or “arrested”. To examine how emotional ward Statistical Lexicon we asked the same group of three and social consequences work together, we created another people to think of acts worth praising and not worth prais- lexicon, a combination of Kohlberg’s theory-based set with ing. The created set (from now on called “Manually Created the Nakamura’s literature-based set. We named the former Set”) contained 64 acts – 32 of praiseworthy ones as shiken- “Ethical Lexicon”, and the latter “Combined Lexicon”. ni goukaku suru (“passing an exam”) or tetsudai-o suru (“helping someone”), and not worth praising as yakusoku-o mamoranai (“not to keep a promise”), kenka-o suru (“to quar- 3 Experiments and Results rel / to have a fight”). Differently from the Automatically Generated Set, although the creators have seen examples of In this section we introduce experiments we conducted to in- acts in the evaluation process, Manually Created Set was not vestigate the effectiveness of our approach in the task of au- restricted and in consequence included more diverse forms tomatic praiseworthy act recognition. containing not only negations but also adverbs and passive / 43 Table 1: Results for Automatically Generated Set of input Table 2: Results for the Manually Created Set of input acts. acts. Matched / All Correct Matched / All Correct Statistical Lexicon 52 / 64 63.5% Statistical Lexicon 54 / 54 83.3% Literature Lexicon 45 / 64 84.4% Literature Lexicon 42 / 54 66.7% Ethical Lexicon 39 / 64 84.6% Ethical Lexicon 17 / 54 58.8% Combined Lexicon 44 / 64 90.9% Combined Lexicon 45 / 54 68.9% small lexicons is currently more realistic approach for the au- double verbs as in jiko-chuushin-teki ni koudou-o suru (“to tomatic recognition (and annotation) of praiseworthy acts. act selfishly”) and iwareta koto-o yaranai (“not to do what one was told”). Ethical Lexicon The smallest of all used lexicons, based on Kohlberg’s theory 3.2 Effectiveness Comparison between Lexicons and utilized in automatic ethical recognition task performed worst when the Automatically Generated Set of acts was in- Having two sets of acts with their human evaluation prepared, put but outperformed both Statistical and Literature Lexicons we have performed a series of experiments to examine our when the Manually Created Set of acts was used. system’s accuracy when using above described lexicons in the task of recognizing praiseworthy acts. Combined Lexicon We managed to confirm that the combination of Ethical and Statistical Lexicon Literature Lexicons performed better than separated ones Tested with acts from the Automatically Generated Set, the when the Manually Created Set of acts was used. However, Statistical Lexicon achieved 83.3% of correct recognitions. its accuracy was still lower than Statistical Lexicon match- To confirm our assumption that matching should be per- ing sentences retrieved with the Automatically Created Set of formed only on the right side of an act phrase because it acts. is where consequences of the act are usually written (see Figure 2), we have also run additional tests and confirmed 3.3 Additional Experiments that analyzing left sides achieves significantly lower accu- As we aim at recognizing praiseworthy acts in everyday con- racy (66.7%). Matching within the whole sentence did not versation, the correct recognition of more natural input acts bring any improvement in results, besides it doubled search- is more important than the correct recognition of less natural ing time. Examples of correctly recognized acts are shouri-o input acts. To be sure if Statistical Lexicon could perform iwau (“to celebrate victory”) and kenkou-o mamoru (“to care better with Manually Created Set we conducted a series of about one’s health”). On the other hand, tsumi-o kuiru (“to re- additional tests increasing the range of positive and negative gret one’s sins”) or shi-o kanashimu (“to grieve one’s death”) words to see if heuristically chosen size of 3,000 was cor- were recognized incorrectly due to noisy polarity in the Sta- rect. We examined 10 sizes starting from 500 words size in- tistical Lexicon. creasing it by 500 each time up to 5,000 and also tested the When tested with Manually Created Set, the results of Sta- whole unbalanced list from -1 to 1. It appeared (See Figure tistical Lexicon dropped as expected. Left side matching 3) that accuracy grows till 1,500 words (increase from 72.9% brought only 53.7% correct recognitions while again the right to 80.8%) but when a larger sets are used, the results start side matching surpassed the left side achieving 63.5% and to decrease and never exceed these of the Literature Lexicon the whole sentences scored significantly lower (58.2%). All (84.4%). other comparison of results between left side, right side and whole sentences confirmed this trend, therefore, in order to avoid confusion, all remaining results we introduce, are from 4 Adding Semantic Categories the matches performed on the right sides following input act After analyzing sentences which include praiseworthy act but phrases. were not counted due to insufficient number of words in lex- icons we decided to examine if we could automatically add Literature Lexicon some valuable information to other words and see if the in- The Literature Lexicon surpassed much larger Statistical Lex- formation influences the act of praising. We chose semantic icon when Manually Created Set acts were input but was sig- categorization and used “Bunrui-Goi-hyo” (Word List by Se- nificantly less accurate with acts from Statistical Lexicon (see mantic Principles) [NLRI, 1964] containing 32,600 seman- Table 1 and Table 2). The perfect recognition rate (54/54 tically categorized words collected from 90 contemporary matched) may suggests that if a new, less noisy method for Japanese newspapers. For example the list groups words un- the automatic estimation of word polarity is proposed and it der categories as “Thoughts / Opinions / Doubts”, “Helping covers all words in every possible input, the Statistical Lex- / Rescuing” or “Profit / Loss”. Our idea was to add sim- icon would outperform the Literature Lexicon also when fed ple weighs (count +1) to words that belong to categories with acts from Manually Created Set. Nevertheless, it would which tend to be praiseworthy. In order to examine which be very costly and avoiding polarizing neutral words seems categories reveal such tendencies we retrieved from the cor- to be difficult, hence we believe that using manually crafted, pus all sentences containing acts labeled by human subjects 44 Figure 2: Example sentence from the corpus with input act and a matched Ethical Lexicon word on the right side. entirely, the semantic categories alone achieved slightly bet- ter precision than Ethical Lexicon when the Automatically Generated Set of acts was input. The highest precision when Manually Created Set was used increased the precision of Lit- erature and Ethical Lexicons achieving 94%. 5 Conclusion, Future Work and Discussion In this paper we introduced a simple matching algorithm al- lowing an agent to recognize human acts worth praising with maximal 94% agreement with human subjects by using lex- icons (words sets) and Web resources (a blog corpus). The best results were achieved by Literature and Combined Lex- icons with Semantic Categories support when manually cre- ated example acts were input. There is still plenty of room Figure 3: Results of additional experiments for investigating for improvement and we plan to increase the coverage of lex- accuracy changes when using different sizes of the Statistical icons by matching synonyms, too. We also are experiment- Lexicon. ing with changing counting method according to adverbs pre- ceding matching phrases (“a little bit sad” could be scored lower than e.g. “so freaking sad”). As the act of praising is as praiseworthy and not praiseworthy. Then a simple script very subjective and depends on many factors, we are plan- counted how many other words in both datasets belong to ning to perform wide, possibly intercultural, surveys. We which semantic category. For example if a blog sentence would like to conclude with underlining a wider importance was “I lost the confidence in myself after he spoke ill about of the ability to automatically recognize praiseworthy acts by me”, the script was adding negative points to categories as a machine. Recent worries about Artificial Intelligence tak- “Profit/Loss” (lost) or “Thoughts / Opinions / Doubts” (con- ing control over their users could be, at least in our opinion, fidence). Because some categories contained thousands of eased by positive examples. Companion robots, while help- words and other only a few, we decided to assign weights ing at home and e.g. running memory-quizes for users with according to differences between frequencies. Examples of Alzheimer disease, need to be trusted and gaining the trust categories with distinctly different frequencies are shown in will be difficult without sharing similar values. Our common Table 3. Then, in order to ease unbalance between sizes recognition and evaluation of a fellow human’s behavior can of both categories, we experimented with combinations of be measured with shallow sentiment analysis techniques on weight sets and discovered that accuracy is highest for both vast textual data which express our experiences and feelings. praiseworthy and not praiseworthy acts when the former uses The proposed method demonstrates that the noisy Web re- weights created from group b) and the latter uses c) (refer to sources like blogs, when processed carefully, can become one Table 3). way to equip artificial agents with a human-like capacity of telling right from wrong without leaning to any specific phi- 4.1 Result Comparison losophy or religion. We believe that a trustworthy machine To see if semantic categorization is effective, we repeated should rather operate on estimating overall positive and neg- all experiments scoring not only matched lexicon words but ative consequences than on methods based on explicit rules also other words that belong to specific categories (those with decided by one or only few programmers. The proposed sys- tendencies to be praiseworthy or not praiseworthy). Because tem can easily “explain” its decisions by giving examples of among semantic categories supposedly specific to praisewor- retrieved experiences or by presenting a voting ratio, while thy acts there were ones like Losing and Disappointment, we most of machine learning based methods are “black boxes” expected rather low accuracy, but quite surprisingly semantic and may lead to trust issues. Having said so, we believe that weighting helped improving all previous results (see Table 4 our method could help to automatically annotate data, which and Table 5). Even when we excluded lexicon words count is crucial for machine learning. 45 Table 3: Examples of frequency differences of semantic categories specific to praiseworthy and not praiseworthy acts Difference Praiseworthy acts a) More than 4 times: Helping / Rescuing, Giving / Receiving, Profit / Loss, Winning / Losing, School / Military, Lending / Borrowing, Physiology, Marking / Signing, etc. b) More than 3 times: Talents, Planning, Specialist jobs, Associations / Groups, Events / Ceremonies, etc. c) More than 2 times: Economy / Income / Expenditure, Formation, Meaning / Problem / Purpose, Desire / Expectance / Disappointment, etc. Difference Not praiseworthy acts a) More than 4 times: Respecting / Thanking / Trusting, Creating / Writing, Old / New / Slow / Fast, Treatment, Graphs / Tables / Formulas, etc. b) More than 3 times: Acquisition, Eye / Mouth / Nose functions, Roads / Bridges, Land vehicles, Fear / Anger, etc. c) More than 2 times: Linguistic activities, Birds, Associations, Distress / Sorrow, Partners / Colleagues, etc. [Crowell et al., 1988] Charles R Crowell, D Chris Anderson, Table 4: Effectiveness comparison of implementing semantic Dawn M Abel, and Joseph P Sergio. Task clarification, categories (Automatically Generated Set). performance feedback, and social praise: Procedures for Matched / All Correct improving the customer service of bank tellers. Journal of Semantic Category (SC) 52 / 54 78.8% Applied Behavior Analysis, 21(1):65–71, 1988. Statistical Lexicon + SC 54 / 54 85.2% [Deng and Wiebe, 2014] Lingjia Deng and Janyce Wiebe. Literature Lexicon + SC 54 / 54 81.5% Sentiment propagation via implicature constraints. In Ethical Lexicon + SC 52 / 54 76.9% Combined Lexicon + SC 54 / 54 85.2% EACL, pages 377–385, 2014. [Ding and Riloff, 2016] Haibo Ding and Ellen Riloff. Ac- quiring knowledge of affective events from blogs using label propagation. In Proceedings of the Thirtieth AAAI Table 5: Effectiveness comparison of implementing semantic Conference on Artificial Intelligence (AAAI-16), 2016. categories (Manually Created Set). [Fogg and Nass, 1997] B.J. Fogg and C. Nass. Silicon syco- Matched / All Correct phants: the effects of computers that flatter. International Semantic Category (SC) 50 / 64 92.0% Journal of Human-Computer Studies, 46(5):551 – 561, Statistical Lexicon + SC 52 / 64 88.5% 1997. Literature Lexicon + SC 50 / 64 94.0% Ethical Lexicon + SC 50 / 64 90.0% [Garland et al., 2008] Ann F Garland, Kristin M Hawley, Combined Lexicon + SC 50 / 64 94.0% Lauren Brookman-Frazee, and Michael S Hurlburt. Iden- tifying common elements of evidence-based psychosocial treatments for children’s disruptive behavior problems. References Journal of the American Academy of Child & Adolescent Psychiatry, 47(5):505–514, 2008. [Bandura, 1977] Albert Bandura. Self-efficacy: toward a [Henderlong and Lepper, 2002] Jennifer Henderlong and unifying theory of behavioral change. Psychological re- Mark R Lepper. The effects of praise on children’s in- view, 84(2):191, 1977. trinsic motivation: a review and synthesis. Psychological [Burnard, 2003] Philip Burnard. Ordinary chat and therapeu- bulletin, 128(5):774, 2002. tic conversation: phatic communication and mental health [Hofmann et al., 2012] Stefan G Hofmann, Anu Asnaani, nursing. Journal of Psychiatric and Mental Health Nurs- Imke JJ Vonk, Alice T Sawyer, and Angela Fang. The ing, 10(6):678–682, 2003. efficacy of cognitive behavioral therapy: a review of meta- [Cambria et al., 2013] E. Cambria, B. Schuller, Yunqing analyses. Cognitive therapy and research, 36(5):427–440, Xia, and C. Havasi. New avenues in opinion mining and 2012. sentiment analysis. Intelligent Systems, IEEE, 28(2):15– [Kanouse et al., 1981] David E Kanouse, Peter Gumpert, 21, March 2013. and Donnah Canavan-Gumpert. The semantics of praise. [Carton, 1996] John S Carton. The differential effects of tan- New directions in attribution research, 3:97–115, 1981. gible rewards and praise on intrinsic motivation: A com- [Kaptein et al., 2010] Maurits Kaptein, Panos Markopoulos, parison of cognitive evaluation theory and operant theory. Boris Ruyter, and Emile Aarts. Two acts of social intelli- The Behavior Analyst, 19(2):237, 1996. gence: the effects of mimicry and social praise on the eval- 46 uation of an artificial agent. AI & SOCIETY, 26(3):261– In Proceedings of the 2008 ACM symposium on Applied 273, 2010. computing, pages 1556–1560. ACM, 2008. [Kawahara and Kurohashi, 2006] Daisuke Kawahara and [Swann et al., 1992] William B Swann, Richard M Wenzlaff, Sadao Kurohashi. A fully-lexicalized probabilistic model and Romin W Tafarodi. Depression and the search for for japanese syntactic and case structure analysis. In negative evaluations: more evidence of the role of self- Proceedings of the Main Conference on Human Language verification strivings. Journal of Abnormal Psychology, Technology Conference of the North American Chapter 1992. of the Association of Computational Linguistics, HLT- [Takamura et al., 2005] Hiroya Takamura, Takashi Inui, and NAACL ’06, pages 176–183, Stroudsburg, PA, USA, Manabu Okumura. Extracting semantic orientations of 2006. Association for Computational Linguistics. words using spin model. In Proceedings of the 43rd An- [Koestner et al., 1987] Richard Koestner, Miron Zuckerman, nual Meeting on Association for Computational Linguis- and Julia Koestner. Praise, involvement, and intrinsic mo- tics, pages 133–140. Association for Computational Lin- tivation. Journal of personality and social psychology, guistics, 2005. 53(2):383, 1987. [Takeuchi et al., 2010] Koichi Takeuchi, Suguru [Kohlberg, 1981] Lawrence Kohlberg. The Philosophy of Tsuchiyama, Masato Moriya, and Yuuki Moriyasu. Moral Development. Harper and Row, 1th edition, 1981. Construction of argument structure analyzer toward [Lipnevich and Smith, 2008] Anastasiya A Lipnevich and searching same situations and actions. Technical Re- Jeffrey K Smith. Response to assessment feedback: The port 390, IEICE technical report. Natural language effects of grades, praise, and source of information. ETS understanding and models of communication, jan 2010. Research Report Series, 2008(1):i–57, 2008. [Tejima and Bunki, 2001] Noriyuki Tejima and Hitomi [Nakamura, 1993] Akira Nakamura. Kanjo hyogen jiten Bunki. Feasibility of measuring the volition level in [Dictionary of Emotive Expressions]. Tokyodo Publish- elderly patients when using audio encouragement during ing, 1993. gait training physical therapy. In Engineering in Medicine and Biology Society, 2001. Proceedings of the 23rd [NLRI, 1964] National Language Research Institute NLRI. Annual International Conference of the IEEE, volume 2, Bunrui Goi Hyo (Word List by Semantic Principles, in pages 1393–1395. IEEE, 2001. Japanese). Shuei Shuppan, 1964. [Tejima et al., 1998] Noriyuki Tejima, Yoko Takahashi, and [Ptaszynski et al., 2012] Michal Ptaszynski, Pawel Dybala, Hitomi Bunki. Verbal-encouragement algorithm in gait Rafal Rzepka, Kenji Araki, and Yoshio Momouchi. Yacis: training for the elderly. In Engineering in Medicine and Bi- A five-billion-word corpus of japanese blogs fully anno- ology Society, 1998. Proceedings of the 20th Annual Inter- tated with syntactic and affective information. In Proceed- national Conference of the IEEE, volume 5, pages 2724– ings of The AISB/IACAP World Congress, pages 40–49, 2725. IEEE, 1998. 2012. [Weiner et al., 1972] Bernard Weiner, Heinz Heckhausen, [Rzepka and Araki, 2015] Rafal Rzepka and Kenji Araki. and Wulf-Uwe Meyer. Causal ascriptions and achievement Rethinking Machine Ethics in the Age of Ubiquitous Tech- behavior: a conceptual analysis of effort and reanalysis of nology, chapter Semantic Analysis of Bloggers Experi- locus of control. Journal of personality and social psy- ences as a Knowledge Source of Average Human Morality, chology, 21(2):239, 1972. pages 73–95. Hershey: IGI Global, 2015. [Wilson et al., 2005] Theresa Wilson, Janyce Wiebe, and [Sarma et al., 2014] Bandita Sarma, Amitava Das, and Rod- Paul Hoffmann. Recognizing contextual polarity in ney D Nielsen. A framework for health behavior change phrase-level sentiment analysis. In Proceedings of the using companionable robots. INLG 2014, page 103, 2014. conference on human language technology and empirical [Socher et al., 2013] Richard Socher, Alex Perelygin, Jean Y methods in natural language processing, pages 347–354. Wu, Jason Chuang, Christopher D Manning, Andrew Y Association for Computational Linguistics, 2005. Ng, and Christopher Potts. Recursive deep models for se- [Wright et al., 2005] Jesse H. Wright, Andrew S. Wright, mantic compositionality over a sentiment treebank. In Pro- Anne Marie Albano, Monica R. Basco, L. Jane Gold- ceedings of the conference on empirical methods in nat- smith, Troy Raffield, and Michael W. Otto. Computer- ural language processing (EMNLP), volume 1631, page assisted cognitive therapy for depression: Maintaining ef- 1642. Citeseer, 2013. ficacy while reducing therapist time. The American Jour- [Strain et al., 1983] Phillip S Strain, Deborah L Lambert, nal of Psychiatry, 162(6):1158–64, Jun 2005. Mary Margaret Kerr, Vaughan Stagg, and Donna A [Zimmerman et al., 2009] Frederick J Zimmerman, Jill Gilk- Lenkner. Naturalistic assessment of children’s compli- erson, Jeffrey A Richards, Dimitri A Christakis, Dongxin ance to teachers’requests and consequences for compli- Xu, Sharmistha Gray, and Umit Yapanel. Teaching by lis- ance. Journal of Applied Behavior Analysis, 16(2):243– tening: The importance of adult-child conversations to lan- 249, 1983. guage development. Pediatrics, 124(1):342–349, 2009. [Strapparava and Mihalcea, 2008] Carlo Strapparava and Rada Mihalcea. Learning to identify emotions in text. 47