Evolution of Emotions and Sentiments in an Online Learning Community Ifeoma Adaji Oluwabunmi Olakanmi Computer Science Department Computer Science Department University of Saskatchewan University of Alberta Saskatoon, Saskatchewan, Canada Edmonton, Alberta, Canada ifeoma.adaji@usask.ca olakanmi@ualberta.ca ABSTRACT and privileges2. Despite the popularity of Stack Overflow and the reward system of the community, only a few users actively We explore how the emotions and sentiments of users of Stack participate in the network [13]. Overflow evolve over time based on their reputation scores. In this initial exploratory study, we compute four dimensions of Research [11] suggests that rewards can change the attitude and emotions and sentiments: analytic, clout, authentic and tone for behavior of people in different ways. Because Stack Overflow is the question and answer posts of users in our dataset. Our results hugely based on rewards, it is important to investigate if people indicate that Stack Overflow users, experts, and non-experts, who have earned several rewards over time have changed their become more analytic and less authentic over time in both behavior in the community. It is equally important to understand question and answer posts irrespective of their reputation score. In the evolution of the behavior of users who have earned a few addition, while the clout of non-experts decreased over time in rewards in the community over time. Therefore, we carried out a both question and answer posts, the clout of experts only preliminary study of the emotions and sentiments of some users in decreased in their answer posts and not in their question posts. the community over a five-year period. Emotion and sentiments These results, though preliminary, could throw some light on the are central to cognition in learning because they can shape evolution of the emotional state of learners in online learning learners’ engagement and also influence their overall learning communities such as Stack Overflow. experiences [6], [8], [4]. In online learning environments such as Stack Overflow, where there are no dedicated instructors, there is CCS CONCEPTS need to design the consciousness of users’ emotions and Human-centered computing → Collaborative and social sentiments into the system, in order to arouse their feelings of computing systems and tools security and self-confidence, which are important to encourage their continued participation. Research has measured emotions in KEYWORDS online learning environments [5], but the trends and changes in Community question answering, Sentiment analysis, Stack their emotions and sentiments have not been measured, Overflow, Online learning community, Lifelong learning particularly in the context of lifelong learning. Therefore, we measured the emotions and sentiments of Stack Overflow users ACM Reference format: with the aim of determining if the emotions and sentiments of Ifeoma Adaji, and Oluwabunmi Olakanmi. 2019. Evolution of Emotions learners change over time for users who have earned a high and Sentiments in an Online Learning Community. In Adjunct reputation, and also for users, who have not earned a high Proceedings of Artificial Intelligence in Education (AIED). Chicago, IL, reputation. In addition, we wanted to determine if there were any USA, 5 pages. changes in emotions and sentiments of these groups of users in their question and answer posts. 1 Introduction With the advent of Web 2.0, more people are seeking for answers 2 Background to their computer programing questions on online forums such as Stack Overflow1. Stack Overflow is a question and answer site, 2.1 Stack Overflow where users can ask and answer specific computer programing Stack Overflow 3 is a community question and answer platform questions. Active and helpful members of this community get where users can ask and answer specific IT related questions. rewarded for their participation in the form of reputation points Authors of questions can earn reputation and rewards when their posts get upvoted. The upvotes and downvotes are used to 1 2 https://stackoverflow.com https://stackoverflow.com/help/whats-reputation Copyright held by the author(s). Use permitted under the CC-BY license CreativeCommons.org/licenses/by/4.0/ compute the final score of a post and are also a means through authenticity use words that show lower cognitive complexity and which users earn reputation. By posting high-quality questions more negative emotion words [9]. and useful answers, users can gain reputation. The higher the reputation score, the more privileges the user can earn. Privileges Tone, according to the LIWC tool, describes the emotion of the control what users can do in Stack Overflow. Stack Overflow author. It summarizes negative and positive emotion dimensions currently has over 5 million users with over 11 million questions. into one variable. The higher the number, the more positive the tone. We chose Stack Overflow because it is a lifelong learning platform where professionals and programming enthusiasts can We chose the LIWC tool because it has been used extensively in post questions and answers to support their continuous research for analyzing user-generated content in online systems professional development. In addition, Stack Overflow data are [1, 2], [3], [16], [7], [15]. In addition, while several sentiment readily available to be queried for the purpose of our research. analysis libraries and packages detect positive, negative and neutral sentiments, the LIWC tool detects sentiments in addition 2.2 Linguistic Inquiry and Word Count (LIWC) to other traits such as emotions and personality traits [16]. Tool In this paper, we identify the sentiments and emotions of users in Stack overflow using the Linguistic Inquiry and Word Count tool 3 Data and Methodology (LIWC) [12], [16]. The LIWC tool reads text and determines what To carry out this study, we used data from Stack Overflow’s data percentage of words in the text reflect various dimensions of explorer5. Stack Overflow’s data explorer enables one to directly sentiments and emotions of the writer based on its built-in query Stack Overflow’s publicly available dataset. We were dictionary of over 6,400 words. Although LIWC computes several interested in the emotions and sentiments displayed in the last five dimensions of emotions and sentiments, being an exploratory years by users who are currently active in the system and who study, we are only interested in four dimensions in this study: joined the community about the same time, just before the five- year period. In addition, we were interested in both question and analytic, clout, authentic and tone. answer posts. Furthermore, we also wanted to compare the users who earned high reputation to those who do not have high According to the LIWC tool4, the dimension analytic represents reputation in the five-year period. To meet these criteria, we the extent to which people use formal words, and how logical and selected posts of users based on the following: hierarchical their thinking patterns are. People low in analytical 1. Question and answer posts of users who were created in thinking typically write in more narrative ways, use less of formal January 2013 logic and rely on knowledge gained from personal experiences 2. Question and answer posts of users who meet the above [10]. On the other hand, people high in analytical thinking use criteria and have been active in 2019 formal logic, are more detailed in their explanations and avoid To split the posts of users into two groups based on reputation, we contradiction [10]. computed the average reputation score of users who met the criteria stated above. The average reputation is 551. We thus split The dimension clout, as defined by the LIWC tool, is an our users’ posts into two groups; posts created by users with a indication of the social status, confidence or leadership displayed reputation score of at least 551, expert group and those created by by an individual through their writing or speaking. People with users with reputation score less than 551, non-expert group 6 . higher clout typically use more first-person plural (such as “we”) Table 1 summarizes the data that was collected from Stack and second-person singular pronouns (such as “you”). In addition, Overflow for this study. they use fewer first-person singular pronouns ( such as “I”) [7]. People in this category tend to focus their attention outwards, Table 1. Summary of data extracted from Stack Overflow’s data towards the people they are interacting with. On the other hand, explorer for this study people low in clout are more self-focused and use more first- Number of users 14,894 person singular pronouns (such as “I”) [7]. Number of question Expert group 43,035 posts Nonexpert group 38,334 People that are high in the dimension authentic, according to the Number of answer posts Expert group 137,425 LIWC tool, reveal themselves to others (through their writing) in Nonexpert group 32,480 a more honest way. Such people are more personal, humble and vulnerable. On the other hand, people that are lower in 5 https://data.stackexchange.com/stackoverflow/query/new 6 These users are not necessarily experts or non-experts in the community. The 3 https://stackoverflow.com/ terminology was only used to differentiate between the groups. 4 http://liwc.wpengine.com/interpreting-liwc-output/ Using the LIWC tool, the dimensions analytic, clout, authentic, and tone were computed for each post for each year from 2014 to 2018. This was done for the expert and non-expert groups and for question and answer posts. The average score of each dimension for each year was then computed for each group of posts. We computed the Z-score of each year’s average to standardize the scores across the four dimensions of analytic, clout, authentic, and tone. 4 Results and Discussion In this section, we present the results of our analysis of the emotions and sentiments expressed by both the expert and non- expert users in their question and answer posts. 4.1 Expert Users’ Group Figure 1a shows the standardized results for answer posts of the four dimensions of emotions and sentiments over five years for users in Stack Overflow who joined the community in January Figure 1b. Sentiments and emotions of questions posted by 2013, have a reputation score of at least 551 and have been active expert users in our data set over 5 years in 2019. We termed these users experts for the purpose of this study. Figure 1b shows the results of posts for the same group of Our results in figures 1a and b suggest that the expert users users but for question posts. became more analytical in writing question and answer posts over time. According to the LIWC tool 7 , the dimension analytic suggests the extent to which people use formal words, and how logical and hierarchical their thinking patterns are. While people with low analytical thinking typically write in more narrative ways and use less of formal logic [10], people high in analytical thinking use formal logic and are more detailed in their posts [10]. Stack Overflow is a question and answer forum for computer programing questions. It is therefore not surprising that the users who have been active in the network over time have become more analytical in their writing. Our results in figures 1 and b also suggest that expert users become less authentic over time in both questions and answer posts. According to the LIWC tool12, people who reveal themselves to others (through their writing) in an authentic or honest way are more personal, humble and vulnerable. On the other hand, people that are lower in authenticity use words that show lower cognitive complexity and more negative emotion words [9]. This result was unexpected because the increase in analytic writing which we described in the previous paragraph Figure 1a. Sentiments and emotions of answers posted by does not suggest lower cognitive complexity. Therefore, we plan expert users in our data set over 5 years to explore this finding in future work. The LIWC tool12 defines clout as the social status, confidence and leadership people display through their writing. As shown in figures 1a and b, while clout decreases over time for answer posts, it increases for question posts. People with higher clout typically use more first-person plural (such as “we”) and second-person 7 http://liwc.wpengine.com/interpreting-liwc-output/ singular pronouns (such as “you”). In addition, they use fewer first-person singular pronouns ( such as “I”) [7]. Our results could be an indication that over time when asking questions, experts have displayed more confidence and leadership and have used more of “we”s and “you”s and less of “I”s compared to when answering questions. We intend to explore this further in future research. The tone of users, as indicated in figures 1a and b, increase over time in the answer posts and decrease in the question posts. The LIWC tool describes the tone as the emotions expressed by users in their posts. In the future, we will investigate this difference in emotions. 4.2 Non Expert Users’ Group Figure 2a shows the standardized results for answer posts of the four dimensions of emotions and sentiments over five years for users in Stack Overflow who joined the community in January 2013, have a reputation score that is less than 551 and have been active in 2019. We termed these users non-experts for the purpose of this study. Figure 2b shows the results of posts for the same Figure 2b. Sentiments and emotions of questions posted by group of users but for question posts. non-expert users in our data set over 5 years Similar to the posts of experts described in figures 1a and b, the analytical thinking of non-experts also increased over time for both question and answer posts as shown in figures 2a and b. This is expected since Stack Overflow is a computer programing question and answer platform where learning occurs over time [14]. Because computer programing is analytical, writing more analytical posts over time could occur for users who use the platform often. Since our dataset includes users who despite joining the platform in 2013, are still active in 2019, the result is not unexpected. Clout decreased over time in both the question and answer posts of non-experts. Clout in LIWC refers to confidence and leadership. Our results could infer that the users with lower reputation scores are not confident in their posts, possibly because they are still learning. The results presented here are our initial exploratory findings. More analyses have to be carried out to determine if, for example, Figure 2a. Sentiments and emotions of answers posted by non- the difference in sentiments and emotions over the five years are expert users in our data set over 5 years significant or not. We plan to do that in the future. 5 Conclusion and Future Work We explored the five-year trend of sentiments and emotions of users on Stack Overflow. The results of our sentiment analysis using the Linguistic Inquiry and Word Count tool suggests that irrespective of their reputation score, users are more analytical and less authentic over time. While the clout of non-experts decreased over time in both question and answer posts, the clout of experts only decreased in their answer posts and not in their question ALN Conference (Orlando, Florida, 2006). [7] Kacewicz, E., Pennebaker, J.W., Davis, M., Jeon, M. and Graesser, A.C. posts. 2014. Pronoun Use Reflects Standings in Social Hierarchies. Journal of Language and Social Psychology. 33, 2 (Mar. 2014), 125–143. DOI:https://doi.org/10.1177/0261927X13502654. These results, though preliminary, could shed some light on the [8] Linnenbrink-Garcia, L. and Pekrunb, R. 2011. Students’ emotions and evolution of the emotional state and sentiments of learners in academic engagement: Introduction to the special issue. Contemporary online learning environments such as Stack Overflow, based on Educational Psychology. 36, 1 (Jan. 2011), 1–3. DOI:https://doi.org/10.1016/J.CEDPSYCH.2010.11.004. their reputation. More analyses are being carried out to determine [9] Newman, M.L., Pennebaker, J.W., Berry, D.S. and Richards, J.M. 2003. the significance of the differences in the sentiments and emotions Lying Words: Predicting Deception from Linguistic Styles. Personality and Social Psychology Bulletin. 29, 5 (May 2003), 665–675. of the five-year period. DOI:https://doi.org/10.1177/0146167203029005010. [10] Nisbett, R., Peng, K., Choi, I., review, A.N.-P. and 2001, undefined REFERENCES 2001. Culture and systems of thought: holistic versus analytic cognition. Psychological review. 108, 2 (2001), 291. [1] Adaji, I., Oyibo, K. and Vassileva, J. 2018. Understanding low review [11] Oinas-Kukkonen, H. and Harjumaa, M. 2009. Persuasive systems design: ratings in online communities: A personality based approach. CEUR Key issues, process model, and system features. Communications of the Workshop Proceedings (2018). Association for Information Systems. 24, 1 (2009), 28. [2] Adaji, I., Sharmaine, C., Debrowney, S., Oyibo, K. and Vassileva, J. [12] Pennebaker, J. 2001. Linguistic inquiry and word count: LIWC 2001. 2018. Personality Based Recipe Recommendation Using Recipe Network Mahway: Lawrence Erlbaum Associates. Graphs. International Conference on Social Computing and Social Media [13] Pudipeddi, J.S., Akoglu, L. and Tong, H. 2014. User churn in focused (Las Vegas, Jul. 2018), 161–170. question answering sites: characterizations and prediction. Proceedings of [3] Bazelli, B., Hindle, A. and Stroulia, E. 2013. On the Personality Traits of the companion publication of the 23rd international conference on World StackOverflow Users. 2013 IEEE International Conference on Software wide web companion (2014), 469–474. Maintenance (Sep. 2013), 460–463. [14] Rekha, V., Science, S.V.-P.C. and 2015, undefined Understanding the [4] Clarizia, F., Colace, F., De Santo, M., Lombardi, M., Pascale, F. and usage of online forums as learning platforms. Elsevier. Pietrosanto, A. 2018. E-learning and sentiment analysis. Proceedings of [15] Riff, D., Lacy, S., Fico, F., Riffe, D. and Fico, F. 2006. Analyzing media the 6th International Conference on Information and Education messages: Using quantitative content analysis in research. Routledge. Technology - ICIET ’18 (New York, New York, USA, 2018), 111–118. [16] Tausczik, Y.R. and Pennebaker, J.W. 2010. The Psychological Meaning [5] Cleveland-Innes, M. and Campbell, P. 2012. Emotional presence, of Words: LIWC and Computerized Text Analysis Methods. Journal of learning, and the online learning environment. The International Review Language and Social Psychology. 29, 1 (Mar. 2010), 24–54. of Research in Open and Distributed Learning. 13, 4 (2012), 269–292. DOI:https://doi.org/10.1177/0261927X09351676. [6] Cleveland-Innes, M. and Campbell, P. 2006. Understanding emotional presence in an online community of inquiry. In 12th Annual SLOAN-C