A report of the CL-Aff OffMyChest Shared Task: Modeling Supportiveness and Disclosure Kokil Jaidka1 , Iknoor Singh2 , Jiahui Lu3 , Niyati Chhaya4 , and Lyle Ungar5 1 National University of Singapore, Singapore 2 Panjab University, India 3 Nanyang Technological University, Singapore 4 Adobe Research, India 5 University of Pennsylvania, USA jaidka@nus.edu.sg Abstract. This overview describes the official results of the CL-Aff Shared Task 2020 – #OffMyChest. The dataset comprised a semi-supervised classification task, and an open-ended knowledge modeling task on a dataset of Reddit comments with annotations crowdsourced from Ama- zon Mechanical Turk. The Shared Task was organized as a part of the 3rd Workshop on Affective Content Analysis @ AAAAI-20, held in New York, USA, on February 7, 2020. This paper compares the participating systems in terms of their accuracy and F-1 scores at predicting differ- ent facets of self-disclosure. Feedback from the system runs was used to weed out labeling errors in the test set. The annotated test and training datasets, instructions, and the scripts used for evaluation are available at the GitHub repository. 1 Introduction There is a growing interest in understanding how humans initiate and hold con- versations online. A plethora of social media platforms has emerged and been adopted by internet communities worldwide. Different cultures and communi- ties have emerged around different social media platforms [3], where some social networking sites are intended more for discussions among professional contacts, e.g., LinkedIn; others are often appropriate for pursing topical interests, e.g., Twitter; for having reasoned debates, e.g., Reddit; still others were developed to provide technical support, e.g., StackOverflow. A defining feature of these platforms is how their social norms differ. On different platforms, people choose to respond differently to each other and share different kinds of information about themselves [6]. An interesting research problem that arises is to quan- tify the levels of disclosure and to apply them for cross-sectional or longitudinal analysis of social norms and platforms. In this Shared Task, we take the first step towards approaching these problems, by examining the affective aspect of online conversations among strangers. Our aim is to build a new resource to model how social media users reciprocate in conversations, with emotional and informational behavior that either offers self-revelation or moral support. In this Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). In: N. Chhaya, K. Jaidka, J. Healey, L. H. Ungar, A. Sinha (eds.): Proceedings of the 3rd Workshop of Affective Content Analysis, New York, USA, 07- FEB-2020, published at http://ceur-ws.org paper, we introduce the OffMyChest conversation dataset and present the results of the concluded 2nd Computational Linguistics Affect Understanding (CL-Aff) Shared Task on modeling interactive affective responses. It was held in February 2020 as a part of the AAAI Annual Meeting in New York. 2 Background Previous work exploring disclosure and support has usually examined its evi- dence in health forums [14,12]. In studies on general social media posts [11], women were found to self-disclose more than men, and people with a stronger desire for impression management are less likely to disclose about themselves online. Cross-platform differences in language can enable greater or lesser pre- dictive accuracy at identifying users’ demographic information [6]. Anonymity is one of the many technological affordances which is expected to make it easier for individuals to express negative feelings online [8]. Previous findings offer a way to understand how platform behavior can differ, but they do not differen- tiate between the information and emotional aspects of disclosure and support. Our Shared Task is motivated to address this research gap and to offer a way to distinguish emotional expressions from emotional support and informational disclosure from informational support. The ability to distinguish between these aspects would allow targeted interventions where mental health issues may be evident or where users’ personal information may be at risk when they share too many personal details about themselves. The work closest to our interest has provided annotation schemes to codify the type of disclosure [2] and support [12] in online help forums. Their work re- ports that support forums offer a higher degree of self-disclosure than discussion forums [2]. Furthermore, they reported that self-disclosure was often reciprocal, and reciprocity was more likely among female than male respondents. Other findings suggest that it is emotional support [12], rather than information sup- port, that predicts users’ longevity in a health support group. On the other hand, informational support satisfied members’ short-term information needs. We were inspired to explore how easily these notions of disclosure and support can generalize into understanding casual conversations between users. To denoise the data, we decided to focus on discussions of relationships and opted to focus on Reddit sub-communities, which are likely to offer better training data thanks to the enforced community rules and strict moderation. First, we provide the definitional scope of disclosure and support for the CL- Aff Shared Task: Emotional Disclosure: Comments that mention the author’s feelings. Exam- ples: – ”My only concern was for my son.” – ”Fuck me that is beautiful.” – ”Thanks for sharing the story.” – ”My heart melted reading this xx”; – ”I’m literally too jealous”; – ”My heart is breaking for you.” Informational disclosure: Comments that contain at least some personal information about the author. Examples: – ”I’m now 65 years old”; – ”I’ve worked with kids with ODD and autism.” – ”I live in West Philly.” – ”Sounds like our bipolar kid.”; – ”She posted a screenshot of his porn history (gross)”; – ”My mum told me that she was sexually abused as a kid.” Emotional Support: The comment is offering sympathy, caring, or encour- agement. Examples: – ”Good luck, this shit is tough”; – ”Good luck! but I’m afraid I have no advice”; – ”You sound like a great person”; – ”I’m so sorry.”; – ”That’s a great story.” Informational support: This comment is offering specific information, practical advice, or suggesting a course of action. Examples: – ”I wouldnt..”; – ”You shouldn’t..”; – ”You can’t..”.; – ”Why didn’t you try this?”; – ”Please talk to a professional.” 3 Corpus On Reddit, discussions of relationships typically happen on the r/relationships community. However, a preliminary examination suggested that the discussions are not the kind of ‘casual’ conversations we were aiming for, and are instead more similar to a support forum. Responses to posts in this community would be skewed towards greater support and disclosure. We wanted a neutral, easy-to- generalize situation, where the pressure to reciprocate is substantively reduced. After further exploration, we decided to mix data from two subreddits. The first one we selected was r/CasualConversations, a ‘friendlier’ sub-community where people are encouraged to share what’s on their mind about any topic. In essence, this is similar to the posting behavior encouraged on a typical social media platform. The second one we selected was r/OffmyChest, intended as ‘a mutually supportive community where deeply emotional things you can’t tell people you know can be told.’ We anticipated that a mixture of labeled data from both these platforms would give us a degree of heterogeneity in the confessional and emotional behavior while preserving the high topicality and post quality that is typical of Reddit posts. We provide further details of the dataset in the following subsections. 3.1 Dataset description The CL-Aff corpus comprises the following: – Unlabeled training set of posts (N=17,392): The top posts in 2018 in /r/CasualConversations and /r/OffMyChest mentioning any of the terms boyfriend, girlfriend, husband, wife, gf, bf. Posts that are parents of comments in the training and test sets are separately identified. – Unlabeled training set of comments (N = 420,000): Over 420k sen- tences extracted from 130k comments posted to the unlabeled set of posts mentioned above. – Labeled training set (N = 12,860): 12,860 labeled sentences, extracted from the top comments posted to the top posts of the Reddit communities mentioned above. – Test set: (N = 5,000) Labeled sentences, extracted from the top comments made to the posts mentioned above. A detailed breakdown of the labeled training and test sets is provided in Table 1. Table 1: CL-Aff #OffmyChest dataset statistics. Total number of instances and positive instances for each of the labels provided. r/OffMyChest r/CasualConversation Training set Emotional disclosure 2449 1499 Information disclosure 2749 2142 Emotional support 901 349 Information support 772 234 Total observations 7613 5247 Test set Emotional disclosure 2301 1237 Information disclosure 1237 1158 Emotional support 1094 406 Information support 854 316 Total observations 3257 1743 3.2 Data collection Data was collected by first subsetting on the posts discussing relationships that were posted to either r/OffmyChest or r/CasualConversation. Posts about re- lationships were identified based on the presence of the seed words relating to romantic partners. Posts were then deduplicated, and all their underlying com- ments were collected. A sentence splitter was applied to obtain sentences, and a random sample of sentences which were at least 10 characters in length was then used for the pilot and confirmatory annotation tasks. 4 Annotation Annotators were required to annotate each moment according to the inset ques- tionnaire. The Disclosure and Support characteristics of each sentence were fi- nally transformed into a binary (yes/no) coding and the labels were assigned based on a simple majority agreement between five independent annotators. Only labels with 60% - 100% agreement were retained. The pairwise percentage agreement on the final dataset was 71.2% each for emotional and informational disclosure, and 84.5% and 83.9% for emotional and informational support. Instructions In this job, you will be presented with a comment made on Red- dit, a popular discussion forum worldwide. The topic of the discussion is a casual conversation or a confession. Review the text of the comment and help us by answering a few yes/no questions about it. Each HIT takes about 30 seconds: Is this comment SHARING PERSONAL FEELINGS? NO/A LIT- TLE/A LOT – NO: This comment does not mention the author’s feelings about anything. (”It’s a book by Hemingway”; ”Are you ok?”; ”She was really mad at me.”) – A LITTLE: This comment mentions the author’s mild positive or negative feelings. (”My only concern was for my son.”; ”Fuck me that is beautiful.”; ”Thanks for sharing the story.”) – A LOT: This comment contains deep positive or negative feelings or tears. (”My heart melted reading this xx”; ”I’m not crying, you’re crying!”; ”I’m literally too jealous”; ”My heart is breaking for you.”) Is this comment SHARING PERSONAL INFORMATION? NO/A LITTLE/A LOT – NO: This comment does not mention the author’s feelings about anything. (”It’s a book by Hemingway”; ”Are you ok?”; ”She was really mad at me.”) – A LITTLE: This comment mentions the author’s mild positive or negative feelings. (”My only concern was for my son.”; ”Fuck me that is beautiful.”; ”Thanks for sharing the story.”) – A LOT: This comment contains deep positive or negative feelings or tears. (”My heart melted reading this xx”; ”I’m not crying, you’re crying!”; ”I’m literally too jealous”; ”My heart is breaking for you.”) Is this comment SUPPORTIVE? YES/NO – YES: This comment is offering support to someone, either through sym- pathy, encouragement, or advice. (”Good luck, this shit is tough”; ”Good luck! but I’m afraid I have no advice”; ”Hey you tried your best”; ”Have you tried family therapy?”) – NO: This comment does not offer any support.. (”Thank you for your time.”; ”This is so sweet.”; ”Badass grandpa.”; I’m now 65 years old”; ”I’ve worked with kids with ODD and autism”; ”I live in West Philly.”) Is this comment SUPPORTIVE? YES/NO – GENERAL SUPPORT: The comment is offering general support through quotes and catchphrases. (”What’s the worst that could happen?”; ”You only die once.”; ”All’s well that ends well.” ) (YES/NO) – INFORMATIONAL SUPPORT: The sentence is offering information, ad- vice, or suggesting a course of action. (”I wouldnt..”; ”You shouldn’t..”; ”You can’t..”. ”Why didn’t you try this?”; ”Please talk to a professional.”) – EMOTIONAL SUPPORT: The sentence is offering sympathy, caring, or encouragement. (”Good luck, this shit is tough”; ”Good luck! but I’m afraid I have no advice”; ”You sound like a great person”; ”I’m so sorry.”; ”That’s a great story.”) 5 Overview of Approaches Twelve teams signed up, and six teams finally submitted their results by the Shared Task deadline. The following paragraphs discuss the approaches followed by the participating systems, sorted in alphabetical order: – GATech USA[4]: The team from GATech followed a semi-supervised ap- proach comprising transformer-based models. Their regularization was pred- icated on the assumption that the class distribution in the test set would be similar to that of the training set. – Gyrfalcon[10]: The team from Gyrfalcon Technology, California, proposed an algorithm to map English words into squared glyphs images, which they call Super Characters. These were implemented on a CNN Domain-Specific Accelerator in order to capture properties of disclosure and support. – International Institute of Information Technology India [9]: The IIIT-H team employed a predictive ensemble model that combined predictions from multi- ple models based on fine-tuned contextualized word embeddings, RoBERTa and ALBERT. – Pennsylvania State University USA (PennState)[1]: The PennState team also followed an ensemble approach, but with BERT, LSTM, and CNN neural networks. In their first model, they performed classification using BERT, fine-tuned their word representations, and obtained the hidden attention and sentence representation features in the CNN model, where they replaced the typical embedding layer with the pre-trained BERT model. – Sungkyunkwan team (SKKU)[5]: The SKKU team used a semi-supervised approach, with the original posts as contextual information, and applied BERT, GLoVe, and Emotional GLoVe embedding models, to represent the text for label prediction. – University of Ottawa (UOttawa) Canada[13]: The University of Ottawa team applied a deep multi-task learning approach that employed the logical re- lationship among the different labels to create ‘fragment layers,’ that were used to build a multi-task deep neural network. 6 Results 6.1 Task 1: Predicting Disclosure and Support This section compares the participating systems in terms of their performance. The results with the best-performing system runs from each of the participating teams are provided in Figure 1. The performance of individual system runs is provided in Table 2 and Table 3. For the detailed implementation of the individual runs, please refer to the system papers which are included in this proceedings volume. Figure 1a shows that predicting disclosure was evidently a harder prob- lem than predicting support. The best performance at predicting both emo- tional and informational disclosure was obtained from the team from UOt- tawa [13](Accuracy = .69). The second and third spots for predicting emotional disclosure went to IIIT [9] and GATech [4], with an accuracy of .62 and .61, re- spectively. Predictive performances for informational disclosure were rather close to one another, with Gyrfalcon [10] and GATech [4] coming in a close second- and third-places with accuracies of .64 and .63 respectively. Figure 1b shows that IIIT [9], UOttawa [13], and GATech [4] were neck- and-neck at predicting emotional and informational support, with IIIT getting a slight edge thanks to its performance on emotional support. The most successful runs can be identified by referring to Table 4. PREDICTION ACCURACY FOR DISCLOSURE Emotional disclosure Informational disclosure 1 0.95 0.9 0.85 0.8 0.75 0.69 0.7 0.65 0.64 0.63 0.62 0.62 0.62 0.65 0.61 0.59 0.6 0.56 0.56 0.53 0.55 0.5 UOT TAWA CANADA IIIT INDIA GATECH USA PENN STATE USA GYRFALCON U SA CAS CHINA PREDICTION ACCURACY FOR SUPPORT Emotional support Informational support 1 0.95 0.9 0.84 0.83 0.83 0.82 0.82 0.85 0.81 0.8 0.79 0.78 0.77 0.8 0.76 0.74 0.75 0.7 0.65 0.6 0.55 0.5 UOT TAWA CANADA IIIT INDIA GATECH USA PENN STATE USA GYRFALCON USA CAS CHINA Fig. 1: Accuracy scores for the best performing system runs on Task 1 for each of the participating teams Table 2: Systems’ performance in Task 1a, ordered by their accuracy on predict- ing emotional disclosure. Emotional disclosure Informational disclosure System Accuracy F1 Accuracy F1 U.Ottawa [13] run 1 0.7 0.64 0.66 0.65 U.Ottawa [13] run 2 0.69 0.64 0.65 0.65 IIIT India run 6 [9] 0.62 0.61 0.62 0.62 GATech [4] 0.61 0.6 0.63 0.63 IIIT India run 2 [9] 0.61 0.6 0.63 0.63 IIIT India run 3 [9] 0.61 0.6 0.62 0.62 IIIT India run 4 [9] 0.61 0.6 0.62 0.62 IIIT India run 5 [9] 0.61 0.6 0.62 0.62 IIIT India run 1 [9] 0.6 0.59 0.62 0.62 IIIT India run 7 [9] 0.6 0.59 0.62 0.62 Penn State [1] 0.56 0.56 0.6 0.6 Gyrfalcon run 7 [10] 0.56 0.54 0.57 0.57 SKKU run 3 [5] 0.53 0.53 0.58 0.58 SKKU run 1 [5] 0.5 0.5 0.59 0.59 SKKU run 4 [5] 0.49 0.49 0.54 0.54 Gyrfalcon run 8 [10] 0.46 0.46 0.5 0.48 SKKU run 2 [5] 0.46 0.46 0.62 0.62 Gyrfalcon [10] run 9 0.45 0.45 0.63 0.62 Gyrfalcon [10] run 4 0.45 0.45 0.64 0.62 Gyrfalcon run 3 [10] 0.4 0.39 0.61 0.6 Gyrfalcon run 10 [10] 0.39 0.38 0.62 0.62 Gyrfalcon run 5 [10] 0.37 0.36 0.63 0.62 Gyrfalcon run 6 [10] 0.32 0.28 0.57 0.57 Gyrfalcon run 1 [10] 0.3 0.25 0.57 0.57 Gyrfalcon run 2 [10] 0.3 0.24 0.49 0.48 Four of the six systems that did Task 1 also did the bonus Task 2 to share insights based on the hidden attention or fragment layers in their deep learning models. The visualizations provided by UOttawa [13] are helpful in understand- ing how exactly the logical relationships between different labels are computed. Interestingly, their approach did not use any of the unlabeled data. Instead, their fragment layers appeared to infer the hierarchical relationship underlying the categories of disclosure and support. . 7 Error Analysis We conducted a meta-analysis of system performances for Task 1 over all the sentences in the test set. When we filtered the sentences for which all or most of the approaches reported a false negative, we noted that the errors could be attributed to mislabeling, especially in the case of emotional disclosure, which had an unexpectedly high error rate. We expect that this may have happened because we transformed a 3-level annotation into a binary form; however, low- disclosure sentences may be vastly different from high-disclosure sentences. In Table 5, we provide a count of the labeling errors identified (and corrected) Table 3: Systems’ performance in Task 1b, ordered by their accuracy on predict- ing emotional support. Emotional support Informational support System Accuracy F1 Accuracy F1 IIIT run 1 [9] 0.84 0.79 0.84 0.73 IIIT run 6 [9] 0.84 0.79 0.84 0.73 IIIT run 2 [9] 0.82 0.76 0.83 0.7 IIIT run 3 [9] 0.82 0.76 0.84 0.73 IIIT run 4 [9] 0.82 0.76 0.84 0.73 IIIT run 5 [9] 0.82 0.76 0.84 0.73 IIIT run 7 [9] 0.82 0.75 0.83 0.69 GATech [4] 0.82 0.75 0.83 0.73 U.Ottawa run 2 [13] 0.81 0.75 0.82 0.73 Penn State [1] 0.8 0.72 0.78 0.48 U.Ottawa run 1 [13] 0.8 0.71 0.82 0.7 SKKU run 3 [5] 0.77 0.64 0.79 0.59 SKKU run 1 [5] 0.77 0.63 0.8 0.59 Gyrfalcon run 4 [10] 0.74 0.57 0.75 0.55 Gyrfalcon run 8 [10] 0.74 0.62 0.62 0.57 Gyrfalcon run 1 [10] 0.74 0.57 0.65 0.58 Gyrfalcon run 7 [10] 0.73 0.59 0.68 0.58 Gyrfalcon run 3 [10] 0.72 0.63 0.53 0.51 Gyrfalcon run 6 [10] 0.72 0.58 0.76 0.51 Gyrfalcon run 10 [10] 0.72 0.63 0.71 0.57 Gyrfalcon run 5 [10] 0.71 0.62 0.75 0.56 SKKU run 4 [5] 0.71 0.45 0.77 0.46 Gyrfalcon run 2 [10] 0.71 0.64 0.69 0.58 SKKU run 2 [5] 0.7 0.43 0.77 0.45 Gyrfalcon run 9 [10] 0.7 0.63 0.71 0.56 through this process. In the true spirit of a Shared Task, we have applied this feedback to identify and correct these labels. The data with corrected labels has been released. We encourage future researchers to test their approaches with the new labels. As is expected in such tasks, other errors appeared to be because of knowl- edge that was implicit in a sentence and formed the basis of annotators’ labels but was not directly present in the sentence. For example, “Clearly, that’s dis- turbing for anyone to experience.” was marked positive for emotional disclosure by annotators, but was predicted to be negative by most participating systems. 8 Conclusion and Future Work The 2nd CL-Aff Shared Task AAAI-20 is the first of its kind of annotated datasets about disclosure and support in social media discussions. We have pub- lished the complete dataset to GitHub. We plan to release other labels comple- mentary to this dataset in future tasks. We conclude this overview with some of the main takeaways shared by our participating teams: Table 4: Legend for Task 1 System Runs. System No. Run No. Description Gyrfalcon USA [10] Run 1 Text only, fold 0 Gyrfalcon USA [10] Run 2 Text only, fold 1 Gyrfalcon USA [10] Run 3 Text only, fold 2 Gyrfalcon USA [10] Run 4 Text only, fold 3 Gyrfalcon USA [10] Run 5 Text only, fold 4 Gyrfalcon USA [10] Run 6 Multimodal, fold 0 Gyrfalcon USA [10] Run 7 Multimodal, fold 1 Gyrfalcon USA [10] Run 8 Multimodal, fold 2 Gyrfalcon USA [10] Run 9 Multimodal, fold 3 Gyrfalcon USA [10] Run 10 Multimodal, fold 4 SKKU South Korea [5] Run 1 BERT SKKU South Korea [5] Run 2 BERT + Emotional GLoVe SKKU South Korea [5] Run 3 BERT + context SKKU South Korea [5] Run 4 BERT + Emotional GLoVe + context IIIT India [9] Run 1 Model 1 (Weights to RoBERTa and ALBERT are 0 or 1) IIIT India [9] Run 2 Model 2 (Weights to RoBERTa and ALBERT are = .5) IIIT India [9] Run 3 Model 3 IIIT India [9] Run 4 Model 4 IIIT India [9] Run 5 Model 5 IIIT India [9] Run 6 Finetuned RoBERTa large IIIT India [9] Run 7 Finetuned ALBERT xxlarge UOttawa Canada [13] Run 1 1024 dimensions, learning rate = 2e-5, 20 epochs UOttawa Canada [13] Run 2 512 dimensions, leaning rate = 2e-5, 20 epochs Table 5: The total number of errors reported, broken down by originating forum and label False Positives False Negatives r/OffMyChest r/CasualConversation r/OffMyChest r/CasualConversation Emotional disclosure 601 371 46 20 Information disclosure 288 129 184 121 Emotional support 416 193 44 12 Information support 123 53 0 0 – UOttawa suggests that when training a model on a task using noisy datasets, it is recommended to identify and separate the data-dependent noise from the signal, and to rely on patterns and relationships based on other features. Their exemplary approach does suggest new paradigms for conceptualiz- ing deep multi-task learning problems. However, we wonder whether the presumptions could break, for instance, when the logical relationships are accidental. In the case of GATech [4], they relied on the label distribution information to regularize their models. However, we had consciously made the decision to have a larger proportion of positive cases in the test set, which may have ultimately hurt their model performance. Perhaps the takeaway would be to look for the semantic relationships in the data and not rely solely on numerical trends. – GATech reaffirms our belief in the power of semi-supervised learning for model training and prediction at scale, showing respectable performance with an entropy-minimization approach for generated more labeled data from the unlabeled sample provided. However, they rely on the data distribution to introduce another error term to minimize the entropy of the output, and to minimize the divergence in output and input label distributions. For future modeling, we would recommend this approach only if the data generation and sampling processes are the same for both the training and the test set. – Gyrfalcon’s Super Characters approach did not appear to wholly satisfy its authors, who recommend possibly upsampling, data augmentation, or word replacement, especially when fine-tuning on small datasets. – While it would logically be expected that adding context to models would improve model accuracy, SKKU observed no such performance gain. They recommend that rather than concatenation, adding suitable representations of context could be the right approach to enhance model performance. Like our Shared Task last year [7], the findings do support the emerging no- tion about the English language as a contextualized emotional vector space, with the best performances reported by approaches that incorporated task-specific embeddings from other language models. Relying on emotional signals and the hierarchical structure of labels alone appears to have provided sufficient pre- dictive performance. We note that in this version of the Shared Task, we did not observe any of our teams to have used syntactic information, or in building domain-specific embeddings, which were some of the more successful approaches last year. It remains an open problem whether the models trained on this data will generalize to measure disclosure and support other platforms and conversations, and one for which we welcome future work and feedback. Acknowledgement. Support for this research was provided by a Nanyang Presidential Postdoctoral Award and an Adobe Research Award. References 1. Akiti, C., Rajtmajer, S., Squicciarini, A.: Contextual representation of self- disclosure and supportiveness in short text. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 2. Barak, A., Gluck-Ofri, O.: Degree and reciprocity of self-disclosure in online forums. CyberPsychology & Behavior 10(3), 407–417 (2007) 3. Boyd, D.M., Ellison, N.B.: Social network sites: Definition, history, and scholarship. Journal of computer-mediated Communication 13(1), 210–230 (2007) 4. Chen, J., Wu, Y., Yang, D.: Semi-supervised models via data augmentation for classifying interactive affective responses. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 5. Hyun, J., Bae, B.C., Cheong, Y.G.: [CL-Aff Shared Task] Multi-label text classifica- tion using an emotion embedding model. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 6. Jaidka, K., Guntuku, S.C., Ungar, L.H.: Facebook versus twitter: Differences in self- disclosure and trait prediction. In: Twelfth International AAAI Conference on Web and Social Media (2018) 7. Jaidka, K., Mumick, S., Chhaya, N., Ungar, L.: The CL-Aff happiness shared task: Results and key insights (2019) 8. Ma, X., Hancock, J., Naaman, M.: Anonymity, intimacy and self-disclosure in social media. In: Proceedings of the 2016 CHI conference on human factors in computing systems. pp. 3857–3869 (2016) 9. Pant, K., Dadu, T., Mamidi, R.: Bert-based ensembles for modeling disclosure and support in conversational social media text. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 10. Sun, B., Yang, L., Sha, H., Lin, M.: Multi-modal sentiment analysis using super characters method on low-power cnn accelerator device. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 11. Wang, Y.C., Burke, M., Kraut, R.: Modeling self-disclosure in social networking sites. In: Proceedings of the 19th ACM conference on computer-supported cooper- ative work & social computing. pp. 74–85 (2016) 12. Wang, Y.C., Kraut, R., Levine, J.M.: To stay or leave? the relationship of emotional and informational support to commitment in online health support groups. In: Proceedings of the ACM 2012 conference on computer supported cooperative work. pp. 833–842 (2012) 13. Xin, W., Inkpen, D.: [CL-Aff Shared Task] Detecting disclosure and support via deep multi-task learning. In: Proceedings of the 3rd Workshop on Affective Content Analysis @ AAAI (AffCon2020). New York, New York (February 2020) 14. Yang, D., Yao, Z., Kraut, R.: Self-disclosure and channel difference in online health support groups. In: Eleventh International AAAI Conference on Web and Social Media (2017)