<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.7910/DVN/JZAS66</article-id>
      <title-group>
        <article-title>The CL-A Happiness Shared Task: Results and Key Insights</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kokil Jaidka</string-name>
          <email>jaidka@ntu.edu.sg</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Saran Mumick</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Niyati Chhaya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lyle Ungar</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Adobe Research</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Megagon Labs</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Nanyang Technological University</institution>
          ,
          <country country="SG">Singapore</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Pennsylvania</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This overview describes the o cial results of the CL-A Shared Task 2019 { in Pursuit of Happiness. The dataset comprised a semi-supervised classi cation task and an open-ended knowledge modeling task on a dataset of over 80,000 brief autobiographical accounts of happy moments, crowdsourced from Amazon Mechanical Turk. The Shared Task was organized as a part of the 2nd Workshop on A ective Content Analysis @ AAAAI-19, held in Honolulu, USA on January 27, 2019. This paper compares the participating systems in terms of their accuracy and F-1 scores at predicting two facets of happiness. The complete annotated dataset is available on Harvard Dataverse at https: //goo.gl/3rcZqf. The annotation instructions and the scripts used for evaluation are available at the Git repository at https://github.com/ kj2013/claff-happydb.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The purpose of the CL-A Shared Task is to challenge the current understanding
of emotion through a task that models the experiential, contextual and
agentic attributes of happy moments. It has long been known that human a ect is
context-driven, and that labeled datasets should account for these factors in
generating predictive models of a ect. The Shared Task is organized in collaboration
with researchers at Megagon Labs and builds upon the HappyDB dataset [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
comprising human accounts of `happy moments'. The Shared Task comprised of
two sub-tasks for analyzing happiness and well-being in written language, on a
corpus of over 80,000 descriptions of happy moments, as described here:
Given: An account of a happy moment, marked with individual's demographics,
recollection time and relevant labels.
5 In the annotation task and the Shared Task, the label names we provided were
`Agency' and `Social'. We have since renamed `Social' to `Sociality' so that both
Agency and Sociality can be grammatically consistent.
{ Task 2: Suggest interesting ways to automatically characterize the happy
moments in terms of a ect, emotion, participants and content.
      </p>
      <p>The task, given its predictive and open-ended interpretive aspects is relevant
for the computational linguistics, natural language processing, arti cial
intelligence and the psycholinguistics communities. The aim is to engage scholarly
interest and crowdsource new ideas and linguistic approaches to de ne
happiness. Details on the psycholinguistic underpinnings of the annotation task are
provided in a di erent, forthcoming paper [5].</p>
      <p>Evaluation: The performance of Systems was compared based on their
Accuracy and F-1 measure at predicting the Agency and Sociality labels on the
unseen test dataset. This was done using an automatic evaluation script,
available on Github 6.
1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset description</title>
      <p>The CL-A</p>
      <p>corpus comprises the following:
{ Labeled training set (N = 10,560): Single-sentence happy moments
from the available HappyDB corpus, annotated with demographic labels of
the author, as well as labels that identify the 'agency' of the author and the
'social' characteristic of the moment, as well as concept labels describing its
theme.
{ Unlabeled training set (N = 59,846: The remaining single-sentence</p>
      <p>HappyDB happy moments with only the demographic labels of the author.
{ Test set: (N = 17,215) Previously unreleased, single-sentence happy
moments, freshly collected in the same manner as the original HappyDB data.
Authors' demographic labels were available to the Shared Task participants
but not the `agency' or `social' characteristics.</p>
      <p>The Agency and Sociality characteristics of each happy moment were decided
by a simple majority agreement between three independent annotators using a
binary (yes/no) coding.
2
2.1</p>
      <sec id="sec-2-1">
        <title>Corpus development</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Collecting the happy moments</title>
      <p>
        We followed the format of the original HappyDB AMT task[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to collect a second
dataset of 20,000 happy moments, which was to be the unseen test data in the
CL-A Shared Task. The following instructions were provided to the workers.
      </p>
      <p>Instructions
6 https://github.com/kj2013/cla -happydb/</p>
      <p>What made you happy? Re ect on the past &lt;duration&gt;, and recall three
actual events that happened to you that made you happy. Describe your happy
moments with a complete sentence. Write three such moments. You will also
be asked to note for how long each event made you happy. This task also has
post-task questions. Please be sure to answer the questions. Examples of happy
moments we are NOT looking for (e.g., events in distant past, incomplete
sentence): The day I married my spouse; My dog.
&lt; Enter moment here &gt;
For how long did that event make you happy? Select the answer that is most
appropriate.</p>
      <p>Each AMT worker was required to enter three happy moments experienced
within a speci c time period. Half of the questionnaires speci ed a time period
of 24 hours, while the other half with a &lt;time period&gt; of 3 months. The
options provided for the follow-up question about the duration (i.e., the length)
of happiness were `All day, I'm still feeling it,' `Half a day,' `At least one hour,'
`A few minutes' or `Not Applicable.' After the participant answered these
questions, demographic information was collected about their country, age, gender
(`Male`,`Female`,`Other`,`Not Applicable`), marital status (`single', `married',
`divorced', `separated', `widowed' or `Not Applicable`), and whether or not they
have children (`yes',`no').
3</p>
      <sec id="sec-3-1">
        <title>Annotation</title>
        <p>Annotators were required to annotate each moment along two binary
dimensions { Agency and Sociality. We draw from Paulhas' conceptualization of
selfpresentation according to the two factors of Agency and Communion [7].
Previous work exploring the evidence of agency in writing has adapted it to mean their
locus of control, or the degree to which an author in control of their surroundings
[9]. Sociality conceptualizes interpersonal engagement, evinced in writing as the
description of any activity performed with or in the company of others[6].</p>
        <p>Instructions Read the following happy moment. Choose any of the following
that applies:
Agency: Is the author in control? YES/NO
Examples of sentences where the author is in control (Answer is YES):
{ \I ran on the treadmill for 20 minutes straight when I could barely do 5
minutes 3 months ago."
{ \Going out to a special birthday lunch for my great-grandmother in law's
birthday."
Examples of sentences where the author is not in control (Answer is NO):
{ \My youngest daughter got accepted to many prestigious universities and
accepted an o er to attend college in San Diego."
{ \A small business deal change over for small pro t."
Social: Does this moment involve other people other than the author?
YES/NO
Please note that objects (e.g., bus, work) should not be counted as social.
Examples of sentences which involve other people (Answer is YES):
{ \Going out to a special birthday lunch for my great-grandmother in law's
birthday."
{ \My youngest daughter got accepted to many prestigious universities and
accepted an o er to attend college in San Diego."
Note that sometimes a person is implicitly involved although not explicitly
mentioned. In this case, we still wish to label the happy moment as social.
E.g., \I received compliments on my tattoo."
Examples of sentences which are not social (Answer is NO):
{ \I ran on the treadmill for 20 minutes straight when I could barely do 5
minutes 3 months ago."
{ \The bus came on time, so I reached work early."
&lt;Happy moment appears here&gt;
Agency: Is the author in control? YES/NO
Social: Does this moment involve other people other than the author?
YES/NO
3.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Topic labeling</title>
      <p>Annotators were presented with a happy moment, and a set of four potential
topics which it was likely describing. Annotators were asked to mark all the tags
which referred to what it was about. Each moment could score a maximum of
four tags if at least two annotators agreed on them.</p>
      <p>Instructions Read the following text. Select all categories that are relevant
to the text from among those provided. If none of the categories is a great t,
select "none of the above"
&lt;Topic 1&gt; &lt;Topic 2&gt; &lt;Topic 3&gt; &lt;Topic 4&gt;
4</p>
      <sec id="sec-4-1">
        <title>Overview of Approaches</title>
        <p>Eleven teams participated in the Shared Task. The following paragraphs discuss
the approaches followed by the participating systems, sorted in the order in
which they signed up to participate in the task.</p>
        <p>
          { Arizona State University (ASU) [10]: The team from ASU proposed a Word
Pair Convolutional Model (WoPCoM) to accomplish Task 1. The proposed
model is motivated by the hypothesis that a small set of word-pair features
are important to capturing the agency/social nature of happy moments.
They trained a convolutional neural network (CNN) to predict on the
unlabeled data.
{ University of California Santa Cruz (UCSC) [15]: The UCSC team
participated in both tasks. For Task 1, they explored the use of syntactic, emotional,
and survey features with semi-supervised learning, speci cally
experimenting with XGBoosted Forest and CNN models. For Task 2, the team trained
similar models to predict concepts, and based on the di culty of doing so,
hypothesized about the nature of the themes in the happy moments.
{ International Institute of Information Technology Hyderabad (IIIT-H) [12]:
The IIIT-H team employed an inductive transfer learning technique (ITL).
They pre-trained a AWD-LSTM neural net on the WikiText-103 corpus, and
then introduced an extra step to adapt the model to Happy moments.
{ Gyrfalcon [11]: The team from Gyrfalcon Technology, California, proposed
an algorithm to map English words into squared glyphs images. Then, they
applied a 2D-CNN model over these images in order to capture the sentiment.
{ A*STAR [4]: The IHPC-A*STAR team participated in both tasks. For Task
1, they used emotion intensity in happy moments to predict agency and
sociality labels. They de ned a set of ve emotions (valence, joy, anger, fear,
sadness) and use a previously developed tool, CrystalFeed, to label each
moment with the corresponding ve emotion intensities. Combining these
features with additional word-embedding features, they trained a logistic
regression model. For Task 2, the team explored how these di erent emotions
are manifested across the di erent concept labels.
{ University of British Columbia (UBC) [8]: The UBC team primarily
experimented with di erent embedding methods, such as CoVe and ELMo, on
deep neural networks. They modeled their neural networks as long
shortterm memory networks and BiLSTM, with and without attention.
{ University of Ottawa (UOttawa) [16]: The University of Ottawa team also
proposed a deep learning CNN solution. They experimented using di erent
kind of word embeddings, and also experimented with training a multi-task
classi er to see whether performance could be enhanced by shared knowledge
between agency and sociality.
{ Escuela Superior Politecnica del Litoral (ESPOL) [14]: The ESPOL team
proposed a semi-supervised adaptation to traditional k-means clustering
using neural networks.
{ Sungkyunkwan team (SKKU) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]: The SKKU team used a semi-supervised
approach. They built four one-class autoencoder models, one for social,
nonsocial, agentic, and non-agentic moments. Each autoencoder model had a
deep learning architecture consisting of two neural networks, one for encoding
the the input, and the other for reconstructing the compressed vector.
{ Jordan University of Science and Technology (JUST) [13]: The JUST team
proposed used a Recurrent Convolutional Neural Network, and combined
words with their context in order to get a more precise word embedding.
{ Fraunhofer (FKIE) [3]: The team from Fraunhofer FKIE trained a three-layer
CNN. They experimented with using di erent embeddings including
FastText and GloVe. Additionally, they experimented with splitting the dataset
by demographic location of the author, and showed that training separate
classi ers on the splits enhanced performance.
5
5.1
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Task 1: Predicting Agency and Sociality</title>
      <p>This section compares the participating systems in terms of their performance.
Four of the eleven systems that did Task 1 also did the bonus Task 2. The results
are provided in Table 1. The detailed implementation of the individual runs are
described in the system papers included in this proceedings volume.
Some of the systems used their neural models of happiness for Task 1 to
produce visual knowledge representations [11], and general insights about happiness
[10,8,15,3]. Most notably, Gyrfalcon [11] transformed textual moments into
visualizations to explore whether they could encode more multi-dimensional
information in this manner. UBC [8] provided a visualization for \attention" in their
bi-directional long short-term memory networks which highlights the patterns
that were considered important by the neural network while predicting Agency
and Sociality when a sequence of words was input into the model. ASU [10]
showed the codependence of the individual Agency and Sociality labels across
the dataset through a t-SNE visualization. Team 33 [3] and UCSC [15] both
attempted to capture the linguistic patterns in the construction of happiness and
their potential cultural underpinnings.
6</p>
      <sec id="sec-5-1">
        <title>Error Analysis</title>
        <p>In this section, we present a meta-analysis of system performances for Task 1
over all the (a) topics and (b) moments in the test set. Furthermore, in their
data pre-processing step, the team from Fraunhofer [3] identi ed that in the
subset of happy moments contributed by authors from India alone, there were
duplicate or near-duplicate happy moments in the data, which reduced the total
number of training samples by 25%. We will include data cleaning as an extra
preprocessing step in future data releases.</p>
        <p>Topic-level analysis: We expect that happiness in di erent situations
would be experienced and expressed di erently. Table 3 aggregates the failures
produced by each of the approaches (out of the set of best approaches submitted
by each of the teams).</p>
        <p>Moment-level meta-analysis: We suspect that some of the errors in our
data may occur due to mislabeling or the coding scheme not being applicable to
the moment. In Table 4 we provide the happy moments for which 100% of the
best approaches submitted by each of the teams reported failure. We observe
that in some of the cases, (e.g., \Topanga running away to Cory"), the happy
moment was actually mislabeled and thus the systems actually did have the
correct prediction. Overall, many of the happy moments in this Table describe a
single moment in the author's life, which seem ordinary when considered in the
context of regular living. In some cases, the authors have attempted to explain
why this moment was special to them (e.g., the second part of the moment \I
nally got a hold of my auto mechanic, and that enabled me to schedule a time
to bring in my car to get my custom exhaust installed" only serves to explain
the signi cance of the moment to the author.)</p>
      </sec>
      <sec id="sec-5-2">
        <title>Conclusion and Future Work</title>
        <p>Eleven teams participated in the inaugural CL-A Shared Task AAAI-19. We
have published the complete dataset to Harvard Dataverse 8. Furthermore, we
expect to release other resources complementary to the challenges of modeling
a ect and emotion language from language.</p>
        <p>In summary, our meta-analysis of system performance identi es the following
key takeaways and recommendations:
{ Predictive modeling approaches are greatly improved when modeled as a
semi-supervised task, enriched with unlabeled data or by knowledge or
feature vectors trained from a di erent domain. This also highlights the
generalizability of the Shared Task to other domains.
{ Syntactic knowledge is important for modeling Agency and Sociality (and
hence, for modeling happiness). Participants incorporated the importance of
the head noun and subject-verb-object word order in their language
models either through interacting layers in convolution neural networks, or by
mining it using lexical pattern analyses methods.
{ The CL-A dataset o ers replicability of more traditional emotion modeling
approaches. It was feasible to apply the models developed on other annotated
emotion datasets to improve the predictive modeling performance on the
Shared Task [4]. We anticipate that language models from the CL-A dataset
will also generalize well to other problems and datasets for emotion and a ect
analysis.
{ In future work, scholars could consider training their classi ers based on
domain-speci c word embeddings derived from the Shared Task dataset
itself.
{ Findings support the emerging notion about the English language as a
contextualized emotional vector space, with the best performances reported by
approaches that incorporated task-speci c embeddings from other language
models, such as ELMo and CoVe.</p>
        <p>Acknowledgement. We thank Dr. Wang-Chiew Tan for her feedback
and Megagon Labs for contributing funds towards the CL-A dataset.
3. Claeser, D.: A ective content classi cation using convolutional neural networks.</p>
        <p>In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI
(AffCon2019). Honolulu, Hawaii (January 2019)
4. Gupta, R.K., Bhattacharya, P., Yang, Y.: What constitutes happiness? predicting
and characterizing the ingredients of happiness using emotion intensity analysis.
In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI
(AffCon2019). Honolulu, Hawaii (January 2019)
5. Jaidka, K., Chhaya, N., Mumick, S., Killingsworth, M., Halevy, A., Ungar, L.:
Towards a typology of happiness: The cl-a annotated dataset of happy moments
(2019)
6. Paulhus, D.L., Robinson, J.P., Shaver, P.R., Wrightsman, L.S.: Measures of
personality and social psychological attitudes. Measures of social psychological attitudes
series 1, 17{59 (1991)
7. Paulhus, D.L., Trapnell, P.D.: Self-presentation of personality. Handbook of
personality psychology 19, 492{517 (2008)
8. Rajendran, A., Zhang, C., Abdul-Mageed, M.: Happy together: Learning and
understanding appraisal from natural language. In: Proceedings of the 2nd Workshop
on A ective Content Analysis @ AAAI (A Con2019). Honolulu, Hawaii (January
2019)
9. Rouhizadeh, M., Jaidka, K., Smith, L., Schwartz, H.A., Bu one, A., Ungar, L.:
Identifying locus of control in social media language. In: Proceedings of the 2018
Conference on Empirical Methods in Natural Language Processing (2018)
10. Saxon, M., Bhandari, S., Ruskin, L., Honda, G.: Word pair convolutional model
for happy moment classi cation. In: Proceedings of the 2nd Workshop on A ective
Content Analysis @ AAAI (A Con2019). Honolulu, Hawaii (January 2019)
11. Sun, B., Yang, L., Chi, C., Zhang, W., Lin, M.: [cl-a shared task] squared english
word: A method of generating glyph to use super characters for sentiment
analysis. In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI
(A Con2019). Honolulu, Hawaii (January 2019)
12. Syed, B., Indurthi, V., Shah, K., Gupta, M., Varma, V.: Ingredients for happiness:
Modeling constructs via semi-supervised content driven inductive transfer
learning. In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI
(A Con2019). Honolulu, Hawaii (January 2019)
13. Talafha, B., Al-Ayyoub, M.: Ioh-rcnn: Pursuing the ingredients of happiness using
recurrent convolutional neural networks. In: Proceedings of the 2nd Workshop
on A ective Content Analysis @ AAAI (A Con2019). Honolulu, Hawaii (January
2019)
14. Torres, J., Vaca, C.: Neural semi-supervised learning for short-texts. In:
Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI (A Con2019).</p>
        <p>Honolulu, Hawaii (January 2019)
15. Wu, J., Compton, R., Rakshit, G., Walker, M., Anand, P., Whittaker, S.:
Cruzaffect at a con 2019 shared task: A feature-rich approach to characterize happiness.
In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI
(AffCon2019). Honolulu, Hawaii (January 2019)
16. Xin, W., Inkpen, D.: [cl-a shared task] happiness ingredients detection using
multi-task deep learning. In: Proceedings of the 2nd Workshop on A ective
Content Analysis @ AAAI (A Con2019). Honolulu, Hawaii (January 2019)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Asai</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evensen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Golshan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halevy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopatenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepanov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suhara</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Happydb: A corpus of 100,000 crowdsourced happy moments</article-title>
          .
          <source>In: Proceedings of LREC 2018</source>
          .
          <article-title>European Language Resources Association (ELRA), Miyazaki</article-title>
          , Japan (May
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cheong</surname>
            ,
            <given-names>Y.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bae</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          :
          <article-title>[cl-a shared task] modeling happiness using one-class autoencoders</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI (A Con2019)</source>
          . Honolulu,
          <string-name>
            <surname>Hawaii</surname>
          </string-name>
          (
          <year>January 2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>