=Paper= {{Paper |id=Vol-2328/session2_paper1 |storemode=property |title=The CL-Aff Happiness Shared Task: Results and Key Insights |pdfUrl=https://ceur-ws.org/Vol-2328/2_paper.pdf |volume=Vol-2328 |authors=Kokil Jaidka,Saran Mumick,Niyati Chhaya,Lyle Ungar |dblpUrl=https://dblp.org/rec/conf/aaai/JaidkaMCU19 }} ==The CL-Aff Happiness Shared Task: Results and Key Insights== https://ceur-ws.org/Vol-2328/2_paper.pdf
The CL-Aff Happiness Shared Task: Results and
                Key Insights

       Kokil Jaidka1,2 , Saran Mumick2,3 , Niyati Chhaya4 , and Lyle Ungar2
                    1
                        Nanyang Technological University, Singapore
                           2
                             University of Pennsylvania, USA
                                 3
                                    Megagon Labs, USA
                               4
                                   Adobe Research, India
                                   jaidka@ntu.edu.sg


        Abstract. This overview describes the official results of the CL-Aff
        Shared Task 2019 – in Pursuit of Happiness. The dataset comprised
        a semi-supervised classification task and an open-ended knowledge mod-
        eling task on a dataset of over 80,000 brief autobiographical accounts
        of happy moments, crowdsourced from Amazon Mechanical Turk. The
        Shared Task was organized as a part of the 2nd Workshop on Affective
        Content Analysis @ AAAAI-19, held in Honolulu, USA on January 27,
        2019. This paper compares the participating systems in terms of their
        accuracy and F-1 scores at predicting two facets of happiness. The com-
        plete annotated dataset is available on Harvard Dataverse at https:
        //goo.gl/3rcZqf. The annotation instructions and the scripts used for
        evaluation are available at the Git repository at https://github.com/
        kj2013/claff-happydb.


1     Introduction
The purpose of the CL-Aff Shared Task is to challenge the current understanding
of emotion through a task that models the experiential, contextual and agen-
tic attributes of happy moments. It has long been known that human affect is
context-driven, and that labeled datasets should account for these factors in gen-
erating predictive models of affect. The Shared Task is organized in collaboration
with researchers at Megagon Labs and builds upon the HappyDB dataset [1],
comprising human accounts of ‘happy moments’. The Shared Task comprised of
two sub-tasks for analyzing happiness and well-being in written language, on a
corpus of over 80,000 descriptions of happy moments, as described here:
Given: An account of a happy moment, marked with individual’s demographics,
recollection time and relevant labels.
 – Task 1: Semi-Supervised classification task - Predict thematic labels (Agen-
   cy/Sociality) on unseen data, based on a small labeled and large unlabeled
   training data.5
5
    In the annotation task and the Shared Task, the label names we provided were
    ‘Agency’ and ‘Social’. We have since renamed ‘Social’ to ‘Sociality’ so that both
    Agency and Sociality can be grammatically consistent.
 – Task 2: Suggest interesting ways to automatically characterize the happy
   moments in terms of affect, emotion, participants and content.

    The task, given its predictive and open-ended interpretive aspects is relevant
for the computational linguistics, natural language processing, artificial intel-
ligence and the psycholinguistics communities. The aim is to engage scholarly
interest and crowdsource new ideas and linguistic approaches to define happi-
ness. Details on the psycholinguistic underpinnings of the annotation task are
provided in a different, forthcoming paper [5].
    Evaluation: The performance of Systems was compared based on their Ac-
curacy and F-1 measure at predicting the Agency and Sociality labels on the
unseen test dataset. This was done using an automatic evaluation script, avail-
able on Github 6 .


1.1     Dataset description

The CL-Aff corpus comprises the following:

 – Labeled training set (N = 10,560): Single-sentence happy moments
   from the available HappyDB corpus, annotated with demographic labels of
   the author, as well as labels that identify the ’agency’ of the author and the
   ’social’ characteristic of the moment, as well as concept labels describing its
   theme.
 – Unlabeled training set (N = 59,846: The remaining single-sentence
   HappyDB happy moments with only the demographic labels of the author.
 – Test set: (N = 17,215) Previously unreleased, single-sentence happy mo-
   ments, freshly collected in the same manner as the original HappyDB data.
   Authors’ demographic labels were available to the Shared Task participants
   but not the ‘agency’ or ‘social’ characteristics.

   The Agency and Sociality characteristics of each happy moment were decided
by a simple majority agreement between three independent annotators using a
binary (yes/no) coding.


2     Corpus development

2.1     Collecting the happy moments

We followed the format of the original HappyDB AMT task[1] to collect a second
dataset of 20,000 happy moments, which was to be the unseen test data in the
CL-Aff Shared Task. The following instructions were provided to the workers.

      Instructions

6
    https://github.com/kj2013/claff-happydb/
    What made you happy? Reflect on the past , and recall three ac-
    tual events that happened to you that made you happy. Describe your happy
    moments with a complete sentence. Write three such moments. You will also
    be asked to note for how long each event made you happy. This task also has
    post-task questions. Please be sure to answer the questions. Examples of happy
    moments we are NOT looking for (e.g., events in distant past, incomplete sen-
    tence): The day I married my spouse; My dog.
    < Enter moment here >
    For how long did that event make you happy? Select the answer that is most
    appropriate.

    Each AMT worker was required to enter three happy moments experienced
within a specific time period. Half of the questionnaires specified a time period
of 24 hours, while the other half with a