<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Accuracy Accuracy
En</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Monitoring Adolescents' Distress using Social Web data as a Source: the InsideOut Project</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Basili Robertoyz</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bellomaria Valentinaz</string-name>
          <email>bellomaria@revealsrl.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bugge Niels J.?</string-name>
          <email>g@gmail.com</email>
          <email>niels.bugge@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Croce Daniloyz</string-name>
          <email>croceg@info.uniroma2.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>De Michele Francesco</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fiori Nastro Federico</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fiori Nastro Paolo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michel Chantal?</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Schmidt Stefanie J.?</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Schultze-Lutter Frauke?</string-name>
          <email>frauke.schultze-lutterg@kjp.unibe.ch</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>1062</year>
      </pub-date>
      <volume>76</volume>
      <abstract>
        <p>English. The role of Social Media in the psychological and social development of adolescents and young adults is increasingly important as it impacts on the quality of their interpersonal communication dynamics. The InsideOut project explores the possibility to use Social Web mining methodologies and technologies to collect information about adolescents' distress from their micro-blogging activities. The project is promoting a complex language processing workflow to approach the collection, enrichment and summarization of user generated contents over Twitter. This paper presents the general architecture of the InsideOut Web Platform and the resources produced by an integrated effort among computer science and mental health professionals.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. Il ruolo dei Social Media nella
crescita psicologica e sociale risulta
essere sempre pi u` importante poiche´
influisce sulla qualit a` e sulle dinamiche
di comunicazioni interpersonali,
specialmente riguardo le ultime generazioni. Il
progetto InsideOut esplora la
applicabilit a` di metodologie e tecnologie che
consentono l’individuazione nel Web di
evidenze riferibili a sorgenti di stress negli
adolescenti. Il progetto propone un
workflow di elaborazione linguistica in grado
di gestire la raccolta, l’arricchimento e la
sintesi dei contenuti generati dagli utenti
su Twitter. Nel paper verr a` presentata
l’architettura generale della piattaforma
Web InsideOut e le risorse che derivano
dal lavoro congiunto di ricercatori
provenienti dall’ambito informatico e medico.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>
        Among adolescents, the use of Social Media, such
as Twitter, Facebook or Instagram, has grown
exponentially in the past years. This makes them a
valuable source of information on the well-being
of adolescents, but also concerning on their
mental health. Mental disorders are the main cause of
disability in adolescents and young adults
        <xref ref-type="bibr" rid="ref5">(Gore
et al., 2011)</xref>
        , affecting an average of 10 to 20%
of youth worldwide
        <xref ref-type="bibr" rid="ref6">(Kieling et al., 2011)</xref>
        . Thus,
for the emerging complex relationship between the
use of Social Media, mental health and well-being
        <xref ref-type="bibr" rid="ref3">(Best et al., 2014)</xref>
        , Social Media are a valuable
source of information on the mental health and
well-being of adolescents.
      </p>
      <p>
        Social Media thus play an increasingly
important role in the psychological and social
development of adolescents as it impacts on the
quality of their social interactions and networks. Any
attempt to study and govern mental health in
young communities (adolescents, students,
interest groups) must take into account an effective and
large scale methodology to monitor all the
behaviors on the Web that exhibit and impact on
mental habits, trends and social practices. The
possibility of predicting writers demographics from
their writings is an important research topic in the
Computational Linguistic Community. In fact, the
idea that a writer’s style may reveal age, gender
or other sociodemographic information has been
also targeted in the “Plagiarism analysis,
Authorship identification, and Near-duplicate detection”
(PAN) (e.g.,
        <xref ref-type="bibr" rid="ref10 ref11 ref9">(Rangel et al., 2014; Rangel et al.,
2015; Rangel et al., 2016)</xref>
        ) or other experiences
        <xref ref-type="bibr" rid="ref13">(Sulis et al., 2016)</xref>
        whose aim was to infer a user’s
gender, age, native language or personality traits,
by analyzing the respective texts.
      </p>
      <p>In this paper, the InsideOut project is presented.
It explores the possibility to use Social Web
mining methodologies and technologies to collect
information about adolescents’ distress from their
micro-blogging activities. The project is
promoting a complex language processing workflow to
approach the collection, enrichment of user
generated contents on Twitter: messages written by
a set of targeted community of users (e.g. from
a school) are enriched with semantic metadata
reflecting the expressed topics (e.g. social vs
intimate relationships) and the attitude of the
writers. The goal is to use this large scale evidence
to support a comprehensive psychological
characterization of adolescent communities and to pave
the way towards effective applications of
preventive and intervention efforts. The general
architecture of the InsideOut Web Platform and the
resources produced by an integrated effort of
computer science specialists and mental health
professionals will be presented. These data supported
the exploratory evaluation where inter-annotation
agreement scores and the performance over real
data in the task of psychologically enriching user
writings have been obtained.</p>
      <p>In the rest of the paper, Section 2 describes the
overall workflow underlying the InsideOut
Platform. Section 3 describes the semantic models at
the base of the semantic annotation process whose
first result is the annotated corpus and the
exploratory evaluation presented in Sections 4 and
5, respectively. Section 6 derives the conclusion.
2</p>
    </sec>
    <sec id="sec-3">
      <title>The InsideOut Web Platfrom</title>
      <p>The InsideOut Web Platform aims at supporting
mental health studies concerning the causes of
distress in adolescents. To this aim, a comprehensive
service-oriented architecture has been designed
and implemented to collect messages from
Social Networks (such as Twitter) written by targeted
communities of adolescents and enrich them with
semantic information reflecting discussed topics
and corresponding attitudes of the writers.</p>
      <p>This enables specific kinds of queries and data
aggregations, such as the pie chart shown in
Figure 1, which summarizes the topics discussed by
a community of users, e.g. concerning SCHOOL,
FAMILY, or ALCHOOL AND DRUGS. By
selecting a specific topic, such as SCHOOL, the system
shows only those messages where the writer
expresses a specific attitude, such as a DISTRESS.
In the same Figure, the distressful messages
concerning school are shown, such as ”Questa scuola
fa schifo...” (”This school sucks...”) or ”Devo
studiare.” (”I have to study.”).</p>
      <p>In order to enable such queries the following
services have been implemented:
Data collection services: services dealing with
the extraction of data (messages/user information)
from targeted social networks. These services are
designed both to collect messages referring to a
specific topic or hashtag, such as ”#maturita`” or
messages exchanged between users belonging to
specific communities, such as a members of a
targeted school class. Among such services, we also
implemented Author Profiling services that
automatically determine the age of the writers (e.g. to
filter adolescent’s messages) but these specific
services are out of the scope of this work.</p>
    </sec>
    <sec id="sec-4">
      <title>Semantic annotation services: services dealing</title>
      <p>with the semantic annotation of gathered
messages; once downloaded, they are automatically
annotated with the semantic metadata described in
the next section.</p>
      <p>Storage services: services to store (possibly
large-scale) collections of messages, communities
and semantic metadata in NoSQL databases,
implemented in MongoDB.</p>
      <p>Reporting services GUI: services that aggregate
messages, metadata and users to enable advanced
report, such as shown in Figure 1.
3</p>
    </sec>
    <sec id="sec-5">
      <title>Distress Characterization: The semantic modeling</title>
      <p>In order to synthesize the amount of information
made available on Social Media, we need to look
at different semantic dimensions that can be
associated with the writer’s emotion, sentiment and
mental status. Given that no direct diagnosis about
mental health of an individual can be traced from
or over one single message (but it is rather
inspired by the observation of behaviors across
temporal and social dimensions) we need to frame the
mental state related information observable in
Social Media within a comprehensive description of
a subject.</p>
      <p>So we decided to focus on the experiential
dimension and start from the so-called Life Event
dimension that expresses topics of interest and
daily events in a young person’s life. At the
moment of writing, these have been discretized in
eighteen different classes, as listed in Table 1.
Each message can be assigned to one or more
classes characterizing the possibly multiple topics
that can be mentioned in a message. For
example, in the message ”Odio la scuola ma adoro i
miei compagni” (”I hate school but I love my
classmates”) the writer refers to the SCHOOL and
SOCIAL RELATIONSHIP life events.</p>
      <p>
        Moreover, a Subjective emotional dimension
is targeted to capture the way the subject relates
to the event in the micro-blog he writes, i.e.,
whether it is related to as a clearly positive or
negative event, as a rather neutral statement, or in an
ironic way. We referred to the traditional
modeling for subjectivity analysis
        <xref ref-type="bibr" rid="ref12 ref2">(Rosenthal et al.,
2017; Barbieri et al., 2016)</xref>
        , adopting POSITIVE,
NEGATIVE and NEUTRAL classes; as an example
”Odio la scuola” (”I hate school”) is NEGATIVE,
while ”Domani la scuola e` chiusa” (”Tomorrow
my school is closed.”) is NEUTRAL.
      </p>
      <p>Finally, a further dimension called Experience
tried to capture the writer’s personal affect towards
an event, e.g., whether it (i) is causing distress or
other negative feelings such as anger or sadness,
(ii) is regarded as helpful or causing positive
feelings such as happiness or affection or (iii) is not
associated with any perceivable emotional
reaction (neutral). As an example, a school
performance can be a positive experience if satisfactory
for the teacher or the parents, thus being
experienced as helpful by the writer, while it might be
experienced as a negative event and as distressing
when teacher’s or parent’s judgment is negative.
It is worth noting that the Subjective and
Experience dimension are nevertheless correlated, but
they target different kinds of perception: the
following message ”Mi sono rotto una gamba.” (”I
broke my leg.”) can be considered DISTRESSFUL
for the writer even if no agreement or rejection is
made w.r.t. the event.</p>
      <p>The information observable in a tweet is thus
mapped into a set of three independent
dimensions: (i) the type of Life Events le the message
relates to (ii) the sentiment s of the event (POSITIVE,
NEGATIVE, NEUTRAL) and (iii) experience-level
e related to the event (among HELPFUL,
DISTRESSFUL or NEUTRAL). For example, the tweet
”Quanto odio la mia classe... per fortuna mia
sorella mi aiuta!” (”I hate my class so much...
thankfully, my sister helps me!”) is assigned to
the (le,s,e) triples: (SCHOOL, NEGATIVE,
DISTRESSFUL) and (FAMILY, POSITIVE, HELPFUL).
4</p>
    </sec>
    <sec id="sec-6">
      <title>The InsideOut Annotated Corpus</title>
      <p>In the annotation process, annotators selected
tweets written by adolescents (that have been
previously manually validated) both in English and
Italian and enriched them with triples (le; s; e), as
discussed in the previous section. In the
annotation process, each annotator starts by associating
one or more le to a message1 and, for each of
them, the corresponding s and e must be provided.
Each message was initially annotated by two
annotators.</p>
      <p>After this first stage, the annotators in
disagreement were asked to converge, in order to
acquire a gold standard dataset. Only for Italian,
we extended the dataset with a set of 963 messages
that were annotated by only one annotator,
without further refinements. The overall statistics of
the dataset are shown in table 2.</p>
      <p>In order to measure the complexity of the
annotation process, we measured the inter-annotation
agreement2.</p>
      <p>Given the possibility to associate
more than one le to a message, we decided to
measure the agreement in terms of Precision, Recall
and F1, by considering the annotations confirmed
after the agreement step as gold-standard and the
1Each annotator can associate zero, one or more les to a
message.</p>
      <p>2The inter-annotation agreement considered only
messages annotated by at least two annotators.</p>
      <p>Life Event
Sentiment
Experience
Life Event
Sentiment
Experience</p>
      <p>Precision
85.76%
72.43%
74.28%
Precision
80.69%
63.77%
64.16%
terpart: one of the main reasons for this is due to
the fact that Italian messages were annotated by
native speakers, while English messages were
annotated by German native speakers.
5</p>
    </sec>
    <sec id="sec-7">
      <title>Exploratory Evaluation</title>
      <p>
        In order to assess the applicability of the
annotation process, we measured the quality of
the system in the automatic recognition of Life
Event (LE), Sentiment and Experience classes.
We modeled this problem as a classification task
and adopted the Support Vector Machine
learning algorithm
        <xref ref-type="bibr" rid="ref14">(Vapnik, 1995)</xref>
        in a One-VS-ALL
schema, implemented within the Kernel-based
Learning Platform (KeLP), presented in
        <xref ref-type="bibr" rid="ref4">(Filice et
al., 2015)</xref>
        3. We evaluated the three targeted
dimensions of LE, Subjectivity and Experience
separately4 in a 10-Fold cross-validation schema: at
each time a fold is selected as test set, while
another set is the validation set used to estimated the
SVM parameters. Each tweet is modeled by using
the following feature representations: a
Bag-ofwords representation, Bag-of-n-grams (with n = 2
and n = 3) and a distributional representation
based on Word Embedding
        <xref ref-type="bibr" rid="ref8">(Mikolov et al., 2013)</xref>
        so that a message is the linear combination of
its nouns, verbs, adjective and adverbs. For the
LE classifier, we built a similar distributional
representation of the eighteen LE definitions shown
in Table 1: we introduced additional features in
terms of the 18-dimensional vector containing the
cosine similarity between the distributional
representation of a tweet and the LE definitions. For
Subjectivity and Experience, we added some
specific features, modeling the presence of emoticons,
punctuation marks (such as exclamation points),
upper case words and elongated words. Moreover,
we added features such as the length of the
message (in terms of words and characters).
      </p>
      <p>Regarding the LE dimension, we adopted a
conservative strategy so that the system assigns a new
LE to a message whereas the SVM classifier
provides a positive confidence for the corresponding
3Available at www.kelp-ml.org.</p>
      <p>4When considering Subjectivity and Experience, a gold
standard Life event is assumed.
class while no LE is assigned, otherwise.
Performance is thus measured in terms of Precision
(the percentage of le correctly introduced by the
system), Recall (the percentage of le from the
oracle that have been correctly recovered) and F1
(the harmonic mean between Precision and
Recall)5. Regarding the Subjective and Experience
dimensions, once a le is known, the classifier is
always requested to associate a message to the s
and e labels, in order to generate consistent triples
in the form (le; s; e). Being a multi-classification
schema were the classifier always outputs a class,
Precision is always equal to Recall6, as well to the
F1. In order to avoid redundancy, only one
measure is reported and it is referred as Accuracy as
it also corresponds to the percentage of messages
correctly associated to the gold-standard label.</p>
      <p>Preliminary results are shown in Table 4, both
for English and Italian. Regarding the LE
dimension, the adopted strategy results in a Precision
higher than 70%, but at in a lower Recall. We
believe this is mainly due to the reduced size of the
dataset: it is even more relevant for English where
only a 31% of Recall was detected. This number
is consistently higher for the Italian dataset, where
almost the double of examples is in fact provided
and almost half of the tweets were only annotated
by one person, thus reducing the odds for
differences in annotations. Anyway, these results are
consistently higher with respect to a baseline: the
correct LE classification given the random
selection from 18 classes would achieve a F1 no higher
than 3%; if we require two correct classifications,
in line with the average le per tweet shown in
Table 2, this baseline drops to 0.3%. Moreover, it is
worth noting that the adopted conservative
strategy has been adopted to have a higher precision:
since we are able to collect a huge amount of
messages from social network, we can afford to lose
5Since a message could be associated to multiple le the
evaluation is not message-based but annotation-based.</p>
      <p>
        6It may be the case that the LE classifier produces a
number of les different from the number of the ones provided in
the gold-standard. As a consequence, when evaluating this
specific classifier, each message potentially introduces a
different number of false positives and false negatives, so
Precision and Recall will diverge.
some messages (often characterize by too little
information in very short messages) instead of
introducing too many noisy meta-data in the
overall workflow. Results concerning sentiment are
generally consistent with respect to international
benchmark in English
        <xref ref-type="bibr" rid="ref12">(Rosenthal et al., 2017)</xref>
        or
in Italian
        <xref ref-type="bibr" rid="ref2">(Barbieri et al., 2016)</xref>
        where almost all
systems achieved an Accuracy between 60% and
65% (even using larger datasets). Overall, this
result seems to be significant, as in line with the first
outcome of the inter-annotation agreement.
However, a further analysis is required to adopt more
complex models for classification of such short
messages, such as more complex kernels
        <xref ref-type="bibr" rid="ref1">(Agarwal et al., 2011)</xref>
        or deep methods
        <xref ref-type="bibr" rid="ref7">(Kim, 2014)</xref>
        .
6
      </p>
    </sec>
    <sec id="sec-8">
      <title>Conclusions</title>
      <p>This paper summarizes the InsideOut project
where the possibility to use Social Web
mining methodologies and technologies to gather
evidence about the adolescents’ mental distress.The
semantic model defined here and the annotated
resource pave the way to a long-term joint research
between computer science specialists and mental
health professionals. The outcomes suggest the
applicability of the devised methodology to larger
communities and different languages. Since the
system is currently active over Twitter, the final
version of the paper will discuss about 5 months
of continuous monitoring outcomes towards
Italian and English speaking communities, with
interesting evidences about the future of our project as
a novel and ambitious Social Computational
Science application.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Apoorv</given-names>
            <surname>Agarwal</surname>
          </string-name>
          , Boyi Xie, Ilia Vovsha, Owen Rambow, and
          <string-name>
            <given-names>Rebecca</given-names>
            <surname>Passonneau</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Sentiment analysis of twitter data</article-title>
          .
          <source>In Proceedings of the Workshop on Languages in Social Media, LSM '11</source>
          , pages
          <fpage>30</fpage>
          -
          <lpage>38</lpage>
          , Stroudsburg, PA, USA.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Valerio Basile, Danilo Croce, Malvina Nissim, Nicole Novielli, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Overview of the evalita 2016 sentiment polarity classification task</article-title>
          .
          <source>In Proceedings Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2016</year>
          ), Napoli, Italy, December 5-
          <issue>7</issue>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Paul</given-names>
            <surname>Best</surname>
          </string-name>
          , Roger Manktelow, and
          <string-name>
            <given-names>Brian</given-names>
            <surname>Taylor</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Online communication, social media and adolescent wellbeing: A systematic narrative review</article-title>
          .
          <source>Children and Youth Services Review</source>
          ,
          <volume>41</volume>
          :
          <fpage>27</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Simone</given-names>
            <surname>Filice</surname>
          </string-name>
          , Giuseppe Castellucci, Danilo Croce, and
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Basili</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Kelp: a kernel-based learning platform for natural language processing</article-title>
          .
          <source>In Proceedings of ACL: System Demonstrations</source>
          , Beijing, China, July.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Fiona M Gore</surname>
          </string-name>
          , Paul JN Bloem, George C Patton, Jane Ferguson, Vronique Joseph, Carolyn Coffey,
          <string-name>
            <surname>Susan M Sawyer</surname>
            , and
            <given-names>Colin D</given-names>
          </string-name>
          <string-name>
            <surname>Mathers</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Global burden of disease in young people aged 1024 years: a systematic analysis</article-title>
          .
          <source>The Lancet</source>
          ,
          <volume>377</volume>
          (
          <issue>9783</issue>
          ):
          <fpage>2093</fpage>
          -
          <lpage>2102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Christian</given-names>
            <surname>Kieling</surname>
          </string-name>
          , Helen Baker-Henningham, Myron Belfer, Gabriella Conti, Ilgi Ertem, Olayinka Omigbodun, Luis Augusto Rohde, Shoba Srinath, Nurper Ulkuer, and
          <string-name>
            <given-names>Atif</given-names>
            <surname>Rahman</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Child and adolescent mental health worldwide: evidence for action</article-title>
          .
          <source>The Lancet</source>
          ,
          <volume>378</volume>
          (
          <issue>9801</issue>
          ):
          <fpage>1515</fpage>
          -
          <lpage>1525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Yoon</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Convolutional neural networks for sentence classification</article-title>
          .
          <source>In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29</source>
          ,
          <year>2014</year>
          , Doha, Qatar, pages
          <fpage>1746</fpage>
          -
          <lpage>1751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Efficient estimation of word representations in vector space</article-title>
          .
          <source>CoRR, abs/1301</source>
          .3781.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Francisco</given-names>
            <surname>Rangel</surname>
          </string-name>
          , Paolo Rosso, Irina Chugur, Martin Potthast, Martin Trenkmann, Benno Stein, Ben Verhoeven, and
          <string-name>
            <given-names>Walter</given-names>
            <surname>Daelemans</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Overview of the 2nd author profiling task at pan 2014</article-title>
          .
          <source>In CLEF evaluation labs and workshop</source>
          , pages
          <fpage>898</fpage>
          -
          <lpage>927</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Francisco</given-names>
            <surname>Rangel</surname>
          </string-name>
          , Fabio Celli, Paolo Rosso, Martin Potthast, Benno Stein, and
          <string-name>
            <given-names>Walter</given-names>
            <surname>Daelemans</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Overview of the 3rd author profiling task at pan 2015</article-title>
          .
          <source>In CLEF 2015 Evaluation Labs and Workshop Working Notes Papers</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Francisco</given-names>
            <surname>Rangel</surname>
          </string-name>
          , Paolo Rosso, Ben Verhoeven, Walter Daelemans,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Potthast</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Overview of the 4th author profiling task at pan 2016: cross-genre evaluations</article-title>
          .
          <source>Working Notes Papers of the CLEF.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Sara</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          , Noura Farra, and
          <string-name>
            <given-names>Preslav</given-names>
            <surname>Nakov</surname>
          </string-name>
          .
          <year>2017</year>
          . SemEval
          <article-title>-2017 task 4: Sentiment analysis in Twitter</article-title>
          .
          <source>In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval '17</source>
          , Vancouver, Canada, August. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Emilio</given-names>
            <surname>Sulis</surname>
          </string-name>
          , Cristina Bosco, Viviana Patti, Mirko Lai, Delia Irazu´ Herna´ndez Far´ıas, Letizia Mencarini, Michele Mozzachiodi, and
          <string-name>
            <given-names>Daniele</given-names>
            <surname>Vignoli</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Subjective well-being and social media. A semantically annotated twitter corpus on fertility and parenthood</article-title>
          .
          <source>In Proceedings of Third Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2016</year>
          ), Napoli, Italy, December 5-
          <issue>7</issue>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>V</given-names>
            <surname>Vapnik</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>The nature of statistical learning theory.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>