<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting Early Onset of Depression from Social Media Text using Learned Confidence Scores</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ana-Maria Bucur</string-name>
          <email>ana-maria.bucur@drd.unibuc.ro</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liviu P. Dinu</string-name>
          <email>ldinu@fmi.unibuc.ro</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bucharest</institution>
          ,
          <country country="RO">Romania</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. Computational research on mental health disorders from written texts covers an interdisciplinary area between natural language processing and psychology. A crucial aspect of this problem is prevention and early diagnosis, as suicide resulted from depression being the second leading cause of death for young adults. In this work, we focus on methods for detecting the early onset of depression from social media texts, in particular from Reddit. To that end, we explore the eRisk 2018 dataset and achieve good results with regard to the state of the art by leveraging topic analysis and learned confidence scores to guide the decision process. 1</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Mental illnesses are a common problem of our
modern world. More than one in ten people was living
with mental health disorders in 2017
        <xref ref-type="bibr" rid="ref1 ref16 ref18 ref24 ref25 ref27 ref29 ref30 ref9">(Ritchie and
Roser, 2018)</xref>
        , with women being the most affected.
These disorders affect people’s way of thinking,
mood, emotions, behaviour and their relationships
with others. Most mental illnesses remain
undiagnosed because of the social stigma around them.
      </p>
      <p>Depression is one of the main causes of
disability globally 2, it affects people of all ages.
Prevention is used to reduce depression and to save the
lives of people at risk of suicide, but prevention
is only limited to raising awareness and programs
to cultivate positive thinking in case of depression
and monitoring people who attempted suicide or
self-harm.</p>
      <p>With the rise in social media use, more
computational efforts are made to detect mental illnesses
1Copyright c 2020 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).</p>
      <p>
        2https://www.who.int
such as depression
        <xref ref-type="bibr" rid="ref7">(De Choudhury et al., 2013)</xref>
        and PTSD
        <xref ref-type="bibr" rid="ref5">(Coppersmith et al., 2015)</xref>
        , but also to
detect misogyny
        <xref ref-type="bibr" rid="ref1">(Anzovino et al., 2018)</xref>
        , irony and
sarcasm
        <xref ref-type="bibr" rid="ref14">(Khokhlova et al., 2016)</xref>
        from users’ texts.
      </p>
      <p>
        People tend to talk more about their emotions
and mental health problems online and to seek
support. The sources of mental health cues used for
detection are Twitter, Facebook, Reddit and forums
        <xref ref-type="bibr" rid="ref4">(Calvo et al., 2017)</xref>
        . Reddit3 is a social media site
very similar to forums. It is organized in
subreddits with specific topics, some dedicated to mental
health problems. The use of throwaway accounts
to maintain anonymity promotes disclosure, and
users are more likely to share problems they have
not discussed with anyone before. The use of these
accounts makes it difficult for users to receive more
social support because the majority of them are
used only for one post
        <xref ref-type="bibr" rid="ref4">(Calvo et al., 2017)</xref>
        .
      </p>
      <p>
        In this work, we choose to tackle the problem
of detecting early onset of depression from users’
posts on social media, specifically from Reddit. As
such, we explore the eRisk 2018 dataset through
topic analysis by means of Latent Semantic
Indexing
        <xref ref-type="bibr" rid="ref8">(Deerwester et al., 1990)</xref>
        and learned
out-ofdistribution confidence scores
        <xref ref-type="bibr" rid="ref1 ref16 ref18 ref24 ref25 ref27 ref29 ref30 ref9">(DeVries and Taylor,
2018)</xref>
        . Due to the nature of the dataset, we
repurpose the learned confidence score to make a
decision on whether to label the user as depressed
or non-depressed or to wait for more data, as test
chunks were progressively released every week.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Recent studies for depression detection from text
are reviewed by Guntuku et al.
        <xref ref-type="bibr" rid="ref12">(Guntuku et al.,
2017)</xref>
        . People diagnosed with mental illnesses
from the datasets are identified using screening
surveys, self-reported posts about diagnosis from
social media or by their membership in different
forums related to mental health. The most used
fea3https://www.reddit.com
tures are topic modelling, n-grams, Linguistic
Inquiry and Word Count (LIWC), emotion and
metadata. The most used methods are Support Vector
Machines (SVM), Logistic Regression, Random
Forests and Neural Networks.
      </p>
      <p>Coppersmith et al. (2016) show the differences
in emoticons use between suicidal users and
controls, neurotypicals using emojis with a much
higher probability than a user before an attempt.
Prior to the suicide attempt, the users at risk tend
to use a more self-focused language, same as the
people diagnosed with depression. The authors
highlight different changes in post emotions before
and after the suicide attempt. Users are also more
likely to talk about suicide after an attempt than
before it.</p>
      <p>Sekulic´ et al. (2018) indicate that users
diagnosed with bipolar disorders use more first-person
singular pronouns, same as depressed people. They
also use more words associated with emotions;
words associated with positive emotions as well as
words associated with negative emotions explained
by alternating episodes of mania and depression.</p>
      <p>Nalabandian el al. (2019) show that depressed
persons tend to use more negative words and a
self-focused language when writing about their
interactions with a close romantic partner than when
writing about other people around them. This is
because people experience different symptoms of
mental illness based on their interactions with other
people.</p>
      <p>
        Loveys et al.
        <xref ref-type="bibr" rid="ref17">(Loveys et al., 2018)</xref>
        show the
differences in language use of users with
depression from different cultures to avoid cultural biases.
Even if depression affects people all over the world,
the way they experience and express it is shaped
by their cultural context. Users from some ethnic
groups does not address mental health issues online
as much as the others and this can make the
depression task more difficult. After topic modeling, the
authors show that the words from each topic vary
for each ethnic group, people discussing different
themes relevant to their culture.
      </p>
      <p>For diagnosis before the onset of the mental
health disorders, Eichstaedt et al. (2018) use users’
posts from Facebook to predict a future depression
diagnosis. De Choudhury et al. (2013) use a
classifier to predict users’ depression likelihood ahead of
the onset of illness, with different measures used:
language, linguistic style, emotion, ego-network,
demographics and user engagement.</p>
      <p>
        We chose to tackle the problem of detecting early
onset of depression from users’ Reddit posts. To
that end, we focus our efforts into processing the
eRisk 2018 dataset
        <xref ref-type="bibr" rid="ref16">(Losada et al., 2018)</xref>
        , given its
success at the Workshop for Early Risk Detection
on the Internet4 within The Conference and Labs
of the Evaluation Forum (CLEF) and its fruitful
submissions from participants.
      </p>
      <p>
        The teams from this workshop had different
detection systems, based on bag of words
ensembles
        <xref ref-type="bibr" rid="ref30">(Trotzek et al., 2018)</xref>
        , machine learning
models with hand-crafted features
        <xref ref-type="bibr" rid="ref1 ref16 ref18 ref24 ref24 ref25 ref27 ref29 ref3 ref30 ref30 ref9">(Trotzek et al.,
2018; Ramiandrisoa et al., 2018; Cacheda et al.,
2018; Ram´ıırez-Cifuentes and Freire, 2018)</xref>
        or with
different text embeddings
        <xref ref-type="bibr" rid="ref23 ref24 ref30">(Trotzek et al., 2018;
Ramiandrisoa et al., 2018; Ragheb et al., 2018)</xref>
        , on
sentence-level analysis to detect self references and
extract different features
        <xref ref-type="bibr" rid="ref20">(Ortega-Mendoza et al.,
2018)</xref>
        , on Latent Dirichlet Allocation (LDA) topic
modelling
        <xref ref-type="bibr" rid="ref1 ref16 ref18 ref24 ref25 ref27 ref29 ref30 ref9">(Maupome´ and Meurs, 2018)</xref>
        , models
combining Term Frequency — Inverse Document
Frequency with Convolutional Neural Networks
        <xref ref-type="bibr" rid="ref31">(Wang et al., 2018)</xref>
        or other machine learning
models. Most systems took the decision after the last
chunk, only a few were able to emit a decision in
the first chunks.
      </p>
      <p>
        Several works addressing depression
        <xref ref-type="bibr" rid="ref26 ref28">(Schwartz
et al., 2014; Resnik et al., 2015)</xref>
        and PTSD
        <xref ref-type="bibr" rid="ref22 ref5">(Coppersmith et al., 2015; Preo¸tiuc-Pietro et al., 2015)</xref>
        use a topic modelling approach showing that
topics encountered texts have important discriminative
power to make the distinction between persons
suffering from mental illnesses and healthy controls.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset</title>
      <p>
        Early Risk Detection on the Internet (eRisk)
workshops organized by CLEF explore the
technologies that can be used for people’s health and safety
and the issues related to building tests collections
        <xref ref-type="bibr" rid="ref16">(Losada et al., 2018)</xref>
        . eRisk 2018 has two tasks,
for early detection of depression and anorexia. We
choose to focus on the task of detecting early onset
of depression of social media users.
      </p>
      <p>This task consists of sequentially processing
chunks of Reddit posts from depressed users and
controls. Submissions from each user are encoded
in an xml file, one subject xml per chunk of data.
Each xml contains the id of the subject and his
posts and comments. Each submission has the
posting time and the actual text. If a submission does
4https://early.irlab.org/
not have a title, it is considered a comment. The
goal is to detect depression as early as possible and
the dataset has to be processed in chronological
order. The test collection of posts from depressed
and non-depressed users is split into 10 chunks. As
training data, the teams had access to data from
eRisk 2017, both train and test. The test chunks
were released one every week. Every week the
teams had to decide whether to label the user as
depressed or non-depressed or to wait for the test
data of the following week.</p>
      <p>
        The dataset contains 125 depressed users and
752 non-depressed users as training data and 79
depressed users and 741 non-depressed users as test
data. The dataset has more posts and comments
from people without depression than from users
diagnosed with depression. From a total of 531,349
submissions, only 49,557 submissions are from
users diagnosed with depression. The average time
from the first to the last submission is between 2
and 3 years, so the posts were collected over a long
period of time
        <xref ref-type="bibr" rid="ref16">(Losada et al., 2018)</xref>
        .
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Method</title>
      <p>
        Our methodology for early diagnosis of depression
follows a classical Natural Language Processing
pipeline. To clean the users’ texts, we transform
them into lowercase, we remove the punctuation
and stopwords, the numbers and URLs are replaced
with specific tokens and we perform stemming
with Porter Stemmer
        <xref ref-type="bibr" rid="ref21">(Porter, 1980)</xref>
        . To reduce
the dimension of the dictionary, we use
collocations
        <xref ref-type="bibr" rid="ref2">(Bouma, 2009)</xref>
        to extract meaningful bigrams
and trigrams.
      </p>
      <p>The number of posts and comments from
nondepressed users is much higher than those from
depressed users. To balance the two classes, we
downsample the majority class to a ratio of 2:1.</p>
      <p>We train our Latent Semantic Indexing model
with 128 topics on every users’ post. We use this
model to extract topic modelling embeddings from
users’ texts and use them as input to our fully
connected neural network architecture. The neural
network has three hidden layers of 512, 256 and 256
neurons respectively, Leaky ReLU activation and
we use Dropout for regularization. We use a
random sample of 20% of the training data provided
by the organisers of the competition for validation.</p>
      <p>
        The network has two outputs, one for classifying
if the user is depressed or not and one for
confidence estimation. The motivation for using this
architecture is to learn the confidence
        <xref ref-type="bibr" rid="ref1 ref16 ref18 ref24 ref25 ref27 ref29 ref30 ref9">(DeVries and
Taylor, 2018)</xref>
        of our predictions and use it to make
a decision on whether to label a user or wait for
the next chunk of data. The learned confidence,
besides its use case in out-of-distribution detection,
can be used as a measure for how much the model
trusts its classification output to be correct. As
such, we consider the classification output only if
the confidence exceeds a certain threshold. As
indicated by DeVries et al. (2018), the network loss
is computed by interpolating the predicted
probabilities p with the target y, using the computed
confidence score c, as follows:
(1)
(2)
p0i = c pi + (1
c)yi
The final loss is then given by:
      </p>
      <p>L =</p>
      <p>M
X log(p0i)yi
i=1
log(c)</p>
      <p>Where, in our case, M = 2, is the number of
classes. The loss includes an additional term that
forces the predicted confidence to be as high as
possible. We performed an ablation study on the
validation data on the confidence penalty .</p>
      <p>A recent study by Hein et al. (2019) shows that
neural networks with ReLU activation functions
tend to be overconfident on incorrectly classified
samples, thus we can not rely only on the output
probabilities, and the predicted confidence offers a
more reliable measure of uncertainty of the
classification.</p>
      <p>As the number of submissions seen by the model
increases, we want to make a decision as early
as possible and thus we use a decaying function
that decreases progressively the fixed threshold for
confidence. The decision function is defined as
follows:</p>
      <p>Dw(x) =
( decide for x if c &gt; T
wait for data otherwise
e sw2
(3)</p>
      <p>Where x is the embedding for the current user’s
posts, w is the week number (i.e. the current
chunk), s is a scaling factor and T is the initial
threshold. We choose T = 85% and progressively
scale it down to 40%. The scaling factor is
computed such that, at the final chunk, the threshold is
less than the smallest confidence encountered on
the training data.</p>
      <p>At the test phase, the proposed model does not
make an independent decision for each chunk of
data in the test set. In the first chunk of data, if
the model is not confident enough to make a final
decision regarding the depressed or non-depressed
status of a user, then, starting with the second chunk
of data, we concatenate the current chunk with the
previously available chunks for the current user.
This way, the LSI model has more data for making
better informed predictions.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>Our results on eRisk 2018 dataset are presented in
Table 1. Even if F1 is a standard evaluation
measure used for imbalanced classification, it does not
include the time component of the early detection
task, thus Losada and Crestani (2016) propose an
evaluation metric better suited for this task, the
Early Risk Detection Error (ERDE).</p>
      <p>ERDE is defined as:
8&gt; cf p if d = F P
&gt;&gt;&lt; cf n if d = F N
&gt; lco(k) ctp if d = T P
&gt;&gt;: 0 if d = T N
ERDEo(d; k) =
(4)</p>
      <p>
        The use of false positive (FP), false negative
(FN), true positive (TP) and true negative (TN) for
prediction d is to avoid the classifiers that always
predict the label of the majority class. lco(k) 2
[0; 1] encodes a cost for the delay in detecting TP.
For the eRisk datasets, where the number of
negative labels is greater than positive labels, the value
of cf n is 1 and cf p is 0.1296, set according to the
proportion of depressed users in eRisk 2017 dataset
        <xref ref-type="bibr" rid="ref16">(Losada et al., 2018)</xref>
        . ctp is set to cf n because the
late detection of people at risk of depression can
have serious consequences, a late detection is
considered as equivalent to not detecting the depressed
user at all. The late detection of TN cases does not
affect the effectiveness of the system.
      </p>
      <p>The goal of the system is to detect as early as
possible people at risk of depression. For the detection
of non-depressed users, the time of the detection
is not relevant. The latency cost function, which
grows with k (the number of submissions seen by
the algorithm), is defined as:
lco(k) = 1</p>
      <p>1
1 + ek o
(5)
o represents the number of posts after which the
cost grows more quickly.</p>
      <p>Method
Baseline LSI
LSIc = 0:01
LSIc = 0:1
LSIc = 0:2
LSIc = 0:4
LSIc = 0:6
LSIc = 0:8
Funez et
al.(2018)
Trotzek et
al.(2018)
9.50%
6.44%</p>
      <p>The detection task is difficult, as seen in the low
values of F1 and Precision. However, the task is
to predict early onset of depression, and for that,
the ERDE metrics are more appropriate, as they
are a measure of prediction delay. ERDE5 metric
is very sensitive to delays, after the first 5
submissions from the user the penalties grow quickly. In
contrast to ERDE5, for ERDE50 the penalties grow
only after the first 50 submissions from the user.
The difference between ERDE5 and ERDE50 is
very important in practice because of the
consequences of late detection of depression signs. As
the task suggests, the detection should be made as
early as possible.</p>
      <p>To measure the impact of our learned
out-ofdistribution confidence from the neural network,
we also trained a plain ReLU network with
crossentropy loss. For this model, we employed a hard
threshold on the output probabilities for whether to
wait for more data or classify the sample. As shown
by Hein et al. (2019), ReLU networks can be overly
confident on misclassified examples. This is shown
in Table 1: the model has a low ERDE5 score as
the output probabilities mostly have extreme values,
which means that for most users the model makes
a decision from the first chunk of data.</p>
      <p>We trained our model with different values in
order to see the impact of the confidence
component on the results. Larger values for make the
model overly confident, as expected from Equation
2, the best performing model being the one with
= 0:2. Smaller values of generate a wider
confidence distribution on the training examples,
facilitating the decision process, as extreme values
either make the model overly-confident on every
example, or not confident at all. This is consistent
with findings by DeVries et al. (2018).</p>
      <p>In Table 1 we also present the best two
submission from the eRisk 2018 Workshop, the one from
Funez et al. (2018), having the best results for
the ERDE5 metric, and the one from Trotzek et al.
(2018) having the top ERDE50 score.</p>
      <p>We can assume from these results that topics
encountered in user writings have important
discriminatory power. Depressed users mostly write about
different subjects than non-depressed subjects,
consistent with results from the work of Resnik et al.
(2015). The writings from users diagnosed with
depression are more focused on their feelings and
their life events. Topics related to those themes
contain words such as someone kill, bad though, never
able to get, forever alone, life save, stay sober, i
am sad, still can’t, improve life. new hope, oneself,
tell anything, happy sad, hope one day. Texts from
non-depressed users are found in topics related to
their hobbies containing specific words: black
mirror, first season, movie adaptation, hologram, nine
inch nails, jimi hendrix, artist name, vlog, game,
fallout, terra mistica, way to make money, paid
time, really proud, amazon whishlist, food industry,
white bread.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper, we use the eRisk 2018 dataset on
Early Detection of Signs of Depression for
depression classification from Reddit posts. Our method
uses Latent Semantic Indexing for topic modelling
and to generate the embeddings used as input for
our neural network, but focuses on using a learned
out-of-distribution confidence score alongside the
classification output to decide whether to label the
user or wait for more data. Besides its initial use
case in out-of-distribution detection, we repurposed
the confidence score as a measure for how much the
model trusts its classification output to be correct.
We showed that, in general, there is a significant
difference in writing topics depending on the users’
mental health, to the extent that it contains enough
information for use in classification.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>We would like to thank our reviewers for their
useful comments and suggestions that helped us
improve this paper and also to the organizers of the
eRisk Workshop for their efforts in encouraging the
research on mental illnesses detection from social
media.</p>
      <p>Liviu P. Dinu was supported by a grant
of the Romanian Ministry of Education and
Research, CCCDI—UEFISCDI, project
number 411PED/2020, code
PN-III-P2-2.1-PED-20192271, within PNCDI III.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Anzovino</surname>
          </string-name>
          , Elisabetta Fersini, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automatic identification and classification of misogynistic language on twitter</article-title>
          .
          <source>In International Conference on Applications of Natural Language to Information Systems</source>
          , pages
          <fpage>57</fpage>
          -
          <lpage>64</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Gerlof</given-names>
            <surname>Bouma</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Normalized (pointwise) mutual information in collocation extraction</article-title>
          .
          <source>Proceedings of GSCL</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Fidel</given-names>
            <surname>Cacheda</surname>
          </string-name>
          , Diego Ferna´ndez Iglesias, Francisco Javier No´voa, and
          <string-name>
            <given-names>Victor</given-names>
            <surname>Carneiro</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Analysis and experiments on early detection of depression</article-title>
          .
          <source>CLEF (Working Notes)</source>
          ,
          <volume>2125</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Rafael A Calvo</given-names>
            ,
            <surname>David N Milne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M Sazzad</given-names>
            <surname>Hussain</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Helen</given-names>
            <surname>Christensen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Natural language processing in mental health applications using non-clinical texts</article-title>
          .
          <source>Natural Language Engineering</source>
          ,
          <volume>23</volume>
          (
          <issue>5</issue>
          ):
          <fpage>649</fpage>
          -
          <lpage>685</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Glen</given-names>
            <surname>Coppersmith</surname>
          </string-name>
          , Mark Dredze, Craig Harman, Kristy Hollingshead, and Margaret Mitchell.
          <year>2015</year>
          .
          <article-title>Clpsych 2015 shared task: Depression and ptsd on twitter</article-title>
          .
          <source>In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Glen</given-names>
            <surname>Coppersmith</surname>
          </string-name>
          , Kim Ngo, Ryan Leary, and
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Wood</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Exploratory analysis of social media prior to a suicide attempt</article-title>
          .
          <source>In Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology</source>
          , pages
          <fpage>106</fpage>
          -
          <lpage>117</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Munmun De Choudhury</surname>
            ,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Gamon</surname>
            ,
            <given-names>Scott</given-names>
          </string-name>
          <string-name>
            <surname>Counts</surname>
            , and
            <given-names>Eric</given-names>
          </string-name>
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Predicting depression via social media</article-title>
          .
          <source>In Seventh international AAAI conference on weblogs and social media.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Scott</given-names>
            <surname>Deerwester</surname>
          </string-name>
          , Susan T. Dumais, George W. Furnas,
          <string-name>
            <surname>Thomas</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Landauer</surname>
          </string-name>
          , and Richard Harshman.
          <year>1990</year>
          .
          <article-title>Indexing by latent semantic analysis</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          ,
          <volume>41</volume>
          (
          <issue>6</issue>
          ):
          <fpage>391</fpage>
          -
          <lpage>407</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Terrance DeVries and Graham W Taylor</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Learning confidence for out-of-distribution detection in neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1802</source>
          .04865.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Johannes C Eichstaedt</surname>
          </string-name>
          ,
          <string-name>
            <surname>Robert J Smith</surname>
            ,
            <given-names>Raina M Merchant</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyle H Ungar</surname>
          </string-name>
          , Patrick Crutchley, Daniel Preo¸
          <article-title>tiuc-</article-title>
          <string-name>
            <surname>Pietro</surname>
          </string-name>
          ,
          <article-title>David A Asch,</article-title>
          and
          <string-name>
            <given-names>H Andrew</given-names>
            <surname>Schwartz</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Facebook language predicts depression in medical records</article-title>
          .
          <source>Proceedings of the National Academy of Sciences</source>
          ,
          <volume>115</volume>
          (
          <issue>44</issue>
          ):
          <fpage>11203</fpage>
          -
          <lpage>11208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Dario G Funez</surname>
          </string-name>
          , Maria Jose´ Garciarena Ucelay, Maria Paula Villegas, Sergio Burdisso, Leticia C Cagnina,
          <string-name>
            <surname>Manuel</surname>
            Montes-y Go´mez, and
            <given-names>Marcelo</given-names>
          </string-name>
          <string-name>
            <surname>Errecalde</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Unsl's participation at erisk 2018 lab</article-title>
          . In CLEF (Working Notes).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Sharath</given-names>
            <surname>Chandra Guntuku</surname>
          </string-name>
          , David Yaden,
          <string-name>
            <given-names>Margaret</given-names>
            <surname>Kern</surname>
          </string-name>
          , Lyle Ungar, and
          <string-name>
            <given-names>Johannes</given-names>
            <surname>Eichstaedt</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Detecting depression and mental illness on social media: an integrative review</article-title>
          .
          <source>Current Opinion in Behavioral Sciences</source>
          ,
          <volume>18</volume>
          :
          <fpage>43</fpage>
          -
          <lpage>49</lpage>
          ,
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Hein</surname>
          </string-name>
          , Maksym Andriushchenko, and
          <string-name>
            <given-names>Julian</given-names>
            <surname>Bitterwolf</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Why relu networks yield highconfidence predictions far away from the training data and how to mitigate the problem</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>
          , pages
          <fpage>41</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Khokhlova</surname>
          </string-name>
          , Viviana Patti, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Distinguishing between irony and sarcasm in social media texts: Linguistic observations</article-title>
          .
          <source>In 2016 International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>David E</given-names>
            <surname>Losada</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Crestani</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A test collection for research on depression and language use</article-title>
          .
          <source>In International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , pages
          <fpage>28</fpage>
          -
          <lpage>39</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>David E Losada</surname>
          </string-name>
          , Fabio Crestani, and
          <string-name>
            <given-names>Javier</given-names>
            <surname>Parapar</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Overview of erisk 2018: Early risk prediction on the internet (extended lab overview)</article-title>
          .
          <source>In Proceedings of the 9th International Conference of the CLEF Association</source>
          , CLEF.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Kate</given-names>
            <surname>Loveys</surname>
          </string-name>
          , Jonathan Torrez, Alex Fine, Glen Moriarty, and
          <string-name>
            <given-names>Glen</given-names>
            <surname>Coppersmith</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Cross-cultural differences in language markers of depression online</article-title>
          .
          <source>In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic</source>
          , pages
          <fpage>78</fpage>
          -
          <lpage>87</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Diego</given-names>
            <surname>Maupome</surname>
          </string-name>
          ´ and
          <string-name>
            <surname>Marie-Jean Meurs</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Using topic extraction on social media content for the early detection of depression</article-title>
          .
          <source>CLEF (Working Notes)</source>
          ,
          <volume>2125</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Taleen</given-names>
            <surname>Nalabandian</surname>
          </string-name>
          and
          <string-name>
            <given-names>Molly</given-names>
            <surname>Ireland</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Depressed individuals use negative self-focused language when recalling recent interactions with close romantic partners but not family or friends</article-title>
          .
          <source>In Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology</source>
          , pages
          <fpage>62</fpage>
          -
          <lpage>73</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Rosa</given-names>
            <surname>Mar</surname>
          </string-name>
          <article-title>´ıa Ortega-Mendoza, Adria´n Pastor Lo´pezMonroy, Anilu Franco-Arcega, and Manuel Montesy Go´mez</article-title>
          .
          <year>2018</year>
          . Peimex at erisk2018:
          <article-title>Emphasizing personal information for depression and anorexia detection</article-title>
          .
          <source>In CLEF (Working Notes).</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Martin F Porter</surname>
          </string-name>
          .
          <year>1980</year>
          .
          <article-title>An algorithm for suffix stripping</article-title>
          .
          <source>Program</source>
          ,
          <volume>14</volume>
          (
          <issue>3</issue>
          ):
          <fpage>130</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Preo</surname>
          </string-name>
          <article-title>¸tiuc-</article-title>
          <string-name>
            <surname>Pietro</surname>
            , Johannes Eichstaedt, Gregory Park, Maarten Sap,
            <given-names>Laura</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>Victoria</given-names>
          </string-name>
          <string-name>
            <surname>Tobolsky</surname>
            ,
            <given-names>H Andrew</given-names>
          </string-name>
          <string-name>
            <surname>Schwartz</surname>
            , and
            <given-names>Lyle</given-names>
          </string-name>
          <string-name>
            <surname>Ungar</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>The role of personality, age, and gender in tweeting about mental illness</article-title>
          .
          <source>In Proceedings of the 2nd workshop on computational linguistics</source>
          and
          <article-title>clinical psychology: From linguistic signal to clinical reality</article-title>
          , pages
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Waleed</given-names>
            <surname>Ragheb</surname>
          </string-name>
          , Bilel Moulahi, Je´roˆme Aze´,
          <string-name>
            <given-names>Sandra</given-names>
            <surname>Bringay</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Maximilien</given-names>
            <surname>Servajean</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Temporal mood variation: at the clef erisk-2018 tasks for early risk detection on the internet</article-title>
          .
          <source>In Proceedings of the 9th International Conference of the CLEF Association.</source>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Faneva</given-names>
            <surname>Ramiandrisoa</surname>
          </string-name>
          , Josiane Mothe, Farah Benamara, and Ve´ronique Moriceau.
          <year>2018</year>
          . Irit at e-risk
          <year>2018</year>
          .
          <source>In Proceedings of the 9th International Conference of the CLEF Association.</source>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Diana</given-names>
            <surname>Ram´</surname>
          </string-name>
          ıırez-Cifuentes and
          <string-name>
            <given-names>Ana</given-names>
            <surname>Freire</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Upf's participation at the clef erisk 2018: Early risk prediction on the internet</article-title>
          . In Cappellato L,
          <string-name>
            <surname>Ferro</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nie</surname>
            <given-names>JY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soulier</surname>
            <given-names>L</given-names>
          </string-name>
          , editors.
          <source>Working Notes of CLEF 2018-Conference and Labs of the Evaluation Forum; 2018 Sep</source>
          <volume>10</volume>
          -14; Avignon, France.[Avignon]: CEUR Workshop Proceedings;
          <year>2018</year>
          . p.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . CEUR Workshop Proceedings.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Philip</given-names>
            <surname>Resnik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>William</given-names>
            <surname>Armstrong</surname>
          </string-name>
          , Leonardo Claudino, Thang Nguyen,
          <string-name>
            <surname>Viet-An Nguyen</surname>
          </string-name>
          , and
          <string-name>
            <surname>Jordan BoydGraber</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Beyond lda: exploring supervised topic modeling for depression-related language in twitter</article-title>
          .
          <source>In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality</source>
          , pages
          <fpage>99</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Hannah</given-names>
            <surname>Ritchie</surname>
          </string-name>
          and
          <string-name>
            <given-names>Max</given-names>
            <surname>Roser</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Mental health</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>H Andrew</given-names>
            <surname>Schwartz</surname>
          </string-name>
          , Johannes Eichstaedt, Margaret Kern, Gregory Park, Maarten Sap, David Stillwell,
          <string-name>
            <given-names>Michal</given-names>
            <surname>Kosinski</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Lyle</given-names>
            <surname>Ungar</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Towards assessing changes in degree of depression through facebook</article-title>
          .
          <source>In Proceedings of the workshop on computational linguistics</source>
          and
          <article-title>clinical psychology: from linguistic signal to clinical reality</article-title>
          , pages
          <fpage>118</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Ivan</surname>
            <given-names>Sekulic´</given-names>
          </string-name>
          , Matej Gjurkovic´, and Jan Sˇnajder.
          <year>2018</year>
          .
          <article-title>Not just depressed: Bipolar disorder prediction on reddit</article-title>
          . arXiv preprint arXiv:
          <year>1811</year>
          .04655.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <given-names>Marcel</given-names>
            <surname>Trotzek</surname>
          </string-name>
          , Sven Koitka, and
          <string-name>
            <surname>Christoph M Friedrich</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Word embeddings and linguistic metadata at the clef 2018 tasks for early detection of depression and anorexia</article-title>
          .
          <source>In CLEF (Working Notes).</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Yu-Tseng</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hen-Hsen Huang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Hsin-Hsi Chen</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A neural network approach to early risk detection of depression and anorexia on social media text</article-title>
          .
          <source>In CLEF (Working Notes).</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>