BioInfo@UAVR at eRisk 2020: on the use of
 psycholinguistics features and machine learning
for the classification and quantification of mental
                       diseases

Alina Trifan[0000−0001−7613−1435] , Pedro Salgado[0000−0001−6230−6484] and José
                       Luı́s Oliveira[0000−0002−6672−6176]

                     DETI/IEETA, University of Aveiro, Portugal
                      {alina.trifan, psalgado, jlo}@ua.pt


        Abstract. This paper describes the participation of the Bioinformatics
        group of the Institute of Electronics and Engineering Informatics of Uni-
        versity of Aveiro in the shared tasks of CLEF eRisk 20201 . The eRisk
        initiative fosters Natural Language Processing research for the automatic
        detection of risk situations on the internet. Similar to the previous years,
        the challenge was organized in two tasks, which aimed the early detec-
        tion of self-harm (T1) and severity of depression (T2) in online forums.
        We addressed these tasks both from a standard machine learning per-
        spective and from a behavioural point of view. The results we obtained
        endorse the use of social monitoring as a possible complement to more
        traditional public health surveillance and intervention practices.

        Keywords: social mining · early detection · depression · self-harm ·
        psycholinguistic patterns.


1     Introduction
In the last decade the digitalization of social interactions has created opportu-
nities for researchers and practitioners to use social media as a data source for
learning from a different perspective about health and well-being. Social data,
defined as data that is created by people with the goal of sharing it with oth-
ers [24] is a quite recent term that, together with the advances in text mining
and Natural Language Processing (NLP) fueled the development of a new re-
search area known as social media mining. Research initiatives such as CLEF
Early Risk [20] dynamize the scientific advances and societal impact that this
research area can have. They foster collaborative work on the topic of mental
health and social data, and push forward new discoveries and insights that can
potentially benefit public health.
1
    http://early.irlab.org/
    Copyright c 2020 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0). CLEF 2020, 22-25 Septem-
    ber 2020, Thessaloniki, Greece.
     The relation between social media and well-being is well recognized, as users
of social media networks often share very personal feelings and beliefs. There are
numerous online communities that provide support and counseling for users in
need. Most important, these social interactions lead to an impressive data lake
that represents an opportunity for scientific advancement and social good [14].
Reliable predictive models allow early detection of heath conditions and pave
the way for health interventions, by promoting relevant health services, or by
delivering useful health information [30]. In a systematic review on social mining
for mental health, Alonso et. al [2] conclude that the use of social mining applied
to diseases such as dementia, schizophrenia and depression can be of great help
to the clinical decision, diagnosis prediction and ultimately improve the patient’s
life quality. Because mental health issues are current societal issues they demand
new prevention and intervention strategies. Early detection of mental illness is
an essential step in the evolution of our current society.
     This paper describes the participation of the BioInfo@UAVR team in the
CLEF eRisk 2020 tasks. This is our second participation in these tasks and our
approach built upon the methodology we used in 2019. As such, we combined
standard machine learning algorithms with extended psycholinguistics and be-
havioral patterns derived from the literature. The methodology and associate re-
sults are presented in this paper, along with differences and improvements with
respect to our previous participation, as well as a discussion on future work.
The rest of this paper is organized as follows: Section 2 overviews the current
background in social data mining. The next two sections are dedicated to the
description of each of the tasks, and include both the methodologies used and
the results obtained. We conclude the paper and discuss possible improvements
and future work in Section 5.


2   Background

Mental and behavioral health, is an area of health with one of the largest gaps
between the seriousness of the problem and the little information we have avail-
able. This makes it one of the most promising areas of research with social
monitoring [24]. While the landscape of mental health has been changing over
the last decades, the traditional clinical research still faces the lack of precise
and timely diagnosis. A standard diagnosis of mental health issues relies mostly
on patient interviews and clinical diaries. In order to overcome these gaps, re-
searchers explore social data in an attempt to better understand a wide range of
mental health disorders. As such, big data and artificial intelligence offer exciting
opportunities for the screening and prediction of mental problems [17].
    Mental disorders include many different illnesses, with depression being the
most prominent. Moreover, self-harm and anxiety can lead to suicidal ideation. In
some of the most varied medical research, data science approaches have allowed
researchers to mine large healthcare datasets to detect patterns and to better
understand a specific disease or its evolution [4,5,15,21,28,29,32,33]. Researchers
have been using over the last decade publicly available social media messages and
interactions as a data source for studying a variety of mental health conditions [7–
9, 12, 22].
    Even if social media systems can deliver novel, reliable information, there is a
challenge in determining how to act on this information. In areas without existing
empirical data, where social monitoring systems deliver new information, careful
validation and evaluation will be necessary to determine the extent to which the
information can be relied on. A recent study by Ernala et al. [10] questions the
validity of classification results when there is no medical confirmation of the
diagnosis and raises a meaningful discussion on the methodologies used so far
for identifying patients at risk in online forums. One of the first demonstration
of suicide risk assesment through Reddit posts, matched with clinical knowledge
was reported by Shing et al. [27] and paves the way into bridging computational
social mining and clinical research in the area of mental health.


3     Task 1 - Early detection of signs of self-harm
Task 1 consisted in sequentially processing pieces of evidence and detect early
traces of self-harm, as soon as possible. The collection contains writings of social
media content from two categories of users: users that at some point in their
history have harmed themselves and control users, that do not have any history
of self-harming. A labelled training collection was released prior to the evaluation
period. For the test stage a server that iteratively releases user writings was set up
by the organization. After each round of writings that the server would release,
a decision had to be emitted. Classifying a user as being prone to self-harm was
considered an irreversible decision, while a decision of non-self-harming was open
to updates in the following rounds of decisions. Self-harm ideation often relates
to depression and poor mental health, therefore we were interested in exploring
psycholinguistic features that are found in the written or oral expressions of
depressed users.

3.1    Dataset description
The training and test collection for this task have the same format as the collec-
tion described in [18]. They represent collections of writings (posts or comments)
from a set of social media users and, for each user, the collection contains a se-
quence of writings in chronological order. Unlike the same task that ran in 2019,
this year edition provided a training dataset. The characteristics of the training
set are presented in Table 1.

3.2    Metrics
The evaluation metrics that have been regularly used for the eRisk challenges
is ERDE, the early risk detection measure proposed by Losada et al. [18]. As
identified in last year’s overview report [19], this measure has several drawbacks,
which led to the inclusion of alternative evaluation metrics. As such, Flatency a
                            Table 1. Task1 training dataset.

                                             Self-harm Control
                           #subjects             41      299
                          #submissions         6 927 163 506
                       avg #posts/subject      169.0    546.8
                        avg #words/post         24.8     18.8


measure proposed by Sadeque et al. [26] was also used. This measure takes into
consideration the effectiveness of the decision (estimated with the F measure)
and the delay for emitting the decision. A perfect system would get an Flatency
of 1. These metrics are further complemented with a ranking evaluation of the
systems after seeing k writings, with varying k.

3.3    Methods
For this task we submitted 3 different runs. Each team was allowed to submit up
to five different runs. All runs had to complete one round’s decisions prior to get-
ting the next round writing. This means there could not be any transfer learning
from one run to another. For all 3 runs, we followed a number of common steps
in the preprocessing phase. The posts were lowercased and tokenized. Stopwords
are filtered, based on the stopwords list of the Natural Language Toolkit2 .
    For the first run, we also removed all non-alphabetic characters. For this
approach, we followed a standard processing stream for text classification. We
initially split the dataset into training and validation chunks, with a ratio of
2:1. We considered Bag of Words (BoW) and tf-idf based feature weighting with
linear Support Vector Machine with Stochastic Gradient Descent and Passive
Aggressive classifiers. We trained and validated both classifiers on the validation
corpus. The SVM led to slightly better results in terms of F1 in the validation
stage, so we retrained the model with the whole corpus (training + validation).
We only started emitting decisions in the 10th round of server writings and we did
all the classification online, without applying any offline knowledge. This means
that in the first 9 rounds all decisions were emitted as 0. This threshold for the
delay in emitting the decision was selected based on our previous participation,
where we concluded that each user had a history of at least 10 writings.
    Our second run was based on a mixture of machine learning and psycholin-
guistic features. The methodology is composed by five different feature extraction
algorithms. The first two algorithms are intended to compute features within the
data by measuring the frequencies of specific characters or words. The first one
acts before any processing takes place and its purpose is to find emojis and
punctuation symbols on the given text. The second one, receives as argument a
list of self-harm related keywords. Synonyms as well as antonyms are extracted
for every keyword using NLTK’s wordnet [13]. Moreover, it also has a collection
of sets of words. Some of the sets are absolutist words [1], first person words
2
    https://www.nltk.org/
and symptoms. We summarize in Table 2 the main linguistics features that we
considered and the lexicon source. The list of absolutist words is presented in
Table 3.

                   Table 2. Linguistic features and source lexica.

Feature            Source
Negative words     https://www.enchantedlearning.com/wordlist/negativewords.shtml
Positive words     https://www.enchantedlearning.com/wordlist/positivewords.shtml
Symptoms           https://www.valleybehavioral.com/disorders/self-harm/
Related diseases https://www.valleybehavioral.com/disorders/self-harm/
harm lexicon       https://www.thesaurus.com/browse/harm
depression lexicon https://www.thesaurus.com/browse/depression
anxiety            https://www.thesaurus.com/browse/anxiety


    The third algorithm is a tf-idf vectorizer which turns the text into a tf-idf
matrix. The forth and fifth algorithm use paragraph vectors based on gensim 3
Doc2vec [16], with two different models, Distributed Memory and Distributed
Bag of Words. The output of these five algorithms is concatenated and the best
features are extracted. These features are then fed to the Adaboost classifier,
which led to better results in the validation stage among different classifiers that
we trained.

            Table 3. Absolutist words validated by Al-Mosaiwi et al. [1].

                 absolutely    all     always complete completely
                  constant constantly definitely entire   ever
                    every   everyone everything   full   must
                   never     nothing    totall   whole


    Our third and last run was a combination of the previous two in a sense
that it made a decision based on the highest probability score output by the two
first runs. Simply put, our third run would emit the decision whose score in the
previous two runs was higher when the decisions emitted by the first two runs
would not be identical.


3.4    Results

The results obtained are shown in Table 4, along with the best results in this
task, for comparison. The results of all participating teams can be found in [20].
   Our second run obtained the best scores among our three submission. This
was somehow expected as it was the most complex one. It took into consideration
3
    https://radimrehurek.com/gensim/auto examples/index.html
Table 4. Evaluation of BioInfo@UAVR’s submission in Task 1. The best results were
added for comparison. It is important to note that no single team reached the best
results in all metrics.

                   P R F1 ERDE5 ERDE50 latency speed latency-weighted F1
      Run 1      .609 .375 .464 .260 .178 14    .949         .441
      Run 2      .591 .654 .621 .273 .120 11    .961         .597
      Run 3      .629 .375 .470 .259 .177 13    .953         .448
    Best results .913 1 .754 .134    .071  1     .1          .658


not only linguistic features, but also psycholinguistic and behavioral patterns.
Unfortunately for us, during the test period we were only able to process roughly
a quarter of the total writings. That was mainly due to our late submission of the
runs and to some backup disk writings that slowed down our script. Following
the competition’s test phase, we processed off-line the whole corpus with the
same algorithms that we submitted in our best-performing run (second run).
We simulated the same round-based writing release and the results obtained in
our off-line simulation setup are very close to the best results obtained during
the on-line test stage. In this off-line test stage, we obtained a precision score
of 0.80, a recall score of 0.58 and an F1 - score of 0.67. Furthermore, using a
different classifier to build the pipeline, a deep learning model, the results were
even better. Precision score of 0.75, recall score of 0.72 and F1 score of 0.73.


4     Task 2 - Estimating the level of depression

This task was aimed at exploring the viability of automatically estimating the
severity of multiple symptoms associated with depression [20]. Given the user’s
history of writings, participants had to work out a solution for predicting the
user’s response to each individual question included in Beck’s Depression Inven-
tory Questionnaire (BDI) [3]. The questionnaire assesses the presence of feelings
like sadness, pessimism, loss of energy, hunger/loss of appetite, etc. For each in-
dividual question, a numeric value between 0 and 3 is considered a valid answer,
with the exception of two questions, whose possible answers were: 0, 1a, 1b, 2a,
2b, 3a or 3b.


4.1    Dataset description

The training dataset was the dataset used in the test stage of the task’s first
edition, CLEF eRisk 2019 [19]. It contained 20 files, with one file per user pro-
vided. For each user, a file containing the history of writings on a social network
was also provided. An annotation file, or ground truth file was also provided,
containing the answers of all users to each of the questions in the questionnaire.
The number of writings per user varied from 30 to 1511. The average number of
writings of the dataset was 548, with a median of 328.5.
4.2   Metrics

The organizers of this task collected questionnaires filled by social media users
together with their history of writings. For each user, the history of writings
was extracted right after the user provided us the filled in questionnaire. The
questionnaires filled by the users were considered the ground truth and were
used to assess the quality of the responses provided by each participating team.
    The evaluation metrics reflected the differences between the answers of the
questionnaire provided by the task participants and the ones provided by the
users that were part of the dataset. Moreover, in the psychological domain it is
customary to associate depression levels with categories. Depression levels are
defined as the sum of all answers of the 21 questions of the questionnaire. The
following depression categories were used for further extension of the evaluation
metrics:

 • minimal depression - [0–9]
 • mild depression - [10–18]
 • moderate depression - [19–29]
 • severe depression - [30–63]

The following metrics were considered for the evaluation of the results [20]:

 • Hit Rate (HR) - the ratio of cases where the automatic questionnaire has
   exactly the same answer as the real questionnaire.
 • Average Hit Rate (AHR) - HR averaged over all users.
 • Closeness Rate (CR) - the absolute difference between the real and the par-
   ticipant provided answer.
 • Average Closeness Rate (ACR) - CR averaged over all users.
 • Difference between overall depression levels (DODL).
 • Average DODL (ADODL) - DODL averaged over all users.
 • Depression Category Hit Rate (DCHR) - the fraction of cases where the
   automated questionnaire led to a depression category that is equivalent to
   the depression category obtained from the real questionnaire.


4.3   Methods

Our approach for solving this task built upon algorithms that we used in CLEF
eRisk 2019 edition for solving not only this task, but also Task 1. In the previous
year we have used for the training stage of Task 1 a machine learning model
trained on Yates et al. [31] Reddit depression dataset. This dataset consists of
all Reddit users who made a post between January and October 2016, matching
high-precision patterns of self-reported diagnosis (e.g. “I was diagnosed with
depression”). The depressed users were matched by control users, who have never
posted in a subreddit related to mental health and never used a term related to
it. In order to avoid a straight-forward separation of the two groups, all posts of
diagnosed users related to depression or mental health were removed.
    The first step in this year’s approach to addressing Task 2 was to predict
whether a user was depressed using the classifier previously trained of the Yates
et. al dataset. Next, we conjugated the score of this classification with several
psycholinguistics and behavioral patterns, as presented next. For each category,
a score was calculated for each user as a normalized value of the number of
occurrences of the features considered for each category with respect to the
total number of occurrences of the same features over the dataset. These scores
were then normalized to the interval [0,3].
 – Lexical category of a user’s text - depressed users tend to have an overall
   more negative connotation of their texts [9,23]. To this purpose we employed
   Empath, an NLP framework for calculating the average polarity of a user’s
   writings.

 – Use of self-related words (e.g: I, myself, mine) - depressed users tend to use
   them more often in their writings [6, 25]

 – Use of absolutist words - Al-Mosaiwi et al. [1] recently showed that anxiety,
   depression, and suicidal ideation forums contained more absolutist words
   than control forums. The list of absolutist words used is presented next in
   Table 3.

 – Mentions of words related to mental disorders, (e.g.:depression, bipolar,
   schizophrenia, psychotic, ocd).

 – Use of the words cry, guilt and their derivatives.

 – Use of the words sleep, anxious and their derivatives.

 – Use of the words irritated, fatigue, tired and their derivatives.

   This list is based on the psycholinguistic patterns and semantic clusters that
we used in our previous participation in this shared lab. Compared to the ap-
proach that we took in our first participation in this task, we decided to re-
move some of the features that we used last year and we explored the use of
Empath [11]. A statistical analysis of the training corpus revealed that the non-
depressed users had relatively low depression scores, as it would be expected.
As such, the users that our trained model would consider non-depressive were
scored with low scores in all categories. Regarding the psycholinguistic features,
our follow-up analysis of our eRisk2019 submission revealed that some of the fea-
tures we included last year did not significantly contribute to the overall scores.

4.4    Results
Task participants had to provide a result file with one line per user in the test
dataset. Each line contained the username and 21 values that corresponded to
3
    https://github.com/Ejhfast/empath-client
the answers of the 21 questions included in Beck’s Depression Inventory. The re-
sults obtained by our team are presented in Table 5, along with the best results
obtained in this task, for each of the metrics. The results of all participating
teams can be found in [20].While the general results obtained in this task have
slightly improved since last year, they stand as proof of its difficulty. One im-
portant aspect to be mentioned is that our team obtained the best score for the
AHR metric and second best score for ACR.

Table 5. Evaluation of BioInfo@UAVR’s submission in Task 2. The best results for
each metric were added for comparison. It is important to note that no single team
achieved the best results for all metrics.

                                 AHR ACR ADODL DCHR
                  BioInfo@UAVR 38.30% 69.21% 76.01% 30.00%
                    Best scores 38.30% 69.41% 83.15% 35.71%


5   Conclusions and Future Work
We presented in this paper the results of our team’s participation in the eRisk2020
shared tasks. Considering this is the second participation in this shared lab, our
submissions were built upon the core approaches used in the previous edition.
We extended the previous work by having considered more psycholinguistic and
behavioral features, which led to more submissions for Task 1 and overall better
results obtained in both tasks. While we recognize the potential that social min-
ing has for signaling a user’s mental health status and for the early detection of
risk situation, we have come to understand that one possible limitation of our
work is the lack of clinical knowledge. As researchers with computational back-
grounds, who are often unfamiliar with existing practices in mental healthcare,
we are in the frontline of developing new algorithms for social data. In order to
better understand the tasks that we have in our hands and to improve the end
solution we will focus on having the missing clinical perspective on our future
participations.

Acknowledgments
This work was supported by the Integrated Programme of SR&TD SOCA (Ref.
CENTRO-01-0145-FEDER-000010), co-funded by Centro 2020 program, Portu-
gal 2020, European Union, through the European Regional Development Fund.

References
 1. Al-Mosaiwi, M., Johnstone, T.: In an absolute state: Elevated use of absolutist
    words is a marker specific to anxiety, depression, and suicidal ideation. Clinical
    Psychological Science p. 2167702617747074 (2018)
 2. Alonso, S.G., de la Torre-Dı́ez, I., Hamrioui, S., López-Coronado, M., Barreno,
    D.C., Nozaleda, L.M., Franco, M.: Data mining algorithms and techniques in men-
    tal health: A systematic review. Journal of medical systems 42(9), 161 (2018)
 3. Beck, A.T., Ward, C.H., Mendelson, M., Mock, J., Erbaugh, J.: An inventory for
    measuring depression. Archives of general psychiatry 4(6), 561–571 (1961)
 4. Benton, A., Coppersmith, G., Dredze, M.: Ethical research protocols for social
    media health research. In: Proceedings of the First ACL Workshop on Ethics in
    Natural Language Processing. pp. 94–102 (2017)
 5. Chen, L., Hossain, K.T., Butler, P., Ramakrishnan, N., Prakash, B.A.: Syndromic
    surveillance of flu on twitter using weakly supervised temporal topic models. Data
    mining and knowledge discovery 30(3), 681–710 (2016)
 6. Chung, C., Pennebaker, J.W.: The psychological functions of function words. Social
    communication 1, 343–359 (2007)
 7. Coppersmith, G., Dredze, M., Harman, C.: Quantifying mental health signals in
    twitter. In: Proceedings of the Workshop on Computational Linguistics and Clinical
    Psychology: From Linguistic Signal to Clinical Reality. pp. 51–60 (2014)
 8. Coppersmith, G., Leary, R., Whyne, E., Wood, T.: Quantifying suicidal ideation
    via language usage on social media. In: Joint Statistics Meetings Proceedings,
    Statistical Computing Section, JSM (2015)
 9. De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via
    social media. ICWSM 13, 1–10 (2013)
10. Ernala, S.K., Birnbaum, M.L., Candan, K.A., Rizvi, A.F., Sterling, W.A., Kane,
    J.M., De Choudhury, M.: Methodological gaps in predicting mental health states
    from social media: Triangulating diagnostic signals. In: Proceedings of the 2019
    CHI Conference on Human Factors in Computing Systems. p. 134. ACM (2019)
11. Fast, E., Chen, B., Bernstein, M.S.: Empath: Understanding topic signals in large-
    scale text. In: Proceedings of the 2016 CHI Conference on Human Factors in Com-
    puting Systems. pp. 4647–4657. ACM (2016)
12. Fatima, I., Abbasi, B.U.D., Khan, S., Al-Saeed, M., Ahmad, H.F., Mumtaz, R.:
    Prediction of postpartum depression using machine learning techniques from social
    media text. Expert Systems p. e12409
13. Fellbaum, C.: Wordnet. The encyclopedia of applied linguistics (2012)
14. Giannotti, F., Trasarti, R., Bontcheva, K., Grossi, V.: Sobigdata: social mining &
    big data ecosystem. In: Companion Proceedings of the The Web Conference 2018.
    pp. 437–438 (2018)
15. Kim, Y., Huang, J., Emery, S.: Garbage in, garbage out: data collection, quality
    assessment and reporting standards for social media data use in health research,
    infodemiology and digital disease detection. Journal of medical Internet research
    18(2) (2016)
16. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In:
    International conference on machine learning. pp. 1188–1196 (2014)
17. Liang, Y., Zheng, X., Zeng, D.D.: A survey on big data-driven digital phenotyping
    of mental health. Information Fusion 52, 290–307 (2019)
18. Losada, D.E., Crestani, F.: A test collection for research on depression and language
    use. In: International Conference of the Cross-Language Evaluation Forum for
    European Languages. pp. 28–39. Springer (2016)
19. Losada, D.E., Crestani, F., Parapar, J.: Overview of eRisk 2019: Early Risk Pre-
    diction on the Internet. In: Experimental IR Meets Multilinguality, Multimodality,
    and Interaction. 10th International Conference of the CLEF Association, CLEF
    2019. Springer International Publishing, Lugano, Switzerland (2019)
20. Losada, D.E., Crestani, F., Parapar, J.: Overview of eRisk 2020: Early
    Risk Prediction on the Internet. In: A. Arampatzis, E. Kanoulas,
    T.T.S.V.H.J.C.L.C.E.A.N.L.C.N.F.e. (ed.) Experimental IR Meets Multilin-
    guality, Multimodality, and Interaction Proceedings of the Eleventh International
    Conference of the CLEF Association (CLEF 2020). Springer International
    Publishing (2020)
21. Loveys, K., Crutchley, P., Wyatt, E., Coppersmith, G.: Small but mighty: Affec-
    tive micropatterns for quantifying mental health from social media language. In:
    Proceedings of the Fourth Workshop on Computational Linguistics and Clinical
    Psychology—From Linguistic Signal to Clinical Reality. pp. 85–95 (2017)
22. MacAvaney, S., Desmet, B., Cohan, A., Soldaini, L., Yates, A., Zirikly, A., Gohar-
    ian, N.: Rsdd-time: Temporal annotation of self-reported mental health diagnoses.
    In: Proceedings of the Fifth Workshop on Computational Linguistics and Clinical
    Psychology: From Keyboard to Clinic. pp. 168–173 (2018)
23. Park, M., Cha, C., Cha, M.: Depressive moods of users portrayed in twitter. In:
    Proceedings of the ACM SIGKDD Workshop on healthcare informatics (HI-KDD).
    vol. 2012, pp. 1–8. ACM New York, NY (2012)
24. Paul, M.J., Dredze, M.: Social monitoring for public health. Synthesis Lectures on
    Information Concepts, Retrieval, and Services 9(5), 1–183 (2017)
25. Rude, S., Gortner, E.M., Pennebaker, J.: Language use of depressed and depression-
    vulnerable college students. Cognition & Emotion 18(8), 1121–1133 (2004)
26. Sadeque, F., Xu, D., Bethard, S.: Measuring the latency of depression detection
    in social media. In: Proceedings of the Eleventh ACM International Conference on
    Web Search and Data Mining. pp. 495–503. ACM (2018)
27. Shing, H.C., Nair, S., Zirikly, A., Friedenberg, M., Daumé III, H., Resnik, P.:
    Expert, crowdsourced, and machine assessment of suicide risk via online postings.
    In: Proceedings of the Fifth Workshop on Computational Linguistics and Clinical
    Psychology: From Keyboard to Clinic. pp. 25–36 (2018)
28. Trifan, A., Antunes, R., Matos, S., Oliveira, J.L.: Understanding depression from
    psycholinguistic patterns in social media texts. In: European Conference on Infor-
    mation Retrieval. pp. 402–409. Springer (2020)
29. Vaterlaus, J.M., Patten, E.V., Roche, C., Young, J.A.: # gettinghealthy: The per-
    ceived influence of social media on young adult health behaviors. Computers in
    Human Behavior 45, 151–157 (2015)
30. Wongkoblap, A., Vadillo, M.A., Curcin, V.: Researching mental health disorders
    in the era of social media: systematic review. Journal of medical Internet research
    19(6), e228 (2017)
31. Yates, A., Cohan, A., Goharian, N.: Depression and self-harm risk assessment in
    online forums. In: Proceedings of the 2017 Conference on Empirical Methods in
    Natural Language Processing. p. 2968–2978. Association for Computational Lin-
    guistics (2017)
32. Yun, G.W., Morin, D., Park, S., Joa, C.Y., Labbe, B., Lim, J., Lee, S., Hyun,
    D.: Social media and flu: Media twitter accounts as agenda setters. International
    journal of medical informatics 91, 67–73 (2016)
33. Zhang, J., Brackbill, D., Yang, S., Centola, D.: Identifying the effects of social
    media on health behavior: Data from a large-scale online experiment. Data in brief
    5, 453–457 (2015)