<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Indicators⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Juan Martinez-Romo</string-name>
          <email>juaner@lsi.uned.es</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lourdes Araujo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xabier Larrayoz</string-name>
          <email>xlarrayoz001@ikasle.ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maite Oronoz</string-name>
          <email>maite.oronoz@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alicia Pérez</string-name>
          <email>alicia.perez@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Instituto Mixto UNED-ISCIII (IMIENS)</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HiTZ Basque Center for Language Technologies - Ixa (UPV/EHU)</institution>
          ,
          <addr-line>Manuel Lardizabal 1, 20018 Donostia</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>Mental health problems, such as depression and pathological gambling, are conditions that can have very serious consequences if untreated, and cause the patient a lot of sufering. Research suggests that the way people write can reflect mental well-being and mental health risks, and social media provides a source of user-generated text to study. Early detection is crucial for mental health problems, and with this in mind the shared task eRisk was created. This paper describes the participation of the group OBSER-MENH on the T1 and T2 subtask at 2023. In the Task 1, participants had to provide rankings for the 21 symptoms of depression from the BDI-II Questionnaire and we used an approach based on the semantic textual similarity using Transformers. Task 2 consisted of sequentially processing pieces of evidence and detect early traces of pathological gambling as soon as possible. We implemented a penalty strategy in the loss function to deal with label imbalance. We combined three feed-forward neural networks with varying penalty values.</p>
      </abstract>
      <kwd-group>
        <kwd>semantic textual similarity</kwd>
        <kwd>early risk detection</kwd>
        <kwd>depression detection</kwd>
        <kwd>pathological gambling detection</kwd>
        <kwd>natural language processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Mental health conditions, including depression and pathological gambling, have a significant
impact on the lives of millions of individuals annually. Unfortunately, many individuals with
these disorders fail to seek timely medical attention, resulting in unnecessary sufering. Some
individuals are unaware of their need for treatment, while others avoid seeking help due to the
associated stigma. Regardless of the reasons, untreated mental illnesses tend to worsen over
time and can lead to severe consequences, such as substance abuse or even death.</p>
      <p>
        Language serves as a fundamental means of communication between individuals, allowing
for the transmission of intended messages while also conveying information about various
aspects of oneself, such as upbringing, mood, and emotional well-being. Numerous studies
have revealed a correlation between diferences in language usage and writing style and the
presence of mental health conditions [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. By employing Natural Language Processing (NLP)
techniques, researchers can explore the application of language analysis to identify untreated
mental health problems.
      </p>
      <p>Social media platforms like Twitter and Reddit provide an extensive collection of
usergenerated texts, where individuals interact with friends, follow discussion groups, and express
their thoughts and emotions. These platforms ofer a vast amount of information that can be
leveraged for NLP-based techniques with various purposes. Recent research has employed such
techniques to automatically detect users who may be experiencing various mental health issues.</p>
      <p>In the context of mental health, early intervention is particularly crucial as it enhances
the likelihood of positive treatment outcomes. The longer a patient sufers without medical
intervention, the higher the chances of experiencing associated risks. Early detection aids
in identifying such cases before they escalate into more significant problems. While existing
literature primarily focuses on detecting individuals who already have established mental health
conditions, we contend that emphasizing early detection is vital for enabling prompt diagnosis
and intervention.</p>
      <p>
        To address this objective, the eRisk shared task was established. This shared task concentrates
on the early detection of mental health problems within social networks. Previous editions
of this initiative have targeted issues such as anorexia, self-harm, and pathological gambling.
In the 2023 eRisk shared task [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], three subtasks were proposed, and this paper outlines our
participation in subtasks T1 (Search for symptoms of depression) and T2 (Early Detection of
Signs of Pathological Gambling).
      </p>
      <p>The following sections are organized as follows: Section 2 describes the participation of our
team in the task 1; Section 3 details our work for the participation in the task 2; Finally, Section
4 presents our conclusions and ideas for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Task 1: Search for symptoms of depression</title>
      <p>
        The objective of this task entails the arrangement of sentences extracted from a corpus of
usergenerated texts, based on their pertinence to a specific manifestation of depression. Participants
will be requested to assign rankings to the 21 symptoms of depression as outlined in the Beck
Depression Inventory (BDI-II) Questionnaire [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A sentence will be considered relevant to a
particular symptom if it contains information pertaining to the user’s condition with respect to
that symptom. In other words, a sentence may be deemed relevant even if it indicates that the
user is experiencing no dificulties associated with the symptom in question.
2.1. Related Work
eRisk, under the umbrella of the international Conference and Labs of the Evaluation Forum
(CLEF), is one of the most important initiatives towards early detection of mental health-related
problems on the Internet. These evaluation campaigns provide a suitable environment for
the automatic identification of early risks, the publication of corpora and collections, and the
development of evaluation methodologies and metrics. In 2017 was proposed an initial pilot
program [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] devoted to predicting depression, aiming to determine as soon as possible, whether
an individual exhibits traces of depression through the analysis of posts in social media. In
particular, the open-source platform Reddit was selected for building the collection of posts used
in this task [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The task was also proposed in the 2018 and 2022 editions of eRisk. Diferent
features have been considered for the detection of signs of depression in this task: emotion and
sentiment words [7], readability features [8], words from depression-based lexicon or ontologies
[9], linguistic metadata [10, 11] or information on the writing style [12], as well as features
widely employed in NLP such as word embeddings or linguistic information like Part-of-Speech
tagging. Deep learning models based on neural networks (Long-Short Term Memory or LSTMs
and Convolutional Neural Networks or CNNs) have achieved the best results in this task [8, 10].
In the 2022 edition, many proposals were based on transformers, or combinations of these with
other technologies [13].
      </p>
      <p>The current edition difers from previous editions in that the focus is now on ranking of
sentences from each user’s writings according to their relevance with respect to a symptom of
depression. We have resorted to the application of semantic similarity techniques to address it.
The methods used for the calculation of semantic similarity include vector-based representations
[14], graph-based models [15] and transformer-based models [16]. In this work we have
preferred the transformer-based methods that currently provide the best results. These models use
attention to capture interactions between words and generate high quality contextual
representations. Similarity between texts can be computed using distance or cosine similarity between
corresponding transformer representations. Specifically, we have used BERT (Bidirectional
Encoder Representations from Transformers) [17], a language model that is known for its ability
to capture the context and relationships between words. It uses a transformer architecture
that applies multiple layers of attention and representational computations to process both
the information before and after a given word. This allows BERT to capture the meaning and
dependency of words in a broader context.</p>
      <sec id="sec-2-1">
        <title>2.2. Dataset Description</title>
        <p>The organizers provide to the participants a TREC formatted sentence-tagged dataset together
with the BDI-II questionnaire.</p>
        <p>The dataset is composed of a set of 3107 documents and each of these documents is composed
of sentences. Below is an excerpt from document s_949.trec:
&lt; / D O C &gt;
&lt; D O C N O &gt; s _ 9 4 9 _ 1 4 0 9 _ 1 &lt; / D O C N O &gt;
&lt; T E X T &gt; I s u s p e c t I h a v e d e p r e s s i o n . &lt; / T E X T &gt;</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Proposed Model</title>
        <p>In order to calculate the similarity between the sentences and the 21 symptoms, we first
transformed each piece of text into an embedding and then we calculated the cosine distance
between the embeddings of each pair of pieces of text. The sentences are mapped such that
symptoms with similar meanings are close in the vector space. For this, we use the Sentence
Transformers (ST) framework [18] which employs a pre-trained BERT model to obtain the
contextual representation of the sentences and symptoms and applies a mean pooling method to
the output in such a way that it converts the embeddings of tokens to embeddings of sentences
of a fixed size. This mean pooling technique, used by default in ST, is done by averaging the
output embeddings.</p>
        <p>We have used several models derived from BERT to obtain the embeddings that represent both
the sentences and the text of every symptom in the questionnaire. BERT (and other transformer
networks) output for each token in our input text an embedding. In order to create a fixed-sized
sentence embedding out of this, the model applies mean pooling, i.e., the output embeddings
for all tokens are averaged to yield a fixed-sized vector.</p>
        <p>The models used in the diferent runs are as follows:
• all-mpnet-base-v21: This model maps sentences and paragraphs to a 768 dimensional
dense vector space.
• all-distilroberta-v12: This model is pretrained from distilroberta-base model and
finetuned on a 1B sentence pairs dataset.
• all-MiniLM-L12-v23: This model is pretrained from the
microsoft/MiniLM-L12-H384uncased model and fine-tuned on a 1B sentence pairs dataset.</p>
        <p>For models ”all-mpnet-base-v2” and ”all-distilroberta-v1”, two diferent runs were performed
depending on the number of terms used to calculate the semantic similarity. Various sizes were
used in the preliminary experiments and finally results were submitted for the first 90 and 20
terms respectively.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Results</title>
        <p>In this section we analyze the task results of our participation. Once the runs from the
participating teams have been submitted, organizers created the relevance judgements with the help
of human assessors using pooling. They have used the resulting qrels to evaluate the systems
with classical ranking metrics.
1https://huggingface.co/sentence-transformers/all-mpnet-base-v2
2https://huggingface.co/sentence-transformers/all-distilroberta-v1
3https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Task 2: Early Detection of Signs of Pathological Gambling</title>
      <p>
        The aim of this task is to identify signs of pathological gambling as earlier as possible [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. With
that aim, participants are given a set of social media posts that have to be processed in the order
they were written. The sooner a pathological gambler is detected by the system, the better.
The task is also addressed as a ranking decision problem, rather than assigning labels 0 (non
pathological gambler) or 1 (pathological gambler), a score of the estimation of the risk to sufer
such a disorder is computed.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Related Work</title>
        <p>Early detection of signs of pathological gambling is crucial to increase the efectiveness of
psychological therapies and be able to help patients at an early stage of the disease.</p>
        <p>When this assignment was first introduced in 2021, eRisk did not ofer labeled data [ 19]. Both
last and this year the task has been carried out with provided labeled training data [20].</p>
        <p>
          Regarding the methods employed to address this task, in 2021 UNSL team, the team attaining
the best results, analyzed three diferent early alert policies based on standard classification
models, rule-based algorithms and deep learning models. In this case, the best result was achieved
with an SVM [21]. On the same year, UPV-Symanto team made use of BERT transformers in
order to detect pathological gamblers [
          <xref ref-type="bibr" rid="ref7">22</xref>
          ].
        </p>
        <p>
          Last year, with training data made available, UNSL team proposed a variant of the previous
year incorporating two score normalization steps reducing the runtime and improving the
model performance [
          <xref ref-type="bibr" rid="ref8">23</xref>
          ]. The BLUE [
          <xref ref-type="bibr" rid="ref9">24</xref>
          ] team, who achieved similar results, trained a BERT
classifier with additional training data. The team that achieved the best F-score was NLP-UNED
[
          <xref ref-type="bibr" rid="ref10">25</xref>
          ] with a 0.868 F1. They used an Approximate Nearest Neighbour approach in order to assign
post level labels.
        </p>
        <p>
          There was, as well, an approach based on FFNN, by the SINAI group [
          <xref ref-type="bibr" rid="ref11">26</xref>
          ], attaining an F-score
of 0.8. They fed the FFNN with vectors that encapsulated emotions, semantic information,
lexical diversity and volumetry of the posts. For what us regards we found that this approach
was comprehensive and opted to explore it to get base-learners in order to build an ensemble
model.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dataset Description</title>
        <p>The dataset is composed by a set of XML files. Each file comprises the user posts each with a
separated title, and the timestamp in which they were posted. The training files consist of the
test posts provided in the two previous editions: CLEF eRisk 2021 and 2022. Labels are given at
user level being positive (denoted as “1”) if the user is classified as a pathological gambler and
negative (“0”) otherwise. The test set, provided iteratively by a server, has the same source as
the training data. Both the training and test data are quantitatively presented in Table 3.</p>
        <p>As presented in table 3, in both train and test sets the class distribution is unbalanced.
Pathological gamblers do not exceed 6% and 5% in training and test sub-sets respectively.</p>
        <p>The texts were pre-processed, as follows: first the title and posts were concatenated, next,
stop-words were removed.</p>
        <p>
          In order to obtain a numeric representation of the posts, we employed the Universal Sentence
Encoder (USE) [
          <xref ref-type="bibr" rid="ref12">27</xref>
          ]. This encoder, on it’s Dynamic Aggregation of Network (DAN) variant,
generates an embedding of dimension 512 as the output.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Proposed Model</title>
        <p>Our approach focuses on tackling skewed class-distribution and also prevent over-fitting and
entails an ensemble combining variants of a base-model. In what follows we describe the base
models involved and next the combination strategies involved and the strategies incorporated
to deal with class imbalance.</p>
        <sec id="sec-3-3-1">
          <title>3.3.1. Base models</title>
          <p>
            Our approach consists of a simple Feed Forward Neural Network (FFNN) implemented using
Python torch library [
            <xref ref-type="bibr" rid="ref13">28</xref>
            ] fed by the posts represented by USE [
            <xref ref-type="bibr" rid="ref12">27</xref>
            ] encoder. Softmax activation
function is employed to predict the class at post level. With respect to the practical details, we
paid attention to over-fitting and tried to prevent it in two ways. On the one hand, AdamW
optimizer was selected in the training process; on the other hand, 0.1 dropout was set.
          </p>
          <p>The class distribution is clearly imbalanced, with ‘Control’ being the majority label (nearly
20 times more frequent than the target ‘Pathological gamblers’), as shown in Table 3. As
a consequence the network can become skewed, and consequently obtain low accuracy in
the minority class. To address the class imbalance, we have applied a loss function based on
cross-entropy also implemented in torch library.</p>
          <p>During training, when labels are assigned at post level, first, the loss is calculated using cross
then compared with the user’s ground truth label to train the model.
entropy loss between the predicted post level labels (denoted as   ′) and ground truth labels
(denoted as   ). Additionally, a user-level label is estimated from the estimated post-level labels.
If any of the post level predicted labels, ( 1′ , … ,  ′ ), is positive (being  the amount of posts
for user k), the user is considered positive (  ′ = 1), as in expression (1). This user-level label is
  ′ = {
0, ℎ  
1, if ∃ i in 1 &lt;=  &lt;=   ′ &gt; 0.5</p>
          <p>As an additional measure in order to address over-fitting, we implemented a penalty at
user-level. That is, when a user is positive but was predicted as negative, the cross entropy
loss is penalized by a scalar. The higher the penalty, the more will the loss be penalized. Our
base penalty function is presented in (2). We consider all post level predictions of a user  as
the vector  ⃗′ = ( 1′ , … ,  ′ ) and ground truth post level labels as vector  ⃗ = ( 1 , … ,   ),  is

the penalty weight. The penalization is applied whenever a user label is predicted incorrectly
( ′ ≠   )

 =
{
     (
     (
 ⃗′,  ⃗ )

 ⃗′,  ⃗ ) ⋅ 

if   ′ =  
otherwise</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>3.3.2. Ensemble models</title>
          <p>In order to analyze the impact of diferent penalty values to deal with label imbalance, we
implemented an ensemble model. We trained three base-models varying the values of the
(1)
(2)
using the following techniques:
predictions are denoted as  0  ′ ,  1  ′ ,  2  ′ respectively.
penalty weight ( 0=5,  1 = 3,  2 = 2). Note that we get a post level prediction for a user
post ( ′ ) by each of the three base-models involved, denoted as   (with 0 ≤  ≤ 2 ). These
Next, we built three alternative ensemble models (  ), each, yielding its prediction (   ′ )

• Max voting: The final prediction for a post is the maximum among the base-models
involved as in (3).
(4).</p>
          <p>average as in (5).
• Average: The prediction represents the average of all models predictions, as in expression
• F-score weighted average: On this technique diferent weights are assigned to each
model predictions based on their F-score (  ) to promote each models contribution to the
   ′</p>
          <p>′
    ′

= max    ′</p>
          <p>= 1</p>
          <p>=</p>
          <p>∑    ′
∑  ⋅</p>
          <p>′
∑ 

(3)
(4)
(5)</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Results</title>
        <p>In a preliminary experiment we tested the base-models with and without the penalty weight
presented in (2), and also varying the penalty weight. Experimentally, we found that it was
worth using the penalty weight and the best weighting values resulted to be  0=5,  1 = 3,
 2 = 2. Next we combined the selected base learners (each with the aforementioned penalty
weight) and thus built ensemble models following the three alternative combination methods
explored in section 3.3.2. Table 4 shows the configuration used on each run. The first two
rows refer to base-models while the last three rows refer to ensemble models combined the
base-learners mentioned in the corresponding column.
values  0=5,  1 = 3 and  2 = 2. The ensembles involved are:   
as in (5);  
as in (3);  
as in</p>
        <p>Run
OBSER-MENH 0
OBSER-MENH 1
OBSER-MENH 2
OBSER-MENH 3
OBSER-MENH 4</p>
        <p>Base-learners Ensemble
 0
 1
 0,  1,  2
 0,  1,  2
 0,  1,  2</p>
        <p>None
None
  
 
 
performance, table 5, despite the fact of employing diferent penalty values all the models’
behavior is similar. The Recall is 1 in all the runs and the precision is close to 0. This means
that the model is efectively identifying the positive gamblers. Nevertheless, the low precision
indicates a high number of false positives. In other words, the model predicts a user as positive
in almost all cases.</p>
        <p>In table 6 we present the ranking based performance results. These results concern the users’
level of risk estimated from the writings processed so far. As in previous results, our models
show the same behaviour with a little improvement in runs 0,3 and 4 when taking into account
1000 writings. It achieves good results in P@10 and NDCG@10. However, when the ranking is
calculated accross 100 sample users (NDCG@100) our results worsen.
1.00
1.00
1.00
1.00
1.00
0
0
1
@
G
C</p>
        <p>D
0.64N
0.65
0.65
0.64
0.65</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions and Future Work</title>
      <p>Our participation in the 2023 edition of eRisk has focused on two tasks. On the one hand, task 1
addressed the challenge of searching for symptoms of depression. On the other hand, task 2
dealt with the problem of early detection of signs of pathological gambling.</p>
      <p>Regarding task 1, we tackle the issue as a semantic similarity task by leveraging transformers
and employing pre-trained models to compute the similarity. Our approach has yielded a highly
efective method, exhibiting optimal performance and being the second team in the ranking
of best results. In the future we would like to address the problem of ambiguity in the use of
certain terms that can confuse models based on semantic similarity.</p>
      <p>Our proposed approach for task 2: early detection of signs of pathological gambling, was an
ensemble between three models. The aim of the ensemble models was to analyze the diferent
penalty weights applied on the FFNN. The implemented model was efective when predicting
the risk level of the patients, obtaining proficient results in that task. Nevertheless, it wasn’t
able to detect users that don’t have signs of pathological gambling.</p>
      <p>Further research should be done to deal with the label imbalance and improve our method. It
would be interesting to asses the efects of oversampling the minority class or to use Synthetic
Minority Oversampling Technique (SMOTE) for artificial examples. Moreover, results suggest
that to perform an ensemble on this scenario isn’t beneficial, showing similar results in all cases.
Therefore, a natural progression of this work is to make an ensemble of diferent Deep Learning
methods that show diferences on their individual performance.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>OBSER-MENH, with subprojects GELP (TED2021-130398B-C21) and LOTU
(TED2021-130398BC22) are funded by MCIN/AEI/10.13039/501100011033 and by the European Union
“NextGenerationEU”/PRTR. In addition, this work was partially funded by the Spanish Ministry of
Science and Innovation (DOTT-HEALTH/PAT-MED PID2019-106942RB-C31, European
Comission (FEDER) and INDICA-MED PID2019-106942RB-C32), by the Basque Government (IXA
IT-1570-22, Ikasiker BOPV 11/07/2022); and by EXTEPA within Misiones Euskampus 2.0.
[7] I. Abdou Malam, M. Arziki, M. Nezar Bellazrak, F. Benamara, A. El Kaidi, B. Es-Saghir,
Z. He, M. Housni, V. Moriceau, J. Mothe, et al., Irit at e-risk, CEUR Workshop Proceedings,
2017.
[8] M. Trotzek, S. Koitka, C. M. Friedrich, Linguistic metadata augmented classifiers at the
clef 2017 task for early detection of depression., in: CLEF (Working Notes), 2017, p. 2017.
[9] F. Sadeque, D. Xu, S. Bethard, Uarizona at the clef erisk 2017 pilot task: linear and recurrent
models for early depression detection, in: CEUR workshop proceedings, volume 1866,
NIH Public Access, 2017.
[10] M. Trotzek, S. Koitka, C. M. Friedrich, Word embeddings and linguistic metadata at the
clef 2018 tasks for early detection of depression and anorexia., in: CLEF (Working Notes),
2018.
[11] E. Campillo-Ageitos, J. Martinez-Romo, L. Araujo, Uned-med at erisk 2022: depression
detection with tf-idf, linguistic features and embeddings, Working Notes of CLEF (2022)
5–8.
[12] F. Cacheda, D. F. Iglesias, F. J. Nóvoa, V. Carneiro, Analysis and experiments on early
detection of depression., CLEF (Working Notes) 2125 (2018) 43.
[13] H. Srivastava, L. N. S, S. S, T. Basu, Nlp-iiserb@erisk2022: Exploring the potential of bag
of words, document embeddings and transformer based framework for early prediction
of eating disorder, depression and pathological gambling over social media., in: CLEF
(Working Notes), 2022, p. 2022.
[14] H. T. Nguyen, P. H. Duong, E. Cambria, Learning short-text semantic similarity with
word embeddings and external knowledge sources, Knowledge-Based Systems 182 (2019)
104842.
[15] R. Mihalcea, P. Tarau, Textrank: Bringing order into text, in: Proceedings of the 2004
conference on empirical methods in natural language processing, 2004, pp. 404–411.
[16] A. Lauscher, I. Vulić, E. M. Ponti, A. Korhonen, G. Glavaš, Specializing
unsupervised pretraining models for word-level semantic similarity, in: Proceedings of the
28th International Conference on Computational Linguistics, International Committee
on Computational Linguistics, Barcelona, Spain (Online), 2020, pp. 1371–1383. URL:
https://aclanthology.org/2020.coling-main.118. doi:1 0 . 1 8 6 5 3 / v 1 / 2 0 2 0 . c o l i n g - m a i n . 1 1 8 .
[17] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[18] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
arXiv preprint arXiv:1908.10084 (2019).
[19] J. Parapar, P. Martín-Rodilla, D. E. Losada, F. Crestani, erisk 2021: Pathological gambling,
self-harm and depression challenges, in: Advances in Information Retrieval: 43rd European
Conference on IR Research, ECIR 2021, Virtual Event, March 28–April 1, 2021, Proceedings,
Part II 43, Springer, 2021, pp. 650–656.
[20] J. Parapar, P. Martín-Rodilla, D. E. Losada, F. Crestani, Overview of erisk 2022: Early risk
prediction on the internet, in: Experimental IR Meets Multilinguality, Multimodality, and
Interaction: 13th International Conference of the CLEF Association, CLEF 2022, Bologna,
Italy, September 5–8, 2022, Proceedings, Springer, 2022, pp. 233–256.
[21] J. M. Loyola, S. Burdisso, H. Thompson, L. C. Cagnina, M. Errecalde, Unsl at erisk 2021:
A comparison of three early alert policies for early risk detection., in: CLEF (Working</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. De Choudhury</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Counts</surname>
          </string-name>
          , E. Horvitz,
          <article-title>Social Media as a Measurement Tool of Depression in Populations</article-title>
          ,
          <source>in: Proceedings of the 5th Annual ACM Web Science Conference</source>
          , WebSci '13,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2013</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Pennebaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Mehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. G.</given-names>
            <surname>Niederhofer</surname>
          </string-name>
          ,
          <article-title>Psychological Aspects of Natural Language Use: Our Words</article-title>
          , Our Selves,
          <source>Annual Review of Psychology</source>
          <volume>54</volume>
          (
          <year>2003</year>
          )
          <fpage>547</fpage>
          -
          <lpage>577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Parapar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Martín-Rodilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          , Overview of erisk 2023:
          <article-title>Early risk prediction on the internet, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 14th International Conference of the CLEF Association</article-title>
          ,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          <year>2023</year>
          . Springer International Publishing, Thessaloniki, Greece.,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Beck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mendelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Erbaugh</surname>
          </string-name>
          ,
          <article-title>An inventory for measuring depression</article-title>
          ,
          <source>Archives of general psychiatry 4</source>
          (
          <year>1961</year>
          )
          <fpage>561</fpage>
          -
          <lpage>571</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          , J. Parapar, erisk
          <year>2017</year>
          :
          <article-title>Clef lab on early risk prediction on the internet: experimental foundations</article-title>
          ,
          <source>in: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 8th International Conference of the CLEF Association, CLEF</source>
          <year>2017</year>
          , Dublin, Ireland,
          <source>September 11-14</source>
          ,
          <year>2017</year>
          , Proceedings 8, Springer,
          <year>2017</year>
          , pp.
          <fpage>346</fpage>
          -
          <lpage>360</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          ,
          <article-title>A test collection for research on depression and language use, in: Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and
          <source>Interaction: 7th International Conference of the CLEF Association, CLEF</source>
          <year>2016</year>
          , Évora, Portugal, September 5-
          <issue>8</issue>
          ,
          <year>2016</year>
          , Proceedings 7, Springer,
          <year>2016</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>39</lpage>
          . Notes),
          <year>2021</year>
          , pp.
          <fpage>992</fpage>
          -
          <lpage>1021</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>A.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Rios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-S.</given-names>
            <surname>Uban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rössler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yenikent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>ChulviFerriols</surname>
          </string-name>
          , P. Rosso,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <article-title>Upv-symanto at erisk 2021: Mental health author profiling for early risk prediction on the internet</article-title>
          ,
          <source>in: Proceedings of the Working Notes of CLEF</source>
          <year>2021</year>
          ,
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , Bucharest, Romania,
          <source>September 21st to 24th</source>
          ,
          <year>2021</year>
          , CEUR,
          <year>2021</year>
          , pp.
          <fpage>908</fpage>
          -
          <lpage>927</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [23]
          <string-name>
            <surname>J. M. Loyola</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Burdisso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Errecalde</surname>
          </string-name>
          , Unsl at erisk 2022:
          <article-title>Decision policies with history for early classification (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [24]
          <string-name>
            <surname>A.-M. Bucur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Cosma</surname>
          </string-name>
          , L. Dinu,
          <article-title>Early risk detection of pathological gambling, selfharm and depression using bert</article-title>
          ,
          <source>in: CEUR Workshop Proceedings, CEUR-WS</source>
          ,
          <year>2021</year>
          .
          <source>doi:1 0 . 1 3 1 4 0 / R G . 2 . 2 . 2 5</source>
          <volume>0 6 0 . 5 0 5 6 7 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>H.</given-names>
            <surname>Fabregat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Duque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Araujo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Martínez-Romo</surname>
          </string-name>
          ,
          <article-title>Uned-nlp at erisk 2022: Analyzing gambling disorders in social media using approximate nearest neighbors</article-title>
          ,
          <source>in: Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [26]
          <string-name>
            <surname>A. M. Mármol-Romero</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez-Zafra</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza-Del-Arco</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Molina-González</surname>
            , M.-
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Montejo-Ráez</surname>
          </string-name>
          , Sinai at erisk@clef
          <year>2022</year>
          :
          <article-title>Approaching early detection of gambling and eating disorders with natural language processing</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3180</volume>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>961</fpage>
          -
          <lpage>971</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          , S. yi Kong,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Limtiaco</surname>
          </string-name>
          ,
          <string-name>
            R. S. John,
            <given-names>N.</given-names>
            <surname>Constant</surname>
          </string-name>
          , M. GuajardoCespedes, S. Yuan,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-H.</given-names>
            <surname>Sung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Strope</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kurzweil</surname>
          </string-name>
          , Universal sentence encoder,
          <year>2018</year>
          .
          <article-title>a r X i v : 1 8 0 3 . 1 1 1 7 5</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Paszke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Massa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lerer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bradbury</surname>
          </string-name>
          , G. Chanan,
          <string-name>
            <given-names>T.</given-names>
            <surname>Killeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gimelshein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Antiga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Desmaison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>DeVito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Raison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tejani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chilamkurthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chintala</surname>
          </string-name>
          ,
          <string-name>
            <surname>Pytorch:</surname>
          </string-name>
          <article-title>An imperative style, high-performance deep learning library</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          <volume>32</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2019</year>
          , pp.
          <fpage>8024</fpage>
          -
          <lpage>8035</lpage>
          . URL: http://papers.neurips.cc/ paper/9015-pytorch
          <article-title>-an-imperative-style-high-performance-deep-learning-library</article-title>
          .pdf.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>